Object Pose Estimation from RGBD Using Bounding Box
SUMMARY
Estimate 3D object poses (centroids) from a point cloud by defining a 3D bounding box ROI, filtering points inside it, clustering with DBSCAN, and computing the centroid of each cluster. Uses Vitreous for filtering, clustering, and centroid computation; Rerun for 3D visualization.
Overview
This example shows how to get 3D pose estimates (centroids) from an RGBD point cloud when you have a known region of interest (ROI) as a 3D bounding box—for example from a CAD model or a rough workspace volume.
The pipeline loads a PLY point cloud, keeps only points inside the box, clusters the remaining points with DBSCAN to separate individual objects, and computes the centroid of each cluster as a 3D pose. Results are visualized in Rerun (input cloud, bbox, filtered cloud, centroids, and coordinate frames).
Inputs
- Path to a PLY point cloud file
- Bounding box center and half-sizes (3D ROI)
Required Telekinesis Skills
- Vitreous — filter by bounding box, DBSCAN clustering, point cloud centroid
Optional: Rerun for visualization.
Use Cases
- Bin picking / tabletop — Restrict to a table or bin volume, then cluster and get object centroids.
- Multi-object 3D pose — When multiple objects lie in a known 3D region, get one centroid per object.
- ROI-based processing — Reduce data and compute cost by only processing points inside a box.
Input-Output
The Pipeline
This pipeline focuses on 3D pose estimation by filtering the point cloud with a bounding box, clustering, and computing one centroid per cluster.
Load Point Cloud (PLY)
↓
Filter Point Cloud Using Bounding Box
↓
Cluster Point Cloud Using DBSCAN
↓
Calculate Point Cloud Centroid (per cluster)
↓
Estimated 3D Poses (centroids)- Filter Point Cloud Using Bounding Box — Keeps only points inside the given 3D box.
- Cluster Point Cloud Using DBSCAN — Groups points into separate clusters (e.g. one per object).
- Calculate Point Cloud Centroid — Computes the 3D centroid of each cluster as the pose.
The Code
Visualization is defined at the top. The pipeline is in get_3d_pose_using_bounding_box(); each Vitreous step is followed by the relevant visualization. main() has no arguments; paths and bbox parameters are constants.
import numpy as np
from loguru import logger
import rerun as rr
import rerun.blueprint as rrb
import pathlib
from datatypes import datatypes
from datatypes import io
from telekinesis import vitreous
ROOT_DIR = pathlib.Path(__file__).parent
DATA_DIR = ROOT_DIR / "data"
PLY_PATH = DATA_DIR / "point_clouds/test_point_cloud_001.ply"
BBOX_CENTER = [0.132, -0.0909, 0.741]
BBOX_HALF_SIZE = [0.11, 0.1, 0.005]
FRAME_SIZE = 0.02 # axis length for coordinate frames
# -----------------------------------------------------------------------------
# Visualization
# -----------------------------------------------------------------------------
def visualize_bbox_filtered(
point_cloud,
bbox: datatypes.Boxes3D,
filtered_point_cloud,
centroids: np.ndarray,
) -> None:
"""Log input cloud, bbox, filtered cloud, and centroids with coordinate frames to Rerun."""
rr.init("object_pose_estimation_bbox", spawn=True)
blueprint = rrb.Blueprint(
rrb.Vertical(
contents=[
rrb.Spatial3DView(
name="input point cloud",
origin="/input",
contents=["/input/points", "/input/bbox"],
),
rrb.Spatial3DView(
name="filtered point cloud",
origin="/filtered",
contents=["/filtered/points", "/filtered/centroids/**"],
),
],
)
)
rr.send_blueprint(blueprint)
rr.log("/input/points", rr.Points3D(positions=point_cloud.positions, colors=point_cloud.colors))
rr.log("/input/bbox", rr.Boxes3D(centers=bbox.centers, half_sizes=bbox.half_sizes))
rr.log("/filtered/points", rr.Points3D(
positions=filtered_point_cloud.positions,
colors=filtered_point_cloud.colors,
))
if len(centroids) > 0:
rr.log("/filtered/centroids", rr.Points3D(
positions=centroids,
colors=[255, 0, 0],
radii=0.001,
))
for idx, centroid in enumerate(centroids):
origins = np.array([centroid, centroid, centroid])
vectors = np.array([
[FRAME_SIZE, 0, 0],
[0, FRAME_SIZE, 0],
[0, 0, FRAME_SIZE],
])
colors = np.array([[255, 0, 0], [0, 255, 0], [0, 0, 255]])
rr.log(f"/filtered/centroids/frame_{idx}", rr.Arrows3D(
origins=origins, vectors=vectors, colors=colors,
))
# -----------------------------------------------------------------------------
# Pipeline
# -----------------------------------------------------------------------------
def get_3d_pose_using_bounding_box():
"""Load PLY, filter by bbox, cluster with DBSCAN, compute centroid per cluster; then visualize."""
# 1. Load point cloud
point_cloud = io.load_point_cloud(str(PLY_PATH))
logger.info(f"Loaded point cloud with {len(point_cloud.positions)} points")
# 2. Filter using bounding box (Vitreous)
bbox = datatypes.Boxes3D(
centers=np.array([BBOX_CENTER]),
half_sizes=np.array([BBOX_HALF_SIZE]),
)
filtered_point_cloud = vitreous.filter_point_cloud_using_bounding_box(
point_cloud=point_cloud,
bbox=bbox,
)
logger.info(f"Filtered point cloud has {len(filtered_point_cloud.positions)} points")
# 3. Cluster with DBSCAN (Vitreous)
clusters = vitreous.cluster_point_cloud_using_dbscan(
point_cloud=filtered_point_cloud,
max_distance=0.00065,
min_points=8,
)
logger.info(f"DBSCAN resulted in {len(clusters)} clusters")
# 4. Centroid per cluster (Vitreous)
centroids = []
for cluster in clusters.to_list():
c = vitreous.calculate_point_cloud_centroid(cluster)
centroids.append(c)
centroids = np.asarray(centroids, dtype=np.float32)
logger.success(f"Estimated {len(centroids)} 3D poses (centroids)")
# Visualization
visualize_bbox_filtered(point_cloud, bbox, filtered_point_cloud, centroids)
return point_cloud, filtered_point_cloud, centroids
def main():
get_3d_pose_using_bounding_box()
if __name__ == "__main__":
main()
