Object Pose Estimation from RGBD Using Bounding Box

SUMMARY

Estimate 3D object poses (centroids) from a point cloud by defining a 3D bounding box ROI, filtering points inside it, clustering with DBSCAN, and computing the centroid of each cluster. Uses Vitreous for filtering, clustering, and centroid computation; Rerun for 3D visualization.

Overview

This example shows how to get 3D pose estimates (centroids) from an RGBD point cloud when you have a known region of interest (ROI) as a 3D bounding box—for example from a CAD model or a rough workspace volume.

The pipeline loads a PLY point cloud, keeps only points inside the box, clusters the remaining points with DBSCAN to separate individual objects, and computes the centroid of each cluster as a 3D pose. Results are visualized in Rerun (input cloud, bbox, filtered cloud, centroids, and coordinate frames).

Inputs

Path to a PLY point cloud file
Bounding box center and half-sizes (3D ROI)

Required Telekinesis Skills

Vitreous — filter by bounding box, DBSCAN clustering, point cloud centroid

Optional: Rerun for visualization.

Use Cases

Bin picking / tabletop — Restrict to a table or bin volume, then cluster and get object centroids.
Multi-object 3D pose — When multiple objects lie in a known 3D region, get one centroid per object.
ROI-based processing — Reduce data and compute cost by only processing points inside a box.

Input-Output

Input

PLY point cloud; 3D bounding box (center + half-sizes).

Output

Filtered point cloud; list of 3D centroids (one per cluster); Rerun view with input cloud, bbox, filtered points, centroids, and coordinate frames.

The Pipeline

This pipeline focuses on 3D pose estimation by filtering the point cloud with a bounding box, clustering, and computing one centroid per cluster.

text

Load Point Cloud (PLY)
        ↓
Filter Point Cloud Using Bounding Box
        ↓
Cluster Point Cloud Using DBSCAN
        ↓
Calculate Point Cloud Centroid (per cluster)
        ↓
Estimated 3D Poses (centroids)

Filter Point Cloud Using Bounding Box — Keeps only points inside the given 3D box.
Cluster Point Cloud Using DBSCAN — Groups points into separate clusters (e.g. one per object).
Calculate Point Cloud Centroid — Computes the 3D centroid of each cluster as the pose.

The Code

Visualization is defined at the top. The pipeline is in get_3d_pose_using_bounding_box(); each Vitreous step is followed by the relevant visualization. main() has no arguments; paths and bbox parameters are constants.

python

import numpy as np
from loguru import logger
import rerun as rr
import rerun.blueprint as rrb
import pathlib

from datatypes import datatypes
from datatypes import io
from telekinesis import vitreous

ROOT_DIR = pathlib.Path(__file__).parent
DATA_DIR = ROOT_DIR / "data"
PLY_PATH = DATA_DIR / "point_clouds/test_point_cloud_001.ply"
BBOX_CENTER = [0.132, -0.0909, 0.741]
BBOX_HALF_SIZE = [0.11, 0.1, 0.005]
FRAME_SIZE = 0.02  # axis length for coordinate frames

# -----------------------------------------------------------------------------
# Visualization
# -----------------------------------------------------------------------------

def visualize_bbox_filtered(
    point_cloud,
    bbox: datatypes.Boxes3D,
    filtered_point_cloud,
    centroids: np.ndarray,
) -> None:
    """Log input cloud, bbox, filtered cloud, and centroids with coordinate frames to Rerun."""
    rr.init("object_pose_estimation_bbox", spawn=True)
    blueprint = rrb.Blueprint(
        rrb.Vertical(
            contents=[
                rrb.Spatial3DView(
                    name="input point cloud",
                    origin="/input",
                    contents=["/input/points", "/input/bbox"],
                ),
                rrb.Spatial3DView(
                    name="filtered point cloud",
                    origin="/filtered",
                    contents=["/filtered/points", "/filtered/centroids/**"],
                ),
            ],
        )
    )
    rr.send_blueprint(blueprint)

    rr.log("/input/points", rr.Points3D(positions=point_cloud.positions, colors=point_cloud.colors))
    rr.log("/input/bbox", rr.Boxes3D(centers=bbox.centers, half_sizes=bbox.half_sizes))
    rr.log("/filtered/points", rr.Points3D(
        positions=filtered_point_cloud.positions,
        colors=filtered_point_cloud.colors,
    ))

    if len(centroids) > 0:
        rr.log("/filtered/centroids", rr.Points3D(
            positions=centroids,
            colors=[255, 0, 0],
            radii=0.001,
        ))
        for idx, centroid in enumerate(centroids):
            origins = np.array([centroid, centroid, centroid])
            vectors = np.array([
                [FRAME_SIZE, 0, 0],
                [0, FRAME_SIZE, 0],
                [0, 0, FRAME_SIZE],
            ])
            colors = np.array([[255, 0, 0], [0, 255, 0], [0, 0, 255]])
            rr.log(f"/filtered/centroids/frame_{idx}", rr.Arrows3D(
                origins=origins, vectors=vectors, colors=colors,
            ))


# -----------------------------------------------------------------------------
# Pipeline
# -----------------------------------------------------------------------------

def get_3d_pose_using_bounding_box():
    """Load PLY, filter by bbox, cluster with DBSCAN, compute centroid per cluster; then visualize."""
    # 1. Load point cloud
    point_cloud = io.load_point_cloud(str(PLY_PATH))
    logger.info(f"Loaded point cloud with {len(point_cloud.positions)} points")

    # 2. Filter using bounding box (Vitreous)
    bbox = datatypes.Boxes3D(
        centers=np.array([BBOX_CENTER]),
        half_sizes=np.array([BBOX_HALF_SIZE]),
    )
    filtered_point_cloud = vitreous.filter_point_cloud_using_bounding_box(
        point_cloud=point_cloud,
        bbox=bbox,
    )
    logger.info(f"Filtered point cloud has {len(filtered_point_cloud.positions)} points")

    # 3. Cluster with DBSCAN (Vitreous)
    clusters = vitreous.cluster_point_cloud_using_dbscan(
        point_cloud=filtered_point_cloud,
        max_distance=0.00065,
        min_points=8,
    )
    logger.info(f"DBSCAN resulted in {len(clusters)} clusters")

    # 4. Centroid per cluster (Vitreous)
    centroids = []
    for cluster in clusters.to_list():
        c = vitreous.calculate_point_cloud_centroid(cluster)
        centroids.append(c)
    centroids = np.asarray(centroids, dtype=np.float32)
    logger.success(f"Estimated {len(centroids)} 3D poses (centroids)")

    # Visualization
    visualize_bbox_filtered(point_cloud, bbox, filtered_point_cloud, centroids)

    return point_cloud, filtered_point_cloud, centroids


def main():
    get_3d_pose_using_bounding_box()


if __name__ == "__main__":
    main()

Object Pose Estimation from RGBD Using Bounding Box ​

Overview ​

Use Cases ​

Input-Output ​