Pedestrian Segmentation From RGB Image
SUMMARY
Segment pedestrians from one RGB image for safety zones, people counting, or collision-free navigation. Uses SAM with a bounding box around each person; outputs masks and boxes, with Rerun visualization.
Overview
Safety zones, people counting, and mobile robot navigation in warehouses, factories, or public spaces often require segmenting pedestrians from a single RGB frame. This example shows how to segment one or more pedestrians in an RGB image using a bounding box prompt: you provide the image and an ROI (or multiple ROIs) around each person, and the pipeline returns per-person instance masks and bounding boxes for safety logic, counting, or obstacle avoidance.
Inputs
- Single RGB image with one or more pedestrians in view
- Bounding box around each pedestrian as
[x_min, y_min, x_max, y_max](one per person per run, or batch multiple boxes)
Required Telekinesis Skills
- Cornea — Segment Image Using SAM for instance segmentation from bounding box prompts
Optional: Rerun for visualization.
Use Cases
This pipeline segments pedestrians in RGB images using SAM with a bounding box prompt.
Typical applications include:
- Safety zones — Isolate each person to enforce keep-out zones or alert when too close to machinery.
- People counting — Segment and count pedestrians for occupancy or flow analytics.
- Collision-free navigation — Provide masks and boxes for mobile robots or AGVs to avoid people.
- Monitoring — Visualize pedestrian regions for dashboards or incident review.
Input-Output


The Pipeline
The pipeline loads an RGB image, defines a bounding box around the pedestrian, runs SAM for instance segmentation, then extracts the mask and bounding box and visualizes with Rerun.
Load RGB Image
↓
Define ROI (Bounding Box Prompt)
↓
Segment Image Using SAM
↓
Postprocess Masks
↓
Extract Bounding Boxes
↓
Visualize with Rerun- Segment Image Using SAM — Instance segmentation from a bounding box prompt; outputs mask and bbox per pedestrian.
The Code
The script loads an image, defines a bounding box for the pedestrian, runs SAM, extracts the mask and bounding box from the annotations, and visualizes with Rerun. Image path and ROI are set at the top; the pipeline runs in the main block with no function arguments.
# Load image
image_path = DATA_DIR / "images/pedestrians.jpg"
image = io.load_image(image_path)
logger.info(f"Loaded image shape: {image.to_numpy().shape}")
# Define a bounding box: (x_min, y_min, x_max, y_max)
bounding_box = [40, 70, 330, 414]
# Segment using SAM
result = cornea.segment_image_using_sam(
image=image,
bboxes=[bounding_box],
)
annotations = result.to_list()
# Rerun visualization
rr.init("pedestrian_segmentation_using_sam", spawn=False)
try:
rr.connect()
except Exception as e:
rr.spawn()
rr.send_blueprint(
rrb.Blueprint(
rrb.Horizontal(
rrb.Spatial2DView(name="Input", origin="input"),
rrb.Spatial2DView(name="Bboxes & Segments", origin="segmented"),
),
rrb.SelectionPanel(),
rrb.TimePanel(),
),
make_active=True,
)
image = image.to_numpy()
rr.log("input/image", rr.Image(image=image))
rr.log("segmented/image", rr.Image(image=image))
h, w = image.shape[:2]
segmentation_img = np.zeros((h, w), dtype=np.uint16)
ann_bboxes = []
class_ids = []
for idx, ann in enumerate(annotations):
label = idx + 1
mask_i = np.zeros((h, w), dtype=np.uint8)
if "mask" in ann and isinstance(ann["mask"], np.ndarray):
m = ann["mask"]
if m.dtype.kind in ("f", "b"):
mask_i = (m > 0.5).astype(np.uint8)
else:
mask_i = (m > 0).astype(np.uint8)
elif "segmentation" in ann and ann["segmentation"]:
seg = ann["segmentation"]
if isinstance(seg, dict):
mask_dec = mask_utils.decode(seg)
if mask_dec.ndim == 3:
mask_dec = mask_dec[:, :, 0]
mask_i = (mask_dec > 0).astype(np.uint8)
elif isinstance(seg, list) and len(seg) > 0:
temp = np.zeros((h, w), dtype=np.uint8)
polys = seg if isinstance(seg[0], list) else [seg]
for poly in polys:
pts = np.array(poly).reshape(-1, 2).astype(np.int32)
cv2.fillPoly(temp, [pts], 1)
mask_i = (temp > 0).astype(np.uint8)
if mask_i.sum() == 0:
continue
segmentation_img[mask_i > 0] = label
bbox = ann.get("bbox", None)
if bbox is None:
continue
ann_bboxes.append(list(bbox))
class_ids.append(label)
rr.log("segmented/masks", rr.SegmentationImage(segmentation_img))
if ann_bboxes:
rr.log(
"segmented/boxes",
rr.Boxes2D(
array=np.asarray(ann_bboxes, dtype=np.float32),
array_format=rr.Box2DFormat.XYWH,
class_ids=np.asarray(class_ids, dtype=np.int32),
),
)
