Forklift Segmentation From RGB Image
SUMMARY
Segment forklifts or industrial vehicles from one RGB image for warehouse safety, fleet tracking, or collision avoidance. Uses SAM with a bounding box prompt; outputs masks and bounding boxes, with Rerun visualization.
Overview
Warehouse safety, fleet tracking, and AGV or robot collision avoidance often require segmenting vehicles (e.g. forklifts) from a single RGB frame.
This example shows how to segment a forklift or similar vehicle in an RGB image using a bounding box prompt: you provide the image and an ROI around the vehicle, and the pipeline returns the instance mask and bounding box for monitoring, path planning, or collision checks.
Inputs
- Single RGB image of the scene (e.g. warehouse) with the forklift or vehicle in view
- Bounding box around the forklift or vehicle as
[x_min, y_min, x_max, y_max]
Optional: Rerun for visualization.
Use Cases
This pipeline segments forklifts or industrial vehicles in RGB images using SAM with a bounding box prompt.
Typical applications include:
- Warehouse safety — Isolate vehicles to enforce safety zones or alert when humans are too close.
- Fleet tracking — Segment and localize each vehicle for occupancy or flow analytics.
- Collision avoidance — Provide masks and boxes for AGV or robot path planning around vehicles.
- Monitoring — Visualize vehicle regions for dashboards or logging.
Input-Output


The Pipeline
The pipeline loads an RGB image, defines a bounding box around the vehicle, runs SAM for instance segmentation, then extracts the mask and bounding box and visualizes with Rerun.
Load RGB Image
↓
Define ROI (Bounding Box Prompt)
↓
Segment Image Using SAM
↓
Postprocess Masks
↓
Extract Bounding Boxes
↓
Visualize with Rerun- Segment Image Using SAM — Instance segmentation from a bounding box prompt; outputs mask and bbox for the vehicle.
The Code
The script loads an image, defines a bounding box for the forklift, runs SAM, extracts the mask and bounding box from the annotations, and visualizes with Rerun. Image path and ROI are set at the top; the pipeline runs in the main block with no function arguments.
# Load image
image_path = DATA_DIR / "images/forklift.jpg"
image = io.load_image(image_path)
logger.info(f"Loaded image shape: {image.to_numpy().shape}")
# Define a bounding box: (x_min, y_min, x_max, y_max)
bounding_box = [18, 216, 303, 389]
# Segment using SAM
result = cornea.segment_image_using_sam(
image=image,
bboxes=[bounding_box],
)
annotations = result.to_list()
# Rerun visualization
rr.init("forklift_segmentation_using_sam", spawn=False)
try:
rr.connect()
except Exception as e:
rr.spawn()
rr.send_blueprint(
rrb.Blueprint(
rrb.Horizontal(
rrb.Spatial2DView(name="Input", origin="input"),
rrb.Spatial2DView(name="Bboxes & Segments", origin="segmented"),
),
rrb.SelectionPanel(),
rrb.TimePanel(),
),
make_active=True,
)
image = image.to_numpy()
rr.log("input/image", rr.Image(image=image))
rr.log("segmented/image", rr.Image(image=image))
h, w = image.shape[:2]
segmentation_img = np.zeros((h, w), dtype=np.uint16)
ann_bboxes = []
class_ids = []
for idx, ann in enumerate(annotations):
label = idx + 1
mask_i = np.zeros((h, w), dtype=np.uint8)
if "mask" in ann and isinstance(ann["mask"], np.ndarray):
m = ann["mask"]
if m.dtype.kind in ("f", "b"):
mask_i = (m > 0.5).astype(np.uint8)
else:
mask_i = (m > 0).astype(np.uint8)
elif "segmentation" in ann and ann["segmentation"]:
seg = ann["segmentation"]
if isinstance(seg, dict):
mask_dec = mask_utils.decode(seg)
if mask_dec.ndim == 3:
mask_dec = mask_dec[:, :, 0]
mask_i = (mask_dec > 0).astype(np.uint8)
elif isinstance(seg, list) and len(seg) > 0:
temp = np.zeros((h, w), dtype=np.uint8)
polys = seg if isinstance(seg[0], list) else [seg]
for poly in polys:
pts = np.array(poly).reshape(-1, 2).astype(np.int32)
cv2.fillPoly(temp, [pts], 1)
mask_i = (temp > 0).astype(np.uint8)
if mask_i.sum() == 0:
continue
segmentation_img[mask_i > 0] = label
bbox = ann.get("bbox", None)
if bbox is None:
continue
ann_bboxes.append(list(bbox))
class_ids.append(label)
rr.log("segmented/masks", rr.SegmentationImage(segmentation_img))
if ann_bboxes:
rr.log(
"segmented/boxes",
rr.Boxes2D(
array=np.asarray(ann_bboxes, dtype=np.float32),
array_format=rr.Box2DFormat.XYWH,
class_ids=np.asarray(class_ids, dtype=np.int32),
),
)
