Detect Objects Using RF-DETR

SUMMARY

Detect Objects Using RF-DETR detects objects using RF-DETR and returns COCO-like annotations with category names from the COCO 80-class label set.

This Skill is designed for transformer-based object detection in scenarios where global context understanding is beneficial, such as dense object scenes or complex warehouse environments. For example, detecting overlapping boxes, pallets, or workers in cluttered industrial layouts.

Use this Skill when you want to detect and label objects using COCO 80-class categories with a modern transformer-based detection architecture.

The Skill

WARNING

This skill is currently in beta and may fail when provided with empty annotations. We are continuously enhancing robustness and reliability, and the documentation will be updated in line with validated improvements.

python

from telekinesis import retina

annotations, categories = retina.detect_objects_using_rfdetr(
    image=image,
    score_threshold=0.5,
)

API Reference

Example

Input Image

Original image

Detected Objects

Detected persons with bounding boxes, labels and scores.

The Code

python

from telekinesis import retina
from datatypes import io
import pathlib

# Optional for logging
from loguru import logger

DATA_DIR = pathlib.Path("path/to/telekinesis-data")

# Load image
filepath = str(DATA_DIR / "images" / "warehouse_1.jpg")
image = io.load_image(filepath=filepath)
logger.success(f"Loaded image from {filepath}")

# Detect Objects
annotations, categories = retina.detect_objects_using_rfdetr(
    image=image,
    score_threshold=0.5,
)

# Access results
annotations = annotations.to_list()
categories = categories.to_list()  
logger.success(f"RF-DETR detected {len(annotations)} objects.")

The Explanation of the Code

This example shows how to use the detect_objects_using_rfdetr Skill to detect objects in an image. The code begins by importing the necessary modules from Telekinesis and Python, and optionally sets up logging with loguru to provide feedback during execution.

python

from telekinesis import retina
from datatypes import io
import pathlib

# Optional for logging
from loguru import logger

The image is loaded from a .jpg file using io.load_image. The logger immediately reports the path of the image loaded, helping confirm the input is correct and ready for processing.

python

DATA_DIR = pathlib.Path("path/to/telekinesis-data")

# Load image
filepath = str(DATA_DIR / "images" / "warehouse_1.jpg")
image = io.load_image(filepath=filepath)
logger.success(f"Loaded image from {filepath}")

The detection parameters are configured:

image specifies the input image
score_threshold sets the minimum confidence score required for a detection to be returned

python

annotations, categories = retina.detect_objects_using_rfdetr(
    image=image,
    score_threshold=0.5,
)

The function returns annotations in COCO-like format and categories with class label information. Extract the detected objects as follows. The logger outputs the number of detected objects.

python

# Access results
annotations = annotations.to_list()
categories = categories.to_list()
logger.success(f"RF-DETR detected {len(annotations)} objects.")

This skill provides a fast, model-driven approach to object detection, useful for identifying and labeling objects in industrial vision pipelines.

Running the Example

Runnable examples are available in the Telekinesis examples repository. Follow the README in that repository to set up the environment. Once set up, you can run this specific example with:

bash

cd telekinesis-examples
python examples/retina_examples.py --example detect_objects_using_rfdetr

How to Tune the Parameters

The detect_objects_using_rfdetr Skill has this below parameter:

score_threshold:

Minimum confidence score required for a detection to be returned
Typical range: 0.3 to 0.7 (task-dependent)
Increase to reduce false positives and keep only high-confidence detections
Decrease to improve recall when objects are small, partially occluded, or hard to detect

TIP

Best practice: Start with score_threshold=0.5. Raise it if you see too many false positives; lower it if true objects are being missed.

Where to Use the Skill

Detect objects using RF-DETR is commonly used in the following pipelines:

Warehouse and logistics monitoring - Detecting pallets, boxes, people, and equipment for operational visibility
Quality inspection and compliance checks - Verifying object presence/absence and category-level correctness

Alternative Skills

Skill	vs. Detect objects using RF-DETR
detect_objects_using_yolox	Use yolox when you need speed and real-time performance.

When Not to Use the Skill

Do not use Detect objects using RF-DETR when:

GPU memory is limited (Transformer-based models typically consume more GPU memory than CNN-based detectors)
Real-time performance is required (RF-DETR may be too slow compared to YOLO-style detectors)
Running on edge devices or resource-constrained systems

Manipulators

Humanoids

Quadrupeds

Mobile Robots

Parallel Grippers

Kinematics

Motion Planning

Visualization & Model

Connection

Motion

Servo Control

Force Control

State Reading

Robot Status

Diagnostics

Tools

Detect Objects Using RF-DETR

The Skill

Example

Input Image

Detected Objects

The Code

The Explanation of the Code

Running the Example

How to Tune the Parameters

Where to Use the Skill

Alternative Skills

When Not to Use the Skill

Detect Objects Using RF-DETR ​

The Skill ​

Example ​

Input Image

Detected Objects

The Code ​

The Explanation of the Code ​

Running the Example ​

How to Tune the Parameters ​

Where to Use the Skill ​

Alternative Skills ​

When Not to Use the Skill ​

Detect Objects Using RF-DETR

The Skill

Example

The Code

The Explanation of the Code

Running the Example

How to Tune the Parameters

Where to Use the Skill

Alternative Skills

When Not to Use the Skill