Synthetic Datasets
SUMMARY
Telekinesis publishes photorealistic synthetic datasets for training and evaluating computer vision and robotic manipulation models. The datasets are generated with Illusion — a module in the Telekinesis SDK for synthetic data generation built around Extreme Domain Randomization (EDR) to enable robust sim-to-real transfer. All datasets are freely available on Kaggle.
Why Synthetic Data?
Collecting and annotating real-world data for robotics applications is expensive, time-consuming, and difficult to scale. Synthetic data generation solves this by rendering photorealistic scenes with pixel-perfect annotations — producing large, diverse datasets without manual labeling effort.
Key advantages:
- Pixel-perfect annotations: Segmentation masks, bounding boxes, and depth maps are generated automatically during rendering
- Scalability: Generate thousands of annotated images in hours instead of weeks
- Extreme Domain Randomization (EDR): Systematically vary lighting, camera angles, object poses, materials, and backgrounds across an aggressive range of conditions to close the sim-to-real gap and produce models that generalize to unseen real-world scenes
- Safety : No physical setup required, enabling data collection for hazardous or hard-to-reproduce scenarios
Available Datasets
Bin Picking: Hinge 09
Instance Segmentation
Bin Picking: Rod End 01
Instance Segmentation
Bin Picking: Pistons
Instance Segmentation







