Skip to content

Synthetic Datasets

SUMMARY

Telekinesis publishes photorealistic synthetic datasets for training and evaluating computer vision and robotic manipulation models. The datasets are generated with Illusion — a module in the Telekinesis SDK for synthetic data generation built around Extreme Domain Randomization (EDR) to enable robust sim-to-real transfer. All datasets are freely available on Kaggle.

Why Synthetic Data?

Collecting and annotating real-world data for robotics applications is expensive, time-consuming, and difficult to scale. Synthetic data generation solves this by rendering photorealistic scenes with pixel-perfect annotations — producing large, diverse datasets without manual labeling effort.

Key advantages:

  • Pixel-perfect annotations: Segmentation masks, bounding boxes, and depth maps are generated automatically during rendering
  • Scalability: Generate thousands of annotated images in hours instead of weeks
  • Extreme Domain Randomization (EDR): Systematically vary lighting, camera angles, object poses, materials, and backgrounds across an aggressive range of conditions to close the sim-to-real gap and produce models that generalize to unseen real-world scenes
  • Safety : No physical setup required, enabling data collection for hazardous or hard-to-reproduce scenarios

Available Datasets

After
Before
Bin Picking: Hinge 09
Instance Segmentation
After
Before
Bin Picking: Rod End 01
Instance Segmentation
After
Before
Bin Picking: Pistons
Instance Segmentation