Skills: Modular Building Blocks for Robotics and Physical AI
SUMMARY
In the Telekinesis ecosystem, Skills are reusable modular operations for perception, robotics, and decision-making that can be chained into workflows for Physical AI applications in manufacturing, logistics and more.
What are Skills?
In the Telekinesis ecosystem, a Skill is the fundamental building block for Physical AI applications.
A Skill is a self-contained, reusable operation that performs a specific task such as 3D perception, image processing, motion planning, or decision logic.
A useful mental model: Skills are to Physical AI what functions are to software — small, composable units that can be combined into larger programs.
Skill Interface
Each Skill is defined as a strongly typed function-like interface:
outputs = skill(inputs)Where:
- Inputs: strict data types defined in the Data Engine Layer
- Outputs: strict data types defined in the Data Engine Layer
This allows for seamless composition, as the output of one Skill can be directly fed into another.
Example Skill
from telekinesis import vitreous
# Beginner-friendly usage
pc = vitreous.load_point_cloud('path/to/point_cloud.ply')
downsampled_pc = vitreous.filter_point_cloud_using_voxel_downsampling(
point_cloud=pc,
voxel_size=0.01
)How Skills Enable Physical AI Agents
In the Telekinesis ecosystem, Physical AI Agents generate programs by selecting, composing, and parameterizing Skills from the Telekinesis Agentic Skill Library. Instead of directly outputting robot actions, the agent produces structured Python code that orchestrates Skills into executable workflows.
This creates a clear separation between:
- Reasoning (Agent layer) — decides what to do
- Execution (Skill layer) — defines how it is done
A Physical AI workflow typically follows this structure:
User Instruction
↓
Agent reasoning (LLM / VLM)
↓
Skill selection (from library)
↓
Skill composition (Python program)
↓
Execution in standard Python runtimeThe diagram below illustrates how Physical AI Agents interact with the Skill Library to generate executable robotics programs.
A large-scale skill library orchestrated by Physical AI agents
How Skills Are Organized
Skills are grouped into modules based on functionality. Each module represents a domain (e.g. perception, planning, control) and exposes a set of related operations.
| Module Name | Description |
|---|---|
| Vitreous | 3D point cloud processing: filtering, clustering, registration |
| Retina | Object detection from images |
| Cornea | Semantic and instance segmentation for images |
| Pupil | Image processing and transformations |
| Illusion | Synthetic data generation |
| Iris | Model training and evaluation |
| Synapse | Robotics motion planning, kinematics, and control |
| Medulla | Hardware communication skills for cameras, sensors, and devices |
Modules are accessed directly from the telekinesis library:
from telekinesis import vitreous # point cloud processing skills
from telekinesis import retina # object detection skills
from telekinesis import cornea # image segmentation skills
from telekinesis import pupil # image processing skills
from telekinesis import illusion # synthetic data generation skills
from telekinesis import iris # AI model training skills
from telekinesis import synapse # robotics skills
from telekinesis import medulla # Medulla — hardware communication skillsSkills are organized in Skill Groups:
Cornea - Image segmentation skills
from telekinesis import cornea- Color-based segmentation: RGB, HSV, LAB, YCrCb
- Region-based segmentation: Focus region, Watershed, Flood fill
- Deep learning segmentation: BiRefNet (foreground), SAM
- Graph-based segmentation: GrabCut
- Superpixel segmentation: Felzenszwalb, SLIC
- Filtering: Filter by area, color, mask
- Thresholding: Global threshold, Otsu, Local, Yen, Adaptive, Laplacian-based
See all the Cornea Skills.
Retina - Object detection skills
from telekinesis import retina- Classical shape detection - Hough Transform, Contours
- 2D Object detection - YOLOX, RF-DETR
- Open-Vocabulary detection - Qwen-VL, Grounding DINO
See all the Retina Skills.
Pupil - Image processing skills
from telekinesis import pupil- Morphology: erode, dilate, open/close, gradient, top-hat
- Structure: Frangi, Hessian, Sato, Meijering
- Edges: Sobel, Scharr, Laplacian, Gabor
- Denoising: Gaussian, median, bilateral, box filters
- Enhancement: CLAHE, gamma correction, white balance
- Transforms: pyramids, mask thinning
See all the Pupil Skills.
Vitreous - Point cloud skills
from telekinesis import vitreous- Point cloud: centroids, normals, bounding boxes, principal axes
- Filtering: masks, outliers, downsampling, plane & cylinder removal
- Segmentation: DBSCAN, density, color, plane-based clustering
- Transforms: rigid transforms, scaling, projection
- Registration: ICP (P2P, P2Plane), global registration, cuboid sampling
- Meshes: shapes, mesh to point cloud, convex hull, Poisson reconstruction
See all the Vitreous Skills.
Synapse - Robotics skills
from telekinesis import synapse- Kinematics
- Motion planning
- Control
- Robot database
Illusion - Synthetic data generation skills
from telekinesis import illusion- Synthetic image data generation for AI model training
Iris - AI Model training and deployment skills
from telekinesis import iris- AI model training pipelines
- Fine-tuning and evaluation of foundation models
Medulla - Hardware Communication Skills
from telekinesis.medulla import cameras- Streaming camera data for vision pipelines
- Orchestrating multiple cameras for robot perception and control
- Connecting cameras to Physical AI agents
See all the Medulla Skills.

