Skills: Concept, Architecture, and Usage
Overview
INFO
In the Telekinesis ecosystem, Skills and Physical AI Agents are the foundational building blocks for developing Physical AI solutions.
A Skill is a reusable, self-contained operation that performs a specific task in a Physical AI application. Skills span a wide range of capabilities, including 2D/3D perception, motion planning, and decision logic. Under the hood, they leverage classical algorithms, foundational models, and deep learning, giving users access to the best of both worlds without managing low-level complexity. Skills can be chained together to create complete workflows, enabling real-world robotics applications in manufacturing, logistics, and beyond.
A Physical AI Agent, typically a Vision Language Model (VLM) or Large Language Model (LLM), autonomously interprets natural language instructions and generates high-level Skill plans. In autonomous Physical AI systems, Agents continuously produce and execute Skill plans, allowing the system to operate with minimal human intervention.
Telekinesis Skill Groups aligned with core robotics domains: perception, motion, and Physical AI reasoning
What is a Skill?
In the Telekinesis ecosystem, a Skill is the fundamental building block for Physical AI applications.
A Skill is a self-contained, reusable operation that performs a specific task: for example, 3D perception, image processing, motion planning, or decision logic. Each Skill:
- Encapsulates expertise: Uses foundational AI models, deep learning models or classical algorithms, under the hood.
- Is composable: Can be combined with other Skills in pipelines to create complete applications.
- Is configurable: Offers beginner-friendly defaults and advanced parameters for precise control.
- Has well-defined inputs and outputs: Returns consistent datatypes.
Example Skill Usage:
from telekinesis import vitreous
# Beginner-friendly usage
pc = vitreous.load_point_cloud('path/to/point_cloud.ply')
downsampled_pc = vitreous.filter_point_cloud_using_voxel_downsampling(
point_cloud=pc,
voxel_size=0.01
)What is a Physical AI Agent?
In the Telekinesis ecosystem, a Physical AI Agent is an a Vision Language Model (VLM) or Large Language Model (LLM) that plans and executes sequences of Skills to achieve a goal.
An Agent is the reasoning and task planning unit that:
- Interpret instructions: Understand high-level goals expressed in natural language.
- Generate Skill plans: Decide which Skills to execute and in what order to accomplish the task.
- Execute autonomously: Run Skills in pipelines and adapt plans based on observations or feedback.
- Coordinate across domains: Combine Skills from perception, motion, control, and logic modules seamlessly.
Skill and Agent Modules in the Telekinesis Developer SDK
The Telekinesis Developer SDK is organized into modules, each representing a core concept.
Each module contains Skills relevant to its area, making it easier to find, compose, and reuse operations. The below figure illustrates the categorization.
| Module Name | Description |
|---|---|
| Vitreous | 3D point cloud processing: filtering, clustering, bounding boxes, registration. |
| Retina | Object detection from images. |
| Cornea | Semantic and instance segmentation for images. |
| Pupil | Image processing: filters, transformations, and preprocessing operations. |
| Illusion | Synthetic data generation for training and simulation. |
| Iris | Model training and evaluation pipelines for perception and motion models. |
| Neuroplan | Robotics: motion planning, trajectory generation, kinematics, and control. |
| Medulla | Hardware communication: robot interfaces, sensors, and actuators. |
| Cortex | Physical AI agents powered by LLMs and VLMs for reasoning and decision-making. |
Each module can be imported easily from the telekinesis SDK:
from telekinesis import vitreous # point cloud processing skills
from telekinesis import retina # object detection skills
from telekinesis import cornea # image segmentation skills
from telekinesis import pupil # image processing skills
from telekinesis import illusion # synthetic data generation skills
from telekinesis import iris # AI model training skills
from telekinesis import neuroplan # robotics skills
from telekinesis import medulla # hardware communication skills
from telekinesis import cortex # Physical AI agentsWhere to Go Next?
Let's dive into implementation starting with the Vitreous Skills!

