Skills: Concept, Architecture, and Usage

Overview

INFO

In the Telekinesis ecosystem, Skills and Physical AI Agents are the foundational building blocks for developing Physical AI solutions.

A Skill is a reusable, self-contained operation that performs a specific task in a Physical AI application. Skills span a wide range of capabilities, including 2D/3D perception, motion planning, and decision logic. Under the hood, they leverage classical algorithms, foundational models, and deep learning, giving users access to the best of both worlds without managing low-level complexity. Skills can be chained together to create complete workflows, enabling real-world robotics applications in manufacturing, logistics, and beyond.

A Physical AI Agent, typically a Vision Language Model (VLM) or Large Language Model (LLM), autonomously interprets natural language instructions and generates high-level Skill plans. In autonomous Physical AI systems, Agents continuously produce and execute Skill plans, allowing the system to operate with minimal human intervention.

What is a Skill?

In the Telekinesis ecosystem, a Skill is the fundamental building block for Physical AI applications.

A Skill is a self-contained, reusable operation that performs a specific task: for example, 3D perception, image processing, motion planning, or decision logic. Each Skill:

Encapsulates expertise: Uses foundational AI models, deep learning models or classical algorithms, under the hood.
Is composable: Can be combined with other Skills in pipelines to create complete applications.
Is configurable: Offers beginner-friendly defaults and advanced parameters for precise control.
Has well-defined inputs and outputs: Returns consistent datatypes.

Example Skill Usage:

python

from telekinesis import vitreous

# Beginner-friendly usage
pc = vitreous.load_point_cloud('path/to/point_cloud.ply')
downsampled_pc = vitreous.filter_point_cloud_using_voxel_downsampling(
    point_cloud=pc,
    voxel_size=0.01
)

What is a Physical AI Agent?

In the Telekinesis ecosystem, a Physical AI Agent is an a Vision Language Model (VLM) or Large Language Model (LLM) that plans and executes sequences of Skills to achieve a goal.

An Agent is the reasoning and task planning unit that:

Interpret instructions: Understand high-level goals expressed in natural language.
Generate Skill plans: Decide which Skills to execute and in what order to accomplish the task.
Execute autonomously: Run Skills in pipelines and adapt plans based on observations or feedback.
Coordinate across domains: Combine Skills from perception, motion, control, and logic modules seamlessly.

Skill and Agent Modules in the Telekinesis Developer SDK

The Telekinesis Developer SDK is organized into modules, each representing a core concept.

Each module contains Skills relevant to its area, making it easier to find, compose, and reuse operations. The below figure illustrates the categorization.

Module Name	Description
Vitreous	3D point cloud processing: filtering, clustering, bounding boxes, registration.
Retina	Object detection from images.
Cornea	Semantic and instance segmentation for images.
Pupil	Image processing: filters, transformations, and preprocessing operations.
Illusion	Synthetic data generation for training and simulation.
Iris	Model training and evaluation pipelines for perception and motion models.
Neuroplan	Robotics: motion planning, trajectory generation, kinematics, and control.
Medulla	Hardware communication: robot interfaces, sensors, and actuators.
Cortex	Physical AI agents powered by LLMs and VLMs for reasoning and decision-making.

Each module can be imported easily from the telekinesis SDK:

python

from telekinesis import vitreous  # point cloud processing skills
from telekinesis import retina    # object detection skills
from telekinesis import cornea    # image segmentation skills
from telekinesis import pupil     # image processing skills
from telekinesis import illusion  # synthetic data generation skills
from telekinesis import iris      # AI model training skills
from telekinesis import neuroplan # robotics skills
from telekinesis import medulla   # hardware communication skills
from telekinesis import cortex    # Physical AI agents

Where to Go Next?

Let's dive into implementation starting with the Vitreous Skills!

Skills: Concept, Architecture, and Usage ​

Overview ​

What is a Skill? ​

What is a Physical AI Agent? ​