Skip to content

Project Pixel to Camera Point

SUMMARY

Project Pixel to Camera Point projects a pixel and depth to a 3D point in camera coordinates.

Unprojects a 2D pixel and depth value into 3D camera space using camera intrinsics and distortion coefficients. Essential for RGB-D processing, 3D reconstruction, and converting image coordinates to 3D points.

Use this Skill when you want to convert pixel + depth to 3D camera coordinates.

The Skill

python
from telekinesis import pupil
import numpy as np

camera_T_point = pupil.project_pixel_to_camera_point(
    camera_intrinsics=camera_intrinsics,
    distortion_coefficients=distortion_coefficients,
    pixel=pixel,
    depth=depth,
)

API Reference

Example

Projects pixel (320, 240) with depth 1.0 to a 3D point in camera coordinates. The result is a 4x4 transform matrix or equivalent representation; the 3D point is typically extracted from the translation component.

The Code

python
from telekinesis import pupil
import numpy as np
from loguru import logger

# Camera intrinsics 3x3 (fx, fy, cx, cy)
camera_intrinsics = np.array(
    [[500.0, 0, 320.0], [0, 500.0, 240.0], [0, 0, 1.0]],
    dtype=np.float64,
)
distortion_coefficients = np.array([0.0, 0.0, 0.0, 0.0, 0.0], dtype=np.float64)
pixel = np.array([320.0, 240.0], dtype=np.float64)
depth = 1.0

# Project pixel and depth to 3D point in camera coordinates
camera_T_point = pupil.project_pixel_to_camera_point(
    camera_intrinsics=camera_intrinsics,
    distortion_coefficients=distortion_coefficients,
    pixel=pixel,
    depth=depth,
)

logger.success(
    "Projected pixel to camera point. camera_T_point shape: {}",
    np.asarray(camera_T_point.matrix).shape if hasattr(camera_T_point, "matrix") else "N/A",
)

The Explanation of the Code

The code begins by importing the necessary modules: pupil for image processing operations, numpy for numerical operations, and loguru for logging.

python
from telekinesis import pupil
import numpy as np
from loguru import logger

Next, camera intrinsics, distortion coefficients, pixel coordinates, and depth are configured. The intrinsics matrix contains fx, fy (focal lengths) and cx, cy (principal point). The pixel is (u, v) in image coordinates; depth is the distance along the optical axis.

python
camera_intrinsics = np.array(
    [[500.0, 0, 320.0], [0, 500.0, 240.0], [0, 0, 1.0]],
    dtype=np.float64,
)
distortion_coefficients = np.array([0.0, 0.0, 0.0, 0.0, 0.0], dtype=np.float64)
pixel = np.array([320.0, 240.0], dtype=np.float64)
depth = 1.0

The main operation uses the project_pixel_to_camera_point Skill from the pupil module. This Skill unprojects a 2D pixel and depth value into 3D camera space. The output is a transform; extract the 3D position from the matrix translation (e.g., matrix[:3, 3]).

python
camera_T_point = pupil.project_pixel_to_camera_point(
    camera_intrinsics=camera_intrinsics,
    distortion_coefficients=distortion_coefficients,
    pixel=pixel,
    depth=depth,
)
logger.success("Projected pixel to camera point. camera_T_point shape: {}", np.asarray(camera_T_point.matrix).shape if hasattr(camera_T_point, "matrix") else "N/A")

This operation is particularly useful in robotics and vision pipelines for RGB-D processing, 3D reconstruction, picking, and sensor fusion, where converting 2D pixel + depth to 3D camera coordinates is required.

Running the Example

Runnable examples are available in the Telekinesis examples repository. Follow the README in that repository to set up the environment. Once set up, you can run this specific example with:

bash
cd telekinesis-examples
python examples/pupil_examples.py --example project_pixel_to_camera_point

How to Tune the Parameters

The project_pixel_to_camera_point Skill has no tunable parameters in the traditional sense. It requires camera calibration data, pixel coordinates, and depth:

camera_intrinsics (no default—required): 3x3 matrix with fx, fy, cx, cy. Obtain from camera calibration.

distortion_coefficients (default: np.array([0.0, 0.0, 0.0, 0.0, 0.0])): Lens distortion coefficients. Use zeros for undistorted models.

pixel (no default—required): 2D pixel (u, v) in image coordinates.

depth (no default—required): Distance along the optical axis. Must be valid and positive for correct 3D position.

Where to Use the Skill in a Pipeline

Project Pixel to Camera Point is commonly used in the following pipelines:

  • RGB-D processing - Convert depth image to point cloud
  • 3D reconstruction - Unproject pixels to 3D
  • Picking - Convert 2D pick point + depth to 3D grasp point
  • Sensor fusion - Align image and 3D data

Related skills to build such a pipeline:

  • project_camera_point_to_pixel: Inverse operation
  • project_pixel_to_world_point: Project to world frame
  • project_world_point_to_pixel: World to pixel

Alternative Skills

Skillvs. Project Pixel to Camera Point
project_pixel_to_world_pointUse when you need the point in world coordinates (requires world_T_camera).
project_camera_point_to_pixelInverse: 3D camera point → pixel.

When Not to Use the Skill

Do not use Project Pixel to Camera Point when:

  • You need world coordinates (Use project_pixel_to_world_point)
  • You have a 3D point and need pixel (Use project_camera_point_to_pixel)
  • Depth is invalid or missing (Ensure valid depth before projection)