What is Pose Estimation?

Pose Estimation — AI Glossary

Pose Estimation

Definition A computer vision technique that detects the position and orientation of a person's body parts in images or video, mapping out the skeleton of human poses.

In Depth

Pose estimation identifies the spatial locations of body joints (keypoints) such as shoulders, elbows, wrists, hips, knees, and ankles in images or video. By connecting these keypoints, the system constructs a skeletal representation of a person's pose. This can be done for single or multiple people simultaneously, and in 2D (pixel coordinates) or 3D (real-world coordinates).

Leading approaches include top-down methods (first detecting people with object detection, then estimating each person's pose) and bottom-up methods (detecting all keypoints first, then grouping them into individual people). Models like OpenPose, HRNet, and MediaPipe Pose are widely used. More recent transformer-based approaches have further improved accuracy, especially for occluded or crowded scenes.

Pose estimation powers diverse applications: fitness and sports analysis (tracking form and movement), healthcare rehabilitation (monitoring exercise compliance), gaming and AR (motion capture without specialized equipment), sign language recognition, fall detection for elderly care, and workplace safety monitoring. Its ability to understand human body language and movement makes it one of the most practically useful computer vision capabilities.

Pose Estimation

In Depth

Browse more terms