Our pose refinement processimproves the accuracy of 3D part poses in assembly videos. This process is crucial for ensuring physically valid assembly sequences and accurate 3D reconstructions.
- Initial Estimation: We start with the Perspective-n-Point (PnP) algorithm to estimate initial poses. While this provides a good 2D overlay, it often results in inaccurate 3D poses.
- Issue Identification: By viewing the scene from different angles, particularly side views, we reveal incorrect spatial relationships between parts that aren't apparent from the camera's perspective.
- Refinement Process: We've developed an interactive interface that allows annotators to:
- Control the virtual camera using axis-aligned controls
- View the 3D scene from different orthographic perspectives
- Refine part poses by rotating and translating them in 3D space
- Compare the real-time 3D view with corresponding video frames
- Relative Pose Accuracy: To improve the accuracy of relative poses, parts that appear together in a video frame are annotated simultaneously, with a visualization of their 3D locations.
- Temporal Smoothness: We initialize part poses with poses from the previous frame to improve the temporal smoothness of part trajectories.