Why Bimanual Data Collection Is Harder
In single-arm data collection, a bad demonstration affects only one arm's trajectory. You record 50 demos, discard 5 bad ones, and train on 45. In bimanual data collection, a mistake at the handoff point invalidates both arms' trajectories for that demo simultaneously. The failure modes are coupled.
This coupling has two practical implications. First, you need more demonstrations — 100 instead of 50 — because bimanual tasks have higher variance and the policy needs more examples to learn the coordination structure. Second, you need stricter consistency per demonstration. A single-arm demo that's 80% consistent trains reasonably well. A bimanual demo where one arm is consistent and the other varies teaches the policy nothing useful about coordination timing.
The workspace coverage challenge is also greater: you need both arms in frame, and the handoff point — the highest-complexity moment — must be reliably captured by at least one camera. Check your camera angles before starting and adjust if the handoff occurs outside the workspace camera's field of view.
LeRobot Bimanual Dataset Format
The DK1 integration with LeRobot extends the standard single-arm format with dual joint-state arrays. Each timestep in the dataset contains:
The key difference from single-arm: the action space is 14-dimensional (6+6 joints + 2 grippers). ACT handles this natively — you specify the action dimension in the training config and no other changes are required.
Recording Workflow
Run 10–15 practice demos before starting the recording session to warm up your motor memory for the task. The first 5–10 recorded demos will be your worst — that's expected. Do not stop to review them during the session; review and cull bad demos after the full 100 are recorded.
Quality Checklist for Bimanual Data
Review every demo after recording using LeRobot's replay viewer. Discard any demo that fails two or more of these criteria:
Unit 4 Complete When...
You have 100 recorded demonstrations in LeRobot format at ~/dk1-datasets/cube-handoff-v1/. After reviewing and culling, at least 90 demos pass the quality checklist. Both joint state arrays are present at 50Hz for every episode. Both camera feeds are present and show the full task sequence including the handoff moment. You have run python -m lerobot.scripts.visualize_dataset --repo-id cube-handoff-v1 and confirmed the dataset structure is valid.