Diffusion Policy for ALOHA TransferCube Task (Baseline)

⚠️ Note: This model underperforms ACT on this task. Published for comparison purposes.

A Diffusion Policy model trained on the ALOHA simulation TransferCube task. This model is published as a baseline comparison to demonstrate that ACT significantly outperforms Diffusion Policy on ALOHA bimanual tasks.

Key Finding

Model	Steps	Success Rate	Parameters
ACT	60K	42%	52M
Diffusion Policy	200K	10%	~100M

Conclusion: ACT is the recommended approach for ALOHA tasks.

Model Description

Property	Value
Architecture	Diffusion Policy
Parameters	~100M
Task	ALOHA TransferCube-v0
Training Steps	200,000
Batch Size	32
Success Rate	~10%

Training Data

Dataset: lerobot/aloha_sim_transfer_cube_human_image
Episodes: 50 human demonstrations
Frames: 20,000

Task Description

The TransferCube task requires a bimanual robot to:

Pick up a red cube with the right arm
Transfer the cube to the left gripper

Demo Video

Training Environment

GPU: RTX A6000
Framework: LeRobot 0.4.3
Training Time: Around 12 hours

Usage

Installation

pip install lerobot gym-aloha

Training

lerobot-train \
    --policy.type=diffusion \
    --dataset.repo_id=lerobot/aloha_sim_transfer_cube_human_image \
    --env.type=aloha \
    --env.task=AlohaTransferCube-v0 \
    --batch_size=32 \
    --steps=200000 \
    --eval.n_episodes=10 \
    --eval_freq=20000 \
    --save_freq=20000 \
    --output_dir=./outputs/dp_aloha_transfer_cube \
    --wandb.enable=false \
    --policy.push_to_hub=false

Evaluation

lerobot-eval \
    --policy.path=LeTau/diffusion_aloha_transfer_cube \
    --env.type=aloha \
    --env.task=AlohaTransferCube-v0 \
    --eval.batch_size=1 \
    --eval.n_episodes=20

Results

Evaluation	Episodes	Success Rate	Avg Sum Reward
Training (100K)	10	10%	23.7
Training (200K)	10	10%	23.3
Independent	20	10%	28.3

Expected success rate: ~10%

Detailed Evaluation Results (Independent)

Sum Rewards: [0.0, 0.0, 253.0, 4.0, 0.0, 0.0, 0.0, 81.0, 21.0, 0.0,
              0.0, 0.0, 0.0, 0.0, 0.0, 207.0, 0.0, 0.0, 0.0, 0.0]

Successes: 2/20 episodes

Why Does Diffusion Policy Underperform?

ACT is designed for ALOHA: ACT was specifically created for bimanual manipulation tasks
Data efficiency: Diffusion Policy may need more demonstrations to learn effectively
Task characteristics: TransferCube requires precise, deterministic actions rather than multi-modal action distributions

Recommendation

For ALOHA bimanual tasks, use ACT instead:

LeTau/act_aloha_transfer_cube - 42% success rate

Citation

@article{zhao2023learning,
  title={Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware},
  author={Zhao, Tony Z and Kumar, Vikash and Levine, Sergey and Finn, Chelsea},
  journal={arXiv preprint arXiv:2304.13705},
  year={2023}
}

@article{chi2023diffusion,
  title={Diffusion Policy: Visuomotor Policy Learning via Action Diffusion},
  author={Chi, Cheng and Feng, Siyuan and Du, Yilun and Xu, Zhenjia and Cousineau, Eric and Burchfiel, Benjamin and Song, Shuran},
  journal={arXiv preprint arXiv:2303.04137},
  year={2023}
}

Acknowledgments

LeRobot framework by HuggingFace
ALOHA project by Stanford
Diffusion Policy project by Columbia

Downloads last month: 17

Video Preview

Robotics

Dataset used to train LeTau/dp_aloha_transfer_cube

Papers for LeTau/dp_aloha_transfer_cube

Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware

Paper • 2304.13705 • Published Apr 23, 2023 • 6

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

Paper • 2303.04137 • Published Mar 7, 2023 • 5