pi05_droid_jointpos (lerobot PyTorch port)

OpenPI ์˜ pi05_droid_jointpos (ฯ€0.5 ๋ฅผ DROID ๋กœ fine-tune, joint-position action space) ๋ฅผ lerobot PI05Policy (PyTorch) ํฌ๋งท์œผ๋กœ ๋ณ€ํ™˜ํ•œ ์ฒดํฌํฌ์ธํŠธ์ž…๋‹ˆ๋‹ค.

  • Source: gs://openpi-assets-simeval/pi05_droid_jointpos (JAX/orbax) โ€” OpenPI ๊ณต์‹ ๋ฐฐํฌ๋ณธ.
  • Conversion: openpi/examples/convert_jax_model_to_pytorch.py (JAXโ†’PyTorch), ์ดํ›„ lerobot config/processor ์™€ ์กฐ๋ฆฝ.
  • Verified: RoboLab (Isaac Lab) BananaInBowlTask 8-env server-client eval โ†’ 7/8 (๋™์ผ ๊ฐ€์ค‘์น˜์˜ openpi JAX baseline 6/8 ๊ณผ ๋™๋“ฑ).

Files

file ๋‚ด์šฉ
config.json PI05Config โ€” max_state_dim=8, chunk_size=15, dtype=bfloat16, STATE/ACTION QUANTILES
model.safetensors 812 keys (Gemma embed_tokens ๋Š” lm_head ์™€ tie)
policy_preprocessor.json (+ *_normalizer_processor.safetensors) rename โ†’ batch โ†’ DROID quantile normalize โ†’ state discretize+tokenize โ†’ device
policy_postprocessor.json (+ *_unnormalizer_processor.safetensors) DROID quantile unnormalize

Usage

from lerobot.policies.pi05.modeling_pi05 import PI05Policy
from lerobot.policies.factory import make_pre_post_processors

repo = "DAVIAN-Robotics/pi05_droid_jointpos"
policy = PI05Policy.from_pretrained(repo).eval()
preprocessor, postprocessor = make_pre_post_processors(policy.config, pretrained_path=repo)

DROID I/O adapter (ํ•„์ˆ˜)

์ด ์ฒดํฌํฌ์ธํŠธ๋Š” lerobot-native ๋‹จ๊ณ„(normalize / tokenize / model / unnormalize)๋งŒ self-contained ํ•ฉ๋‹ˆ๋‹ค. DROID-specific ํ•œ ์ž…์ถœ๋ ฅ ๋ณ€ํ™˜์€ OpenPI droid_policy ๋กœ์ง์ด๋ฉฐ lerobot ์— ์—†์œผ๋ฏ€๋กœ ์™ธ๋ถ€ ์–ด๋Œ‘ํ„ฐ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค:

  • ์ž…๋ ฅ: observation.state = concat(joint_position[7], gripper_position[1]) (raw, processor ๊ฐ€ ์ •๊ทœํ™”). ์ด๋ฏธ์ง€๋Š” observation.images.base_0_rgb (exterior), observation.images.left_wrist_0_rgb (wrist) 2๊ฐœ๋งŒ ์ฃผ๋ฉด ๋ชจ๋ธ์ด 3๋ฒˆ์งธ cam ์„ -1 ๋กœ ์ž๋™ pad. ([0,1] float, CHW.)
  • ์ถœ๋ ฅ: ๋ชจ๋ธ์€ joint delta(dims 0โ€“6)๋ฅผ ๋‚ด๋ฏ€๋กœ ํ˜„์žฌ joint ๋ฅผ ๋”ํ•ด absolute ํ™” (AbsoluteActions, mask = 7ร—True + 1ร—False), ๊ทธ ๋’ค action[:, :8] (joint7 + gripper1) ๋งŒ ์‚ฌ์šฉ.
  • ์ฐธ์กฐ ๊ตฌํ˜„: sft/scripts/pi05_lerobot_server/ (droid_glue.py + policy.py) โ€” OpenPI websocket ํ”„๋กœํ† ์ฝœ ์„œ๋ฒ„๋กœ, RoboLab pi0_family client ๊ฐ€ ๊ทธ๋Œ€๋กœ ๋ถ™๋Š”๋‹ค.

์ฃผ์˜

  • max_state_dim=8: pi05 ๋Š” ์ •๊ทœํ™”๋œ state ๋ฅผ 256-bin ์ด์‚ฐํ™”ํ•ด ํ”„๋กฌํ”„ํŠธ์— ๋„ฃ๋Š”๋ฐ, DROID ๋Š” ์‹ค์ œ state ๊ฐ€ 8-dim(joint7+gripper1)์ด๋ผ 8๊ฐœ ๊ฐ’๋งŒ ๋„ฃ์–ด์•ผ openpi ์™€ ์ผ์น˜ํ•œ๋‹ค. ๊ธฐ๋ณธ pi05(32)๋กœ ๋‘๋ฉด ์ž‰์—ฌ 24๊ฐœ ํ† ํฐ์ด conditioning ์„ ์˜ค์—ผ์‹œ์ผœ ์„ฑ๋Šฅ์ด ๋–จ์–ด์ง„๋‹ค(ํŠนํžˆ gripper ์กฐ๊ธฐ close). ์ด repo ๋Š” config.json ๊ณผ policy_preprocessor.json(prepare-state step)์— max_state_dim=8 ์„ ๋ช…์‹œํ•ด ํ‘œ์ค€ ๋กœ๋”ฉ๋งŒ์œผ๋กœ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋™์ž‘ํ•œ๋‹ค. (lerobot ๊ธฐ๋ณธ save ๋Š” prepare step ์˜ ์ด ๊ฐ’์„ ์ง๋ ฌํ™”ํ•˜์ง€ ์•Š์œผ๋ฏ€๋กœ ์ˆ˜๋™์œผ๋กœ ๋„ฃ์–ด๋‘” ๊ฒƒ โ€” ์ง์ ‘ ๋นŒ๋“œ ์‹œ ๋™์ผํ•˜๊ฒŒ ๋ช…์‹œ ํ•„์š”.)
  • RLinf/RLinf-Pi05-Polaris-droid_jointpos ์™€ ํ˜ผ๋™ ์ฃผ์˜ โ€” param ํ‚ค ๊ตฌ์กฐ์™€ norm_stats ๋Š” ๊ฐ™์ง€๋งŒ RL-finetune ๋œ ๋‹ค๋ฅธ ๊ฐ€์ค‘์น˜ ๋‹ค. ์ด repo ๋Š” OpenPI ๊ณต์‹ pi05_droid_jointpos ๋ณ€ํ™˜๋ณธ์ด๋‹ค.

License / attribution

๊ฐ€์ค‘์น˜๋Š” OpenPI(Physical Intelligence) ์˜ pi05_droid_jointpos ์—์„œ ํŒŒ์ƒ๋˜์—ˆ๋‹ค. ์‚ฌ์šฉ ์กฐ๊ฑด์€ upstream(openpi) ์˜ ๋ผ์ด์„ ์Šค๋ฅผ ๋”ฐ๋ฅธ๋‹ค.

Downloads last month
36
Safetensors
Model size
4B params
Tensor type
BF16
ยท
Video Preview
loading