HaozhiQi
diff --git a/‎LICENSE
Lines changed: 20 additions & 0 deletions b/‎LICENSE
Lines changed: 20 additions & 0 deletions
diff --git a/‎README.md
Lines changed: 124 additions & 0 deletions b/‎README.md
Lines changed: 124 additions & 0 deletions
diff --git a/‎assets/ball.urdf
Lines changed: 21 additions & 0 deletions b/‎assets/ball.urdf
Lines changed: 21 additions & 0 deletions
diff --git a/‎assets/cube.urdf
Lines changed: 23 additions & 0 deletions b/‎assets/cube.urdf
Lines changed: 23 additions & 0 deletions
diff --git a/‎assets/cube_multicolor.mtl
Lines changed: 59 additions & 0 deletions b/‎assets/cube_multicolor.mtl
Lines changed: 59 additions & 0 deletions
diff --git a/‎assets/cube_multicolor.obj
Lines changed: 36 additions & 0 deletions b/‎assets/cube_multicolor.obj
Lines changed: 36 additions & 0 deletions
diff --git a/‎assets/cylinder.urdf
Lines changed: 23 additions & 0 deletions b/‎assets/cylinder.urdf
Lines changed: 23 additions & 0 deletions
diff --git a/‎assets/cylinder/pencil-5-7/0000.npy
152 Bytes b/‎assets/cylinder/pencil-5-7/0000.npy
152 Bytes
diff --git a/‎assets/cylinder/pencil-5-7/0000.urdf
Lines changed: 22 additions & 0 deletions b/‎assets/cylinder/pencil-5-7/0000.urdf
Lines changed: 22 additions & 0 deletions
diff --git a/‎assets/cylinder/pencil-5-7/0001.npy
152 Bytes b/‎assets/cylinder/pencil-5-7/0001.npy
152 Bytes
@@ -0,0 +1,20 @@
+Copyright (c) 2024 Authors of Lessons from Learning to Spin “Pens”
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,124 @@
+# Lessons from Learning to Spin “Pens”
+
+<p align="center">
+  <img src="assets/teaser.gif" width="1000"/>
+</p>
+
+This repository contains a reference PyTorch implementation of the paper:
+
+<b>Lessons from Learning to Spin “Pens”</b> <br>
+[Jun Wang*](https://wang59695487.github.io/),
+[Ying Yuan*](https://yingyuan0414.github.io/),
+[Haichuan Che*](https://www.linkedin.com/in/haichuan-che-7338721b1/),
+[Haozhi Qi*](https://haozhi.io/),
+[Yi Ma](http://people.eecs.berkeley.edu/~yima/),
+[Jitendra Malik](https://people.eecs.berkeley.edu/~malik/),
+[Xiaolong Wang](https://xiaolonw.github.io/) <br>
+[[Website](https://penspin.github.io/)]
+
+## Installation
+
+See [installation instructions](docs/install.md).
+
+## Introduction
+
+Our pen spinning method contains the following four steps.
+1. Learn a oracle policy with privileged information, point-clouds, and tactile sensor output with RL in simulation.
+2. Learn a student policy using the rollout of the oracle policy, also in simulation.
+3. Rollout trajectories generated by the oracle policy in a real robot, with initial state distribution matched. The success trajectories are collected while failures are discarded.
+4. Finetune the student policy in step 2 with the real-world successful trajectories.
+
+The following session only provides example script of our method. For baselines, checkout [baselines](docs/baseline.md).
+
+## Step 0: Visualize a Pre-trained Oracle Policy
+
+```
+cd outputs/AllegroHandHora
+gdown 1LCRFE6lvKSUDPpUfEATOmpDUPDbB7n8d
+unzip demo.zip -d ./
+cd ../../
+scripts/vis_teacher.sh demo
+```
+
+
+## Step 1: Oracle Policy training
+
+To train an oracle policy $f$ with RL, run
+
+```
+# 0 is GPU is
+# 42 is experiment seed
+scripts/train_teacher.sh 0 42 output_name
+```
+
+After training your oracle policy, you can visualize it as follows:
+```
+scripts/vis_teacher.sh output_name
+```
+
+## Step 2: Student Policy Pretraining
+
+In this section, we train a proprioceptive student policy by distilling from our trained oracle policy $f$.
+
+Note we use the teacher rollout to train student policy, in contrast to DAgger in previous works.
+
+```
+scripts/train_student_sim.sh train.ppo.is_demon=True train.demon_path=ORACLE_CHECKPOINT_PATH 
+```
+We have provided a reference teacher checkpoint in [Google Drive](https://drive.google.com/file/d/1LCRFE6lvKSUDPpUfEATOmpDUPDbB7n8d/view?usp=sharing).
+
+## Step 3: Open-Loop Replay in Real Hardware
+
+To generate open-loop replay data for the student policy $\pi$, run
+```
+python real/robot_controller/teacher_replay.py --data-collect --exp=0 --replay_data_dir=REPLAY_DATA_DIR
+```
+where `REPLAY_DATA_DIR` is the directory to save the replay data.
+
+Then process the replay data.
+
+## Step 4: Real-world Fine-tuning
+
+To fine-tune the student policy $\pi$ using real data, run
+```
+scripts/finetune_ppo.sh --real-dataset-folder=REAL_DATA_PATH --checkpoint-path=YOUR_CHECKPOINTPATH
+```
+
+## Real Data Download
+Please download the real reference data from [Google Drive](https://drive.google.com/drive/folders/1TAMAvqLp3b5vEmdyrdcgW0kBW1GAxoyy?usp=sharing).
+```
+Real data:
+  real_data.h5 is in the format of h5 file, which contains the following keys:
+  -replay_demon_{idx}: the idx-th replay demonstration data
+    - qpos: the current qpos of the robot
+    - action: the delta action applied to the robot
+    - current_target_qpos: the target qpos of the robot
+
+  real_data_full.h5 is a full version of real_data.h5, which contains the following keys:
+  -replay_demon_{idx}: the idx-th replay demonstration data
+    - qpos: the current qpos of the robot
+    - action: the delta action applied to the robot
+    - current_target_qpos: the target qpos of the robot
+    - rgb_ori: the original rgb image
+    - rgb_c2d: the rgb image after camera2depth image processing
+    - depth: the depth image
+    - pc: the point cloud
+    - obj_ends: the position of object ends 
+```
+
+## Acknowledgement
+
+Note: This repository is built based on [Hora](https://github.com/HaozhiQi/hora) and [IsaacGymEnvs](https://github.com/isaac-sim/IsaacGymEnvs).
+
+## Citing
+
+If you find **PenSpin** or this codebase helpful in your research, please consider citing:
+
+```
+@article{wang2024penspin,
+  author={Wang, Jun and Yuan, Ying and Che, Haichuan and Qi, Haozhi and Ma, Yi and Malik, Jitendra and Wang, Xiaolong},
+  title={Lessons from Learning to Spin “Pens”},
+  journal={},
+  year={2024}
+}
+```
@@ -0,0 +1,21 @@
+<?xml version="1.0"?>
+<robot name="ball">
+  <link name="ball">
+    <visual>
+      <origin xyz="0 0 0"/>
+      <geometry>
+        <mesh filename="ycb/056_tennis_ball/google_16k/textured.obj" scale="1.1934954497985975 1.1934954497985975 1.1934954497985975"/>
+	  </geometry>
+    </visual>
+    <collision>
+      <origin xyz="0 0 0"/>
+      <geometry>
+        <sphere radius="0.04"/>
+      </geometry>
+    </collision>
+    <inertial>
+        <mass value="0.05" />
+        <inertia ixx="0.0001" ixy="0.0" ixz="0.0" iyy="0.0001" iyz="0.0" izz="0.0001"/>
+    </inertial>
+  </link>
+</robot>
@@ -0,0 +1,23 @@
+<?xml version="1.0"?>
+<robot name="object">
+  <link name="object">
+    <visual>
+      <origin xyz="0 0 0"/>
+      <geometry>
+        <mesh filename="cube_multicolor.obj" scale="0.08 0.08 0.08"/>
+      </geometry>
+    </visual>
+
+    <collision>
+      <origin xyz="0 0 0"/>
+      <geometry>
+        <box size="0.08 0.08 0.08"/>
+      </geometry>
+    </collision>
+
+    <inertial>
+      <mass value="0.05" />
+      <inertia ixx="0.0001" ixy="0.0" ixz="0.0" iyy="0.0001" iyz="0.0" izz="0.0001"/>
+    </inertial>
+  </link>
+</robot>
@@ -0,0 +1,59 @@
+newmtl red
+Ns 10.0
+Ka 1.0 1.0 1.0
+Kd 1.0 0.0 0.0
+Ks 0.125 0.125 0.125
+Ke 0.0 0.0 0.0
+Ni 1.0
+d 1.0
+illum 2
+
+newmtl green
+Ns 10.0
+Ka 1.0 1.0 1.0
+Kd 0.0 1.0 0.0
+Ks 0.125 0.125 0.125
+Ke 0.0 0.0 0.0
+Ni 1.0
+d 1.0
+illum 2
+
+newmtl blue
+Ns 10.0
+Ka 1.0 1.0 1.0
+Kd 0.0 0.0 1.0
+Ks 0.125 0.125 0.125
+Ke 0.0 0.0 0.0
+Ni 1.0
+d 1.0
+illum 2
+
+newmtl yellow
+Ns 10.0
+Ka 1.0 1.0 1.0
+Kd 1.0 1.0 0.0
+Ks 0.125 0.125 0.125
+Ke 0.0 0.0 0.0
+Ni 1.0
+d 1.0
+illum 2
+
+newmtl cyan
+Ns 10.0
+Ka 1.0 1.0 1.0
+Kd 0.0 1.0 1.0
+Ks 0.125 0.125 0.125
+Ke 0.0 0.0 0.0
+Ni 1.0
+d 1.0
+illum 2
+
+newmtl white
+Ns 10.0
+Ka 1.0 1.0 1.0
+Kd 1.0 1.0 1.0
+Ks 0.125 0.125 0.125
+Ke 0.0 0.0 0.0
+Ni 1.0
+d 1.0
+illum 2
@@ -0,0 +1,36 @@
+mtllib cube_multicolor.mtl
+
+v  -0.5  -0.5  -0.5
+v  -0.5  -0.5  0.5
+v  -0.5  0.5  -0.5
+v  -0.5  0.5  0.5
+v  0.5  -0.5  -0.5
+v  0.5  -0.5  0.5
+v  0.5  0.5  -0.5
+v  0.5  0.5  0.5
+
+vn  -0.5  -0.5  0.5
+vn  -0.5  -0.5 -0.5
+vn  -0.5  0.5  -0.5
+vn  -0.5 -0.5  -0.5
+vn  0.5  -0.5  -0.5
+vn -0.5  -0.5  -0.5
+
+usemtl red
+f  1//2  7//2  5//2
+f  1//2  3//2  7//2
+usemtl green
+f  1//6  4//6  3//6
+f  1//6  2//6  4//6
+usemtl blue
+f  3//3  8//3  7//3
+f  3//3  4//3  8//3
+usemtl yellow
+f  5//5  7//5  8//5
+f  5//5  8//5  6//5
+usemtl cyan
+f  1//4  5//4  6//4
+f  1//4  6//4  2//4
+usemtl white
+f  2//1  6//1  8//1
+f  2//1  8//1  4//1
@@ -0,0 +1,23 @@
+<?xml version="1.0"?>
+<robot name="object">
+  <link name="object">
+    <visual>
+      <origin xyz="0 0 0"/>
+      <geometry>
+        <cylinder radius="0.04" length="0.08"/>
+      </geometry>
+    </visual>
+
+    <collision>
+      <origin xyz="0 0 0"/>
+      <geometry>
+        <cylinder radius="0.04" length="0.08"/>
+      </geometry>
+    </collision>
+
+    <inertial>
+      <mass value="0.05" />
+      <inertia ixx="0.0001" ixy="0.0" ixz="0.0" iyy="0.0001" iyz="0.0" izz="0.0001"/>
+    </inertial>
+  </link>
+</robot>
@@ -0,0 +1,22 @@
+<robot name="object">
+  <link name="object">
+    <visual>
+      <origin xyz="0 0 0" />
+      <geometry>
+        <cylinder radius="0.04" length="0.4" />
+      </geometry>
+    </visual>
+
+    <collision>
+      <origin xyz="0 0 0" />
+      <geometry>
+        <cylinder radius="0.04" length="0.4" />
+      </geometry>
+    </collision>
+
+    <inertial>
+      <mass value="0.05" />
+      <inertia ixx="0.0001" ixy="0.0" ixz="0.0" iyy="0.0001" iyz="0.0" izz="0.0001" />
+    </inertial>
+  </link>
+</robot>