-
Notifications
You must be signed in to change notification settings - Fork 2k
Add configs and adapt exporter for RSL-RL distillation #2182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Mayank Mittal <[email protected]>
Hey! Thanks a lot for this feature — I actually needed something like this, and you already had it covered. Much appreciated! I had a quick question: how do you think to integrate the teacher-student setup within the broader IsaacLab framework? At the moment, I’ve defined two separate environments — one for the teacher and one for the student. The only real difference between them is the observation space.
Does this align with your intended setup? When using the student though, I always need to initialize the teacher first. So I was thinking of adding a This would be different from env config: @configclass
class G1FlatGaitRewardTeacherCfg(G1FlatCfg):
observations: TeacherObservationsCfg = TeacherObservationsCfg()
rewards: G1GaitRewards = G1GaitRewards()
commands: CommandsCfg = CommandsCfg()
def __post_init__(self):
# post init of parent
super().__post_init__()
self.rewards.feet_air_time = None
self.rewards.track_lin_vel_xy_exp.params["command_name"] = (
"gaited_base_velocity"
)
self.rewards.track_ang_vel_z_exp.params["command_name"] = "gaited_base_velocity"
# observation terms (order preserved)
@configclass
class G1FlatGaitRewardStudentCfg(G1FlatGaitRewardTeacherCfg):
observations: StudentObservationsCfg = StudentObservationsCfg()
def __post_init__(self):
# post init of parent
super().__post_init__() agent config: @configclass
class G1FlatStudentGaitRewardPPORunnerCfg(G1RoughPPORunnerCfg):
def __post_init__(self):
super().__post_init__()
self.num_steps_per_env = 64
self.max_iterations = 300
self.experiment_name = "g1_flat_gait_student"
self.teacher_experiment_name = "g1_flat_gait_teacher"
self.policy = RslRlDistillationStudentTeacherCfg(
init_noise_std= 0.001,
student_hidden_dims = [256, 128, 128],
teacher_hidden_dims = [256, 128, 128],
activation="elu"
)
self.algorithm = RslRlDistillationAlgorithmCfg(
num_learning_epochs=5,
learning_rate=1e-03,
gradient_length=2.
)
@configclass
class G1FlatTeacherGaitRewardPPORunnerCfg(G1RoughPPORunnerCfg):
def __post_init__(self):
super().__post_init__()
self.max_iterations = 30000
self.experiment_name = "g1_flat_gait_teacher"
self.policy.actor_hidden_dims = [256, 128, 128]
self.policy.critic_hidden_dims = [256, 128, 128] then in the main script adding something like: if isinstance(agent_cfg.policy, RslRlDistillationStudentTeacherCfg):
teacher_root_path = os.path.join("logs", "rsl_rl", agent_cfg.teacher_experiment_name)
teacher_root_path = os.path.abspath(teacher_root_path)
trained_teacher_path = get_checkpoint_path(teacher_root_path, agent_cfg.load_run_teacher, agent_cfg.load_checkpoint_teacher) Let me know what you think! |
Hey! Yes, your setup looks good :) However, I don't think you need an attribute for the teacher. Just pass the directory with the |
Signed-off-by: Mayank Mittal <[email protected]>
"""The learning rate for the student policy.""" | ||
|
||
gradient_length: int = MISSING | ||
"""The number of environment steps the gradient flows back.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ClemensSchwarke should the parameter by default be 1?
"""The number of environment steps the gradient flows back.""" | |
"""The number of rollout steps for gradient propagation. | |
This is useful for sequential training of recurrent student network. | |
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please also update the extension.toml and CHANGELOG?
Thank you! Looks good otherwise.
# Description This PR adds configuration classes for Student-Teacher Distillation and adapts the policy exporters to be able to export student policies. ## Type of change - Non-breaking change ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there --------- Signed-off-by: Mayank Mittal <[email protected]> Co-authored-by: Mayank Mittal <[email protected]>
# Description This PR adds configuration classes for Student-Teacher Distillation and adapts the policy exporters to be able to export student policies. ## Type of change - Non-breaking change ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [ ] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [ ] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there --------- Signed-off-by: Mayank Mittal <[email protected]> Co-authored-by: Mayank Mittal <[email protected]>
Description
This PR adds configuration classes for Student-Teacher Distillation and adapts the policy exporters to be able to export student policies.
Type of change
Checklist
pre-commit
checks with./isaaclab.sh --format
config/extension.toml
fileCONTRIBUTORS.md
or my name already exists there