Transparent Reward Model

This repo contains source code for Learning Transparent Reward Models via Unsupervised Feature Selection, Daulet Baimukashev, Gokhan Alcan, Kevin Sebastian Luck, Ville Kyrki, CoRL 2024.

We propose a novel approach to construct compact and transparent reward models from automatically selected state features. These inferred rewards have an explicit form and enable the learning of policies that closely match expert behavior by training standard reinforcement learning algorithms from scratch. We validate our method’s performance in various robotic environments with continuous and high-dimensional state spaces.

Installation

Software requirements

Python
Pytorch
CUDA
Jax

Install all the required packages in conda environment by running:

conda env create -f environment.yml

Data

The data can be downloaded from this link, Put the data inside folder data/{env_name}

Alternatively, data can be collected by training RL policy

sh train_expert.sh

Configuration

Configuration files for all environments are located in src/cfg/.

Training

Reward learning consists of two steps: extracting feature set and learning feature weights.

To select important reward features and save symbolic expressions to file, run:

python extract_features.py

To learn feature weights, run training scripts using the examples from:

run_expertiments.sh

To test the trained model, run

sh test_expert.sh

Citation

@inproceedings{
baimukashev2024learning,
title={Learning Transparent Reward Models via Unsupervised Feature Selection},
author={Daulet Baimukashev and Gokhan Alcan and Kevin Sebastian Luck and Ville Kyrki},
booktitle={8th Annual Conference on Robot Learning},
year={2024},
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
extract_features.py		extract_features.py
run_experiments.py		run_experiments.py
test_expert.sh		test_expert.sh
train_expert.sh		train_expert.sh
train_irl.sh		train_irl.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transparent Reward Model

Installation

Software requirements

Data

Configuration

Training

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Transparent Reward Model

Installation

Software requirements

Data

Configuration

Training

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages