Skip to content
View holarissun's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report holarissun

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. Prompt-OIRL Prompt-OIRL Public

    code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning

    Python 39 6

  2. RewardModelingBeyondBradleyTerry RewardModelingBeyondBradleyTerry Public

    official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives

    Python 34 1

  3. RewardShifting RewardShifting Public

    Code for NeurIPS 2022 paper Exploiting Reward Shifting in Value-Based Deep RL

    Python 29 3

  4. embedding-based-llm-alignment embedding-based-llm-alignment Public

    Codebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs

    4 1

  5. Accountable-Offline-RL Accountable-Offline-RL Public

    Code for NeurIPS 2023 paper Accountability in Offline Reinforcement Learning: Explaining Decisions with a Corpus of Examples

    Python 5 1