Skip to content
Change the repository type filter

All

    Repositories list

    • Any to Any Modality Model Training Framework. Lean for research and hacking.
      Python
      22513Updated Oct 16, 2025Oct 16, 2025
    • Fully Open Framework for Democratized Multimodal Training
      Python
      34528140Updated Oct 15, 2025Oct 15, 2025
    • lmms-eval

      Public
      One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
      Python
      3923.2k2664Updated Oct 9, 2025Oct 9, 2025
    • .github

      Public
      0100Updated Sep 29, 2025Sep 29, 2025
    • [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.
      Python
      1115840Updated Sep 26, 2025Sep 26, 2025
    • VideoMMMU

      Public
      Python
      26000Updated Sep 5, 2025Sep 5, 2025
    • sglang

      Public
      SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.
      Python
      3.1k300Updated Aug 26, 2025Aug 26, 2025
    • MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search tools.
      Python
      1733220Updated Aug 26, 2025Aug 26, 2025
    • Enjoy the magic of Diffusion models!
      Python
      966000Updated Aug 23, 2025Aug 23, 2025
    • Deploying High-Performance Lean 4 Server in One Click
      Python
      0801Updated Aug 14, 2025Aug 14, 2025
    • MGPO

      Public
      High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning
      04940Updated Jul 23, 2025Jul 23, 2025
    • sae

      Public
      A framework that allows you to apply Sparse AutoEncoder on any models
      Python
      14120Updated Jul 11, 2025Jul 11, 2025
    • Open-source implementation of AlphaEvolve
      Python
      612200Updated Jun 20, 2025Jun 20, 2025
    • DeepEyes

      Public
      Python
      51300Updated Jun 16, 2025Jun 16, 2025
    • agent-rl

      Public
      A fork version of verl to support multi-turn tool use and many more agentic tasks.
      Python
      43100Updated Jun 14, 2025Jun 14, 2025
    • Aero-1

      Public
      Python
      67830Updated May 4, 2025May 4, 2025
    • EgoLife

      Public
      [CVPR 2025] EgoLife: Towards Egocentric Life Assistant
      Python
      1933570Updated Mar 19, 2025Mar 19, 2025
    • LongVA

      Public
      Long Context Transfer from Language to Vision
      Python
      20394270Updated Mar 18, 2025Mar 18, 2025
    • A fork to add multimodal model training to open-r1
      Python
      701.4k221Updated Feb 8, 2025Feb 8, 2025
    • demos

      Public
      Python
      0000Updated Sep 18, 2024Sep 18, 2024
    • The math library of Lean 4
      Lean
      835000Updated Aug 7, 2024Aug 7, 2024
    • Otter

      Public
      🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
      Python
      2103.3k622Updated Mar 5, 2024Mar 5, 2024
    • Relate Anything Model is capable of taking an image as input and utilizing SAM to identify the corresponding mask within the image.
      Python
      2245460Updated Jul 4, 2023Jul 4, 2023