- Shenzhen, China
- @felix1987_
Lists (2)
Sort Name ascending (A-Z)
- All languages
- Assembly
- C
- C#
- C++
- CMake
- CSS
- CoffeeScript
- Cuda
- Cython
- Dockerfile
- Emacs Lisp
- Go
- HCL
- HTML
- Java
- JavaScript
- Jinja
- Jupyter Notebook
- Kotlin
- Less
- Lua
- MATLAB
- MLIR
- Makefile
- Markdown
- Mojo
- Mustache
- OpenEdge ABL
- PLpgSQL
- Perl
- PureBasic
- Python
- Roff
- Rust
- SRecode Template
- Scala
- Shell
- Starlark
- Svelte
- Swift
- TeX
- TypeScript
- Vue
- Zig
Starred repositories
Thunder gives you PyTorch models superpowers for training and inference. Unlock out-of-the-box optimizations for performance, memory and parallelism, or roll out your own.
Ring attention implementation with flash attention
LettuceDetect is a hallucination detection framework for RAG applications.
Common used component in AI applications. (inference interface, processing utils, serving etc)
A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.
Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language Models (ACL 2024)
[SIGIR 2025] The official repo for "Scaling Sparse and Dense Retrieval in Decoder-Only LLMs"
A general framework for bridging LLMs and recommendation systems via reinforcement learning. https://arxiv.org/pdf/2503.24289
A new chunking strategy developed by ZeroEntropy for general semantic chunking using Llama-70B.
🐳 Python GPU adds a minimal install of CUDA and cuDNN on top of the official python:3.x-slim base image
✂️ Sentence segmentation with wtpsplit's state-of-the-art Segment any Text (SaT) models
Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
A high performance gRPC server on top of Apache Lucene
📄 Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.
Official Repo for ACL 2024 "Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering"
Official code for "SearchLM: Language Models Can Self-Incentivize as Search Reasoners"
Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".
Beyond Contrastive Learning: Synthetic Data Enables List-wise Training with Multiple Levels of Relevance
High-performance safetensors model loader
Official Implementation of SynthTIGER (Synthetic Text Image Generator), ICDAR 2021