Awesome series for Large Language Model(LLM)s
Name | Parameter Size | Announcement Date | Provider |
---|---|---|---|
Gemma-3 | 1B, 4B, 12B, 27B | March 2025 | |
GPT-4.5 | Undisclosed | Feburary 2025 | OpenAI |
Grok‑3 | Undisclosed | Feburary 2025 | xAI |
Gemini-2 | Undisclosed | Feburary 2025 | |
DeepSeek-VL2 | 4.5B | Feburary 2025 | DeepSeek |
DeepSeek-R1 | 671B | January 2025 | DeepSeek |
DeepSeek-V3 | 671B | December 2024 | DeepSeek |
GPT‑o1 | Undisclosed | September 2024 | OpenAI |
Qwen-2.5 | 0.5B, 1.5B, 3B, 7B, 14B, 72B | September 2024 | Alibaba Cloud |
Gemma-2 | 2B, 9B, 27B | June 2024 | |
Qwen-2 | 0.5B, 1.5B, 7B, 57B, 72B | June 2024 | Alibaba Cloud |
GPT‑4o | Undisclosed | May 2024 | OpenAI |
Yi‑1.5 | 6B, 9B, 34B | May 2024 | 01.AI |
DeepSeek-V2 | 238B (21B active) | April 2024 | DeepSeek |
Llama-3 | 8B, 70B | April 2024 | Meta |
Gemma-1.1 | 2B, 7B | April 2024 | |
DeepSeek-VL | 7B | March 2024 | DeepSeek |
Claude-3 | Undisclosed | March 2024 | Anthropic |
Grok‑1 | 314B | March 2024 | xAI |
DBRX | 132B (36B active) | March 2024 | Databricks |
Gemma | 2B, 7B | February 2024 | |
Qwen-1.5 | 0.5B, 1.8B, 4B, 7B, 14B, 72B | February 2024 | Alibaba Cloud |
Qwen‑VL | Undisclosed | January 2024 | Alibaba Cloud |
Phi‑2 | 2.7B | December 2023 | Microsoft |
Gemini | Undisclosed | December 2023 | |
Mixtral | 46.7B | December 2023 | Mistral AI |
Grok‑0 | 33B | November 2023 | xAI |
Yi | 6B, 34B | November 2023 | 01.AI |
Zephyr‑7b‑beta | 7B | October 2023 | HuggingFace H4 |
Solar | 10.7B | September 2023 | Upstage |
Mistral | 7.3B | September 2023 | Mistral AI |
Qwen | 1.8B, 7B, 14B, 72B | August 2023 | Alibaba Cloud |
Llama-2 | 7B, 13B, 70B | July 2023 | Meta |
XGen | 7B | July 2023 | Salesforce |
Falcon | 7B, 40B, 180B | June/Sept 2023 | Technology Innovation Institute (UAE) |
MPT | 7B, 30B | May/June 2023 | MosaicML |
LIMA | 65B | May 2023 | Meta AI |
PaLM-2 | 340B | May 2023 | |
Vicuna | 7B, 13B, 33B | March 2023 | LMSYS ORG |
Koala | 13B | April 2023 | UC Berkeley |
OpenAssistant | 30B | April 2023 | LAION |
Jurassic‑2 | Undisclosed | April 2023 | AI21 Labs |
Dolly | 6B, 12B | March/April 2023 | Databricks |
BloombergGPT | 50B | March 2023 | Bloomberg |
GPT‑4 | Undisclosed | March 2023 | OpenAI |
Bard | Undisclosed | March 2023 | |
Stanford-Alpaca | 7B | March 2023 | Stanford University |
LLaMA | 7B, 13B, 33B, 65B | February 2023 | Meta |
ChatGPT | Undisclosed | November 2022 | OpenAI |
GPT‑3.5 | 175B | November 2022 | OpenAI |
Jurassic‑1 | 178B | November 2022 | AI21 |
Galactica | 120B | November 2022 | Meta |
Sparrow | 70B | September 2022 | DeepMind |
NLLB | 54.5B | July 2022 | Meta |
BLOOM | 176B | July 2022 | BigScience (Hugging Face) |
AlexaTM | 20B | August 2022 | Amazon |
UL2 | 20B | May 2022 | |
OPT | 175B | May 2022 | Meta (Facebook) |
PaLM | 540B | April 2022 | |
AlphaCode | 41.4B | February 2022 | DeepMind |
Chinchilla | 70B | March 2022 | DeepMind |
GLaM | 1.2T | December 2021 | |
Macaw | 11B | October 2021 | Allen Institute for AI |
T0 | 11B | October 2021 | Hugging Face |
Megatron‑Turing-NLG | 530B | January 2022 | Microsoft & NVIDIA |
LaMDA | 137B | January 2022 | |
Gopher | 280B | December 2021 | DeepMind |
GPT‑J | 6B | June 2021 | EleutherAI |
GPT‑NeoX-2.0 | 20B | February 2022 | EleutherAI |
T5 | 60M, 220M, 770M, 3B, 11B | October 2019 | |
BERT | 108M, 334M, 1.27B | October 2018 |
- Gemma 3 (1B, 4B, 12B, 27B) - Announced by DeepSeek / 2025
- DeepSeek-R1 (671B) - Announced by DeepSeek / 2025
- LLaMA 3 (8B, 70B) - Announced by Meta / 2024
- Gemma 2 (2B, 9B, 27B) - Announced by Google / 2024
- DeepSeek-V2 (238B) - Announced by DeepSeek / 2024
- Mistral 7B - Announced by Mistral AI / 2023
- Solar (10.7B) - Announced by Upstage / 2023
- DBRX (132B) - Announced by Databricks / 2024
- Falcon (7B, 40B, 180B) - Announced by Technology Innovation Institute / 2023
- MPT (7B, 30B) - Announced by MosaicML / 2023
- Dolly (6B, 12B) - Announced by Databricks / 2023
- Phi-2 (2.7B) - Announced by Microsoft / 2023
- GPT-NEOX 20B - Announced by EleutherAI / 2023
- GPT-J (6B) - Announced by EleutherAI / 2021
- Stanford Alpaca (7B) - Announced by Stanford University / 2023
- OpenAssistant (30B) - Announced by LAION / 2023
- Visual ChatGPT - Announced by Microsoft / 2023
- LMOps - Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities.
- GPT 4 (Parameter size unannounced, gpt-4-32k) - Announced by OpenAI / 2023
- ChatGPT (175B) - Announced by OpenAI / 2022
- ChatGPT Plus (175B) - Announced by OpenAI / 2023
- GPT 3.5 (175B, text-davinci-003) - Announced by OpenAI / 2022
- Gemini - Announced by Google Deepmind / 2023
- Bard - Announced by Google / 2023
- Codex (11B) - Announced by OpenAI / 2021
- Sphere - Announced by Meta / 2022
134M
documents split into906M
passages as the web corpus.
- Common Crawl
3.15B
pages and over than380TiB
size dataset, public, free to use.
- SQuAD 2.0
100,000+
question dataset for QA.
- Pile
825 GiB diverse
, open source language modelling data set.
- RACE
- A large-scale reading comprehension dataset with more than
28,000
passages and nearly100,000
questions.
- A large-scale reading comprehension dataset with more than
- Wikipedia
- Wikipedia dataset containing cleaned articles of all languages.
Below are key websites and references used for evaluating and comparing large language models (LLMs) and their benchmarks:
-
Chatbot Arena
https://chatbotarena.com/
A platform for head-to-head evaluations of AI chatbots. -
LLM Leaderboard 2025 – Verified AI Rankings
https://llm-stats.com/
Comparative rankings of leading AI models based on quality, price, and performance. -
Artificial Analysis LLM Leaderboards
https://artificialanalysis.ai/leaderboards/models
Detailed comparisons across multiple metrics (output speed, latency, context window, etc.). -
MMLU – Wikipedia
https://en.wikipedia.org/wiki/MMLU
Information about the Measuring Massive Multitask Language Understanding benchmark. -
Language Model Benchmark – Wikipedia
https://en.wikipedia.org/wiki/Language_model_benchmark
Overview of various benchmarks used for evaluating LLM performance.
- Megatron-Turing NLG (530B) - Announced by NVIDIA and Microsoft / 2021
- LaMDA (137B) - Announced by Google / 2021
- GLaM (1.2T) - Announced by Google / 2021
- PaLM (540B) - Announced by Google / 2022
- AlphaCode (41.4B) - Announced by DeepMind / 2022
- Chinchilla (70B) - Announced by DeepMind / 2022
- Sparrow (70B) - Announced by DeepMind / 2022
- NLLB (54.5B) - Announced by Meta / 2022
- LLaMA (65B) - Announced by Meta / 2023
- AlexaTM (20B) - Announced by Amazon / 2022
- Gopher (280B) - Announced by DeepMind / 2021
- Galactica (120B) - Announced by Meta / 2022
- PaLM2 Tech Report - Announced by Google / 2023
- LIMA - Announced by Meta / 2023
- DeekSeek-R1 (631B) - Announced by DeepSeek-AI / 2025
- Llama 2 (70B) - Announced by Meta / 2023
- Luminous (13B) - Announced by Aleph Alpha / 2021
- Turing NLG (17B) - Announced by Microsoft / 2020
- Claude (52B) - Announced by Anthropic / 2021
- Minerva (Parameter size unannounced) - Announced by Google / 2022
- BloombergGPT (50B) - Announced by Bloomberg / 2023
- AlexaTM (20B - Announced by Amazon / 2023
- Dolly (6B) - Announced by Databricks / 2023
- Jurassic-1 - Announced by AI21 / 2022
- Jurassic-2 - Announced by AI21 / 2023
- Koala - Announced by Berkeley Artificial Intelligence Research(BAIR) / 2023
- Gemma - Gemma: Introducing new state-of-the-art open models / 2024
- Grok-1 - Open Release of Grok-1 / 2023
- Grok-1.5 - Announced by XAI / 2024
- DBRX - Announced by Databricks / 2024
- Grok-2 - Announced by XAI / 2025
- BigScience - Maintained by HuggingFace (Twitter) (Notion)
- HuggingChat - Maintained by HuggingFace / 2023
- OpenAssistant - Maintained by Open Assistant / 2023
- StableLM - Maintained by Stability AI / 2023
- Eleuther AI Language Model- Maintained by Eleuther AI / 2023
- Falcon LLM - Maintained by Technology Innovation Institute / 2023
- Gemma - Maintained by Google / 2024
- Stanford Alpaca -
- A repository of Stanford Alpaca project, a model fine-tuned from the LLaMA 7B model on 52K instruction-following demonstrations.
- Dolly -
- A large language model trained on the Databricks Machine Learning Platform.
- AutoGPT -
- An experimental open-source attempt to make GPT-4 fully autonomous.
- dalai -
- The cli tool to run LLaMA on the local machine.
- LLaMA-Adapter -
- Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters.
- alpaca-lora -
- Instruct-tune LLaMA on consumer hardware.
- llama_index -
- A project that provides a central interface to connect your LLM's with external data.
- openai/evals -
- A curated list of reinforcement learning with human feedback resources.
- trlx -
- A repo for distributed training of language models with Reinforcement Learning via Human Feedback. (RLHF)
- pythia -
- A suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters.
- Embedchain -
- Framework to create ChatGPT like bots over your dataset.
- google-deepmind/gemma -
- Open weights LLM from Google DeepMind.
- DeepSeek-R1 -
- A first-generation reasoning model from DeepSeek-AI.
- OpenAssistant SFT 6 - 30 billion LLaMa-based model made by HuggingFace for the chatting conversation.
- Vicuna Delta v0 - An open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT.
- MPT 7B - A decoder-style transformer pre-trained from scratch on 1T tokens of English text and code. This model was trained by MosaicML.
- Falcon 7B - A 7B parameters causal decoder-only model built by TII and trained on 1,500B tokens of RefinedWeb enhanced with curated corpora.
- Introducing Gemma 3: The most capable model you can run on a single GPU or TPU
- Phi-2: The surprising power of small language models
- StackLLaMA: A hands-on guide to train LLaMA with RLHF
- PaLM2
- PaLM2 and Future work: Gemini model
We welcome contributions to the Awesome LLM list! If you'd like to suggest an addition or make a correction, please follow these guidelines:
- Fork the repository and create a new branch for your contribution.
- Make your changes to the README.md file.
- Ensure that your contribution is relevant to the topic of LLM.
- Use the following format to add your contribution:
[Name of Resource](Link to Resource) - Description of resource
- Add your contribution in alphabetical order within its category.
- Make sure that your contribution is not already listed.
- Provide a brief description of the resource and explain why it is relevant to LLM.
- Create a pull request with a clear title and description of your changes.
We appreciate your contributions and thank you for helping to make the Awesome LLM list even more awesome!