Skip to content

KennethanCeyer/awesome-llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

90 Commits
 
 
 
 
 
 

Repository files navigation

Awesome LLM

Awesome

Awesome series for Large Language Model(LLM)s

Contents

Models

Overview

Name Parameter Size Announcement Date Provider
Gemma-3 1B, 4B, 12B, 27B March 2025 Google
GPT-4.5 Undisclosed Feburary 2025 OpenAI
Grok‑3 Undisclosed Feburary 2025 xAI
Gemini-2 Undisclosed Feburary 2025 Google
DeepSeek-VL2 4.5B Feburary 2025 DeepSeek
DeepSeek-R1 671B January 2025 DeepSeek
DeepSeek-V3 671B December 2024 DeepSeek
GPT‑o1 Undisclosed September 2024 OpenAI
Qwen-2.5 0.5B, 1.5B, 3B, 7B, 14B, 72B September 2024 Alibaba Cloud
Gemma-2 2B, 9B, 27B June 2024 Google
Qwen-2 0.5B, 1.5B, 7B, 57B, 72B June 2024 Alibaba Cloud
GPT‑4o Undisclosed May 2024 OpenAI
Yi‑1.5 6B, 9B, 34B May 2024 01.AI
DeepSeek-V2 238B (21B active) April 2024 DeepSeek
Llama-3 8B, 70B April 2024 Meta
Gemma-1.1 2B, 7B April 2024 Google
DeepSeek-VL 7B March 2024 DeepSeek
Claude-3 Undisclosed March 2024 Anthropic
Grok‑1 314B March 2024 xAI
DBRX 132B (36B active) March 2024 Databricks
Gemma 2B, 7B February 2024 Google
Qwen-1.5 0.5B, 1.8B, 4B, 7B, 14B, 72B February 2024 Alibaba Cloud
Qwen‑VL Undisclosed January 2024 Alibaba Cloud
Phi‑2 2.7B December 2023 Microsoft
Gemini Undisclosed December 2023 Google
Mixtral 46.7B December 2023 Mistral AI
Grok‑0 33B November 2023 xAI
Yi 6B, 34B November 2023 01.AI
Zephyr‑7b‑beta 7B October 2023 HuggingFace H4
Solar 10.7B September 2023 Upstage
Mistral 7.3B September 2023 Mistral AI
Qwen 1.8B, 7B, 14B, 72B August 2023 Alibaba Cloud
Llama-2 7B, 13B, 70B July 2023 Meta
XGen 7B July 2023 Salesforce
Falcon 7B, 40B, 180B June/Sept 2023 Technology Innovation Institute (UAE)
MPT 7B, 30B May/June 2023 MosaicML
LIMA 65B May 2023 Meta AI
PaLM-2 340B May 2023 Google
Vicuna 7B, 13B, 33B March 2023 LMSYS ORG
Koala 13B April 2023 UC Berkeley
OpenAssistant 30B April 2023 LAION
Jurassic‑2 Undisclosed April 2023 AI21 Labs
Dolly 6B, 12B March/April 2023 Databricks
BloombergGPT 50B March 2023 Bloomberg
GPT‑4 Undisclosed March 2023 OpenAI
Bard Undisclosed March 2023 Google
Stanford-Alpaca 7B March 2023 Stanford University
LLaMA 7B, 13B, 33B, 65B February 2023 Meta
ChatGPT Undisclosed November 2022 OpenAI
GPT‑3.5 175B November 2022 OpenAI
Jurassic‑1 178B November 2022 AI21
Galactica 120B November 2022 Meta
Sparrow 70B September 2022 DeepMind
NLLB 54.5B July 2022 Meta
BLOOM 176B July 2022 BigScience (Hugging Face)
AlexaTM 20B August 2022 Amazon
UL2 20B May 2022 Google
OPT 175B May 2022 Meta (Facebook)
PaLM 540B April 2022 Google
AlphaCode 41.4B February 2022 DeepMind
Chinchilla 70B March 2022 DeepMind
GLaM 1.2T December 2021 Google
Macaw 11B October 2021 Allen Institute for AI
T0 11B October 2021 Hugging Face
Megatron‑Turing-NLG 530B January 2022 Microsoft & NVIDIA
LaMDA 137B January 2022 Google
Gopher 280B December 2021 DeepMind
GPT‑J 6B June 2021 EleutherAI
GPT‑NeoX-2.0 20B February 2022 EleutherAI
T5 60M, 220M, 770M, 3B, 11B October 2019 Google
BERT 108M, 334M, 1.27B October 2018 Google

⬆️ Go to top

Open models

⬆️ Go to top

⬆️ Go to top

Projects

  • Visual ChatGPT - Announced by Microsoft / 2023
  • LMOps - Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities.

⬆️ Go to top

Commercial models

GPT

⬆️ Go to top

Gemini

  • Gemini - Announced by Google Deepmind / 2023

Bard

  • Bard - Announced by Google / 2023

⬆️ Go to top

Codex

⬆️ Go to top

Datasets

  • Sphere - Announced by Meta / 2022
    • 134M documents split into 906M passages as the web corpus.
  • Common Crawl
    • 3.15B pages and over than 380TiB size dataset, public, free to use.
  • SQuAD 2.0
    • 100,000+ question dataset for QA.
  • Pile
    • 825 GiB diverse, open source language modelling data set.
  • RACE
    • A large-scale reading comprehension dataset with more than 28,000 passages and nearly 100,000 questions.
  • Wikipedia
    • Wikipedia dataset containing cleaned articles of all languages.

⬆️ Go to top

Benchmarks

Below are key websites and references used for evaluating and comparing large language models (LLMs) and their benchmarks:

⬆️ Go to top

Materials

Papers

Posts

⬆️ Go to top

Projects

GitHub repositories

  • Stanford Alpaca - Repo stars of tatsu-lab/stanford_alpaca - A repository of Stanford Alpaca project, a model fine-tuned from the LLaMA 7B model on 52K instruction-following demonstrations.
  • Dolly - Repo stars of databrickslabs/dolly - A large language model trained on the Databricks Machine Learning Platform.
  • AutoGPT - Repo stars of Significant-Gravitas/Auto-GPT - An experimental open-source attempt to make GPT-4 fully autonomous.
  • dalai - Repo stars of cocktailpeanut/dalai - The cli tool to run LLaMA on the local machine.
  • LLaMA-Adapter - Repo stars of ZrrSkywalker/LLaMA-Adapter - Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters.
  • alpaca-lora - Repo stars of tloen/alpaca-lora - Instruct-tune LLaMA on consumer hardware.
  • llama_index - Repo stars of jerryjliu/llama_index - A project that provides a central interface to connect your LLM's with external data.
  • openai/evals - Repo stars of openai/evals - A curated list of reinforcement learning with human feedback resources.
  • trlx - Repo stars of promptslab/Promptify - A repo for distributed training of language models with Reinforcement Learning via Human Feedback. (RLHF)
  • pythia - Repo stars of EleutherAI/pythia - A suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters.
  • Embedchain - Repo stars of embedchain/embedchain - Framework to create ChatGPT like bots over your dataset.
  • google-deepmind/gemma - Repo stars of google-deepmind/gemma - Open weights LLM from Google DeepMind.
  • DeepSeek-R1 - Repo stars of deepseek-ai/DeepSeek-R1 - A first-generation reasoning model from DeepSeek-AI.

⬆️ Go to top

HuggingFace repositories

  • OpenAssistant SFT 6 - 30 billion LLaMa-based model made by HuggingFace for the chatting conversation.
  • Vicuna Delta v0 - An open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT.
  • MPT 7B - A decoder-style transformer pre-trained from scratch on 1T tokens of English text and code. This model was trained by MosaicML.
  • Falcon 7B - A 7B parameters causal decoder-only model built by TII and trained on 1,500B tokens of RefinedWeb enhanced with curated corpora.

⬆️ Go to top

Reading materials

⬆️ Go to top

Contributing

We welcome contributions to the Awesome LLM list! If you'd like to suggest an addition or make a correction, please follow these guidelines:

  1. Fork the repository and create a new branch for your contribution.
  2. Make your changes to the README.md file.
  3. Ensure that your contribution is relevant to the topic of LLM.
  4. Use the following format to add your contribution:
[Name of Resource](Link to Resource) - Description of resource
  1. Add your contribution in alphabetical order within its category.
  2. Make sure that your contribution is not already listed.
  3. Provide a brief description of the resource and explain why it is relevant to LLM.
  4. Create a pull request with a clear title and description of your changes.

We appreciate your contributions and thank you for helping to make the Awesome LLM list even more awesome!

⬆️ Go to top