Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] "Get Model List" clears out Github models #4547

Closed
ignaciocastro opened this issue Oct 29, 2024 · 4 comments · Fixed by #4638 or #4644 · May be fixed by #4645
Closed

[Bug] "Get Model List" clears out Github models #4547

ignaciocastro opened this issue Oct 29, 2024 · 4 comments · Fixed by #4638 or #4644 · May be fixed by #4645
Labels
🐛 Bug Something isn't working | 缺陷 released

Comments

@ignaciocastro
Copy link

📦 Environment

Vercel

📌 Version

v1.26.11

💻 Operating System

Windows

🌐 Browser

Chrome

🐛 Bug Description

When "Get Model List" is pressed on Github, it reports "0 models available in total" even though you can see the models available on console. This also removes the already set models

📷 Recurrence Steps

  1. Click "Get Model List" on Github

🚦 Expected Behavior

Model list updates to whatever's available instead of showing 0 models

📝 Additional Information

https://models.inference.ai.azure.com/models response
[
    {
        "id": "azureml://registries/azureml-ai21/models/AI21-Jamba-Instruct/versions/2",
        "name": "AI21-Jamba-Instruct",
        "friendly_name": "AI21-Jamba-Instruct",
        "model_version": 2,
        "publisher": "AI21 Labs",
        "model_family": "AI21 Labs",
        "model_registry": "azureml-ai21",
        "license": "custom",
        "task": "chat-completion",
        "description": "Jamba-Instruct is the world's first production-grade Mamba-based LLM model and leverages its hybrid Mamba-Transformer architecture to achieve best-in-class performance, quality, and cost efficiency.\n\n**Model Developer Name**: _AI21 Labs_\n\n## Model Architecture\n\nJamba-Instruct leverages a hybrid Mamba-Transformer architecture to achieve best-in-class performance, quality, and cost efficiency.\nAI21's Jamba architecture features a blocks-and-layers approach that allows Jamba to successfully integrate the two architectures. Each Jamba block contains either an attention or a Mamba layer, followed by a multi-layer perceptron (MLP), producing an overall ratio of one Transformer layer out of every eight total layers.\n",
        "summary": "Jamba-Instruct is the world's first production-grade Mamba-based LLM model and leverages its hybrid Mamba-Transformer architecture to achieve best-in-class performance, quality, and cost efficiency.",
        "tags": [
            "chat",
            "rag"
        ]
    },
    {
        "id": "azureml://registries/azureml-cohere/models/Cohere-command-r/versions/3",
        "name": "Cohere-command-r",
        "friendly_name": "Cohere Command R",
        "model_version": 3,
        "publisher": "cohere",
        "model_family": "cohere",
        "model_registry": "azureml-cohere",
        "license": "custom",
        "task": "chat-completion",
        "description": "Command R is a highly performant generative large language model, optimized for a variety of use cases including reasoning, summarization, and question answering. \n\nThe model is optimized to perform well in the following languages: English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic.\n\nPre-training data additionally included the following 13 languages: Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, Persian.\n\n## Resources\n\nFor full details of this model, [release blog post](https://aka.ms/cohere-blog).\n\n## Model Architecture\n\nThis is an auto-regressive language model that uses an optimized transformer architecture. After pretraining, this model uses supervised fine-tuning (SFT) and preference training to align model behavior to human preferences for helpfulness and safety.\n\n### Tool use capabilities\n\nCommand R has been specifically trained with conversational tool use capabilities. These have been trained into the model via a mixture of supervised fine-tuning and preference fine-tuning, using a specific prompt template. Deviating from this prompt template will likely reduce performance, but we encourage experimentation.\n\nCommand R's tool use functionality takes a conversation as input (with an optional user-system preamble), along with a list of available tools. The model will then generate a json-formatted list of actions to execute on a subset of those tools. Command R may use one of its supplied tools more than once.\n\nThe model has been trained to recognise a special directly_answer tool, which it uses to indicate that it doesn't want to use any of its other tools. The ability to abstain from calling a specific tool can be useful in a range of situations, such as greeting a user, or asking clarifying questions. We recommend including the directly_answer tool, but it can be removed or renamed if required.\n\n### Grounded Generation and RAG Capabilities\n\nCommand R has been specifically trained with grounded generation capabilities. This means that it can generate responses based on a list of supplied document snippets, and it will include grounding spans (citations) in its response indicating the source of the information. This can be used to enable behaviors such as grounded summarization and the final step of Retrieval Augmented Generation (RAG).This behavior has been trained into the model via a mixture of supervised fine-tuning and preference fine-tuning, using a specific prompt template. Deviating from this prompt template may reduce performance, but we encourage experimentation.\n\nCommand R's grounded generation behavior takes a conversation as input (with an optional user-supplied system preamble, indicating task, context and desired output style), along with a list of retrieved document snippets. The document snippets should be chunks, rather than long documents, typically around 100-400 words per chunk. Document snippets consist of key-value pairs. The keys should be short descriptive strings, the values can be text or semi-structured.\n\nBy default, Command R will generate grounded responses by first predicting which documents are relevant, then predicting which ones it will cite, then generating an answer. Finally, it will then insert grounding spans into the answer. See below for an example. This is referred to as accurate grounded generation.\n\nThe model is trained with a number of other answering modes, which can be selected by prompt changes . A fast citation mode is supported in the tokenizer, which will directly generate an answer with grounding spans in it, without first writing the answer out in full. This sacrifices some grounding accuracy in favor of generating fewer tokens.\n\n### Code Capabilities\n\nCommand R has been optimized to interact with your code, by requesting code snippets, code explanations, or code rewrites. It might not perform well out-of-the-box for pure code completion. For better performance, we also recommend using a low temperature (and even greedy decoding) for code-generation related instructions.\n",
        "summary": "Command R is a scalable generative model targeting RAG and Tool Use to enable production-scale AI for enterprise.",
        "tags": [
            "rag",
            "multilingual"
        ]
    },
    {
        "id": "azureml://registries/azureml-cohere/models/Cohere-command-r-plus/versions/3",
        "name": "Cohere-command-r-plus",
        "friendly_name": "Cohere Command R+",
        "model_version": 3,
        "publisher": "cohere",
        "model_family": "cohere",
        "model_registry": "azureml-cohere",
        "license": "custom",
        "task": "chat-completion",
        "description": "Command R+ is a highly performant generative large language model, optimized for a variety of use cases including reasoning, summarization, and question answering. \n\nThe model is optimized to perform well in the following languages: English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic.\n\nPre-training data additionally included the following 13 languages: Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, Persian.\n\n## Resources\n\nFor full details of this model, [release blog post](https://aka.ms/cohere-blog).\n\n## Model Architecture\n\nThis is an auto-regressive language model that uses an optimized transformer architecture. After pretraining, this model uses supervised fine-tuning (SFT) and preference training to align model behavior to human preferences for helpfulness and safety.\n\n### Tool use capabilities\n\nCommand R+ has been specifically trained with conversational tool use capabilities. These have been trained into the model via a mixture of supervised fine-tuning and preference fine-tuning, using a specific prompt template. Deviating from this prompt template will likely reduce performance, but we encourage experimentation.\n\nCommand R+'s tool use functionality takes a conversation as input (with an optional user-system preamble), along with a list of available tools. The model will then generate a json-formatted list of actions to execute on a subset of those tools. Command R+ may use one of its supplied tools more than once.\n\nThe model has been trained to recognise a special directly_answer tool, which it uses to indicate that it doesn't want to use any of its other tools. The ability to abstain from calling a specific tool can be useful in a range of situations, such as greeting a user, or asking clarifying questions. We recommend including the directly_answer tool, but it can be removed or renamed if required.\n\n### Grounded Generation and RAG Capabilities\n\nCommand R+ has been specifically trained with grounded generation capabilities. This means that it can generate responses based on a list of supplied document snippets, and it will include grounding spans (citations) in its response indicating the source of the information. This can be used to enable behaviors such as grounded summarization and the final step of Retrieval Augmented Generation (RAG).This behavior has been trained into the model via a mixture of supervised fine-tuning and preference fine-tuning, using a specific prompt template. Deviating from this prompt template may reduce performance, but we encourage experimentation.\n\nCommand R+'s grounded generation behavior takes a conversation as input (with an optional user-supplied system preamble, indicating task, context and desired output style), along with a list of retrieved document snippets. The document snippets should be chunks, rather than long documents, typically around 100-400 words per chunk. Document snippets consist of key-value pairs. The keys should be short descriptive strings, the values can be text or semi-structured.\n\nBy default, Command R+ will generate grounded responses by first predicting which documents are relevant, then predicting which ones it will cite, then generating an answer. Finally, it will then insert grounding spans into the answer. See below for an example. This is referred to as accurate grounded generation.\n\nThe model is trained with a number of other answering modes, which can be selected by prompt changes . A fast citation mode is supported in the tokenizer, which will directly generate an answer with grounding spans in it, without first writing the answer out in full. This sacrifices some grounding accuracy in favor of generating fewer tokens.\n\n### Code Capabilities\n\nCommand R+ has been optimized to interact with your code, by requesting code snippets, code explanations, or code rewrites. It might not perform well out-of-the-box for pure code completion. For better performance, we also recommend using a low temperature (and even greedy decoding) for code-generation related instructions.\n",
        "summary": "Command R+ is a state-of-the-art RAG-optimized model designed to tackle enterprise-grade workloads.",
        "tags": [
            "rag",
            "multilingual"
        ]
    },
    {
        "id": "azureml://registries/azureml-cohere/models/Cohere-embed-v3-english/versions/3",
        "name": "Cohere-embed-v3-english",
        "friendly_name": "Cohere Embed v3 English",
        "model_version": 3,
        "publisher": "cohere",
        "model_family": "cohere",
        "model_registry": "azureml-cohere",
        "license": "custom",
        "task": "embeddings",
        "description": "Cohere Embed English is the market's leading text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed English has top performance on the HuggingFace MTEB benchmark and performs well on a variety of industries such as Finance, Legal, and General-Purpose Corpora.The model was trained on nearly 1B English training pairs. For full details of this model, [release blog post](https://aka.ms/cohere-blog).",
        "summary": "Cohere Embed English is the market's leading text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering.",
        "tags": [
            "RAG",
            "search"
        ]
    },
    {
        "id": "azureml://registries/azureml-cohere/models/Cohere-embed-v3-multilingual/versions/3",
        "name": "Cohere-embed-v3-multilingual",
        "friendly_name": "Cohere Embed v3 Multilingual",
        "model_version": 3,
        "publisher": "cohere",
        "model_family": "cohere",
        "model_registry": "azureml-cohere",
        "license": "custom",
        "task": "embeddings",
        "description": "Cohere Embed Multilingual is the market's leading text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering. Embed Multilingual supports 100+ languages and can be used to search within a language (e.g., search with a French query on French documents) and across languages (e.g., search with an English query on Chinese documents). This model was trained on nearly 1B English training pairs and nearly 0.5B Non-English training pairs from 100+ languages. For full details of this model, [release blog post](https://aka.ms/cohere-blog).",
        "summary": "Supporting over 100 languages, Cohere Embed Multilingual is the market's leading text representation model used for semantic search, retrieval-augmented generation (RAG), classification, and clustering.",
        "tags": [
            "RAG",
            "search"
        ]
    },
    {
        "id": "azureml://registries/azureml-meta/models/Meta-Llama-3-70B-Instruct/versions/6",
        "name": "Meta-Llama-3-70B-Instruct",
        "friendly_name": "Meta-Llama-3-70B-Instruct",
        "model_version": 6,
        "publisher": "meta",
        "model_family": "meta",
        "model_registry": "azureml-meta",
        "license": "custom",
        "task": "chat-completion",
        "description": "Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety. \n\n## Model Architecture\n\nLlama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.\n\n## Training Datasets\n\n**Overview** Llama 3 was pretrained on over 15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 10M human-annotated examples. Neither the pretraining nor the fine-tuning datasets include Meta user data.\n\n**Data Freshness** The pretraining data has a cutoff of March 2023 for the 8B and December 2023 for the 70B models respectively. \n",
        "summary": "A powerful 70-billion parameter model excelling in reasoning, coding, and broad language applications.",
        "tags": [
            "conversation"
        ]
    },
    {
        "id": "azureml://registries/azureml-meta/models/Meta-Llama-3-8B-Instruct/versions/6",
        "name": "Meta-Llama-3-8B-Instruct",
        "friendly_name": "Meta-Llama-3-8B-Instruct",
        "model_version": 6,
        "publisher": "meta",
        "model_family": "meta",
        "model_registry": "azureml-meta",
        "license": "custom",
        "task": "chat-completion",
        "description": "Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety. \n\n## Model Architecture\n\nLlama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.\n\n## Training Datasets\n\n**Overview** Llama 3 was pretrained on over 15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 10M human-annotated examples. Neither the pretraining nor the fine-tuning datasets include Meta user data.\n\n**Data Freshness** The pretraining data has a cutoff of March 2023 for the 8B and December 2023 for the 70B models respectively. \n",
        "summary": "A versatile 8-billion parameter model optimized for dialogue and text generation tasks.",
        "tags": [
            "conversation"
        ]
    },
    {
        "id": "azureml://registries/azureml-meta/models/Meta-Llama-3.1-405B-Instruct/versions/1",
        "name": "Meta-Llama-3.1-405B-Instruct",
        "friendly_name": "Meta-Llama-3.1-405B-Instruct",
        "model_version": 1,
        "publisher": "meta",
        "model_family": "meta",
        "model_registry": "azureml-meta",
        "license": "custom",
        "task": "chat-completion",
        "description": "The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned\ngenerative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on\ncommon industry benchmarks.\n\n## Model Architecture\n\nLlama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.\n\n## Training Datasets\n\n**Overview:** Llama 3.1 was pretrained on ~15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 25M synthetically generated examples.\n\n**Data Freshness:** The pretraining data has a cutoff of December 2023.\n",
        "summary": "The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.",
        "tags": [
            "conversation"
        ]
    },
    {
        "id": "azureml://registries/azureml-meta/models/Meta-Llama-3.1-70B-Instruct/versions/1",
        "name": "Meta-Llama-3.1-70B-Instruct",
        "friendly_name": "Meta-Llama-3.1-70B-Instruct",
        "model_version": 1,
        "publisher": "meta",
        "model_family": "meta",
        "model_registry": "azureml-meta",
        "license": "custom",
        "task": "chat-completion",
        "description": "The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned\ngenerative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on\ncommon industry benchmarks.\n\n## Model Architecture\n\nLlama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.\n\n## Training Datasets\n\n**Overview:** Llama 3.1 was pretrained on ~15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 25M synthetically generated examples.\n\n**Data Freshness:** The pretraining data has a cutoff of December 2023.\n",
        "summary": "The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.",
        "tags": [
            "conversation"
        ]
    },
    {
        "id": "azureml://registries/azureml-meta/models/Meta-Llama-3.1-8B-Instruct/versions/1",
        "name": "Meta-Llama-3.1-8B-Instruct",
        "friendly_name": "Meta-Llama-3.1-8B-Instruct",
        "model_version": 1,
        "publisher": "meta",
        "model_family": "meta",
        "model_registry": "azureml-meta",
        "license": "custom",
        "task": "chat-completion",
        "description": "The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned\ngenerative models in 8B, 70B and 405B sizes (text in/text out). The Llama 3.1 instruction tuned text only models (8B, 70B, 405B) are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on\ncommon industry benchmarks.\n\n## Model Architecture\n\nLlama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.\n\n## Training Datasets\n\n**Overview:** Llama 3.1 was pretrained on ~15 trillion tokens of data from publicly available sources. The fine-tuning data includes publicly available instruction datasets, as well as over 25M synthetically generated examples.\n\n**Data Freshness:** The pretraining data has a cutoff of December 2023.\n",
        "summary": "The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.",
        "tags": [
            "conversation"
        ]
    },
    {
        "id": "azureml://registries/azureml-mistral/models/Mistral-large/versions/1",
        "name": "Mistral-large",
        "friendly_name": "Mistral Large",
        "model_version": 1,
        "publisher": "mistralai",
        "model_family": "mistralai",
        "model_registry": "azureml-mistral",
        "license": "custom",
        "task": "chat-completion",
        "description": "Mistral Large is Mistral AI's most advanced Large Language Model (LLM). It can be used on any language-based task thanks to its state-of-the-art reasoning and knowledge capabilities.\n\nAdditionally, Mistral Large is:\n\n- **Specialized in RAG.** Crucial information is not lost in the middle of long context windows (up to 32K tokens).\n- **Strong in coding.**  Code generation, review and comments. Supports all mainstream coding languages.\n- **Multi-lingual by design.** Best-in-class performance in French, German, Spanish, and Italian - in addition to English. Dozens of other languages are supported.\n- **Responsible AI.** Efficient guardrails baked in the model, with additional safety layer with safe_mode option\n\n## Resources\n\nFor full details of this model, please read [release blog post](https://aka.ms/mistral-blog).\n",
        "summary": "Mistral's flagship model that's ideal for complex tasks that require large reasoning capabilities or are highly specialized (Synthetic Text Generation, Code Generation, RAG, or Agents).",
        "tags": [
            "reasoning",
            "rag",
            "agents",
            "multilingual"
        ]
    },
    {
        "id": "azureml://registries/azureml-mistral/models/Mistral-large-2407/versions/1",
        "name": "Mistral-large-2407",
        "friendly_name": "Mistral Large (2407)",
        "model_version": 1,
        "publisher": "mistralai",
        "model_family": "mistralai",
        "model_registry": "azureml-mistral",
        "license": "custom",
        "task": "chat-completion",
        "description": "Mistral Large (2407) is an advanced Large Language Model (LLM) with state-of-the-art reasoning, knowledge and coding capabilities.\n\n**Multi-lingual by design.**\u00a0Dozens of languages supported, including English, French, German, Spanish, Italian, Chinese, Japanese, Korean, Portuguese, Dutch and Polish\n\n**Proficient in coding.** Trained on 80+ coding languages such as Python, Java, C, C++, JavaScript, and Bash. Also trained on more specific languages such as Swift and Fortran\n\n**Agent-centric.** Best-in-class agentic capabilities with native function calling and JSON outputting \n\n**Advanced Reasoning.**\u00a0State-of-the-art mathematical and reasoning capabilities\n",
        "summary": "Mistral Large (2407) is an advanced Large Language Model (LLM) with state-of-the-art reasoning, knowledge and coding capabilities.",
        "tags": [
            "reasoning",
            "rag",
            "agents"
        ]
    },
    {
        "id": "azureml://registries/azureml-mistral/models/Mistral-Nemo/versions/1",
        "name": "Mistral-Nemo",
        "friendly_name": "Mistral Nemo",
        "model_version": 1,
        "publisher": "mistralai",
        "model_family": "mistralai",
        "model_registry": "azureml-mistral",
        "license": "custom",
        "task": "chat-completion",
        "description": "Mistral Nemo is a cutting-edge Language Model (LLM) boasting state-of-the-art reasoning, world knowledge, and coding capabilities within its size category.\n\n**Jointly developed with Nvidia.** This collaboration has resulted in a powerful 12B model that pushes the boundaries of language understanding and generation.\n\n**Multilingual proficiency.** Mistral Nemo is equipped with a new tokenizer, Tekken, designed for multilingual applications. It supports over 100 languages, including but not limited to English, French, German, Spanish, Italian, Chinese, Japanese, Korean, Portuguese, Dutch, Polish, and many more. Tekken has proven to be more efficient than the Llama 3 tokenizer in compressing text for approximately 85% of all languages, with significant improvements in Malayalam, Hindi, Arabic, and prevalent European languages.\n\n**Agent-centric.** Mistral Nemo possesses top-tier agentic capabilities, including native function calling and JSON outputting.\n\n**Advanced Reasoning.** Mistral Nemo demonstrates state-of-the-art mathematical and reasoning capabilities within its size category.\n",
        "summary": "Mistral Nemo is a cutting-edge Language Model (LLM) boasting state-of-the-art reasoning, world knowledge, and coding capabilities within its size category.",
        "tags": [
            "reasoning",
            "rag",
            "agents"
        ]
    },
    {
        "id": "azureml://registries/azureml-mistral/models/Mistral-small/versions/1",
        "name": "Mistral-small",
        "friendly_name": "Mistral Small",
        "model_version": 1,
        "publisher": "mistralai",
        "model_family": "mistralai",
        "model_registry": "azureml-mistral",
        "license": "custom",
        "task": "chat-completion",
        "description": "Mistral Small is Mistral AI's most efficient Large Language Model (LLM). It can be used on any language-based task that requires high efficiency and low latency.\n\nMistral Small is:\n\n- **A small model optimized for low latency.**\u00a0Very efficient for high volume and low latency workloads. Mistral Small is Mistral's smallest proprietary model, it outperforms Mixtral 8x7B and has lower latency. \n- **Specialized in RAG.**\u00a0Crucial information is not lost in the middle of long context windows (up to 32K tokens).\n- **Strong in coding.**\u00a0Code generation, review and comments. Supports all mainstream coding languages.\n- **Multi-lingual by design.**\u00a0Best-in-class performance in French, German, Spanish, and Italian - in addition to English. Dozens of other languages are supported.\n- **Responsible AI.**\u00a0Efficient guardrails baked in the model, with additional safety layer with safe_mode option\n\n## Resources\n\nFor full details of this model, please read [release blog post](https://aka.ms/mistral-blog).\n",
        "summary": "Mistral Small can be used on any language-based task that requires high efficiency and low latency.",
        "tags": [
            "low latency",
            "multilingual"
        ]
    },
    {
        "id": "azureml://registries/azure-openai/models/gpt-4o/versions/2",
        "name": "gpt-4o",
        "friendly_name": "OpenAI GPT-4o",
        "model_version": 2,
        "publisher": "Azure OpenAI Service",
        "model_family": "openai",
        "model_registry": "azure-openai",
        "license": "custom",
        "task": "chat-completion",
        "description": "GPT-4o offers a shift in how AI models interact with multimodal inputs. By seamlessly combining text, images, and audio, GPT-4o provides a richer, more engaging user experience.\n\nMatching the intelligence of GPT-4 Turbo, it is remarkably more efficient, delivering text at twice the speed and at half the cost. Additionally, GPT-4o exhibits the highest vision performance and excels in non-English languages compared to previous OpenAI models.\n\nGPT-4o is engineered for speed and efficiency. Its advanced ability to handle complex queries with minimal resources can translate into cost savings and performance.\n\nThe introduction of GPT-4o opens numerous possibilities for businesses in various sectors: \n\n1. **Enhanced customer service**: By integrating diverse data inputs, GPT-4o enables more dynamic and comprehensive customer support interactions.\n2. **Advanced analytics**: Leverage GPT-4o's capability to process and analyze different types of data to enhance decision-making and uncover deeper insights.\n3. **Content innovation**: Use GPT-4o's generative capabilities to create engaging and diverse content formats, catering to a broad range of consumer preferences.\n\n## Resources\n\n- [\"Hello GPT-4o\" (OpenAI announcement)](https://openai.com/index/hello-gpt-4o/)\n- [Introducing GPT-4o: OpenAI's new flagship multimodal model now in preview on Azure](https://azure.microsoft.com/en-us/blog/introducing-gpt-4o-openais-new-flagship-multimodal-model-now-in-preview-on-azure/)\n",
        "summary": "OpenAI's most advanced multimodal model in the GPT-4 family. Can handle both text and image inputs.",
        "tags": [
            "multipurpose",
            "multilingual",
            "multimodal"
        ]
    },
    {
        "id": "azureml://registries/azure-openai/models/gpt-4o-mini/versions/1",
        "name": "gpt-4o-mini",
        "friendly_name": "OpenAI GPT-4o mini",
        "model_version": 1,
        "publisher": "Azure OpenAI Service",
        "model_family": "OpenAI",
        "model_registry": "azure-openai",
        "license": "custom",
        "task": "chat-completion",
        "description": "GPT-4o mini enables a broad range of tasks with its low cost and latency, such as applications that chain or parallelize multiple model calls (e.g., calling multiple APIs), pass a large volume of context to the model (e.g., full code base or conversation history), or interact with customers through fast, real-time text responses (e.g., customer support chatbots).\n\nToday, GPT-4o mini supports text and vision in the API, with support for text, image, video and audio inputs and outputs coming in the future. The model has a context window of 128K tokens and knowledge up to October 2023. Thanks to the improved tokenizer shared with GPT-4o, handling non-English text is now even more cost effective.\n\nGPT-4o mini surpasses GPT-3.5 Turbo and other small models on academic benchmarks across both textual intelligence and multimodal reasoning, and supports the same range of languages as GPT-4o. It also demonstrates strong performance in function calling, which can enable developers to build applications that fetch data or take actions with external systems, and improved long-context performance compared to GPT-3.5 Turbo.\n\n## Resources\n\n- [OpenAI announcement](https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/)\n",
        "summary": "An affordable, efficient AI solution for diverse text and image tasks.",
        "tags": [
            "multipurpose",
            "multilingual",
            "multimodal"
        ]
    },
    {
        "id": "azureml://registries/azure-openai/models/text-embedding-3-large/versions/1",
        "name": "text-embedding-3-large",
        "friendly_name": "OpenAI Text Embedding 3 (large)",
        "model_version": 1,
        "publisher": "Azure OpenAI Service",
        "model_family": "openai",
        "model_registry": "azure-openai",
        "license": "custom",
        "task": "embeddings",
        "description": "Text-embedding-3 series models are the latest and most capable embedding model. The text-embedding-3 models offer better average multi-language retrieval performance with the MIRACL benchmark while still maintaining performance for English tasks with the MTEB benchmark.",
        "summary": "Text-embedding-3 series models are the latest and most capable embedding model from OpenAI.",
        "tags": [
            "RAG",
            "search"
        ]
    },
    {
        "id": "azureml://registries/azure-openai/models/text-embedding-3-small/versions/1",
        "name": "text-embedding-3-small",
        "friendly_name": "OpenAI Text Embedding 3 (small)",
        "model_version": 1,
        "publisher": "Azure OpenAI Service",
        "model_family": "openai",
        "model_registry": "azure-openai",
        "license": "custom",
        "task": "embeddings",
        "description": "Text-embedding-3 series models are the latest and most capable embedding model. The text-embedding-3 models offer better average multi-language retrieval performance with the MIRACL benchmark while still maintaining performance for English tasks with the MTEB benchmark.",
        "summary": "Text-embedding-3 series models are the latest and most capable embedding model from OpenAI.",
        "tags": [
            "RAG",
            "search"
        ]
    },
    {
        "id": "azureml://registries/azureml/models/Phi-3-medium-128k-instruct/versions/3",
        "name": "Phi-3-medium-128k-instruct",
        "friendly_name": "Phi-3-medium instruct (128k)",
        "model_version": 3,
        "publisher": "microsoft",
        "model_family": "microsoft",
        "model_registry": "azureml",
        "license": "mit",
        "task": "chat-completion",
        "description": "The Phi-3-Medium-128K-Instruct is a 14B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties.\nThe model belongs to the Phi-3 family with the Medium version in two variants 4k and 128K which is the context length (in tokens) that it can support.\n\nThe model underwent a post-training process that incorporates both supervised fine-tuning and direct preference optimization for the instruction following and safety measures.\nWhen assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3-Medium-128K-Instruct showcased a robust and state-of-the-art performance among models of the same-size and next-size-up.\n\n## Resources\n\n\ud83c\udfe1 [Phi-3 Portal](https://azure.microsoft.com/en-us/products/phi-3) <br>\n\ud83d\udcf0 [Phi-3 Microsoft Blog](https://aka.ms/Phi-3Build2024) <br>\n\ud83d\udcd6 [Phi-3 Technical Report](https://aka.ms/phi3-tech-report) <br>\n\ud83d\udee0\ufe0f [Phi-3 on Azure AI Studio](https://aka.ms/phi3-azure-ai) <br>\n\ud83d\udc69\u200d\ud83c\udf73 [Phi-3 Cookbook](https://github.com/microsoft/Phi-3CookBook) <br>\n\n## Model Architecture\n\nPhi-3-Medium-128k-Instruct has 14B parameters and is a dense decoder-only Transformer model. The model is fine-tuned with Supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) to ensure alignment with human preferences and safety guidelines.\n\n## Training Datasets\n\nOur training data includes a wide variety of sources, totaling 4.8 trillion tokens (including 10% multilingual), and is a combination of \n1) Publicly available documents filtered rigorously for quality, selected high-quality educational data, and code; \n2) Newly created synthetic, \"textbook - like\" data for the purpose of teaching math, coding, common sense reasoning, general knowledge of the world (science, daily activities, theory of mind, etc.); \n3) High quality chat format supervised data covering various topics to reflect human preferences on different aspects such as instruct-following, truthfulness, honesty and helpfulness.\n\nWe are focusing on the quality of data that could potentially improve the reasoning ability for the model, and we filter the publicly available documents to contain the correct level of knowledge. As an example, the result of a game in premier league in a particular day might be good training data for frontier models, but we need to remove such information to leave more model capacity for reasoning for the small size models. More details about data can be found in the [Phi-3 Technical Report](https://aka.ms/phi3-tech-report).\n",
        "summary": "Same model as Phi-3-medium (4k) but with larger context size. Use this for RAG or few shot prompting.",
        "tags": [
            "reasoning",
            "understanding",
            "large context"
        ]
    },
    {
        "id": "azureml://registries/azureml/models/Phi-3-medium-4k-instruct/versions/3",
        "name": "Phi-3-medium-4k-instruct",
        "friendly_name": "Phi-3-medium instruct (4k)",
        "model_version": 3,
        "publisher": "microsoft",
        "model_family": "microsoft",
        "model_registry": "azureml",
        "license": "mit",
        "task": "chat-completion",
        "description": "The Phi-3-Medium-4K-Instruct is a 14B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties.\nThe model belongs to the Phi-3 family with the Medium version in two variants 4K and 128K which is the context length (in tokens) that it can support.\n\nThe model underwent a post-training process that incorporates both supervised fine-tuning and direct preference optimization for the instruction following and safety measures.\nWhen assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3-Medium-4K-Instruct showcased a robust and state-of-the-art performance among models of the same-size and next-size-up.\n\n## Resources\n\n\ud83c\udfe1 [Phi-3 Portal](https://azure.microsoft.com/en-us/products/phi-3) <br>\n\ud83d\udcf0 [Phi-3 Microsoft Blog](https://aka.ms/Phi-3Build2024) <br>\n\ud83d\udcd6 [Phi-3 Technical Report](https://aka.ms/phi3-tech-report) <br>\n\ud83d\udee0\ufe0f [Phi-3 on Azure AI Studio](https://aka.ms/phi3-azure-ai) <br>\n\ud83d\udc69\u200d\ud83c\udf73 [Phi-3 Cookbook](https://github.com/microsoft/Phi-3CookBook) <br>\n\n## Model Architecture\n\nPhi-3-Medium-4K-Instruct has 14B parameters and is a dense decoder-only Transformer model. The model is fine-tuned with Supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) to ensure alignment with human preferences and safety guidelines.\n\n## Training Datasets\n\nOur training data includes a wide variety of sources, totaling 4.8 trillion tokens (including 10% multilingual), and is a combination of \n1) Publicly available documents filtered rigorously for quality, selected high-quality educational data, and code; \n2) Newly created synthetic, \"textbook-like\" data for the purpose of teaching math, coding, common sense reasoning, general knowledge of the world (science, daily activities, theory of mind, etc.); \n3) High quality chat format supervised data covering various topics to reflect human preferences on different aspects such as instruct-following, truthfulness, honesty and helpfulness.\n\nWe are focusing on the quality of data that could potentially improve the reasoning ability for the model, and we filter the publicly available documents to contain the correct level of knowledge. As an example, the result of a game in premier league in a particular day might be good training data for frontier models, but we need to remove such information to leave more model capacity for reasoning for the small size models. More details about data can be found in the [Phi-3 Technical Report](https://aka.ms/phi3-tech-report).\n",
        "summary": "A 14B parameter model. Use this larger model for better quality than Phi-3-mini, and with reasoning-dense data.",
        "tags": [
            "reasoning",
            "understanding"
        ]
    },
    {
        "id": "azureml://registries/azureml/models/Phi-3-mini-128k-instruct/versions/10",
        "name": "Phi-3-mini-128k-instruct",
        "friendly_name": "Phi-3-mini instruct (128k)",
        "model_version": 10,
        "publisher": "microsoft",
        "model_family": "microsoft",
        "model_registry": "azureml",
        "license": "mit",
        "task": "chat-completion",
        "description": "The Phi-3-Mini-128K-Instruct is a 3.8 billion-parameter, lightweight, state-of-the-art open model trained using the Phi-3 datasets.\nThis dataset includes both synthetic data and filtered publicly available website data, with an emphasis on high-quality and reasoning-dense properties.\n\nAfter initial training, the model underwent a post-training process that involved supervised fine-tuning and direct preference optimization to enhance its ability to follow instructions and adhere to safety measures.\nWhen evaluated against benchmarks that test common sense, language understanding, mathematics, coding, long-term context, and logical reasoning, the Phi-3 Mini-128K-Instruct demonstrated robust and state-of-the-art performance among models with fewer than 13 billion parameters.\n\n## Resources\n\n\ud83c\udfe1 [Phi-3 Portal](https://azure.microsoft.com/en-us/products/phi-3) <br>\n\ud83d\udcf0 [Phi-3 Microsoft Blog](https://aka.ms/Phi-3Build2024) <br>\n\ud83d\udcd6 [Phi-3 Technical Report](https://aka.ms/phi3-tech-report) <br>\n\ud83d\udee0\ufe0f [Phi-3 on Azure AI Studio](https://aka.ms/phi3-azure-ai) <br>\n\ud83d\udc69\u200d\ud83c\udf73 [Phi-3 Cookbook](https://github.com/microsoft/Phi-3CookBook) <br>\n\n## Model Architecture\n\nPhi-3 Mini-128K-Instruct has 3.8B parameters and is a dense decoder-only Transformer model. The model is fine-tuned with Supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) to ensure alignment with human preferences and safety guidelines.\n\n## Training Datasets\n\nOur training data includes a wide variety of sources, totaling 4.9 trillion tokens, and is a combination of \n1) Publicly available documents filtered rigorously for quality, selected high-quality educational data, and code; \n2) Newly created synthetic, \"textbook - like\" data for the purpose of teaching math, coding, common sense reasoning, general knowledge of the world (science, daily activities, theory of mind, etc.); \n3) High quality chat format supervised data covering various topics to reflect human preferences on different aspects such as instruct-following, truthfulness, honesty and helpfulness.\n\nWe are focusing on the quality of data that could potentially improve the reasoning ability for the model, and we filter the publicly available documents to contain the correct level of knowledge. As an example, the result of a game in premier league in a particular day might be good training data for frontier models, but we need to remove such information to leave more model capacity for reasoning for the small size models. More details about data can be found in the [Phi-3 Technical Report](https://aka.ms/phi3-tech-report).\n",
        "summary": "Same model as Phi-3-mini (4k) but with larger context size. Use this for RAG or few shot prompting.",
        "tags": [
            "reasoning",
            "understanding",
            "low latency"
        ]
    },
    {
        "id": "azureml://registries/azureml/models/Phi-3-mini-4k-instruct/versions/10",
        "name": "Phi-3-mini-4k-instruct",
        "friendly_name": "Phi-3-mini instruct (4k)",
        "model_version": 10,
        "publisher": "microsoft",
        "model_family": "microsoft",
        "model_registry": "azureml",
        "license": "mit",
        "task": "chat-completion",
        "description": "The Phi-3-Mini-4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties.\nThe model belongs to the Phi-3 family with the Mini version in two variants 4K and 128K which is the context length (in tokens) that it can support.\n\nThe model underwent a post-training process that incorporates both supervised fine-tuning and direct preference optimization for the instruction following and safety measures.\nWhen assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-4K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.\n\n## Resources\n\n\ud83c\udfe1 [Phi-3 Portal](https://azure.microsoft.com/en-us/products/phi-3) <br>\n\ud83d\udcf0 [Phi-3 Microsoft Blog](https://aka.ms/Phi-3Build2024) <br>\n\ud83d\udcd6 [Phi-3 Technical Report](https://aka.ms/phi3-tech-report) <br>\n\ud83d\udee0\ufe0f [Phi-3 on Azure AI Studio](https://aka.ms/phi3-azure-ai) <br>\n\ud83d\udc69\u200d\ud83c\udf73 [Phi-3 Cookbook](https://github.com/microsoft/Phi-3CookBook) <br>\n\n## Model Architecture\n\nPhi-3 Mini-4K-Instruct has 3.8B parameters and is a dense decoder-only Transformer model. The model is fine-tuned with Supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) to ensure alignment with human preferences and safety guidelines.\n\n## Training Datasets\n\nOur training data includes a wide variety of sources, totaling 4.9 trillion tokens, and is a combination of \n1) Publicly available documents filtered rigorously for quality, selected high-quality educational data, and code; \n2) Newly created synthetic, \"textbook - like\" data for the purpose of teaching math, coding, common sense reasoning, general knowledge of the world (science, daily activities, theory of mind, etc.); \n3) High quality chat format supervised data covering various topics to reflect human preferences on different aspects such as instruct-following, truthfulness, honesty and helpfulness.\n\nWe are focusing on the quality of data that could potentially improve the reasoning ability for the model, and we filter the publicly available documents to contain the correct level of knowledge. As an example, the result of a game in premier league in a particular day might be good training data for frontier models, but we need to remove such information to leave more model capacity for reasoning for the small size models. More details about data can be found in the [Phi-3 Technical Report](https://aka.ms/phi3-tech-report).\n",
        "summary": "Tiniest Phi-3 model. Optimized for both quality and low latency.",
        "tags": [
            "reasoning",
            "understanding",
            "low latency"
        ]
    },
    {
        "id": "azureml://registries/azureml/models/Phi-3-small-128k-instruct/versions/3",
        "name": "Phi-3-small-128k-instruct",
        "friendly_name": "Phi-3-small instruct (128k)",
        "model_version": 3,
        "publisher": "microsoft",
        "model_family": "microsoft",
        "model_registry": "azureml",
        "license": "mit",
        "task": "chat-completion",
        "description": "The Phi-3-Small-128K-Instruct is a 7B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties. The model supports 128K context length (in tokens).\n\nThe model underwent a post-training process that incorporates both supervised fine-tuning and direct preference optimization for the instruction following and safety measures.\nWhen assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3-Small-128K-Instruct showcased a robust and state-of-the-art performance among models of the same-size and next-size-up.\n\n## Resources\n\n+ [Phi-3 Microsoft Blog](https://aka.ms/phi3blog-april)\n+ [Phi-3 Technical Report](https://aka.ms/phi3-tech-report)\n\n## Model Architecture\n\nPhi-3 Small-128K-Instruct has 7B parameters and is a dense decoder-only Transformer model. The model is fine-tuned with Supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) to ensure alignment with human preferences and safety guidelines.\n\n## Training Datasets\n\nOur training data includes a wide variety of sources, totaling 4.8 trillion tokens (including 10% multilingual), and is a combination of \n1) Publicly available documents filtered rigorously for quality, selected high-quality educational data, and code; \n2) Newly created synthetic, \u201ctextbook-like\u201d data for the purpose of teaching math, coding, common sense reasoning, general knowledge of the world (science, daily activities, theory of mind, etc.); \n3) High quality chat format supervised data covering various topics to reflect human preferences on different aspects such as instruct-following, truthfulness, honesty and helpfulness.\n\nWe are focusing on the quality of data that could potentially improve the reasoning ability for the model, and we filter the publicly available documents to contain the correct level of knowledge. As an example, the result of a game in premier league in a particular day might be good training data for frontier models, but we need to remove such information to leave more model capacity for reasoning for the small size models. More details about data can be found in the [Phi-3 Technical Report](https://aka.ms/phi3-tech-report).",
        "summary": "Same Phi-3-small model, but with a larger context size for RAG or few shot prompting.",
        "tags": [
            "reasoning",
            "understanding",
            "large context"
        ]
    },
    {
        "id": "azureml://registries/azureml/models/Phi-3-small-8k-instruct/versions/3",
        "name": "Phi-3-small-8k-instruct",
        "friendly_name": "Phi-3-small instruct (8k)",
        "model_version": 3,
        "publisher": "microsoft",
        "model_family": "microsoft",
        "model_registry": "azureml",
        "license": "mit",
        "task": "chat-completion",
        "description": "The Phi-3-Small-8K-Instruct is a 7B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties. The model supports 8K context length (in tokens).\n\nThe model underwent a post-training process that incorporates both supervised fine-tuning and direct preference optimization for the instruction following and safety measures.\nWhen assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3-Small-8K-Instruct showcased a robust and state-of-the-art performance among models of the same-size and next-size-up.\n\n## Resources\n\n\ud83c\udfe1 [Phi-3 Portal](https://azure.microsoft.com/en-us/products/phi-3) <br>\n\ud83d\udcf0 [Phi-3 Microsoft Blog](https://aka.ms/Phi-3Build2024) <br>\n\ud83d\udcd6 [Phi-3 Technical Report](https://aka.ms/phi3-tech-report) <br>\n\ud83d\udee0\ufe0f [Phi-3 on Azure AI Studio](https://aka.ms/phi3-azure-ai) <br>\n\ud83d\udc69\u200d\ud83c\udf73 [Phi-3 Cookbook](https://github.com/microsoft/Phi-3CookBook) <br>\n\n## Model Architecture\n\nPhi-3 Small-8K-Instruct has 7B parameters and is a dense decoder-only Transformer model. The model is fine-tuned with Supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) to ensure alignment with human preferences and safety guidelines.\n\n## Training Datasets\n\nOur training data includes a wide variety of sources, totaling 4.8 trillion tokens (including 10% multilingual), and is a combination of \n1) Publicly available documents filtered rigorously for quality, selected high-quality educational data, and code; \n2) Newly created synthetic, \u201ctextbook-like\u201d data for the purpose of teaching math, coding, common sense reasoning, general knowledge of the world (science, daily activities, theory of mind, etc.); \n3) High quality chat format supervised data covering various topics to reflect human preferences on different aspects such as instruct-following, truthfulness, honesty and helpfulness.\n\nWe are focusing on the quality of data that could potentially improve the reasoning ability for the model, and we filter the publicly available documents to contain the correct level of knowledge. As an example, the result of a game in premier league in a particular day might be good training data for frontier models, but we need to remove such information to leave more model capacity for reasoning for the small size models. More details about data can be found in the [Phi-3 Technical Report](https://aka.ms/phi3-tech-report).",
        "summary": "A 7B parameters model. Use this larger model for better quality than Phi-3-mini, and with reasoning-dense data.",
        "tags": [
            "reasoning",
            "understanding"
        ]
    },
    {
        "id": "azureml://registries/azureml/models/Phi-3.5-mini-instruct/versions/2",
        "name": "Phi-3.5-mini-instruct",
        "friendly_name": "Phi-3.5-mini instruct (128k)",
        "model_version": 2,
        "publisher": "microsoft",
        "model_family": "microsoft",
        "model_registry": "azureml",
        "license": "mit",
        "task": "chat-completion",
        "description": "Phi-3.5-mini is a lightweight, state-of-the-art open model built upon datasets used for Phi-3 - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data. The model belongs to the Phi-3 model family and supports 128K token context length. The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning, proximal policy optimization, and direct preference optimization to ensure precise instruction adherence and robust safety measures.\n\n### Resources\n\ud83c\udfe1 [Phi-3 Portal](https://azure.microsoft.com/en-us/products/phi-3) <br>\n\ud83d\udcf0 [Phi-3 Microsoft Blog](https://aka.ms/phi3.5-techblog) <br>\n\ud83d\udcd6 [Phi-3 Technical Report](https://arxiv.org/abs/2404.14219) <br>\n\ud83d\udc69\u200d\ud83c\udf73 [Phi-3 Cookbook](https://github.com/microsoft/Phi-3CookBook) <br>\n\n### Model Architecture\nPhi-3.5-mini has 3.8B parameters and is a dense decoder-only Transformer model using the same tokenizer as Phi-3 Mini. It is a text-only model best suited for prompts using chat format.\n\n### Training Data\nPhi-3.5-mini is a static model trained on an offline dataset with 3.4T tokens and a cutoff date October 2023 for publicly available data. Future versions of the tuned models may be released as we improve models.\n",
        "summary": "Refresh of Phi-3-mini model",
        "tags": [
            "reasoning",
            "understanding",
            "low latency"
        ]
    }
]
@ignaciocastro ignaciocastro added the 🐛 Bug Something isn't working | 缺陷 label Oct 29, 2024
@github-project-automation github-project-automation bot moved this to Roadmap - Chat 1.x in Lobe Chat Routine Oct 29, 2024
Copy link

dosubot bot commented Oct 29, 2024

Hey there, @ignaciocastro! I'm here to help you with bug fixes, questions, and more while you wait for a human maintainer. Let's squash those bugs together! 🐞

The issue you're experiencing, where pressing the "Get Model List" button results in "0 models available in total" and clears out already set models, could be due to several factors. Here are some potential causes and solutions:

  1. Data Fetching and Transformation: Ensure that the data fetching and transformation logic is correctly implemented. The application uses functions like extractEnabledModels and transformToChatModelCards to handle model data. Verify that these functions are correctly processing the data from GitHub [1].

  2. Error Handling: Check if there are any errors during the fetching process. The application has error handling mechanisms in place, such as using a try-catch block in the getModels method of the OllamaService class. Ensure that any errors are being caught and handled appropriately [2].

  3. Caching Issues: The application uses useSWR for data fetching, which includes caching. Ensure that the cache is not causing stale data to be displayed. You might need to adjust the revalidateOnFocus and revalidateOnMount options to ensure fresh data is fetched [3].

  4. UI Update Logic: Verify that the UI update logic is correctly implemented. The useFetchProviderModelList function should update the model list in the UI once new models are retrieved. Ensure that the onSuccess callback is correctly updating the state and refreshing the model list [3].

  5. Configuration and State Management: Check if the configuration or state management is inadvertently clearing the models. Ensure that the logic for setting and maintaining the model list state is robust and not resetting the list unexpectedly.

By reviewing these areas, you should be able to identify and resolve the issue causing the model list to display as empty.

To continue talking to Dosu, mention @dosu.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 Bug Something isn't working | 缺陷 released
Projects
Status: Done
2 participants
@ignaciocastro and others