multimodal

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translation tts speech-synthesis neural-networks deeplearning speaker-recognition asr multimodal speech-translation large-language-models speaker-diariazation generative-ai

Updated Nov 22, 2024
Python

mediar-ai / screenpipe

Star

rewind.ai x cursor.com = your AI assistant that has all the context. 24/7 screen & voice recording for the age of super intelligence. get your data ready or be left behind

machine-learning ai computer-vision ml vision multimodal llm

Updated Nov 22, 2024
Rust

bentoml / BentoML

Star

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

python machine-learning deep-learning model-serving multimodal mlops ml-engineering ai-inference llm generative-ai llmops llm-serving model-inference-service llm-inference inference-platform

Updated Nov 22, 2024
Python

rerun-io / rerun

Star

Visualize streams of multimodal data. Free, fast, easy to use, and simple to integrate. Built in Rust.

visualization python rust computer-vision cpp robotics multimodal

Updated Nov 21, 2024
Rust

Generative AI suite powered by state-of-the-art models and providing advanced AI/AGI functions. It features AI personas, AGI functions, multi-model chats, text-to-image, voice, response streaming, code highlighting and execution, PDF import, presets for developers, much more. Deploy on-prem or in the cloud.

ui beam agi openai gpt mistral multimodal groq openai-api gpt-4 large-language-models stable-diffusion generative-ai chatgpt chatgpt-ui gpt-5 anthropic

Updated Nov 21, 2024
TypeScript

facebookresearch / mmf

Star

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

deep-learning dialog pytorch vqa pretrained-models captioning multimodal multi-tasking textvqa hateful-memes

Updated Nov 15, 2024
Python

SkalskiP / courses

Sponsor

Star

This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI)

nlp machine-learning natural-language-processing tutorial deep-neural-networks computer-vision deep-learning transformers generative-model multimodal mlops stable-diffusion

Updated Apr 22, 2024
Python

swyxio / ai-notes

Star

notes for software engineers getting up to speed on new AI developments. Serves as datastore for https://latent.space writing, and product brainstorming, but has cleaned up canonical references under the /Resources folder.

ai openai gpt multimodal gpt-3 prompt-engineering stable-diffusion

Updated Nov 20, 2024
HTML

kyegomez / tree-of-thoughts

Sponsor

Star

Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%

deep-learning prompt artificial-intelligence multimodal gpt4 prompt-learning prompt-tuning prompt-engineering chatgpt

Updated Oct 29, 2024
Python

modelscope / ms-swift

Star

Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)