multi-modality

Here are 68 public repositories matching this topic...

kyegomez / swarms

Orchestrate Swarms of Agents From Any Framework Like OpenAI, Langchain, and Etc for Business Operation Automation. Join our Community: https://discord.gg/DbjBMJTSWD

Updated May 27, 2024
Python

RLHF-V / RLHF-V

Star

[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

chatbot llama multimodal multi-modality gpt-4 visual-language-learning rlhf-v

Updated May 27, 2024
Python

xiaoachen98 / Open-LLaVA-NeXT

Star

An open-source implementation of LLaVA-NeXT.

chatbot llama multimodal multi-modality gpt-4 visual-language-learning chatgpt vision-language-model llava large-multimodal-models llama3 gpt4o llava-next

Updated May 27, 2024
Python

InternLM / InternLM-XComposer

Star

InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.

foundation gpt language-model multimodal multi-modality vision-transformer gpt-4 visual-language-learning llm chatgpt instruction-tuning large-language-model supervised-finetuning mllm vision-language-model large-vision-language-model

Updated May 27, 2024
Python

BradyFU / Awesome-Multimodal-Large-Language-Models

Star

✨✨Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.

multi-modality instruction-following in-context-learning large-language-models chain-of-thought instruction-tuning visual-instruction-tuning large-vision-language-model multimodal-instruction-tuning large-vision-language-models multimodal-large-language-models visual-in-context-learning multimodal-in-context-learning visual-chain-of-thought multimodal-chain-of-thought

Updated May 27, 2024

kyegomez / MLXTransformer

Sponsor

Star

Simple Implementation of a Transformer in the new framework MLX by Apple

machine-learning artificial-intelligence multi-modal multi-modality gpt4

Updated May 27, 2024
Python

kyegomez / Gemini

Sponsor

Star

The open source implementation of Gemini, the model that will "eclipse ChatGPT" by Google

machine-learning ai ml artificial-intelligence gemini multi-modality gpt4 multimodla

Updated May 26, 2024
Python

kyegomez / MultiModal-ToT

Sponsor

Star

Multi-Modal Tree of thoughts for DALLE-3 like auto self improvement

artificial-intelligence multi-modal multi-modality gpt4 multi-modality-data

Updated May 18, 2024
Python

kyegomez / MoE-Mamba

Sponsor

Star

Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Zeta

ai ml moe swarms multi-modality multi-modal-fusion

Updated May 18, 2024
Python

kyegomez / MambaByte

Sponsor

Star

Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zeta

machine-learning ai tokenizer ml artificial-intelligence mamba multi-modality megabyte gpt4v

Updated May 17, 2024
Python

kyegomez / Kosmos2.5

Sponsor

Star

My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"

opensource attention kosmos attention-is-all-you-need multimodal multimodal-deep-learning multi-modality gpt3 gpt4

Updated May 17, 2024
Python

kyegomez / qformer

Sponsor

Star

Implementation of Qformer from BLIP2 in Zeta Lego blocks.

machine-learning ai machine artificial-intelligence multi-modal attention-mechanism multi-modality blip2

Updated May 17, 2024
Python

DLR-RM / 3DObjectTracking

Star

Algorithms and Publications on 3D Object Tracking

tracking real-time computer-vision paper object-tracking rgbd pose-estimation ijcv multi-modality articulated tpami multi-body accv2020 cvpr2022 iros2023

Updated May 16, 2024
C++

haotian-liu / LLaVA

Star

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

chatbot llama multimodal multi-modality gpt-4 foundation-models visual-language-learning chatgpt instruction-tuning vision-language-model llava llama2 llama-2

Updated May 15, 2024
Python

voidful / MMLM

Sponsor

Star

Toward Multi Modality Language Model - implementation of GPT-4o/Project Astra

multi-model world-models multi-modality gpt-4o

Updated May 14, 2024
Python

Oztobuzz / Vista

Star

This is the official repository for Vista dataset - A Vietnamese multimodal dataset contains more than 700,000 samples of conversations and images

open-source vietnamese dataset vista vietnamese-nlp multimodal multi-modality vision-language-model

Updated May 14, 2024
Python

yangcaoai / CoDA_NeurIPS2023

Star

Official code for NeurIPS2023 paper: CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection

deep-learning detection artificial-intelligence transformer 3d-vision 3d-detection multi-modality open-vocabulary

Updated May 5, 2024
Jupyter Notebook

chenshuang-zhang / imagenet_d

Star

[CVPR2024 Highlight] Official Code for "ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object"

benchmark recognition computer-vision dataset imagenet image-recognition robustness synthetic-data generative-models multi-modality out-of-distribution text-to-image-synthesis diffusion-models stable-diffusion large-language-model vision-language-model

Updated May 2, 2024
Python

kyegomez / MM1

Sponsor

Star

PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"

machine-learning ai deep-learning ml artificial-intelligence multi-modal mm1 multi-modality gpt4 multi-modal-revolution

Updated May 26, 2024
Python

OpenGVLab / Multi-Modality-Arena

Star

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

chat chatbot vqa gradio multi-modality large-language-models llms chatgpt vision-language-model

Updated Apr 21, 2024
Python

Improve this page

Add a description, image, and links to the multi-modality topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multi-modality topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-modality

Here are 68 public repositories matching this topic...

kyegomez / swarms

RLHF-V / RLHF-V

xiaoachen98 / Open-LLaVA-NeXT

InternLM / InternLM-XComposer

BradyFU / Awesome-Multimodal-Large-Language-Models

kyegomez / MLXTransformer

kyegomez / Gemini

kyegomez / MultiModal-ToT

kyegomez / MoE-Mamba

kyegomez / MambaByte

kyegomez / Kosmos2.5

kyegomez / qformer

DLR-RM / 3DObjectTracking

haotian-liu / LLaVA

voidful / MMLM

Oztobuzz / Vista

yangcaoai / CoDA_NeurIPS2023

chenshuang-zhang / imagenet_d

kyegomez / MM1

OpenGVLab / Multi-Modality-Arena

Improve this page

Add this topic to your repo