sshh12 / multi_token Star 150 Code Issues Pull requests Embed arbitrary modalities (images, audio, documents, etc) into large language models. multimodal multi-modality large-language-models llm vision-language-model llava large-context large-multimodal-models Updated Mar 27, 2024 Python