Awesome Personalized Large Multimodal Models

📝 A curated list about Personalized Multimodal Models and related resources~ 📚


Problem Settings: Using 3-5 images of a novel concept/subject (e.g., a pet named `<bo>`), can we personalize Large Multimodal Models so that: (1) They retain their original capabilities (e.g., Describe a dog) while (2) Enabling tailored their capabilities for the novel concept? (e.g., Describe `<bo>`)

Over the years, we’ve witnessed the evolution of personalization across various tasks (e.g., object segmentation, image generation).
Now, with the rise of Large Multimodal Models (LMMs) -- We have opportunities to personalizing these generalist, large-scale AI systems.
It’s time to take the leap and bring personalization into the realm of Large Multimodal Models, making them not only powerful but also user-specific!

^ Above caption are actually generated by GPT-4o, I feed it the figure and asked it to generate a caption, haha!

(This figure is created by me. If there is anything incorrect, please feel free to correct me! Thank you!)

Papers

⚠️ Minor Note: The listed works below are specified for settings where users provide 3-5 images, and the system needs to learn about those concepts. There is research on other subtopics (e.g., role-playing, persona, etc.). For these topics, this repo might provide better coverage.

Title	Venue	Year	Input	Output	Link/ Code
[paper title]	xx	2024	image, text	image, text
─── Vision Language Model ───
MC-LLaVA: Multi-Concept Personalized Vision-Language Model	arXiv	2024	image, text	text	Code
Retrieval-Augmented Personalization for Multimodal Large Language Models	arXiv	2024	image, text	text	Page, Code
Yo'LLaVA: Your Personalized Language and Vision Assistant	NeurIPS	2024	image, text	text	Page, Code
MyVLM: Personalizing VLMs for user-specific queries	ECCV	2024	image, text	text	Page, Code
─── Large Language Models ───
Personalized Large Language Models	ICDMw	2024	text	text
LaMP: When Large Language Models Meet Personalization	ACL	2024	text	text	Page, Code
Learning to Predict Persona Information forDialogue Personalization without Explicit Persona Description	ACL	2023	text	text
Call for Customized Conversation: Customized Conversation Grounding Persona and Knowledge	AAAI	2022	text	text	Code
A Personalized Dialogue Generator with Implicit User Persona Detection	COLING	2022	text	text
Personalizing Dialogue Agents: I have a dog, do you have pets too?	ACL	2018	text	text

Datasets

Name	Year	# Concepts	Link	Notes
MC-LLaVA	2024	--	GitHub	with MC-LLaVA, multiple concepts
Yo'LLaVA	2024	40	GitHub	with Yo'LLaVA, single concept
MyVLM	2024	29	GitHub	with MyVLM, single concept

Applications

Memory and new controls for ChatGPT

⣶⣶⣶⣶⣶⣖⣒⡄⠀⣶⡖⠲⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢠⣤⠠⡄⠀⠀⠀⠀ ⠙⠛⣿⣿⣿⡟⠛⠃⢀⣿⣿⣆⣦⣴⠂⠤⠀⠀⠀⣠⣤⣴⣆⠠⢄⠀⠀⠀⣤⡤⢤⣤⣤⠤⢄⠀⠀⢻⣿⣦⡇⢀⣤⢤⠀ ⠀⢀⣿⣿⣿⡇⠀⠀⢸⣿⣿⣿⠛⣿⣷⣄⡇⠀⣼⣿⣿⡟⢿⣷⡄⣣⠀⢘⣿⣿⣿⠿⣿⣧⣈⡆⠀⢹⣿⣿⣷⣾⣧⣴⠀ ⠀⢰⣿⣿⣿⠀⠀⠀⢸⣿⣿⣿⠀⣿⣿⣿⡇⠀⠙⠛⣻⣧⣾⣿⣿⡷⠀⢸⣿⣿⣿⠀⣿⣿⣿⡇⠀⢸⣿⣿⣿⣿⣿⡇⠀ ⠀⢸⣿⣿⣿⠀⠀⠀⢸⣿⣿⡿⠀⣿⣿⣿⠃⠀⣰⣾⣿⡿⣿⣿⣿⣟⠀⢸⣿⣿⣿⠀⣿⣿⣿⡇⠀⢸⣿⣿⣿⣿⡏⢇⠀ ⠀⣼⣿⣿⣿⠀⠀⠀⣸⣿⣿⣟⢠⣿⣿⣿⠀⠀⣿⣿⡟⣇⣾⣿⣿⣯⠀⢸⣿⣿⣿⠀⣿⣿⣿⡇⠀⢼⣿⣿⣿⣿⣷⡈⡀ ⠀⠻⠿⠿⠟⠀⠀⠀⠻⠿⠿⠏⠸⣿⣿⣿⠀⠀⢿⣿⣿⣿⣿⣿⣿⡇⠀⢸⣿⣿⣿⠀⣿⣿⣿⡇⠀⣿⣿⣿⡟⢻⣿⣧⣇ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⠀⠀⠉⠉⠀⠀⠀⠉⠉⠁⠀⠉⠉⠉⠀⠀⠘⠙⠋⠁⠈⠋⠛⠉ ⠀⠀⠀⠀⠀⠀⢀⣠⣤⡀⠀⢀⣀⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣤⡤⠠⡄⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⢹⣿⣄⠱⣠⣿⣧⣴⠀⠀⣠⣤⣤⣀⣀⡀⠀⠀⢀⣤⠤⡀⢀⣠⡤⢄⠀⠈⣿⣿⣦⡇⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠈⢿⣿⣷⣿⣿⣿⡏⠀⣾⣿⣿⣿⣶⣄⡉⡄⠀⣿⣿⣤⣝⢸⣿⣦⣼⠀⠀⣿⣿⣿⡇⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⢿⣿⣿⣿⠏⠀⠐⣿⣿⣿⠉⣿⣿⣷⡇⠀⣽⣿⣿⣯⢸⣿⣿⣿⠀⠀⢹⣿⣿⡇⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⢸⣿⣿⣿⠀⠀⢠⣿⣿⣿⠀⣿⣿⣿⡇⠀⣻⣿⣿⡷⢸⣿⣿⣿⠀⠀⢸⣿⣿⠇⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⢸⣿⣿⣿⠀⠀⠀⢿⣿⣿⣄⣿⣿⣿⠇⠀⢹⣿⣿⣿⣸⣿⣿⣿⠀⠀⢠⣽⣧⡄⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠛⠛⠋⠀⠀⠀⠈⠛⠛⠛⠛⠛⠉⠀⠀⠈⠛⠛⠛⠋⠛⠛⠋⠀⠀⠈⠛⠛⠁⠀⠀⠀⠀⠀⠀⠀

And good luck with your research! 🤗✨

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
imgs		imgs
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Personalized Large Multimodal Models

Table of Contents

🌱 Contributing

Papers

Datasets

Applications

About

thaoshibe/awesome-personalized-lmms

Folders and files

Latest commit

History

Repository files navigation

Awesome Personalized Large Multimodal Models

Table of Contents

🌱 Contributing

Papers

Datasets

Applications

About

Topics

Resources

Stars

Watchers

Forks