Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename lmm-tools to vision-agent #5

Merged
merged 2 commits into from
Feb 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci_cd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ jobs:
poetry run black --check --diff --color .
- name: Type Checking
run: |
poetry run mypy lmm_tools
poetry run mypy vision_agent
- name: Test with pytest
run: |
poetry run pytest -v tests
Expand Down
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# Large Multimodal Model Tools
LMM-Tools (Large Multmodal Model Tools) is a minimal library that helps you utilize multimodal models to organize and structure your image data. One of the problems of dealing with image data is it can be difficult to organize and quickly search. For example, you might have a bunch of pictures of houses and want to count how many yellow houses you have, or how many houses with adobe roofs. This library utilizes LMMs to help create these tags or descriptions and allow you to search over them, or use them in a database to do other operations.
# Vision Agent
Vision Agent is a minimal library that helps you utilize multimodal models to organize and structure your image data. One of the problems of dealing with image data is it can be difficult to organize and quickly search. For example, you might have a bunch of pictures of houses and want to count how many yellow houses you have, or how many houses with adobe roofs. This library utilizes LMMs to help create these tags or descriptions and allow you to search over them, or use them in a database to do other operations.

## Getting Started
### LMMs
To get started you can create an LMM and start generating text from images. The following code will grab the LLaVA-1.6 34B model and generate a description of the image you pass it.

```python
import lmm_tools as lmt
import vision_agent as va

model = lmt.lmm.get_model("llava")
model = va.lmm.get_model("llava")
model.generate("Describe this image", "image.png")
>>> "A yellow house with a green lawn."
```
Expand All @@ -19,13 +19,13 @@ model.generate("Describe this image", "image.png")
You can use the `DataStore` class to store your images, add new metadata to them such as descriptions, and search over different columns.

```python
import lmm_tools as lmt
import vision_agent as va
import pandas as pd

df = pd.DataFrame({"image_paths": ["image1.png", "image2.png", "image3.png"]})
ds = lmt.data.DataStore(df)
ds = ds.add_lmm(lmt.lmm.get_model("llava"))
ds = ds.add_embedder(lmt.emb.get_embedder("sentence-transformer"))
ds = va.data.DataStore(df)
ds = ds.add_lmm(va.lmm.get_model("llava"))
ds = ds.add_embedder(va.emb.get_embedder("sentence-transformer"))

ds = ds.add_column("descriptions", "Describe this image.")
```
Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,5 +13,5 @@ This library provides a set of tools to help you build applications with Large M
First, install the library:

```bash
pip install lmm-tools
pip install vision-agent
```
10 changes: 5 additions & 5 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,17 @@ requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

[tool.poetry]
name = "lmm-tools"
name = "vision-agent"
version = "0.0.8"
description = "Toolset for Large Multi-Modal Models"
description = "Toolset for Vision Agent"
authors = ["Landing AI <[email protected]>"]
readme = "README.md"
packages = [{include = "lmm_tools"}]
packages = [{include = "vision_agent"}]

[tool.poetry.urls]
"Homepage" = "https://landing.ai"
"repository" = "https://github.com/landing-ai/lmm-tools"
"documentation" = "https://github.com/landing-ai/lmm-tools"
"repository" = "https://github.com/landing-ai/vision-agent"
"documentation" = "https://github.com/landing-ai/vision-agent"

[tool.poetry.dependencies] # main dependency group
python = ">=3.10,<4.0"
Expand Down
2 changes: 1 addition & 1 deletion tests/test_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
import pandas as pd
import pytest

from lmm_tools.data import DataStore, build_data_store
from vision_agent.data import DataStore, build_data_store


@pytest.fixture(autouse=True)
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
4 changes: 2 additions & 2 deletions lmm_tools/data/data.py → vision_agent/data/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@
from tqdm import tqdm
from typing_extensions import Self

from lmm_tools.emb import Embedder
from lmm_tools.lmm import LMM
from vision_agent.emb import Embedder
from vision_agent.lmm import LMM

tqdm.pandas()

Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
6 changes: 4 additions & 2 deletions lmm_tools/lmm/lmm.py → vision_agent/lmm/lmm.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
import base64
import requests
from abc import ABC, abstractmethod
from pathlib import Path
from typing import Any, Dict, List, Optional, Union, cast
from lmm_tools.config import BASETEN_API_KEY, BASETEN_URL

import requests

from vision_agent.config import BASETEN_API_KEY, BASETEN_URL


def encode_image(image: Union[str, Path]) -> str:
Expand Down
Loading