Skip to content

[Bug]: Using v1.0.3 client/server with a database created by v0.6.3 client/server causes chromadb.errors.InternalError #4217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Tracked by #36
Davidyz opened this issue Apr 8, 2025 · 15 comments
Labels
bug Something isn't working

Comments

@Davidyz
Copy link

Davidyz commented Apr 8, 2025

What happened?

A collection.get() call using v1.0.3 client and v1.0.3 server on a collection created by v0.6.3 chromadb produced a chromadb.errors.InternalError: Error executing plan: Error sending backfill request to compactor.

This is the relevant code snippet:

await collection.get(
    where={"path": full_path_str},
    include=["metadatas"],
)

Aside from this error, I'm wondering whether a database created by 0.x.x chromadb is supposed to "just work"? Also, what's the expected outcome if a user has a mismatched chromadb client and server? I'm experiencing errors, and at the same time, I couldn't find many useful guides on what to expect over the upgrade (from 0.x.x to 1.x.x). The Discord channel (migrations) hasn't been updated for a few months. I'm a bit lost at this point because I don't know what to expect when trying to bump the chromadb version for my project.

Versions

Collection created by chromadb v0.6.3 and accessed by chromadb v1.0.3, on python 3.13, arch Linux

Relevant log output

Traceback (most recent call last):
  File "/home/davidyz/git/VectorCode/src/vectorcode/main.py", line 85, in async_main
    return_val = await vectorise(final_configs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/src/vectorcode/subcommands/vectorise.py", line 192, in vectorise
    await task
  File "/usr/lib/python3.13/asyncio/tasks.py", line 634, in _wait_for_one
    return f.result() if resolve else f
           ~~~~~~~~^^
  File "/home/davidyz/git/VectorCode/src/vectorcode/subcommands/vectorise.py", line 43, in chunked_add
    await collection.get(
    ...<2 lines>...
    )
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/models/AsyncCollection.py", line 127, in get
    get_results = await self._client._get(
                  ^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 134, in async_wrapper
    return await f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/async_fastapi.py", line 471, in _get
    resp_json = await self._make_request(
                ^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<11 lines>...
    )
    ^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/async_fastapi.py", line 149, in _make_request
    BaseHTTPClient._raise_chroma_error(response)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/base_http_client.py", line 96, in _raise_chroma_error
    raise chroma_error
chromadb.errors.InternalError: Error executing plan: Error sending backfill request to compactor
@Davidyz Davidyz added the bug Something isn't working label Apr 8, 2025
@tazarov
Copy link
Contributor

tazarov commented Apr 8, 2025

@Davidyz, this is likely a HNSW initialization error. Rust errors a little harder to trace, I've just added a PR - #4219. I'll comeback with a way to further debug this.

@tazarov
Copy link
Contributor

tazarov commented Apr 8, 2025

@Davidyz, try to build a new server image with the following dockerfile:

FROM rust:1.81.0 AS builder

ARG RELEASE_MODE=release

WORKDIR /chroma/

ENV PROTOC_ZIP=protoc-25.1-linux-x86_64.zip
RUN curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v25.1/$PROTOC_ZIP \
    && unzip -o $PROTOC_ZIP -d /usr/local bin/protoc \
    && unzip -o $PROTOC_ZIP -d /usr/local 'include/*' \
    && rm -f $PROTOC_ZIP

RUN git clone --branch trayan-04-08-fix_improving_backfil_error_propagation https://github.com/chroma-core/chroma.git .

RUN apt-get update && \
    apt-get install -y python3.11-dev

# Build dependencies first (for better caching)
RUN cargo build --bin chroma --release

#FROM gcr.io/distroless/cc-debian12
FROM debian:bookworm-slim

# Copy the binary from the build stage
COPY --from=builder /chroma/target/release/chroma /usr/local/bin/
RUN apt-get update && \
    apt-get install -y python3.11-dev
EXPOSE 8000

ENTRYPOINT ["chroma"]
CMD ["run","--path","/data","--host","0.0.0.0"]

Build it:

docker build -t chromarust -f Dockerfile .

Then run your server:

docker run -v ./<local_dir>:/data  -p 8000:8000 chromarust

Test it out and let me know what error you get then.

@Davidyz
Copy link
Author

Davidyz commented Apr 10, 2025

Hi @tazarov , thanks for the instructions and apologies for replying late. Here's the error output using v1.0.3 client (the server is built from the dockerfile provided above):

Traceback (most recent call last):
  File "/home/davidyz/git/VectorCode/src/vectorcode/main.py", line 85, in async_main
    return_val = await vectorise(final_configs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/src/vectorcode/subcommands/vectorise.py", line 192, in vectorise
    await task
  File "/usr/lib/python3.13/asyncio/tasks.py", line 634, in _wait_for_one
    return f.result() if resolve else f
           ~~~~~~~~^^
  File "/home/davidyz/git/VectorCode/src/vectorcode/subcommands/vectorise.py", line 43, in chunked_add
    await collection.get(
    ...<2 lines>...
    )
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/models/AsyncCollection.py", line 127, in get
    get_results = await self._client._get(
                  ^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 134, in async_wrapper
    return await f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/async_fastapi.py", line 471, in _get
    resp_json = await self._make_request(
                ^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<11 lines>...
    )
    ^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/async_fastapi.py", line 149, in _make_request
    BaseHTTPClient._raise_chroma_error(response)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/base_http_client.py", line 96, in _raise_chroma_error
    raise chroma_error
chromadb.errors.InternalError: Error executing plan: Error sending backfill request to compactor: Error reading from metadata segment reader

EDIT: this error persists on v1.0.4 client

@tazarov
Copy link
Contributor

tazarov commented Apr 10, 2025

hey @Davidyz, thanks for providing the logs. I see that now we do get, as anticipated, more feedback from the error messages. Looking at the code, this seems to be related to sqlite3.

Can I bother you to build another image from a new branch where I've added propagation of sqlite3 related errors:

FROM rust:1.81.0 AS builder

ARG RELEASE_MODE=release

WORKDIR /chroma/

ENV PROTOC_ZIP=protoc-25.1-linux-x86_64.zip
RUN curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v25.1/$PROTOC_ZIP \
    && unzip -o $PROTOC_ZIP -d /usr/local bin/protoc \
    && unzip -o $PROTOC_ZIP -d /usr/local 'include/*' \
    && rm -f $PROTOC_ZIP

RUN git clone --branch trayan-04-10-chore_local_compation_manager_error_propagation_for_sqlite https://github.com/chroma-core/chroma.git .

RUN apt-get update && \
    apt-get install -y python3.11-dev

# Build dependencies first (for better caching)
RUN cargo build --bin chroma --release

#FROM gcr.io/distroless/cc-debian12
FROM debian:bookworm-slim

# Copy the binary from the build stage
COPY --from=builder /chroma/target/release/chroma /usr/local/bin/
RUN apt-get update && \
    apt-get install -y python3.11-dev
EXPOSE 8000

ENTRYPOINT ["chroma"]
CMD ["run","--path","/data","--host","0.0.0.0"]

@Davidyz
Copy link
Author

Davidyz commented Apr 10, 2025

Hi @tazarov thanks for the patch. Here's the new error message:

Traceback (most recent call last):
  File "/home/davidyz/git/VectorCode/src/vectorcode/main.py", line 85, in async_main
    return_val = await vectorise(final_configs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/src/vectorcode/subcommands/vectorise.py", line 192, in vectorise
    await task
  File "/usr/lib/python3.13/asyncio/tasks.py", line 634, in _wait_for_one
    return f.result() if resolve else f
           ~~~~~~~~^^
  File "/home/davidyz/git/VectorCode/src/vectorcode/subcommands/vectorise.py", line 43, in chunked_add
    await collection.get(
    ...<2 lines>...
    )
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/models/AsyncCollection.py", line 127, in get
    get_results = await self._client._get(
                  ^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 134, in async_wrapper
    return await f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/async_fastapi.py", line 471, in _get
    resp_json = await self._make_request(
                ^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<11 lines>...
    )
    ^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/async_fastapi.py", line 149, in _make_request
    BaseHTTPClient._raise_chroma_error(response)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/base_http_client.py", line 96, in _raise_chroma_error
    raise chroma_error
chromadb.errors.InternalError: Error executing plan: Error sending backfill request to compactor: Error reading from metadata segment reader: error occurred while decoding column 0: mismatched types; Rust type `u64` (as SQL type `INTEGER`) is not compatible with SQL type `BLOB`

This also applies to both 1.0.3 and 1.0.4

@tazarov
Copy link
Contributor

tazarov commented Apr 10, 2025

hey @Davidyz, thanks so much for the quick turnaround. Once again, much appreciate you sticking with this and I can understand your frustration with this. As you point out things should "just work" and indeed this is the intent here :). Reality though is that bugs do creep in from time to time.

The error above starts to make sense now. My hypothesis at this point is misbehaving DB migration. Can I bother you for one last bit of feedback:

sqlite3 <persist_dir>/chroma.sqlite3 'select segment_id, hex(seq_id)  from max_seq_id;'

The above command will output something like this:

95f13d36-824b-4256-a9ce-d3dbd63a6a5e|0000000000001452
265719aa-cd49-486f-b6c9-35590ee09bcd|0000000000001388
63a8e0d4-8f05-42c9-a14b-ccf411b28abb|00000000000027DA
31c2b40c-10a1-47e5-bd40-e24903bdb925|00000000000028A4

Aslo this:

sqlite3<persist_dir>/chroma.sqlite3 ' 'select dir,version,filename,hash from migrations' 

Resulting in something like this:

embeddings_queue|1|00001-embeddings.sqlite.sql|d3755dfd232be8e8301f4d7fcfb3a486
embeddings_queue|2|00002-embeddings-queue-config.sqlite.sql|8fbfe4ffb3e57f1d8bfdc58510a82e85
sysdb|1|00001-collections.sqlite.sql|38352d725ad1c16074fac420b22b4633
sysdb|2|00002-segments.sqlite.sql|2913cb6a503055a95f625448037e8912
sysdb|3|00003-collection-dimension.sqlite.sql|42d22d0574d31d419c2a0e7f625c93aa
sysdb|4|00004-tenants-databases.sqlite.sql|048867ce8fcdefe4023c7110e4433591
sysdb|5|00005-remove-topic.sqlite.sql|b1367c826b8fba5f96f27befdc1d42d2
sysdb|6|00006-collection-segment-metadata.sqlite.sql|4eea7468935bf25d4604a0fed2366116
sysdb|7|00007-collection-config.sqlite.sql|1c7e63bba346a42a18b6ab7f1c989bed
sysdb|8|00008-maintenance-log.sqlite.sql|0a0e7e93111a01789addf64961c6127c
sysdb|9|00009-segment-collection-not-null.sqlite.sql|054355aef9e63702bf54ea29e61563f1
metadb|1|00001-embedding-metadata.sqlite.sql|2b4cf52c4bb2676e21d6860a4409f856
metadb|2|00002-embedding-metadata.sqlite.sql|12a570f7121b3a8ce750a2a7c36da20f
metadb|3|00003-full-text-tokenize.sqlite.sql|f97ad6334aeaa8f419f01110b648b97a
metadb|4|00004-metadata-indices.sqlite.sql|fb36603a45ee2cd0254cef3ef86585e8

@Davidyz
Copy link
Author

Davidyz commented Apr 10, 2025

@tazarov, here is the output for the commands that you requested:

553d87e7-287f-4473-96f5-5e87fb692677|000000000000022C
embeddings_queue|1|00001-embeddings.sqlite.sql|d3755dfd232be8e8301f4d7fcfb3a486
embeddings_queue|2|00002-embeddings-queue-config.sqlite.sql|8fbfe4ffb3e57f1d8bfdc58510a82e85
sysdb|1|00001-collections.sqlite.sql|38352d725ad1c16074fac420b22b4633
sysdb|2|00002-segments.sqlite.sql|2913cb6a503055a95f625448037e8912
sysdb|3|00003-collection-dimension.sqlite.sql|42d22d0574d31d419c2a0e7f625c93aa
sysdb|4|00004-tenants-databases.sqlite.sql|048867ce8fcdefe4023c7110e4433591
sysdb|5|00005-remove-topic.sqlite.sql|b1367c826b8fba5f96f27befdc1d42d2
sysdb|6|00006-collection-segment-metadata.sqlite.sql|4eea7468935bf25d4604a0fed2366116
sysdb|7|00007-collection-config.sqlite.sql|1c7e63bba346a42a18b6ab7f1c989bed
sysdb|8|00008-maintenance-log.sqlite.sql|0a0e7e93111a01789addf64961c6127c
sysdb|9|00009-segment-collection-not-null.sqlite.sql|054355aef9e63702bf54ea29e61563f1
metadb|1|00001-embedding-metadata.sqlite.sql|2b4cf52c4bb2676e21d6860a4409f856
metadb|2|00002-embedding-metadata.sqlite.sql|12a570f7121b3a8ce750a2a7c36da20f
metadb|3|00003-full-text-tokenize.sqlite.sql|f97ad6334aeaa8f419f01110b648b97a
metadb|4|00004-metadata-indices.sqlite.sql|fb36603a45ee2cd0254cef3ef86585e8
metadb|5|00005-max-seq-id-int.sqlite.sql|0e9de46758761b373ce682925edcc326

@tazarov
Copy link
Contributor

tazarov commented Apr 10, 2025

Hey @Davidyz, much appreciated. This tells me two things:

metadb|5|00005-max-seq-id-int.sqlite.sql|0e9de46758761b373ce682925edcc326 - migration from 0.6.3 -> 1.0.x was applied

553d87e7-287f-4473-96f5-5e87fb692677|000000000000022C - migration failed to apply as the second column is still in blob (big-endian encoded int).

Good news is that I am able to reproduce your error 100% and the fix for it is relatively simple. Just run:

sqlite3 <persist_dir>/chroma.sqlite3 "delete from migrations where dir ='metadb' and filename='00005-max-seq-id-int.sqlite.sql';" 

Then run your query again.

While the above will solve the issue you are having it doesn't explain how the migration script failed. Possibly due to another error that you've encountered along the way that may have prevented the migration from applying. I'll keep digging.

@Davidyz
Copy link
Author

Davidyz commented Apr 10, 2025

Thanks! I'm glad I was able to help!

I'm maintaining a code repository indexing tool that acts as a context provider (MCP tools) for LLM applications, and the current version pinned chromadb to 0.6.3. Should I stay at 0.6.3 until this is fixed in a future release of chromadb? (If I unpin chromadb now, it'll work for new users, but for existing user,s it might break their setup.)

@tazarov
Copy link
Contributor

tazarov commented Apr 10, 2025

@Davidyz, Chroma 1.0 brings lots of improvements, main of which is performance however, if you have a large based of existing users on 0.6.3 I would recommend pining the version for the time being until we figure the root cause of the failure and possibly fix it.

I will try to reproduce the upgrade error. Any particular steps you followed in your upgrade process that might be useful? Or did you just unpin and installed the latest available?

@Davidyz
Copy link
Author

Davidyz commented Apr 10, 2025

Yes I just unpinned the version number in my pyproject.toml.

One thing that may be unusual is that I personally deployed chromadb by systemd, which is essentially the same as starting it from the CLI (native install with pipx, without docker). This has worked perfectly fine for me pre 1.0.0 so I chose to stick to it.

@jeffchuber
Copy link
Contributor

@Davidyz should we keep this open or is everything ironed out for now?

@Davidyz
Copy link
Author

Davidyz commented Apr 16, 2025

@Davidyz should we keep this open or is everything ironed out for now?

This bug (using old database with new client) is still there (at least in the latest release). @tazarov mentioned that he's trying to figure out what happened, so I think it's better to keep this open?

@tazarov
Copy link
Contributor

tazarov commented Apr 16, 2025

@Davidyz, I tried to reproduce it with 0.6.3 -> 1.0.x upgrade with both server (docker) and CLI but in vain. I think there maybe either a step or a separate process which can hold locks to the sqlite3, but even then the whole application startup should generally fail as the db will be locked.

Perhaps it is worth following your exact process:

Correct the above to match your sequence.

@Davidyz
Copy link
Author

Davidyz commented Apr 19, 2025

Yes these steps are what I used, except that I used a user service that can be created and managed without sudo:

# ~/.config/systemd/user/chromadb.service
[Unit]
Description = Chroma Service
After = network.target

[Service]
Type = simple
WorkingDirectory = /opt/chromadb
ExecStart=/home/davidyz/.local/bin/chroma run --host 127.0.0.1 --port 8000 --path /opt/chromadb/data --log-path /var/log/chromadb.log

[Install]
WantedBy = default.target

and then

systemctl start --user chromadb

HammadB pushed a commit that referenced this issue Apr 21, 2025
## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
- We also propagate errors from sqlite metadata reader - related to
debugging of #4217

## Test plan
*How are these changes tested?*

- [ ] Tests pass locally with `pytest` for python, `yarn test` for js,
`cargo test` for rust

## Documentation Changes
*Are all docstrings for user-facing APIs updated if required? Do we need
to make documentation changes in the [docs
repository](https://github.com/chroma-core/docs)?*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants