Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added BGE embeddings model and Web Embeddings options for similarity search #30

Open
wants to merge 35 commits into
base: main
Choose a base branch
from

Conversation

michaeljelly
Copy link
Contributor

Everything seems to work for me, I added a couple tests for BGE Embeddings.

Web embeddings works with Cloudflare's default deployment of bge-small-en-v1.5

michaeljelly and others added 24 commits July 10, 2023 03:28
I think there's a concurrency issue:
Thread 2: EXC_BAD_ACCESS (code=1, address=0xfffffffffffffff0)
happens on         indexItems.append(item)
in addItem()

By making it all one task, it should avoid that.

For more context, here was my chat with GPT-4 ;)
# Memory Access Error

**USER:**
I'm getting: `Thread 2: EXC_BAD_ACCESS (code=1, address=0xfffffffffffffff0)` when I do indexItems.append(item) in Swift. Why would that be?

**ASSISTANT:**
The `EXC_BAD_ACCESS` error in Swift usually means you're trying to access memory that is not allocated, or is no longer allocated. This can happen due to a variety of reasons, such as accessing an array out of its bounds, accessing a deallocated object, or due to a race condition in multi-threaded code.

In your case, since the error happens when you're trying to append an item to an array (`indexItems.append(item)`), it's possible that you're dealing with a race condition where `indexItems` is being accessed from multiple threads at the same time.

Swift's Array is not thread-safe, which means you can't safely read or write to an array from multiple threads without proper synchronization. If you're appending to the array from one thread and reading or writing to it from another thread at the same time, you can get this kind of crash.

Here's a simple way to make your array access thread-safe using `DispatchQueue`:

```swift
class ThreadSafeArray<T> {
    private var array: [T] = []
    private let accessQueue = DispatchQueue(label: "com.example.ThreadSafeArray")

    func append(_ item: T) {
        accessQueue.async {
            self.array.append(item)
        }
    }

    var items: [T] {
        return accessQueue.sync {
            array
        }
    }
}
```

Then you can use `ThreadSafeArray` instead of `Array` for `indexItems`.

Please note that this is just one possible cause of the error. If this doesn't solve your issue, you might need to debug your program further to find out the exact cause.
Copy link
Owner

@ZachNagengast ZachNagengast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @michaeljelly this looks cool! Glad you were able to get it working 💪 I was curious if you could provide any more info on the web embeddings, I'm not familiar with cloudflare's deployment. It might make sense to move that into a seperate PR, so that we could add openai and other web embedding apis as well and protocols for various api formating.
Also for code clarity in this PR, I think your editor is using 2 spaces for tabs, if you could update these changes to use 4 instead, that would make this PR a lot more readable because the actual changes will stand out more, vs changes that are just indentation. Also theres a few comments throughout that seem unnecessary, would you be able to check those? Thanks again for the contribution, I'll give it a second pass in the meantime.

michaeljelly and others added 7 commits December 2, 2023 17:30
Occasionally the compression ratio was above 5! So I had to increase the max- compression ratio to 10 accommodate for these cases
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants