Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esutil BulkIndexer Body ReadSeaker is incompatible with JSONReader #592

Open
project0 opened this issue Jan 24, 2023 · 2 comments
Open

esutil BulkIndexer Body ReadSeaker is incompatible with JSONReader #592

project0 opened this issue Jan 24, 2023 · 2 comments

Comments

@project0
Copy link

I like that the library provides some convenient way to do bulk indexing data.
I think with v8 the BulkIndexerItem.Body has been changed from Reader to ReadSeeker interface for better performance.

This makes it very hard to use the library in a efficient way, even the provided JSONReader is now incompatible.
Setting the body becomes pretty hard with native packages using the Reader interface.
Now we have to buffer/cache the data a lot of times, just so it gets wrapped into the ReadSeeker interface, i am in doubt if this makes it really more efficient.

However, this is not working anymore:

esutil.BulkIndexerItem{
    data: esutil.NewJSONReader(data),
}

instead, i need to do it manually:

indexerData, err := json.Marshal(&cloudEvent.Data)
    if err != nil {
}
esutil.BulkIndexerItem{
    data: bytes.NewReader([]byte(indexerData))
}
@Anaethelion
Copy link
Contributor

Hi @project0

Thanks for raising this, that change was more to align the existing API with the behavior of the indexer rather than a performance optimization.

I do have received parallel requests for a different version of the bulk indexer which I think fits more the use case you are asking for.
This indexer wouldn't keep in memory the sent items beyond the request and would not have per item callbacks. More of a fire&forget type of indexer.

Would you be interested in that ? Let me know what you think!

@project0
Copy link
Author

project0 commented Feb 1, 2023

@Anaethelion my use case is pretty simple. I am just receiving a stream of event data (currently via http requests, hopefully eventually via kafka, etc..) and need to ingest into elasticsearch.
I am not sure what you mean with fire & forget, but i find the per item callback useful to catch and log errors for single items.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants