Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preventing Duplicate Events #42

Open
shwetas-syd opened this issue Mar 31, 2020 · 4 comments
Open

Preventing Duplicate Events #42

shwetas-syd opened this issue Mar 31, 2020 · 4 comments
Labels
bug Something isn't working

Comments

@shwetas-syd
Copy link

We've noticed duplication of events and we're looking at ways to prevent them. I tried adding the add_id processor, but it's not available in the list of processors.

@chris-counteractive
Copy link
Collaborator

I'd love to learn more, to differentiate whether this is a case of the beat repeating content downloads or if it's an artifact of the API itself. I'll check on the add_id processor, but the events themselves should have unique IDs already.

@shwetas-syd
Copy link
Author

Debugging logs show the beat querying the artifact and publishing events from the same date range multiple times. So, I suspect the O365Beat isn't getting an acknowledge from Elastic in time.
Elastic recommends the add_id processor to prevent data duplication https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-deduplication.html

@rob570
Copy link

rob570 commented May 14, 2020

Hi,
I think it's not an o365beat issue. This is my pipeline:
o365beat-->logstash (with geo info enrichment by a filter)-->output to file and to ES
I see duplicate events in the file too.
I solved on ES mapping in the logstash conf the document_id to the "Id" O365 field.

@chris-counteractive
Copy link
Collaborator

I'm thinking these duplicate events could be part of the same underlying issue described in my recent reply to @rob570's issue. I'll let you know when a fix is posted, and hopefully we can test it under your conditions!

@chris-counteractive chris-counteractive added the bug Something isn't working label May 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants