You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Situation:
Data is being received on the HEC from Kafka-Connect using the Splunk plugin. Multiple data sources are being sent in one stream of data.
Issue
When using the ‘Event’ endpoint, the timestamp in the metadata added by Kafka-Connect is given precedence on the timestamp extraction from the event. This timestamp reflects the moment when either kafka received the event, not when the event was generated. In the event of an issue on the log-source which introduces delay, the timestamp in Splunk will be incorrect leading to correlation issues.
When using the ‘Raw’ endpoint this issue does not pop-up however this situation is unable to handle the amount of events we are receiving.
Temporary fix:
To fix this I’ve resorted to using a ‘ingest_eval’ for the sourcetype with an elaborate case() statement to attempt to find all the possible timestamps using strptime and substr, but when timestamps conflict in this logic the events are dropped.
Proposed fix:
Introduce an option on either the HEC or the kafka_connect plugin to choose if the metadata timestamp is leading or is to be ignored.
I had hoped the “splunk.hec.use.record.timestamp” would allow this to happen, but sadly it does nothing to fix this.
The text was updated successfully, but these errors were encountered:
Situation:
Data is being received on the HEC from Kafka-Connect using the Splunk plugin. Multiple data sources are being sent in one stream of data.
Issue
When using the ‘Event’ endpoint, the timestamp in the metadata added by Kafka-Connect is given precedence on the timestamp extraction from the event. This timestamp reflects the moment when either kafka received the event, not when the event was generated. In the event of an issue on the log-source which introduces delay, the timestamp in Splunk will be incorrect leading to correlation issues.
When using the ‘Raw’ endpoint this issue does not pop-up however this situation is unable to handle the amount of events we are receiving.
Temporary fix:
To fix this I’ve resorted to using a ‘ingest_eval’ for the sourcetype with an elaborate case() statement to attempt to find all the possible timestamps using strptime and substr, but when timestamps conflict in this logic the events are dropped.
Proposed fix:
Introduce an option on either the HEC or the kafka_connect plugin to choose if the metadata timestamp is leading or is to be ignored.
I had hoped the “splunk.hec.use.record.timestamp” would allow this to happen, but sadly it does nothing to fix this.
The text was updated successfully, but these errors were encountered: