Description
Summary
Add a setting in the interface to ban bots based on a User-Agent. Or maybe we at least can file the User-Agents so that you could block them manually?
Motivation
There is a toggle called "Filter out known web crawlers" on the Inbound Filters settings page. Sometimes new crawlers appear that slip by that filter. The bad thing is that ignoring them in beforeSend
doesn't always work even if a User-Agent is clear. They might be using some kind of cached versions of pages or they might be stripping script objects from pages - in any case, for us the flood of about 200 event per hour hasn't stopped even after adding this code in beforeSend()
:
if (/lyticsbot/.test(window.navigator.userAgent) ||
event.request && event.request.headers && event.request.headers['User-Agent'].search('lyticsbot') !== -1) {
return null;
}
Additional Context
Here is the issue page: https://sentry.io/organizations/policeone/issues/982063112/events/latest/?project=67360 It actually started after upgrading to the new JS SDK (4.6.6) from raven (3.26.4). Prior to that even the window.navigator.userAgent
was sufficient for blocking errors from that bot.
The crawler's UA in almost 100% of the cases is
User-Agent: lyticsbot-external
Metadata
Metadata
Assignees
Type
Projects
Status