View Issue Details

IDProjectCategoryView StatusLast Update
5822Composrcorepublic2024-07-31 03:15
ReporterGuest Assigned ToPDStig  
PrioritynormalSeverityfeature 
Status assignedResolutionopen 
Summary5822: Perhaps spam detection should not ban web crawlers
DescriptionSome web crawlers get banned by the antispam if they match enough criteria to cause their score to reach the ban limit. But when this happens, health check fails saying accidentally banned a web crawler. Perhaps web crawlers should be exempt from automatic bans based on perceived spam (but they should not be excluded from manual bans or hack attack bans).
TagsNo tags attached.
Attach Tags
Time estimation (hours)
Sponsorship open

Sponsor

Date Added Member Amount Sponsored

Activities

PDStig

2023-11-26 19:34

administrator   ~9014

This will need some consideration. Do we really want to exclude web crawlers? Or do we want to follow the philosophy that a web crawler that triggers spam detection is a poorly-made web crawler and "gets what it deserves"?

Chris Graham

2024-07-30 19:38

administrator   ~9015

It's difficult because any request can easily have a 'spoofed' (if you can even call it that) user-agent.

Chris Graham

2024-07-30 19:40

administrator   ~9016

I've moved to a feature request. I doubt this is happening on default settings. If it is, please mention such. I actually would like to know the settings it is happening for anyway.

TBH the importance of search engines are declining. Personally I rarely Google stuff nowadays because I either use ChatGPT, or some kind of walled garden (e.g. Reddit). That's sad, but it does lower the criticality of this kind of thing, especially if the crawlers are faulty in some way.

PDStig

2024-07-30 19:43

administrator   ~9017

Possibly the health check should use different wording then... instead of saying "accidentally banned a web crawler", implying it should be unbanned, maybe it should say "Banned a potential web crawler... (IP). If you believe this to be a false-positive, unban the IP address."

Add Note

View Status
Note
Upload Files
Maximum size: 32,768 KiB

Attach files by dragging & dropping, selecting or pasting them.
You are not logged in You are not logged in. This means you will not get any e-mail notifications. And if you reply, we will not know for sure you are the original poster of the issue.

Issue History

Date Modified Username Field Change
2024-07-31 03:15 Guest New Issue
2024-07-31 03:15 Guest Issue generated from: 5472