#5472 - Perhaps spam detection should not ban web crawlers
| Identifier | #5472 |
|---|---|
| Issue type | Trivial issue (does not break functionality) |
| Title | Perhaps spam detection should not ban web crawlers |
| Status | Completed |
| Tags |
Roadmap: v11 (custom) |
| Handling member | PDStig |
| Version | 11 alpha1 |
| Addon | core |
| Description | Some web crawlers get banned by the antispam if they match enough criteria to cause their score to reach the ban limit. But when this happens, health check fails saying accidentally banned a web crawler. Perhaps web crawlers should be exempt from automatic bans based on perceived spam (but they should not be excluded from manual bans or hack attack bans). |
| Steps to reproduce | |
| Funded? | No |
| Commits |
The system will post a comment when this issue is modified (e.g., status changes). To be notified of this, click "Enable comment notifications".


Comments
TBH the importance of search engines are declining. Personally I rarely Google stuff nowadays because I either use ChatGPT, or some kind of walled garden (e.g. Reddit). That's sad, but it does lower the criticality of this kind of thing, especially if the crawlers are faulty in some way.
if ((!is_our_server($ip)) && (!is_unbannable_bot_dns($ip)) && (!is_unbannable_bot_ip($ip))) {
And we should do the same for antispam.
Some web crawlers get banned by the antispam if they match enough criteria to cause their score to reach the ban limit. But when this happens, health check fails saying accidentally banned a web crawler.
This hotfix adds checks to be sure we're not trying to ban the server or a known / trusted bot. A hack attack will still be logged regardless if the IP gets banned (though advanced_banning by default stops logs).