View Issue Details

IDProjectCategoryView StatusLast Update
5472Composrcorepublic2024-10-12 23:32
ReporterPDStig Assigned ToPDStig  
PrioritynormalSeveritytrivial 
Status resolvedResolutionfixed 
Product Version11.alpha1 
Summary5472: Perhaps spam detection should not ban web crawlers
DescriptionSome web crawlers get banned by the antispam if they match enough criteria to cause their score to reach the ban limit. But when this happens, health check fails saying accidentally banned a web crawler. Perhaps web crawlers should be exempt from automatic bans based on perceived spam (but they should not be excluded from manual bans or hack attack bans).
TagsRoadmap: v11
Attach Tags
Attached Files
Time estimation (hours)
Sponsorship open

Sponsor

Date Added Member Amount Sponsored

Activities

PDStig

2023-11-26 19:34

administrator   ~8071

This will need some consideration. Do we really want to exclude web crawlers? Or do we want to follow the philosophy that a web crawler that triggers spam detection is a poorly-made web crawler and "gets what it deserves"?

Chris Graham

2024-07-30 19:38

administrator   ~9007

It's difficult because any request can easily have a 'spoofed' (if you can even call it that) user-agent.

Chris Graham

2024-07-30 19:40

administrator   ~9008

I've moved to a feature request. I doubt this is happening on default settings. If it is, please mention such. I actually would like to know the settings it is happening for anyway.

TBH the importance of search engines are declining. Personally I rarely Google stuff nowadays because I either use ChatGPT, or some kind of walled garden (e.g. Reddit). That's sad, but it does lower the criticality of this kind of thing, especially if the crawlers are faulty in some way.

PDStig

2024-07-30 19:43

administrator   ~9009

Possibly the health check should use different wording then... instead of saying "accidentally banned a web crawler", implying it should be unbanned, maybe it should say "Banned a potential web crawler... (IP). If you believe this to be a false-positive, unban the IP address."

Chris Graham

2024-08-01 21:29

administrator   ~9056

Changed back to bug. Reviewing the code, it is working by IP, not by UA. Hackattack banning already does:
        if ((!is_our_server($ip)) && (!is_unbannable_bot_dns($ip)) && (!is_unbannable_bot_ip($ip))) {
And we should do the same for antispam.

admin

2024-10-12 23:32

administrator   ~9474

Automated response: 5472: Perhaps spam detection should not ban web crawlers

Some web crawlers get banned by the antispam if they match enough criteria to cause their score to reach the ban limit. But when this happens, health check fails saying accidentally banned a web crawler.

This hotfix adds checks to be sure we're not trying to ban the server or a known / trusted bot. A hack attack will still be logged regardless if the IP gets banned (though advanced_banning by default stops logs).

admin

2024-10-12 23:32

administrator   ~9475

Fixed in Git commit 522e2b470c (https://gitlab.com/composr-foundation/composr/commit/522e2b470c - link will become active once code pushed to GitLab)

admin

2024-10-12 23:32

administrator   ~9476

A hotfix (a TAR of files to upload) has been uploaded to this issue. Only apply this hotfix if you absolutely need it and cannot wait until the next release of Composr (releases are more reliable and strictly tested). As of Composr version 11, the recommended way to apply a hotfix is by following the same steps as an upgrade (https://baseurl/upgrader.php, use the hotfix on the step “Transfer across new/updated files”). The upgrader will automatically skip files belonging to addons you do not have installed or that are newer on disk than in the hotfix. Otherwise, you can manually extract and replace these files (do not replace if your on-disk file is newer than the one in the hotfix). Always take backups of your site or at least files you are replacing before applying a hotfix. Not sure how to extract TAR files to your Windows computer? Try 7-zip (http://www.7-zip.org/).

Issue History

Date Modified Username Field Change
2023-11-26 19:33 PDStig New Issue
2023-11-26 19:33 PDStig Status Not Assigned => Assigned
2023-11-26 19:33 PDStig Assigned To => user4172
2023-11-26 19:34 PDStig Note Added: 0008071
2024-07-30 19:38 Chris Graham Note Added: 0009007
2024-07-30 19:38 Chris Graham Severity Minor Bug => Feature or Request
2024-07-30 19:38 Chris Graham Project Composr alpha bug reports => Composr
2024-07-30 19:38 Chris Graham Category General / Uncategorised => core
2024-07-30 19:40 Chris Graham Note Added: 0009008
2024-07-30 19:43 PDStig Note Added: 0009009
2024-07-31 03:15 Guest Issue cloned: 5822
2024-08-01 21:29 Chris Graham Severity Feature or Request => Minor Bug
2024-08-01 21:29 Chris Graham Note Added: 0009056
2024-10-12 23:30 PDStig Tag Attached: Roadmap: v11