5822: Perhaps spam detection should not ban web crawlers - Composr CMS feature tracker

ID	Project	Category	View Status	Date Submitted	Last Update

5822	Composr	core	public	2024-07-31 03:15	2024-07-31 03:15

Reporter	Guest	Assigned To	PDStig
Priority	normal	Severity	feature
Status	assigned	Resolution	open

Summary	5822: Perhaps spam detection should not ban web crawlers
Description	Some web crawlers get banned by the antispam if they match enough criteria to cause their score to reach the ban limit. But when this happens, health check fails saying accidentally banned a web crawler. Perhaps web crawlers should be exempt from automatic bans based on perceived spam (but they should not be excluded from manual bans or hack attack bans).
Tags	No tags attached.
Attach Tags	(Separate by ",")

Time estimation (hours)
Sponsorship open

Date Added	Member	Amount Sponsored

PDStig 2023-11-26 19:34 administrator ~9014	This will need some consideration. Do we really want to exclude web crawlers? Or do we want to follow the philosophy that a web crawler that triggers spam detection is a poorly-made web crawler and "gets what it deserves"?

Chris Graham 2024-07-30 19:38 administrator ~9015	It's difficult because any request can easily have a 'spoofed' (if you can even call it that) user-agent.

Chris Graham 2024-07-30 19:40 administrator ~9016	I've moved to a feature request. I doubt this is happening on default settings. If it is, please mention such. I actually would like to know the settings it is happening for anyway. TBH the importance of search engines are declining. Personally I rarely Google stuff nowadays because I either use ChatGPT, or some kind of walled garden (e.g. Reddit). That's sad, but it does lower the criticality of this kind of thing, especially if the crawlers are faulty in some way.

PDStig 2024-07-30 19:43 administrator ~9017	Possibly the health check should use different wording then... instead of saying "accidentally banned a web crawler", implying it should be unbanned, maybe it should say "Banned a potential web crawler... (IP). If you believe this to be a false-positive, unban the IP address."

View Status	Private (for security issues or disclosing of private information; if you submit as a guest, you will not be able to see your submission)
Note
Upload Files Maximum size: 32,768 KiB	Attach files by dragging & dropping, selecting or pasting them.
You are not logged in	You are not logged in. This means you will not get any e-mail notifications. And if you reply, we will not know for sure you are the original poster of the issue.

Date Modified	Username	Field	Change
2024-07-31 03:15	Guest	New Issue
2024-07-31 03:15	Guest	Issue generated from: 5472