#290 - Spammer database
| Identifier | #290 |
|---|---|
| Issue type | Feature request or suggestion |
| Title | Spammer database |
| Status | Completed |
| Handling member | Chris Graham |
| Addon | core |
| Description | Lookup IPs/browsers in a spammer database and block from posting accordingly. |
| Steps to reproduce | |
| Funded? | No |
The system will post a comment when this issue is modified (e.g., status changes). To be notified of this, click "Enable comment notifications".


Comments
A better spam filter would be needed for post submission checking. Perhaps something that would allow admins to enter the new word violation directly from the post.
The mod I used on the old SMF allowed you to check any/all members against the database at anytime and at registration time.
or a other services. that will auto look up ISP/IP/and if they are spammers or not.. in this module it will also have block for sign up with captcha that will auto match with there real IP and not proxy IP... this can or will help to save you time... it will also stop from the posting on your site and forums and there Sig.. please note that maxmind is not free.. but they do have free sign up to get you started. any questions?
I think it would be worthwhile seeing if you could build this service based on Project Honeypot with the possible inclusion of a honeypot to catch and report spammers back to the project. They have fairly demanding requirements to work with them (in order to protect their assets) but it might be worth looking into.
https://www.projecthoneypot.org/httpbl_api.php
As far as I can tell, it is only IP numbers. Do you know of one that does email addresses and user names? I suspect that'd be a privacy issue actually.
I note a popular HTTP:BL implementation checks on every page view and apparently still has great performance (I think it probably relies on the server's DNS caching for that, as the checks happen via DNS).
"As far as I can tell, it is only IP numbers. Do you know of one that does email addresses and user names? I suspect that'd be a privacy issue actually."
This is the mod I was using for my SMF based forum: http://custom.simplemachines.org/mods/index.php?mod=1547
It did an excellent job of weeding out the spammers. That mod pulls from the StopForumSpam database.
Here is a short list:
http://www.stopforumspam.com/
http://www.fspamlist.com/
http://www.spambusted.com/
Other resources:
http://akismet.com/development/api/
http://www.anatoa.com/
http://www.block-disposable-email.com/
http://blogspam.net/api/
http://www.easyantispam.com/wiki/api:home
http://spamid.servebeer.com:8081/spamid/spamid/apis.jsp
http://blocklistpro.com/
http://dnsbl.tornevall.org/
You need to watch out for a maximum amount of free queries from these databases. Some of these databases have a limit on the number of queries per day. Checking at post time may not seem like much, but when you have 100's or 1000's of forums querying a database at each post it could create an extra load on the database provider that the provider may not be happy about. Most of these databases are provided free of charge as a service to forum administrators.
If you are stopping them at registration time, they will not get the chance to create a spam post. If by chance you do get a spam post, more than likely that poster is using a new ip/email/username that was never on a database to begin with and you will most likely will miss catching them at post time. A spammer can make many posts on different forums before they are caught and reported, providing they are even reported.
A busy forum should have active moderators catching the very few spammers that will slip through past detection.
This is an advantage for Project Honey Pot/HTTP:BL which, to my knowledge, does not impose a quota although they do encourage high-traffic sites to download the BL to their DNS servers and keep them synced.
I agree that whatever automated solutions are developed, manual moderation is still a requirement but, hopefully, with a much smaller number of issues.
However before I forget I want to mention that this is going to do a DB version jump of one of the core modules, so when the patch is released it'll be necessary to run a database upgrade in the upgrader.
This code is not going to be in v8. A patch will be released for v8, for people wanting to try it ahead of whenever is released (v9?). This is as per new policy - feature sponsorship results in supported patches for the latest version at the patch construction time, and the code is added to Composr's unstable branch, but does release roadmaps are unaffected.
"Spammer checking level": "Every page view", really does run RBL checks on each page view
"Spammer checking level": "Every page view", does not run Stop Web Spam checks on each page view
"Spammer checking level": "Every page view", does run Stop Web Spam checks on posting
"Spammer checking level": "Actions", really does run Stop Web Spam checks on posting as a member
"Spammer checking level": "Guest Actions", really does run Stop Web Spam checks on posting as a Guest
"Spammer checking level": "Guest Actions", really does run Stop Web Spam checks on joining
"Spammer checking level": "Never", really does not run RBL checks on joining
"Spammer checking level": "Never", really does not run Stop Web Spam checks on joining
RBL check works for tornevall, with a confidence equal to the "Implied spammer confidence" option
RBL check works for HTTP:BL with a correct confidence level (HTTP:BL needs setting up in config first, with a key)
If an invalid RBL is configured,it does not kill Composr, but it does send an error notification
If an RBL check bans a spammer, it is only for as long as the configured "Block list cache time"
If an IP is in "Spammer checking exclusions", it is not checked against RBLs
If an IP is in "Spammer checking exclusions", it is not checked against Stop Web Spam
If an IP is in "Spammer checking exclusions", any existing IP bans for it will be ignored
HTTP:BL bans over the "Spammer ban threshold" result in bans
Bans result in appropriate ban notifications indicating the reason and IP
HTTP:BL bans over the "Spammer block threshold" but less than "Spammer ban threshold" result in blocks
Blocks result in appropriate block notifications indicating the reason and IP
HTTP:BL bans over the "Spammer approval threshold" but less than "Spammer block threshold" result in content requiring approval, even for an admin
Approval-requires result in appropriate approval notifications indicating the reason and IP
Stop Web Spam results older than "Spammer staleness threshold" but above an action threshold do not result in any action
Stop Web Spam results newer than "Spammer staleness threshold" but above an action threshold do result in any action
If "Honeypot URL" is configured, honeypots are correctly advertised
If "Honeypot URL" is configured, honeypot URL injection methods are different on different pages, but are constant on each particular page
If the "Check usernames against known spammers" option is enabled then known Stop Web Spam usernames will be blocked on joining
If the "Check usernames against known spammers" option is enabled then known Stop Web Spam usernames will not blocked on joining
Stop Web Spam email addresses will be blocked on joining
If a service request to Stop Web Spam fails, an error notification is sent
If "Blackhole detection" is enabled, fiddling with browser developer tools to fill up the blackhole will result in a hack-attack alert
If "Blackhole detection" is enabled, NOT fiddling with browser developer tools to fill up the blackhole will NOT result in a hack-attack alert
The Blackhole is marked up so as not to be visible
The Blackhole is marked up so as someone with a screenreader would not accidentally fill it in
Ban syndication not available from the action log if no key provided
Ban syndication works from the action log
Ban syndication not available from investigate user if no key provided
Ban syndication works from investigate user
Ban syndication not available from punish member if no key provided
Ban syndication works from punish member
IP ban management correctly shows the temporary bans, with all the details required (including IP, expiry time, and block reason) but they are uneditable directly
When saving IP ban management, temporary bans are not wiped
Marking a trackback as spam results in ban syndication
"Scattergun link injection" spam (detected spam that creates a hack-attack in Composr) results in ban syndication
The privacy policy mentions spam checks, if not set to 'Never'
The privacy policy does not mention spam checks, if set to 'Never'
(Click to enlarge)
Welcome. :-)
I might not have been able to help sponsor this monetarily, but I was able to help sponsor in another way -- sharing my knowledge from research.
http://www.stopforumspam.com/forum/viewtopic.php?id=2256
So Tornevall API support will also be there. We support tornevall RBL already, so that means we both can feed off and into this.
HOWEVER! It requires PHP to have the SoapClient installed, and I'm also not sure how readily they give out API keys. You have to email to ask for access and I'm still awaiting mine.
I can sort of see where StopForumSpa is coming from — they want to be a spammer database only. I just question whether that is a good long-term strategy.
In theory the attached zip will be fine with v8-final as well as v8 RC6, as we are so close to release now.
Show 3 more replies