#2384 - Anti-spam heuristics – Composr CMS: Your Data, Your Privacy, Your Control

This is a spacer post for a website comment topic. The content this topic relates to: #2384 - Anti-spam heuristics

By Guest posted 7th Jun 2016, 8:58 PM

Use of the contact forms is a concern. We should log everything going into them so that '4'/'5' above can work for these.

By Guest, By Guest, posted 8th Jun 2016, 10:20 PM

True. But also what about guest forum posting and guest support ticket / feedback submitting as well?

By Guest posted 8th Jun 2016, 10:23 PM

Fair point. It would need to track through somehow then. Maybe a punish link in the staff actions, like we do for forum posts - and track through content_type and content_id from that.

By Guest posted 8th Jun 2016, 10:48 PM

If the punish links could also be tied in to the warning/punishment form similar to forum posts... aka the content being punished is rendered as a link or a render box tempcode (or comcode) in the message field (similar to how my new reports addon works), that could further enhance the usefulness of punish links elsewhere.

But agreed. Technically, virtually any form of content can be submitted by guests... if permissions allow for it. Therefore, there needs to be a pipe for all content.

By Guest, By Guest, posted 14th Jun 2016, 9:48 AM

We also should have a privilege to avoid the spam heuristic system.

By Guest posted 25th Oct 2016, 1:50 PM

Ok, so I'm reading the comments more carefully than I did originally, as I am now implementing this.

I don't really agree with much of the discussion, it's tangential to the issue, more related to #2057 and #2374 and #375 which will be considered separately.

The main issue discussed seems to be how can we do posting-frequency detection for guests, as all combined guest postings go under a single ID. However I think there's no real issue because guests get the CAPTCHA, or we'd generally limit guest posting access (who'd want guests submitting news for example). So we can implement posting-frequency for non-guests only, and still have a whole diverse set of other techniques that do work on guests (CAPTCHA, but also all the other heuristics). We couldn't really track guests anyway, people could use TOR (so have rotating IPs and session IDs).

Duplicate content submission can work on the guest ID with no issue - because different guests are not legitimately going to be posting the same content.

We do need to make sure heuristics do work effectively for contact forms though.

By Guest posted 25th Oct 2016, 1:52 PM

Oh, also I think I was getting at, how do we know what is duplicate content, and I suggested a mechanism using the report system for that.
That isn't so necessary really. I've implemented a system where it can query via meta-data provided in the CMA hooks, over a time range for a particular submitter ID. That's simpler and better than trying to do it through reporting, because it works without any reporting needing to happen.

By Guest posted 27th Jun 2019, 1:20 PM

For reference, W3C have a document explaining non-CAPTCHA anti-spam techniques:
https://www.w3.org/TR/turingtest/

The TLDR is that we now do everything we can that isn't awful in some way, but it's still a good reference.

Gabri	Points: 60 Voting power: 16.626 Voting power control percentage: 0.235%
Master Rat	Points: 15 Voting power: 12.925 Voting power control percentage: 0.182%
PDStig	Points: 10 Voting power: 27.666 Voting power control percentage: 0.390%

#2384 - Anti-spam heuristics

Leader-board Top Weekly Earners

Coming soon…

Statistics