Spam protection in s9y – issues and thoughts

Creating and modifying plugins.
Post Reply
yellowled
Regular
Posts: 7111
Joined: Fri Jan 13, 2006 11:46 am
Location: Eutin, Germany
Contact:

Spam protection in s9y – issues and thoughts

Post by yellowled »

Let me preface this by outlining my current “line of defense” in terms of spam protection:

1. Spamblock Bee – I only use the Honeypot feature, not the hidden captcha (see below as to why)
2. Spamblock Bayes – manual borders (60/95), learn: yes, ignore: ip, referer, trashcan: 98%
3. Spamblock – no content filter, no Akismet; moderate trackbacks: 28 days, no captchas, moderate comments: 42 days, moderate links: 3, block links: 10

I also recently reset my Bayes database and learned all comments which already are in the blog as ham. I'm not using any imported databases in Bayes. This is a rather low-traffic blog with a very specific audience, yet it seems to get quite a lot of spam.

Now I notice the following issues:

1. The hidden captcha in Spamblock Bee sometimes fails for human users (i.e. valid comments). If that happens, the log file states “1 != ?” (“?” in fact being the infamous black diamond question mark character) which I believe indicates some issue with UTF-8 encoding. This is especially bad since Grischa as the original plugin maintainer is M.I.A. at the moment.

I've had at least one valid comment and one valid contact form email fail to this behaviour in 2014 (that I know of) which is why I've disabled the hidden captcha.

2. Spamblock Bayes is not working at all right now. All comments which are not caught by the Honeypot feature of Spamblock Bee go directly to the standard Spamblock plugin, although Spamblock Bayes is placed before standard Spamblock in the plugin list. I have no idea why.

3. Actually, I get most of my blog spam through the contact form. As far as I know, at least some of the spamblock features should work on the contact form as well (I know captchas do, but I really don't want to use captchas), but I'm not sure which ones. Bee does, but do Bayes and Standard as well?

4. Even though I set Bayes to ignore ip and referer, the Bayes database seems to include them. At least it has some entries for ip and referer. Are those ignored in Bayes' analysis and kept in case I ever remove them from the ignore list?

Apart from my current issues (which might just as well be because I don't understand the various spamblock plugins and their settings properly) I think this is an area in s9y which has room for improvement. I'm not a new user, and I have been using these plugins for quite some time now, yet I still don't think I really understand them. I reckon this must be utterly confusing for new users.

YL
onli
Regular
Posts: 2825
Joined: Tue Sep 09, 2008 10:04 pm
Contact:

Re: Spam protection in s9y – issues and thoughts

Post by onli »

yellowled wrote:The hidden captcha in Spamblock Bee sometimes fails for human users (i.e. valid comments).
Happened in my blog as well, and I made the same consequence. The plugin is still great with the honeypot only, but that is a bit of a pity.
yellowled wrote:Spamblock Bayes is not working at all right now.
You could try to look into the logfile, I have also no idea what could cause this. Maybe it is just to liberal atm?
yellowled wrote:... at least some of the spamblock features should work on the contact form as well .... Bee does, but do Bayes and Standard as well?
I have to admit, I don't know. If the contactform uses the comment-hooks (and I was under the impression), then yes. If it uses its own system and Bayes would need to catch additional hooks, it should be possible to add that, as long as the submissions are still in a comment structure. Someone more familiar with the contactform-plugin (I never used it) could give me a hint.
yellowled wrote:Are those ignored in Bayes' analysis and kept in case I ever remove them from the ignore list?
Yes. you can test that if you select a comment in the Analyse menu. Ignored commentparts will show no rating.


Apart from that specific issues, I have no vision right now how we could improve this area of s9y. Which maybe is obvious, since I didn't change anything on bayes in ages. It works for me…

Distributed plugins seem to me the only choice, if we don't build a very well engineered combined (modular?) plugin. Which in the end might become just another option.
yellowled
Regular
Posts: 7111
Joined: Fri Jan 13, 2006 11:46 am
Location: Eutin, Germany
Contact:

Re: Spam protection in s9y – issues and thoughts

Post by yellowled »

onli wrote:Happened in my blog as well, and I made the same consequence. The plugin is still great with the honeypot only, but that is a bit of a pity.
The honeypot alone doesn't seem to do as much here, but that might be specific to my blog/content/whatever.
onli wrote:You could try to look into the logfile, I have also no idea what could cause this. Maybe it is just to liberal atm?
Yeah, seems like it. It's starting to find spam now, so I guess it was in fact related to deleting the db.
onli wrote:If the contactform uses the comment-hooks (and I was under the impression), then yes.
I think what's bugging me about this is that I can't set different settings for comments and contact form. Bayes does in fact seem to work on the contact form, but since there is no way to moderate mails, it's kind of ineffective there (depending on the plugin's settings, of course). It would be nice (but probably too much overhead) if one could set different actions for the contact form or have a seperate instance of the plugin for the contact form.
onli wrote:Apart from that specific issues, I have no vision right now how we could improve this area of s9y.
As I said – we might need a seperation of comment and contact form spam, or a seperate spamblock plugin for the contact form … other than that, I really don't know, either. It's a shame that spamblock bee is pretty much unmaintained, but that's probably hard to change.

YL
ju
Regular
Posts: 50
Joined: Wed Oct 01, 2008 4:27 pm

Re: Spam protection in s9y – issues and thoughts

Post by ju »

I hope my question fits into this thread: due to recent waves of havy commentspam in my blog I had to start using more rigid rules. Now I can't comment any more myself ;) and I haven't found, why. The logfile says: "REJECTED: Sie haben das Feld "%s" nicht ausgefüllt!" but I have no idea which field it talks about (username and e-mail address are required; before the last changes only a username was required).
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Spam protection in s9y – issues and thoughts

Post by garvinhicking »

ju wrote:I hope my question fits into this thread: due to recent waves of havy commentspam in my blog I had to start using more rigid rules. Now I can't comment any more myself ;) and I haven't found, why. The logfile says: "REJECTED: Sie haben das Feld "%s" nicht ausgefüllt!" but I have no idea which field it talks about (username and e-mail address are required; before the last changes only a username was required).
Maybe you configured a required/mandatory field in the spamblock (or accidentally added an empty linebreak, space or whatever)?
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
ju
Regular
Posts: 50
Joined: Wed Oct 01, 2008 4:27 pm

Re: Spam protection in s9y – issues and thoughts

Post by ju »

Probably "or whatever" ;) I did't change anything after my question here, but after some hours I could comment again.
Czorneboh
Regular
Posts: 385
Joined: Tue Apr 08, 2008 7:17 pm
Location: Berlin
Contact:

Re: Spam protection in s9y – issues and thoughts

Post by Czorneboh »

Two issues:

1) In Bayes I am missing the data of the url of my blog, where a comment, which was dismissed by Bayes, was posted. I am meaning the url of the entry. Too much ham comments are getting lost. When I study the spam log much later, I do not know, to which entry the ham comment refers to. If I would know that, I could copy the comment into the comment field of the right entry and present the comment under the entry yet (Unfortunately I can not manage the date of the comment, when the comment was once made.).

Knowing the url, where the comment was catched by Bayes could help to learn about what is interesting to human spammers or learn about spam patterns. Such understanding could perhaps help to manage the bayes plugin.

(Maybe the same with spamblock plugin? I have to check this yet.)

2) What I was wondering now was, what Yellowled already explained before on 26.02.14.
Bayes is working for both comments and contact form right?

Yellowled:
I think what's bugging me about this is that I can't set different settings for comments and contact form. Bayes does in fact seem to work on the contact form, but since there is no way to moderate mails, it's kind of ineffective there (depending on the plugin's settings, of course). It would be nice (but probably too much overhead) if one could set different actions for the contact form or have a seperate instance of the plugin for the contact form.
a) I do miss that opportunity too.
b) And I would like to know, if the ham, which was filtered, was written into the contact form or in a comment form (see above!). I want to know, if the filtering of ham by bayes happens so often only with entry comment forms or as well (on a similiar level) with contact forms.

I am in need to force users to use the contact form (or a selfmade request form) for making requests and perhaps would explain, that requests in comment forms accomodate the (higher) risk of getting filtered.
Post Reply