Page 1 of 1

bayes fishing admin comments

Posted: Thu Jul 11, 2013 11:25 am
by Timbalu
I have a foreign blog here, having bee, bayes, spamblock activated in this order.

But the owner still gets rated by bayes, funnily even higher than other commenters
This option is activated for administrators in spamblock.
Disable spamblock for Authors
You can allow authors in the following usergroups to post comments without them being checked by the spamblock plugin.
Is that a plugin order problem? Or shouldn't we have an option like this for bayes too? Does Bayes care about authored comments?
If this is an order issue, how are the experiences with bee, spamblock, bayes orders? Is that doing well for all option cases?

Re: bayes fishing admin comments

Posted: Thu Jul 11, 2013 6:53 pm
by onli
Hi Ian
That option will have no effect on bayes. Bayes itself doesn't treat the name of the blog author any different than all other names. If the learning works correct, his comments should get marked as valid often enough anyway.
Regards

Re: bayes fishing admin comments

Posted: Thu Jul 11, 2013 7:18 pm
by Timbalu
I was afraid you might say that....
In that case it would still be nice to have authored comments to be not parsed, since I already did the most I could to rate this admin, but could not downrate the one dramatically beneath 50%...

Re: bayes fishing admin comments

Posted: Thu Jul 11, 2013 7:44 pm
by onli
What would you check for?

Code: Select all

serendipity_checkPermission('adminComments')
?
If it's a one-liner or just an additional check in the if of frontend_saveComment, we could add something like that. But I wouldn't want to add anything more complex than that, there is already too much code in that plugin.

Re: bayes fishing admin comments

Posted: Thu Jul 11, 2013 8:10 pm
by Timbalu
Yes exactly, or additional with a config option bool check $this->get_config('dropAdminComments')...
I also don't think a more would be needed.

Re: bayes fishing admin comments

Posted: Wed Jul 17, 2013 5:37 pm
by Timbalu
Do you still have that in mind?

BTW, I just wanted to comment with a normal 2-liner and got rejected all the way. That is, why I inspected the settings and the database again and found an enormous spam flood on that blog, well rejected (moderated) by the spam measurements. The bayes database got huge too, that is why I thought, it might very well be better to have this new order (1 bee, 2 contact, 3 spamblog, 4 bayes, ...), to only have the bayes database polluted after all others have done their work. (*)

I disabled bayes now for a while, since too much ham words where rejected as spam (which was the cause of my rejected comment) and still the bee and spamblock plugins caught all spam just as well. After that I will possibly empty the bayes database to start from scratch, but with this new sort order to see what happens.

Isn't this a better way for this? (I appreciate experiences and comments!)

(*) Assuming the comment tries already rejected by bee and spamblog (like no CAPTCHA or not allowed API generated ones) do not pass to bayes then, is that right?

Re: bayes fishing admin comments

Posted: Thu Jul 18, 2013 12:31 am
by onli
Hi Ian
Timbalu wrote:Do you still have that in mind?
I admit: Not really. I still think it would be better to see the real cause, like the spamflood you described.

The new order ist alright, that is how my setup works too (bee, spamblock, bayes). The important thing when that much more spam exist than valid comments is to not enable autolearning, so the database doesn't get too strict.

Re: bayes fishing admin comments

Posted: Thu Jul 18, 2013 9:41 am
by Timbalu
OK. So the reason why bayes got that big, was the wrong order and that autolearning was active. But since bayes was on 2cd place in order, it also parsed all comments not rejected by bee, mostly these API-created ones.

I have watched this bee, spamblock order, without bayes 24h now and had ~1200 rejected comments and none came through. This does not sound that much, since this is only about 50 per hour, but in some parts of the day, they flood-in every 3 seconds.

One of this API spammers uses different IP's all the way, but still has the same useragent. I disabled the one per htaccess above the s9y block. Maybe this helps someothers out there:

Code: Select all

# disable (403 Access Denied response) a "PHP/5.2.10" useragent of API-generated SPAM comments
RewriteCond %{HTTP_USER_AGENT} ^PHP\/5.2.10
RewriteRule ^ - [F,L]
Edit: this single rule decreased that flood from far more than 10.000 hits down to ~1200 a day, which is a lot and this is working now since nearly 2 weeks w/o problems...! :D