Page 2 of 2

Re: REGEX pattern in serendipity_getUriArguments()

Posted: Fri Apr 20, 2012 5:23 pm
by Timbalu
Well, sure it is doing well - same as without - here (and better without the ending slash).
But I still don't really get what your problem actual is. Could you please describe what is making problems with the pagination without that snipped in detail? And, are we talking about standard or bulletproof pagination? What about the htaccess?

Re: REGEX pattern in serendipity_getUriArguments()

Posted: Fri Apr 20, 2012 5:56 pm
by gregman
Well, it's that easy: On my server I discovered a problem with serendipity handling searchterms with umlauts. In detail: Searchterms were broken just before an umlaut. As I went to examine whats going on I found, that

1) Apache does not keep the urlencoding when using mod_rwrite. Affected rewrite rule is

Code: Select all

RewriteRule ^{PAT_SEARCH} {indexFile}?url=/{PATH_SEARCH}/$1 [L,QSA]
As mentioned in one of my former post this is a known issue of Apache. One solution to get this fixed is to urlencode the affected url TWICE!!! The disadvantage of that solution is, that I cripple urls where's no need.

2) As I combed through the code I found that serendipity_getUriArguments() relies on a REGEX pattern which causes the cut in umlaut url. So In my opinion this is the right point to fix this behavior. After that everything works fine. So my suggestion was to get this fix into the core as some other users out there may have the same problem with apache mod rewrite und urlencodinded urls.

Greg

Re: REGEX pattern in serendipity_getUriArguments()

Posted: Fri Apr 20, 2012 6:50 pm
by Timbalu
This is much clearer. Sorry for all these detours.
Is that a common distributed Apache version?
I still can't imagine why Apache should lose urlencoding on its way... on some systems....

After some searching.... still not really knowing which one serves your problem...

Did you try to add a [NE] (NoEscape) flag?

Code: Select all

RewriteRule ^search/(.*) index.php?url=/search/$1 [L,NE,QSA] 
Is that the problem?

Or maybe you need to put a

Code: Select all

RewriteMap esc int:escape
into the main httpd.conf and/or use

Code: Select all

/${esc:$1}
in the htaccess (see http://rdfabout.com/demo/census/htaccess.txt).

Re: REGEX pattern in serendipity_getUriArguments()

Posted: Sat Apr 21, 2012 10:19 am
by gregman
Timbalu wrote: I still can't imagine why Apache should lose urlencoding on its way... on some systems....
Me neither untill I experienced it on my own system after a server upgrade. I did try the NE flag with no change. Because of the upgrade there are many other construction areas I have to deal with. So I didn't have the time to figure it out in detail. There is a php solution mentioned as a workaround on https://issues.apache.org/bugzilla/show ... i?id=34602. Anyway according to the linked thread the bug still seems to be present.

Greg

Re: REGEX pattern in serendipity_getUriArguments()

Posted: Sat Apr 21, 2012 10:41 am
by Timbalu
Well yes, I have read that and according to post 16, the 'Bug' gets solved with the second solution of my last post. If you follow the link, the one describes nice what happens on these systems presenting this 'bug'.
You patch seems do do well, but it is an Apache issue in the last, isn't it?!
That is why I asked if that is a common distributed Apache version, which should give us a hint to be prepared for this more often via your patch or not.

Re: REGEX pattern in serendipity_getUriArguments()

Posted: Tue Apr 24, 2012 8:49 am
by gregman
Ok, I see. My Apache was pre installed by my Hosting provider, but have no doubt that it's the commonly spread version on ubuntu 10.04 systems.

Greg

Re: REGEX pattern in serendipity_getUriArguments()

Posted: Thu Apr 26, 2012 2:26 pm
by Timbalu
I wish we could have some better info, why this happens on some (rare) Apaches.
@Garvin, what shall we do with it now?

Re: REGEX pattern in serendipity_getUriArguments()

Posted: Thu Apr 26, 2012 5:04 pm
by garvinhicking
Timbalu wrote:I wish we could have some better info, why this happens on some (rare) Apaches.
@Garvin, what shall we do with it now?
If it fixes things, we can implement gregmans patch proposed on the first page of this thread; I don't see much harm in it, and it's easier to fix there than to do some regexp .htaccess mumbo jumbo that is harder to maintain than our PHP code....?

Regards,
Garvin

Re: REGEX pattern in serendipity_getUriArguments()

Posted: Tue Jun 19, 2012 1:58 pm
by gregman
Hi,

after some time I come back to this issue... mainly because the provided patch does not work for phrases and some other special characters, which are wisely encoded by s9y but my apache aparently decodes when performing a rewrite rule. Here you can see that this bug/behavior is resolved/changed in Apache 2.5 but I'm sure there are others like my who are stuck to an earlier version of apache.
Timbalu wrote:You patch seems do do well, but it is an Apache issue in the last, isn't it?!
Therefore I decided to examine the given workarounds in the above link and came to a solution which may be better than the former regex-patch.

Aparently {THE_REQUEST} is not affected by the bug/behavior, so its possible to put it in a rewrite condition before the rewrite rule of the search pattern an reference it inside the rewrite rule with %1 like this

Code: Select all

RewriteCond %{THE_REQUEST} ^GET\ {PREFIX}{PAT_SEARCH}\ HTTP/\d\.\d
RewriteRule ^{PAT_SEARCH} {indexFile}?url=/{PATH_SEARCH}/%1 [L,QSA]
Maybe you can check the fix?

Regards
Greg