Embedded Blog and .htaccess Looping

Random stuff about serendipity. Discussion, Questions, Paraphernalia.
Post Reply
DB
Regular
Posts: 22
Joined: Mon May 01, 2006 9:40 pm
Contact:

Embedded Blog and .htaccess Looping

Post by DB »

Albeit, the looping part is of my own creation.

Since I'm using .htaccess and mod_rewrite to create the pages for the blog, I thought I'd be really sneaky and use it to my advantage to pull the blog entry titles out of .htaccess and plug them into the meta title head of my website template (the shell I embedded my s9y blog into).

Here is what I did to my .htaccess file:

RewriteRule ^((archives/([0-9]+)-([0-9a-z\.\_!;,\+\-\%]+)\.html)/?) unite.html?/$1&title=$4 [NC,L,QSA]

RewriteRule ^(authors/([0-9]+)-([0-9a-z\.\_!;,\+\-\%]+)) unite.html?/$1&title=$3 [NC,L,QSA]

RewriteRule ^(categories/([0-9;]+)-([0-9a-z\.\_!;,\+\-\%]+)) unite.html?/$1&title=$3 [NC,L,QSA]

Note the &title= at the end.

Now all I have to do is $_REQUEST the $title variable from the URLs, and voila I can plug them right into my HTML Head and output the titles right to the browser. If I didn't do this, then having the embedded blog means having a singular blog meta title for the whole blog, rather than giving each entry their own unique one. Make sense?

Problem: As you can probably see, the rewrites are now left wide open. Proof of this would be the spambots who have managed to be landing at URLs (I have proof of this in my stats) like this:

../archives/2008/css/css/css/css/css/css/css/css/css/css/css/css/css/css/
css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/
css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/
css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/
css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/
css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/
css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/css/
css/css/css/css/css/css/css/

and

../authors/css/print/css/login-signup/css/css/links/css/idea/css/css/
login-signup/login signup/accessibility/css/css/idea/css/main_redux/css/css/main_redux/
css/login-signup/css/login/css/print/css/login-signup/links/css/css/print/css/css/index/archives/css/idea/css/css/links.html

Literally about 5000 variations of this.

So, does anyone know a good way that I can grab the titles, and maybe even meta description/keywords for an entry to use with an embedded blog like I describe? The main site, and the blog, use separate databases, so I would like to avoid opening/closing multiple databases just to grab this miniscule amount of info. Maybe some type of conditional in the rewrite? Help!
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Embedded Blog and .htaccess Looping

Post by garvinhicking »

Hi!

A really bad idea, this is completely the wrong way to do it *g* :)

This is why s9y has a templating system. Simply edit the index.tpl of your template and put the variable {$entry.title} at the place where you want it. Or install the s9y "meta tags" event plugin. :-)

Best regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
DB
Regular
Posts: 22
Joined: Mon May 01, 2006 9:40 pm
Contact:

Re: Embedded Blog and .htaccess Looping

Post by DB »

garvinhicking wrote: This is why s9y has a templating system. Simply edit the index.tpl of your template and put the variable {$entry.title} at the place where you want it. Or install the s9y "meta tags" event plugin. :-)
I think that I'm going to rethink the way in which the blog is embedded. Rather than pull a hacked up index.tpl (minus the HTML head) into an existing template using the wrapper.php method, I think I will chunk out the main website so it has an opening header type function and a closing footer type function. Then I'll just use those two function calls to wrap my content directly in my index.tpl so that I can still have all the goodies from my main site layout, and inject all the s9y goodies into the wrapper functions from the index.tpl. Darn, why didn't I think of that earlier. I'll do that, and see it works.
DB
Regular
Posts: 22
Joined: Mon May 01, 2006 9:40 pm
Contact:

Post by DB »

Okay, wrangled my main site template a bit so that I have a opening header function, and a closing footer function. Then I just registered them both with a config.inc.php file in my template's directory. I called the header just before the output of the blog content in the index.tpl, and then call the closing footer just after. I also pull the frontend_header in for the meta plugin. Whoo! Now I've still got everything embedded, and meta tags coming in without the .htaccess madness.

Oh, and I turned off mod_rewrite from the configuration screen, and then turned it back on. That way I've got a brand new shiny .htaccess file without all my fiddling around in it.

I'll have to wait and see if the strange URLs stop showing up in my stats now. However, I'm still curious why a URL like:

../authors/css/print/css/login-signup/css/css/links/css/idea/css/css/
login-signup/css/main_redux/login-signup/accessibility/css/css/print/login/index/archives/links/P1.html

couldn't be accessed. I mean, with the way the rewrite is, I don't think that will drop somebody off at the 404 destination. Wouldn't me putting the "&title" at the end simply just make it so that I could $_REQUEST the page, only if called by name?
DB
Regular
Posts: 22
Joined: Mon May 01, 2006 9:40 pm
Contact:

Post by DB »

DB wrote:I mean, with the way the rewrite is, I don't think that will drop somebody off at the 404 destination. Wouldn't me putting the "&title" at the end simply just make it so that I could $_REQUEST the page, only if called by name?
Or is it just that this particular spambot is making random requests, and having the title variable there is allowing anything to fly. I think I just answered my own question ; )
DB
Regular
Posts: 22
Joined: Mon May 01, 2006 9:40 pm
Contact:

Still Looping

Post by DB »

Okay, so my titles/metatags have been working great, and have been implemented using a better technique. However, I'm still noticing that every couple weeks or so a referrer bot slams my site and my stats just list thousands of looping rewrite URLs. I have reset my .htaccess file to the default for a fresh setup. Has anyone else with mod_rewrite turned on experienced this? Sure, I can just block the IP from the referrer spam, but I just don't want it to take a server down in the meantime.


/authors/css/print/css/login-signup/css/css/links/P2/P2/css/idea/css/css/login-signup/login-
signup/accessibility/css/css/idea/css/main_redux/css/css/main_redux/css
/login-signup/css/login/css/css/css/css/print/P2/css/css/login/login/P2/login/P2/l
ogin/css/idea/portal/css/css/login/idea/P2/P2/accessibility/css/css/main_r
edux/css/P2/portal/P2/P2/P2/css/idea/css/main_redux/css/links/P2/css/pr
int/P2/css/main_redux/css/login-signup/css/idea/css/main_redux/css/css/css/css/css/accessibility/login/cs
s/css/css/

Is there any type of sanitation taking place in the scripts where a URL rewrite is requested, maybe limiting it to a certain amount of directories, or characters? It may be that my setup is unique and requires something else. I've got a .htaccess file in my root directory (s9y is in a sub) so maybe there is something taking place there. I can't see anything right off.
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Still Looping

Post by garvinhicking »

Hi!

This repetition sounds like you have some HTML page that uses relative CSS links instead of full ones, and thus lead to the looping. When dealing with virtual symlinks, you must make sure your HTML pages never contain a link to a structure that is invalid, because bad relative links will then stack on each other.

So check your references to the "css/print"... structure and make sure it always uses /path/to/css/print from the root of your web.

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
DB
Regular
Posts: 22
Joined: Mon May 01, 2006 9:40 pm
Contact:

Re: Still Looping

Post by DB »

garvinhicking wrote: So check your references to the "css/print"... structure and make sure it always uses /path/to/css/print from the root of your web.

It was "css/main.css" and now it is using "/css/main.css"

I also added a <base> meta tag with my base URL, so we'll see if the combination will take care of things.

Thank you so much!
Post Reply