I just changed my permalink structure to be more SEO friendly and I'll explain
how I keep old URLs working at the same time AND make sure that Google
and other search engines are happy — even more happy than in the past.
You find a great tip from Google at the very end of this post.
However, let me start with the full story.
When I originally had set up my S9Y blog I had chosen the following
permalink structure for blog posts.
%year%-%month%/%id%-%title%
E.g.
http://blog.fcon21.biz/2009-01/230-Email-Marketing-Tips-Edition-16
As you can see I did a couple of things in an unusual way.
- combined year and month of the post in one virtual directory
instead of using the more common .../2009/01/... notation - use no file extension which is actually recommended by W3C.org and did
not use a trailing slash "/" either. - I also wanted the post id, e.g. "230" as part of the filename because this
makes the URL more robust against truncation and mistyping. Serendipity is
actually very good at this and tolerates a lot of bogus while still serving the correct
blog post. - Additionally I configured monthly archives by calling, e.g. ...biz/2009-01/
- And of course I introduced short URLs in the form of, e.g. ...biz/p230/ with or
even without trailing slash. Those come handy when used in emails.
The date in the URL is supposed to provide information about the publishing date of the
blog post. That's basically a good idea but I wanted it for the wrong reason.
I thoughts search engine users will enjoy the additional information in the URL and
that it helps them to decide to click on the link.
Well, they enjoy it too much and don't even click through to "older" posts. Web users are hunting for the latest, greatest information. Most of them there are a few exceptions.
However, the idea in SEO is to get people to your site. Depending on the topic
of the particular article the date when it was published could be relatively irrelevant
to the fact that it might provide the sought after solution to the reader.
There is a lot of "evergreen" content on the net, and on my blog as well.
But using the date in the URL simply shies potential web visitors away.
I want people to read it. Therefore, the date has to go.
The URL is 8 characters shorter all of a sudden. (A side benefit.)
Bye-bye.
Now what about the blog post id?
That's definitely a very good parameter to use in the URL because it's a "quick" database
index which is good for the performance. And it helps to protect the URL from typos and truncations.
The above sample URL still works as
http://blog.fcon21.biz/2009-01/230-Emablah-blah-blah
You'll notice that I don't redirect the URL (what you see in the browser navigation bar)
to the correct address. I only serve the blog post according to the blog post index 230.
It doesn't really matter for the user. The link works and it can be bookmarked. I'll talk
about the search engines in a bit.
What about SEO?
It's almost safe to assume that the most experienced SEO experts for blogging
can be found being active in the Wordpress world.
So let's learn from them. I'll keep it short.
- The URL should contain keywords in the domain and the path.
- Cryptic URL parameters, like index.php?author=A0076&post_id=234867&category=23&language=en-gb
are not so good. Easy to see why, isn't it. - The shorter the better.
- Omitting the date has more human than SEO reasons, but in general the structures
should not go too deep either because it can limit the set of keywords your page
can rank for. E.g.
.../animals/mammals/small/rabbit/rabbit-flowers-garden
The flower and garden part in the post has a hard time to compete against
the animal part in the path respectively virtual directory structure. This might be
a silly example, but it shows the point. - Testing and statistical analysis indicates that sites with 1 virtual directory before
the post title rank better. There's a lot of debate about this subject and search engine
algorithms are modified frequently. So in reality we are never too sure about it. I
simply trust my sources (without disclosing them here.)
That gives something like this:
example.com/email-marketing/why-squeeze-page/
With or without the trailing slash "/". Using the trailing slash gives the hint that
the URL is complete and no letters have been lost. So it's good actually for
human readability.
Or you might have seen a lot of wordpress blog posts like this
example.com/why-squeeze-page/
This leaves out the category and is shorter as well. Bingo!
Bingo? - Let's not forget a category "email-marketing" in the previous example
could have influence the ranking in a positive or negative way. Therefore
my sources suggest to use numbers, digits instead. They don't have a
negative influence.
Now I say, "Bingo"!
That's the perfect place for the blog post id.
%id%/%title%
Easy, effective, elegant, robust.
My original example
http://blog.fcon21.biz/2009-01/230-Email-Marketing-Tips-Edition-16
becomes
http://blog.fcon21.biz/230/Email-Marketing-Tips-Edition-16
and the associated short version
http://blog.fcon21.biz/230/
You'll notice all three URLs work, of course.
That's how it should be.
You want to avoid link rot and keep old URLs alive through redirection. I basically
achieved that goal with a modified .htaccess file and Apache Mod_Rewrite. I
simply have added ReWrite Rules from old versions of the URL to the new ones.
Done!
Attention: When you save certain changes in the Serendipity (S9Y) configuration
in the administration panel it will overwrite custom changes in the local .htaccess
file. That's why it is always a good idea to keep a backup with date stamp and
a README files for taking notes about particular modifications. I also had made a tiny
hack in 1 or 2 serendipity core files to make my life easier.
What about duplicate content?
Now I can access the same resource with many different URLs. That doesn't sound
too good for SEO purposes.
The most elegant way to deal with this issue is to do a "301 Permanent Redirect" to
the new permalink URL. In this case the displayed URL in the navigation bar of the Internet browser would change as well. And smart browsers could even update old bookmarks
automatically.
(I did not check on this but I assume that this required a change in the core files. Speaking
from my limited experience with version 1.1.2.
* A check if the requested URL is identical with the permalink, if not
do a 301 permanent redisrect. Eventually requiring an additional database call (I don't
know at which stage the permalink is read or constructed.
There is a much easier solution. Thanks to Google.
They allow you to specify the canonical form of the URL now which can be
achieved by a very easy change in the index.tpl template.
Simply add:
Code: Select all
{if $entry_id}
...
<link rel="canonical" href="{$entry.link}" />
{/if}
Fabulous.
Here's the blog post from Google.
Webmaster Central Blog :: Specify Your Canonical
Yours
John W. Furst
Udate: Some additional observations about what can happen. (Call for precaution!)
In the P.S. of my blog post SEO Friendly Permalink Structure - Update