Sitemap Features / Bugs

Creating and modifying plugins.
Post Reply
psx
Regular
Posts: 78
Joined: Sun Dec 16, 2007 7:09 pm

Sitemap Features / Bugs

Post by psx »

Feature Request

1)

please let the plugin do an entry in the .htaccess file to point to the sitemap.xml

Sitemap: <sitemap_location>

2)

Sitemap schould add ALL urls on my blog... at the moment i miss the urls from the linklist as example my contact page ist shown up

http://www.dani.schlumpfhausen.info/pages/kontakt.html
psx
Regular
Posts: 78
Joined: Sun Dec 16, 2007 7:09 pm

Post by psx »

as u can test sitemap plugin output brought by spartacus

http://www.dani.schlumpfhausen.info/sitemap.xml

in comparisson to the google python sitemap script:

http://www.dani.schlumpfhausen.info/sitemap-google.xml

Code: Select all

in this example these files are excluded
  <filter  action="drop"  type="wildcard"  pattern="*~"           />
  <filter  action="drop"  type="wildcard"  pattern="*config*"           />
  <filter  action="drop"  type="wildcard"  pattern="*inc*"           />
  <filter  action="drop"  type="wildcard"  pattern="*bundled-libs*"           />
  <filter  action="drop"  type="wildcard"  pattern="*admin*"           />
  <filter  action="drop"  type="wildcard"  pattern="*sql*"           />

   <filter  action="drop"  type="regexp"    pattern="/\.[^/]*"     />

I have some problems with the google script, its hard to exclude "private" files, out of the box the google script just scan every fu**ing file, and posts it into the sitemap.xml... for sure we all dont want to have config files etc included..

I hope u can provide an "extended" version of the sitemap plugin, to catch all relevant links
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Re: Sitemap Features / Bugs

Post by garvinhicking »

Hi!
please let the plugin do an entry in the .htaccess file to point to the sitemap.xml
What? I've never heard of sucha .htaccess command! I don't think Apache can interpret that!
Sitemap schould add ALL urls on my blog... at the moment i miss the urls from the linklist as example my contact page ist shown up
The sitemap plugin would need to support all event plugins. That is hard work. If a developer wants to do that, we'll happily add it to the plugin, it's just a matter of SQL querying.

About your excludes:Google cannot index your PHP files, it only sees what the webserver gives them. Those files cannot be viewed in text.

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
psx
Regular
Posts: 78
Joined: Sun Dec 16, 2007 7:09 pm

Post by psx »

I think sitemap is a very important thing, i hope any developer will do that work!

i knew that about indexing, but still i dont want that google got the index!

and last point stupid me

.htaccess meant to be robots.txt

Code: Select all

You can specify the location of the Sitemap using a robots.txt file. To do this, simply add the following line:
Sitemap: <sitemap_location>
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Hi!
psx wrote:I think sitemap is a very important thing, i hope any developer will do that work!
I agree. But the most important pages are already added: blog articles and static pages. Extra stuff like the guestbook and contactform are all seperate plugins that would need querying. It would be nice, but I don't really thinks this is vital to be indexed by a googlespider.
i knew that about indexing, but still i dont want that google got the index!
But why should google index such a URL? There are no links to your config files...?
.htaccess meant to be robots.txt
Okay - however, then the plugin would need one more write access to a robots.txt, and it might be possible that the plugin overwrites your custom specifications. IMHO it would be safer for users to modify those files on their own? But I also think, it would be possible to add it to that plugin, yes.

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
psx
Regular
Posts: 78
Joined: Sun Dec 16, 2007 7:09 pm

Post by psx »


But why should google index such a URL? There are no links to your config files...?
that is exactly was the google script does, it indexing all files in your directory's

If u dont know it u can take a look at it

sitemap_gen-1.4 Google Inc. <opensource@google.com>
http://heanet.dl.sourceforge.net/source ... 1.4.tar.gz

Okay - however, then the plugin would need one more write access to a robots.txt, and it might be possible that the plugin overwrites your custom specifications. IMHO it would be safer for users to modify those files on their own? But I also think, it would be possible to add it to that plugin, yes.
I think u are right, espacilly if the blog isnt the main page... and the plugin woulnd put all the links outside the blog into the sitemap.xml....
scottwalsh
Regular
Posts: 17
Joined: Sun May 20, 2007 9:05 am
Contact:

Post by scottwalsh »

Hi psx,

The sitemap module currently supports the follow types of pages:
Entries
Categories
Authors
Achieves
Feeds
Static Pages
Tag Pages

As a work around out could try using the static pages module to link to your contacts pages…


As Gavin mentioned, to add support for each plug in requires change to the sitemap plug in.

If you know a little php, it is not hard to do, I submitted a patch to support tags pages and didn’t take long to do up… It would just take a while to support all the various plug ins
------

Scott Walsh

http://zone3.net.nz/
profile, journal of travels and photos
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Hi!
that is exactly was the google script does, it indexing all files in your directory's
No.

Google and any other bot does not know what your filenames are. They are not exposed, you can NEVER know the filesystem layout of a docuemntroot, if they are not linked somewhere.

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
psx
Regular
Posts: 78
Joined: Sun Dec 16, 2007 7:09 pm

Post by psx »

garvinhicking wrote:Hi!
that is exactly was the google script does, it indexing all files in your directory's
No.

Google and any other bot does not know what your filenames are. They are not exposed, you can NEVER know the filesystem layout of a docuemntroot, if they are not linked somewhere.

Regards,
Garvin
thats what i tried to say, if u use googles sitemap script tools, in generates a complete sitemap including alle files in your directorys...
garvinhicking
Core Developer
Posts: 30022
Joined: Tue Sep 16, 2003 9:45 pm
Location: Cologne, Germany
Contact:

Post by garvinhicking »

Hi!
thats what i tried to say, if u use googles sitemap script tools, in generates a complete sitemap including alle files in your directorys...
Why use the sitemap script tools, when you use the s9y sitemap plugin? This is responsib for the s9y pagelayout; the directory-bases sitemap tools are worth nothing for s9y.

Regards,
Garvin
# Garvin Hicking (s9y Developer)
# Did I help you? Consider making me happy: http://wishes.garv.in/
# or use my PayPal account "paypal {at} supergarv (dot) de"
# My "other" hobby: http://flickr.garv.in/
psx
Regular
Posts: 78
Joined: Sun Dec 16, 2007 7:09 pm

Post by psx »

u are right!

thats why i asked for an improvment ;-)
Post Reply