Google Sitemaps & Textpattern
03 June 2005 @ mid-afternoon | Comments (2)
Update: 03 Jun 2005 -› Now possible with Sencer’s sitemap plugin
After checking out Social Patterns’ Google Sitemaps solution for Wordpress, I decided to modify Michael’s code and adapt it for Textpattern.
Download the source file (rename file extension to .php) and place this file in the document root of your domain. You may have to adjust the path to your textpattern directory in the first two lines of the file. You can also add/subtract sections and/or categories to exclude from the sitemap.
Paranoid about unwelcome eyes peering at your sitemap? Add a few lines to your .htaccess:
RewriteCond %{HTTP_USER_AGENT} !^GoogleBot [NC]RewriteRule ^sitemap\.php$ - [F,L]
This will ban anyone without the Googlebot user agent from viewing your sitemap file. This is not a foolproof method – it is trivial to spoof a user agent so this really only provides superficial security.
Keep in mind that this sitemap generation method will not include other files outside of Textpattern, so if your needs are beyond this you may think about creating multiple sitemaps, using a sitemap index, and/or creating a sitemap with Google’s Sitemap Generator. Additionally, this script will not include your root section pages in the sitemap (perhaps I’ll get around to adding that).
With regards to exceeding the 10MB maximum sitemap size, I’m not really sure how I’d break this up or at what point it would become necessary. A 10MB file would be quite a lot of URLs, and I seriously doubt any Textpattern sites currently exist that would have more than 50,000 URLs or 10MB. But if that were the case, it could probably easily be modified to include a 50,000 limit and offset.
That’s about all I’ve got at the moment. Hopefully some of this will be of use, and not too much of it is flawed. You may also find useful Google’s list of third party solutions based on Google Sitemaps.
2 comments