I've used quite a few dynamic scripting languages over the last couple of years including groovy, ruby and python, but I keep coming back to python. I think this time it's due to Peter Butler's (a guy I worked with a while ago) complete love of the language and I think I'm starting to see why.
Over the last week I've been bashing away working on improving the rather outdated www.logicalpractice.com and it occurred to me that it would be a bad idea to generate a sitemap xml for google and the other search bots.
The following code is my solution, I'm sure it's not the best python in the world but I do just kinda like the look.
from __future__ import with_statement
import xmlbuilder
import sys
import os
from datetime import datetime
from xml.dom.minidom import parse as parseDom
from xml.dom.minidom import Node
def url_element(xml, loc,lastmod,changefreq="weekly", priority=0.5):
with xml.url:
if loc.startswith("http:"):
xml.loc(loc)
else:
xml.loc("http://www.logicalpractice.com%s" % loc)
xml.lastmod(lastmod.strftime("%Y-%m-%d"))
xml.changefreq(changefreq)
xml.priority(priority)
def lastmod(file_name):
global basedir
last_mod = os.path.getmtime(os.path.join(basedir,file_name))
return datetime.fromtimestamp(last_mod)
basedir = os.path.join(os.path.dirname(sys.argv[0]), "..","..")
xml = xmlbuilder.builder(version="1.0",encoding="utf-8")
with xml.urlset(xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"):
url_element(xml,"/",lastmod("index.jsp"),priority=1.0)
url_element(xml,"/news.jsp", lastmod("news.jsp"), priority=0.8)
url_element(xml,"/projects.jsp", lastmod("projects.jsp"), priority=0.5)
url_element(xml,"/profile.jsp", lastmod("profile.jsp"), priority=0.5)
# generate elements from the news.rss
rss = parseDom(os.path.join(basedir,"news.rss"))
for node in rss.getElementsByTagName("item"):
link = node.getElementsByTagName("link")[0].firstChild.data
strdate = node.getElementsByTagName("pubDate")[0].firstChild.data
date = datetime.strptime(strdate, "%a, %d %b %Y %H:%M:%S +0000")
url_element(xml, link, date, priority=0.5)
print xml
the xmlbuilder used is from Jonas Galvez via github seems a very simple and elegant solution for building xml documents
How do I know that python must be a good thing? Well anything that I get up at 5 in the morning to code a bit more of before work has to be a good thing.
1 comments:
Hey, xmlbuilder is now xmlwitch, now featuring a proper setup script and documentation.
Post a Comment