Weather reports via XML

From Wildsong
Jump to navigationJump to search

Forecasts

I used to run a perl screenscraper that downloaded the weather forecast page for my area from the National Weather Service.

The script captured the parts that interested me and stored them in an HTML file. The HTML then could be embedded in my home page.

It works okay but NWS now makes XML versions available so I am going to use instead. I can get the data and format it any way that I choose as an exercise in working with XML. Note that in addition to XML you can fetch the data in table format now too, should you want a simpler solution. (Yes, I realize this has already been done 100 times before by other people.)

Visit the NWS web site, click on the map to find the area of interest, click on "New products" and you will get a form to fill out to select your area and the parameters you want for your feed. Here is the URL that it gave me:

http://ifps.wrh.noaa.gov/cgi-bin/dwf?outFormat=xml&duration=72hr&interval=6&citylist=CORVALLIS%2C44.5710%2C-123.2760&city=Go&latitude=&longitude=&ZOOMLEVEL=1&XSTART=&YSTART=&XC=&YC=&X=&Y=&siteID=PQR

I don't want to hit the NWS web site every time someone requests a page, so I will download the forecast XML page periocally with a cron script. Once I have the XML in a local file (or while downloading it) I can process it with a server side script. I can either build a DOM/XHTML solution that will defer further formatting to the browser, or I can do more work in the server side script to render the XML into generic HTML.

To do XHTML, my script would build a page that includes the contents of the forecast XML file. This new XHTML page would require that the browser understand the XML, the XSL stylesheet, and the Javascript. This means it will never work on my Audrey! This will never do.

I spent some time playing around with XSL and Firefox; it's definitely worth knowing about it but for this application I think that I will instead make my cron script do more work. In addition to downloading the XML, it will convert the XML to HTML and store it on the server.

The finished page weather.html will be loaded into an iframe on my home page.

Here is a sample of the XML from the weather forecast URL above.

Here is more information on Cascading Style Sheets and a tutorial on CSS2 and XML

I have written perl for about 100 years but never tried processing XML with it. I know I could do it so it does not hold much attraction for me. Since I am also trying to learn Python I will give it a crack.

There are three steps here, 1. downloading the XML 2. parsing it 3. writing it out to disk as HTML. As the script will be run (probably once per hour) from cron, I don't have to think about building it as a daemon or anything like that.

Downloading

#!/usr/bin/python
#
import urllib2

def download (url) :
 try: 
   fd_url = urllib2.urlopen(url)
   str = fd_url.read()
 except:
   str = 
 return str
url =  "http://ifps.wrh.noaa.gov/cgi-bin/dwf?outFormat=xml&duration=72hr&interval=6&citylist=CORVALLIS%2C44.5710%2C-123.2760&city=Go&latitude=&longitude=&ZOOMLEVEL=1&XSTART=&YSTART=&XC=&YC=&X=&Y=&siteID=PQR"

print ">>Downloading forecast from NWS."
str = download(url)
print str
print ">>Done."

That certainly was easy.

Parsing XML

PyXML HOWTO More XML/Python links

Writing HTML output

Current conditions

Now that I have dealt with forecasts, I still want to work with the current conditions data from the National Weather Service as well.

I want to be able to log current conditions to RRD databases, so that I can plot the information in Cricket graphs. To do this I need to write a cricket 'collector' script. This is a piece of glue that will parse the XML and output simple numbers that can then be inserted into the Cricket RRD database.

Heading back to the NWS site, I see I can get observations in RSS and XML format.

RSS: http://www.weather.gov/data/current_obs/KCVO.rss
XML: http://www.weather.gov/data/current_obs/KCVO.xml

The RSS version is for insertion into a web page, the XML version is more like raw data -- for logging to a database, this is what I want.

Here is a link to the RSS 2.0 specification at Harvard Law.

I want to collect visibility, temperature, precipitation, barometric pressure, dew point, humidity, wind speed and direction.