Weather reports via XML: Difference between revisions

From Wildsong
Jump to navigationJump to search
Brian Wilson (talk | contribs)
mNo edit summary
Brian Wilson (talk | contribs)
 
(6 intermediate revisions by the same user not shown)
Line 1: Line 1:
You can get regional information in a nice web format from
== Forecasts ==
http://www.nws.noaa.gov/ but if you want to insert just the
 
information you want into a web page, you can grab the XML data
I used to run a perl screenscraper that downloaded the weather forecast page
and parse that. They can provide you with tabular data if you
for my area from the [http://www.nws.noaa.gov/ National Weather Service].
don't want to do any programming.
 
The script captured the parts that interested me and stored them in an HTML file. The HTML then could be embedded in my home page.
 
It works okay but NWS now makes XML versions available so
I am going to use instead. I can get the data and format it
any way that I choose as an exercise in working with XML.
Note that in addition to XML you can fetch the data in table format now
too, should you want a simpler solution. (Yes, I realize this has already been done 100 times before by other people.)


Visit the NWS web site, click on the map to find the area of interest,
Visit the NWS web site, click on the map to find the area of interest,
 
click on "New products" and you will get a form to fill out to select your area
Click on "New products" and you will get a form to fill out to select your area
and the parameters you want for your feed. Here is the URL that it gave me:
and the parameters you want for your feed. Here is the URL that it gave me:


http://ifps.wrh.noaa.gov/cgi-bin/dwf?outFormat=xml&duration=72hr&interval=6&citylist=CORVALLIS%2C44.5710%2C-123.2760&city=Go&latitude=&longitude=&ZOOMLEVEL=1&XSTART=&YSTART=&XC=&YC=&X=&Y=&siteID=PQR
http://ifps.wrh.noaa.gov/cgi-bin/dwf?outFormat=xml&duration=72hr&interval=6&citylist=CORVALLIS%2C44.5710%2C-123.2760&city=Go&latitude=&longitude=&ZOOMLEVEL=1&XSTART=&YSTART=&XC=&YC=&X=&Y=&siteID=PQR


After thinking for a bit I realize that there is no need to do
I don't want to hit the NWS web site every time someone requests a page, so I will download the forecast XML page periocally with a cron script. Once I have the XML in a local file (or while downloading it) I can process it with a server side script. I can either build a DOM/XHTML solution that will defer further formatting to the browser, or I can do more work in the server side script
"programming", all I need to do is create a style sheet and insert
to render the XML into generic HTML.
the code into the page in an iframe:
 
To do XHTML, my script would build a page that includes the contents of the forecast XML file. This new XHTML page would require that the browser understand the XML, the XSL stylesheet, and the Javascript. This means it will never work on my [[Audrey]]! This will never do.
 
I spent some time playing around with [[XSL]] and Firefox; it's definitely worth knowing about it but for this application I think that I will instead make
my cron script do more work. In addition to downloading the XML, it will
convert the XML to HTML and store it on the server.
 
The finished page weather.html will be loaded into an iframe on my home page.
 
Here is a [http://www.wildsong.biz/static/forecast.xml sample of the XML] from the weather forecast URL above.
 
Here is more information on [http://www.w3schools.com/css/css_reference.asp Cascading Style Sheets] and a tutorial on [http://www.zvon.org/xxl/CSS2Tutorial/General/htmlIntro.html CSS2 and XML]
 
I have written perl for about 100 years but never tried processing XML with it.
I know I could do it so it does not hold much attraction for me. Since I am also trying to learn Python I will give it a crack.
 
There are three steps here, 1. downloading the XML 2. parsing it 3. writing it out to disk as HTML. As the script will be run (probably once per hour) from cron, I don't have to think about building it as a daemon or anything like that.
 
=== Downloading ===
 
#!/usr/bin/python
#
import urllib2
def download (url) :
  try:
    fd_url = urllib2.urlopen(url)
    str = fd_url.read()
  except:
    str = ''
  return str
 
url =  "http://ifps.wrh.noaa.gov/cgi-bin/dwf?outFormat=xml&duration=72hr&interval=6&citylist=CORVALLIS%2C44.5710%2C-123.2760&city=Go&latitude=&longitude=&ZOOMLEVEL=1&XSTART=&YSTART=&XC=&YC=&X=&Y=&siteID=PQR"
print ">>Downloading forecast from NWS."
str = download(url)
print str
print ">>Done."
 
That certainly was easy.
 
=== Parsing XML ===
 
[http://pyxml.sourceforge.net/topics/howto/xml-howto.html PyXML HOWTO]
[http://pyxml.sourceforge.net/topics/ More XML/Python links]
 
=== Writing HTML output ===
 
== Current conditions ==
 
Now that I have dealt with forecasts, I still want to work with
the current conditions data from the National Weather Service as well.
 
I want to be able to log current conditions to RRD databases, so that
I can plot the information in Cricket graphs. To do this I need to write a cricket 'collector' script. This is a piece of glue that will parse the XML and output simple numbers that can then be inserted into the Cricket RRD database.


Here is the iframe code for my location:
Heading back to the NWS site, I see I can get observations in RSS and XML format.
<pre>
<iframe name="Corvallis weather" src="http://ifps.wrh.noaa.gov/cgi-bin/dwf?outFormat=xml&duration=72hr&interval=6&citylist=CORVALLIS%2C44.5710%2C-123.2760&city=Go&latitude=&longitude=&ZOOMLEVEL=1&XSTART=&YSTART=&XC=&YC=&X=&Y=&siteID=PQR" scrolling=no width=480>Sorry, your browser does not support iframes.
</pre>


Without the stylesheet it will look like this:<br>
RSS: http://www.weather.gov/data/current_obs/KCVO.rss <br>
<html>
XML: http://www.weather.gov/data/current_obs/KCVO.xml
<iframe name="Corvallis weather" src="http://ifps.wrh.noaa.gov/cgi-bin/dwf?outFormat=xml&duration=72hr&interval=6&citylist=CORVALLIS%2C44.5710%2C-123.2760&city=Go&latitude=&longitude=&ZOOMLEVEL=1&XSTART=&YSTART=&XC=&YC=&X=&Y=&siteID=PQR" scrolling=no width=480>Sorry, your browser does not support iframes.
</iframe>
</html>


With styles<br>
The RSS version is for insertion into a web page, the XML version
<html>
is more like raw data -- for logging to a database, this is what I want.
<style type="text/css">
period { color: blue}
</style>
<iframe name="Corvallis weather" src="http://ifps.wrh.noaa.gov/cgi-bin/dwf?outFormat=xml&duration=72hr&interval=6&citylist=CORVALLIS%2C44.5710%2C-123.2760&city=Go&latitude=&longitude=&ZOOMLEVEL=1&XSTART=&YSTART=&XC=&YC=&X=&Y=&siteID=PQR" scrolling=no width=480>Sorry, your browser does not support iframes.
</iframe>
</html>


The above shows forecasts.
Here is a link to the [http://blogs.law.harvard.edu/tech/rss RSS 2.0 specification] at Harvard Law.
I still want to collect current conditions data from the National Weather Service, so that I can plot the information in Cricket graphs. To do this I need to write a cricket 'collector' script. This is a piece of glue that will parse the XML and output simple numbers that can then be inserted into the Cricket RRD database.


I want to collect visibility, temperature, precipitation, barometric pressure, dew point, humidity, wind speed and direction.
I want to collect visibility, temperature, precipitation, barometric pressure, dew point, humidity, wind speed and direction.

Latest revision as of 23:41, 5 January 2006

Forecasts

I used to run a perl screenscraper that downloaded the weather forecast page for my area from the National Weather Service.

The script captured the parts that interested me and stored them in an HTML file. The HTML then could be embedded in my home page.

It works okay but NWS now makes XML versions available so I am going to use instead. I can get the data and format it any way that I choose as an exercise in working with XML. Note that in addition to XML you can fetch the data in table format now too, should you want a simpler solution. (Yes, I realize this has already been done 100 times before by other people.)

Visit the NWS web site, click on the map to find the area of interest, click on "New products" and you will get a form to fill out to select your area and the parameters you want for your feed. Here is the URL that it gave me:

http://ifps.wrh.noaa.gov/cgi-bin/dwf?outFormat=xml&duration=72hr&interval=6&citylist=CORVALLIS%2C44.5710%2C-123.2760&city=Go&latitude=&longitude=&ZOOMLEVEL=1&XSTART=&YSTART=&XC=&YC=&X=&Y=&siteID=PQR

I don't want to hit the NWS web site every time someone requests a page, so I will download the forecast XML page periocally with a cron script. Once I have the XML in a local file (or while downloading it) I can process it with a server side script. I can either build a DOM/XHTML solution that will defer further formatting to the browser, or I can do more work in the server side script to render the XML into generic HTML.

To do XHTML, my script would build a page that includes the contents of the forecast XML file. This new XHTML page would require that the browser understand the XML, the XSL stylesheet, and the Javascript. This means it will never work on my Audrey! This will never do.

I spent some time playing around with XSL and Firefox; it's definitely worth knowing about it but for this application I think that I will instead make my cron script do more work. In addition to downloading the XML, it will convert the XML to HTML and store it on the server.

The finished page weather.html will be loaded into an iframe on my home page.

Here is a sample of the XML from the weather forecast URL above.

Here is more information on Cascading Style Sheets and a tutorial on CSS2 and XML

I have written perl for about 100 years but never tried processing XML with it. I know I could do it so it does not hold much attraction for me. Since I am also trying to learn Python I will give it a crack.

There are three steps here, 1. downloading the XML 2. parsing it 3. writing it out to disk as HTML. As the script will be run (probably once per hour) from cron, I don't have to think about building it as a daemon or anything like that.

Downloading

#!/usr/bin/python
#
import urllib2

def download (url) :
 try: 
   fd_url = urllib2.urlopen(url)
   str = fd_url.read()
 except:
   str = 
 return str
url =  "http://ifps.wrh.noaa.gov/cgi-bin/dwf?outFormat=xml&duration=72hr&interval=6&citylist=CORVALLIS%2C44.5710%2C-123.2760&city=Go&latitude=&longitude=&ZOOMLEVEL=1&XSTART=&YSTART=&XC=&YC=&X=&Y=&siteID=PQR"

print ">>Downloading forecast from NWS."
str = download(url)
print str
print ">>Done."

That certainly was easy.

Parsing XML

PyXML HOWTO More XML/Python links

Writing HTML output

Current conditions

Now that I have dealt with forecasts, I still want to work with the current conditions data from the National Weather Service as well.

I want to be able to log current conditions to RRD databases, so that I can plot the information in Cricket graphs. To do this I need to write a cricket 'collector' script. This is a piece of glue that will parse the XML and output simple numbers that can then be inserted into the Cricket RRD database.

Heading back to the NWS site, I see I can get observations in RSS and XML format.

RSS: http://www.weather.gov/data/current_obs/KCVO.rss
XML: http://www.weather.gov/data/current_obs/KCVO.xml

The RSS version is for insertion into a web page, the XML version is more like raw data -- for logging to a database, this is what I want.

Here is a link to the RSS 2.0 specification at Harvard Law.

I want to collect visibility, temperature, precipitation, barometric pressure, dew point, humidity, wind speed and direction.