Malleable Musings

December 23, 2008

RSS feed from a static webpage

Filed under: PHP, Yahoo Pipes — Brendan @ 10:55 pm

I mentioned in an earlier post that I’d started playing around with twitterfeed having set up a newsfeed service on Twitter (not many followers at the mo but it’s not really been promoted yet and it’s main output may well not be on Twitter – I’ll probably look at aggregated RSS feeds elsewhere).

Where I work is fairly unique and one of the problems I’m facing is that I need to bring in news from a variety of sources that aren’t in RSS format. One example is the Goldsmiths College News Feed.

The purpose of this post is to remind myself what I did to include this as I’ll probably need to repeat this process on a whole range of other pages.  (N.B. this is one of those posts written for myself so I’ll not post the full file but rather some comments – the variable I used wasn’t called $str).

1) Create a php file which will use some of the following commands:

$strURL = "URL to be brought in";
$strHTML = file_get_contents($strURL);
$str = explode("<hr>",$strHTML);
// break the the html in to usable parts

$variable = strip_tags($itThree, '<a>');
// strip out the HTML apart from whatever tags are needed

Probably explode the text further so that individual elements are part
of an array, might need to use something like the following at this stage:
$str[2] = preg_replace('/\ \[ <a href="/', 'http://www.gold.ac.uk/news', $str[2]);
The publication date will need to be in a suitable format or needs to be changed
to a guid format, e.g.
$guid = strtotime($str[0])

Then echo it all out
echo '<?xml version="1.0" ?>
      <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
      <channel>
      <title>Gold news</title>
      <link>URL where php script is</link>
      <description>An RSS enabled version of the Goldsmith News Page available at http://www.gold.ac.uk/news</description>
      <atom:link href="URL where php script is" rel="self" type="application/rss+xml" />
      .......

      ......
      </channel>
      </rss>

2) Upload it – and check it validates – http://feedvalidator.org/

3) Add to feedburner – and sit back and watch the news roll out.

Postscript: after writing this and getting it working I remembered Yahoo Pipes for the conversion. Use this to remind self of regex options.

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: