# tianjara.net | Andrew Harvey's Blog

### Entries from July 2009.

30th July 2009

We are finally learning common Unix tools at uni. Gosh I wish we had done these earlier because they are so useful! (yes I could have learnt them myself, and I did a bit. But I ended up just learning the parts to get the job done. This didn't always work because I had very little understanding of why things worked (and why they didn't) and thus things turned into trial and error).

So anyway I wanted an RSS feed for videos uploaded on Facebook to public pages. (For example http://www.facebook.com/video/?id=20916311640). So I put my newly learnt skills to good use and wrote a shell script.

#!/bin/sh
wget http://www.facebook.com/video/?id=$1 -q -O - -U 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv: 1.8.0.3) Gecko/20060523 Ubuntu/dapper Firefox/1.5.0.3' | grep 'http://www.facebook.com/video/video.php?v=' | sed -e 's/http:\/\/www.facebook.com\/video\/video.php?v=[0-9]*/\n&\n/g' | grep 'http://www.facebook.com/video/video.php?v=' | uniq | sed -e 's/.*/<item><title>&<\/title><link>&<\/link><\/item>/' | sed "1 s/^/<?xml version=\"1.0\"?><rss version=\"2.0\"><channel><title>Facebook Video Feed<\/title><link>http:\/\/www.facebook.com\/video\/?id=$1<\/link><description>Facebook Videos for ID $1<\/description><language>en-us<\/language>/" | sed '$ s/$/<\/channel><\/rss>/' UPDATED: (links on the page from facebook no longer have the domain etc in the link) (the line below gets cut off, but you can select it and copy paste...) #!/bin/sh wget http://www.facebook.com/video/?id=$1 -q -O - -U 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv: 1.8.0.3) Gecko/20060523 Ubuntu/dapper Firefox/1.5.0.3' | grep '/video/video.php?v=' | sed -e 's/\/video\/video.php?v=[0-9]*/\n&\n/g' | grep '/video/video.php?v=' | uniq | sed -e 's/.*/<item><title>http:\/\/www.facebook.com&<\/title><link>http:\/\/www.facebook.com&<\/link><\/item>/' | sed "1 s/^/<?xml version=\"1.0\"?><rss version=\"2.0\"><channel><title>Facebook Video Feed<\/title><link>http:\/\/www.facebook.com\/video\/?id=$1<\/link><description>Facebook Videos for ID$1<\/description><language>en-us<\/language>/" | sed '$s/$/<\/channel><\/rss>/'

Facebook will actually check the user agent and refuse to serve users it doesn't like so I had to spoof it. So anyway the pipeline will grab the html page and find all the links to individual videos and feed these out, one line for each (this is up to just after the uniq). Next I add some text to turn this list into a basic RSS file. I don't worry about making it fancy with the video title, thumbnail etc. because honestly I don't care about that for my use.

To actually use it I can use cron, (actually I think its easiest to make another shell script and put this in /etc/cron.daily/ or /etc/cron.hourly/) to run the command, ./fbvidrss.sh 20916311640 > /var/www/fbvid_20916311640.xml

I have made some changes to my original script. This new perl script will scrape info from sbs.com.au and give an RSS feed of the items in the specified playlist. I only know of two playlists (94 = Latest Full Episodes, 95 = Preview Clips). Only one line needs to be changed to use the script to give the RSS feed of a different playlist. The major improvement is the items that are only available over RTMP now have the correct URL which was previously incorrect (but now the script runs slower as it has to grab more pages from the web). I use the url, http://player.sbs.com.au/video/smil/index/standalone/$item_code/ to find out the url details. FLVStreamer appears to do a good job of downloading media over the RTMP protocol. Just use ./flvstreamer -r rtmp://file.flv > file.flv. Mozilla has an article on how to add protocol's to firefox here. But I didn't bother with that as the command is simple as it is, and building an app with a save as dialogue is beyond me for now, but I hope to learn that soon. [Update: It seems that you also need to have the --swfUrl argument set ('http://player.sbs.com.au/web/flash/standalone_video_player_application.swf' works.). Also the perl script doesn't get the file name correctly (it uses the thumbnail image url, rather it should be using the url's given at the /video/smil/index pages).] For local use the current format will probably be what you want, but in a production environment you probably want to have the script save the RSS file to disk and have people hit that RSS file with the requests. Just set the perl script to run every now and then. Unfortunately I can't seem to upload .pl files to Wordpress (I've put a link, but that will expire eventually)... I really need to get my own site.. There is so much customisation I would like to do and many experiments to try out on a live server, but the$'s are too much...