avatar tianjara.net | blog icon Andrew Harvey's Blog

Facebook Video Updates as an RSS Feed (Using a Shell Script)
30th July 2009

We are finally learning common Unix tools at uni. Gosh I wish we had done these earlier because they are so useful! (yes I could have learnt them myself, and I did a bit. But I ended up just learning the parts to get the job done. This didn't always work because I had very little understanding of why things worked (and why they didn't) and thus things turned into trial and error).

So anyway I wanted an RSS feed for videos uploaded on Facebook to public pages. (For example http://www.facebook.com/video/?id=20916311640). So I put my newly learnt skills to good use and wrote a shell script.

#!/bin/sh
wget http://www.facebook.com/video/?id=$1 -q -O - -U 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv: 1.8.0.3) Gecko/20060523 Ubuntu/dapper Firefox/1.5.0.3' | grep 'http://www.facebook.com/video/video.php?v=' | sed -e 's/http:\/\/www.facebook.com\/video\/video.php?v=[0-9]*/\n&\n/g' | grep 'http://www.facebook.com/video/video.php?v=' | uniq | sed -e 's/.*/<item><title>&<\/title><link>&<\/link><\/item>/' | sed "1 s/^/<?xml version=\"1.0\"?><rss version=\"2.0\"><channel><title>Facebook Video Feed<\/title><link>http:\/\/www.facebook.com\/video\/?id=$1<\/link><description>Facebook Videos for ID $1<\/description><language>en-us<\/language>/" | sed '$ s/$/<\/channel><\/rss>/'

UPDATED: (links on the page from facebook no longer have the domain etc in the link)

(the line below gets cut off, but you can select it and copy paste...)

#!/bin/sh
wget http://www.facebook.com/video/?id=$1 -q -O - -U 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv: 1.8.0.3) Gecko/20060523 Ubuntu/dapper Firefox/1.5.0.3' | grep '/video/video.php?v=' | sed -e 's/\/video\/video.php?v=[0-9]*/\n&\n/g' | grep '/video/video.php?v=' | uniq | sed -e 's/.*/<item><title>http:\/\/www.facebook.com&<\/title><link>http:\/\/www.facebook.com&<\/link><\/item>/' | sed "1 s/^/<?xml version=\"1.0\"?><rss version=\"2.0\"><channel><title>Facebook Video Feed<\/title><link>http:\/\/www.facebook.com\/video\/?id=$1<\/link><description>Facebook Videos for ID $1<\/description><language>en-us<\/language>/" | sed '$ s/$/<\/channel><\/rss>/'

Facebook will actually check the user agent and refuse to serve users it doesn't like so I had to spoof it. So anyway the pipeline will grab the html page and find all the links to individual videos and feed these out, one line for each (this is up to just after the uniq). Next I add some text to turn this list into a basic RSS file. I don't worry about making it fancy with the video title, thumbnail etc. because honestly I don't care about that for my use.

To actually use it I can use cron, (actually I think its easiest to make another shell script and put this in /etc/cron.daily/ or /etc/cron.hourly/) to run the command, ./fbvidrss.sh 20916311640 > /var/www/fbvid_20916311640.xml

Tags: rss, sh.