avatar tianjara.net | blog icon Andrew Harvey's Blog

Entries tagged "sh".

Using exiftool to reduct EXIF metadata in some JPEGs
9th January 2011

After spending way to much time trying to edit some EXIF metadata in some JPEG photographs, I'm posting my method here for future reference.

The first thing I needed to do was geoencode/geotag the images with a GPS location tag. I did this with the perl module Image::ExifTool::Location (script here).

The second thing I needed to do was examine what metadata was in the JPEG and reduct some of it. I used exifprobe and exiftool -v to examine the metadata. I ended up using this command on each image:

exiftool -overwrite_original -scanForXMP \
 -MakerNotes:SerialNumber='0' -MakerNotes:OwnerName='' -MakerNotes:InternalSerialNumber="0" -XMP:SerialNumber= \
 -XMP:OwnerName="Andrew Harvey <andrew.harvey4@gmail.com>" \
 -XMP-cc:AttributionName='Andrew Harvey' -XMP-cc:License='http://creativecommons.org/licenses/by/3.0/' -XMP-cc:attributionURL="http://www.flickr.com/photos/andrewharvey4/"\
"file_name.jpg"

http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/Canon.html was a good reference to find out about the MakerNotes tags. Also it took me a little experimenting and research before finding out that "MakerNotes tags may be edited, but not created or deleted individually." This is why the SerialNumber tag is set to zero, and not removed. I tried to remove the camera serial numbers, but who knows, Canon probably secretly embed the serial number into the image pixel values as well...

http://wiki.creativecommons.org/XMP also provided me with some hints on how best to embed these JPEGs with at least some form of machine readable tagging as CC BY licensed.

Tags: geo, sh.
An Idea for a Media-Centreish Interface for a UNIX terminal/shell
18th December 2009

Back in July or August this year when I was going through the notes on unix shells for COMP2041 I came up with idea of doing a shell/terminal interface that looked like an interface for a media centre ie. rather than looking like this,

manual page for man in xtermit would look "like" this (obvious not exactly the same but similar feel),

[caption id="attachment_970" align="aligncenter" width="450" caption="XBMC skin MediaStream by Team Razorfish. http://xbmc.org/wordpress/wp-content/gallery/mediastream/viewoptions.jpg"][/caption]

The key principles I had in mind were,

My original motives were that I was just learning all these core-utils commands (ls, cat, mkdir, cp, mv...) and I found that although the shell had tab completion and apropos, it didn't categorise these or give them in a list of common commands. Then I came up with more abstract ideas,

I haven't really thought about it on a technical level, but it may not be so portable as say gnome-terminal. I don't know the really differences among different shells out there so I don't know how dependent this is on bash or even if it ties bash and the terminal together, but from a beginner user perspective I don't care about this.

The cloudy idea I have in my mind is basically a GUI/CLI hybrid but I think such a program would need to be careful not to go too far, because it could be made so that after doing an ls -la you could click on a file in the list and rename it, but then we are turning into a file manager in list mode (like Dolphin or Nautilus) which is unnecessary as those tools already exist.

I'm aiming to do come up with a list and more detailed list of requirements and a set of activity and use case scenarios, along with some wire-frame prototypes for such an interface soon. But for now I just needed to get it all out of my head an onto paper (and also public (in case someone tries to patent a concept)).

Tags: computing, sh.
A Perl Script to Pause/Resume Amarok 1.4 Playback on Screensaver/Screenlock
11th December 2009

I've just uploaded to GitHub a script to pause Amarok 1.4 playback when the screensaver/screenlock starts and up pause again when closed/unlocked. It addresses the issue I was having with the script at http://nxsy.org/getting-amarok-to-pause-when-the-screen-locks-using-python-of-course where the script would start Amarok if it was not running and it would restart playback on screensaver end/unlock regardless of whether it was playing when the screensaver started.

You could start the script on start-up or plug it into Amarok's script engine to only be active when Amarok is active.

(Oh and in the future I'll try to avoid posts that just duplicate item's from other RSS/Atom feeds that don't add much extra value.)

Tags: computing, dev, sh.
Saving the Wordpress.com Export File and The Linked Media Files (and wget's strictness)
7th December 2009

So I've been wanting a way to automatically backup my wordpress.com export file. I decided to go for a bash and wget mix to do this work. But I soon had a problem wget won't save cookies that have a path different to the file you are downloading. This is a problem because, well here is what I basically do to get the export file.

Grab wp-login.php. This will issue a cookie that WP looks for as proof that I can indeed store cookies.

Next I post login credentials to wp-login.php. This will issue a bunch of authentication cookies. Specifically,

Set-Cookie: wordpress_test_cookie=WP+Cookie+check; path=/; domain=.wordpress.com
Set-Cookie: wordpress=some_string; path=/wp-content/plugins; domain=.wordpress.com; httponly
Set-Cookie: wordpress=some_string path=/wp-admin; domain=.wordpress.com; httponly
Set-Cookie: wordpress_logged_in=some_string; path=/; domain=.wordpress.com; httponly
Set-Cookie: wordpress_sec=some_string; path=/wp-content/plugins; domain=.wordpress.com; secure; httponly
Set-Cookie: wordpress_sec=some_string path=/wp-admin; domain=.wordpress.com; secure; httponly

The problem is Wget will refuse to save number 2,3,5 and 6 (only saving wordpress_test_cookie and wordpress_logged_in). It refuses the rest because it requires the cookie path to be the same as the path of the file you are requesting. Using --debug wget says,

cdm: 1 2 3 4 5 6 7 8Attempt to fake the path: /wp-content/plugins, /wp-login.php
cdm: 1 2 3 4 5 6 7 8Attempt to fake the path: /wp-admin, /wp-login.php
cdm: 1 2 3 4 5 6 7 8Attempt to fake the path: /wp-content/plugins, /wp-login.php
cdm: 1 2 3 4 5 6 7 8Attempt to fake the path: /wp-admin, /wp-login.php

Specifically to get the export file I need the wordpress_sec cookie for the path /wp-admin. I can't just request /wp-admin and try to get the cookie from there because only wp-login.php will let me post credentials.

Possible solutions are A) write a hacky solution that just grabs the cookie value using grep/sed and manually add this to the cookies file, B) recompile wget to accept some other argument that will accept these cookies, or C) don't use wget.

I took a look at the source for wget, and it was easy to identify the problem area, I could just simply remove this segment,

/* The cookie sets its own path; verify that it is legal. */
 if (!check_path_match (cookie->path, path))
 {
 DEBUGP (("Attempt to fake the path: %s, %s\n",
 cookie->path, path));
 goto out;
 }

But then my download script wouldn't be as portable and I'll have to make sure I use and have the patched wget available.

I ended up using curl for some parts, but I probably could have done option A.

Anyhow, the script is here. It should grab the export xml file as well as any media files that it references and were uploaded to that wordpress.com blog.

Tags: sh.
Facebook Video Updates as an RSS Feed (Using a Shell Script)
30th July 2009

We are finally learning common Unix tools at uni. Gosh I wish we had done these earlier because they are so useful! (yes I could have learnt them myself, and I did a bit. But I ended up just learning the parts to get the job done. This didn't always work because I had very little understanding of why things worked (and why they didn't) and thus things turned into trial and error).

So anyway I wanted an RSS feed for videos uploaded on Facebook to public pages. (For example http://www.facebook.com/video/?id=20916311640). So I put my newly learnt skills to good use and wrote a shell script.

#!/bin/sh
wget http://www.facebook.com/video/?id=$1 -q -O - -U 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv: 1.8.0.3) Gecko/20060523 Ubuntu/dapper Firefox/1.5.0.3' | grep 'http://www.facebook.com/video/video.php?v=' | sed -e 's/http:\/\/www.facebook.com\/video\/video.php?v=[0-9]*/\n&\n/g' | grep 'http://www.facebook.com/video/video.php?v=' | uniq | sed -e 's/.*/<item><title>&<\/title><link>&<\/link><\/item>/' | sed "1 s/^/<?xml version=\"1.0\"?><rss version=\"2.0\"><channel><title>Facebook Video Feed<\/title><link>http:\/\/www.facebook.com\/video\/?id=$1<\/link><description>Facebook Videos for ID $1<\/description><language>en-us<\/language>/" | sed '$ s/$/<\/channel><\/rss>/'

UPDATED: (links on the page from facebook no longer have the domain etc in the link)

(the line below gets cut off, but you can select it and copy paste...)

#!/bin/sh
wget http://www.facebook.com/video/?id=$1 -q -O - -U 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv: 1.8.0.3) Gecko/20060523 Ubuntu/dapper Firefox/1.5.0.3' | grep '/video/video.php?v=' | sed -e 's/\/video\/video.php?v=[0-9]*/\n&\n/g' | grep '/video/video.php?v=' | uniq | sed -e 's/.*/<item><title>http:\/\/www.facebook.com&<\/title><link>http:\/\/www.facebook.com&<\/link><\/item>/' | sed "1 s/^/<?xml version=\"1.0\"?><rss version=\"2.0\"><channel><title>Facebook Video Feed<\/title><link>http:\/\/www.facebook.com\/video\/?id=$1<\/link><description>Facebook Videos for ID $1<\/description><language>en-us<\/language>/" | sed '$ s/$/<\/channel><\/rss>/'

Facebook will actually check the user agent and refuse to serve users it doesn't like so I had to spoof it. So anyway the pipeline will grab the html page and find all the links to individual videos and feed these out, one line for each (this is up to just after the uniq). Next I add some text to turn this list into a basic RSS file. I don't worry about making it fancy with the video title, thumbnail etc. because honestly I don't care about that for my use.

To actually use it I can use cron, (actually I think its easiest to make another shell script and put this in /etc/cron.daily/ or /etc/cron.hourly/) to run the command, ./fbvidrss.sh 20916311640 > /var/www/fbvid_20916311640.xml

Tags: rss, sh.

RSS Feed