avatar tianjara.net | blog icon Andrew Harvey's Blog

Entries tagged "computing".

A small observation regarding usability and Gnome 3
17th September 2011

I've been running Nautilus 3.0.2 for the last few weeks, I noticed the delete key didn't actually delete files so I've been painstakingly deleting files by right click then "Move to Trash" or "Delete" from the menu. I always thought this was a bug and it would get fixed soon, but after getting sick of this workaround I finally did a Google search and found that this is by design. Now you need to use Ctrl+Del to send to trash and Shift+Del to skip the trash.

I think this is a poor choice, my reasoning is because now you have to use modifier+Del for both send to trash and skip trash. It makes sense to have Del as safe delete (send to trash) and a modified+Del for hard delete (no trash) that is logical and it is easy to remember the difference between the two. But now in this new situation a large majority of users (me included) won't be able to remember which modifier does what. Is it Ctrl or Shift and will just pick one at random, giving you a 0.5 chance of doing the actual type of delete you wanted.

Also I don't see any reason to add more safety to the straight delete key, after all it only goes to the trash so you can recover it if you accidentally hit the key. If you want add a sound upon sending something to the trash as an audio cue.

The number one thing I learned in my Human Computer Interaction course at uni was get feedback from the users (this isn't just asking for feedback, but also observing how they use the software). This is my feedback as a user, and if Gnome developers want to improve usability then they should take in the feedback, i.e. read this.

Tags: computing, usability.
I'm halfway towards an online home...
31st July 2011

Just recently I signed up with Linode for a virtual private server. I should have done this a long time ago but payment methods and too much choice lead me to put it off. So far I'm quite happy with the price and the quality of the product.

I think I would have preferred just buying a new dedicated machine myself, however unless you're in the CBD you can't get upload speeds much more than 1.5Mbps around here (if you are lucky), which is not really fast enough, hence I've gone with the 3rd party hosting option. Perhaps if this NBN the government keeps taking about is turned on and not too expensive I'll switch. In the mean time it seems silly that Australians need to host content that is mainly consumed by Australians overseas just because of the price difference.

For example if A wants to host content to Australians, but it is cheaper to do so from overseas, then A will mostly likely host their content overseas. This just means Australia has an even greater disturbance in the overseas traffic up/down ratio (i.e. we pull more than we push), which in turns means our local ISPs have to pay more compared to the overseas ISPs to lay the cables, this in turn leads to higher internet costs for Australians compared to overseas ISPs.

At any time our local ISPs could provide incentives for keeping traffic local inside the country by making overseas traffic more expensive than local traffic (for both people hosting and consuming), which would help Australians host locally and in turn even up the balance. It also means that the overseas pipes won't need to be as thick (because for sites which mostly get Australian visitors, hosting locally means the total amount of cross country traffic for that site is less, hence more efficient).

There is not much I can do about this, so I'm giving in and hosting overseas with the hope that one day things will return to sanity, but in the meantime I haven't been left off-line.

Anyway, back to Linode, I having a lot of fun at the moment setting it up and getting things working.

I'm running Debian 6.0, with unattended-upgrades, lighttpd, ufw, etckeeper (and soon either exim or postfix and possibly I'll migrate this wordpress blog across). I have run through http://www.debian.org/doc/manuals/securing-debian-howto/ it's a bit dated and I'm by no means following everything, but none the less it's still a nice read.

I've set up ssh with protocol 1 disabled (securing debian howto says it has some design flaws, so why enable it if I don't need it?), publickey authentication only, fail2ban, and only accepting traffic to the ssh port from IP ranges I know I'll be connecting from (I know if I've got security via publickey auth, I don't really need this, but it can't hurt... at least my logs don't fill up with as many break in attempts.).

However since one can always log into the machine via the console at linode.com, security here comes down to the weakest link of web based username/password and ssh publickey auth (ignoring all the other threats like compromised VM separation, compromised VM host, physical security, etc.. stuff I have no control over).

I have set up lighttpd as RAM is limited and I read that this (and nginx) are better than Apache httpd in this regard. I have also set up munin and munin-node (and adapted /etc/munin/apache.conf for lighttpd). I would prefer the munin stats not be appreciable to the world, so I just expose it to localhost only and access it via an SSH tunnel.

Additionally, while trying to debug lighttpd configuration I noticed that it wasn't responding to nc. Apparently lighttpd will only respond to my requests when lines end with \r\n but just typing away in the terminal as input to nc and using the Enter key I only end lines with \n. Apparently this is per the HTTP spec and it only works with Apache because it isn't as strict about this.

Next I have to see about getting a domain name. This is a huge problem in itself, but it since,

I don't really have a choice. Unfortunately the current ICANN DNS is the only real option at the moment, so I'm just going to have to pay up, and try to avoid having some details which I don't want listed, listed in WHOIS. At the moment I'll probably go for a .id.au domain.

I'll probably move this blog across once I set up a domain name, so more news to come on this later.

Update: I've registered with gandi.net. My site is now available at http://tianjara.net/

Tags: computing, web.
Operating Systems Notes
31st July 2010

Here are are some rough notes I put together as part of revision for a uni course. They are heavily based on the course lecture notes by Kevin Elphinstone and Leonid Ryzhyk. All diagrams are sourced from those lecture notes, some of which are in turn from the text book, A. Tannenbaum, Modern Operating Systems.

OS Overview

System Calls

Processes and Threads

During the system initialisation background processes (called daemon's on linux, service's on windows) and foreground processes can be started.

Threads Implementation



File Systems

There are many popular file systems in use today. FAT16, FAT32 are good for embedded devices; Ext2, Ext3, Ext4 are designed for magnetic disks, ISO9660 is designed for CD-ROM's, JFFS2 is optimised for flash memory as you don't want to write to the same location too many times as it wears out. Others include NTFS, ReiserFS, XFS, HFS+, UFS2, ZFS, JFS, OCFS, Btrfs, ExFAT, UBIFS.

Performance issues with standard magnetic disks

Block Allocation Strategies

External fragmentation - space wasted external to the allocated regions, this space becomes unusable as its not contiguous (so you have lots of small spaces but you need a larger space to fit a whole file in)

Internal fragmentation - space wasted internal to the allocated memory regions, eg. you get a 4K block allocated and only use 1K, but the OS can't give the leftover 3K to someone else.

Keeping track of file blocks

System call interface

File syscall interface


Journaling file system keeps a journal (or log) of FS actions which allows for the FS to recover if it was interrupted when performing an action that is not atomic, but needs to be.

Memory Management and Virtual Memory

Virtual Memory

Input Output



Multiprocessor Systems


Tags: computing.
Security Engineering Notes
31st July 2010

Here are are some rough notes I put together as part of revision for a uni course.

Security Engineering


Extra notes from Schinder 2nd Ed.


Choose two primes p and q and let n = pq. Choose e such that e and (p - 1)(q - 1) are relatively prime (ie. no common factor and both prime numbers). Let d be a solution of ed = 1 mod (p-1)(q-1). Public key $latex K = (e,n)$, private key $latex K^{-1} = (d,n)$.

$latex E_K(M) = M^e \mod n$

$latex D(M) = E_{K^-1}(M) = M^d \mod n$

Access Control


You want your elections to be,

(additional notes, but don't really need to know for exam)

Security Architecture

Security Design Principles:

Defence in depth - use many layers of security (or many security perimeters in layers)

Human Factors

Some strategies for reducing the risk,


Risk is not something I would have thought to be in a security course, but now that I think about it there are few if any bullet proof systems, so there is always some risk.

Whether it be secure communication (there is always some risk that an eavesdropper cracks your cryto and reads the message, so its important to weigh up those risks to decide if its worth sending the message in the first place), or be it running a web server (you cannot be sure that your web server doesn't have bugs, or even if you have verified the code to be matching the vulnerability free specification other things can happen like, has your CPU been verified to be bug free, are you sure that a cosmic ray won't flip a bit and subsequently create a vulnerability). So weighing up the risks involved is important to decide how much time and effort you devote to securing a system, or how the system is designed to work.

Business Risk Concepts

Degree of Risk

       <-- Probability -->
High Exposure/  |  High Exposure/         /\
Low Probability |  High Probability       ||
-----------------------------------    Exposure
Low Exposure/   |  Low Exposure/          ||
Low Probability |  High Probability       \/

Exposure - how widespread is the system? Probability - how easy is it to exploit the system, and how great is the incentive to do so (which relates to how valuable the assets you are protecting are)?

Risk management

Tags: computing.
A Great Gnome Panel Applet + A UPnP Enabled Router
26th July 2010

I have just discovered that my ADSL router supports UPnP, and that this provides an interface to access 3 important bits of information from the router (external IP address, total bytes sent and total bytes received). I had previously been scraping the router's web interface to grab the external IP. As for the bytes sent and received, I didn't even know the router had a method of reporting these.

My first instinct was to look for a Perl library for UPnP, I found two. One in the Ubuntu repositories (and in CPAN) http://search.cpan.org/perldoc?Net::UPnP::GW::Gateway and another which appears to be in neither, http://perlupnp.sourceforge.net/.

I tried out the first one and after some fiddling get it working (though I haven't yet been able to eliminate the 2 seconds it spends blocked, ie. not executing on the CPU but still not complete).

Next I found a great program that allows you to place an arbitrary command's output in the Gnome Panel, http://code.google.com/p/compa/. Which resulted in,

Compa Gnome panel applet showing total bytes recieved and sent by the router.

The Perl script I use to provide the output to the Compa applet is at http://github.com/andrewharvey/perlmisc/blob/master/upnp_router_inoutbytes_to_compa.pl

Tags: computing, dev.
The URL shown in Firefox's status bar isn't always where you end up.
27th June 2010

Unless you are aware of the more technical details of web browsing its reasonable for the average web user to assume that if you hover your mouse over a link and Firefox tells you in the status bar that the link is to http://foobar.com/, then clicking on the link will actually take you to http://foorbar.com/. But sadly this is not the case for out of the box Firefox.

Take a look at a Google search results pages. Hovering your mouse over the links gives one URL in the status bar, yet clicking the link actually takes you somewhere else.

Here is a sample of the HTML for the link,

<a href="http://www.example.com/page1.html" onmousedown="return rwt(this,'','','res','1','$ID1','&amp;sig2=$ID2','$ID3')">Page Title</a>

Hovering over the link, you see in the status bar http://www.example.com/page1.html, but as soon as you mousedown javascript goes ahead and changes the href to something else (keep in mind that Firefox only goes to the new link on mouserelease), so that when you release the mouse your browser takes you to the replacement URL.

The problem I see with this is what if some unsuspecting user checks the link in the status bar, clicks the link thinking they are going one place then get taken somewhere else. This becomes even more of a problem if that site is susceptible to certain kinds of XSS attacks. So you can think your going to https://paypal.com/, and the URL bar after clicking the link goes to https://paypal.com/ but yet you've actually got some malicious js or html injected in the paypal.com/ page that you have loaded in your browser window.

I originally thought this was clickjacking, but the Wikipedia article describes that as when a transparent layer on top of the page provides the concealed URL.

Tags: computing, web.
Australian States Map/Graph API
19th February 2010

I've managed to do a couple things all in one here. I've made use of some Geoscience Australia Creative Commons licensed material, in a nice little program with a web API, and I've aggregated some data from the myschool scraper and parser. Putting them all together gives some nice images like this.

The program for generating these images basically takes an SVG template file with placeholder markers and then fills these values based on the CGI parameters. The API is fairly simple so one should be able to work out how to use it from the example in the README file. Here are the files I used to make the graphs (and the svg versions as Wordpress.com won't let me upload them to here).

ps. This gets cut off when viewing it from the default web interface of this blog, use print preview or even better look at the RSS feed to see the cut off parts. Also I tried to ensure the accuracy of the data, but I cannot be 100% sure that there are no bugs, in fact there are discrepancies with the averages I get from my scrape of myschool and the averages provided in the report on the NPLAN website. The numbers I get seem to be consistent (ie. the state rankings seem mostly the same), but nonetheless not exactly the same as those reported in the report. Although I would be very surprised if all the numbers I got were exactly the same as in the report. I mainly did this to use map/graph code I wrote, so if you really care about how certain state averages compare in these tests look at the reports on the NPLAN website.

The lighter the colour the higher the number.


2008 2009


2008 2009
Tags: computing, education, graphics.
A Look into the myschool.edu.au Data
7th February 2010

After overcoming a few problems I managed to write a scraper for the myschool.edu.au data. Unfortunately they choose to put data in HTML, so the scraping process may have led my data to have some unknown errors. I publish (see bottom) the scraped data as I believe that per the IceTV v Nine Network [2009] HCA 14 case, any data that my scraper produces as output from the HTML input is not subject to the copyright of the original HTML content (this also means that I cannot publish the HTML pages) and the Telstra Corporation Limited v Phone Directories Company Pty Ltd [2010] FCA 44 case, that the raw data that is scraped is not subject to copyright.

I wish I could bzip2 up all those HTML pages and give them to you just to save your download, because the myschool.edu.au site doesn't compress their pages when I tell them I accept gzip over HTTP, so it took up almost 2GB of quota to download all the HTML pages, oh well.

Some preliminary statistics from the data.

I could report a lot of stats like these above, all you need is a basic knowledge of SQL, but as much as I enjoy working out these stats I find graphs and graphics much more intuitive, so that is up next. Because of the vast dimensions to the data you can make all kinds of graphs so what would be best is a system to draw graphics dynamically which allows the user to decide what is graphed, but this takes more work so that is on the todo list.

I've also looked into doing some heatmaps using the geographical location of the schools, I could have used Google Maps, or I could use OpenStreetMap and libchamplain. Both have pros and cons... But for now I used Google Maps because their API is simple and I've always wanted to experiment with it, the downside is I'm not sure about the copyright of their maps and subsequently any derivative works. This image is just a test showing a dot for each school in the system, but its very easy to change the colour, size and opacity of the dots based on features of the school.

Schools in Sydney Map

Another test (some markers will be missing or in the wrong place, like the ones in NZ!),

[caption id="attachment_1023" align="aligncenter" width="450" caption="Google Earth map showing markers for Australian schools (though not completely accurate). (Copyright notices in image)"]Google Earth map showing markers for Australian schools (though not completely accurate).[/caption]

Source code? http://github.com/andrewharvey/myschool

Don't want to scrape and parse but want the raw data in a usable form? http://github.com/andrewharvey/myschool/tree/master/data_exports/

Extra thought: Currently the code uses Google's API for geting the geolocation of the school, I could use OpenStreetMap for this also, however it would take more investiagtion to determine what tools exist. At the moment all I know is I have an .osm file of Australia, but schools aren't just one dot, they are a polygon so unless I find some other tools which probably exist, I would need to (probably) just use one of the points in the polygon.

Or I could used the Geographic Names Register for NSW, but that is just for NSW... http://www.gnb.nsw.gov.au/__gnb/gnr.zip

Tags: computing, education, politics.
anonymous FTP
22nd January 2010

So I've started reading a book about networks, and to complement this I've been taking a closer look at my network traffic in Wireshark (really great tool, by the way.).

So I pick an ftp site that I know, ftp://download.nvidia.com/ and see what happens in Wireshark when I visit it in Firefox. At the FTP application level this is what happens,

ftpsite to me: 220 spftp/1.0.0000 Server []\r\n
me to ftpsite: USER anonymous\r\n
ftpsite to me: 331 Password required for USER.\r\n
me to ftpsite: PASS mozilla@example.com\r\n
ftpsite to me: 230- \r\n
               230- ---------------------------------------------------------------------------\r\n
               230- WARNING:  This is a restricted access system.  If you do not have explicit\r\n
               230-           permission to access this system, please disconnect immediately!\r\n
               230 ----------------------------------------------------------------------------\r\n

But Firefox does not disconnect. So I did some more research and I found no references to "anonymous" users in either RFC 959 (FTP) or RFC 3659 (extensions to FTP). (Though there are references in latter RFCs, see RFC 2228).

The thing I find disconcerting is that I don't think I have "explicit permission" to access this system. I (or rather Firefox) just guessed a username and password and they happened to let me in (what if I guessed a different username and password that wasn't anonymous and it let me in?). If the RFC specified that a user of anonymous requires no password, or any password, then I would assume that the FTP server is granting me permission, but I assume rather people just started using anonymous as the user and it caught on...

The real problem here is that there are laws which govern such areas, and it doesn't help that that I don't understand what PART 6 - COMPUTER OFFENCES of the CRIMES ACT 1900 (NSW) is saying.

Tags: computing, law.
To fix broken audio, unplug faulty USB device.
19th January 2010

How weird is this, just recently when I started up my computer lots of stuff was broken, no audio (and /proc/asound/cards was empty, normally it has "0 [Intel          ]: HDA-Intel - HDA Intel\nHDA Intel at 0xfa100000 irq 22"), libsensors weren't reporting any values (eg. no CPU temp reported), eth0 dissapeared from NetworkManager, and probably a host of other things that I didn't notice. Restarting didn't fix it.

Well long story short, it turns out that everything magically fixed when I unplugged a USB hard drive that was plugged in. I had seen a lot of concerning messages sent to /var/log/messages from the kernel about it,

Jan 19 09:45:00 host kernel: [  564.100026] usb 1-3: reset high speed USB device using ehci_hcd and address 2 Jan 19 09:45:00 host kernel: [  564.237716] sd 8:0:0:0: [sdd] Unhandled error code Jan 19 09:45:00 host kernel: [  564.237719] sd 8:0:0:0: [sdd] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK

that repeated every so often, but I never thought that a dodgy USB device would break the kernel from doing some of its job.

Tags: computing.
An Idea for a Media-Centreish Interface for a UNIX terminal/shell
18th December 2009

Back in July or August this year when I was going through the notes on unix shells for COMP2041 I came up with idea of doing a shell/terminal interface that looked like an interface for a media centre ie. rather than looking like this,

manual page for man in xtermit would look "like" this (obvious not exactly the same but similar feel),

[caption id="attachment_970" align="aligncenter" width="450" caption="XBMC skin MediaStream by Team Razorfish. http://xbmc.org/wordpress/wp-content/gallery/mediastream/viewoptions.jpg"][/caption]

The key principles I had in mind were,

My original motives were that I was just learning all these core-utils commands (ls, cat, mkdir, cp, mv...) and I found that although the shell had tab completion and apropos, it didn't categorise these or give them in a list of common commands. Then I came up with more abstract ideas,

I haven't really thought about it on a technical level, but it may not be so portable as say gnome-terminal. I don't know the really differences among different shells out there so I don't know how dependent this is on bash or even if it ties bash and the terminal together, but from a beginner user perspective I don't care about this.

The cloudy idea I have in my mind is basically a GUI/CLI hybrid but I think such a program would need to be careful not to go too far, because it could be made so that after doing an ls -la you could click on a file in the list and rename it, but then we are turning into a file manager in list mode (like Dolphin or Nautilus) which is unnecessary as those tools already exist.

I'm aiming to do come up with a list and more detailed list of requirements and a set of activity and use case scenarios, along with some wire-frame prototypes for such an interface soon. But for now I just needed to get it all out of my head an onto paper (and also public (in case someone tries to patent a concept)).

Tags: computing, sh.
The Features of My Utopian Music Player
11th December 2009

Ideally I would like to write my own music player because I don't really like any that are currently available (Amarok 1.4, Amarok 2, Songbird, Rhythmbox, Banshee, Exaile). I like features from each but none seem to fit all my needs. All the time I keep rethinking what I should do and I still cannot decide. Anyway this is what my ideal music player would be like...

Now for the solution. I could try everything from writing my own music player from scratch that implements that all (but I gave up on that after I could not decide what programming language to use C, C++, Java, Perl, Python, what GUI widget toolkit to use Qt, GTK+, wxWidgets, graphics api for nice graphs Cairo, raw OpenGL, OpenGL behind Clutter, R's graph drawing, Processing, or some other CPAN Perl module for drawing nice graphs. I can mix a few but the core app needs one programming language and it needs a core GUI toolkit for the GUI. There is too much choice and I don't have enough experience to know before hand what is best and what I will find easiest and simplest to use.)

I could try to capture playback statistics by looping last.fm and audioscrobber.com to localhost and capturing the data that Amarok sends. Or I could just write a script for Amarok which captures playback, but this only solves part of the problem and then I'm stuck using a certain application. Alternatively I could just take an existing program and fork it to suit my needs.

There should be more to come on this as I start experimenting.

Tags: computing, dev.
A Perl Script to Pause/Resume Amarok 1.4 Playback on Screensaver/Screenlock
11th December 2009

I've just uploaded to GitHub a script to pause Amarok 1.4 playback when the screensaver/screenlock starts and up pause again when closed/unlocked. It addresses the issue I was having with the script at http://nxsy.org/getting-amarok-to-pause-when-the-screen-locks-using-python-of-course where the script would start Amarok if it was not running and it would restart playback on screensaver end/unlock regardless of whether it was playing when the screensaver started.

You could start the script on start-up or plug it into Amarok's script engine to only be active when Amarok is active.

(Oh and in the future I'll try to avoid posts that just duplicate item's from other RSS/Atom feeds that don't add much extra value.)

Tags: computing, dev, sh.
Computer Graphics Notes
2nd December 2009

Not really complete...

Colour notes here, transformations notes here.

Parametric Curves and Surfaces

Parametric Representation

eg. $latex C(t) = (x(t), y(t))$


Parametric Continuity

Geometric Continuity

Control Points

Control points allow us to shape/define curves visually. A curve will either interpolate or approximate control points.

Natural Cubic Splines


Bezier Curve


This Bezier curve shown has two segments, where each segment is defined by 4 control points. The curve interpolates two points and approximates the other two. The curve is defined by a Bernstein polynomial. In the diagram changing points 1 and 2 only affects that segment. Changing the corner points (0 and 3) each only affect the two segments that they boarder.

Some properties of Bezier Curves:


We can join two Bezier curves together to have C1 continuity (where B1(P0, P1, P2, P3) and B2(P0, P1, P2, P3)) if P3 - P2 = P4 - P3. That is P2, P3,  and P4 are colinear and P3 is the midpoint of P2 and P4. To get G1 continuity we just need P2, P3, and P4 to be colinear. If we have G1 continuity but not C1 continuity the curve still won't have any corners but you will notice a "corner" if your using the curve for something else such as some cases in animation. [Also if the curve defined a road without G1 continuity there would be points where you must change the steering wheel from one rotation to another instantly in order to stay on the path.]

De Casteljau Algorithm

De Casteljau Algorithm is a recursive method to evaluate points on a Bezier curve.


To calculate the point halfway on the curve, that is t = 0.5 using De Casteljau's algorithm we (as shown above) find the midpoints on each of the lines shown in green, then join the midpoints of the lines shown in red, then the midpoint of the resulting line is a point on the curve. To find the points for different values of t, just use that ratio to split the lines instead of using the midpoints. Also note that we have actually split the Bezier curve into two. The first defined by P0, P01, P012, P0123 and the second by P0123, P123, P23, P3.


The curvature of a circle is $latex \frac{1}{r}$.

The curvature of a curve at any point is the curvature of the osculating circle at that point. The osculating circle for a point on a curve is the circle that "just touches" the curve at that point. The curvature of a curve corresponds to the position of the steering wheel of a car going around that curve.

Uniform B Splines

Join with C2 continuity.

Any of the B splines don't interpolate any points, just approximate the control points.

Non-Uniform B Splines

Only invariant under affine transformations, not projective transformations.

Rational B Splines

Rational means that they are invariant under projective and affine transformations.


Non-Uniform Rational B Splines

Can be used to model any of the conic sections (circle, ellipse, hyperbola)



When rotating about an axis in OpenGL you can use the right hand rule to determine the + direction (thumb points in axis, finger indicate + rotation direction).

We can think of transformations as changing the coordinate system, where (u, v, n) is the new basis and O is the origin.

$latex \begin{pmatrix}u_x & v_x & n_x & O_x\ u_y & v_y & n_y & O_y\ u_z & v_z & n_z & O_z\ 0 & 0 & 0 & 1 \end{pmatrix}$

This kind of transformation in is known as a local to world transform. This is useful for defining objects which are made up of many smaller objects. It also means to transform the object we just have to change the local to world transform instead of changing the coordinates of each individual vertex. A series of local to world transformations on objects builds up a scene graph, useful for drawing a scene with many distinct models.

Matrix Stacks


For MODELVIEW operations include glTranslate, glScaled, glRotated... These are post multiplied to the top of the stack, so the last call is done first (ie. a glTranslate then glScaled will scale then translate.).

Any glVertex() called have the value transformed by matrix on the top of the MODELVIEW stack.

Usually the hardware only supports projection and viewport stacks of size 2, whereas the modelview stack should have at least a size of 32.

The View Volume

Can set the view volume using,(after setting the the current matrix stack to the PROJECTION stack

In OpenGL the projection method just determines how to squish the 3D space into the canonical view volume.

Then you can set the direction using gluLookAt (after calling one of the above) where you set the eye location, a forward look at vector and an up vector.

When using perspective the view volume will be a frustum, but this is more complicated to clip against than a cube. So we convert the view volume into the canonical view volume which is just a transformation to make the view volume a cube at 0,0,0 of width 2. Yes this introduces distortion but this will be compensated by the final window to viewport transformation.

Remember we can set the viewport with glViewport(left, bottom, width, height) where x and y are a location in the screen (I think this means window, but also this stuff is probably older that modern window management so I'm not worrying about the details here.)

Visible Surface Determination (Hidden Surface Removal)

First clip to the view volume then do back face culling.

Could just sort the polygons and draw the ones further away first (painter's algorithm/depth sorting). But this fails for those three overlapping triangles.

Can fix by splitting the polygons.

BSP (Binary Space Partitioning)

For each polygon there is a region in front and a region behind the polygon. Keep subdividing the space for all the polygons.

Can then use this BSP tree to draw.

void drawBSP(BSPTree m, Point myPos{
   if (m.poly.inFront(myPos)) {
      drawBSP(m.behind, myPos);
      drawBSP(m.front, myPos);
      drawBSP(m.front, myPos);
      drawBSP(m.behind, myPos);

If one polygon's plane cuts another polygon, need to split the polygon.

You get different tree structures depending on the order you select the polygons. This does not matter, but some orders will give a more efficient result.

Building the BSP tree is slow, but it does not need to be recalculated when the viewer moves around. We would need to recalculate the tree if the polygons move or new ones are added.

BSP trees are not so common anymore, instead the Z buffer is used.

Z Buffer

Before we fill in a pixel into the framebuffer, we check the z buffer and only fill that pixel is the z value (can be a pseudo-depth) is less (large values for further away) than the one in the z buffer. If we fill then we must also update the z buffer value for that pixel.

Try to use the full range of values for each pixel element in the z buffer.

To use in OpenGL just do gl.glEnable(GL.GL_DEPTH_TEST) and to clear the z-buffer use gl.glClear(GL.GL_DEPTH_BUFFER_BIT).



Line systems. eg. koch curve


IFS - Iterated Function System


Shading Models

There are two main types of rendering that we cover,

Polygon rendering is used to apply illumination models to polygons, whereas ray tracing applies to arbitrary geometrical objects. Ray tracing is more accurate, whereas polygon rendering does a lot of fudging to get things to look real, but polygon rendering is much faster than ray tracing.

We start with a simple model and build up,

Lets assume each object has a defined colour. Hence our illumination model is $latex I = k_i$, very simple, looks unrealistic.

Now we add ambient light into the scene. Ambient Light is indirect light (ie. did not come straight from the light source) but rather it has reflected off other objects (from diffuse reflection). $latex I = I_a k_a$. We will just assume that all parts of our object have the same amount of ambient light illuminating them for this model.

Next we use the diffuse illumination model to add shading based on light sources. This works well for non-reflective surfaces (matte, not shiny) as we assume that light reflected off the object is equally reflected in every direction.

Lambert's Law

"intensity of light reflected from a surface is proportional to the cosine of the angle between L (vector to light source) and N(normal at the point)."

Gouraud Shading

Use normals at each vertex to calculate the colour of that vertex (if we don't have them, we can calculate them from the polygon normals for each face). Do for each vertex in the polygon and interpolate the colour to fill the polygon. The vertex normals address the common issue that our polygon surface is just an approximation of a curved surface.

To use gouraud shading in OpenGL use glShadeModel(GL_SMOOTH). But we also need to define the vertex normals with glNormal3f() (which will be set to any glVertex that you specify after calling glNormal).

Highlights don't look realistic as you are only sampling at every vertex.

Interpolated shading is the same, but we use the polygon normal as the normal for each vertex, rather than the vertex normal.

Phong Shading

Like gouraud, but you interpolate the normals and then apply the illumination equation for each pixel.

This gives much nicer highlights without needing to increase the number of polygons, as you are sampling at every pixel.

Phong Illumination Model

Diffuse reflection and specular reflection.

Components of the Phong Model (Brad Smith, http://commons.wikimedia.org/wiki/File:Phong_components_version_4.png)
Components of the Phong Model (Brad Smith, http://commons.wikimedia.org/wiki/File:Phong_components_version_4.png)

Source: Lambert (Source: COMP3421, Lecture Slides.)

$latex I_s = I_l k_s \cos^n \left ( \alpha \right )$

n is the Phong exponent and determines how shiny the material (the larger n the smaller the highlight circle).

Flat shading. Can do smooth shading with some interpolation.

If you don't have vertex normals, you can interpolate it using the face normals of the surrounding faces.

Gouraud interpolates the colour, phong interpolates the normals.


inverse square is physically correct, but looks wrong because real lights are not single points as we usually use in describing a scene, and

For now I assume that all polygons are triangles. We can store the normal per polygon. This will reneder this polygon, but most of the time the polygon model is just an approximation of some smooth surface, so what we really want to do is use vertex normals and interpolate them for the polygon.

Ray Tracing

For each pixel on the screen shoot out a ray and bounce it around the scene. The same as shooting rays from the light sources, but only very few would make it into the camera so its not very efficient.

Each object in the scene must provide an intersection(Line2D) function and a normal (Point3D) function

Ray Tree

Nodes are intersections of a light ray with an object. Can branch intersections for reflected/refracted rays. The primary ray is the original ray and the others are secondary rays.


Can do them using ray tracing, or can use shadow maps along with the Z buffer. The key to shadow maps is to render the scene from the light's perspective and save the depths in the Z buffer. Then can compare this Z value to the transformed Z value of a candidate pixel.



Line Drawing



We use Bresenham's algorithm for drawing lines this is just doing linear interpolation, so we can use Bresenham's algorithm for other tasks that need linear interpolation.

Polygon Filling

Scan line Algorithm

The Active Edge List (AEL) is initially empty and the Inactive Edge List (IEL) initially contains all the edges. As the scanline crosses an edge it is moved from the IEL to the AEL, then after the scanline no longer crosses that edge it is removed from the AEL.

To fill the scanline,

Its really easy to fill a triangle, so an alternative is to split the polygon into triangles and just fill the triangles.



Ideally a pixel's colour should be the area of the polygon that falls inside that pixel (and is on top of other polygons on that pixel) times the average colour of the polygon in that pixel region then multiply with any other resulting pixel colours that you get from other polygons in that pixel that's not on top of any other polygon on that pixel.

Aliasing Problems


In order to really understand this anti-aliasing stuff I think you need some basic understanding of how a standard scene is drawn. When using a polygon rendering method (such as is done with most real time 3D), you have a framebuffer which is just an area of memory that stores the RGB values of each pixel. Initially this framebuffer is filled with the background colour, then polygons are drawn on top. If your rending engine uses some kind of hidden surface removal it will ensure that the things that should be on top are actually drawn on top.

Using the example shown (idea from http://cgi.cse.unsw.edu.au/~cs3421/wordpress/2009/09/24/week-10-tutorial/#more-60), and using the rule that if a sample falls exactly on the edge of two polygons, we take the pixel is only filled if it is a top edge of the polygon.

Anti-Aliasing Example Case. The pixel is the thick square, and the blue dots are samples.
Anti-Aliasing Example Case. The pixel is the thick square, and the blue dots are samples.

No Anti-Aliasing

With no anti-aliasing we just draw the pixel as the colour of the polygon that takes up the most area in the pixel.



Can use adaptive supersampling. If it looks like a region is just one colour, don't bother supersampling that region.


Often the graphics card will take over and do supersamling for you (full scene anti aliasing).

To get OpenGL to anti-alias lines you need to first tell it to calculate alpha for each pixel (ie. the ratio of non-filled to filled area of the pixel) using, glEnable(GL_LINE_SMOOTH) and then enable alpha blending to apply this when drawing using,


You can do post-filtering using the accumulation buffer (which is like the framebuffer but will apply averages of the pixels), and jittering the camera for a few times using accPerspective.

Anti-Aliasing Textures

A texel is a texture pixel whereas a pixel in this context refers to a pixel in the final rendered image.

When magnifying the image can use bilinear filtering (linear interpolation) to fill the gaps.

Mip Mapping

Storing scaled down images and choose closes and also interpolate between levels where needed. Called trilinear filtering.

Rip Mapping helps with non uniform scaling of textures. Anisotropic filtering is more general and deals with any non-linear transformation applied to the texture

Double Buffering

We can animate graphics by simply changing the framebuffer, however if we start changing the framebuffer and we cannot change it faster than the rate the screen will display the contents of the frame buffer, it gets drawn when we have only changed part of the framebuffer. To prevent this, we render the image to an off screen buffer and when we finish we tell the hardware to switch buffers.

Can do on-demand rendering (only refill framebuffer when need to) or continuois rendeing (draw method is called at a fixed rate and the image is redrawn regardless of whether the image needs to be updated.)


Mip Mapping for models. Can have some low poly models that we use when far away, and use the high res ones when close up.


Key-frames and tween between them to fill up the frames.



OpenGL 2.0  using GLSL will let us implement out own programs for parts of the graphics pipeline particularly the vertex transformation stage and fragment texturing and colouring stage.

Fragments are like pixels except they may not appear on the screen if they are discarded by the Z-buffer.

Vertex Shaders

Fragment Shaders

set gl_FragColor.

Tags: comp3421, computing, graphics.
Human Computer Interaction Notes
2nd December 2009

These notes are based around my COMP3511 course.

Interaction Design (+Scenarios)

Cooper et al. describe the process of interaction design as,

  1. Identifying needs and establishing requirements for the user experience.
  2. Developing alternative designs that meet those requirements.
  3. Building interactive versions of the designs so that they can be communicated and assessed.
  4. Evaluating what is being built throughout the process and the user experience it offers.

Scenarios are narratives about named people with an age. We need some background to understand where they are coming from (for instance their cultural background (eg. the US uses MM/DD/YYYY but Australia uses DD/MM/YYYY)). We try to remove incorrect assumptions about what we think a certain group of people are like. The scenario should explain their motivations and their goals.


Usability is all about producing things which are usable. Where something is usable when it meets these usability goals, however you should work out which goals are most important for the problem and focus on those first.

Usability Goals

User Experience Goals

Heuristics (Usability Principles)

Design Principles

When designing a system we need to consider,

Identifying Needs


When testing an interface with users/test participants, give them a high level goal and observe how they go about doing it. Don't give them specific instructions.

Use Scenario 1: For each task identified (or major tasks, or particularly special tasks if many tasks are defined), write a description of how that user would accomplish the task independent of how they would complete it within the application.

Use Case 1: If a use scenario has been implemented, include a matching use case which describes how the task use scenario can be completed in the application. There may be branching or multiple ways to complete the task, and this is a good way to document it.

To test if something is a requirement just ask, "If I remove this, will the product still fulfil its purpose?"

Design Conceptualisation

A conceptual model is a high-level description of how a system is organised and operates. --Johnson and Henderson, 2002, p. 26

I like to think of it as this. The person coding the web browsers understands that when the users types in a URL and presses enter an HTTP GET request is sent and the response is received and the HTML is processed and displayed. There are many technical interactions and details that are happening here. But the conceptual model of this is what the average non-technical uses thinks is happening. They just have some kind of model in their head that they type the URL hit enter and get the web site displayed. Its just an abstraction of what is going on.

Interface metaphors are used as they can help the user understand and determine how an interface works. We try to use them for this purpose but just using the metaphor directly can have some negative affects (eg. if your designing a radio application for desktop PC's, it may not be a good idea to just show an image of a real radio as the UI). We don't want to use the metaphor to an extent that it breaks the design principles.

A classic example of a conceptual framework is that of the relation between the design of a conceptual model and the user's understanding of it. In this framework there are three components, (Sharp et al., 2006)

[caption id="attachment_772" align="aligncenter" width="361" caption="Conceptual Framework (from Norman, 1988)"]Conceptual Framework (from Norman, 1988)[/caption]

The designers job is to create the system image so that the users will invoke the same conceptual model as the designer's.

The interface could be made more transparent so the user can see the details of how the system works, but this is not always desirable as it may cause confusion. Also many users may not want to know all the gory details, nor should they have to know the actual implementation in order to use the system.


Using A Design Diary


Here is an example wireframe.

[caption id="" align="aligncenter" width="445" caption="Example Wireframe from https://wiki.ubuntu.com/DesktopExperienceTeam/KarmicBootExperienceDesignSpec"]Example Wireframe from (https://wiki.ubuntu.com/DesktopExperienceTeam/KarmicBootExperienceDesignSpec)[/caption]

Another paper prototype with a slightly higher fidelity.

[caption id="attachment_850" align="aligncenter" width="450" caption="An example paper prototype (from https://wiki.ubuntu.com/SoftwareStore)."]An example paper prototype (from https://wiki.ubuntu.com/SoftwareStore).[/caption]

Issues Table

Usability Testing

There are some legal and ethical issues to consider. The participant,

During a Usability Test,

After the Testing,

Usability Testing

When actually running a usability test you should follow a usability test plan. The test plan just details what the test facilitator should do during the test.

The usability testing procedures we used in this course are:

  1. Explain procedures (think aloud, data collection methods in use...)
  2. Make sure they agree and sign a consent for before proceeding (you keep one, they keep one)
  3. Run a pre-test questionnaire (used to generate a participant profile) (this helps to give you an idea on their experience level, as well as any background they may have in using similar interfaces before, as these affect how the participant performs) (best to get the participant to do this a few days before the test so that you can focus on specific tasks.)
  4. Introduce scenario
  5. Run through tasks
  6. Ask any post test questions
  7. Do they have any extra comments/debriefing
  8. Thank them for their time



User Centred Design Process

The UCD process is all about focusing on the users and tasks. It also means iterate your designs often. The development is driven by users needs rather than technical concerns.

More specifically Gould and Lewis (1985) give three principles,

Affinity Diagramming

Card Sorting

Software Lifecycles

Cognitive Load Theory

Cognition is what goes on in our brains. It includes cognitive processes such as,

Some Cognitive Load Theory

Some HCI Applications


(From a psychologists perspective).

Long Term Memory


[caption id="attachment_794" align="aligncenter" width="360" caption="A Taxonomy of Memory"]A Taxonomy of Memory[/caption]

Explicit and Implicit Memory

"Imagine that you learn a list of items and are then required to recall or recognise them. This memory test would be accompanied by conscious awareness that you were remembering. Imagine that a considerable time later, a test of recall or recognition revealed no memory for the items. However if you were given the original list to relearn there would probably be some savings in learning time (i.e. you would take less time to learn the list the second time, oven though you were not aware of your memory of the items). This is the distinction between explicit memory, in which remembering is accompanied by either intentional or involuntary awareness of remembering, and implicit memory, in which remembering is not accompanied by awareness (Graf & Schacter 1985; Schacter 1987)." -- (Walker, "Chapter 9: Memory, Reasoning and Problem Solving". pg. 262 (sorry I don't have the title))

Not really related, but a good thing to hear a text book say,

"Finally, some long-enduring memories are for passages that our teachers have drulled into us... The Interesting thing about these memories is that they are preserved as they were memorised, in a very literal form, in exact wordings (Rubin, 1982). The memory code is not transferred from literal to semantic. In fact, the words are often remembered mechanically, with almost no attention to their meaning." --(Walker, "Chapter 9: Memory, Reasoning and Problem Solving". pg. 267 (sorry I don't have the title))

Obstacles to Problem Solving

External Cognition

People use external representations to extend or support ones ability to perform cognitive activities. For example, pens and paper, calculators, etc. We do this to,

  1. reduce memory load
    • eg. post-it notes, todo lists. But the placement of say post-it notes is also significant.
  2. offload computation
    • eg. pen and paper to solve a large arithmetic problem (the mechanical kind).
  3. annotate
    • modifying or manipulating the representation to reflect changes

Experts vs. Novices

Visual Design


Differences around the world,

Internationalisation (i18n) refers to designing and developing a software product to function in multiple locales. Localisation (L10n) refers to modifying or adapting a software product to fit the requirements of a particular locale. This could include translating text, changing icons, modifying layout (eg. of dates).5

A locale is a set of conventions affected or determined by human language and customs, as defined within a particular geo-political region. These conventions include (but are not necessarily limited to) the written language, formats for dates, numbers and currency, sorting orders, etc.5




Keystroke Level Model

Fitt's Law


[1] Sharp, Rodgers, Preece. (2006) Interaction Design: Beyond human computer interaction. 2nd Ed.

[2] Marcus, Nadine. (2009) COMP3511 Cognitive Load Theory Lecture Slides.

Woo, Daniel. (2009) COMP3511 Lecture Slides.

Norman, Donald. (1988) The Design of Everyday Things.

[5] http://www.mozilla.org/docs/refList/i18n/

Tags: comp3511, computing.
A Review of Gnome Shell from my Perspective (and a Comparison with Compiz)
26th September 2009

I tried out Gnome Shell today. (And it didn't break everything! I followed their instructions it build and ran fine, and when I killed it, my normal environment with the normal Gnome Panel and Compiz... went back to normal.) Its shaping together nicely, there are many good things and I think its a great effort by everyone behind it. (But just a warning I don't know all the technical things behind everything here, so please excuse me if I miss something or don't use the correct terminology. This review is just from my perspective/my view. It is not a proper usability evaluation, nor have I looked and which is better engineered or anything too technical.)

[caption id="attachment_812" align="aligncenter" width="450" caption="My Desktop Running Gnome Shell"]My Desktop Running Gnome Shell[/caption]

[caption id="attachment_814" align="aligncenter" width="450" caption="My Desktop Running Gnome"]My Desktop Running Gnome[/caption]

The obvious difference is there is no bottom panel in Gnome Shell and the top panel is different (but its still in development of course so in a later version they may make more use of it).

My Current Work-flow

Window Management

Normally in Gnome I use Compiz a lot to help me manage my open windows. Compiz/Compiz Fusion has a lot of plugins, but over time I've found a few which I really like and I use all the time.

If I have a bunch of windows in one workspace and I want to switch to another I usually use Scale (shortcut of Super + Tab), although I still sometimes use the bottom taskbar, and I always use that taskbar when the window is minimised (Because Compiz can't access minimised windows pixmaps so they don't appear in Scale unfortunately. This is a real killer.). I can also right click on a window in this view to close it. This makes it really easy and fast to kill a heap of windows that I have finished with. This makes my search space when changing windows much less and hence much easier.

[caption id="attachment_818" align="aligncenter" width="450" caption="The Compiz Fusion Scale Plugin"]The Compiz Fusion Scale Plugin[/caption]

To change workspaces I use Expo (shortcut of Super + E). But I don't actually use more than I workspace all that often, even though I think I should be. The other great thing is I can drag windows from one workspace to another while in Expo.

[caption id="attachment_819" align="aligncenter" width="450" caption="Compiz Expo Plugin"]Compiz Expo Plugin[/caption]

Some other shortcuts I use for window management very frequently are,

I keep making refinements to this, but it works very well for me as it is.

Application Starting

In Gnome I use the Panels Run Application dialogue (see pic) (with the shortcut Alt + ) and the terminal (with the shortcut Ctrl +) to start new applications. Those shortcuts really make things easier and faster.

[caption id="attachment_815" align="aligncenter" width="450" caption="The Panel's Run Application Dialogue"]The Panel's Run Application Dialogue[/caption]

The run dialogue is good. I can run programs like firefox, gedit just as you would in the terminal but it means I don't have to have a terminal open or open one first (its all amount maximising efficiency, so I can get to where I want to be as fast as possible). Also I can enter locations such as /etc/whatever and Nautilus will be opened to that location. That text box has tab completion (and it actually shows the suggestions) which makes things easier and faster.

In Gnome Shell

[caption id="attachment_821" align="aligncenter" width="450" caption="Gnome Shell Activity Mode (sorry, I'm not sure what its actually called)"]Gnome Shell Activity Mode (sorry, I'm not sure what its actually called)[/caption]

Window Management

In Gnome Shell (it uses Metacity not Compiz) you can do all your window management and application starting through the Activities mode. Which can be started either by the Super key, clicking Activities, or dragging the mouse to the top left edge (although it seems I must go to the exact 0,0 pixel not 0,1 or 1,0 which is a bit annoying). This is good it gives the user some choice they may happen to have their hand near Super so they use that, or they may only using the mouse so they can use that (actually I will set up Compiz Scale to work with both Super Tab and a top left mouse move).

On the down side, Gnome Shell did not seem to be as fast and responsive as similar Compiz tools. What I mean is that on my system where the Scale tool is fast, as in the windows move smoothly and quickly, when I go into the Activities mode its has a small delay (less than a second, but its still annoying) and its seems a bit jumpy and jerky when everything is moving. But of course its still in development so I'm not going to criticise this. Apart from this, it seems just like Compiz Expo + Scale together. This activity mode window management is good, but there are some small things like I can't seem to close windows from this activity mode (like I can in Compiz's Scale), but I can move windows from one workspace to another in Gnome Shell just like in Compiz's Expo. Also it can also be annoying to have Scale and Expo mixed together (of course I can just just Alt + Tab or move windows around so I can focus on another, but I don't really like that idea).

Unlike Compiz/Gnome's multiple workspaces, in Gnome shell you can add these dynamically. Which I think is a better idea than the static type that normal Gnome/Compiz uses.

[caption id="attachment_822" align="aligncenter" width="450" caption="Gnome Shell allows you to dynamically add/remove workspaces"]Gnome Shell allows you to dynamically add/remove workspaces[/caption]

Things seems to be shifting towards emphasising multiple workspaces. What I need to try to remember to do is USE these multiple workspaces, grouping windows together where they group nicely, instead of just putting everything in one workspace. Window managers could help me with this, like they could remember that I often have Firefox on workspace 2, so when I run it automatically put it there and switch to workspace 2. I haven't tried this, so I don't know if it would help me, or just frustrate me by doing what I don't want every time. I'm not even sure if Compiz can do this anyway.

I'm not sure where dock's like Avant or Cairo fit in, but I never really found them to make things easier.

Application Starting

The other noticeable thing in Gnome Shell is that bar on the left. In normal Gnome you have your menu bar which has Applications, Places and System (which I wish I could easily shorten to Apps, Places and Sys to save space). Given I have this new user thing on the right where I can shutdown/logout/suspend/hibernate... from the only real thing I use System for is the Preferences and Administration. Yet I can never remember if what I want is in admin or pref. I recently discovered this system preferences thing which just puts it all in one window categorised into appropriate groups. I'm sure some find the two lists easier and some find the single window easier. When I scan with my eyes in a list I just go up/down, but when I scan a grid my eyes wander all over the place with no apparent system. As such its probably a more random search than a well defined one. There is heaps of things you could test out (we looked at some in my HCI course) to try to make the grid layout faster but nonetheless I think I like the grid better.

I use the Places bar often, and I think the Gnome Shell implementation makes things easier as they are listed in two columns, unlike traditionally where the number of bookmarks is limited and I need to navigate to a sub-menu to show them all. It seems I can't change the size proportions of those three sections on the side, but again its still in development. You could look at this a number of ways but because the panels are gone, if you are using a full screen application you can focus on that, with nothing cluttering the edge or distracting you from your task at hand. Traditionally everything in layered down, you have panels, then window decorations then menu bars, status bars, tabs (in Firefox), removing all that so that you just have the task at hand in your vision can be a great thing (yes I know there is a full screen feature in Firefox, and you can set Gnome panels to hide). When you are working in a browser its up to the web site (unless you have the time to do some Greasemonkey scripts) to allow you to again remove outside clutter, yet many application-like web sites allow you to do this (Alt + Shift + G when editing in Wordpress, u in Google Reader (to some degree)). Anyway that is moving away a bit from the topic of this post.

At the bottom of the left bar, you have recent documents. I use recent documents very very rarely (as in the shortcuts to them, not the documents themselves). Although I still think that a well designed system for access to recent documents integrated with some kind of search capability would be very useful for me, and I would use it often. However I am yet to find such a system that I like. The concept in my mind is something like the Lifestream design that Wei Zhou blogged about. An interface where time is on the horizontal axis, where you could change the scale and location of this view easily, view related things such as the weather for that particular time, your location if you have a GPS enabled laptop, etc. Also it should be integrated with a good filter feature (anything such as file type, file size, location, tags...) that lets you narrow down your search space. Something like that is what I have in my mind as a great use of a "recent documents" feature. GNOME Zeitgeist looks like it may address some of this.

Lastly the top section is the application launcher.

[caption id="attachment_825" align="aligncenter" width="450" caption="Gnome Shell Menu"]Gnome Shell Menu[/caption]

The actual menu in some ways is much better than the normal Gnome menu. Larger icons and a short description of the application are good. When I open the Gnome menu bar, I never need to see what's on my screen in order to make my selection from the menu bar (and if I forgot what I wanted to start I can always close it then open it again). You have the whole screen so you may as well use it, and Gnome Shell seems better in this respect. The bad thing is I don't like the use of pages. If not everything will fit on one column, you have to change the pages at the bottom. Instead you should be able to scroll through the options with the mouse wheel, or the ones that don't fit go in another column to the right (like Windows XP can do, and yes I used to use Windows XP).

The search box above this doesn't behave like the traditional Gnome Panel's Run Application dialogue. For example I can't type a file path, and tying gedit then enter won't take me where I want to go (gedit). Instead it takes me to some other entry I have defined in the menu bar. Now I can see some reasons why this could be better. Really I want to launch any executable files in my $PATH, but a user who doesn't use the terminal probably doesn't want this. An option so that the user can choose how they want it to behave would be better, I think.

[caption id="attachment_826" align="aligncenter" width="450" caption="Gnome Shell's search box doesn't behave as I expected."]Gnome Shell's search box doesn't behave as I expected.[/caption]

Having all my icon application starters in the top Gnome Panel was nice but there is no reason those can't be added to Gnome Shell, but again it's still in development. Although now that I've been using the interface for an hour or so, I think that they may create more clutter. Actually I think I would prefer that that top panel bar in Gnome shell would only appear in the Activity mode (but still recognise the top left mouse gesture). Although this may be scary for newbie's (hey I got intimidated the first time I used Blackbox, I couldn't work out that right clicking on the desktop gave me a menu) so an option would be much better.

Anything thing I wanted to mention was, I use Firefox a lot, and a lot of the concepts and issues with window management can be applied to tabs in a browser. The folks over at Mozilla are working on this so I'm eager to see what they come up with, but as more and more things are done through HTML web pages, it just means I'm going to have more and more tabs open that I need to manage, and navigate. Like starting a new application in a desktop environment you often start a new task (web page/tab) in a web browser. I've been using Ubiquity for a while now at I find it really good. Although they are up to release 0.5, I'm still using 0.1.9rc6. Although I can think of many improvements, its still really efficient at starting new tasks.

Oh an in case you were wondering from my Screenshots there, I'm using the orange-theme (orange-theme - 1.3.0.jaunty.ppa2+nmu1) from https://launchpad.net/~bisigi/+archive/ppa/+packages.

Tags: computing, ubuntu.
SBS Playlist to RSS Feed Perl Script v2
10th July 2009

I have made some changes to my original script. This new perl script will scrape info from sbs.com.au and give an RSS feed of the items in the specified playlist. I only know of two playlists (94 = Latest Full Episodes, 95 = Preview Clips). Only one line needs to be changed to use the script to give the RSS feed of a different playlist. The major improvement is the items that are only available over RTMP now have the correct URL which was previously incorrect (but now the script runs slower as it has to grab more pages from the web). I use the url, http://player.sbs.com.au/video/smil/index/standalone/$item_code/ to find out the url details.

FLVStreamer appears to do a good job of downloading media over the RTMP protocol. Just use ./flvstreamer -r rtmp://file.flv > file.flv. Mozilla has an article on how to add protocol's to firefox here. But I didn't bother with that as the command is simple as it is, and building an app with a save as dialogue is beyond me for now, but I hope to learn that soon.

[Update: It seems that you also need to have the --swfUrl argument set ('http://player.sbs.com.au/web/flash/standalone_video_player_application.swf' works.). Also the perl script doesn't get the file name correctly (it uses the thumbnail image url, rather it should be using the url's given at the /video/smil/index pages).]

For local use the current format will probably be what you want, but in a production environment you probably want to have the script save the RSS file to disk and have people hit that RSS file with the requests. Just set the perl script to run every now and then. Unfortunately I can't seem to upload .pl files to Wordpress (I've put a link, but that will expire eventually)... I really need to get my own site.. There is so much customisation I would like to do and many experiments to try out on a live server, but the $$$'s are too much...

On another note I tried out EPIC (Eclipse Perl Integration), which was fairly simple to install. It seems much nicer than using a plain text editor and command line, especially the debugging abilities that it adds.


Tags: computing, rss.
Deque Implementation Design
29th June 2009

For Assignment 1 of COMP2911 we got the task of implementing a deque, using arrays and linked lists (in Java). Here is the design I used for the implementation.


This was quite a challenge.

One approach was to have an array of size capacity, and also store some left and right index pointers to know where the deque ranges in the array.


My first idea (actually inspired by my tutor) was basically, to add to the right of the deque you start at the left and push right, to add to the left of the deque you start at the right and push left. When the two ends collide, you make a new array larger keeping the left and right parts of the deque at the leftmost and rightmost parts of the array.

series1This turned out to have some problems. All worked well, except for a couple of things. Such as show here,

series2, because now you are left in the middle and your not just pushing straight from one side to another.

This next situation below was also troublesome because you have to be careful where you store your left/right index pointers. You need to carefully think of what if they overlap? And if they are equal or overlap, is your arrayDeque full or empty?

series3iSo what I did is look at the different ways to store the left/right index pointers. In all these diagrams orange means the location of the left index pointer, and the violet is the location of the right index pointer.

a) In this diagram I point to the next available space.

series3i1b) In this diagram I point to the location of the most recently added item. At the very beginning I loop them over.

series3i2All this just lead to much confusion for me and many bugs and problems. So I had a look at an entirely new approach. It turned out much better, and I learnt that if things aren't working out and you just keep getting lots of bugs then sometimes trying another approach can really help out. I also learnt that its really hard to know the best way to do something right from the start you really have to try something and if its not working to well than just try something else and see how it goes. So here is what I finally did.

db_1Initially I just ignored the fact that I had a limited size array, and that things are actually mod array size. I just had two index pointers (shown in orange and blue) which pointed to the next free space at the left and right ends of the deque. I kept these as ints and they I don't think of them as wrapping round. If you forget about the array based implementation and go back to the original problem of a deque you see that really all you need is a big long line with a left end and right end pointer. Now all you have to do is remember that whenever you are dealing with implementation level things such as accessing the array, to use index mod capacity, rather than the raw index value (which may be out of the array range). That and you need to have a check to know when to resize your array.

Under this scheme the number of items in your deque is rightIndex - leftIndex - 1, therefore your array is full if and only if (rightIndex - leftIndex - 1) >= capacity. (Where capacity is the size of the array). If this is true then you need to resize your array.

The method I choose was to simply shift the deque along (either direction) so that the leftIndex is at -1.


This was much simpler. Basically I made a link object that stored an item and a left and a right pointer (to other link objects). I would store the leftmost and rightmost link items for the deque and that is all. I guess I could have stored size as part of the deque object and updated it whenever new link items were added or removed to the deque, but as we were given no requirements to do it one way or another I made it so that the size method would go though the whole deque and do a summation every time it was called.

The only thing I really had to watch was to ensure that I kept the left and right pointers for each link item up to date with changes, and this was my primary source of bugs.

Tags: comp2911, computing.
Very Useful Ubuntu/Unix Commands
29th June 2009

Just some Ubuntu/Unix commands that I seem to find very useful but can be hard to remember at times.

PDF Concatenation

pdftk in1.pdf in2.pdf cat output out1.pdf

Its annoying that nautilus doesn't integrate this by default, so I could select some PDF's then right click and choose merge. Luckily I can do this with the fairly simple Nautilus Actions Configuration app. But this takes time and the average user doesn't have the time to research these things on how to do it themselves. Of course this could be extended further to match what you can do from the command line, such as choose some method of ordering the files, or choose to do weird things such as rotate every second page. Sure you want to keep things simple, so as a start it would be great to see something basic. There are probably even nautilus scripts in the repositories out there that do this exact thing, but you spend half your time finding them and installing them. I think this should be enabled by default, or put in some options page somewhere.

Truncate an mp3 file

mpgsplit input.mp3 [00:00:20-00:00:58] -o output.mp3

It appears that this doesn't re-encode so it seems to run very fast.

Mount an ISO

sudo mount -o loop ~/disk.iso /media/cdrom0

Strip audio from a video file

ffmpeg -ab 320k -i in.mov outaudio.mp3

Unfortunately when I left out the bit rate, it defaulted to 14K or something much lower than what the source file was using.

Search all files in the current directory for some string

find . -exec grep "searchthisstring" '{}' \; -print
find . -exec grep -q "searchthisstring" '{}' \; -print

Trim and nup pdfs

pdfnup --nup 2x2 --pages 26-140 --trim "1cm 1cm 1cm 1cm" infile.pdf

Wake from Suspend

sudo rtcwake -u -s 12600 -m on

... then you need to manually put the computer in suspend and 12600 seconds later the computer will resume.




sudo shutdown -h now sudo shutdown -r now

... the -h means halt (shutdown), -r means reboot.

Tar and Gz all the files in a directory

tar -cvfz file.tar.gz *

Get file info from an HTTP server without downloading

wget -S --spider http://www.site.com/file
Tags: computing, linux.
COMP2121 Quick Summary on Particular Aspects
23rd June 2009

Von-Newman vs. Harvard Architecture.

Von-Newman has a single memory space, share for data and program instructions. Harvard Architecture has separate memory spaces for data and instructions (so you cannot execute from the data memory).

2's compliment


It is important to know that the hardware does all arithmetic in 2's compliment. It is up to the programmer to interpret the number as signed or unsigned.

To convert a number from 2's compliment, for example -45 in 2's compliment is 11010011, we can do something like this,

$latex 1 \times (-2^7) + 1 \times 2^6 + 1 \times 2^6 + 0 \times 2^5 +1 \times 2^4 + 0 \times 2^3 + 0 \times 2^2 + 1 \times 2^1 + 1 \times 2^0 = -45$

To go the other way from say -1 to the 2's compliment form 11111111 we use that $latex 2^p - X$ formula. I'm not exactly sure how its supposed to work so I've hacked it to make it work.

If the number you wish to convert is negative, let $latex X = -n$, so that X is positive then take $latex 2^p$ where p is the number of bits you are using (say 8), then subtract X. If the number to convert is less than $latex 2^p$ (where p is the number of bits, say 8 ) then leave it as is and that in your 2's compliment.

Now that was complicated. But its the only way I can get that advertised $latex 2^p - X$ formula to work with the given set of sample data (as in that table above).

Sign Extension

Why do we need sign extension? We need it in order to do operations on numbers than have different bit lengths (the number of bits used to represent the number).

Decimal to Binary

From a human kind of approach to convert 221 to binary, we see that $latex 2^7 = 128$, that is 7 is the largest power of 2 less than 221, so we have $latex 1 \times 2^7$. That gives us 128, so we still have 93 (221-128) to go. We try $latex 2^6$, this is less than 93. So far we have $latex 1 \times 2^7 + 1 \times 2^6$. 29 left now, but $latex 2^5$ is greater than 29, so we put a zero in that digit, ie. $latex 1 \times 2^7 + 1 \times 2^6 + 0 \times 2^5$. If we go on we get $latex 1 \times 2^7 + 1 \times 2^6 + 0 \times 2^5 + 1 \times 2^4 + 1 \times 2^3 + 1 \times 2^2 + 0 \times 2^1 + 1 \times 2^0$. Taking the coefficients of the $latex \times 2^x$ terms we get the number 221 in binary, 11011101.

We can convert hexadecimal to binary by going from hex to decimal then decimal to binary. For hex to decimal,

$latex \mbox{F23AC} = 15 \times 16^4 + 2 \times 16^3 + 3 \times 16^2 + 10 \times 16^1 + 12 \times 16^0 = 992172$ (where F23AC is in hex and 992172 is in decimal)

Operations on Signed and Unsigned Multi byte Numbers

add al, bl
adc ah, bh
does a + b, result in is a.

There are 3 multiplication operations, MUL (Multiply Unsigned), MULS (Multiply Signed) and MULSU (Multiply Signed with Unsigned). They each do this. Notice the result is stored in r1:r0.

mulThus to do n*m = r where n is 2 bytes unsigned and m is 1 byte signed,
mulsu nl, m ;nl * (signed)m
movw rh:rl, r1:r0
mulsu nh, m ;(signed)nh * m
add rh, r0

We can also do 16bit * 16bit,

;* From AVR Instruction Set Guide, pg 99-100.
;* Signed multiply of two 16-bit numbers with 32-bit result.
;* r19:r18:r17:r16 = r23:r22 * r21:r20
clr r2
muls r23, r21 ;(signed)ah * (signed)bh
movw r19:r18, r1:r0
mul r22, r20 ;al * bl
movw r17:r16, r1:r0
mulsu r23, r20 ;(signed)ah * bl
sbc r19, r2
add r17, r0
adc r18, r1
adc r19, r2
mulsu r21, r22 ;(signed)bh * al
sbc r19, r2
add r17, r0
adc r18, r1
adc r19, r2

brge and brsh

Calculating Total Stack Space Needed

Draw a call tree, find the path with the most total weight, that total weight is the total stack size needed. Here is the sample question,

A C program consists of five functions. Their calling relations are shown as follows (the arguments and irrelevant C statements are omitted).

int main(void) {


int func1(…) {

int func2(…) {

func1() is a recursive function and calls itself 15 times for the actual parameters given in main(). Both func3() and func4() do not call any function. The sizes of all stack frames are shown as follows.

main(): 200 bytes. func1(): 100 bytes. func2(): 400 bytes. func3(): 1,400 bytes. func4(): 300 bytes. How much stack space is needed to execute this program correctly? (3 marks)

call_graphThere are three paths,
main() func1() func1() x 15 200+100+15x100 =1800 main() func2() func3() 200+400+1400 =2000 main() func2() func4() 200+400+300 =900
The path with the most total weight is main() > func2() > func3(), so this is the total stack space needed.

Nested Interrupts

nested_interrupt(Source: Hui Wu's Lecture Notes)

Keypads with 'abc' 'def' ... buttons

These keypads where to enter b you need to press the abc button twice in succession, but wait to long at it will chose a. Here is a psudo algorithm that seemed to fit this,

.def reg = rN
.def reg = rM
.def count = rX

//passvalue means that we register the given value ie. abc abc wait > b

clr reg (to some value that is != to a key value) ;set to default
clr count
rjmp keyloop

  check pins for a key
  if no key pressed rjmp keyloop, else continue

  //key was pressed, and value is stored in key
  reset someTimeCounter
  if (key == reg) {
     inc count
     if (count == 3)
     if (reg != default) ;so we don't initially passvalue
        passvalue(reg,count) ;send the last value
     reg = key ;store the new one
     count = 1

rjmp keyloop

if someTimeCounter expires and count != 0 //(count up, so expires after time to wait for anymore keypresses) (check count != 0, because if its 0 then we never had any key pressed that we need to send)
   reg = default

Switch Bounce Software Solution

When a switch makes contact, its mechanical springiness will cause the contact to bounce, or make and break, for a few millisecond (typically 5 to 10 ms). Two software solutions are wait and see and counter-based.

  1. If we detect it as closed, wait for a little bit and check again.
  2. Poll the switch constantly. For each poll if the switch is closed increment the counter. If we reach a certain value in a certain time then the switch was closed (or button pressed).

Serial Communication (Start and Stop bit)

"[The] start bit is used to indicate the start of a frame. Without the start bit, the receiver cannot distinguish between the idle line and the 1 bit because both are logical one. A stop bit is used to allow the receiver to transfer the data from the receive buffer to the memory." (Wu, Homework 6 Solutions)


uart(Source: Hui Wu, Lecture Notes)

Sample Q3a

(This code probably won't work and probably has errors (and maybe not just simple ones, but serious ones that mean that the logic is wrong))

A: .byte 20 ;array of size 10, element size 2 bytes

ldi XL, low(A)
ldi XH, high(A)

;add the contents of the array.
store 0
store 1
store 2
store 3
store 4
store 5
store 6
store 7
store 8
store 9

;find the largest value
ldi XL, low(A)
ldi XH, high(A)

;start with the 1st element of the array
ld r25, X+
ld r26, X+

ldi r20, 10 ;size of array
 cpi r20, 0
 breq endloop

 ld r21, X+
 ld r22, X+

 cp r25, r21
 cpc r26, c22
 brlo lowerthan
 ;we have a new max
 mov r25, r21
 mov r26, r22


 inc r20
 rjmp loop
endloop: rjmp endloop

.macro store
ldi r16, low(@0)
ldi r17, high(@0)

st X+, r16
st X+, r17

For some reason in my lecture notes I have "eg. fine 2nd or 3rd smallest or largest" so here is a modification to do something like that.

A: .byte 20 ;array of size 10, element size 2 bytes

ldi XL, low(A)
ldi XH, high(A)

;add the contents of the array.
store 0
store 1
store 2
store 3
store 4
store 5
store 6
store 7
store 8
store 9

;sort into accending
loopthough for the length of array, (by then we can be sure its sorted)
ldi r23, 10
 cpi r23, 0
 breq endlargeloop

 ;point X to the start of A
 ldi XL, low(A)
 ldi XH, high(A)

 ;start with the 1st element of the array
 ld r25, X+
 ld r26, X+

 ldi r20, 10 ;size of array
 cpi r20, 0
 breq endloop

 ;the next value
 ld r21, X+
 ld r22, X+

 cp r25, r21
 cpc r26, c22
 brge gethan
 ;r22:r21 < r26:r25
 ;swap the order
 st -X, r26
 st -X, r25
 st -X, r22
 st -X, r21

 ld r24, X+ ;to change the X pointer
 ld r24, X+

 ld r25, X+
 ld r26, X+


 inc r20
 rjmp loop

inf: rjmp inf

.macro store
ldi r16, low(@0)
ldi r17, high(@0)

st X+, r16
st X+, r17
Tags: comp2121, computing.
COMP2121 - Wk05 - Interrupts
20th June 2009

To explain interrupts, Wu used an example of a network card that is downloading a file. The network card has a buffer, and only once this buffer is full (or data stream is complete) should the CPU then copy the contents from the buffer to the RAM. So how does the CPU know when the network card's buffer is full and when to execute the copy? He described two ways here interrupt and polling.

Polling involves the CPU periodically asking the network card, are you full? Two problems with this method are a) there may be a delay as you have to wait for the poll request to be made and b) it wastes a lot of CPU time. Polling is implemented in software, not hardware.

An alternative to polling is using interrupts whereby the network card will send an interrupt signal to the CPU to get its attention. This needs special hardware to implement, however it is very efficient compared with polling.

An interrupt system must (among other things),

IRQ is an interrupt request signal.

A daisy chain arrangement (as seen below) allows multiple devices to send and IRQ. However the CPU cannot determine from the IRQ line which device sent the interrupt. So in a daisy chain system when the CPU receives an IRQ, it will send a signal to IO1 asking "did you send the IRQ?" if IO1 sent the request it will reply "yes", if not it will pass the question on to the next IO device and so on. The response is passed back in the same way.


Reset is an interrupt in AVR, and in the AVR Mega64 there are five different sources of reset. There is a flag in the MCU Control Register for each of these and can be used to determine the source of a reset interrupt. The watchdog timer is one source of reset.

Watchdog Timer

The watchdog timer is used to try to reset the system if an error such hang occurs. The watchdog timer in AVR can be enabled or disabled.

If the Watchdog timer is enabled, it needs to be periodically reset using the wdr instruction. When (if) the Watchdog times out, it will generate a short reset pulse.

Tags: comp2121, computing.
PostgreSQL Very Basic Cheatsheet
5th May 2009

(Wrote this a few weeks ago when I knew nothing. Indented into my brain now. Should have published earlier or just trashed the post as it seems too simple now. So instead I'll update it when I find out some new neat tricks.)

List of databases: $ psql -l

To open one of them, $ psql MyDatabase

To see what is in the database (list of relations), mydb=# \d

To examine a specific table, mydb=# \d TableName

Can execute SQL, mydb=# select * from Table;

Can edit SQL in an editor from within PSQL, mydb=# \e

To quit, mydb=# \q

To load a schema from a file $ psql mydb -f /home/foo/bar

Also from the shell, $ pg_dump -O myDB > file (-O means no ownership information is outputed)

On my server configuration (default for ubuntu) you can restart the PostgreSQL service using, $ sudo /etc/init.d/postgresql-8.3 restart

Tags: comp3311, computing.
COMP3311 - Week 7 Notes
5th May 2009

Its pointless repeating what John put in the lecture slides, so this is just my additions that he mentioned but are not in the slides.


Initial condition is defaulted to NULL. So sometimes you will need to define,

  initcond = '';

This is different to NULL, because,

  null || 'abc' --> null

(where || is append) but

  '' || 'abc' --> 'abc'


$x = 2;
myFunc() {
  global $x;

If we omit the global $x, then any references to $x in myFunc will refer to a new local x, not the first x that is set to 2. To avoid this and force any references to x inside myFunc to refer to the first x that is equal to 2, we need this global $x line.


1. $i = strpos('abc', 'a') --> 0
2. $i = strpos('abc', 'b') --> 1
3. $i = strpos('abc', 'z') --> false

if($i) will only be true in case 2 (false in case 1 and 3). So if we want to test if the second string was at all in the first string we must use,

if($i !== false)

This one will be true in case 1 and 2, but not 3.

Tags: comp3311, computing.
COMP2121 - Wk03/04 - Data Transfer/Functions
17th April 2009

[Updated: Clarified what I wrote about the way stacks in AVR work, in the Push/Pop section.]

Data Transfer Instructions


(Credit: Hui Wu/COMP2121 Lecture Notes)

Load Direct (ld):

ld Rd, v
Rd $latex \in$ {r0, r1, ..., r31} and v $latex \in$ {x, x+, -x, y, y+, -y, z, z+, -z}

(remember the X, Y, Z pointers)

Load Program Memory (lpm):

Can take on three forms,

Syntax Operation
lpm Rd, Z
lpm Rd, Z+

Z contains the byte address while the program flash memory uses word addressing. Therefore the word address must be converted into a byte address before having access to the data on the program flash memory. Example usage, (Table_1<<1 converts the word address into a byte address)

   ldi zh, high(Table_1<<1) ; Initialise Z pointer
   ldi zl, low(Table_1<<1)
   lpm r16, z+ ; r16=0x76
   lpm r17, z ; r17=0x58
Table_1: .dw 0x5876


AVR has 64 IO registers. Example,

in Rd, A
out A, Rd
where 0 ≤ A ≤ 63.


The stack pointer (implemented as two 8-bit registers in the I/O space) points to the next free space in RAM above/below (depends how you look at it) the stack. The stack in AVR is part of the SRAM space, and the stack (in AVR) grows from higher memory locations to lower memory locations. I'm talking about the actual value of the address, so 0xFFFF is a higher address than 0xDDDD. This got me a little confused at one stage because if you draw a picture of memory with 0x0000 at the top of the diagram, and 0x0001 below it and so on then in reference to the diagram a stack that is getting larger with PUSH operations is growing upwards (you usually associate higher to lower with down, this is why I got confused).

So the first thing you must do is initialise the stack pointer (RAMEND is a good starting location).

So a push operation,

push Rr

will push Rr onto the stack (ie. put the contents of Rr into the location that the SP points to), and then decrement the SP by 1. Pop has a similar opposite effect.

Shift Instructions

Logical shift left

lsl Rd

lslRotate Left Through Carry

rol Rd


Both operation change some status register flags.


We dabbed into this in first year but just to revise and extend a little I'll try to reiterate this here.

The heap is used for dynamic memory applications (eg. malloc()).

The stack is used to store return addresses, parameters, conflict registers and local variables and other things.

In passing parameters in WINAVR (C compiler for AVR) for say a function call they are passed by value for scalar variables (eg. char, int, float) and passed by reference for non-scalar variables (eg. array, struct).

Rules are needed between the caller and the callee to resolve issues such as,

If a register is used in both caller and callee and the caller needs its old value after the return from the callee, then a register conflict occurs. Either the compiler or the assembly programmer needs to check for this. The work around is to save the conflict registers on the stack.

The return value of a function needs to be stored in designated registers. WINAVR uses r25:r24 for this.

A stack consists of stack frames. A stack frame is created whenever a function is called, and it is freed whenever the function returns. avrstack

(Credit: Hui Wu/COMP2121 Lecture Notes)


The AVR assembler offers macros. A macro is just a segment of code that you define and can then use by just calling the macro. Basically the macro name is just a place holder for the macro code. When the program is assembled the macro name will be replaced by the code that macro defines. This defines a macro named mymacro,

.macro mymacro
 lds r2, @0
 lds r3, @1
 sts @1, r2
 sts @0, r3

We can then invoke this macro with,

mymacro p, q

The p, q are used like arguments. So @0 will be replaced with p and @1 will be replaced by q. In AVR you can used @0 up to @9 in the macro body.

Assembly Process

The AVR assembler uses a two pass process.

Pass One:

Pass Two:


Wu, Hui. COMP2121 09s1 Lecture Notes.

Tags: avr, comp2121, computing.
COMP2121 - ADD and ADC
14th April 2009

Just some notes on using ADD and ADC (and also a not on compare and branch instructions at the end). (Thanks also to my lab partner who helped me with some of this stuff.)

The registers for AVR are 8 bits long, as noted if I want to store a 16 bit number I need to use two registers, one will store the low byte and the other will store the high byte. eg the 16 bit binary number a = 1111111100000000 (as big endian) would be stored as a_high = 11111111, a_low = 00000000.

Another thing that I've learnt is how addition of multi byte numbers in AVR works. AVR has the instructions ADD (add without carry) and ADC (add with carry). Lets look at a simple case first.

ldi r16, 0b10101010 //the 0b indicates binary ($ or 0x for hex, nothing for decimal, 0 for octal). loads the binary number 10101010 into register 16.
ldi r17, 0b10101010
add r16, r17

What happens here is,

+ 10101010
  c c c c

The carry bit is set, and hence and overflow has occurred. The actual answer is 101010100 but this has 9 bits and cannot fix in the 8 bits of space we have for the result to be stored into r16. Here we used the ADD instruction this means (at least to the best of my understanding, please correct me if I'm wrong) that we ignore what is originally in the carry flag (part of the status register). If we used ADC and the carry flag was set to 1 then when we add the right most bit as above 0+0 = 0 but we have a carry so we would get a 1 in that last bit. We can use this to do multi byte arithmetic.

Say we have,

//a is a 16 bit number 1010101010101010
ldi al, 0b10101010
ldi ah, 0b10101010
//b is a 16 bit number 1010101010101010
ldi bl, 0b10101010
ldi bh, 0b10101010

//to do a+b (and store the result in a) we do
add al, bl
adc ah, bh

We can then extend this to as many bytes as we want. Say we have a 3 byte number spread over the registers of a_0, a_1, a_2 and another 3 byte number in the registers b_0, b_1, b_2. We could add them like this,

add a_0, b_0
adc a_1, b_1
adc a_2, b_2

The result would be spread over a_0, a_1 and a_2 (remember though that we can still overflow this).

It was also enlightening (though also a nice reminder) to see how this low level addition works. (when doing 1bit addition (without carry), a + b, the sum = a xor b and the carry = a and b.) (see picture below)

onebitadderonebitadderwithcarry(Credit: Annie Guo, Basics of Digital Systems, COMP2121 Notes)

This concept can similarly be applied to other instructions. For example,

cp a, b
breq label

will compare a and be and branch to the label label if a is equal to b (and if not continue on executing the subsequent instructions). However what if you wanted to compare if a two byte number was equal to a two byte number? Sure you could just repeat the code one time for the low byte and another for the high byte but you can also do this,

cp al, bl
cpc ah, bh
breq label

These cp and cpc instructions use some trickery (well not really but I didn't initially see how this worked, though its quite obvious now) where they modify some flags based on the result of the comparison and then the branch instructions (breq, brne, brlo...) look at the respective flag to see if they should branch or not.

Tags: avr, comp2121, computing.
COMP2121 - Wk04
13th April 2009

Some general notes from an AVR Tutorial that I have found useful and need to reiterate.

Sections marked with ¹ are ©2002-2009 by http://www.avr-asm-tutorial.net, from Beginners Introduction to the Assembly Language of ATMEL AVR Microprocessors by Gerhard Schmidt 2003. Used under the supplied license, "You may use, copy and distribute these pages as long as you keep the copyright information with it."

Tags: comp2121, computing.
COMP3311 Wk02
26th March 2009

Various things about mapping ER Designs to Relational Schemas.

Mapping Strong Entities


The relational model supports only atomic attributes. To map composite attributes you can try,

  1. Concatenate the attributes eg. Struct name {"John", "Smith"} --> "John Smith"
  2. Map atomic components of the composite attribute to a set of atomic components. eg. mapcompatt1 mapcompatt_table
  3. ??

Mapping N:M Relations


Mapping 1:N Relations

map1nrel2Mapping 1:1 Relations


Notes from the Text Book (The Lecture Notes are a Little Different)

Domain Types & User Types

In the sample code for the first assignment to define "custom types" create domain is used. eg.

create domain PersonGender as char(1) check (value in ('M','F'));

However the text also shows create type. eg.

create type Dollars as numeric(12,2) final

It goes on to explain the difference.

Pattern Matching

Patterns in SQL can be desribed using % and _.


select foo from bar where lar like '_to%'

This will match to any of these strings, "Lto" "Ato" "ltoo" "rtoto" ... (any character at the start, then the "to" string, then any (even null) trailing string)

You can define the escape character for a like comparison as follows,

like 'he\%%'  escape '\'' --matches all strings begining with 'he%'

You can also use not like.

SQL:1999 allows for similar too which is similar to Unix regular expressions.

Drop vs. Delete

drop table r will remove all the tuples from r, and removes the schema of r, whereas

delete from r will just remove all the tuples from r, but leaving the schema so you can still add values to the table.


Shepherd, John. COMP3311 09s1 Lecture Slides. http://www.cse.unsw.edu.au/~cs3311/09s1/lectures/. (Diagrams have also been sourced from here).

Silberschatz. Database System Concepts. 5th Ed.

Tags: comp3311, computing.
COMP3311 - Wk01
23rd March 2009

Just a couple random notes, to reiterate some things I need to become acquainted with. Definitely not comprehensive.

The ER Diagram



  1. Each manager manages exactly one branch, and each branch is managed by exactly one manager.
  2. Each branch holds zero or more accounts, but each account is held be at most one branch.
  3. Each customer owns zero or more accounts and each account is owned by zero or more customers.


participNot all customers must take out a loan (or it is not the case that every customer takes out a loan), but every loan is taken out by at least one customer. i.e. Every loan is associated with at least one person, but every person is not necessarily associated with at least one loan.

Attributes Linked to Relationships


In (a) you know how much time a particular person spends on a project. In (b) you only know how much time has been spend on a particular project. You don't know the distribution of that time among the researches that have worked on it.


Shepherd, John. COMP3311 09s1 Lecture Slides. http://www.cse.unsw.edu.au/~cs3311/09s1/lectures/. (Diagrams have also been sourced from here).

Tags: comp3311, computing.
COMP2121 - Wk02
20th March 2009

This week we learnt some of the things that separate assembly language from machine code in the context of AVR (or should I say AVR Studio).

Important Note 1: Assembly Language

In AVR assembly an input line takes the form of one of these, ([foo] = Optional)

[label:] directive [operands] [Comment]
[label:] instruction [operands] [Comment]
Empty Line

where a comment has form,

; [Text]

Important Note 2: Pseudo Instructions

AVR Assembly (using AVR studio) has some additional commands that are not part of the AVR instruction set, but the assembler (that is part of AVR studio) interprets into machine code.

Pseudo instructions are recognised by their preceding '.' (dot character). eg,

.equ CONST = 31

, will upon assembly go through the code and replace CONST with 31.

Here are the AVR assembly Pseudo Instructions.

Directive Description
BYTE Reserve byte to a variable
CSEG Code Segment
CSEGSIZE Program memory size
DB Define constant byte(s)
DEF Define a symbolic name on a register
DEVICE Define which device to assemble for
DSEG Data Segment
DW Define Constant word(s)
EQU Set a symbol equal to an expression
EXIT Exit from file
INCLUDE Read source from another file
LIST Turn listfile generation on
LISTMAC Turn Macro expansion in list file on
MACRO Begin macro
NOLIST Turn listfile generation off
ORG Set program origin
SET Set a symbol to an expression

.byte - Reserve some space (only allowed in dseg). eg.

  var1: .byte 4 ;reserve 4 bytes and store address in var1

  ldi r30, low(var1) ; Load address into Z low register
  ldi r31, high(var1) ; Load address into Z high register
  ld r1, Z ; Load VAR1 into register 1

...and some more...

.def FOO=r30 ;give register 30 the symbolic name FOO
.equ var = 2 ;like C's #define statement
.set var = 2 ;like a global variable in C

Important Note 3: Segments

There AVR three segments of an AVR assembly program. Also when writing an assembly program you need to be able to specify which segment you are referring to, so you need to declare this using an AVR assembler directive shown in brackets below.

  1. Code Segment (.cseg)
  2. Data Segment (.dseg)
  3. EEPROM Segment (.eseg)

"The CSEG directive defines the start of a Code Segment. An Assembler file can consist of several Code Segments, which are concatenated into one Code Segment when assembled. The BYTE directive can not be used within a Code Segment. The default segment type is Code. The Code Segments have their own location counter which is a word counter. The ORG directive can be used to place code and constants at specific locations in the Program memory. The directive does not take any parameters." [1]

Notes from the Lab

Assembly Code and Machine Code

In the lab we looked at this AVR assembly program,

.include "m64def.inc"
.def a =r16 ; define a to be register r16
.def b =r17
.def c =r18
.def d =r19
.def e =r20
main: ; main is a label
ldi a, 10 ; load 10 into r16
ldi b, -20
mov c, a ; copy the value of r16 into r18
add c, b ; add r18 and r17 and store the result in r18
mov d, a
sub d, b ; subtract r17 from r19 and store the result in r19
lsl c ; refer to AVR Instruction Set for the semantics of
asr d ; lsl and asr
mov e, c
add e, d
inc r21 ; increase the value of r21 by 1
rjmp loop ; jump to loop

The machine code executable produced by AVR studio was 24 bytes long, the question was why. It's actually quite simple, all of the here instructions are 1 word (2 bytes = 16 bits), there are 12 instruction so total 24 bytes. One thing that initially confused me was the weird encoding. Back in 1917 I got the impression that if you have something like mov r16 r17 that this would be represented in machine code as some number for the mov operation followed by two more numbers of the same bit size for the registers. However this can vary depending on the architecture.

For example the mov operation, MOV Rd, Rr has encoding 0010 11rd dddd rrrr. The instruction takes up 6 bits, the source register takes up 5 bits and the destination takes up 5 bits (total 16 bits). The reason that the source and destination bits are intertwined is that it makes things easier on the hardware implementation to do it this way.

The program above has machine code (in hexadecimal),

E00A EE1C 2F20 0F21 2F30 1B31 0F22 9535 2F42 0F43 5993 CFFE

and annotated,

+00000000:   E00A        LDI     R16,0x0A         Load immediate
+00000001:   EE1C        LDI     R17,0xEC         Load immediate
+00000002:   2F20        MOV     R18,R16          Copy register
+00000003:   0F21        ADD     R18,R17          Add without carry
+00000004:   2F30        MOV     R19,R16          Copy register
+00000005:   1B31        SUB     R19,R17          Subtract without carry
+00000006:   0F22        LSL     R18              Logical Shift Left
+00000007:   9535        ASR     R19              Arithmetic shift right
+00000008:   2F42        MOV     R20,R18          Copy register
+00000009:   0F43        ADD     R20,R19          Add without carry
+0000000A:   9553        INC     R21              Increment
+0000000B:   CFFE        RJMP    PC-0x0001        Relative jump

Status Register

Stepping through this program also shows a few of the status registers in use. The Status register, like all the other registers has 8 bits, namely,

SREG Status Register
C Carry Flag
Z Zero Flag
N Negative Flag
V Two’s complement overflow indicator
S N ⊕ V, For signed tests
H Half Carry Flag
T Transfer bit used by BLD and BST instructions
I Global Interrupt Enable/Disable Flag

Last week we saw how signed and unsigned numbers are stored, so if you look at the program above you will see that the last part just increments a register infinitely over and over. Stepping through this shows us what the status register does as we do this. Remember that AVR does all arithmetic in two's compliment.

0 H Z
1-127 H
128 H V N
129-255 H S N

As you can see the values that are negative under the two's complement 128-255 have the N (negative) flag. Also from 127 then to 128 under two's compliment we have gone from 127 to -128, so the V (Two’s complement overflow indicator) flag is indicated. 0 has the zero flag.

The Rest

The lecture notes go over some of the AVR instructions but I think the docs provided by Atmel are fine. What I do think needs explaining (and me learning) is the carry flag and the difference between operations like add without carry (ADD) and add with carry (ADC).

Here is how I understand the carry bit. Say we have 4 bit registers and try to add (in binary) 1000 and 1000, the answer is 10000 however this is 5 bits and we only have 4 bits available to store the result. An overflow has occurred. To introduce some terminology, the most significant bit (or byte) (msb) is the left most bit (or byte) in big-endian (right most in little-endian), and vice-versa for least significant bit (or byte) (lsb). The carry bit (C flag) is the carry from the msb. This can indicate overflow in some cases but not always, it those cases the V flag (Two’s complement overflow indicator) is set.

So getting back to ADD and ADC, ADD will just add the two numbers ignoring the carry bit and ignoring overflow whereas ADC will actually add the carry bit to the result. At least this is what I've observed, though I'm not 100% sure on this.


[1] AVR Assembler User Guide. www.atmel.com/atmel/acrobat/doc1022.pdf


AVR Assembler User Guide. www.atmel.com/atmel/acrobat/doc1022.pdf

Tags: comp2121, computing.
An old memory of Linux
13th March 2009

I only started using Linux when I started Uni (2008). I knew nothing about it prior to this. I had heard of it but that is all. I now wish I'd become aware of it sooner but then I realised I had the chance. A friend of mine introduced me to it in year 10 (2005) and I even got an Ubuntu 4, 5 or 6 CD (I can't recall which one). But I never used it. Why not? Well, it occurred to me that Windows was the sole reason for me not at least testing Linux out. I had had some bad experiences with some Windows operating systems and installing windows operating systems. So I had the thought "I don't want to risk breaking my computer/loosing things again with another OS install" just to test this Linux thing out. Huge mistake! Though at least I've learnt from this experience and I'm trying to try things out and give them ago before I just assume them to be poorer quality or choosing not to use them.

One of the best pieces of advice that I would give is to try new things out, if you don't try it you won't know what your missing. For if I was always too afraid to try new things out I would still be using Windows 95.

Tags: computing, linux.
COMP2121 - Wk01
13th March 2009

Just some reflections on the first week of my 09s1 COMP2121 Lectures and other materials. I will base a lot of this material on the course lecture notes by Hui Wu. Actually if you read this and the lecture notes you might see that they are pretty much identical.

The 4917 Microprocessor

This first lecture clarified a lot of the things I'd learnt about low level computing in COMP1917, which was good. Back in COMP1917 we were introduced to the 4917 microprocessor (a hypothetical designed just for this course). It was 4bit, has 16 memory locations and 4 registers (instruction pointer, instruction store, general register 0 and general register 1). Each memory location could store a number between 0 and 15, and there were 16 instructions.

It had a simple start up (registers set to 0, all memory locations set to 0, fetch-execute cycle begins), and a fetch-execute cyle,

So for example the following 4917 machine code program would print 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15.

8 0 3 11 1 15 0

This is understandable but was difficult for me to tie into my desktop computer level, for instance the CPU and the RAM. This became a bit clearer today.

The Von Newman and The Harvard Architectures

Two main computer architectures include the Von Newman Architecture and the Harvard architecture. The hypothetical 4917 from COMP1917 was a Von Newman Architecture because both the program and the data were stored in the same memory. (It seems that processors like the Intel Pentium 4, Intel Core 2, etc. use this architecture)


Another key thing is the Arithmetic and Logic Unit (ALU) and the Control Unit are collectively called the CPU (Central Processing Unit). A microprocessor is just a CPU on a single chip. A microcontroller is basically a computer on a chip, so the CPU, memory, etc. are all on a single chip.

I learnt a lot from experimenting with this architecture and the 4917 (and its successors). Though when I began to write more complex programs I found myself constantly putting a GOTO instruction number n command at the very begining, and using that skiped space as memory for data. I also came to see that this architecture allows for the buffer overflow vulnerability as program data can be executed as instructions if there are vulnerability. These two observations tend to lead to the Harvard architecture (which I am new to).


This is the architecture used for the Atmel AVR microprocessor, which is what we will focus on in this course (I think). I'll come back to this when I talk about address space.

Computer Memory

This is the computer memory hierarchy diagram from the lecture notes.


This helps put a lot of my misunderstandings of how the 4917 from COMP1917 relates to modern processors, as I didn't quite see if the instructions and data were stored in RAM or on the CPU. But really this is just an implementation issue so it didn't really matter to the 4917.

In the CPU there are registers which are really fast, but there are few (apparently in x86 (eg. Pentium, Core 2) there are only 8 integer and 8 floating point registers). Then there is the cache (on chip memory), this is what that 4MB cache that I see advertised that my CPU has. This cache can be separated for program code and data. This is faster that reading from RAM (the off chip memory), so currently active program code is moved to here from the RAM when necessary to speed things up (this is apparently implemented on the hardware level (which is always nice for a software developer ;) ). Then there is off-chip memory and auxiliary storage. This fits in nicely with the picture I get when I open up my computer.

The material on the types of RAM and ROM in the lecture notes needs no further commentary, so I'll skip that part.

ISA (Instruction Set Architecture)

"ISA is the interface between hardware and software

This sums up the situation nicely, and makes perfect sense.

There are 4 things that make up memory instruction set architectures,

RISC - Reduced Instruction Set Computer CISC - Complex Instruction Set Computer

Atmel AVR is RISC.

Number Systems

Problem: How do we represent negative numbers in a computer? 4 main methods (from Wikipedia).



It is important to know that the hardware does all arithmetic in 2's compliment. It is up to the programmer to interpret the number as signed or unsigned.

To convert a number from 2's compliment, for example -45 in 2's compliment is 11010011, we can do something like this,

$latex 1 \times (-2^7) + 1 \times 2^6 + 1 \times 2^6 + 0 \times 2^5 +1 \times 2^4 + 0 \times 2^3 + 0 \times 2^2 + 1 \times 2^1 + 1 \times 2^0 = -45$

To go the other way from say -1 to the 2's compliment form 11111111 we use that $latex 2^p - X$ formula. I'm not exactly sure how its supposed to work so I've hacked it to make it work.

If the number you wish to convert is negative, let $latex X = -n$, so that X is positive then take $latex 2^p$ where p is the number of bits you are using (say 8), then subtract X. If the number to convert is less than $latex 2^p$ (where p is the number of bits, say 8 ) then leave it as is and that in your 2's compliment.

Now that was complicated. But its the only way I can get that advertised $latex 2^p - X$ formula to work with the given set of sample data (as in that table above).

This raises an issue for comparison operations. eg. Is 1111 1100two > 0000 0010two? If those two numbers are unsigned then YES, if they are signed thin NO. As such the hardware uses the S flag in AVR to indicate the result of the signed comparison, and the C flag to indicate the result of unsigned comparison.

Representing Strings. We saw one method back in COMP1917 though its nice to see that the other methods that come to mind were used also.

  1. 1st position of string reserved for length (Pascal)
  2. an accompanying variable has the length of the string (as in a structure)
  3. last position of string marked with a special character (NULL in C)

Sign Extension

How do we extend a binary number of m bits to an equivalent binary number of m + n bits? If the number is positive add n 0's to the most significant side (usually left) of the binary number. If the number is negative add n 1's to the most significant side of the binary number. This is called sign extension. To add twe binary numbers of different lengths just sign extend the shorter one so that both numbers have the same bit length then add as usual.


Wu, Hui. COMP2121 09s1 Lecture Notes.

(Updated 20th June 2009 with explanation of 2's compliment conversions and sign extension)

Tags: comp2121, computing.
Some thoughts on Feed Readers
3rd March 2009

I currently use Google Reader to subscribe to and read RSS and Atom feeds. It has pros and cons.

One of the great things about Google Reader is that its entirely in the cloud. I can access it wherever I am, whichever operating system I'm using. Also as its run on Google's servers, they pay the bandwidth bill not me. I subscribe to lots of feeds (181 as of now) but I don't read them all all the time. If I don't want to read some high volume feeds I don't open them and they don't get downloaded to my machine. This also means that its Google that checks the feed for updates all the time, not my machine. And that's the beauty of using a cloud based feed reader. Google only has to download the feed 1 time for all the subscribers of that feed in Reader, not once for each user.

This is the sole reason I use a cloud based reader, and this pro is too great to get me to switch. Unfortunately there are some things that I don't like about reader and some things I wish I could add. That's the bad thing about using a cloud service like this, I cannot edit the code for my needs. Open source desktop applications can, but with Google's cloud services you cannot.

Its good that in Reader you can download a list of your subscribed feeds, but I wish I could also download the feed items themselves in bulk. There is a copyright issue here that being, from what I've experienced when you subscribe to a new feed in Reader you can view the past postings going back probably to when it was first indexed, despite these items no longer being in the feed. I want to be able to download those too. I guess since you are already downloading them when you view them this is the same, so its not a new issue.

This is turning into a debate about cloud services vs. desktop software, so I'll stop here.

Tags: computing, google.

RSS Feed