Having battled through this in Matlab, doing it the Python way is extremely easy on any platform. One makes use of Mencoder, which we can just call using os.system. Here’s an idiom, culled mostly from the scipy wiki:

from pylab import *
import os
# set the interactive mode of pylab to ON
ion()
# opens a new figure to plot into
fig_hndl = figure()
# make an empty list into which we'll append
# the filenames of the PNGs that compose each
# frame.
files=[]	
# filename for the name of the resulting movie
filename = 'animation'
number_of_frames = 100
for t in range(number_of_frames):
	# draw the frame
	imshow(rand(100,100))
	# form a filename
	fname = '_tmp%03d.png'%t
	# save the frame
	savefig(fname)
	# append the filename to the list
	files.append(fname)
# call mencoder	
os.system("mencoder 'mf://_tmp*.png' -mf type=png:fps=10 
	-ovc lavc -lavcopts vcodec=wmv2 -oac copy 
	-o " + filename + ".mpg")
# cleanup
for fname in files: os.remove(fname)

Depending on how complicated the plot is, this can take anywhere from a while to ages and ages. It’s good quality, though! You can muck about with the frame rate in the call to encoder where it says fps=10. Change that 10 to something else (30 seems popular!). Also there are probably better codecs than wmv2!

Having written this, it strikes me we could probably do all this in Matlab in the same way, rather than trying to make movies using Matlab’s builtins…

Enumerate in Python

January 26, 2009

If you need to take an element of an array, do something to it, and store it in a new array you have some options. You could do this:

a = arange(10)
b = empty(10)
f = lambda x: x**2 #  or whatever
i = 0
for ai in a:
	b[i] = f(ai)
	i += 1

I think this is ugly because the i+=1 reminds me too much of Matlab. Maybe a bit better is:

a = arange(10)
b = empty(10)
f = lambda x: x**2
for i in range(len(a)):
	b[i] = f(a[i])

This is better, but I’m sad not to be iterating through a, and think the range(len(a)) construct is clumsy. We can have the best of both worlds using enumerate:

a = arange(10)
b = empty(10)
f = lambda x: x**2
for i,ai in enumerate(a):
	b[i] = f(ai)

Note that for these simple examples you could use b = [f(ai) for ai in a] (which works if a is any iterable object, like a list or a tuple) or simply b = f(a) (which works when a is a numpy array like above). Using enumerate often comes in handy for more complicated projects! For example, using the annoyingly indexed subplot command:

# X is a list of vectors
X = [rand(100) for i in range(10)]

for i,x in enumerate(X):
	# subplot is indexed using 1-indexing
	subplot(len(X),1,i+1) 
	plot(x)

Assertions in Python

January 26, 2009

Keeping track of your variables gets weirdly difficult with even vaguely large projects. Especially when worrying about things like integer division (try 1/2) which can seriously affect your results. So I’ve been getting into assertions. These are little checks that represent your understanding of the variable. So if you have a matrix that you know is square then we can encode this understanding before we try to operate on the matrix. For example, consider some function isposdef() acting on a square matrix A:

def isposdef(A):
	
	assert type(A) is matrix
	assert A.shape[0] == A.shape[1]
	
	...

They only take seconds to write but they can save you hours and hours of debugging. I’m trying to get into the habit of writing assertions on every single argument that is passed into a function and on every variable that’s returned by a function. They’re falling into four categories:

  • type assertions: making sure everything is the correct type
  • shape assertions: for matrices mostly, to enforce things like being column vectors
  • consistency assertions: again for matrices – making sure everything’s the correct size with respect to one another
  • value assertions: making sure values are within an allowable range, for example that a particular matrix has negative eigenvalues.

One thing to note is that you shouldn’t be using assertions to test user input. These are for (‘preventative’) debugging – if you know something about a variable then you should express it. If you need to check for user input, proper checks should be implemented. A funky thing about assertions is that you can turn all of them off for production code! So you don’t need to worry about writing millions of assertions – they’re not going to slow you down any in the end.

Along these lines, then, here are two little matrix tests:

def iscolvector(x):
	assert type(x) is matrix
	return x.shape[1] == 1
	
def issquarematrix(X):
	assert type(X) is matrix
	return X.shape[0] == X.shape[1]

Iterators and Exceptions

January 26, 2009

Here’s the situation. I have a list of symbol objects, and each object has an associated test method. The test method accepts a piece of a time series and asks whether or not the symbol associated with that object can represent this piece of a time series. If the symbol can represent that piece of time series, the test returns True, otherwise False.

Now say I have 100 different symbols in a list called symbols – how do I find which symbol corresponds to a particular piece of a time series? Simple: I check each one and stop checking once I’ve found the correct symbol. Here’s a little idiom to do that:

symbol_iter = iter(symbols)
still_looking = True
try:
	while still_looking:
		symbol = symbol_iter.next()
		if symbol.test(timeseries):
			name = symbol.name
			still_looking = False
except StopIteration: 
	raise ValueError("didn't find an appropriate symbol")

So I take my list of symbols and turn it into an iterator using iter(). This means that when I call next() as a method of the iterator, it will give me the next value in the list, or it will raise a StopIteration exception. So all I need to do is keep calling the next() method until I find the symbol. When I find it I do whatever I want to do (in this case store the name of the symbol) and halt the while loop. If for whatever reason it reaches the end of the list of symbols without having found the appropriate symbol it will throw a StopIteration which I catch and turn into a ValueError.

For a list of all the built-in exceptions see here.

Python Notes

January 26, 2009

The next few posts are going to be more little things about Python that I’m moving from my old site… I think it’s good that they live here where I can search them and so on.

I often work with lists of arrays or matrices. For example if I have a state sequence X = {x_1, x_2 \ldots x_T}, it’s often useful to store each state as a numpy.matrix so I can do fun things like write

for x in X:
    outer_product = x * x.T

which pleases me. Sometimes I need to initialise these lists and I keep tripping up on the same problem. I can think of two ways to initialise a list of arrays:

[pylab.empty(n,m)] * T
[pylab.empty(n,m) for t in range(T)]

Way number one is BAD and doesn’t work quite how I’d expect. What happens is that the array is created and then repeated T times. This is bad because each element of the resulting list is simply a pointer at a single array. So whenever you change any element of the list it changes EVERY element of the list. Gah!

Way number two is better! Here the list comprehension makes T empty matrices, which you can then edit to your heart’s content. Hurrah!

Hello World

January 23, 2009

Howdy!

I’m here to blog about my adventures in dynamic modelling and machine learning. I’ve got a PhD in spatiotemporal modelling, and I’ve been post-docing for a year and a half studying behaviour in a couple of different contexts. This work has covered quite a broad range of techniques in dynamic modelling, and has introduced me to some new ways of coding (mostly involving Python). So I thought it might be useful to write all this stuff down.