More Segment HMMs

April 23, 2009

It’s weird how you can muck about with structures for so long and still forget what the hell you’re doing all the time. This post is supposed to try and clarify segment HMMs a bit for me, so that I can approach the learning problem with a clear mind. It’s all very specific; and it’s all below the fold.

I’m going to define the model again. Notation is important:

  • p(Q_{t+l}=q_j |Q_t= q_i): state transition probability
  • p(L_t=l|Q_t=q): duration density
  • p(y_t,\ldots, y_{t+l-1}|Q_t=q,L_t=l): segment density
  • p(Q_0): initial state distribution

Some things to note.

  1. The state transition probability doesn’t really care how long the segment associated with q_i is. This means that the transition is invariant under changes to l which kind of means the whole thing is still Markovian, though this enough for people to start using terms like ‘semi-Markovian’ which is kind of confusing.
  2. The duration of the state at time t is dependent on the state at time t. This implies that:
  3. The segment starts at time t and lasts for l time steps. This means that the last time index in the ‘current’ segment is t+l-1.

What we’d like to do is write this out as a properly specified graphical model, but it turns out that this is pretty hard to do for the general case. As a first crack I tried introducing a counting variable C_t. Unfortunately it doesn’t qutie make it, but was a useful learning experience. So, with a counting variable we can write the alternative model out as follows:

  1. p(C_t=c_j|C_{t-1}=c_i,Q_t=q_j): counting variable transition distribution
  2. p(Q_{t}=q_j|Q_{t-1}=q_i,C_{t-1}=c_i): state transition probability conditioned on the counting variable at the previous time point
  3. p(y_t,\ldots, y_{t+l-1}|Q_t=q,C_t=l): segment density
  4. p(Q_0): initial state distribution

The counting variable either decrements or resets, such that p(C_t=c_j|C_{t-1}=c_i,Q_t=q_j) = \delta(c_i-1), ~ c_i > 1 and p(C_t=c_j|C_{t-1}=c_i,Q_t=q_j) = p(C_t=c_j|Q_t=q_j), ~ c_i = 1. Here p(C_t=c_j|Q_t=q_j) is simply the duration density, such that when the counter resets it starts off at a new duration.

Similarly, the state variable either stays the same or transitions depending on the counting variable. So p(Q_{t}=q_j|Q_{t-1}=q_i,C_{t-1}=c_i) = \delta(Q_{t}=q_i), ~ c_i > 1 and p(Q_{t}=q_j|Q_{t-1}=q_i,C_{t-1}=c_i) = p(Q_{t}=q_j|Q_{t-1}=q_i), ~ c_i = 1.

So we’ve swapped some awkardness in having two ideas of time’s passage – one for the underlying state and one for the observations – with some awkwardness of defining these conditional densities.

As it stands, though, we’re still stuck with a variable topology – the introduction of the counting variable hasn’t really changed the fact that the segment density changes dimension dependent on l. If each observation was independent given the state then the counting variable would be all we need to get a nice regular graph.

The observations aren’t independent though! This is the big difference between Segment HMMs and what are known as explicit-duration HMMs. Hence to be able to draw a graph we must make some assumptions on the form of the output density. Murphy actually states this in one of his technical reports, but it seems to have taken a couple of months for it to actually sink in. Next step, then, is a graphical model for a pth order explicit duration switching AR model! I think I’ll save that for the next post.


One Response to “More Segment HMMs”

  1. […] 23, 2009 When dealing with Segment HMMs (see my previous post), in order to properly specify a graphical model we need to make some assumptions on the form of […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: