Marlin’s Gottahack
– 1998 Oct 12 – PREV HOME NEXT
Focus or lack
thereof
--------------------------
In case you are
wondering if Marlin has just slipped off into reliving the days when he was
teaching a CS Algorithms course, well, you are partly right. However there is a
reason that I am hacking through all of this seemingly random stuff. It was and
is my hope to use this news letter as a forum for discussion of Computer Music
Composition. Since everyone on the mailing list comes from differing
backgrounds with respect to both CS and Music I am attempting to go over all
the background material, algorithms and musical terms so that we have a common
vocabulary for discussing this stuff. I want to know that everyone on the
mailing list knows what Kohonen nets, Viterbi algorithms, and black board
systems are.
I know that some of
you were involved with the handwriting recognition project and are thus quite
familiar with some of these CS techniques that are useful in the AI domain. I
hope that you will bear with me as I go back through all of this stuff.
The reason that you
are seeing more CS than music so far is that I am culling through previously
half written CS class notes and articles and editing them up for the news
letter. I am also doing some writing from scratch and expect to be doing a lot
more of that. However, there should be no doubt in your mind that I am using
this news letter as a tool to force myself to clean up a lot of the junk that
has collected on my hard drive over the last decade or so. I can assure you
that the week will come when I just can't think of anything to write and I will
dip into by bin of poetry and fiction and fob some of that off on you.
I was inspired in
this direction by one of the talks given at Microsoft Research. They invited in
this dude, Dr. Manford Clynes, to give a talk on his enhanced music playback
system. He is in the middle of giving his talk to about 50 people when he says,
"Hey, I want to read you this poem that I wrote while I was in
Japan."
Suddenly there we
are in the twilight zone, "time - The fourth dimension - evolution
..." The audience fidgets wondering how long this is going to last.
I mean, why not? He
had his captive audience, it was his talk, why not just launch into, "Hey
check out this tune," or, "Here's some photos of me and the dog from
our family vacation a few summers ago." It made me realize how constrained
many of the forum that we frequent really are.
We are not so
constrained here. Caveat Emptor.
Dynamic Programming
- An intro
------------------------------------------
Dynamic programming
is a name attached to a combination of two techniques (not algorithms) that
help convert some minimization problems from a potentially exponential run time
into a polynomial time, and possibly even into a linear time.
These tricks only
work on a certain class of problems (problems where the minimum of some global
big thing is obtained from a simple combination of minimums for a few small
things thus a divide and conquer can work) but it is a large and useful class
so the tricks are worth learning.
One classical
minimization problem that dynamic programming solves is approximate string
matching. In this problem, you assign a cost function (penalty)for every letter
mismatch that occurs, another cost function for every letter that occurred in
the search string but not in the text and yet another cost function for every
letter that occurred in the text but did not have a matching letter in the
search string. What you are trying to do is find the alignment that minimized
the total penalty that you incur. At first glance, this problem is combinatorially
hard, since you can line up nearly any letter with nearly any other letter,
stretching the search string over any number of letters in the text string, or
skipping letters in the search string. However as we will see later, it can
actually be done in a manner that is not much more computationally inefficient
than standard exact string matching.
One last word on
approximate string matching before we go on. The string-matching problem is
particularly important not only for its use in dealing with actual matching of
text, but because many recognition problems such as speech or handwriting can
be cast as approximate string matching problems. The problem of aligning a
human played midi file with a score and the search through a tune database
looking for a not entirely correctly remembered melody are other examples.
Before we tackle
the classic problem we will take a brief digression and show how the techniques
of dynamic programming can be formally applied to the "stupid"
exponential runtime version of the Fibbonacci algorithm to get the standard
linear runtime algorithm. This is overkill for the Fibbonacci problem, but it
illustrates the techniques and helps render them obvious. The Fibbonacci
problem is not a minimization problem, but the transform involves the two
essential steps of dynamic programming.
Technique 1: Given
a recursive problem, where F(bigproblem) is defined in terms of F(little
problem) combined with F(little problem) you often are solving the same little
problem over and over. You can typically eliminate the recursion and use an
array to store the results of the sub problem and just reuse the answer.
Technique 2: given
a large array of subproblems that you need to solve in order to solve the big
problem, it may well be that by reorganizing the order in which you do things
you can convert the large array to a small array to do the same thing. i.e. you
may find that once you have used the results of the small subproblems to solve
some mid size problem, you no longer need the small results any more so you can
recycle the space they were using.
In summary, write
the recursive function, then convert all the function calls to array calls and
then reorder what you are doing to minimize the size of the array that you need
to use.
Here is the recursive
Fib program:
function Fib(n)
if n > 1 then
Fib = Fib(n-1) +
Fib(n-2)
else
Fib = 1
endif
end function
Technique 1 says
convert problem from recursion to using an array. Since Array values for large
N are computed from array numbers smaller than N we will fill the array from 0
on up.
n = 100 ' or some
big number
Dim Fib(n)
Fib(0) = 1
Fib(1) = 1
For i = 2 to n
Fib(n) = Fib(n-1) +
Fib(n-2)
next i
Sho nuff, no array
element is used before it has been filled. We have converted the exponential problem
to a linear one. However it still uses large space. Technique 2 says to observe
that we never need an element more than 2 back so we can crank the array size
down to two elements.
Dim Fib(2)
Fib(0 mod 2) = 1
Fib(1 mod 2) = 1
For i = 1 to n
Fib(i mod 2) =
Fib((i - 1)mod 2) + Fib((i-2) mod 2)
next i
You can eliminate
the mod 2 using conventional techniques to toggle indices back and forth if you
want.
So we have used the
two techniques to convert from recursion to arrays then to trim down the array
size.
One last point on
the general method. In the Fib example I went immediately from the recursive
function to the array based one that uses FOR loops. It may not always be
obvious how you are supposed to do that with a more gnarly recursive function.
The difficulty is often in the figuring out what order to use to fill in the
arrays. There is actually a very easy reformulation of the recursive fib
problem that technically still uses recursion but in reality shortcuts to using
arrays. The advantage of this is that the transformation from the exponentially
slow to the polynomially fast algorithm becomes a simple syntatic change. So
here is Fib once again:
Dim FibA(1000) '
this is the array that shortcuts the recursion
' rename the
original recursive function to Rfib so that it no longer directly calls itself
function RFib(n)
if n > 1 then
Fib = Fib(n-1) +
Fib(n-2)
else
Fib = 1
endif
end function
' now write the new
function that shortcuts the recursion with the array
function Fib(n)
if FibA(n) = 0 then
' I use the fact that fib in never 0 to flag uncomputed elements
FibA(n) = Rfib(n) '
compute new value using Real fib then stuff into array
Endif
Fib = FibA(n) ' now
I know that FibA(n) has a value so return it
end function
and finally be careful
to initialize the array befor you call fib at the top level
For i = 1 to 1000:
FibA(i) = 0: next I
This allows you to
hack out the recursive function. Get it working for small n and then just wrap
the old function with a layer that shortcuts to the array. This is not as
efficient as what you get by a total rewrite that eliminates all the function
overhead. But it is way easier to do, way easier to read, and let's face it,
when yer just screwing around solving some combinatorial puzzle like the Brozz
eggs who cares about function overhead.
Solution to the
Brozz Egg Problem
-----------------------------------------
As I said before,
there was a puzzle alias at the soft and this one made its rounds. The debate
that went around was all about whether binary search was right or not. I hacked
out the dynamic program solution and showed that binary search was NOT correct.
(Though to be fair it is not very far from being right)
I resurrected the
puzzle for last weeks news letter so that I could launch this discussion of
dynamic programming and the puzzle once again leaked out into MS email and had
dozens of people working on it.
I was shocked to
discover that this time the MS alias had quickly converged on a totally
different solution from any that were discussed the first time the problem went
around. Now it is possible that I gave the problem wrong in this second pass
but I don't think so.
So first the latest
and greatest solution: Fill the boiling kettle with 60 eggs. Pull one out every
minute until you get the perfect egg. Immediately note the time and dump out
all the water, you now have several perfectly cooked eggs and know exactly how
long it takes to cook them. Expected time to solution = Proper cooking time.
Piece of cake! No
programming let alone dynamic programming necessary.
So I will now
restate the problem is such a was that you get to use the DP tools I just
showed you.
Consider a binary
tree with N nodes whose values are the numbers from 1 to N. The tree is
arranged so that an in order traversal of the binary tree prints out the
numbers from 1 to N in order. (In other words if a node has the value X then
that node's left sub tree only has values less than X and that node's right sub
tree only has values greater than X) Now, there are many such trees. We define
the cost to a node to be the sum of all the parent nodes plus the value in the
node itself. (this is the time it would take to cook a parent egg, discover
whether it is raw or over cooked, make the binary decision and move to the new
cooking time) We are looking for the tree that optimizes the average cost to
each node. Since all the trees have exactly N nodes optimizing the average cost
is the same as optimizing the total cost.
Now that you all
are DP experts the code for this is pretty simple:
You need to be able
to find the optimal layout for the nodes from a to b. You do this by looking a
each possible split point and seeing what the cost is for that split and you
pick the best split point. Here is the psuedo code
Function
OptimalCost(a,b)
if a = b then '
first we do the trivial case
OptimalCost = a
elseif a > b
then ' this just stops the recursion
OptimalCost = 0
else ' a < b -
the normal case
BestSoFar = Cost of
split at a
BestSplit = a
For I = a + 1 to b
NewCost = Cost of
split at I - this will be recursive, calling OptimalCost
If NewCost <
BestSoFar then ' we have a better solution
BestSoFar = NewCost
BestSplit = I
Endif
Next I
OptimalCost =
BestSoFar
Save BestSplit
somewhere so we can reconstruct the optimal tree
endif
End function
The handwaving was
what is the cost of a tree from a to b split at some I
There are a total
of (b-a + 1) nodes in this tree. The top of the tree has value I. You must pay
the cost of I not only for the node I itself, but you must also pay the cost of
I for every single child and grandchild of I thus the total cost of the tree
from a to b is:
Cost(a,b,I) = (b -
a + 1) * I + OptimalCost(a, I - 1) + OptimalCost(I + 1, b)
The other piece of
handwaving that often causes trouble in dynamic programming is the saving the
best split. You see, it is rare that all you care about is the one optimal
value. What you usually want as well is the optimal set of decisions that led
to the optimal value. What you typicall need to do is have another array that
instead of keeping the optimal values, which is driving the recursive function,
is keeping track of the decision that was made at each point. You then must
walk this array to reconstrct the actual path in problem graph that led to the
optimum value.
Before I show the
real code, I will remark on one last sleezy trick. Normally I would need two
arrays, one to keep track of the optimum value and one to keep track of the
split point that led to that optimum value. However note that I only care about
the function for cases when a is less than b. This means that I only use the
upper triangular portion of a square array. Since I can fit two triangles into
a square array I will pack both of the arrays that I need into one. I create
one array, bsa() - BestScoreArray, and use the following convention. Given the
problem from a to b where a < b I will store the optimum value at bsa(a,b)
and I will store the split point at bsa(b,a)
Here is final VB
code Including some really ratty code to print out a tree structure picture of
the trees so that you can eyeball the shape of the tree.
It may appear that
I made no effort to do step two of the DP process, that I made no effort to
reduce the size of the matrix needed. This is not quite true. I spent many days
trying to find some order in which I could fill the array so that I could toss
out some of the information once it was no longer needed. Great effort expended
but no success. I convinced myself that there is no size ruduction possible in
this problem. Not all problems allow you to use step 1 and not all problems
allow you to use step 2. Actual mileage may vary.
Brozz Egg VB code
------------------------
Const n = 60
Dim bsa(n, n)
Private Sub
Command1_Click() ' I just hard wire a button to calculate BestScore(1,60)
Cls
Print
"clearing array"
For i = 1 To n: For
j = 1 To n
bsa(i, j) = 0
Next j: Next I
Print "doin
it"
t = BestScore(1, n)
Print
"done"
End Sub
Public Function
BestScore(a, b)
If a > b Then
BestScore = 0
ElseIf a = b Then
BestScore = a
bsa(a, b) = a
Else ' a < b
If bsa(a, b)
<> 0 Then
BestScore = bsa(a,
b)
Else
bs = (b - a + 1) *
a + BestScore(a + 1, b) ' score for split at a
bi = a
For i = a + 1 To b
newscore = i * (b -
a + 1) + BestScore(a, i - 1) + BestScore(i + 1, b)
If newscore < bs
Then
bs = newscore
bi = i
End If
Next i
BestScore = bs
bsa(a, b) = bs '
save the cost
bsa(b, a) = bi '
save the split point
End If
End If
End Function
Private Sub
Command2_Click() ' will print the sub tree from a to b
a = Val(Text1.Text)
b = Val(Text2.Text)
Cls
For i = a To b
t$ = Str$(i)
ta = a: tb = b
While bsa(tb, ta)
<> i
t$ = " "
& t$
'Print
"("; ta; ","; tb; ")="; bsa(tb, ta);
v = bsa(tb, ta)
If v < i Then
ta = v + 1
End If
If v > i Then
tb = v - 1
End If
Wend
Print t$
Next I
End Sub
I think I will save
the approximate string match for another time.
Results
-------
If you actually
hack in this code and look at the trees you will find that I was totally off
base with the hint that I gave when I introduced the problem. Such is the
nature of failing memory. I remembered that the tree was skewed from a simple
balanced binary tree, but I had the skew in exactly the wrong direction. The
skew is actually toward the shorter time eggs.
an example in
pictures is worth a thousand words
Here is the
perfectly balanced tree for the numbers from 1 to 7
4
2 6
1 3 5 7
Here is a higly
skewed one that has an identical cost function.
1
4
2 6
3 5 7
In the balanced
tree it costs you 4 + 2 + 1 to get to the 1 or seven points.
in the skew tree it
costs you only 1 to get to 1 and you have added the cost of 1 to the six other
paths. So in the skew tree you have shaved off 6 points off the 1-path and have
added 1 to each of the six paths and the result is very different shaped trees
with identical cost structure.
As you go up to
higher numbers and particularly as you approach 2^(n-1) nodes which would allow
a perfectly balanced tree, you find that just before you get to perfectly
balanced you can win by skewing a small number up to the top of the tree,
adding that small cost to every long path and simplifying the left sub tree. The
optimal split point seems to vary back and forth from about 1/3 to 1/2
depending on how close you are to a power of 2. Here is the split points for
the first several numbers.
1 1
2 1
3 1
4 1
5 2
6 3
7 1
8 2
9 3
10 3
11 4
12 5
13 6
14 7
15 1
16 3
17 4
18 5
19 6
20 6
21 7
22 7
23 8
24 9
25 10
So you see just
before a power of 2, like 14 the tree is just about balanced with a split point
mid way at 7, but right at the powers of 2 it skewes. Just for the record, the
problem that I gave you to solve was the problem from 1 to 60 and 60 just
happens to be just before the skew dislocation that happens at 63 and thus to
add insult to injury the actual shape of the optimal tree for 60 is
indistinguishable from a balanced binary tree contrary to my hint.
OK so I lied. That
is just part of the thrills of a hacked out newsletter as opposed to one of
those boreing old peer reviewed journals.
Beating Fib to
death
-------------------------
A quick digression
for those who have not seen this trick: What is the runtime of the recursive
Fib algorithm? It is exponential. It takes Fib(n) steps to calculate Fib(n)
What is the runtime of the one we listed above that just uses an array? It is
linear. It takes n steps to calculate Fib(n). Way faster. Is it the best we can
do. Nope, not even close. Here is the order lg(n) algorithm that kicks face.
Consider the
following 2x2 matrix
0 1
1 1
if you multiply the
vector (a.b) by that matrix you get the vector (b, a+b). Pretty cool, nee. If a
and b happened to be two successive terms in the Fibonnaci series, the Matrix
shifted you over one more in the series and gave you the next term. If I
multiplied once more I would have (a+b,a+b+b) So if I want to know what the
series looks like N terms out I can evaluate the vector equation
(1,1)* M^N
Now the speed up
comes because we can raise to integer powers very fast the trick to that is to
notice that you can multiply M by itself to get M^2. You can multiply M^2 by
itself to get M^4. So you can quickly crank out M raised to any power of 2 and
then using the binary representation of N you can pick out which powers of 2
you need to multiply together to get M^N. So you crank out M^N in lg(N) steps,
hit the vector with the matrix and ta da!
If we ignore the
problem of size of integer storage, and really we must since they get real long
real fast, i.e. we change the problem from evaluating Fib(n) which has too many
digits to write down for any decent sized n to evaluating Fib(n) mod 32 bits,
and we set you the problem of calculating Fib(billion) you get that wonderful
spread of execution times where the recursive one takes you several times past
the heat death of the universe to run (not to mention having blown out the
return stack long ago) the second one is done in seconds and the last one is
done in milliseconds.
This trick works
for any simple linearly recursive function and is the method used to show that
linearly recursive functions are just the discrete analog of a linear
differential equations. No surprise that the answers come out to be so similar.
Feedback
--------
I hear nothing from
any of you. Actually that is not exactly true, luis recommends the movie
"Lolita" playing at the egyptian and my dad sent me a song that he
wrote.
This is not a
problem. I can continue to churn on in this vein forever. I just wanted to let
you all know that you are not the only one out there that contributes nothing.
Things I am
thinking of for future issues:
Handwriting Input
of Music Notation
Names, not random
numbers.
Jan Ken Po - Paper,
rock, scissors.
8 queens
Ants
Ink Capture
Skunk
Quick and dirty
Voroni Diagrams
the second half of
Dynamic programming - string matching.
more puzzles
card tricks
I will going back
to Russia to look in on things there in another week or so. I may or may not
interrupt the gottahack during that time. You been forewarned.