Memoization is tracking partial solutions so that they don’t have to be recalculated. A good example of where this is handy is the Fibonacci sequence. You may remember that the definition of this is:
F(n) = F(n -1) + F(n – 2)
F(1) = 1
F(2) = 1
Clearly, this is a recursive function. Let’s take a look at the naive implementation:
The good news is that we get the right answer.
local:fib(10) => 55
The bad news is that this is a seriously inefficient way to do it. Why? Let’s look at what happens when we calculate local:fib(10)
F(10) = F(9) + F(8)
As we calculate F(9), we’ll figure out F(8) and F(7). Our function will recurse through these until it has an answer, bringing us to
F(10) = 34 + F(8)
Now we recurse on F(8) to find that value. But wait, we already calculated F(8) as part of finding the value of F(9)! We need a way to keep track of the values we’ve already calculated so that we don’t have to recomputed them.
Enter the map. Maps, or associative arrays, are a tool available in many programming languages, including MarkLogic’s XQuery. While not part of the XQuery 3.0 standard, this is a tool you want in your belt. You’ll find some great documentation on the mapping operators; my goal here is to show a technique where maps will help with your runtime performance.
Let’s try another version of that function.
The base case remains the same, but when we’re computing other values, we put them into the map as they are calculated.
Performance
Let’s see what kind of difference this makes. I set up each of these functions in Query Console and called each function with n = 20 while looking at the Profile tab.
Naive version: 85,353 expressions; about 0.046 seconds.
Improved version: 354 expressions; about 0.00034 seconds.
That’s two orders of magnitude fewer expressions that needed to be run, reducing the run time by a similar amount.
This approach is useful for problems where:
1. you have a recursive solution
2. where some of the calculations would otherwise be repeated
Tags: algorithms, marklogic, performance, xquery