You may already know that MarkLogic sometimes runs modules as queries (read-only) and updates (read-write). The advantage of queries is that, because they run at a particular timestamp, they don’t need to deal with locks — nothing will change at that timestamp. How does the server know whether a module is a query or an update? It performs a lexical analysis of the source code. This is all explained in section 2.3 of the Application Developer’s Guide.
Reading through that I was left with a couple questions. In particular, I wanted to know how smart the lexical analysis is. Turns out its pretty good (no surprise)!
A Simple Query
Time for some code. Here’s a simple module that we hit to see whether we have a query or an update.
<h1>{ if (xdmp:request-timestamp()) then "query" else "update" }</h1>
As the documentation tells us, xdmp:request-timestamp() will return a value if and only if we’re doing a read-only query. Sure enough, when we run this module, we get <h1>query</h1>.
A Simple Update
if (0 = 1) then xdmp:document-insert("/foo.xml", <foo/>) else (), <h1>{ if (xdmp:request-timestamp()) then "query" else "update" }</h1>
This one announces itself as an update. Despite the fact that the xdmp:document-insert() will never be called, the server sees this as a potential update. We’re going to use this expression as our trigger to put us into update mode.
A Library Update
Let’s introduce a simple library module.
module namespace lib1 = "lib1";
declare function lib1:f1() { if (0 = 1) then xdmp:document-insert("/foo.xml", <foo/>) else () };
declare function lib1:f2() { <harmless/> };
The lib1:f1() function has our trigger. Let’s see what happens if we import the module but don’t call the function.
import module namespace lib1 = "lib1" at "lib1.xqy";
<h1>{ lib1:f2(), if (xdmp:request-timestamp()) then "query" else "update" }</h1>
That’s a query — MarkLogic Server is smart enough to see that while there are update expressions in the library module, there are no calls that invoke that expression. Now let’s put in a call to f1() — our trigger function — and see what happens.
import module namespace lib1 = "lib1" at "lib1.xqy";
<h1>{ lib1:f1(), if (xdmp:request-timestamp()) then "query" else "update" }</h1>
You guessed it, now we have an update function.
Apply a Test
One last thing to try: can we fool the analysis with xdmp:apply()?
xquery version "1.0-ml";
import module namespace lib1 = "lib1" at "lib1.xqy";
<h1>{ xdmp:apply(xdmp:function(xs:QName("lib1:f1"))), if (xdmp:request-timestamp()) then "query" else "update" }</h1>
What do we get this time? The analysis shows that this is a query statement, but when the server actually runs the code, it finds itself doing an update. That breaks the rules — a query runs at a particular timestamp without locks. An update without locks would obviously cause problems. So what do we get running the query above?
500 Internal Server Error
XDMP-UPDATEFUNCTIONFROMQUERY: xdmp:apply(xdmp:function(xs:QName("lib1:f1"))) --
Cannot apply an update function from a query
Kaboom. Check for this when using xdmp:apply().
Columbo Moment
Oh and, there is just one more thing…. If you want to tell MarkLogic Server explicitly that a module should run in update mode without using a trigger expression like the one above, you can put this into your module:
declare option xdmp:update "true";
You might do that to avoid an XDMP-UPDATEFUNCTIONFROMQUERY while using xdmp:apply().
Tags: locking, marklogic, transactions
December 23rd, 2012 at 1:03 pm
Excellent, concise and well explained. Many thanks