Queries and Updates

Author: Dave Cassel  |  Category: Software Development

You may already know that MarkLogic sometimes runs modules as queries (read-only) and updates (read-write). The advantage of queries is that, because they run at a particular timestamp, they don’t need to deal with locks — nothing will change at that timestamp. How does the server know whether a module is a query or an update? It performs a lexical analysis of the source code. This is all explained in section 2.3 of the Application Developer’s Guide.

Reading through that I was left with a couple questions. In particular, I wanted to know how smart the lexical analysis is. Turns out its pretty good (no surprise)!

A Simple Query

Time for some code. Here’s a simple module that we hit to see whether we have a query or an update.

<h1>{
  if (xdmp:request-timestamp()) then "query"
  else "update"
}</h1>

As the documentation tells us, xdmp:request-timestamp() will return a value if and only if we’re doing a read-only query. Sure enough, when we run this module, we get <h1>query</h1>.

A Simple Update

if (0 = 1) then xdmp:document-insert("/foo.xml", <foo/>) else (),
<h1>{
  if (xdmp:request-timestamp()) then "query"
  else "update"
}</h1>

This one announces itself as an update. Despite the fact that the xdmp:document-insert() will never be called, the server sees this as a potential update. We’re going to use this expression as our trigger to put us into update mode.

A Library Update

Let’s introduce a simple library module.

module namespace lib1 = "lib1";
declare function lib1:f1()
{
  if (0 = 1) then xdmp:document-insert("/foo.xml", <foo/>) else ()
};
declare function lib1:f2()
{
  <harmless/>
};

The lib1:f1() function has our trigger. Let’s see what happens if we import the module but don’t call the function.

import module namespace lib1 = "lib1" at "lib1.xqy";
<h1>{
  lib1:f2(),
  if (xdmp:request-timestamp()) then "query"
  else "update"
}</h1>

That’s a query — MarkLogic Server is smart enough to see that while there are update expressions in the library module, there are no calls that invoke that expression. Now let’s put in a call to f1() — our trigger function — and see what happens.

import module namespace lib1 = "lib1" at "lib1.xqy";
<h1>{
  lib1:f1(),
  if (xdmp:request-timestamp()) then "query"
  else "update"
}</h1>

You guessed it, now we have an update function.

Apply a Test

One last thing to try: can we fool the analysis with xdmp:apply()?

xquery version "1.0-ml";
import module namespace lib1 = "lib1" at "lib1.xqy";
<h1>{
  xdmp:apply(xdmp:function(xs:QName("lib1:f1"))),
  if (xdmp:request-timestamp()) then "query"
  else "update"
}</h1>

What do we get this time? The analysis shows that this is a query statement, but when the server actually runs the code, it finds itself doing an update. That breaks the rules — a query runs at a particular timestamp without locks. An update without locks would obviously cause problems. So what do we get running the query above?

500 Internal Server Error
XDMP-UPDATEFUNCTIONFROMQUERY: xdmp:apply(xdmp:function(xs:QName("lib1:f1"))) --
Cannot apply an update function from a query

Kaboom. Check for this when using xdmp:apply().

Columbo Moment

Oh and, there is just one more thing…. If you want to tell MarkLogic Server explicitly that a module should run in update mode without using a trigger expression like the one above, you can put this into your module:

declare option xdmp:update "true";

You might do that to avoid an XDMP-UPDATEFUNCTIONFROMQUERY while using xdmp:apply().

Tags: , ,

One Response to “Queries and Updates”

  1. Tom H Says:

    Excellent, concise and well explained. Many thanks

Leave a Reply