MarkLogic has long had the ability to spawn tasks, consisting of a module, a set of variables and some options. Those tasks go into the task queue and get worked off by the configured number of threads. A new feature in MarkLogic 5 gives a new option: higher priority tasks. Let’s take a look at how that works.
My local instance of MarkLogic is configured for 16 task server threads, which I can see by going to Configure -> Groups -> Default -> Task Server in the Admin UI. By clicking on the Status tab here, I’ll be able to watch the tasks get worked off.
Normal Priority
To play with the queue, I’ll start by setting up a simple module to simulate a long running task. I’ll call this write-log.xqy:
xquery version "1.0-ml"; declare variable $priority external; declare variable $id external; xdmp:sleep(5000), xdmp:log(fn:concat("task priority: ", $priority, "; id=", $id))
Sleep for 5 seconds, then write out the parameters. Very simple. Now I’ll fire up Query Console and spawn a bunch of these tasks:
for $i in (1 to 500) let $priority := "normal" return xdmp:spawn( "/write-log.xqy", (xs:QName("priority"), $priority, xs:QName("id"), $i), <options xmlns="xdmp:eval"><priority>{$priority}</priority></options> )
I spawn 500 tasks with “normal” priority. After running this, I can refresh the Task Server Status page and see a bunch of tasks in the queue, getting worked off by the 16 threads. Watching the log file, I see 16 of the log messages bunched together as the threads wrap up around the same time; then 5 seconds later, another group of 16.
Higher Priority
The new option in MarkLogic 5 is to choose between “normal” and “higher” priority. Let’s see what happens when they mix. We’ll use the same write-log.xqy as before, but we’ll change what we do in Query Console:
for $i in (1 to 500) let $priority :=Â if ($i <= 250) then "normal" else "higher" return xdmp:spawn( "/write-log.xqy", (xs:QName("priority"), $priority, xs:QName("id"), $i), <options xmlns="xdmp:eval"><priority>{$priority}</priority></options> )
We launch 250 normal priority tasks followed by 250 higher priority tasks. Higher priority tasks get a separate queue and their own batch of threads. That means that both sets of tasks can proceed in parallel. We see this when we look at the Task Server Status page and see 32 running threads, and in the log file where we see 32 of the log messages bunched together. Tasks 1-16 and tasks 251-266 complete around the same time; about five seconds later, we see tasks 17-32 and tasks 267-282.
Using Priorities
From the way this works, an application can have most tasks running as normal priority, but when higher priority tasks come along, they’ll get to run without interrupting the normal ones. If you make heavy use of the higher priority tasks, be conscious of the extra threads that will be run on your behalf.
Tags: marklogic, new feature, xquery
April 9th, 2012 at 3:58 pm
Thanks for the post Dave, I tried looking into the API documentation and i didn’t see the priority option, but thanks for sharing this. I tried this and really great.
April 9th, 2012 at 6:51 pm
The API documentation for xdmp:eval() lists the options for eval(), spawn(), and invoke(). The priority option is listed there to put it with the others, even though it only applies to eval().
April 10th, 2012 at 4:38 pm
I mostly refer xdmp:eval()
and I didn’t find it here. Thanks for your reply.