Word query specification

Author: Dave Cassel  |  Category: Software Development

While looking at the database configuration in MarkLogic Server, have you ever noticed the Word Query link on the left? (Click Configure -> Databases -> the name of one of your databases -> Word Query.) Ever wonder what it does? Word Query is a simple way to specify what parts of a document should be included in queries and what parts should be excluded. You might use this if your documents have some sections you intend to be searchable and other sections that are simply around for historical or metadata purposes, but shouldn’t be involved in document discovery.

Here are just a few lines of code you can use to start exploring this feature.

xdmp:document-insert(
  '/word-query/doc1.xml',
  <doc>
    <invisible>you can't find me</invisible>
    <visible>here I am</visible>
  </doc>
)

We’ll start by creating a document that has some structure we want to ignore.

Now if we run this query:

cts:search(fn:doc(), "find")

we get the document back, just as you would expect. Now I’ll go into the database configuration and click Word Query (I’m using the Documents database). Now I’ll click on the “excludes” tab and type “invisible” into the localname box — my sample data isn’t using a namespace. I hit the OK button and I’m sent back to the Configure tab where I can now see the “invisible” element listed under Excludes.

Let’s run that search again:

cts:search(fn:doc(), "find")

This time, we get nothing back. Setting up that exclude has removed the “invisible” element from the index. Searching for “here” will still get the document, as we didn’t exclude that element or any of its ancestors.

Tags: ,

One Response to “Word query specification”

  1. Brajendu Kumar Das Says:

    Nice article.
    Sir , Pls guide me to find the content present in frames and its count….to write query…

    Thankx in advance
    regards
    Brajendu Kumar Das

Leave a Reply