text(), fn:string() and fn:data()

Author: Dave Cassel  |  Category: Software Development

I’m going to be teaching the MarkLogic University Basic XQuery class tomorrow, and as usually happens when I get to teach, I’ve learned something while reviewing the material. In this case, it’s the differences among text(), fn:string() and fn:data().

text()

This one is commonly seen at the end of an XPath expression. It returns a string for each text element that is a direct child of the context element. For instance:

let $x := <boast>The <em>best</em> movie ever!</boast>
return $x/text()

This returns:

The
 movie ever!

“best” isn’t in the sequence because it’s not a direct child of $x.

fn:string()

This is similar, but there are two important differences:

1. This function will get all descendent text nodes;
2. All those descendant text nodes will be concatenated into one string.

Thus:

let $x := <boast>The <em>best</em> movie ever!</boast>
return fn:string($x)

This returns “The best movie ever!” as a single string.

fn:data()

This is the one that I think confused me most. My real problem is that I thought it was specifically for getting the value of an attribute, since that was the context in which I first learned to use it. There a couple of things wrong with that notion. First, you can use fn:string() to get an attribute’s value, just like you can use fn:data():

let $x := <input type="text"/>
return (fn:string($x/@type), fn:data($x/@type))

Both will return the same thing: “text”. Likewise, fn:data() can be used in the same context as we used fn:string() above — to get values from elements.

let $x := <boast>The <em>best</em> movie ever!</boast>
return fn:data($x)

This also returns “The best movie ever!” as a single string. But here’s a more interesting case:

let $x := 
  <values y="1">
    <value>4</value>
    <value>8</value>
  </values>
return fn:string($x)

This way it returns a single string “48”. But if we change the return to “return fn:data($x/value)”, we get back 4 and 8 as separate values. And those values aren’t strings — they are xs:anyAtomicType. That gives us more flexibility as to what we want to do with them. For instance, because we know we have numbers there, we can do some math or pass those values to functions that take integers. We wouldn’t be able to do that with text() or string(), as XQuery doesn’t allow for, say, “4” + 1 — a string plus a number.

I regard these differences as kind of basic, but it’s easy to get into habits of using certain functions by habit, rather than by full understanding. That’s why I like teaching. I anticipated a student question and figured I should know the answer. Now I do, and hopefully it was helpful to some of you as well.

Tags: ,

37 Responses to “text(), fn:string() and fn:data()”

  1. Joe W. Says:

    Nicely stated and illustrated!

    The next lesson might explain when and why forcing text(), string(), or data() is unnecessary and may even reduce performance. Namely, many functions like concat() atomize their parameters, rendering text() etc. redundant.

  2. Dave Cassel Says:

    Good idea, Joe, I’ll work on that for a future post.

  3. Beth Log Says:

    Nice Article! :)

  4. Pablo Jobs Says:

    I working with OSB and this article helps me a lot. Tnx Dave.

    Cheers:),

    Pablo Jobs
    OSB Developer

  5. Vinod51 Says:

    Thanks :) it helped alot for me…..

  6. sabbir Says:

    how can use fn:string for scrapping web content

  7. Rahul Says:

    Thanks for the details. Just want to ask, in the last example we can directly use $x/value + 1 without using fn:data() or fn:string(). Since Xquery by default implements fn:data(), is it a good practice to use fn:data() explicitly?

  8. Dave Cassel Says:

    Rahul, as you note, it’s often not necessary to call fn:data(). As Joe commented above, explicitly calling it can reduce performance in some cases. I never did write that follow-up post.

  9. Daniel Says:

    let $x :=

    4
    8

    the trick is not tha data() or the string() but the way you call is: value of string($x/value) and data($x/value) is the same and the vaule of string($x) and data($x) is the same.
    but the type is different: you can use data($x)*2 but string($x)*2 causes error

  10. Rahul Says:

    I have come across a problem. I want to check here whether the given xml contains any kind of text. I dont want any output if any of the children doesn’t have any text.Like in the example given, it should not give me anything.

    let $x :=

    (1) $x/text() will yield nothing. But if I change Hi David, then also it is returning nothing since text() works on the self node. Thats fine.

    (2) $x/string() will yield something but dont know what? I was expecting it should not give me anything in the example mentioned. Can you please help?

    (3) $x/data gives “XDMP-NONMIXEDCOMPLEXCONT: Node has complex type with non-mixed complex content.” since the head node is under some namespace {WHY??}
    Removing the namespace solves this error but behaving same like string.

  11. Rahul Says:

    I dont know why it doesn’t display the example here. Hence copying it again!! Please discard double quotes in example.

    let $x := ”

  12. Dave Cassel Says:

    Rahul, try & l t ; instead of the less-than character: <

  13. Rahul Says:

    The example was;
    let $x :=
    <head xmlns=”http://www.w3.org/1999/xhtml”>
    <script type=”text/javascript”></script>
    <title></title>
    </head>

    Thanks Dave but I got my solution now.I did not realise that I am checking only in same node and not into its descendants. This solved my problem:
    fn:exists($x//text()[fn:normalize-space(.) ne “”])

  14. Adi Says:

    What is the best way to get the following working:
    Bob

    I want “fname” and “Bob” as separate strings.
    I am new to XML and XSLT. Will any of the above explained functions [text(), string(), data(), node()] help me?

  15. Dave Cassel Says:

    Adi, your XML didn’t come through (use & l t ; for <), but I’ll guess you’re asking about something like this:

    <fname>Bob</fname>

    In that case, fn:string() or text() or data() will get you the “Bob” value. You can get fname as an xs:QName with fn:node-name().

  16. Lars G Johnsen Says:

    Nice article. I would like to make explicit a difference between text() and string(), implicit in your article, that the text() elements are indexed and can be addressed individually, not so with string(). Good for extracting elements with mixed content:

    let $x := <boast>The <em>best</em> movie ever!</boast>
    return element boast {
    element first {$x/text()[1]},
    element second {$x/text()[2]}
    }

    <boast>
    <first>The </first>
    <second> movie ever!</second>
    </boast>

  17. Raj Says:

    I am trying to read the xpath from the node[//teiHeader/fileDesc/publicationStmt/p] and navigate to the corresponding xml node to fetch the value.

    In the below XML, if the xml node contains the xpath like //teiHeader/fileDesc/publicationStmt, i have to navigate “//teiHeader/fileDesc/publicationStmt” in the xml document to fetch the value.

    So, if read //teiHeader/fileDesc/publicationStmt, then it should return “First published right before your very eyes” as the result.
    Please let me know if i am not clear.

    I tried using some xquery functions but it doesnt worked out.
    Can you please share your inputs to find the solution.

    Thanks a lot in advance!!!

    Sample XML:

    <TEI xmlns="http://www.tei-c.org/ns/1.0">
     <teiHeader>
      <fileDesc>
       <titleStmt>
        <title>Another short TEI Document</title>
       </titleStmt>
       <publicationStmt>
        <p>First published right before your very eyes.</p>
       </publicationStmt>
       <sourceDesc>
        <p>//teiHeader/fileDesc/publicationStmt/p</p>
       </sourceDesc>
      </fileDesc>
     </teiHeader>
     <text>
      <body>
       <p>This is a very short TEI document.</p>
      </body>
     </text>
    </TEI>
    
  18. Dave Cassel Says:

    You’ll need to escape the angle brackets for the XML, using & l t ; and & g t ; — <like-this>

  19. Dave Cassel Says:

    Raj, suppose we have a “let $sample :=” with the sample XML you’ve provided. Then

    $sample//teiHeader/fileDesc/publicationStmt/p

    would return

    <p>First published right before your very eyes.</p>

    If you want the string value, you could do:

    $sample//teiHeader/fileDesc/publicationStmt/p/fn:string()

  20. Raj Says:

    Thank you for the response Dave!!

    But if I have the xml as below,

    //teiHeader/fileDesc/publicationStmt/p

    then, I should populate “First published right before your very eyes” to ‘p’ field in the node ‘sourceDesc’

    //teiHeader/fileDesc/publicationStmt/p

    So the requirement is, I want to navigate to the “//teiHeader/fileDesc/publicationStmt/p” to get the data “First published right before your very eyes”.

    Please let me knoq if I am not clear.

    Thanks a lot in advance!!

  21. Raj Says:

    Thank you for the response Dave!!

    But if I have the xml as below,

    <p>//teiHeader/fileDesc/publicationStmt/p</p>

    then, I should populate “First published right before your very eyes” to ‘p’ field in the node ‘sourceDesc’

    <sourceDesc>
    <p>//teiHeader/fileDesc/publicationStmt/p</p>
    </sourceDesc>

    So the requirement is, I want to navigate to the “//teiHeader/fileDesc/publicationStmt/p” to get the data “First published right before your very eyes”.

    Please let me know if I am not clear.

    Thanks a lot in advance!!

  22. Saki Says:

    is there any way to get result of xml element “The movie best will be released on 2015-12-23.” and output required as,

    “The movie ‘best’ will be released on 23rd December, 2015.”

    Have tried with for each, but it doesn’t considers text() nodes.

    Any help can be appreciated.

    Thanks in advance!!

  23. Saki Says:

    sorry xml nodes are not visible in the above post.
    My xml file will be
    <boast>The movie <em>best</em> will be released on <date>2015-12-23</date>.</boast>

    Sorry for the inconvenience.

  24. Dave Cassel Says:

    Saki, it looks like you want to replace <em> with a single quote. No, none of the text, string, or data functions will do that for you. You’ll need to transform it to do that. See my post on recursive descent for a way to do so.

  25. Eduardo Says:

    Hello, I need your help. What happen if the text that I’m getting with text(), is a number and I need to try this text as number later. Please help me.

  26. Dave Cassel Says:

    Eduardo, if you need a value as a number, the simplest thing to do is ask for it as one. Instead of using $content/value/text(), use $content/value/number(). But also know that using a string in a mathematical operation can cast it to the numeric type that you need.

    For instance, put this into Query Console:

    let $content :=
    <root>
    <value>42</value>
    </root>
    let $value := $content/value/text()
    return $value + 1

    You can also pass the string to xs:integer() for a more explicit conversion.

  27. José Júnior Says:

    Thank you very much! Very helpful information! ;)

  28. Dixit Singla Says:

    Really nice article.
    Thanks a lot Dave. :)

  29. test Says:

    Hi ,

    i tried xquery assertion to get a list of string values eg; city names.

    This is how my xqery looks :
    declare namespace ns1=’http://www.webservicex.net/’;
    declare namespace soap=’http://www.w3.org/2003/05/soap-envelope’;

    {
    for $x in //ns1:GetSupplierByZipCodeResponse/ns1:GetSupplierByZipCodeResult/ns1:SupplierDataLists/ns1:SupplierData/ns1:SupplierNumber/fn:string()
    return $x
    }

    I am not recieving correct result, its returing .

    Can you help

  30. Dave Cassel Says:

    Hello. The XML in your question didn’t come through. Please use &lt; and &gt; for the angle brackets, or post your question on Stack Overflow (which has nicer formatting) with the “marklogic” tag.

  31. Vivek Says:

    Hi Dave,

    Below is the xml that was after transformation.

    Vivek
    ABC
    California

    John
    Davis
    Hinckley

    Moesis
    Chese
    Atlanta

    Data that should be converted into file is :

    “Vivek”||”ABC”||”California”
    “John”||”Davis”||”Hinckley”
    “Moesis”||”Chese”||”Atlanta”

    Thanks in advance.

    Thanks,
    Vivek.

  32. Vivek Says:

    Vivek
    ABC
    California

    John
    Davis
    Hinckley

    Moesis
    Chese
    Atlanta

  33. Dave Cassel Says:

    Please use < and > for the angle brackets

  34. Vinay Kumar Says:

    Nice Article. Thanks

  35. Shivling Says:

    Hi Dave,

    Thanks for this post but still i am confused which one (text() or fn:string() or fn:data()) will give you high performance when you are dealing with billions of documents and you want get the xpath value from these documents.

  36. Dave Cassel Says:

    When I’m working against billions of documents, I start with cts or higher-level queries to narrow it down to a much smaller number, then use XPath to pull specific pieces of data out of those documents. At that point, the choice among string, text, and data is more about what I want from the data about performance.

  37. Neer Says:

    Hi Dave,
    I have one requirement where i need tocheck whether in below xml response “Dates overlap” error exist.
    8Dates overlap

Leave a Reply