I’m going to be teaching the MarkLogic University Basic XQuery class tomorrow, and as usually happens when I get to teach, I’ve learned something while reviewing the material. In this case, it’s the differences among text(), fn:string() and fn:data().
text()
This one is commonly seen at the end of an XPath expression. It returns a string for each text element that is a direct child of the context element. For instance:
let $x := <boast>The <em>best</em> movie ever!</boast> return $x/text()
This returns:
The movie ever!
“best” isn’t in the sequence because it’s not a direct child of $x.
fn:string()
This is similar, but there are two important differences:
1. This function will get all descendent text nodes;
2. All those descendant text nodes will be concatenated into one string.
Thus:
let $x := <boast>The <em>best</em> movie ever!</boast> return fn:string($x)
This returns “The best movie ever!” as a single string.
fn:data()
This is the one that I think confused me most. My real problem is that I thought it was specifically for getting the value of an attribute, since that was the context in which I first learned to use it. There a couple of things wrong with that notion. First, you can use fn:string() to get an attribute’s value, just like you can use fn:data():
let $x := <input type="text"/> return (fn:string($x/@type), fn:data($x/@type))
Both will return the same thing: “text”. Likewise, fn:data() can be used in the same context as we used fn:string() above — to get values from elements.
let $x := <boast>The <em>best</em> movie ever!</boast> return fn:data($x)
This also returns “The best movie ever!” as a single string. But here’s a more interesting case:
let $x :=Â Â <values y="1"> Â Â <value>4</value> Â Â <value>8</value> Â </values> return fn:string($x)
This way it returns a single string “48”. But if we change the return to “return fn:data($x/value)”, we get back 4 and 8 as separate values. And those values aren’t strings — they are xs:anyAtomicType. That gives us more flexibility as to what we want to do with them. For instance, because we know we have numbers there, we can do some math or pass those values to functions that take integers. We wouldn’t be able to do that with text() or string(), as XQuery doesn’t allow for, say, “4” + 1 — a string plus a number.
I regard these differences as kind of basic, but it’s easy to get into habits of using certain functions by habit, rather than by full understanding. That’s why I like teaching. I anticipated a student question and figured I should know the answer. Now I do, and hopefully it was helpful to some of you as well.
June 7th, 2011 at 11:49 pm
Nicely stated and illustrated!
The next lesson might explain when and why forcing text(), string(), or data() is unnecessary and may even reduce performance. Namely, many functions like concat() atomize their parameters, rendering text() etc. redundant.
June 9th, 2011 at 7:32 am
Good idea, Joe, I’ll work on that for a future post.
July 9th, 2013 at 7:36 am
Nice Article! :)
July 9th, 2013 at 7:39 am
I working with OSB and this article helps me a lot. Tnx Dave.
Cheers:),
Pablo Jobs
OSB Developer
February 13th, 2014 at 1:28 pm
Thanks :) it helped alot for me…..
June 18th, 2014 at 2:57 am
how can use fn:string for scrapping web content
January 6th, 2015 at 4:46 am
Thanks for the details. Just want to ask, in the last example we can directly use $x/value + 1 without using fn:data() or fn:string(). Since Xquery by default implements fn:data(), is it a good practice to use fn:data() explicitly?
January 6th, 2015 at 12:21 pm
Rahul, as you note, it’s often not necessary to call fn:data(). As Joe commented above, explicitly calling it can reduce performance in some cases. I never did write that follow-up post.
March 9th, 2015 at 10:27 am
let $x :=
4
8
the trick is not tha data() or the string() but the way you call is: value of string($x/value) and data($x/value) is the same and the vaule of string($x) and data($x) is the same.
but the type is different: you can use data($x)*2 but string($x)*2 causes error
April 9th, 2015 at 2:46 am
I have come across a problem. I want to check here whether the given xml contains any kind of text. I dont want any output if any of the children doesn’t have any text.Like in the example given, it should not give me anything.
let $x :=
(1) $x/text() will yield nothing. But if I change Hi David, then also it is returning nothing since text() works on the self node. Thats fine.
(2) $x/string() will yield something but dont know what? I was expecting it should not give me anything in the example mentioned. Can you please help?
(3) $x/data gives “XDMP-NONMIXEDCOMPLEXCONT: Node has complex type with non-mixed complex content.” since the head node is under some namespace {WHY??}
Removing the namespace solves this error but behaving same like string.
April 9th, 2015 at 2:52 am
I dont know why it doesn’t display the example here. Hence copying it again!! Please discard double quotes in example.
let $x := ”
“
April 9th, 2015 at 7:05 pm
Rahul, try & l t ; instead of the less-than character: <
April 17th, 2015 at 6:21 am
The example was;
let $x :=
<head xmlns=”http://www.w3.org/1999/xhtml”>
<script type=”text/javascript”></script>
<title></title>
</head>
Thanks Dave but I got my solution now.I did not realise that I am checking only in same node and not into its descendants. This solved my problem:
fn:exists($x//text()[fn:normalize-space(.) ne “”])
April 23rd, 2015 at 1:55 am
What is the best way to get the following working:
Bob
I want “fname” and “Bob” as separate strings.
I am new to XML and XSLT. Will any of the above explained functions [text(), string(), data(), node()] help me?
April 23rd, 2015 at 9:29 am
Adi, your XML didn’t come through (use & l t ; for <), but I’ll guess you’re asking about something like this:
<fname>Bob</fname>
In that case, fn:string() or text() or data() will get you the “Bob” value. You can get fname as an xs:QName with fn:node-name().
May 20th, 2015 at 5:27 am
Nice article. I would like to make explicit a difference between text() and string(), implicit in your article, that the text() elements are indexed and can be addressed individually, not so with string(). Good for extracting elements with mixed content:
let $x := <boast>The <em>best</em> movie ever!</boast>
return element boast {
element first {$x/text()[1]},
element second {$x/text()[2]}
}
<boast>
<first>The </first>
<second> movie ever!</second>
</boast>
July 13th, 2015 at 3:30 am
I am trying to read the xpath from the node[//teiHeader/fileDesc/publicationStmt/p] and navigate to the corresponding xml node to fetch the value.
In the below XML, if the xml node contains the xpath like //teiHeader/fileDesc/publicationStmt, i have to navigate “//teiHeader/fileDesc/publicationStmt” in the xml document to fetch the value.
So, if read //teiHeader/fileDesc/publicationStmt, then it should return “First published right before your very eyes” as the result.
Please let me know if i am not clear.
I tried using some xquery functions but it doesnt worked out.
Can you please share your inputs to find the solution.
Thanks a lot in advance!!!
Sample XML:
July 13th, 2015 at 7:41 am
You’ll need to escape the angle brackets for the XML, using & l t ; and & g t ; — <like-this>
July 15th, 2015 at 1:53 pm
Raj, suppose we have a “let $sample :=” with the sample XML you’ve provided. Then
$sample//teiHeader/fileDesc/publicationStmt/p
would return
<p>First published right before your very eyes.</p>
If you want the string value, you could do:
$sample//teiHeader/fileDesc/publicationStmt/p/fn:string()
July 16th, 2015 at 4:09 am
Thank you for the response Dave!!
But if I have the xml as below,
//teiHeader/fileDesc/publicationStmt/p
then, I should populate “First published right before your very eyes” to ‘p’ field in the node ‘sourceDesc’
//teiHeader/fileDesc/publicationStmt/p
So the requirement is, I want to navigate to the “//teiHeader/fileDesc/publicationStmt/p” to get the data “First published right before your very eyes”.
Please let me knoq if I am not clear.
Thanks a lot in advance!!
July 16th, 2015 at 4:11 am
Thank you for the response Dave!!
But if I have the xml as below,
<p>//teiHeader/fileDesc/publicationStmt/p</p>
then, I should populate “First published right before your very eyes” to ‘p’ field in the node ‘sourceDesc’
<sourceDesc>
<p>//teiHeader/fileDesc/publicationStmt/p</p>
</sourceDesc>
So the requirement is, I want to navigate to the “//teiHeader/fileDesc/publicationStmt/p” to get the data “First published right before your very eyes”.
Please let me know if I am not clear.
Thanks a lot in advance!!
August 19th, 2015 at 9:51 am
is there any way to get result of xml element “The movie best will be released on 2015-12-23.” and output required as,
“The movie ‘best’ will be released on 23rd December, 2015.”
Have tried with for each, but it doesn’t considers text() nodes.
Any help can be appreciated.
Thanks in advance!!
August 19th, 2015 at 10:01 am
sorry xml nodes are not visible in the above post.
My xml file will be
<boast>The movie <em>best</em> will be released on <date>2015-12-23</date>.</boast>
Sorry for the inconvenience.
August 20th, 2015 at 9:14 am
Saki, it looks like you want to replace <em> with a single quote. No, none of the text, string, or data functions will do that for you. You’ll need to transform it to do that. See my post on recursive descent for a way to do so.
April 18th, 2016 at 2:00 pm
Hello, I need your help. What happen if the text that I’m getting with text(), is a number and I need to try this text as number later. Please help me.
April 18th, 2016 at 3:16 pm
Eduardo, if you need a value as a number, the simplest thing to do is ask for it as one. Instead of using $content/value/text(), use $content/value/number(). But also know that using a string in a mathematical operation can cast it to the numeric type that you need.
For instance, put this into Query Console:
let $content :=
<root>
<value>42</value>
</root>
let $value := $content/value/text()
return $value + 1
You can also pass the string to xs:integer() for a more explicit conversion.
April 4th, 2017 at 4:08 am
Thank you very much! Very helpful information! ;)
April 5th, 2017 at 3:13 am
Really nice article.
Thanks a lot Dave. :)
July 4th, 2017 at 6:03 am
Hi ,
i tried xquery assertion to get a list of string values eg; city names.
This is how my xqery looks :
declare namespace ns1=’http://www.webservicex.net/’;
declare namespace soap=’http://www.w3.org/2003/05/soap-envelope’;
{
for $x in //ns1:GetSupplierByZipCodeResponse/ns1:GetSupplierByZipCodeResult/ns1:SupplierDataLists/ns1:SupplierData/ns1:SupplierNumber/fn:string()
return $x
}
I am not recieving correct result, its returing .
Can you help
July 5th, 2017 at 9:03 am
Hello. The XML in your question didn’t come through. Please use < and > for the angle brackets, or post your question on Stack Overflow (which has nicer formatting) with the “marklogic” tag.
August 15th, 2017 at 6:54 am
Hi Dave,
Below is the xml that was after transformation.
Vivek
ABC
California
John
Davis
Hinckley
Moesis
Chese
Atlanta
Data that should be converted into file is :
“Vivek”||”ABC”||”California”
“John”||”Davis”||”Hinckley”
“Moesis”||”Chese”||”Atlanta”
Thanks in advance.
Thanks,
Vivek.
August 15th, 2017 at 6:54 am
Vivek
ABC
California
John
Davis
Hinckley
Moesis
Chese
Atlanta
August 15th, 2017 at 6:56 am
Please use < and > for the angle brackets
October 3rd, 2018 at 12:30 pm
Nice Article. Thanks
May 30th, 2019 at 10:36 pm
Hi Dave,
Thanks for this post but still i am confused which one (text() or fn:string() or fn:data()) will give you high performance when you are dealing with billions of documents and you want get the xpath value from these documents.
May 31st, 2019 at 11:08 am
When I’m working against billions of documents, I start with cts or higher-level queries to narrow it down to a much smaller number, then use XPath to pull specific pieces of data out of those documents. At that point, the choice among string, text, and data is more about what I want from the data about performance.
August 29th, 2019 at 1:55 am
Hi Dave,
I have one requirement where i need tocheck whether in below xml response “Dates overlap” error exist.
8Dates overlap