Redirecting responses from the rewriter in MarkLogic
Author: Dave Cassel | Category: Software DevelopmentUpdate: I got some feedback on this post that amounted to “Wha???” Rereading it, I see that I really went into the esoteric for this one. After spending some time dealing with a very specific problem, I posted the solution, but it’s a problem you’re not very likely to encounter. If you do and you’re using MarkLogic’s built-in REST Rewriter, Section 18.8 of the Application Developer’s Guide, Handling Redirects, is probably what you’re looking for. This post is specific enough that I thought about just taking it down, but what the heck, maybe it will help someone after all.Â
I recently learned something the hard way, so hopefully this saves someone some time. I was working in a rewrite module of a MarkLogic HTTP application server and wanted to call xdmp:redirect-response(). The rewrite module has to return a string, so I returned the URL that I was redirecting to. The thing is, in this particular case, the redirected URL wasn’t a real URL; it was something that would need another pass through the rewriter before it got to actual code. That may sound weird, so let me illustrate what I mean. (Note that while the example is specific to the Roxy framework, the lesson applies to any MarkLogic rewrite module.)
An Example
Here’s an example to help that make sense: Roxy already lets you use URLs like /book/view?id=1, which calls the view function in the book controller module. Roxy’s normal rewriter does this by turning that URL into /roxy/query-router.xqy?controller=book&function=view&id=1. Now suppose you wanted to use a URL that looks like this instead: /book/1. If the request is a GET, you want to call the view function, but if the request is a PUT, you want to call the book controller’s update function. We still need to call query-router.xqy eventually (or update-router.xqy if the request will make updates). The rewrite mechanism in Roxy lets you do it two different ways. Method 1:
<request uri="^/book/(\d+)" redirect="/book/view"> <uri-param name="id">$1</uri-param> <http method="GET"/> <http method="HEAD"/> </request>
Pretty simple: In comes “/book/1”; the rewriter module redirects that to “/book/view”. The thing is, “/book/view?id=1” also needs to be rewritten — it doesn’t point to executable code yet. Here’s method 2:
<request uri="^/book/(\d+)" endpoint="/roxy/query-router.xqy"> <uri-param name="controller">book</uri-param> <uri-param name="func">view</uri-param> <uri-param name="id">$1</uri-param> <http method="GET"/> </request>
The advantage of the first is simplicity in the configuration; as a developer, you don’t need to pay attention to what the framework does under the hood. The advantage to the second is that no redirect is necessary; it’s just a rewrite of the URL, so the browser doesn’t change what it shows. I like the second way better, but with Roxy, we try to be flexible.
Lesson Learned
So here’s what I learned: whatever string you return from the rewriter must correspond to a main module; can’t just be a fake URL or one that needs another round of interpretation. For the particular case I was dealing with, the solution turned out to be a no-op module: a main module that simply has an empty sequence as the only expression to be evaluated. So my rewriter called xdmp:redirect-response() with the URL that I really wanted to go to, then returned the URL of the no-op module.
What really threw me is that the rewriter already had an example of doing this redirecting that worked fine, even though it didn’t point to a module. The rewrite rule was that if you asked for “/test”, you were redirected to “/test/”. Having learned that I couldn’t return the URL I was directing to (since it didn’t point to a module), I needed to return something else. I didn’t want to hard-code the “/test/” string that gets returned when doing a redirect, because in Roxy, the test directory doesn’t always exist. So I tried returning “/app/” instead, which should always exist. That didn’t work. After a little thought, I realized that the test directory has a default.xqy module in it, while the app directory does not. If you send a request to MarkLogic that is just a directory, it will look for default.xqy in that directory — when the rewriter returned “/test/”, MarkLogic found “/test/default.xqy” and was happy. When the rewriter returned “/app/” and there was no “/app/default.xqy”, MarkLogic was not happy. For my situation, “/roxy/no-op.xqy” became the URL returned when doing a direct.
Of course, rewriting in a single step is nicer in a lot of ways, but allowing the redirect simplifies the rewriting rules that Roxy allows developers to write.
Tags: gotcha, marklogic, rewriter, roxy
October 21st, 2016 at 12:57 am
Hi David,
After upgrading to MarkLogic 7 to MarkLogic 8, website links are not working. I understood, rewriter.xml has some issue. Can you guide to resolve this issue?
Thanks,
Kumar
October 21st, 2016 at 7:18 am
MarkLogic 7 used /MarkLogic/rest-api/rewriter.xqy; MarkLogic 8+ uses /MarkLogic/rest-api/rewriter.xml. Since Roxy’s rewriter handles REST API endpoints as well as its own, it had to change. If you update to the latest Roxy, that should fix your problem (“./ml upgrade” from within your project directory). If not, Stack Overflow provides a good place to ask where you can provide more details about your error and get more eyes on the problem.
October 24th, 2016 at 9:04 am
Hi David,
If we update to the latest Roxy, will it have any impact on the existing MarkLogic 8 content/application?
October 24th, 2016 at 11:12 am
kumar, MarkLogic 8.0-6 has a change to Query Console that breaks Roxy’s bootstrapping process. The problem has already been fixed on the dev branch, and we’ll be doing a new release very soon (likely this week) to bring that fix onto the master branch.