Putting CoffeeScript into XQuery Databases

Coffee Script
Coffee Scripy XMLtoday

One of the central challenges that vendors such as MarkLogic face right now is the limited number of XQuery developers currently in the marketplace, to the extent that it is becoming an impediment to adoption of these databases.

There may be an interesting solution to this conundrum – a delightful, albeit rather odd, language called CoffeeScript. For those unfamiliar with it, CoffeeScript is essentially a metalanguage – it is a language that is designed to be compiled into Javascript, rather than run natively, but it has a number of constructs that could just as readily be tamed for the kind of XML manipulation that XQuery excels at. Moreover, it is a language that is gaining a lot of traction in the JavaScript community, which means that if MarkLogic or similar vendor were to build a CoffeeScript to XQuery compiler, it could potentially open up the MarkLogic engine to a broader programming base.

The following is a simple example of a CoffeeScript script that I modified from the CoffeeScript site :

activity = (day) ->
go = (term) -> "I am #{term} ."
activities =
work:"at work"
relax:"relaxing"
iceFishing:"ice fishing"
bingo:"playing bingo"
dancing:"dancing"
church:"at church"
bingoDay:"Fri"
switch day
when "Mon" then go activities.work
when "Tue" then go activities.relax
when "Thu" then go activities.iceFishing
when "Fri", "Sat"
if day is activities.bingoDay
go activities.bingo
else go activities.dancing
when "Sun" then go activities.church
else go map.work
build_message = (day,message) => """
<schedule day="#{day}">
<message>#{message}</message>
<activity>#{activity day}</activity>
</schedule>
"""
day = "Wed"
id="foo"

console.log ( build_xml "Tue","Come join me" )

The Construct

activity = (day) -> ...

defines a function using a Haskell-like mapping notation. The result of this script in toto is to send out to the console the “message” :

<schedule day="Tue">
<message>Come join me.</message>
<activity>I am relaxing.</activity>
</schedule>

What makes CoffeeScript intriguing as an interpreter for XQuery is that it is more formally declarative than JavaScript is, which means that it can be more readily parsed into XQuery-like structures. With a very few exceptions, every statement in CoffeeScript is an expression, something that should feel very familiar to XQuery developers. It also includes a class structure that could map quite readily to the modular approach utilized by XQuery. For instance, a cts search query might very well be rendered as:

cts::search xp#{//phrase}, cts::element-value-query qn#
{ns:foo},"myWord"

where xp#{} is a syntax for representing an XPath Query, and qn#{} represents a qname. Inline evaluate of content (either in CoffeeScript mediated expressions or interpolated XQuery expressions) makes creation of XML fragments trivial.

Similarly, XQuery FLWR scripts have a corresponding constructs in CoffeeScript :

eat = (food) -> xml:node("#{food}")
foods = ['broccoli', 'spinach', 'chocolate']
eat food for food in foods when food isnt 'chocolate'

Would map to something like :

declare function local:eat(food as xs:string) as node(){{$food}};
let $foods := ("broccoli","spinach","chocolate")
return for $food in $foods where $food !="chocolate" return
local:eat({$food}

Speaking of maps, maps (named hashes) are the one area of XQuery that isn’t consistently supported (there are forces in the W3C that have been trying to get it in as a proposal for XQuery 3, but it’s controversial). However, most fourth generation XQuery databases do support them, and as is illustrated in the first example above, maps are intrinsic to CoffeeScript because they are intrinsic to JavaScript.

Certain aspects (such as typing) would need to be added into the coffeescript parser for XQuery, but this would be a relatively minor addition (XQuery is comparatively weakly typed, so type tends to lay lightly on the system. Similarly certain JavaScript constructs would not be allowed in an XCoffeeScript variant, but in most cases these are not recommended in CoffeeScript as well.

There are no efforts yet to write such a compiler, but as mentioned, the benefits to doing so may very well spur someone to undertake the effort. The interpretation process for CoffeeScript would need to be done only once, unless the script itself changes, and it would readily support (and could even optimize) inline evaluated XQuery code, as such could be refactored as anonymous functions. Finally, it could hide a lot of the complexity associated with working with namespaces, something that can be especially offputting for developers coming from the JavaScript side.

Overall, a CoffeeScript compiler into XQuery could help make XML processing attractive for the larger population of CS developers, as switching from JavaScript generation to XQuery generation may be as simple as changing a switch from their perspective. It opens up the set of potential modules that can be incorporated (indeed, such modules could be designed to work with multiple XQuery systems as well, enhancing their portability) and it increases the attractiveness of XQuery based systems to organizations with strong web presences by simplifying the interfaces between the XML and JavaScript worlds.