Sablotron: Future Directions - Technical

Back to FUTURE.html

XSLT 1.0 Specification

Result tree fragments

We are less restrictive on RTF, then we should be currently. RTFs are allowed as parts of a location path, which is forbidden by the spec. We may keep this functionality, but a user should be able to enable/disable this extension with some option.

The id() function

The id() function is not supported at all. An improved parsing of input files is needed to support it. Structures used for the key() function could be reused to some extent.

xsl:sort and @case-order

This attribute is not supported at all. The order of sorting depends on the libc (or another) std. library. The implementation might be a bit tricky.

Forward compatible processing

The section 2.5 of XSLT 1.0 spec describes how the forward compatible processing should work. Sablotron is not fully compliant with this section yet.

Embedded stylesheets

Not supported at all. It requires an improved parsing and some changes to the core engine.

Miscellaneous

External functions support - plug-ins

This seems to be the most severe issue related to the processor architecture, at least from our point of view. Currently, it's relatively easy to implement extension elements, and it doesn't make much sense to introduce a generic interface to define custom extension elements (although even this can be considered).

More problems are related to extension functions. Sablotron embeds JavaScript engine, thus functions implemented in JS are integrated seamlessly. Unfortunately, this is not true for other functions: either hard-wired ones (delivered with Sablotron) or custom ones. What is missing is an expression parse-time recognition of function and a mechanism for marshaling of parameters and results.

We're not sure, if there is a real need for a dynamic plug-in architecture, where function might be registered with engine in the run-time, or if a kind of build-time support is enough. We tend to prefer the second way currently.

JavaScript extensions

Generating the output

It was suggested to generate output from JS code directly. This idea is interesting and scaring in the same time; it breaks the functional character of XSLT (no side-effects) totally and there is a lot of technical issues related. For more details see the record in Bugzilla.

XPath evaluation

It would be nice to have a chance to evaluate XPath expressions in JavaScript. This would be handy for many things, see Dynamic under EXSLT.

Parameters

Encoding

This is the most frequent complaint. There is no way to pass parameters with an encoding different from utf-8. The API should be extended to allow other encodings.

Typing

All parameters are handled as strings currently. There is no way to pass in numbers, booleans or result tree fragments. The API should be extended in order to allow to pass typed parameters.

C++ wrapper

There is no 'official' C++ class wrapping the Sablotron API. C++ programmers could appreciate one.

EXSLT functions

As stated in External functions support, the most critical issue related to extension functions is to provide a bit more generic architecture of the expression evaluator. The following points depend on this issue.

JS implementations of EXSLT functions can be used whenever acceptable, but it also make sense to hardwire some of the most common functions.

Dates and times

We suppose that this functionality can be easily covered with the JavaScript functions.

Dynamic

Functions like closure, evaluate, min, max, sum, map could be implemented natively over the existing code, or in JavaScript (provided the 'XPath evaluation' issue is resolved).

Common

The node-set() function is trivial (see Result tree fragments). The same is true for the object-type() function. The exsl:document element is already supported.

Functions

JavaScript functions are supported.

The exsl:function and exsl:result elements can be implemented over the existing code, though it would require some effort.

Math

This can be implemented in JavaScript easily.

Regular expressions

This can be implemented in JavaScript easily.

Sets

These functions could be implemented in JavaScript, but we'd need to add some methods (is-same-node()) and properties (document-order) to the Node object. It shouldn't be too painful.

Strings

All function returning strings can be easily implemented in JavaScript. This is not true for split() and tokenize() functions as they return node-sets.

There are problems with returning node-sets because of the incompatibility with the internal context implementation. We would need to introduce some reference counting model for context and fake nodes to be able to create node-sets containing nodes that don't belong to any of processed trees.