XML::Sablotron - a Perl interface to the Sablotron XSLT processor
use XML::Sablotron qw (:all); Process(.....);
If you prefer an object approach, you can use the object wrapper:
$sab = new XML::Sablotron();
$sab->runProcessor($template_url, $data_url, $output_url,
\@params, \@arguments);
$result = $sab->getResultArg($output_url);
Note, that the Process function as well as the SablotProcess function are deprecated. See the USAGE section for more details.
This package is a interface to the Sablotron API.
Sablotron is an XSLT processor implemented in C++ based on the Expat XML parser.
If want to run this package, you need download and install Sablotron from the http://www.gingerall.cz/charlie-bin/get/webGA/act/download.act page. The Expat XML parser is needed by Sablotron (http://expat.sourceforge.net)
See Sablotron documentation for more details.
You do _not_ need to download any other Perl packages to run the XML::Sablotron package.
Since version 0.6 Sablotron supports very useful subset of the DOM specification, you may access the parsed trees, modify them and process them, as well as serialize them into files etc. The DOM trees are not dependent on the processor object, so you may use them for data or stylesheet caching.
Generally there are two modes how you may use Sablotron. The first one (and the simplest one) is based on procedural calls, the second one is based on object oriented interface.
Note, that the original procedural interface is deprecated and should not be used.
There are two methods exported from the XML::sablotron package: ProcessString and Process. As we mentioned above, these function are deprecated and shouldn't be used. Many Sablotron features as miscellaneous handlers, DOM model etc. are not available trough this interface. See the Exported Function for the usage of these procedures.
There are two classes defined to deal with the Sablotron processor object.
XML::Sablotron::Processor is a class implementing an interface to
the Sablotron processor object. Multiple concurrent processors are
supported, so you may use Sablotron in multithreaded programs easily.
Implementation of this class contains a circular reference inside Perl
structures, which has to be broken calling the _release method. If
you aren't going to do some hacks to this package, you don't need to
use this mechanism directly.
XML::Sablotron is often the only thing you need. It's a wrapper
around the XML::Sablotron::Processor object. The only quest of this class is to
keep track of life-cycle of the processor, so you don't have to deal with
a reference counting inside the processor class. All calls to this class are
redirected to an inner instance of the XML::Sablotron::Processor object.
As an addition to previous version of XML::Sablotron, there are new interface methods. We strongly recommend you to use that new methods. Previous versions used the RunProcessor method, which had been called with many parameters specifying XSL params, processed buffers and URLs. New interface methods are more intuitive to use and, and this is extremely important, they allow to process preparsed DOM document as well as the new ones.
New methods are:
See references for more.
Since the release 0.60 all API uses unique naming convention. Names starts with lower case letter, first letters of following words are capitalized. Older user don't have to panic, since old names are kept for the compatibility.
Since the release 0.60 there is new object (user internally in previous versions) used for several tasks. In this Perl module is represented by the XML::Sablotron::Situation package.
At this time the situation is used only for error tracking, but in further releases its usage will become quite extensive. (It will be used for all handlers etc.)
So far you don't have (and it is not even possible many times) to use the Situation object for processing the data. There is one exception to this. If you use the DOM interface (XML::Sablotron::DOM module), you have to create and use the situation object like this:
$situa = new XML::Sablotron::Situation;
ProcessStrings($template, $data, $result);
where...
This function returns the Sablotron error code.
This function provides a more general interface to Sablotron. You may find its usage a little bit tricky but it offers a variety of ways how to modify the Sablotron behavior.
Process($template_uri, $data_uri, $result_uri,
$params, $buffers, $result);
where...
The following example should make it clear.
Process("arg:/template", "arg:/data", "arg:/result",
undef,
["template", $template, "data", $data],
$result);>
does exactly the same as
ProcessStrings($template, $data, $result);>
Why is it so complicated? Please, see the Sablotron documentation for details.
This function returns the Sablotron error code.
This function is deprecated and no longer supported. See the description of object interface later in this document.
This function is deprecated and no longer supported. See the description of object interface later in this document.
The constructor of the XML::Sablotron object takes no arguments, so you can create new instance simply like this:
$sab = new XML::Sablotron();
Add an argument to the processor. Nothing (almost) happened at the time
of call, but this argument may be processed later by the process
function.
$sab->addArg($situa, $name, $data);
Add a DOM document to the processor. This document may be processed
later with the process call.
$sab->addArgTree($situa, $name, $doc);
Adds the XSL parameter to the processor. The parameter may be accessed
later by the process call.
$sab->addParam($situa, $name, $value);
This function starts the XSLT processing over the formerly specified
data. Data are added to the processor using addArg, addArgTree
and addParam methods.
$sab->process($situa, $template_uri, $data_uri, $result_uri);
The RunProcessor is the older method analogous to the Process
function. You may find it useful, but the use of the process
method is recommended.
$code = $sab->runProcessor($template_uri, $data_uri, $result_uri,
$params, $buffers);
where...
URIs passed to this function may be from schemes supported internally (file:, arg:) of from any scheme handled by registered handler (see HANDLERS section).
Note the difference between the RunProcessor method and the Process function. RunProcessor doesn't return the output buffer ($result parameter is missing).
To obtain the result buffer(s) you have to call the getResultArg method.
Example of use:
$sab->runProcessor("arg:/template", "arg:/data", "arg:/result",
undef,
["template", $template, "data", $data] );
Call this function to obtain the result buffer after processing. The goal of this approach is to enable multiple output buffers.
$result = $sab->getResultArg($output_url);
This method returns a desired output buffer specified by its url. Specifying the ``arg:'' scheme in URI is optional.
The recent example of the runProcessor method should continue:
$return = $sab->getResultArg("result");
$sab->freeResultArgs();
This call frees up all output buffers allocated by Sablotron. You do not have to call this function as these buffers are managed by the processor internally.
Use this function to release huge chunks of memory while an instance of processor stays idle for a longer time.
Set particular type of an external handler. The processor can use the handler for miscellaneous tasks such log and error hooking etc.
For more details on handlers see the HANDLERS section of this document.
There are two ways how to call the RegHandler method:
$sab->regHandler($type, $handler);
where...
The second way allows to create anonymous handlers defined as a set of function calls:
$sab->regHandler($type, { handler_stub1 => \&my_proc1,
handlerstub2 => \&my_proc2.... });
However, this form is very simple. It disallows to unregister the handler later.
For the detailed description of handler interface see the Handlers section.
$sab->unregHandler($type, $handler);
This method unregisters a registered handler.
Remember, that anonymously registered handlers can't be unregistered.
$sab->setEncoding($encoding);
Calling these methods has no effect. They are valuable for miscellaneous handler, which may store received values together with the processor instance.
$sab->setEContentType($content_type);
Calling these methods has no effect. They are valuable for miscellaneous handler, which may store received values together with the processor instance.
$sab->setOutputEncoding($encoding);
This methods allows to override the encoding specified in the <xsl:output> instruction. It enables to produce differently encoded outputs using one template.
$sab->setBase($base_url);
Call this method to make processor to use the $base_url base URI while
resolving any relative URI within a data or template.
$sab->setBaseForScheme($scheme, $base);
Like SetBase, but given base URL is used only for specified scheme.
$sab->setLog($filename, $level);
This methods sets the log file name, and the log level. See Messages handler - overview for details on log levels.
$sab->clearError();
This methods clears the last internal error of processor.
Sablotron performs almost all operations in very special context used for the error tracing. This is useful for multithreaded programing or if you need called Sablotron in the reentrant way.
The tax you have to pay for it is the need of specifying this context in many calls. Using DOM access to Sablotron structures requires this approach almost for every call.
The XML::Sablotron::Situation object represents the execution
context.
E.g. if you want to create new DOM document, you have to do following:
$situa = new XML::Sablotron::Situation(); $doc = new XML::Sablotron::DOM::Document(SITUATION => $situa);
The situation object supports several methods you may use if you want to get more details on error happened.
(Note: In upcoming releases the Situation object will be used for more tasks like handler registering etc.)
$sit->setOptions($options);
Control some processing features. The $options parameter may be any combination of following constants:
Returns the last error code.
Returns the string characterizing the last occurred error.
Returns ARRAYREF with several details on the most recent error. See example:
$arr = $situa->getExceptionDetails(); ($code, $message, $uri, $line) = @$arr;
Currently, Sablotron supports three flavors of handlers.
I have to say that in this moment the XML::Sablotron extension supports only the first two of them.
Call-back functions implementing handlers are of different prototypes (not a prototypes in the Perl meaning) but the first two parameters are always the same:
The goal of this handler is to deal with all messages produced by a processor.
Each state reported by the processor is composed of the following data:
Each reported event falls into one of predefined categories, which define the event level. The valid levels include:
The numbers in the parentheses are the internal level codes.
To define a messages handler, you have to define the following functions (or methods, depending on kind of registration, see RegHandler).
To understand parameters of this call see: Messages handler - overview
A very simple message handler could look like this:
sub myMHMakeCode {
my ($self, $processor, $severity, $facility, $code);
return $code; # I can deal with internal numbers
}
sub myMHLog {
my ($self, $processor, $code, $level, @fields);
print LOGHANDLE "[Sablot: $code]\n" . (join "\n", @fields, "");
}
sub myMHError {
myMHlog(@_);
die "Dying from Sablotron errors, see log\n";
}
$sab = new XML::Sablotron();
$sab->RegHandler(0, { MHMakeCode => \&myMHMakeCode,
MHLog => \&myMHLog,
MHError => \&myMHError });
That's all, folks.
One of great features of Sablotron is the possibility of Scheme
handlers. This feature allows to reference data from any URL
scheme. Every time the processor is asked for some URI
(e.g. using the document() function), it looks for a handler,
which can resolve the required document.
Sablotron asks the handler for all the document at once. If the handler refuses this request, Sablotron ``opens'' a connection to the handler and tries to read the data ``per partes''.
A handler can be used for the output buffers as well, so this mechanism also supports the ``put'' method.
If you're going to use the second way (giving chunks of the document), simply
don't implement this function or return the undef value from it.
$scheme parameter holds the scheme extracted from a URI $rest holds the rest of the URI
$handle is the value previously returned from the SHOpen function.
Return the undef value to say ``No more data''.
See the test script (test.pl) included in this distribution.
Sablotron supports both of physical (file, buffer) and event based output methods. SAX handler is a bit confusing name, because events produced by the engine are of a bit different flavors then 'real' SAX events; think about this feature as about SAX-like handler.
You may set this handler if you want to catch output events and process them as you wish. Note, that there are XML::SAXDriverr::Sablot and XML::SAXFilter::Sablot Perl modules available, so you don't need to deal with SAX-like handler, if you want to use Sablotron in standard SAX chains.
This handler was introduced in version 0.42 and could be subject of change in the near future. For the namespace collision with message handler misc. handler uses prefix 'XS' (like extended features).
$contentType holds value of ``media-type''
attribute, $encoding holds value of ``encoding attribute.
Return value of this callback is discarded.
Suppose template like this:
<?xml version='1.0'?> ... <xsl:output media-type="text/html" encoding="iso-8859-2"/> ...
In this case XSDocumentInfo callback function is called with values of ``text/html'' and ``iso-8859-2''.
This package is subject to the MPL (or the GPL alternatively).
The same licensing applies for Sablotron.
Pavel Hlavnicka; pavel@gingerall.cz
perl(1).