Honza Jirousek, honza@ecn.cz
Version: DRAFT 2000.08.07.02
August 7, 2000
This section discusses roles and relations of XML application server, Charlie agent and end-user interface (browser) in a Charlie environment. We show several typical Charlie network setup examples.
Charlie is neither an XML application server framework (though it can substitute it in some simple cases), nor an end-user interface product. Charlie is designed to filter, control and cache the traffic between the client and the server. The XML application server will most likely be an XML enabled web application (a web application returning XML pages instead of HTML), but it can be any type of network application with XML interface (XML-RPC, SOAP, XML database, CORBA application etc). The end-user interface will typically be provided by an HTML browser (anything from text-based Lynx to IE5 or Netscape 6 DHTML), but it can also be a WAP/WML browser or some XML driven UI platform. Charlie is independent on server and client platforms and programming language/framework of server-side XML application.
Charlie itself is an universal agent, not specific to any particular application. A single Charlie agent installed on end-user's computer or a proxy server can support any Charlie-aware XML application (we will use the term ch-application through this text). Similarly to traditional web application methodologies, ch-applications are defined and driven entirely from the server. Authors of XML applications can build in some level of Charlie support, or structure the application entirely for Charlie, and get in return some or all of the advantages Charlie provides. The most important of them include:
Charlie architecture is based on emerging industry standard building blocks, such as XML for data/document description and transport formats, XSLT for UI template definitions, JavaScript for secure client-side code execution. Charlie architecture is also very modular and it is possible to add/replace functionality, e.g. different template or scripting language support, in a plug-in module form. Note, though, that this breaks the "universal agent" model somewhat - a versioning system will be put in place to clearly mark extensions.
While we tend to use terms like Charlie agent, Charlie server, Charlie proxy etc, it is important to understand there is always only one Charlie module between the client browser (or another UI platform) and the server (an XML application). There is also only one type of Charlie agent, but it can be set up in several networking configurations - we call the code implementing a particular networking context for Charlie a handler. Charlie processing model is always the same, but the communication path between the client, Charlie and the server, data flows, and caching effect are different. All of these network contexts are totally transparent to the ch-application design and code. The author of ch-application will not have to think about the way Charlie is deployed, all ch-applications will work in all setups (unless explicitly restricted).
Bellow are examples of several typical scenarios.
1) A Charlie module installed on the XML application server itself can support multiple clients with plain browsers. An author of XML application can take the advantage of multiple UI template management, and separation of application control logic from data generation. Caching features of Charlie are not of much interest here (can be disabled). The client will communicate with Charlie through the HTTP protocol, the handler will either have a form of an Apache module, or a set of CGI/mod_perl scripts. Charlie will communicate to the XML application either through local HTTP connections, or through Apache internal calls and direct file access where possible.
2) A variant of previous case is a distributed server-side environment, where Charlie runs on a dedicated server (or perhaps a load-balanced server farm), communicating with a dedicated XML application server (and perhaps with other servers dedicated to serving static XML documents and XSLT templates) through the HTTP over LAN. Some of Charlie caching features become interesting here.
3) A Charlie agent installed on the client computer. Likely there will be only a single user working with a given Charlie agent, but there will be multiple client-side Charlie agents talking to one XML application server. In this scenario all Charlie features can be leveraged. The handler code can either have a form of local HTTP server (likely single-purpose Charlie http daemon) or local HTTP Charlie-enabled proxy server. In the long run Charlie can also become an integral part of some browsers, e.g. a Mozilla plug-in module etc. In fact one of the major goals of Ginger Alliance is to build Charlie into various embedded platforms and handheld devices.
4) A variant of previous case is a LAN proxy setup - several clients on a LAN (or a campus network) share a single Charlie proxy server. This has most features of client-side Charlie deployment, but less administrative overhead.
A straightforward variant of this case would be a "Charlie WAP gateway", where the transport protocol between client browser and Charlie is WAP. Note that this is not the only way to deploy Charlie in WAP networks - WAP clients can also talk to WML enabled HTTP Charlie servers through a generic WAP/HTTP gateway.
5) In most cases of Charlie deployment the cases above will be mixed into an environment, where some clients with local Charlie agent will talk directly to the XML application server, while others will use server-side Charlie module. Because Charlie network contexts are transparent to ch-applications, no changes on the server side are needed and this setup can be used for various migration and deployment scenarios.
6) A single ch-application can also integrate and dispatch several XML applications, or perhaps their pieces based on different technologies. In such a case a single Charlie module can talk to multiple servers, perhaps through different communication protocols.
This section introduces main concepts of Charlie application design. We explain how Charlie uses and manages the separation of presentation, data generation, and application logic. We introduce Charlie actions, and show some examples of their use.
The concept of presentation and content separation is well known to any seasoned web application designer - it is featured (at least to certain extent) in popular tools like ASP, EmbPerl, Cocoon and others. Aside from providing a clean application design methodology it also allows:
Charlie takes this approach to the extreme and totally separates the system providing the data/document content from the UI definitions (we call them templates) and their processing. XML provides a convenient and flexible interface between the two. The XML data/document generation and serving is not Charlie's concern, that is left to an independent XML application server. Charlie is responsible for locating and processing appropriate UI templates for XML data/documents. A natural choice of UI template language for XML is XSLT (this is XSL transformations, don't confuse it with XSL formatting objects). Charlie includes Sablotron, our own XSLT processor. Other template languages for XML, such as XPathScript, may be added. The repository of XSLT templates and Charlie actions (see bellow) is considered a part of the XML application server, but can easily be provided by a separate HTTP server (e.g. in case the XML application is a purely dynamic database).
Note: There seem to be two major types of XML applications out there - more-or-less dynamic application (e.g. database applications) that use XML as a convenient data serialization/representation format, and more-or-less static XML document management systems. There are many similarities between the two, but also many differences, which require slightly different design of appropriate application frameworks. This is sometimes a source of misunderstandings within XML/XSL community. Charlie tries to be flexible enough to cater for both but the more advanced features (e.g. selective XML data caching) are designed with dynamic XML data in mind.
Note: Some more advanced server-side application frameworks, such as Cocoon/XSP, take the data/UI template separation a step further, providing a methodology for template-based data generation from various sources through multiple "processors" (e.g. SQL, SSI-like, embedded code etc) applied in a sequence to the same XML template with processing instructions. Last step in such a system is usually an XSLT (or similar) processor converting resulting XML file into an HTML page. Charlie doesn't really care about such mechanisms within the XML application server, and can cooperate with them smoothly as long as the final XSLT step is delegated to Charlie rather than processed on the server.
A new problem appears when we separate contents and presentation templates - coupling them back together. Which template will be used for given document, where to find data for a given template, how to keep the coupling rules manageable within a growing and changing application? This turns out to be quite a difficult task. Popular "active pages" systems put the templates in the front line of application control - application URLs point to templates themselves and the code generating the data is referred from within the template. This is an intuitive approach for simple applications, but turns bad when meeting more sophisticated application flow logic - "if" and "case" type of logic tends to be very difficult and non-intuitive to express in a template-driven paradigm. Poor error handling in ASP applications, encountered so frequently, is one of the consequences. Second approach follows a classical "CGI model" of web applications, where URLs point to the code, which constructs the data/document, and explicitly calls the template processor with a (often hardcoded) template reference. A form of this approach is an XML document management system, where each XML document carries an <?xml-stylesheet?> reference to the template. The second model provides more flexibility, but unless a sophisticated subsystem of data/template coupling rules is put in place, it can have problems with application growth, changes and switching between multiple UI designs for a single XML backend.
Charlie solves the problem by introducing a new entity, called an action, independent of data/document and template. In the Charlie model the client's URLs point to actions, which in turn point (again via URL) to a source of XML data/document and a template.
An action is actually a piece of code in some scripting language, executed within the Charlie context, and using Charlie's services. JavaScript is a convenient and secure candidate for such a language, but others, such as Perl, Penguin-secured Perl code or Zend can be plugged in. In response to client's URL request Charlie downloads and executes an appropriate action. The action in turn requests loading of XML data/document and XSLT template through Charlie service calls. A simple action might look like the following:
Example 1: A simple specific action. Let's say the URL of
the action itself is
http://charlie.gingerall.cz/example/example.act, and it is used
to process the XML document
http://charlie.gingerall.cz/example/xml/example.xml with an XSLT
template http://charlie.gingerall.cz/example/xsl/example.xsl.
The action uses relative URLs.
// getting document and template
data = chGetResponse(new Request("GET","xml/example.xml"));
templ = chGetResponse(new Request("GET","xsl/example.xsl"));
// processing template and document
sab = new Sablotron();
sab.setTemplate(templ);
sab.setDocument(data);
ret = sab.process();
// returning response
response.fromSablotron(sab);
1;
Charlie loads actions from the server as requested by the client, and executes them locally. If a specific action for given URL is not found on the server, Charlie searches for an appropriate default action up the directory hierarchy. The system of default actions together with the power of a scripting language allows to implement various naming conventions for content/template coupling and keep the number of actions necessary to minimum. An emulation of the <?xml-stylesheet?> model can also be easily implemented with a default action - the action code inspects <?xml-stylesheet?> instructions in XML document received and requests appropriate stylesheet (Charlie builtin services will help). There is also a possibility to define symbolic links within the charlie: URL namespace (an unified URL model used internally, independent of Charlie network context, see the section charlie: URL namespace), which further helps to set up flexible and manageable content/template matching rules through actions.
Note: Apart from within an action, requests to load documents/templates from the server may come up when processing <xls:include>, <xsl:import> and document() instructions in XSLT templates.
Example 2: Imagine a company dealing with documents in one of three possible formats: XML (fixed internal DTD), HTML and plain text. If the documents are organized in a directory tree, all you need to access whatever document using Charlie is to place a single default action to the root of the tree. The request of URL pointing to a document then starts the default action, which examines the document extension and returns it either transformed using a predefined stylesheet (templ01.xsl) or "as is" with the appropriate content-type set.
// getting URL
url = charlieURI();
// getting file extension
fields = url.split("/")
file = fields[fields.length - 1];
parts = file.split(".");
ext = parts[parts.length - 1]
// getting document
doc = chGetResponse(new Request("GET",url));
// creating response based on extension
switch (ext) {
// - xml doc is transformed to html, content-type=text/html
case "xml" :
templ = chGetResponse(new Request("GET","../templates/templ01.xsl"));
sab = new Sablotron();
sab.setTemplate(templ);
sab.setDocument(doc);
ret = sab.process();
type = text/html;
break;
// - html is left unchanged, content-type=text/html
case "html" :
res = doc.content();
type = text/html;
break;
// - other files are left unchanged, content-type=text/plain
default :
res = doc.content();
type = text/html;
}
// returning response
response.content(res);
response.contentType(type);
1;
Charlie actions may look like an unnecessarily complicated and inefficient way to solve the content/presentation coupling problem, particularly in comparison with systems like Cocoon and AxKit, more tightly integrated with the HTTP server. The inefficiency of action execution and loading documents and templates separately can be cured by HTTP server integration and direct-access optimizations when running as an Apache module, and offset by caching when Charlie runs on another computer. More important difference is that while Cocoon and AxKit are purely server-side systems, Charlie actions work in the distributed environments the same way as in server-side setup and the power of scripting language can be used for additional neat tricks.
In addition to content/presentation coupling, actions can also take over a part of application control logic. Actions can make decisions based on input (GET/POST) parameters, XML data returned from the server, or Charlie environment. Thus it is possible to implement input validation, error handling, UI design switching and dispatching/combining multiple XML application calls within an action. This may lead to a clean separation of data generation code (XML server), application control logic (actions) and presentation (templates). The server-side application can be structured as a set of small, single purpose services (data generation, data validation tasks, static documents and templates) and the glue/dispatch code can be moved to Charlie actions. For example, it is possible to provide a sophisticated control flow to "active pages" type applications through Charlie actions. In a distributed Charlie environment this concept also allows a distributed execution of parts of application code.
The following examples of action code (JavaScript) demonstrate some of the possible uses of application logic embedded in actions.
Example 3: Suppose we have an XML application (an application generating XML output) and we want to create three different interfaces:
Each interface will be implemented as a set of XSLT stylesheets
in a separate directory. The selection of desired interface can
be driven by actions based on the "ui" parameter (no "ui"
parameter means simple HTML interface). ui=d
invokes the DHTML face and ui=w formats output for
the WAP. The following is a default action, that will couple XML
datafile
http://charlie.gingerall.cz/example/xxx/yyy/xml/asdf.xml with an
appropriate XSLT template.
// getting ui value
url = new URI(charlieURI());
ui = url.getQuery().value("ui");
// get URL base, file name and extension
url = new String(charlie.chURL());
urldir = url.urlbase(url)+"/"+url.dir(url);
filebase = url.filebase(url);
fileext = url.fileext(url);
// get document
doc_url = urldir+"/xml/"+filebase+".xml";
doc = chGetResponse(new Request("GET",doc_url));
// get template (based on the ui parameter)
templ_url = urldir+"/xsl/"+ui+"/"+filebase+".xsl";
templ = chGetResponse(new Request("GET",templ_url));
// processing template and document
sab = new Sablotron();
sab.setTemplate(templ);
sab.setDocument(doc);
ret = sab.process();
// returning response
response.fromSablotron(sab);
1;
Note that the ui value is supplied as a parameter to the XSLT transformation. This is because the ui parameter has to be propagated to all URLs in the resulting page - the XSLT template may have to insert the value of the ui parameter into some URLs. There are other ways to solve this - e.g. the ui parameter may be carried not within an URL parameter, but as a part of PATH_INFO. In this case the action would still be able to find the ui value within the URL and locate appropriate XML and XSLT files, and as long as all URLs in the template would be relative, there would be no need to modify them.
Example 4: We can also use actions to validate user input. The preferable way may be to include JavaScript checking code to templates, so that the validation is performed locally in a browser, but moving this part of application logic to actions works also for browsers, which don't support or allow JavaScript. The following sample action checks whether a three-field form input contains valid values (letters for name, alphanumeric characters, at, dots, hyphens and underscores for e-mail, and digits for phone number). If the input is OK, the action displays the next screen. Otherwise, the previous form is repeated. This is not an example of a default action, but rather an action corresponding to the single specific URL, say http://charlie.gingerall.cz/example/form.act. With none or invalid input the form.xml datafile and form.xsl template is used, otherwise will use the (probably dynamic) result.xml and result.xsl.
// getting url
url = new URI(charlieURI());
// getting values
XXX
name = url.getQuery().value("name");
email = url.getQuery().value("email");
phone = url.getQuery().value("phone");
// checking data validity
invalid = "";
if (name.match(/^\w{2,}$/) == null) {invalid = invalid+" name"; name = "";};
if (email.match(/^[\w._-]+@[\w._]+$/) == null) {invalid = invalid+"email"; email = "";};
if (phone.match(/^\d{5,}$/) == null) {invalid = invalid+" phone"; phone ="";};
// choosing data and template
if (invalid) {
  stylesheet = "xsl/form.xsl";
  document = "xml/form.xml";}
else {
  stylesheet = "xsl/result.xsl";
  document = "xml-bin/result.xml";}
// getting data and template
templ = chGetResponse(new Request("GET",stylesheet));
data = chGetResponse(new Request("GET",document));
// processing data and template
sab = new Sablotron();
sab.setTemplate(templ);
sab.setDocument(data);
sab.addParam("invalid",invalid;
sab.addParam("name",name;
sab.addParam("email",email;
sab.addParam("phone",phone;
ret = sab.process();
// returning response
response.fromSablotron(sab);
1;
The invalid, name, email and phone values are passed as parameters to the XSLT transformation, so that the template may prefill the form (if any field rejected) with previously entered good values in other fields.
With a Charlie agent installed on client computers, or on an intermediate proxy server, the above concepts suddenly (and transparently) become a distributed XML application environment. The XML application server remains a source of (possibly dynamic) XML documents/data and a repository of XSLT templates and actions. Actions, documents and templates are downloaded to local charlie agent and processed locally. All static components (likely all actions, templates and some XML documents) are cached locally, so the major part of traffic soon consists of dynamic XML data only. XSLT template processing and parts of application logic contained in actions are offloaded from the server to the Charlie agent. The application and Charlie agent's behavior is still controlled entirely from the server, through actions and templates stored on the server.
Many parallels can be drawn between the operation model of local Charlie agent and caching and dynamic/scripting features of modern browsers. Templates and static XML documents are cached on Charlie in a similar way browsers cache static HTML pages. Action code is cached and executed locally as well, similarly to JavaScript (or other) code within HTML pages. The disadvantage of using Charlie actions, e.g. to validate a data in form, is a requirement of (local) page reload and re-rendering, while embedded JavaScript/DHTML code can do this within the context of the page. Charlie actions offset this by browser version independence (and support for browsers without fancy dynamic features) and by access to a rich library of Charlie builtin services.
A possible way to handle different browser feature sets is to design different UI templates for each - e.g. support plain HTML browsers and DHTML+scripting browsers through different sets of templates. All DHTML features and scripts are described within HTML pages, so it is possible to work with them within the templates. Advanced uses of DHTML scripting could lead to quite complicated application design, though, you would have to balance between the code on the server, within the actions and within the pages.
In the long run Charlie could integrate with some browsers (e.g. Mozilla), which would allow action execution within the context of an HTML page and use of Charlie services (XML data caching, XSLT processing) from DHTML scripts. Eventually Charlie could also interface with XML driven GUIs (libglade derivates etc), ideally working with them as just another (perhaps richer) interface for the same XML application backend.
From another point of view, Charlie can be considered a browser independent client-side XSLT processor. While this is a gross simplification of Charlie model, it is certainly an important function Charlie can play. Charlie will support native XSLT support in browsers (e.g. IE5) through an optional "pass-through" flag.
We have already introduced caching of templates, actions and static XML documents on the client-side Charlie. This leaves out dynamic XML data requests. Charlie provides a mechanism to increase efficiency of handling dynamic XML data as well. There is a local storage for pieces of XML data retrieved from the server. The local XML data storage is managed in a cache-like style (all data in there are flagged with expire information), but can also provide a very persistent way of storing data through whole Charlie sessions and between them.
The preferred way of XML data cache management (deciding what to cache and when to purge) is server-driven. An XML data coming from the server are marked with cache-control directives and Charlie caches them automatically. This requires the XML application to be aware of Charlie XML data caching framework, but it is also the most logical place to generate cache-control information (at the same time the data are generated), and allows most of the data caching mechanism to be transparent to the action code. Alternatively, actions will have full access to the cache and can take caching decisions themselves.
To utilize the XML data cache, actions will be able to refer to stored data (e.g. for information lookup), and to send the relevant cache status information to the server with requests (so that server can only send data not present in the cache).
Examples of what can be achieved with the XML data caching range from stored lookup tables for local data validation or construction of frequently used select boxes to an increased network efficiency of the application and a lower server load through caching frequently used data (shopping cart contents, catalog entries, user profile data).
We can extend the parallel we've drawn between the local Charlie agent and the DHTML/scripting in previous section to IE5 "XML island" feature. XML islands allow to incorporate pieces of data (e.g. a lookup table) within a page, and use it from the scripts also incorporated in the page to build small interactive application. Charlie's local XML storage can do the same and more, with a data persistence in between pages (in IE5, when you reload the page, the data are gone), and, of course, for any browser.
In the long run the local caching functions will provide a path towards support for offline ch-applications. These applications would not be fully offline, of course. They would still be perfectly valid ch-applications working in all Charlie networking contexts, but would allow for some operations through periods without a connection to the server. In the moment we have locally cached/stored all necessary templates, static XML documents, important pieces of XML data and control code in actions (this can be achieved by a bulk preloading the cache), we have all we need to run a part of the application locally. This can be used e.g. in home-order systems, where catalog browsing and selection of goods could be done offline, the online connection would be required for order placement and catalog synchronization only. Another applications can be delivery or storage tracking systems, where a Charlie-enabled gizmo-type device is used to keep track of packages/whatever and synchronized with a server only at times.
Note: At the time of writing this document, the XML data caching framework is in a design and early testing stage, so the details are somewhat blurry. Selective XML data caching is a major design goal of Charlie framework, though.
The framework described so far is nicely flexible and generic, but Charlie also takes a step further towards simplifying the task of ch-application programmer in typical cases. While keeping access to all of the low level flexibility, Charlie provides a rich set of "smart features" that can dramatically simplify the code of actions and the XML application server. Examples of such features are enhanced control of template/document cache through a "directory metadata", symbolic links in a charlie: URL namespace, the protocol for server driven XML data cache control. Some others will be added in the future.
The use of these smart features, similarly to the use of other parts of Charlie framework, is optional (but of course highly recommended). Charlie is designed for an "incremental" use of its features. The more features you want to use the more you have to follow some Charlie conventions within the structure and code of your application. The reverse is also true. So you can use Charlie for "simple" things like management of multiple sets of UI templates for your XML application/document management system or as a client-side XSLT processor by defining a couple of default actions without much changes of the server-side application code. You can restructure your XML application code a bit and take the advantage of separating the application control logic from the data generation code and distributed execution of actions. You can modify your server code to flag data with Charlie cache-control directives and increase efficiency of the application by local XML data caching. And so on.
To abstract from various networking contexts Charlie can be run
in (server-side module, client-side proxy, local HTTP server,
independent proxy), an unified internal URL model is used. These
URLs are very similar to HTTP URLs, in the form of
charlie:/ch-application-id/xxx/yyy?aaa=bbb&ccc=ddd.
A translation of client URL into charlie: URL is done by Charlie
handler when accepting request. Internally Charlie works with
charlie: URLs only. Another translation is done when sending
requests to the server. All URLs used in actions and most URLs
used in templates are also charlie: URL. When returning pages to
the client, Charlie handler translates all charlie: URLs back to
URLs appropriate for given client.
The charlie: URL
charlie:/example1/xxx/yyy?p1=1 may be represented
as http://127.0.0.1/example1/xxx/yyy?p1=1 in the
local Charlie http daemon setup, but in server-side Charlie
model it can be
http:/charlie.gingerall.cz/charlie-bin/get/example1/xxx/yyy?p1=1.
Charlie URL namespace turns out to be somewhat central to Charlie implementation and is also used for various other purposes:
charlie:/example1/templates/myplatform/main.xsl
can mean different templates based on the value of locally
defined (through client platform or user preferences)
"myplatform" variable.
Currently, the final works on Charlie specification are in progress. The next version of this white paper will compare our plans and intentions to the reality of Charlie version 0.50. We also want to give more examples on advanced Charlie features (namespace, caching, local storage).
(c) 2000,2001 Ginger Alliance s.r.o.