Xineo XIL (XML Import Langage) defines an XML language for transforming various record-based data sources into XML documents, and provides a fully functional XIL processing implementation. This implementation has built-in support for relational (via JDBC) and structured text (like CSV) sources, and is extensible thanks to its public API, allowing dynamic integration of new data source implementations. It also provides an abstraction over output format, and the Xineo implementation can generate output documents into stream or as DOM document.
The XML Import Language (XIL) defines a way to express transformations from various record-based
data sources to XML. A transformation is expressed as
a well-formed XML document, which may include both XIL defined elements and any other elements.
XIL defined elements belong to the XIL namespace (http://www.xineo.net/XIL-1.0), which will be
referred to using the "xil" prefix in the rest of this document.
The XIL language is not restricted to a specific data source, and gives access to all source types that are available in the actual implementation that is used to process the XIL document. The Xineo XIL implementation has built-in support for some source types, and is easily extensible via the Xineo XIL Java API. Xineo's implementation built-in data sources include :
As it was said before, a valid XIL document contains two kind of elements :
The first step in the construction of an import sheet is to define the data sources that are needed
to build the output document. Data sources are defined using the xil:source element.
Each data source has a type (which defines the kind of data source) and is attributed a user-defined
name (that will be used for future reference). Each data source can also be given a certain amount
of properties (depending on the source type). A single import sheet can define as many data sources
as needed.
Here is a example that defines a JDBC data source named "myDataSource". Properties are used to define the database connection parameters :
<?xml version="1.0" ?>
<xil:xil xmlns:xil="http://www.xineo.net/XIL-1.0">
<xil:source type="sql" name="myDataSource">
<xil:property name="url"> jdbc:mysql://myServer/myBase </xil:property>
<xil:property name="userId"> myUserId </xil:property>
</xil:source>
[...]
</xil:xil>
The complete reference of built-in data sources and corresponding properties is detailed in the input source reference.
Once the data sources are defined, output templates must be defined. These templates are usual XML nodes that will be instantiated by the XIL engine to produce the output document. Some of them are static and will be output as they are in the sheet, and other ones are dynamic and depend on the data exported from the data source.
Output templates are always defined inside a xil:output element.
The following example show how to create some static elements (in this case, the
addressBook element will be the root of the produced document) :
<?xml version="1.0" ?>
<xil:xil xmlns:xil="http://www.xineo.net/XIL-1.0">
<xil:source type="sql" name="myDataSource"> [...] </xil:source>
<xil:output>
<addressBook>
<title> My address book <title;>
[...]
</addressBook>
</xil:output>
</xil:xil>
Let's consider that we want to pupulate our address-book using data available as a view of our relational database. We previously saw how to declare a JDBC input source that will allow us to query the database and construct XML elements from the obtained data.
To create dynamic nodes in the import sheet, we have to use the xil:node
element, which itself contains two sub-elements :
xil:input element that will specify how data should be obtained from a given input source.xil:output element that will contain output templates using this data.
The xil:input element must specify which input source to use (my its name)
as well as the the query to access the data (using SQL for JDBC data sources). Each record
returned by this query will be accessible to produce the XML output.
xil:output element specifies an output template that will be instantiated
for each record returned by the given query. The data of each record is available via
a set of variables that will be properly substituted by the XIL engine when instantiating
the templates.
When using a JDBC data source, each returned column creates a variable of the same name.
For example the SELECT Id,Name FROM ... query would create two variables named
Id and Name (case-sensitive). To substitute a variable, the following
form must be used: ${variableName}. Such references to variables will be substituted
in any attribute of the output template, as well as using the xil:subst element.
The following example show how to query the Person table of our database, and
construct suitable XML elements to populate our address-book document.
<xil:output>
<addressBook>
<title> My address book <title>
<xil:node>
<xil:input source="myDataSource">
<xil:query> SELECT Id,Name,Address FROM Person </xil:query>
</xil:input>
<xil:output>
<person id="${Id}">
<name> <xil:subst value="${Name}"/> </name>
<address> <xil:subst value="${Address}"/> </address>
</person>
</xil:output>
</xil:node>
</addressBook>
</xil:output>
The following example shows a possible output of this import sheet :
<xml version="1.0">
<addressBook>
<title> My address book <title>
<person id="1">
<name> Miles Davis </name>
<address> 42 Horn Street, New-York </address>
</person>
<person id="2">
<name> Tony Williams </name>
<address> 21 Drum Street, London </address>
</person>
[...]
</addressBook>
Variables range. As in most programming languages, variables in XIL have a range which depends
on where they are defined. The scope of variables are created by a xil:query statement
is the corresponding xil:output element. Variables defined at the sheet level (see below)
are available globaly. When several variables of the same name exist at the same time, only the more
local one is available, and more global ones become available again when exiting the local variable scope.
Variables everywere. As shown in previous examples, variables are primarily used in output templates. But variables can also be used in several other places.
First of all, variables are also substituted in queries (in xil:query elements).
For example, when a xil:node element embeds another one, results from the first query
can be used to construct another one (on the same or other data source).
Variables are also substituted in data source property definitions. Global variables can thus be used to make an import sheet externally configurable (see "Using the Xineo XIL processor").
Beside JDBC, the Xineo XIL processor provides an built-in data source for structured text files, based on regular expressions.
A regular expression based data source can be defined using the regex type.
The file to be processed is specified by the inputSource property. The encoding
of this input source may be specified by the inputEncoding property. Here is an example :
<xil:source type="regex" name="myDataSource"> <xil:property name="inputSource"> inputfile.txt </xil:property> <xil:property name="inputEncoding"> ISO-8859-1 </xil:property> </xil:source>
This data source can then be queried via a regular expression, specified in the
xil:query element. Each line of the input source will be matched against the
given regular expression (not matching lines will be silently ignored), and records will
create variables for each parenthesis-captured group. Capturing groups are numbered by
counting their opening parentheses from left to right. In the expression (A)(B(C)), for
example, there are three such groups:
Each group will create a variable named by its number, for example ${1},
${2} and ${3}. Here is an example considering an input file
where each line contains three tab-separated fields:
<xil:node>
<xil:input source="myDataSource">
<xil:query> ([^\t]*)\t([^\t]*)\t([^\t]*) </xil:query>
</xil:input>
<xil:output>
<person id="${1}">
<name> <xil:subst value="${2}"/> </name>
<address> <xil:subst value="${3}"/> </address>
</person>
</xil:output>
</xil:node>
The Regex input source implementation uses Java-builtin regex support. Please consult the relative documentation for more information on regular expressions syntax.
This data source type is a variant of the "regex" one :
inputString property.Of course, it wouldn't be very useful if the input string was static, and this kind of source will generally be used to match patterns in some result of an upper-level query, for example to tokenize some field value.
In the following example, we first execute a query to get some fields from a JDBC table, before using the "localRegex" data source
to tokenize the Name field and produce correspoding token XML Elements.
Note that this example also demonstrates the possibility to set data source properties in the xil:input
element instead of in the xil:source declaration (when the same property is defined in both locations,
priority is given to the one defined in xil:input elements).
<xil:source type="sql" name="myDataSource">
[...]
</xil:source>
<xil:source type="localRegex" name="myLocalRegex" />
<xil:output>
<xil:node>
<xil:input source="myDataSource">
<xil:query> SELECT ID, Name FROM Person </xil:query>
</xil:input>
<xil:output>
<person id="${ID}">
<name>
<xil:node>
<xil:input source="myLocalRegex">
<xil:query> ([^\s]+) </xil:query>
<xil:property name="inputString">${Name}</xil:property>
</xil:input>
<xil:output>
<token><xil:subst value="${1}"/></token>
</xil:output>
</xil:node>
</name>
</person>
</xil:output>
</xil:node>
<xil:output>
The Xineo XIL engine provides a set of data sources types by default, but this set can also be extended, since new data source types may easily be implemented conforming to the API.
To use a third-party data source type, you have to tell the engine about it. To do so, the xil:sourceType
element may be used, which has just two attributes : the name attribute specifies the name
of the newly registered data source type, ant the class attribute specifies the
fully-qualified name of the data source Java class, which must be valid API-conforming data source
implementation. Here is an example :
<xil:xil>
<xil:sourceType name="mySourceType" class="com.foo.bar.MyDataSource"/>
<xil:source type="mySourceType" name="mySource">
<!-- Data source specific properties -->
</xil:source>
[...]
<xil:xil>
Note that data source types may also be registered programmatically using the SourceFactory
class (see "The Xineo XIL API").
The simpliest way to run Xineo XIL as a command-line tool is: "java -jar xineo-xml-X.X.X.jar <parameters>". You may also put the ".jar" in your CLASSPATH and run the "net.xineo.xml.xil.Main" class.
The available parameters are : [inputFile [outputFile [outputEncoding [property=value ...]]]].
Omitting "inputFile" or "outputFile" will read/write from/to standard input/output.
Using "-" instead of "inputFile" or "outputFile" will use respectively the standard input
and/or output. Default output encoding is UTF-8.
The "property=value" couples may be used to bind a value to a variable which will be available globally in the import sheet. Here is a command line sample :
$ java -jar xineo-xil.jar myImportSheet.xil myOutputFile.xml userId=bob
The Xineo XIL processor can also be easily used from any Java program via its public API (see "The Xineo XIL API"). For example, it could be integrated in a J2EE application.
Comma Separated Value (CSV) files and variant are structured text files where each line contains a
set of fields, usually separated by commas or ant other character like tabulations or columns.
In many cases, those files may easily be handled in with XIL using the regex data source
type, like demonstrated in this tutorial.
But in more complex cases, regular expressions will not be sufficiently powerfull, for example if you want to perform queries on CSV entries. In this case, you may use the CSV-JDBC driver, which is a simple read-only JDBC driver that uses (CSV) files as database tables.
This section has not been written yet. Please consult the Xineo XIL API documentation.
To learn more about Xineo XIL and how to use it in your applications, please refer to:
readme.html file, if you've not done so yet.