<!DOCTYPE page SYSTEM "page.dtd">
<page>
<section title="XML" ref="XML">

<p>In order to make it easy for people to incoporate the M'Cheyne
daily Bible readings into their own web pages I wrote a script which
serves them up as an <dfn title="Extensible Markup Language">XML</dfn>
document.  This can be interrogated from a remote site and the results
are easily interpreted by Perl or PHP so that the readings can be
extracted.  Information on using it is on the <a
href="/mcheyne/server.html">M'Cheyne server page</a>.</p>

<p>Any simple format would have done for the server output, but XML
gives flexibility and future-proofing whilst being relatively easily
interpreted by scripts.</p>

<p>The <a href="/mcheyne/server_raw.php">raw output</a> of the server
is a <em>well-formed</em> XML document.  It contains a single root
element <code>&lt;mcheyne&gt;</code> which itself contains strictly
nested elements with balanced start and end tags.  Subject to these
constraints you are free in a well-formed XML document to choose
pretty much any structure and element names you like.</p>

<pre>
&lt;mcheyne&gt;
&lt;copyright&gt;Copyright (c) 2003, Ben Edgington&lt;/copyright&gt;
&lt;version&gt;2.0&lt;/version&gt;
&lt;day&gt;8&lt;/day&gt;
&lt;date&gt;
&lt;mday&gt;9&lt;/mday&gt;
&lt;month&gt;January&lt;/month&gt;
&lt;year&gt;2003&lt;/year&gt;
&lt;/date&gt;
...
&lt;/mcheyne&gt;
</pre>

<p>Making the document well-formed would have been sufficient for
currently envisioned uses of the server, but I was also keen to make
it <em>valid</em>.  This means that it should have a corresponding
document type declaration which defines all the elements, and
constrains the relationships between them.</p>

<p>I chose to use an <em>external</em> <dfn
title="Document Type Definition">DTD</dfn> so that automatic
accesses of the M'Cheyne server by scripts do not need to download it.
To point to the DTD, the XML must be declared with
<code>standalone=&quot;no&quot;</code>, and the DOCTYPE declaration
points to a file.  It is a &quot;SYSTEM&quot; DTD as it is private to
the server rather than publically available.</p>

<pre>
&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; standalone=&quot;no&quot;?&gt;
&lt;!DOCTYPE mcheyne SYSTEM &quot;server.dtd&quot;&gt;
</pre>

<p>The document structure is then defined in the <em
class="file">server.dtd</em> using the somewhat arcane XML syntax.  In
this case the structure is fairly simple and regular, so the DTD I
created is quite inflexible.  Here's an extract, or see the <a
href="/file.pl/mcheyne/server.dtd">whole DTD file</a>.</p>

<pre>
<span class="file">server.dtd</span>

&lt;!ELEMENT mcheyne (notice,copyright,version,day,date,timezone,calendar,bible,quote,family,(secret|private))&gt;
&lt;!ELEMENT notice (#PCDATA|website)*&gt;
&lt;!ELEMENT website (#PCDATA)&gt;
&lt;!ELEMENT copyright (#PCDATA)&gt;
&lt;!ELEMENT version (#PCDATA)&gt;
&lt;!ELEMENT day (#PCDATA)&gt;
&lt;!ELEMENT date (mday,month,year)&gt;
&lt;!ELEMENT mday (#PCDATA)&gt;
&lt;!ELEMENT month (#PCDATA)&gt;
...
</pre>

<p>Each element either contains only text (ie. <code>#PCDATA</code>)
or other elements as listed, and for the XML document to be valid it
must conform to this structure.</p>

<p>With a DTD it is possible to validate the XML produced by the
server using an <a
href="http://www.stg.brown.edu/service/xmlvalid/">XML 1.0
validator</a>.</p>

<p>A good introduction to XML is on the <a
href="http://xmlwriter.net/xml_guide/xml_declaration.shtml">XMLwriter
website</a>.</p>
</section>

<section title="XSLT" ref="XSL">
<p>Just for fun I wanted to try to format the output of the server
nicely, so it can not only be used by scripts just to extract the the
data they want, but can also be viewed directly as a simple web page.
There seem to be two main options if you want to be able to format
<dfn title="Extensible Markup Language">XML</dfn> nicely for
display in browsers: <dfn
title="Cascading Style Sheets">CSS</dfn> and <dfn
title="XML Stylesheet Language Transformations">XSLT</dfn></p>

<subsect title="CSS limitations" ref="CSS">

<p>The first option is to use a standard <a href="xhtml.html_param_">CSS style sheet</a>.  The presentation of all the elements
can be specified in the style-sheet in the normal way.  Here's part of
a rudimentary style-sheet for the M'Cheyne server data.</p>

<pre>
<span class="file">server.css</span>

mcheyne, notice, copyright, quote, target, date { display: block; }
mday, month, year, reference, url { display: inline; }
day, timezone, calendar, bible, version  { display: none; }

mcheyne { background: #eee; }

notice {
    color: #888;
    text-align: center;
    margin: 1ex 0;
}

date {
    text-align: center;
    margin: 2ex 0;
}

quote {
    text-align: center;
    font-style: italic;
    font-size: 120%;
    margin:  1ex 0;
}
...
</pre>

<p>To tell the interpreter of the XML document to use this style-sheet
insert a line like this before your DOCTYPE declaration,</p>

<pre>
&lt;?xml-stylesheet type=&quot;text/css&quot; href=&quot;server.css&quot;?&gt;
</pre>

<p>A drawback of this approach is that you are not able easily to
modify the data, for example to change its order of presentation, or
to make hyperlinks out of URLs.  I wanted to do both of these with the
M'Cheyne server XML output.</p>
</subsect>

<subsect title="XSLT kicks butt" ref="XKB">

<p>The solution is to use <dfn
title="XML stylesheet language">XSL</dfn> Transformations
(XSLT).  This is a framework for taking an input XML code tree and
transforming it in an arbitrary way to form an output tree.  In this
case, the output is an HTML document which the web browser will then
display in the normal way, although it could be any arbitrary
format.</p>

<p>An XSLT stylesheet contains templates which match specific elements
in the source and describe how they are to be transformed to create
the output.  For example, in the M'Cheyne server output a
&lt;target&gt; element contains a &lt;url&gt; element and a
&lt;reference&gt; element.  The following stylesheet fragment
transforms this &lt;target&gt; element into a standard &lt;a
href=&quot;&quot;&gt; hypertext anchor with the url as the target and
the reference as the text.</p>

<pre>
<span class="file">server.xsl</span>

...
&lt;xsl:template match=&quot;target&quot;&gt;
  &lt;a href=&quot;{url}&quot; style=&quot;padding-left: 0.5em&quot;&gt;
    &lt;xsl:value-of select=&quot;reference&quot;/&gt;&lt;/a&gt;
&lt;/xsl:template&gt;
...
</pre>

<p>See the <a href="/file.pl/mcheyne/server.xsl">full stylesheet</a> for all
the gory details.</p>

<p>To tell the interpreter of the XML document to use this XSL
style-sheet insert a line like this before your DOCTYPE declaration,</p>

<pre>
&lt;?xml-stylesheet type=&quot;text/xsl&quot; href=&quot;server.xsl&quot;?&gt;
</pre>

<p><strong>Note:</strong> Netscape and Mozilla need the
<code>.xsl</code> file to be served up with MIME type
<code>text/xml</code> or they won't use it (this is actually as per <a
href="http://www.ietf.org/rfc/rfc2376.txt">RFC2376</a>).  You can
check if this is the case using my <a
href="/cgi-bin/http_head.cgi">HTTP header viewer</a>.  If your server
is not configured to do this you have a number of options.  For one,
you can simply give your stylesheet a <code>.xml</code> extension
instead of <code>.xsl</code>.  For another, with the Apache server,
you can add a line like &quot;<code>AddType text/xml xsl</code>&quot;
to your directory <a href="htaccess.html_param_"><em
class="file">.htaccess</em></a> file.  A third option is to make the
stylesheet a PHP or Perl CGI script that sets the correct
<code>Content-Type: text/xml</code> <a href="http.html_param_">HTTP
header</a>.</p>

<p>Now, if your browser supports it, when the XML document is
downloaded the browser will transform the raw XML according to the XSL
stylesheet to produce nicely formatted HTML.  Mozilla 1.0, Netscape 7
and Internet Explorer 6 seem to handle this reliably, but Opera
doesn't support it.  If you point one of these browsers at the server
you <em>should</em> see a simple webpage with the data <a
href="/mcheyne/server.php">nicely formatted</a> rather than the <a
href="/mcheyne/server_raw.php">raw output</a> or just unformatted
text.</p>

<p>Of course, you can still use CSS for formatting if you want to:
just include your style sheet into the XSLT file so that it is output
with the HTML and the browser will interpret it as normal.</p>

<p>A good introduction to XSLT (albeit heavily Windows biased) is on
the <a href="http://www.w3schools.com/xsl">W3schools website</a>.</p>
</subsect>

<subsect title="Server-side XSLT" ref="SSX">

<p>Sometimes it is more appropriate for the XSL transformation to be
done server-side rather than client-side.  For example, the body of
this page is an XML document which has been processed by XSLT before
being inserted into the page template.  In this case the work is done
on the web-server&emdash;only the XHTML output is sent to your browser&emdash;which avoids the problem of having to deal with browsers that do not
support XSLT.</p>

<p>An easy way to do server-side XSLT is to use PHP's built-in functions such
as <a href="http://www.zend.com/manual/function.xslt-process.php">xslt_process()</a>.</p>

<pre>
from <span class="file">index.php</span>

$x = xslt_create();
$result = xslt_process($x,
                       &apos;file://&apos;.getcwd().&quot;/$file.xml&quot;,
                       &apos;file://&apos;.getcwd().&apos;/website.xsl&apos;);
if ($result) {
    print $result;
} else {
    print &quot;XSLT error: &quot;.xslt_error($x).&quot;, error code: &quot;.xslt_errno($x);
}
xslt_free($x);
</pre>

<p>Note that the behaviour of Sablotron (which is used by PHP's xslt_process)
has changed - you should now use an absolute URI for the XSL and XML files, so
we use PHP's getcwd() function to locate them.  The URI is in the form
&quot;file://<i>absolute-pathname</i>/<i>filename</i>&quot;.</p>

<p>Your webserver may or may not be enabled for XSLT.  If you run the
<a href="perlphp.html_param_#PHI"><em class="file">phpinfo.php</em></a> script and see a section headed &quot;xslt&quot; which says XSLT is enabled then you're in luck.</p>

<h4>Example 1</h4>

<p>On this page the main work done by the XSLT is to generate the
&quot;Page Index&quot; at the top.  This is done on-the-fly to ease
maintenance: it is now very easy to add new pages and sections without
the tedious and error-prone task of creating the index by hand.</p>

<p>Have a look at the <a href="/file.pl/techie/website/xml.xml">XML
source</a> for this page and the corresponding <a
href="/file.pl/techie/website/website.xsl">XSL stylesheet</a> for an
example of how it can be used.  The XML source is basically normal
XHTML with the addition of some <code>&lt;section&gt;</code> and
<code>&lt;subsect&gt;</code> elements.  The document type declaration
at the top is used to define any HTML entities that are not recognised
by XSL, in this case an &quot;emdash&quot;, which has Unicode number
8212.</p>

<h4>Example 2</h4>

<p>Another place where I use XSLT is in the picture gallery on my
daughter's website.  This is a fairly substantial example,
illustrating lots of loops and conditionals, named templates,
variables and parameters, and the use of the <code>count()</code>
function among other things.</p>

<p>In this case the metadata for the pictures are stored in an <a
href="/file.pl/$/edgingtonfamily/hannah/pictures.xml">XML file</a>, and transformed
according to an <a href="/file.pl/$/edgingtonfamily/hannah/pictures.xsl">XSL
Transformation file</a>.  The end product is a normal, validating <a
href="http://hannah.edgingtonfamily.org/pictures.php?number=61">XHTML
webpage</a>.</p>

<h4>Example 3</h4>

<p>My <a href="/christian/sermons/">sermons</a> pages use XSLT both to
produce the main index page and the individual pages for each of the
sermons among other things.  Details of the sermons available are held
in an <a href="/file.pl/christian/sermons/sermons.xml">XML index
file</a> which is itself automatically generated and cached from the
sermon texts with another <a
href="/file.pl/christian/sermons/index.xsl">XSLT</a> script.</p>

<p> The <a href="/file.pl/christian/sermons/sermons.xsl">XSLT for the
index</a> reformats the XML file as a HTML table: it demonstrates
sorting with XSL.  The <a
href="/file.pl/christian/sermons/sermon.xsl">XSLT for the individual
sermon pages</a> transforms the XML for the selected sermon into a
final XHTML page.  The XSL <code>document()</code> function is used to
include the body of the sermon which is stored in a separate file.</p>

<p>Incidentally, these examples demonstrate a crude way of using a
arrays in XSLT.  It's neither fast nor elegant, and it violates the
standard (see below), but is pretty useful.</p>

<p>The array is created as a node-set in the variable
<code>$months</code> which contains the twelve months in order.  Then
in the template below an array-lookup is effectively done with
<code>select=&quot;$months/month[position()=$m]&quot;</code></p>

<pre>
&lt;xsl:variable name=&quot;months&quot;&gt;
  &lt;month&gt;January&lt;/month&gt;
  &lt;month&gt;February&lt;/month&gt;
...
  &lt;month&gt;November&lt;/month&gt;
  &lt;month&gt;December&lt;/month&gt;
&lt;/xsl:variable&gt;

&lt;xsl:template match=&quot;date&quot;&gt;
  &lt;xsl:number value=&quot;d&quot;/&gt;
  &lt;xsl:text&gt; &lt;/xsl:text&gt;
  &lt;xsl:variable name=&quot;m&quot; select=&quot;m&quot;/&gt;
  &lt;xsl:value-of select=&quot;$months/month[position()=$m]&quot;/&gt;
  &lt;xsl:text&gt; &lt;/xsl:text&gt;
  &lt;xsl:number value=&quot;y&quot;/&gt;
&lt;/xsl:template&gt;
</pre>

<p>The XML fragment this matches looks like</p>

<pre>
&lt;date&gt;&lt;y&gt;YYYY&lt;/y&gt;&lt;m&gt;MM&lt;/m&gt;&lt;d&gt;DD&lt;/d&gt;&lt;/date&gt;
</pre>

<p>It is important to note that all this, strictly speaking, defies
the <a
href="http://www.w3.org/TR/xslt#section-Result-Tree-Fragments">XSLT
1.0 standard</a>, since we are trying to treat a result tree fragment
(the <code>$months</code> variable) as a node-set.  In the XSLT 2.0
standard this will be allowed.  Meanwhile, my XSLT processor
(Sablotron) automatically converts result tree fragments to node-sets.
Other processors may have extension functions (called something like
<code>exsl:node-set()</code> or similar) that will do the conversion
explicitly.  Portable code to do this safely looks like this mess,</p>

<pre>
&lt;xsl:choose&gt;
  &lt;xsl:when test=&quot;function-available('exsl:node-set')&quot;&gt;
    &lt;xsl:value-of select=&quot;exsl:node-set($months)/month[position()=current()/m]&quot; /&gt;
  &lt;/xsl:when&gt;
  &lt;xsl:otherwise&gt;
    &lt;xsl:value-of select=&quot;$months/month[position()=current()/m]&quot; /&gt;
  &lt;/xsl:otherwise&gt;
&lt;/xsl:choose&gt;
</pre>

<h4>Example 4</h4>

<p>My XSLT for the ESV Bible text demonstrates some nifty code for
handling footnotes, including the use of &lt;xsl:apply-templates/&gt;
&quot;mode&quot; attribute.  See the <a href="/esv">ESV Web
Service</a> pages for more on this.</p>

</subsect>

</section>

<section title="Conclusion" ref="CNC">

<p>It looks like the combination of XML and XSLT is tremendously
powerful, and it is definitely the future of the Web.  The majority of
Web users will find that their browsers already support client-side
XSL transformation, or the transformations can be applied server-side,
as described above.</p>

<p>But it's not just about web browsing: it marks a genuine division
between the content of an XML document and the presentation of that
document as defined in the XSL file.  This means that the same data
may be delivered in a multitude of ways by a multitude of media
without the artificial constraints imposed by HTML and the visual
formatting model.  This will facilitate a host of <em>web
services</em> that can be embedded into external internet enabled
devices like, in its humble way, the M'Cheyne daily Bible reading
server.</p>

<p>For a thorough overview of XML and related issues, have a look at
the <a href="http://www.ucc.ie/xml/faq.xml">XML FAQ</a>.</p>

</section>

</page>