Yeah I know this isnt a strict linux question but you guys are leet so...
My dad has a slight problem with his coursework. He's been given a peice of XML which is terribly formatted - there is a lot of duplication of data. The XML in question is sitting at http://www.imen.org.uk/pd.php?id=24
The first stage of the problem was to break it down per invoice - instead of having four lines containing shipping data etc, he wants only one set of that per invoice,and the rest of the information to be sub that.
Thats relativly easy using a key: http://www.imen.org.uk/pd.php?id=25
Now this leads onto stage two... Each invoice contains information about individual suppliers. This information again contains information which isunique per line (prodCode) but also information which isnt (supplierName).
Again, the problem is to only show suppliername once for each line, so that the first output would show three sets of lines, the first one containing two products, the second and third containing one. (as opposed to the output that that gives you which is four lines of one output line each).
Now I've tried lots of things - I cant generate a composite key based on the parent id because they all have the same parent - if file was structured better it could be sorted better. Any ideas? (I worked on this for about four hours lastnight, plus whatever my dad did on it and still no closer to a solution... is it indeed possible?)
Thanks JT
On Tue, 9 May 2006 10:23:32 +0100 (BST), jt@imen.org.uk said:
Any ideas? (I worked on this for about four hours lastnight, plus whatever my dad did on it and still no closer to a solution... is it indeed possible?)
Have you got Saxon? (http://saxon.sf.net). If I understand your problem correctly, I think you could do it like this:
Change your original XSLT for this:
<?xml version="1.0" encoding="UTF-8"?> <xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:output method="xml"/>
<xsl:key name="invoices-by-id" match="orders_rec" use="invoiceNum" />
<xsl:template match="/database"> <orders> <xsl:apply-templates select="orders" /> </orders> </xsl:template>
<xsl:template match="orders">
<xsl:for-each select="orders_rec[count(. | key('invoices-by-id', invoiceNum)[1]) = 1]"> <xsl:sort select="invoiceNum" />
<!-- This is per invoice information --> <invoice id="{invoiceNum}">
<invoice_num><xsl:value-of select="invoiceNum"></xsl:value-of></invoice_num> <contactName><xsl:value-of select="contactName"></xsl:value-of></contactName> <shipToAddress><xsl:value-of select="shipToAddress"></xsl:value-of></shipToAddress>
<suppliers> <xsl:for-each select="key('invoices-by-id', invoiceNum)"> <supplier> <supplierName><xsl:value-of select="supplierName"></xsl:value-of></supplierName> <line> <prodCode><xsl:value-of select="prodCode"></xsl:value-of></prodCode> </line> </supplier> </xsl:for-each> </suppliers> </invoice> </xsl:for-each> </xsl:template>
</xsl:transform>
Create a second XSLT stylesheet like this:
<?xml version="1.0" encoding="UTF-8"?> <xsl:transform version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:output method="xml"/>
<xsl:template match="/"> <xsl:apply-templates /> </xsl:template>
<xsl:template match="invoice"> <xsl:apply-templates /> </xsl:template>
<xsl:template match="suppliers"> <suppliers> <xsl:for-each-group select="supplier" group-by="supplierName"> <supplier> <name><xsl:value-of select="current-grouping-key()" /></name> <xsl:for-each select="current-group()/line"> <xsl:copy-of select="prodCode" /> </xsl:for-each> </supplier> </xsl:for-each-group> </suppliers> </xsl:template>
<xsl:template match="invoice_num"> <invoice_num><xsl:apply-templates /></invoice_num> </xsl:template>
<xsl:template match="contactName"> <contactName><xsl:apply-templates /></contactName> </xsl:template>
<xsl:template match="shipToAddress"> <shipToAddress><xsl:apply-templates /></shipToAddress> </xsl:template>
</xsl:transform>
And then transform you original data through both the stylesheets like this:
$ java net.sf.saxon.Transform database.xml transform1.xslt | java net.sf.saxon.Transform - transform2.xslt > new-database.xml
If you want it as a text file you'll have to write an additional stylesheet to do so.
Cheers, Richard (waiting for a long make) -=-=-=-=-=-=-=-=-=-=-=-=-=- Richard Lewis Sonic Arts Research Archive http://www.sara.uea.ac.uk/ -=-=-=-=-=-=-=-=-=-=-=-=-=-