Yeah I know this isnt a strict linux question but you guys are leet so...
My dad has a slight problem with his coursework. He's been given a peice of XML which is terribly formatted - there is a lot of duplication of data. The XML in question is sitting at http://www.imen.org.uk/pd.php?id=24
The first stage of the problem was to break it down per invoice - instead of having four lines containing shipping data etc, he wants only one set of that per invoice,and the rest of the information to be sub that.
Thats relativly easy using a key: http://www.imen.org.uk/pd.php?id=25
Now this leads onto stage two... Each invoice contains information about individual suppliers. This information again contains information which isunique per line (prodCode) but also information which isnt (supplierName).
Again, the problem is to only show suppliername once for each line, so that the first output would show three sets of lines, the first one containing two products, the second and third containing one. (as opposed to the output that that gives you which is four lines of one output line each).
Now I've tried lots of things - I cant generate a composite key based on the parent id because they all have the same parent - if file was structured better it could be sorted better. Any ideas? (I worked on this for about four hours lastnight, plus whatever my dad did on it and still no closer to a solution... is it indeed possible?)
Thanks JT