This has been driving me mad all afternoon.
How can I compare xml files from the commandline?
Specifically: If I have two XML files and one contains: ... <field name="A">ValueA</field> <field name="B">ValueB</field> ... and the other contains: ... <field name="B">ValueB</field> <field name="A">ValueA</field> ...
... then I consider those two files to be the same, but obviously a diff will not. How can I compare just the content and not the ordering/formatting/etc?
On Mon, 23 Oct 2023 15:20:42 +0100 Mark Rogers mark@more-solutions.co.uk allegedly wrote:
This has been driving me mad all afternoon.
How can I compare xml files from the commandline?
Specifically: If I have two XML files and one contains: ... <field name="A">ValueA</field> <field name="B">ValueB</field> ... and the other contains: ... <field name="B">ValueB</field> <field name="A">ValueA</field> ...
... then I consider those two files to be the same, but obviously a diff will not. How can I compare just the content and not the ordering/formatting/etc?
Mark
Try "sorting" the files before the diff.
i.e.
sort one.txt > sortedone.txt sort two.txt > sortedtwo.txt
diff sortedone.txt sortedtwo.txt
Mick
--------------------------------------------------------------------- Mick Morgan gpg fingerprint: FC23 3338 F664 5E66 876B 72C0 0A1F E60B 5BAD D312 blog: baldric.net ---------------------------------------------------------------------
On Mon, 23 Oct 2023 at 22:51, mick mbm@rlogin.net wrote:
Try "sorting" the files before the diff.
i.e.
sort one.txt > sortedone.txt sort two.txt > sortedtwo.txt
diff sortedone.txt sortedtwo.txt
Not really something I'd recommend. You'd want something that's XML-aware. A very quick search shows this, might be useful?
https://xmldiff.readthedocs.io/en/stable/commandline.html
Regards, Srdjan
On Mon, 23 Oct 2023 at 23:12, Srdjan Todorovic todorovic.s@googlemail.com wrote:
On Mon, 23 Oct 2023 at 22:51, mick mbm@rlogin.net wrote:
Try "sorting" the files before the diff.
Not really something I'd recommend. You'd want something that's XML-aware.
Agreed. Yes it would work in my limited example but the actual files are more complex and a simple sort could give misleading results.
Also: I should clarify that I'm not just looking for a "are they the same?" comparison but also a meaningful summary of where the differences are, which an xmldiff tool would most likely have given some thought to.
A very quick search shows this, might be useful?
I did try this yesterday, and it does get close, but this is a sample of its output: [move, /template/output[4], /template[1], 14] [move, /template/output[4], /template[1], 15] [delete-attribute, /template/output[1], desc] [delete-attribute, /template/output[2], desc] [delete-attribute, /template/output[3], desc] [delete-attribute, /template/output[4], desc] [delete-attribute, /template/output[5], desc] [delete-attribute, /template/output[6], desc] [delete-attribute, /template/output[7], desc] [delete-attribute, /template/output[8], desc] [move, /template/output[8]/field[14], /template/output[8], 14] [move, /template/output[5]/field[1], /template/output[4], 0] ...
I suspect that combined with a GUI this would be excellent (and I know I asked for a commandline solution but I'd take that). But as it is that's quite hard to actually work with to work out where the differences are to fix them.
(Background: Some time ago I wrote some scripts to generate dozens of XML configuration files for a wide range of devices. The target software has now been updated and the configuration file structure has changed. Performing the software upgrade also upgraded all the XML config files, but I need to adapt my scripts to reliably generate files in the new format. So I need to find the differences between what I generate and what the upgraded files look like. A lot of the changes are minimal, like the upgraded files have had all comments stripped out, but also they've shuffled the order of various bits around in ways which are entirely cosmetic but which I can't easily replicate without major changes to my script.)
On Tue, 24 Oct 2023 at 10:54, Mark Rogers mark@more-solutions.co.uk wrote:
I did try this yesterday, and it does get close, but this is a sample of its output: [move, /template/output[4], /template[1], 14]
There's also this: https://superuser.com/questions/79920/how-can-i-diff-two-xml-files It talks about running xmllint on both files, and then doing a normal diff on the result. Probably fiddly to use for multiple comparisons, but might be a slightly better option?
I suspect that combined with a GUI this would be excellent (and I know I asked for a commandline solution but I'd take that). But as it is that's quite hard to actually work with to work out where the differences are to fix them.
XML Spy (commercial paid software) should be able to do it. Also there are XML plugins for VS Code (will run on Linux, but I know some might object).
Regards, Srdjan
On Tue, 24 Oct 2023 at 17:39, Srdjan Todorovic todorovic.s@googlemail.com wrote:
There's also this: https://superuser.com/questions/79920/how-can-i-diff-two-xml-files It talks about running xmllint on both files, and then doing a normal diff on the result. Probably fiddly to use for multiple comparisons, but might be a slightly better option?
Thanks I'll take a look. I'm sure it can be scripted to make it simpler.
(In an ideal world, I think that when comparing "new" and "old", the "new" file should be left unchanged but differences with the "old" shown as diffs against it, with things that have simply moved around ignored. Anything which reformats both files for the comparison will tell you that there is a difference in line X where X is meaningless in terms of either the old or new files.)
Also there are XML plugins for VS Code (will run on Linux, but I know some might object).
I do use VSCode as an editor (not a huge fan but I haven't found a cross-platform GUI editor I prefer) so I'll take a look.