On 28-May-10 20:37:39, Adam Bower wrote:
On Fri, May 28, 2010 at 08:57:51PM +0100, Ted Harding wrote:
That strikes me as completely nuts! I would welcome any comment about it. I have tried to track down where this is done, and to locate on my system (Debian Lenny, also occurs on earlier Debian Etch) any system file which defines the sort order (i.e. collation order) of the standard ASCII (and other) characters.
All help and/or insight much appreciated!
man 5 locale should be a starting point.
Adam
Thanks Adam. Thanks also to some folk on the Linux-Users list, whose hints led to realising that the problem with
sort << EOT "ABCD" "A CD" EOT # "ABCD" # "A CD"
sort << EOT "ADCD" "A CD" EOT # "A CD" # "ADCD"
arises because, by default, the " " is ignored in sorting. Therefore in the first case it sorted "ABCD" and "ACD", returning "ABCD", "A CD", while in the second case it sorted "ADCD" and "ACD", and returned "A CD", "ADCD".
A solution is to export LC_COLLATE=C -- following directly on after the above:
export LC_COLLATE=C
sort << EOT "ABCD" "A CD" EOT # "A CD" # "ABCD"
sort << EOT "ADCD" "A CD" EOT # "A CD" # "ADCD"
Because "export" makes LC_COLLATE available to processes spawned by the shell, this also works within the application that revealed the problem in the first place.
Cheers, Ted.
-------------------------------------------------------------------- E-Mail: (Ted Harding) Ted.Harding@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 28-May-10 Time: 23:10:18 ------------------------------ XFMail ------------------------------