[Alug]Trouble with foreign character input

List overview All Threads
Download

newer

older

[Alug]Adrian Kiddle rejoining ALUG

RE: [Alug]Adrian Kiddle rejoining...

MJ Ray

15 Sep 2003 15 Sep '03

7:34 p.m.

I want to type characters like c-cedilla (French) and c-circumflex (Slavic) easily. I cannot get the compose key or the dead_circumflex character in xmodmap to work for anything other than combining a circumflex with a space to produce ^. Redefining keys with xmodmap, eg: keycode 54 = c C ccircumflex Ccircumflex doesn't work either (output of xmodmap -pm gives: shift Shift_L (0x32), Shift_R (0x3e) lock Caps_Lock (0x42) control Control_L (0x25), Control_R (0x6d) mod1 Alt_L (0x40) mod2 Num_Lock (0x4d) mod3 mod4 Super_L (0x73), Super_R (0x74) mod5 ISO_Level3_Shift (0x71) in case I've got a modifier wrong.)

That used to work with XFree86 3.3.1, using AltGr-c to give c-circumflex. Indeed, it's the same xmodmap file and the same pc105 gb keyboard settings in the config file (but a different place), but I am now using XFree86 version 4.3.0. If someone knows what is wrong, please tell me. If I can give more helpful info, please tell me what you want.

-- MJR/slef My Opinion Only and possibly not of any group I know. http://mjr.towers.org.uk/ gopher://g.towers.org.uk/ slef@jabber.at Creative copyleft computing services via http://www.ttllp.co.uk/

Show replies by date

Ted.Harding＠nessie.mcc.ac.uk

16 Sep 16 Sep

2:24 a.m.

On 15-Sep-03 MJ Ray wrote:

...

I want to type characters like c-cedilla (French) and c-circumflex (Slavic) easily. I cannot get the compose key or the dead_circumflex character in xmodmap to work for anything other than combining a circumflex with a space to produce ^.

There's a possible confusion here (apologies if I'm wrong in your case, but what you write sugegsts it). What you call "c-circumflex (Slavic)" is not c-circumflex but c-hacek (the accent is like an upside-down circumflex), as in Czech. Is this a possible source of the non- cooperation? You would probably need to make sure (if using iso-8859-X 256-byte font encodings) that you have iso-8859-2 (aka ISO-Latin-2) fonts ("Central Eurpoean") available. At the end of the day, Unicode will solve these problems, since it uses multibyte encoding (and probably has a code for anything a 2-year-old might scribble on paper). However, it is only now beginning to necome available on Linux.

Ted.

Redefining keys with xmodmap,

...

eg: keycode 54 = c C ccircumflex Ccircumflex doesn't work either (output of xmodmap -pm gives: shift Shift_L (0x32), Shift_R (0x3e) lock Caps_Lock (0x42) control Control_L (0x25), Control_R (0x6d) mod1 Alt_L (0x40) mod2 Num_Lock (0x4d) mod3 mod4 Super_L (0x73), Super_R (0x74) mod5 ISO_Level3_Shift (0x71) in case I've got a modifier wrong.)

That used to work with XFree86 3.3.1, using AltGr-c to give c-circumflex. Indeed, it's the same xmodmap file and the same pc105 gb keyboard settings in the config file (but a different place), but I am now using XFree86 version 4.3.0. If someone knows what is wrong, please tell me. If I can give more helpful info, please tell me what you want.

-- MJR/slef My Opinion Only and possibly not of any group I know. http://mjr.towers.org.uk/ gopher://g.towers.org.uk/ slef@jabber.at Creative copyleft computing services via http://www.ttllp.co.uk/

main@lists.alug.org.uk http://www.alug.org.uk/ http://lists.alug.org.uk/mailman/listinfo/main Unsubscribe? See message headers or the web site above!

-------------------------------------------------------------------- E-Mail: (Ted Harding) Ted.Harding@nessie.mcc.ac.uk Fax-to-email: +44 (0)870 167 1972 Date: 15-Sep-03 Time: 20:09:14 ------------------------------ XFMail ------------------------------

MJ Ray

17 Sep 17 Sep

12:01 p.m.

On 2003-09-15 20:09:14 +0100 (Ted Harding) Ted.Harding@nessie.mcc.ac.uk wrote:

...

There's a possible confusion here (apologies if I'm wrong in your case, but what you write sugegsts it). What you call "c-circumflex (Slavic)" is not c-circumflex but c-hacek (the accent is like an upside-down circumflex), as in Czech.

No, I mean c-circumflex from the ISO-8859-3 (Southern European) encoding. Sorry for using the imprecise term "Slavic" (I knew the encording began with S, but didn't look up the proper name).

As I'm typing across 3 of the ISO Latin character sets (-1, -3 and -15), I do already have the machine configured for UTF-8, as far as I can tell. Some applications with alternative input methods (yudit, qemacs) can display all the characters, so that's why I think it is an input configuration problem rather than a font problem. I could be wrong.

-- MJR/slef My Opinion Only and possibly not of any group I know. http://mjr.towers.org.uk/ gopher://g.towers.org.uk/ slef@jabber.at Creative copyleft computing services via http://www.ttllp.co.uk/

Ted.Harding＠nessie.mcc.ac.uk

1:37 p.m.

On 17-Sep-03 MJ Ray wrote:

...

On 2003-09-15 20:09:14 +0100 (Ted Harding) Ted.Harding@nessie.mcc.ac.uk wrote:

...
There's a possible confusion here (apologies if I'm wrong in your case, but what you write sugegsts it). What you call "c-circumflex (Slavic)" is not c-circumflex but c-hacek (the accent is like an upside-down circumflex), as in Czech.

No, I mean c-circumflex from the ISO-8859-3 (Southern European) encoding. Sorry for using the imprecise term "Slavic" (I knew the encording began with S, but didn't look up the proper name).

Hmmm ... 8859-3 was originally designed for Maltese, Turkish (now superseded by 8859-9) and Esperanto. Esperanto is the only place I've ever come across c-circumflex. Hmmm ...

...

As I'm typing across 3 of the ISO Latin character sets (-1, -3 and -15), I do already have the machine configured for UTF-8, as far as I can tell. Some applications with alternative input methods (yudit, qemacs) can display all the characters, so that's why I think it is an input configuration problem rather than a font problem. I could be wrong.

But you're probably right: I did a little hunt on

"c-circumflex" "utf-8" linux

and was led to

http://mail.nl.linux.org/linux-utf8/2002-01/msg00041.html

which doesn't solve the question but does indicate that there are built-in problems (it looks as though that whole mailing list is worth a browse, though it doesn't seem to have a spam filter ... ).

Cheers, Ted.

-------------------------------------------------------------------- E-Mail: (Ted Harding) Ted.Harding@nessie.mcc.ac.uk Fax-to-email: +44 (0)870 167 1972 Date: 17-Sep-03 Time: 12:18:12 ------------------------------ XFMail ------------------------------

MJ Ray

2:56 p.m.

On 2003-09-17 12:18:12 +0100 (Ted Harding) Ted.Harding@nessie.mcc.ac.uk wrote:

...

http://mail.nl.linux.org/linux-utf8/2002-01/msg00041.html which doesn't solve the question but does indicate that there are built-in problems (it looks as though that whole mailing list is worth a browse,

Thanks Ted! I always thought that list was only concerned with the console, but there are some xfree86 messages in it. It seems that the locale was incorrect at the time xdm was started (when I used version 3.3.1, I used startx) and so X didn't recognise all the input I was trying to use. I also needed to add the "compose:rwin" XKB option to get the compose key on the right windows key, so I can get characters like c-cedilla by pressing compose , c in order.

I can now type ç, ĉ, ŭ and other things that won't show correctly on most of your screens, but I still don't understand why my xmodmap doesn't work. Which modifier is used for the 3rd and 4th columns? Mod3? Mod5?

-- MJR/slef My Opinion Only and possibly not of any group I know. http://mjr.towers.org.uk/ gopher://g.towers.org.uk/ slef@jabber.at Creative copyleft computing services via http://www.ttllp.co.uk/

Ted.Harding＠nessie.mcc.ac.uk

2:14 p.m.

On 17-Sep-03 MJ Ray wrote:

...

No, I mean c-circumflex from the ISO-8859-3 (Southern European) encoding. Sorry for using the imprecise term "Slavic" (I knew the encording began with S, but didn't look up the proper name).

As I'm typing across 3 of the ISO Latin character sets (-1, -3 and -15), I do already have the machine configured for UTF-8, as far as I can tell. Some applications with alternative input methods (yudit, qemacs) can display all the characters, so that's why I think it is an input configuration problem rather than a font problem. I could be wrong.

A follow-up (which may become increasingly irrelevant is unicode/utf8 gets properly integrated into Linux).

Long ago I realised that in Linux/X11 the issues of keyboard input in multilingual documents (i.e. where several languages with incompatible encodings were all present) and rendering/display) of the text had to be separate if it was going to work. The solution was to use good typesetting software (the main candidates in Linux are groff and TeX). You can enter your funnies using ASCII transcriptions, and the software is responsible for ensuring that they come out right. A bit of preliminary work is needed, but afterwards things become pretty straightforward.

As an example, for Cyrillic I made a groff macro "cyr" which sets up a correspondence between character "names" and their groff codes. Excerpt (the original has a lot of lines):

.de cyr .ft AntCy ... .char a \N'193' ... .char [i:] \N'202' .char k \N'203' ... .char o \N'207' ... .char s \N'211' ... .char v \N'215' ... .char [Tch] \N'254' ... ..

(with a corresponding macro /cyr which undoes all these definitions). Each ".char" definition states that the first argument (e.g. [i:]) is the name of a character which will be represented as the glyph at the position (e.g. 202) given by the second aregument in the font which I've called "AntCy" (Antique Cyrillic in the original).

So, for instance, the name of the composer is entered as

.cyr [Tch]a[i:]kovski[i:] ./cyr

which is pretty readable and will be displayed in good Cyrillic. As a more extended example, you could enter the ASCII text

Did .cyr [Kh]ru[shch][yo]v ./cyr like .cyr [Tch]a[i:]kovski[i:]\c ./cyr 's music?

which shows how, in a mixed-language document, it can become a bit cumbersome and not so easy on the eye. However, it works well and is infinitely flexible (provided you have the PostScript fonts needed). Where there are extended passages in one language, then you won't be switching all the time and it gets much more readable.

Similar things are possible with TeX.

Just a comment ... comments invited! Ted.

-------------------------------------------------------------------- E-Mail: (Ted Harding) Ted.Harding@nessie.mcc.ac.uk Fax-to-email: +44 (0)870 167 1972 Date: 17-Sep-03 Time: 13:08:35 ------------------------------ XFMail ------------------------------

MJ Ray

5:34 p.m.

On 2003-09-17 13:08:35 +0100 (Ted Harding) Ted.Harding@nessie.mcc.ac.uk wrote:

...

You can enter your funnies using ASCII transcriptions, and the software is responsible for ensuring that they come out right. A bit of preliminary work is needed, but afterwards things become pretty straightforward.

As long as it is only a minority of your work, or you have editors (like yudit and qemacs) that work around inadequate input devices and can do some smart conversion work from files actually written from the remote character set into the transliteration. ("funnies" probably isn't the best thing to call other people's characters...)

It's a good half-way solution, but the problem is going away: today's TeX can accept some unicode input, it seems.

-- MJR/slef My Opinion Only and possibly not of any group I know. http://mjr.towers.org.uk/ gopher://g.towers.org.uk/ slef@jabber.at Creative copyleft computing services via http://www.ttllp.co.uk/

7947

Age (days ago)

7949

Last active (days ago)

main@lists.alug.org.uk

6 comments

2 participants

tags (0)

participants (2)

MJ Ray
Ted.Harding＠nessie.mcc.ac.uk