|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 | Next > |
|
|
RTL supportHi,
Regarding support for Arabic, Hebrew, etc. there are three elements: a. RTL layout b. Shaping c. Bidi reordering For myself, support for the first two, without bidi reordering, would be fantastic. Usually when I work in Arabic, I don't need bidi support. IMO it should be considered an add-on. Actually I think the way Vim does it is the way to go - each feature can be dis/enabled independently. So how hard would it be to add plain RTL layout first, and then Arabic shaping? thanks, gregg _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: RTL support> Date: Tue, 22 Nov 2005 11:05:43 -0600
> From: Gregg Reynolds <gar@...> > > Regarding support for Arabic, Hebrew, etc. there are three elements: > > a. RTL layout > b. Shaping > c. Bidi reordering No, there are only two: shaping and Bidi reordering. The former is relevant for Arabic scripts alone, AFAIK (Hebrew certainly doesn't need that, but I'm not sure whether there are scripts besides Arabic that need it). RTL layout is an integral part of bidi reordering. > For myself, support for the first two, without bidi reordering, would be > fantastic. Usually when I work in Arabic, I don't need bidi support. I don't speak Arabic, so I cannot say how useful it is to have right-to-left display without bidi reordering. I _can_ tell you that bidi support is a must for Hebrew (because numbers should be displayed left to right), and the way Arabic related bidi features are specified in the Unicode Bidirectional Algorithm makes me wonder how come such an elaborate scheme (more complex than the scheme used in Hebrew) was invented if Arabic can be written without it. > So how hard would it be to add plain RTL layout first, and then Arabic > shaping? I have no idea. I certainly am not going to work on RTL without bidi. If you want partial bidi support, you might try the m17n version or hebeng.el (which you could hack to support Arabic). As for Arabic shaping, I think some work was or is planned in the Unicode branch. You may wish to try searching the emacs-devel archives for suitable keywords, I think Kenichi Handa posted some messages there in the past. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
|
|
|
Re: Re: RTL supportOn Tue, 22 Nov 2005, Gregg Reynolds wrote:
> The point being simply that RTL layout makes for a perfectly usable > editor with or without bidi support. In your opinion. > -gregg --behdad http://behdad.org/ "Commandment Three says Do Not Kill, Amendment Two says Blood Will Spill" -- Dan Bern, "New American Language" _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: RTL supportEli Zaretskii wrote:
>>Date: Tue, 22 Nov 2005 11:05:43 -0600 >>From: Gregg Reynolds <gar@...> >> ... > in the Unicode Bidirectional Algorithm makes me wonder how come such > an elaborate scheme (more complex than the scheme used in Hebrew) was > invented if Arabic can be written without it. > 1. It was legacy, so Unicode had so support it. Then they went berserk with it. 2. Whoever made that first fateful design mistake either didn't understand what he was doing, or else designing in the service of the Arabic/Hebrew/etc speaking community was not a priority (making Western software work for those languages cheaply was most likely the motivation, hence the desire to avoid handling LSD-first digits. But that's just my speculation.) > >>So how hard would it be to add plain RTL layout first, and then Arabic >>shaping? > > > I have no idea. I certainly am not going to work on RTL without bidi. > If you want partial bidi support, you might try the m17n version or > hebeng.el (which you could hack to support Arabic). > > As for Arabic shaping, I think some work was or is planned in the > Unicode branch. You may wish to try searching the emacs-devel > archives for suitable keywords, I think Kenichi Handa posted some > messages there in the past. > Thanks, I'll take a look. -gregg _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
|
|
|
Re: Re: RTL support> Date: Tue, 22 Nov 2005 16:13:58 -0600
> From: Gregg Reynolds <gar@...> > Cc: emacs-bidi@... > > Benjamin Riefenstahl wrote: > > Hi Eli, Gregg, Benjamin, I consistently don't see your messages on the list. Do you know why is that? > The point being simply that RTL layout makes for a perfectly usable > editor with or without bidi support. Not for me, nor for most of the users of bidi languages. So you are obviously in minority here. I responded to your other points elsewhere in this thread. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: Re: RTL support> Date: Tue, 22 Nov 2005 22:07:53 -0600
> From: Gregg Reynolds <gar@...> > Cc: emacs-bidi@... > > >>1. It was legacy, so Unicode had so support it. Then they went > >> berserk with it. > > > > > > From my POV, there are very good reasons to consistently encode > > characters in the order in which they are written. You don't want > > visual layout for any other operation except display. You might think > > that display is the most important operation on text, but for large > > bits of most software it isn't. > > Two things. One is, directionality a design choice, not a reflection of > some kind of objective reality. That's true, and we decided here long time ago to store characters in the logical order in Emacs buffers. The reasons were not only that most other software in the world made the same decision (and thus if we want to be able to import text from outside we are better off with logical order), but also which way would make common Emacs operations, like searching, easier. It is pointless to try to convince us now to change that design decision. Even if you come up with VERY convincing arguments (which you didn't, as everything you wrote was on our table when we discussed this back then), it will be a very hard job to make us revert that decision. > In other words "reasons to consistently encode characters in the order > in which they are written" is essentially meaningless. They are not meaningless, they describe a conscious design decision that was made after much discussion and deliberations. We came to the conclusion that logical-order storage will make the rest of bidi support easier. > It boils down to an economic argument. For Arabic, we need a) RTL > layout (a purely graphical matter); and b) shaping. Both of these are > (relatively) inexpensive to implement. Support for bidi reordering is a > nice enhancement, but it's a) expensive; and b) unecessary unless you > write in two or more languages in the same doc. This is only true if we accept your assumption that text should be stored within Emacs in visual order. And we already rejected that design. So for us, bidi reordering during display is a must. > Ask yourself a simple question. Software like Emacs has been around for > what, 30 years? It gained support for e.g. Japanese, Korean, etc. years > ago. But the 1 billion + people in the world who need RTL support are > still waiting. Why is that? Because precious few out of those 1 billion were able or wishing to help us integrate bidi reordering into Emacs display engine. > The bidi algorithm is complex and generally yucky. Nevertheless, I think I succeeded to conquer it for Emacs. Unfortunately, I ran out of free time soon after that, so I need help in getting this to a working, reliable support. Then this support could be extended by others to make Emacs bidi editor. > Thought experiment: imagine a world in which nobody would implement > English language software unless it had bidi support. Such arguments are fruitless now, when the design decisions were made long ago. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: Re: RTL supportEli Zaretskii wrote:
>>Date: Tue, 22 Nov 2005 16:13:58 -0600 >>From: Gregg Reynolds <gar@...> >>Cc: emacs-bidi@... >> >>Benjamin Riefenstahl wrote: >> >>>Hi Eli, Gregg, > > > Benjamin, I consistently don't see your messages on the list. Do you > know why is that? > > >>The point being simply that RTL layout makes for a perfectly usable >>editor with or without bidi support. > > > Not for me, nor for most of the users of bidi languages. So you are > obviously in minority here. > strange series of responses. I have no idea why you bring up majority/minority. I had no idea that you were competent to speak on behalf of "most users of bidi languages", especially since there is no such thing as a "bidi language". I really just wanted to know something about RTL support in Emacs. Obviously this is not the right place. Please don't respond. This thread has gone far enough. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: RTL support>>>>> "Eli" == Eli Zaretskii <eliz@...> writes:
Eli> That's true, and we decided here long time ago to store Eli> characters in the logical order in Emacs buffers. The reasons Eli> were not only that most other software in the world made the Eli> same decision (and thus if we want to be able to import text Eli> from outside we are better off with logical order), but also Eli> which way would make common Emacs operations, like searching, Eli> easier. Right, although this is now a long time ago, I recall I gave some lisp implementation a try in which a visual method was used. Even Ehud version (which was by far the best around) did not do line breaking very well. I tried to set up my own and did not succeed in anything really usable, so I am not sure whether it is that easy at it least not at the lisp level. May be if it is done correctly on the C level it is different. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: Re: RTL support> Date: Tue, 22 Nov 2005 22:48:02 -0600
> From: Gregg Reynolds <gar@...> > CC: b.riefenstahl@..., emacs-bidi@... > > Well, I for one am very sorry that a simple question had prompted such a > strange series of responses. What is strange about it? that we all disagreed with you? > I have no idea why you bring up majority/minority. I had no idea > that you were competent to speak on behalf of "most users of bidi > languages" I wasn't speaking on behalf of anyone. I merely told you what we decided in this forum when we discussed the design of Emacs support for languages that need bidirectional editing. Here, ``we'' means ``all those who were interested enough in Emacs support for bidi to participate, and who knew enough about Emacs internals to contribute to that discussion''. So obviously the views and opinions expressed here are only in the context of Emacs design, they do not pretend to be broader than that, even if the specific wording might indicate otherwise. This is, after all, "Emacs bidi" mailing list, no more, no less. > I really just wanted to know something about RTL support in Emacs. > Obviously this is not the right place. This _is_ the right place. I tried to answer your questions, sorry if I failed. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: RTL supportAt 07:26 05/11/23, Gregg Reynolds wrote:
>1. It was legacy, so Unicode had so support it. Then they went berserk with it. >2. Whoever made that first fateful design mistake either didn't understand what he was doing, or else designing in the service of the Arabic/Hebrew/etc speaking community was not a priority (making Western software work for those languages cheaply was most likely the motivation, hence the desire to avoid handling LSD-first digits. But that's just my speculation.) Well, Unicode is of course about encoding all scripts of the world, whatever the direction. It seems extremely obvious that in that context, you'd try to come up, or adopt, a solution that didn't only allow each script to work on it's own, but also different scripts together. The final algorithm is probably more complex than it really needed to be, but that's similar for most standards. Calling it 'berserk' doesn't help in my view. Regarding LSD (least significant digit) first, that's of course the crucial point. If you say that making Western software work for RTL languages cheaply was the motivation for the bidi algorithm, and for making RTL languages inherently bidi, then you seem to say that implementing LSD first is even more difficult/expensive than implementing bidi. I'd probably have to agree with that: While the technical details of a single LSD-first number are much easier, making sure that everybody in the world always knows which numbers are MSD-first and which numbers are LSD-first would be a very expensive nightmare. Messing up things like 123 and 321 can easily get expensive. Having text, rather than numbers, run the wrong way at times, doesn't look better, but is much better re. error detection. Regards, Martin. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: RTL supportMartin Duerst wrote:
> At 07:26 05/11/23, Gregg Reynolds wrote: > >>1. It was legacy, so Unicode had so support it. Then they went > berserk with it. >>2. Whoever made that first fateful design mistake either didn't > understand what he was doing, or else designing in the service of the > Arabic/Hebrew/etc speaking community was not a priority (making Western > software work for those languages cheaply was most likely the > motivation, hence the desire to avoid handling LSD-first digits. But > that's just my speculation.) > > Well, Unicode is of course about encoding all scripts of the > world, whatever the direction. It seems extremely obvious that > in that context, you'd try to come up, or adopt, a solution > that didn't only allow each script to work on it's own, but > also different scripts together. The final algorithm is > probably more complex than it really needed to be, but that's > similar for most standards. Calling it 'berserk' doesn't help > in my view. > > Regarding LSD (least significant digit) first, that's of course > the crucial point. If you say that making Western software > work for RTL languages cheaply was the motivation for the > bidi algorithm, and for making RTL languages inherently bidi, No, I was speculating that that might have had something to do with modeling RTL digit strings as MSD-first. Without that, you have problems with math routines. If we were starting from scratch today that might not be a big problem, but in the 50s and 60s processor time was hugely expensive, and most (business) computing was bean-counting. There were probably good economic reasons at the time in favor of the MSD-first design. But that's idle speculation. > then you seem to say that implementing LSD first is even more > difficult/expensive than implementing bidi. I'd probably have Not at all; only with respect to functions etc. that interpret digit strings as numbers. > to agree with that: While the technical details of a single > LSD-first number are much easier, making sure that everybody > in the world always knows which numbers are MSD-first and > which numbers are LSD-first would be a very expensive nightmare. > Messing up things like 123 and 321 can easily get expensive. > Having text, rather than numbers, run the wrong way at times, > doesn't look better, but is much better re. error detection. Similar arguments were made on the Unicode list not too long ago. Let's please not open up that debate here ;), but for what it's worth I never understood what the worry is. Personally I don't see any possibility of confusion, but others clearly do. -gregg _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: RTL support> Date: Thu, 24 Nov 2005 12:37:42 +0900
> From: Martin Duerst <duerst@...> > Cc: emacs-bidi@... > > Regarding LSD (least significant digit) first, that's of course > the crucial point. If you say that making Western software > work for RTL languages cheaply was the motivation for the > bidi algorithm, and for making RTL languages inherently bidi, > then you seem to say that implementing LSD first is even more > difficult/expensive than implementing bidi. Unless I'm missing something, these issues have nothing to do with what we were discussing. We weren't discussing how to encode bidi text in a file or in general; we were discussing how to hold it within Emacs buffers and strings. The latter is an internal Emacs matter that shouldn't bother users at all. The only valid arguments for how to store RTL text within Emacs buffers and strings are those which compare the difficulty of adding bidi support to relevant Emacs features. That is, one must speak about Emacs design and structure, not about anything else. When we discussed this in the past, the conclusion was that storing RTL text in the visual order will require bidi-related changes in many places in Emacs, both in many primitive operations and in application C and Lisp code. By contrast, logical-order storage required changes in a small number of well-isolated parts of low-level code, mainly in display code and in some of the primitives that translate screen to buffer position and back. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: RTL supportHi Gregg,
I am not an Emacs developer, and I don't plan to work on this issue right now. I also don't believe that you have brought up new arguments to change the decisions about how this is to done in Emacs in the future. I do think that an occasional check of these ideas is a good thing, though. So this exchange is mostly about: "Would I personally find useful software that worked along the lines hat you suggest." Gregg Reynolds writes: > Two things. One is, directionality a design choice, not a > reflection of some kind of objective reality. The fact that Arabic and other scripts are written and read from right to left is a design feature of the script that we can't just ignore when we implement it in computers, we have to deal with it at some level. The question is at *which* level. > There is no necessary relationship between the IO model implemented > by an application and the corresponding textual representation, Exactly. Which is why Unicode put the complicated parts into the IO model (for human IO) with BIDI reordering, while any software module that doesn't have human IO can completely ignore the issue. The same goes for most software that directly implements human IO but uses pre-fabricated building blocks for it (using e.g. GTK or Qt). If OTOH you use visual ordering in the encoding you make life easier for a few primitive versions of the IO and complicated for all the rest of the software. Not to mention that it makes it even more complicated for more advanced - read: user-friendly - versions of IO. There is a third possibility in our case, using visual order within Emacs and only storing the text in logical order. That is possible in a simple text editor (and I am sure there are some of those around). But Emacs does a lot more, of course. Every module in Emacs that needs to look at the logical order would have to make the reordering anyway. And as Emacs is about text processing that would probably be a lot of modules. That's the choice. I personally prefer the first way of doing it. > More important is that RTL has no necessary relationship to mixed > content or bidi reordering. If you only ever write documents in > Arabic (Hebrew, Persian, Pashto, whatever) then why do you need > bidi? A large part (maybe still a majority) of the people that write Arabic and Hebrew on computers write in more than just one language. This is even if you discount numbers and trademarks. > To be clear: monolingual Arabic text is not mixed content, whether > it contains digit strings or not. So why should an Arabic user pay > the Unicode tax of bidi support? A large part of the user base right now does need mixed content. So you would get the tax of supporting several versions of software, the software for people that don't need mixed content and another version for people that do. Even if the first version on its own might be cheaper, on the whole this will get more costly. Not to mention that it would end up in a system where the "natives" get the "stupid" mono-lingual software and the "experts" and the westerners can afford the "intelligent" software for the mixed content. benny _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: RTL supportBenjamin Riefenstahl wrote:
> Hi Gregg, > Hi Benny, > > I am not an Emacs developer, and I don't plan to work on this issue > right now. I also don't believe that you have brought up new > arguments to change the decisions about how this is to done in Emacs > in the future. I do think that an occasional check of these ideas is > a good thing, though. Fair enough. (However, my original post wasn't intended to be argumentative, but to ask about some specific design options. So to be clear, I don't mean to advocate any particular option at this point, since I don't know enough about Emacs internals. The general observation that graphical layout, text reordering (via bidi or any other algo), and shaping are mutually orthogonal applies generally to any notion of text processing. > > So this exchange is mostly about: "Would I personally find useful > software that worked along the lines hat you suggest." > Yep; this is my itch. I happen to think scratching it would benefit many others, but of course that is (informed) speculation. > Gregg Reynolds writes: > >>Two things. One is, directionality a design choice, not a >>reflection of some kind of objective reality. > > > The fact that Arabic and other scripts are written and read from right > to left is a design feature of the script that we can't just ignore > when we implement it in computers, we have to deal with it at some > level. The question is at *which* level. Sorry, I wasn't clear. I mean that modeling a single script/language as mono- or bi-directional is a design choice, not a statement of a Law of Nature. This might be better expressed by saying the choice of number polarity - MSD or LSD first in strings - is a design choice. One could model English text with LSD-first digit strings if one wanted. This means, among other things, that "RTL" does not imply "bidi", any more than "LTR" does. RTL/LTR refers solely to graphical syntax, not to an encoding model. > >>There is no necessary relationship between the IO model implemented >>by an application and the corresponding textual representation, > > > Exactly. Which is why Unicode put the complicated parts into the IO > model (for human IO) with BIDI reordering, while any software module > that doesn't have human IO can completely ignore the issue. The same I don't understand what you say here. Unicode as I understand it doesn't have anything at all to say about IO; it just defines character semantics and syntax (accent after base char, etc.) Note that there are no complicated parts for monodirectional text. It's the bidi requirement itself that creates the complication. Another clarification: I'm not arguing against bidi support where it is truly needed, namely in mixed language texts. Nor am I arguing that Emacs should not have bidi support - it should, obviously. I guess the point is that we can get there in stages. First you implement RTL layout, then shaping, then bidi. That way we have *usable* software without having to wait for bidi support, and eventually we do have full bidi support. Vim provides the model: you can switch on/off RTL layout and Arabic shaping independently; hopefully someday somebody will add bidi support too. But in the meantime it is very useful for working with Arabic text. I'd simply like for Emacs to be as useful, since I'm firmly in the Emacs camp when it comes to editors. > goes for most software that directly implements human IO but uses > pre-fabricated building blocks for it (using e.g. GTK or Qt). > > If OTOH you use visual ordering in the encoding you make life easier > for a few primitive versions of the IO and complicated for all the > rest of the software. Not to mention that it makes it even more > complicated for more advanced - read: user-friendly - versions of IO. I don't see how. Can you provide an example of how this would make things more complicated? I mean other than with math routines. That I admit is the big problem. There are ways around it, but that's for another thread. > > There is a third possibility in our case, using visual order within > Emacs and only storing the text in logical order. That is possible in > a simple text editor (and I am sure there are some of those around). > But Emacs does a lot more, of course. Every module in Emacs that > needs to look at the logical order would have to make the reordering > anyway. And as Emacs is about text processing that would probably be I don't see why. Example? > a lot of modules. > > That's I think we're talking about two separate things. In my opinion, the internal encoding used by Emacs is irrelevant, so long as I know what it is. I just want RTL layout and Arabic shaping, both of which simply operate on a string of chars/glyphs. Actually, even when Emacs has full bidi support, I would still want a "transparent" mode that will provide a graphical representation of the true (physical) ordering of the text. The question of how best to represent text internally is an interesting one, but I haven't given it much thought. I do think Emacs did the right thing by *not* adopting Unicode as its internal representation. the choice. I personally prefer the first way of doing it. > > >>More important is that RTL has no necessary relationship to mixed >>content or bidi reordering. If you only ever write documents in >>Arabic (Hebrew, Persian, Pashto, whatever) then why do you need >>bidi? > > > A large part (maybe still a majority) of the people that write Arabic > and Hebrew on computers write in more than just one language. This is > even if you discount numbers and trademarks. Yes, I've heard this claimed many times, but I've never seen any evidence to back it up. My personal experience is that it is simply not true. In the Arab world, at least, *most* people do *not* operate in multiple languages (just like in the US), and from what I've personally seen they get along fine using Arabic only on a computer, just as most Americans get along fine using English only. Even scholarly articles written in English about Arabic generally use transliteration. Things are no different in the Arab world. When newspapers need to write "CNN" or "FBI", they transliterate it. Then need for full mixed directional support is quite specialized, probably everywhere in the world. Add to that the fact that multilanguage computing w/out bidi support is quite feasible. I do it all the time using Vim and even Emacs. > > >>To be clear: monolingual Arabic text is not mixed content, whether >>it contains digit strings or not. So why should an Arabic user pay >>the Unicode tax of bidi support? > > > A large part of the user base right now does need mixed content. So That may be true for the *current* emacs user base. Then again, Emacs has no RTL user base, since Emacs doesn't support RTL. Whether or not the potential RTL user base truly needs multilanguage (mixed directionality) support is a matter of speculation. But we *know* that they need RTL layout and shaping, and we also know that RTL layout and shaping is sufficient to make software useful. Besides, to me the user base is everybody in the world. Whoever wants to use it, should be able to use it. Lack of bidi support need not prevent the software from being useful for people who don't need bidi support. > you would get the tax of supporting several versions of software, the > software for people that don't need mixed content and another version > for people that do. Even if the first version on its own might be > cheaper, on the whole this will get more costly. Not to mention that > it would end up in a system where the "natives" get the "stupid" > mono-lingual software and the "experts" and the westerners can afford > the "intelligent" software for the mixed content. I guess I wasn't clear - see my note above. As you note, it wouldn't make much sense to support two RTL versions of a piece of software, one with and one without bidi support. But there would be no reason to do so; RTL w/out bidi would just be a stage on the way to full bidi implementation. It's interesting that you perceive the "intelligent" software as the stuff with bidi support. In my experience it is just the opposite: editors with Unicode bidi support are really stupid, from the end user point of view. They are often almost impossible to use, thanks to bizarro cursor behaviour and the directional ambiguity Unicode explicitly assigns to characters like puncuation, parens, etc. I find Vim much simpler and more user-friendly. Somewhere in the GCC list archives there's a note from RMS in response to an issue involving support for some obscure feature of the ISO C++ standard (if I recall correctly), in which he says it all in a very few words, something along the lines of "Standards are recommendations; we should design to meet the needs of our community; if the Standard helps with that, then we support it, but if not we shouldn't hesitate to ignore it and do what is best for the community." Software support for Unicode RTL scripts provides a classic example of getting things backwards - designing to satisfy the standard instead of community needs. In summary, there's more than one way to skin a cat, as the (American) saying goes. Emacs (and other software) can be quite useful to RTL users without bidi support. It's better to have bidi support, naturally, but the cost if bidi implementation need not stand in the way of providing useful stuff, and providing useful stuff by supporting non-bidi RTL and shaping need not inhibit implementation of bidi support. thanks, gregg _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: RTL support> Date: Fri, 25 Nov 2005 11:24:19 -0600
> From: Gregg Reynolds <gar@...> > CC: emacs-bidi@..., Eli Zaretskii <eliz@...> > > I guess the point is that we can get there in stages. First you > implement RTL layout, then shaping, then bidi. That way we have > *usable* software without having to wait for bidi support, and > eventually we do have full bidi support. >From what I understand, users of RTL languages want almost full bidi support, they won't settle for RTL display alone. > Vim provides the model I never saw anyone use Vim for Hebrew text---that I can tell you. Hebrew text without reordering is going backwards to the age of typewriters when secretaries needed to learn to type numbers and other LTR text backwards. > But in the meantime it is very useful for working with Arabic text. Again, users of Arabic scripts I talked to think otherwise: they want bidi support, not just RTL display. > > There is a third possibility in our case, using visual order within > > Emacs and only storing the text in logical order. That is possible in > > a simple text editor (and I am sure there are some of those around). > > But Emacs does a lot more, of course. Every module in Emacs that > > needs to look at the logical order would have to make the reordering > > anyway. And as Emacs is about text processing that would probably be > > I don't see why. Example? The simplest example would be incremental search. Suppose you type "C-s ABCD", where upper-case letters denote RTL characters (e.g., Arabic letters). If text is stored in the buffer in visual order, Emacs will have to reorder ABCD into DCBA before passing it to the text-search primitive. Likewise, any other Emacs function that receives input from the user or from external applications will need to do the reordering from logical to visual order. In other words, many places in Emacs will need to be changed to handle visual-to-logical reordering. This would make the job of adding RTL support to Emacs unbearably hard. In addition, while logical-to-visual reordering is a well-defined operation, whereby for every logical-order string there's one and only one visual-order string, the reverse is not true: for some visual-order strings one can find more than one logical-order string that, when reordered according to Unicode Bidirectional Algorithm, will all give the original visual-order string. > I just want RTL layout and Arabic shaping, both of which simply > operate on a string of chars/glyphs. But the Emacs developers want what the users of RTL languages want, and that is bidi support, not just RTL display. So any work done for RTL support must be consistent with, and a part of, the full bidi support, otherwise we, the Emacs developers, will object to including it. > The question of how best to represent text internally is an interesting > one, but I haven't given it much thought. We did give it much thought, and the conclusion was to use the logical order. That's why we must have bidi reordering in the display code. > > A large part (maybe still a majority) of the people that write Arabic > > and Hebrew on computers write in more than just one language. This is > > even if you discount numbers and trademarks. > > Yes, I've heard this claimed many times, but I've never seen any > evidence to back it up. My personal experience is that it is simply not > true. In the Arab world, at least, *most* people do *not* operate in > multiple languages (just like in the US), and from what I've personally > seen they get along fine using Arabic only on a computer, just as most > Americans get along fine using English only. As I and others pointed out here, even Arabic-only text needs bidi reordering because of digits and other weak and neutral characters. The only way to avoid this reordering is to store characters within Emacs buffers and strings in visual order, which we decided not to do, for reasons explained above. > In summary, there's more than one way to skin a cat, as the (American) > saying goes. Emacs (and other software) can be quite useful to RTL > users without bidi support. It's better to have bidi support, > naturally, but the cost if bidi implementation need not stand in the way > of providing useful stuff, and providing useful stuff by supporting > non-bidi RTL and shaping need not inhibit implementation of bidi support. This is Free Software developed by volunteers. And all the volunteers we have that know about and use RTL languages unanimously decided they wanted RTL support with bidi reordering. It is okay for you to disagree, but the only method to steer Emacs development your way would be to submit code changes to add simple RTL display without bidi reordering to Emacs. If you write such code, and it is clean and doesn't get in the way of the future bidi reordering support, I promise you I will review the code and recommend it for inclusion. But as long as you leave this job to us, we will do what we think is right, and that is RTL with bidi reordering support. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: Re: RTL supportOn Fri, 2005-11-25 at 11:24 -0600, Gregg Reynolds wrote:
> > A large part (maybe still a majority) of the people that write Arabic > > and Hebrew on computers write in more than just one language. This is > > even if you discount numbers and trademarks. > > Yes, I've heard this claimed many times, but I've never seen any > evidence to back it up. My personal experience is that it is simply not > true. In the Arab world, at least, *most* people do *not* operate in > multiple languages (just like in the US), and from what I've personally > seen they get along fine using Arabic only on a computer, just as most > Americans get along fine using English only. Even scholarly articles > written in English about Arabic generally use transliteration. Things > are no different in the Arab world. When newspapers need to write "CNN" > or "FBI", they transliterate it. Then need for full mixed directional > support is quite specialized, probably everywhere in the world. > > Add to that the fact that multilanguage computing w/out bidi support is > quite feasible. I do it all the time using Vim and even Emacs. Hello Gregg, With all respect, your claims are so wrong I do not know where to start. I live in Israel and I use computers for Hebrew wordprocessing, in addition to other purposes and other kinds of text processing. I do not know how it is in Arabic speaking countries, but in Israel, full BiDi is mandatory in text editors and wordprocessors, which aim at handling Hebrew. Even in Hebrew monolingual text, numbers are written in LTR order. So you already need a BiDi algorithm. In addition to this, Latin letters are frequently used in Hebrew text, especially in articles about technical topics. I consider any wordprocessor or editor without full BiDi support to be broken and useless for editing Hebrew texts. I use Emacs to edit Web pages for my Web site (http://www.zak.co.il/), but when a Web page has also Hebrew text, I use gedit instead, because it has full BiDi support. --- Omer -- Delay is the deadliest form of denial. C. Northcote Parkinson My own blog is at http://www.livejournal.com/users/tddpirate/ My opinions, as expressed in this E-mail message, are mine alone. They do not represent the official policy of any organization with which I may be affiliated in any way. WARNING TO SPAMMERS: at http://www.zak.co.il/spamwarning.html _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: Re: RTL supportOmer Zak wrote:
> > Hello Gregg, > With all respect, your claims are so wrong I do not know where to start. I surrender. How can one possibly argue against an argument as clever as "you are wrong". Good work, Omer. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: RTL supportHi Gregg,
Sorry, this is much too long again :-( Gregg Reynolds writes: > This might be better expressed by saying the choice of number > polarity - MSD or LSD first in strings - is a design choice. One > could model English text with LSD-first digit strings if one wanted. At the encoding level you could. At the input level you have to emulate how the users would do it on paper. If you don't, users will find that awkward and look for something else. We don't design scripts, we just design how they are implemented in computers. > RTL/LTR refers solely to graphical syntax, not to an encoding model. RTL refers to how I read, write and line-break a written phrase as a human being. Computers have to map this to their notions of graphics and coordinate-systems. And than they have to store it in a convenient form. > I guess the point is that we can get there in stages. First you > implement RTL layout, then shaping, then bidi. It *is* done in stages. We have had Hebrew modules in Elisp for ages. We have emacs-bidi now. The only thing is that this is not in the stock Emacs. The same as support for FE scripts was not in stock Emacs until the developers thought it was mature enough and somebody had (or made) the necessary time to integrate it. This development process takes time. The developers do it in their free time, after all, and the issues are complicated. Doing it in the stages that you indicated would probably take more time (and require more overhead in discussing it :-() than just doing it well enough in fewer steps. >> A large part (maybe still a majority) of the people that write >> Arabic and Hebrew on computers write in more than just one >> language. This is even if you discount numbers and trademarks. > > Yes, I've heard this claimed many times, but I've never seen any > evidence to back it up. My personal experience is that it is simply > not true. In the Arab world, at least, *most* people do *not* > operate in multiple languages (just like in the US), Hm. In your opinion, native Arab speakers who only work in their own language are how much of the whole user base for Arabic? 50%? 70%? 90%? There are westerners studying Arabic, business types (native Arabs and westerners) doing multi-language word-processing (think of any larger company), there are programmers, scripters, folks building HTML pages, even graphical designers that need to incorporate correct Western elements into their designs, ... I've probably forgotten some important groups. All of these work with mixed content. I'd say these are probably much more than 30% of the computer users and even just 10% would IMO be more than enough of a market to justify to implement mixed content and not start a separate code branch for RTL only. > Even scholarly articles written in English about Arabic generally > use transliteration. There were times when they didn't. Now authors regularly apologize to their readers that they can't do it. So there is demand, but publishers currently think it's too expensive. > Things are no different in the Arab world. When newspapers need to > write "CNN" or "FBI", they transliterate it. As far as I have seen (and you have probably more experience), newspapers and magazines sometimes do and sometimes don't. But advertisements seem to be more likely to use at least their brand names in Latin script. And those advertisements targeting the wealthy probably use English marketing phrases, as they do over here in Germany, right? >> A large part of the user base right now does need mixed content. So > > That may be true for the *current* emacs user base. The current base and the base of the *very* near future is what Emacs is written for. Like most software Emacs is not written for an ideal world. That would waste too much resources. Emacs is actually an exception in that it is so easily adapted to users' needs in most respects, that it has a much better flexibility than a lot of other software. This is more difficult with some things than with others, as this example shows. And even here we actually have emacs-bidi now. > Besides, to me the user base is everybody in the world. Whoever > wants to use it, should be able to use it. In the way that you are framing it, I think that is unrealistic. benny PS: Clarifications of my previous post: >> [...] Unicode put the complicated parts into the IO model (for >> human IO) with BIDI reordering, while any software module that >> doesn't have human IO can completely ignore the issue. The same > > I don't understand what you say here. Unicode as I understand it > doesn't have anything at all to say about IO; it just defines > character semantics and syntax (accent after base char, etc.) Yes, Unicode specifies the encoding level. But by exclusion those things that are required by the task at hand, but are not specified in Unicode itself, have to be done in other levels. In this way Unicode works on the basis of an implied architecture. >>Not to mention that it makes it even more complicated for more >>advanced - read: user-friendly - versions of IO. > > I don't see how. Can you provide an example of how this would make > things more complicated? I spend too much on these posts already to also think of exiciting examples, too. Sorry ;-) What I mean is that with a given visual encoding, if your IO model is not the exact same as the encoding model for whatever reason, than the encoding gets in your way in a big way and makes for even more complicated code (== more bugs, less features). So mandating a visual (or not-quite-logical) encoding was no realistic choice for Unicode and is also not realistic for a text-processing platform such as Emacs. >> Every module in Emacs that needs to look at the logical order would >> have to make the reordering anyway. And as Emacs is about text >> processing that would probably be a lot of modules. > > I don't see why. Example? E.g. search and replace. I have a number of functions that do that to fix up text automatically. This works on Emacs' internal text model. Sometimes I need to find stuff in a certain order, which is of course the logical order. If I were processing bidi text (say an XML file containing Syriac content, that's something that I actually have) and the text was not in logical order, I'd have to think about it. Everytime when I code a search I'd have see if the logical/visual dichotomy has an impact in the particular case. Let's say only half of the approximately thousand Elisp modules in stock Emacs have a need for a similar review at a superficial level, to see if they are compatible with the chosen visual or not-quite-logical ordering. That is a lot of work. With a plain logical ordering, such a review is probably still needed in some places, but changes should be necessary much more rarely. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
| < Prev | 1 - 2 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |