|
View:
New views
18 Messages
—
Rating Filter:
Alert me
|
|
|
Bidirectional editing in Emacs -- main design decisionsAs some of you know, I'm slowly working on adding support for
bidirectional editing in Emacs. (Before you ask: the code is not publicly available yet, and won't be until Emacs switches to bzr as its main VCS.) While there's a lot of turf to be covered yet, I thought I'd publish the main design decisions up to this point. Many of these decisions were discussed at length years ago on emacs-bidi mailing list, and since then I also talked them over in private email with a few people. Other decisions were made recently, as I went about changing the display engine. My goal, and the main drive behind these design decisions was to preserve as much as possible the basic assumptions and design principles of the current Emacs display engine. This is not just opportunism; I firmly believe that any other way would mean a total redesign and rewrite of the display engine, which is something we want to avoid. Personally, if such a redesign would be necessary, I couldn't have participated in that endeavor, except as advisor. With that preamble out of my way, here's what I can tell about the subject at this point: 1. Text storage Bidirectional text in Emacs buffers and strings is stored in strict logical order (a.k.a. "reading order"). This is how most (if not all) other implementations handle bidirectional text. The advantage of this is that file and process I/O is trivial, as well as text search. The disadvantage is that text needs to be reordered for display (see below) and also for sending to any other visual-order stream, such as a printer or a file in visual-order encoding. 2. Support for Unicode Bidirectional Algorithm The Unicode Bidirectional Algorithm, described in Annex 9 of the Unicode Standard (a.k.a. UAX#9, see http://www.unicode.org/reports/tr9/), specifies how to reorder bidirectional text from logical to visual order. Emacs will belong to the so-called "Full Bidirectionality" class of applications, which include support for both implicit bidirectional reordering and explicit directional embedding codes that allow to override the implicit reordering. This means that Emacs supports the entire spectrum of Unicode character properties and special codes relevant to bidirectional text. 3. Bidi formatting codes are retained At some point in the reordering described by UAX#9, the various formatting codes are to be removed from the text, once they've performed their role of forcing the order of characters for display, because they are not supposed to be visible on display. Contrary to this, Emacs does not remove these formatting codes, it just behaves as if they are not there. (This behavior is acknowledged by UAX#9 under "Retaining Format Codes" clause, so Emacs does not break conformance here.) This is primarily because Emacs must preserve the text that was not edited; in particular, visiting a file and then saving it to a different file without changing anything must produce the same byte stream as the original file, even if the formatting codes were part of the original file. In addition, being able to show these formatting codes to the user is a valuable feature, because the way reordered text looks might not be otherwise understood or changed easily. 4. Reordering of text for display Reordering for display happens as part of Emacs redisplay. In a nutshell, the current unidirectional redisplay code walks through buffer text and considers each character in turn. After each character is processed and translated into a `struct glyph', which includes all the information needed for displaying that character, the iterator's position is incremented to the next character. In the bidi Emacs, this _linear_ iteration through the buffer is replaced with a _non-linear_ one, whereby instead of incrementing buffer position, a function is called to return the next position in the visual order. Whatever position it returns is processed next into a `struct glyph'. The rest of the code that produces "glyph matrices" (data structures used to decide which parts of the screen need to be redrawn) is largely ignorant of the bidirectionality of the text. Of course, parts of the display engine that manipulate the glyph matrices directly and assume that buffer positions increase monotonically with glyph positions need to be fixed or rewritten. But these parts of the display are relatively few and localized. Also, some redisplay optimizations need to be disabled when bidirectional text is rendered for display. 5. Visual-order information is volatile There were lots of discussions several years ago about whether Emacs should record in some way the information needed to reorder text into visual order of the characters, to reuse it later. In UAX#9 terminology, this information is the "resolved level" of each character. Various features were suggested as a vehicle for this, for example, some special text properties (except that text properties, unlike resolved levels, cannot overlap). Lots of energy went into discussing how this information would be recorded and how it will be reused, e.g. if portion of the text was copy-pasted into a different buffer or string. The complications, it turns out, are abound. The current design doesn't record this information at all. It simply recomputes it each time a buffer or string need to be displayed or sent to a visual-order stream. The resolved levels are computed during reordering, then forgotten. It turns out that bidirectional iteration through buffer text is not much more expensive than the current unidirectional one. The implementation of UAX#9 written for Emacs is efficient enough to make any long-term caching of resolved levels unnecessary. 6. Reordering of strings from `display' properties Strings that are values of `display' text properties and overlay properties are reordered individually. This matters when such properties cover adjacent portions of buffer text, back to back. For example, PROP1 is associated with buffer positions P1 to P2, and PROP2 immediately follows it, being associated with positions P2 to P3. The current design calls for reordering the characters of the strings that are the values of PROP1 and PROP2 separately. An alternative would be to feed them concatenated into the reordering algorithm, in which case the characters coming from PROP2 could end up displayed before (to the left) of the characters coming from PROP1. However, this alternative requires a major surgery of several parts of the display code. (Interested readers are advised to read the code of set_cursor_from_row in xdisp.c, as just one example.) It's not clear what is TRT to do in this case anyway; I'm not aware of any other application that provides similar features, so there's nothing I could compare it to. So I decided to go with the easier design. If the application needs a single long string, it can always collapse two or more `display' properties into one long one. Another, perhaps more serious implication of this design decision is that strings from `display' properties are reordered separately from the surrounding buffer text. IOW, production of glyphs from reordered buffer text is stopped when a `display' property is found, the string that is the property's value is reordered and displayed, and then the rest of text is reordered and its glyphs produced. The effect will be visible, e.g., when a `display' string is embedded in right-to-left text in otherwise left-to-right paragraph text. Again, I think in the absence of clear "prior art", simplicity of design and the amount of changes required in the existing display engine win here. 7. Paragraph base direction Bidirectional text can be rendered in left-to-right or in right-to-left paragraphs. The former is used for mostly left-to-right text, possibly with some embedded right-to-left text. The latter is used for text that is mostly or entirely right-to-left. Right-to-left paragraphs are displayed flushed all the way to the right margin of the display; this is how users of right-to-left scripts expect to see text in their languages. UAX#9 specifies how to determine whether this attribute of a paragraph, called "base direction", is one or the other, by finding the first strong directional character in the paragraph. However, the Unicode Character Database specifies that NL and CR characters are paragraph separators, which means each line is a separate paragraph, as far as UAX#9 is concerned. If Emacs would follow UAX#9 to the letter, each line could have different base direction, which is, of course, intolerable. We could avoid this nonsense by using the "soft newline" or similar features, but I firmly believe that Emacs should DTRT with bidirectional text even in the simplest modes, including the Fundamental mode, where every newline is hard. Fortunately, UAX#9 acknowledges that applications could have other ideas about what is a "paragraph". It calls this ``higher protocol''. So I decided to use such a higher protocol -- namely, the Emacs definition of a paragraph, as determined by the `paragraph-start' and `paragraph-separate' regexps. Therefore, the first strong directional character after `paragraph-start' or `paragraph-separate' determines the paragraph direction, and that direction is kept for all the lines of the paragraph, until another `paragraph-separate' is found. (Of course, this means that inserting a single character near the beginning of a paragraph might affect the display of all the lines in that paragraph, so some of the current redisplay optimizations which deal with changes to a single line need to be disabled in this case.) There is a buffer-specific variable `paragraph-direction' that allows to override this dynamic detection of the direction of each paragraph, and force a certain base direction on all paragraphs in the buffer. I expect, for example, each major mode for a programming language to force the left-to-right paragraph direction, because programming languages are written left to right, and right-to-left scripts appear in such buffers only in strings embedded in the program or in comments. 8. User control of visual order UAX#9 does not always produce perfect results on the screen. Notable cases where it doesn't are related to characters such as `+' and `-' which have more than one role: they can be used in mathematical context or in plain-text context; the "correct" reordering turns out to be different in each case. Again, lots of energy was invested in past discussions how to prevent these blunders. Several clever heuristics are known to avoid that. The problem is that all those heuristics contradict UAX#9, which means text that looks OK in Emacs will look different (i.e. wrong) in another application. I decided it was unjustified to deviate from UAX#9. Its algorithm already provides the solution to this problem: users can always control the visual order by inserting special formatting codes at strategic places. These codes are by default not shown in the displayed text, but they influence the resolved directionality of the surrounding characters, and thus change their visual order. We could (and probably should) have commands in Emacs to control the visual order that will work simply by inserting the appropriate formatting codes. For example, a paragraph starting with an Arabic letter could nonetheless be rendered as left-to-right paragraph by inserting the LRM code before that Arabic character; Emacs could have a command called, say, `make-paragraph-left-to-right' that did its job simply by inserting LRM at the beginning of the paragraph. This design kills two birds: (a) it produces text that is compliant with other applications, and will display the same as in Emacs, and (b) it avoids the need to invent yet another Emacs infrastructure feature to keep information such as paragraph direction outside of the text itself. That is all for now. If you have comments or questions, you are welcome to voice them. However, I reserver the right to respond only to those I'm interested in and/or have time for. ;-) _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
|
|
|
Re: Bidirectional editing in Emacs -- main design decisions> Date: Fri, 09 Oct 2009 23:18:00 +0200
> From: Eli Zaretskii <eliz@...> > Cc: > > So I decided to use such a higher protocol -- namely, > the Emacs definition of a paragraph, as determined by the > `paragraph-start' and `paragraph-separate' regexps. A small, but significant correction to this: these two regexps are looked for anchored at line beginning. The reason for this deliberate deviation from the letter of Emacs definition of a paragraph are complicated, but the upshot is that from the user point of view, it does not make sense to change paragraph direction if the paragraph separator does not begin at the beginning of a line. As another deviation from the definition of a paragraph, text that matches `paragraph-separate' is given the same direction as the preceding paragraph. (By contrast, Emacs generally does not consider `paragraph-separate' as part of any paragraph.) _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
|
|
|
|
|
|
Re: Bidirectional editing in Emacs -- main design decisions The reason for this deliberate deviation from the letter of Emacs
definition of a paragraph are complicated, but the upshot is that from the user point of view, it does not make sense to change paragraph direction if the paragraph separator does not begin at the beginning of a line. The only case when the paragraph separator does not begin at the beginning of a line is when the left margin is nonzero. Why should these paragraphs be different from other paragraphs with regard to direction of text? As another deviation from the definition of a paragraph, text that matches `paragraph-separate' is given the same direction as the preceding paragraph. (By contrast, Emacs generally does not consider `paragraph-separate' as part of any paragraph.) I don't think that conflicts at all with the normal definition of paragraphs. The separator isn't part of the paragraph, but its reading direction needs to be determined somehow. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: Bidirectional editing in Emacs -- main design decisions> From: Richard Stallman <rms@...>
> CC: emacs-devel@..., emacs-bidi@... > Date: Sat, 10 Oct 2009 05:16:58 -0400 > > The reason for this deliberate deviation from the letter of Emacs > definition of a paragraph are complicated, but the upshot is that from > the user point of view, it does not make sense to change paragraph > direction if the paragraph separator does not begin at the beginning > of a line. > > The only case when the paragraph separator does not begin at the > beginning of a line is when the left margin is nonzero. I'm not sure I understand the situation you are describing. (The word "margin" is too overloaded, even if we confine ourselves to Emacs parlance alone.) Could you please provide an example of such a paragraph? Then I could reason about it. > Why should these paragraphs be different from other paragraphs > with regard to direction of text? They are not "different", they just follow the base direction of the preceding paragraph. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
|
|
|
Re: Bidirectional editing in Emacs -- main design decisionsOn Fri, 09 Oct 2009 23:18:00 Eli Zaretskii wrote:
> > Here's what I can tell about the subject (bidi display) at this point In general I agree with your decisions. > 1. Text storage > > Bidirectional text in Emacs buffers and strings is stored in strict > logical order (a.k.a. "reading order"). This is how most (if not > all) other implementations handle bidirectional text. The > advantage of this is that file and process I/O is trivial, as well > as text search. [snip] The search has many problems but this should not influence your bidi reordering. The changes to various search functions can be done later. The user ALWAYS search for the visual text s/he sees (S/he never knows the logical order unless she visits the file literally). The problems are caused by many reasons: 1. Different logical inputs, even without formatting characters, can result in the same visual output. e.g. Logical Hebrew text + a number in LTR reading order, the number may be before or after the Hebrew text, but in the visual output the number will always be after (to the left of) the text. Logical "123 HEBREW 456" appears as "123 456 WERBEH". 2. Formatting characters are not seen and should not be searched. 3. The visual appearance of the searched string may be different from what it will match. e.g. The search for logical "HEBREW 3." in RTL reading order will appear as ".3 WERBEH" but will match also something like logical "HEBREW 3.14159" which its visual appearance is "3.14159 WERBEH". This may be what the user wants but it may also disturb her because she really wants to find only (visual) ".3 WERBEH". There is also a technical question, how Emacs will show the found string which is not connected as in the "3.14159 WERBEH" above. As a minimum adjustment, I think the search must ignore the formatting characters. An option to show (or operate, in search & replace) only on found matches that are also the same visually is recommended. > 3. Bidi formatting codes are retained Agreed, but see my comment on search. > 7. Paragraph base direction > > There is a buffer-specific variable `paragraph-direction' that > allows to override this dynamic detection of the direction of each > paragraph, and force a certain base direction on all paragraphs in > the buffer. I expect, for example, each major mode for a > programming language to force the left-to-right paragraph > direction, because programming languages are written left to right, > and right-to-left scripts appear in such buffers only in strings > embedded in the program or in comments. I think a better name is `bidi-paragraphs-direction' or even `bidi-paragraphs-reading-direction'. Note the `s' in paragraphs, because it is influence all the paragraphs in the buffer. There should be a key to toggle this variable. It will very useful for the minibuffer. > 8. User control of visual order Do you intend to support all the explicit formatting characters (LRO is specially important as it allows to store visual strings as is) or just the implicit (and more used) LRM and RLM ? > This design kills two birds: (a) it produces text that is compliant > with other applications, and will display the same as in Emacs, and > (b) it avoids the need to invent yet another Emacs infrastructure > feature to keep information such as paragraph direction outside of > the text itself. While you can store the LRM and RLM in ISO-8859-8 encoding, there is no way to store the the other formatting characters. > That is all for now. If you have comments or questions, you are > welcome to voice them. I found an editor that support the all the formatting characters, YODIT (http://www.yudit.org/) it is GPLed, may be you can use it. The W3C recommend not to use explicit formatting characters (i.e. RLO/LRO/RLE/LRE/PDF) and instead to use markup (see http://www.w3.org/International/questions/qa-bidi-controls , specially the "reasons" section). Ehud. -- Ehud Karni Tel: +972-3-7966-561 /"\ Mivtach - Simon Fax: +972-3-7976-561 \ / ASCII Ribbon Campaign Insurance agencies (USA) voice mail and X Against HTML Mail http://www.mvs.co.il FAX: 1-815-5509341 / \ GnuPG: 98EA398D <http://www.keyserver.net/> Better Safe Than Sorry _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
|
|
|
Re: Bidirectional editing in Emacs -- main design decisions> Date: Sat, 10 Oct 2009 16:57:59 +0200
> From: "Ehud Karni" <ehud@...> > Cc: emacs-bidi@..., emacs-devel@... > > On Fri, 09 Oct 2009 23:18:00 Eli Zaretskii wrote: > > > > Here's what I can tell about the subject (bidi display) at this point > > In general I agree with your decisions. Well, you brought up many of them (thanks!), so it isn't surprising ;-) > The search has many problems but this should not influence your bidi > reordering. The changes to various search functions can be done later. Agreed. > The user ALWAYS search for the visual text s/he sees (S/he never knows > the logical order unless she visits the file literally). She will look for visual text, but she will type the text she looks for in the logical (reading) order, not in the visual order, where characters are reversed and/or reshuffled. > The problems are caused by many reasons: > 1. Different logical inputs, even without formatting characters, can > result in the same visual output. > e.g. Logical Hebrew text + a number in LTR reading order, the > number may be before or after the Hebrew text, but in the visual > output the number will always be after (to the left of) the text. > Logical "123 HEBREW 456" appears as "123 456 WERBEH". > 2. Formatting characters are not seen and should not be searched. > 3. The visual appearance of the searched string may be different from > what it will match. e.g. The search for logical "HEBREW 3." in > RTL reading order will appear as ".3 WERBEH" but will match > also something like logical "HEBREW 3.14159" which its visual > appearance is "3.14159 WERBEH". This may be what the user wants > but it may also disturb her because she really wants to find only > (visual) ".3 WERBEH". All of these are valid and important considerations, and the search commands and primitives will have to deal with them, of course. There's also the issue of ``final'' letters in Hebrew and much more complex similar issues in Arabic, etc. I hope enough application-level Emacs programmers will come aboard and handle all this, because otherwise these scripts will never be supported well enough in Emacs. However, taking care of this is still quite far in the future. My main difficulty in making these decisions was to convince myself that, while none of these problems are trivial to solve and their solutions are not even known yet in detail, at least not to me, they are all _solvable_in_principle_ using just the logical-order text and the reordering engine (which was designed to allow it to be used by code other than just the redisplay iterator, so that it's easy to write a Lisp primitive that takes a logical-order string and returns its visual-order variant). Comments that question or contradict this conclusion are what I'm seeking now, because changing these decisions further down the road may be very difficult, to say nothing of the wasted effort. > There is also a technical question, how Emacs will show the found > string which is not connected as in the "3.14159 WERBEH" above. I didn't yet adapt support for faces to bidi display. However, my plan is to make it so that each character produced by the bidi iterator gets the correct face, like it does today, and faces are (and will be in the future) set in the logical order of buffer positions. So, in your example, the characters underlined below will have the `isearch' face: 3.14159 WERBEH -- ------ (you were saying that the search string is ".3 WERBEH"). Yes, this shows as disconnected. But other GUI applications do it that way, so I think the user will expect this behavior. > As a minimum adjustment, I think the search must ignore the formatting > characters. Yes, of course. At least by default, with an option to not ignore them. > Do you intend to support all the explicit formatting characters (LRO is > specially important as it allows to store visual strings as is) or just > the implicit (and more used) LRM and RLM ? All of them. They are already supported in the code that I'm using now. Like I said, Emacs will support the full set of features described by UAX#9. > > This design kills two birds: (a) it produces text that is compliant > > with other applications, and will display the same as in Emacs, and > > (b) it avoids the need to invent yet another Emacs infrastructure > > feature to keep information such as paragraph direction outside of > > the text itself. > > While you can store the LRM and RLM in ISO-8859-8 encoding, there is no > way to store the the other formatting characters. UAX#9 recommends to use LRM and RLM, in preference to the other codes, for this very reason. Users who will want to use the other codes (in the rare cases where they are necessary), will have to encode text in UTF-8. I don't see this as a serious problem, though: unlike several years ago, when this issue was discussed at length on emacs-bidi, the number of applications supporting UTF-8 is very large today. Heck, even Notepad groks it nowadays! > I found an editor that support the all the formatting characters, YODIT > (http://www.yudit.org/) it is GPLed, may be you can use it. Thanks, I had it installed already. > The W3C recommend not to use explicit formatting characters (i.e. > RLO/LRO/RLE/LRE/PDF) and instead to use markup (see > http://www.w3.org/International/questions/qa-bidi-controls , > specially the "reasons" section). Yes, I know. The obsession of W3C with markup is well known ;-) But Emacs is first and foremost a _text_editor_, so it doesn't make sense to me to force users to use markup just to be able to read or write bidirectional text. I also believe that converting text that uses Unicode formatting codes into markup is not such a hard job, and someone will surely come up soon enough with an Emacs function to do that. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: Bidirectional editing in Emacs -- main design decisions>>>>> "Eli" == Eli Zaretskii <eliz@...> writes:
Eli> I'm slowly working on adding support for bidirectional editing in Emacs. Thanks for posting that. It is a great summary of the concerns and needs of an editor when dealing with bidi test. To be fair, I should point out before continuing that I do not read any rtl scripts. My interests deal with fonts and typography and at least seeing bidi email in its correct visual order, if only to try to learn some of it. Eli> 1. Text storage Eli> 2. Support for Unicode Bidirectional Algorithm Eli> 3. Bidi formatting codes are retained Eli> 4. Reordering of text for display Eli> 5. Visual-order information is volatile Eli> 6. Reordering of strings from `display' properties Eli> 7. Paragraph base direction Eli> 8. User control of visual order Of those points, all but #6 are no brainers; your choices are exactly what an editor must do. Point six is an interesting problem; I'm also unaware of any prior art. I suspect that in the long term it would be best to note the start and end directionality of such chunks of text and set them chunk-by-chunk in a manner similar to how glyphs are set in the absence of such properties. But in the short term I agree with the choice you outlined. -JimC -- James Cloos <cloos@...> OpenPGP: 1024D/ED7DAEA6 _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: Bidirectional editing in Emacs -- main design decisions> From: James Cloos <cloos@...>
> Cc: emacs-devel@..., emacs-bidi@... > > Thanks for posting that. It is a great summary of the concerns and > needs of an editor when dealing with bidi test. Thanks, but I think it's just the beginning. There are lots of other issues to deal with; see, for example, the aspects of search described by Ehud Karni in this thread. The hard problem in making these decisions was to become convinced that all those other issues are reasonably solvable based on these basic features, without actually solving any of them. > Of those points, all but #6 are no brainers; your choices are exactly > what an editor must do. Thanks for confirming that. > Point six is an interesting problem; I'm also unaware of any prior > art. I suspect that in the long term it would be best to note the > start and end directionality of such chunks of text and set them > chunk-by-chunk in a manner similar to how glyphs are set in the > absence of such properties. I think this is impossible in general, because once text is reordered, the information needed to plug in additional chunks (the resolved level of each character) is lost. Note that it is fairly simple to reorder the text of `display' strings together with the surrounding text -- you just need to feed the characters together into the reordering engine. The problem is elsewhere -- in the code that uses the produced glyphs. > But in the short term I agree with the choice you outlined. The future will tell if it was the right decision. Maybe a useful first step to examining its validity would be to prepare a fairly complete list of Emacs applications that currently use the `display' text properties and overlay properties. Given such a list, one could think of their applicability to bidirectional editing, and how the strings should be displayed in each context to do what the users expect. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: Bidirectional editing in Emacs -- main design decisions > The only case when the paragraph separator does not begin at the
> beginning of a line is when the left margin is nonzero. I'm not sure I understand the situation you are describing. (The word "margin" is too overloaded, even if we confine ourselves to Emacs parlance alone.) Could you please provide an example of such a paragraph? Then I could reason about it. I can't give you an example, but move-to-left-margin shows what it means to be at the left margin. It is a matter of matching the paragraph regexps after the right amount of whitespace as specified by the value of `left-margin'. When `left-margin' is nonzero, a line which fails to start with that much whitespace also starts a paragraph. > Why should these paragraphs be different from other paragraphs > with regard to direction of text? They are not "different", they just follow the base direction of the preceding paragraph. To be fully correct, it ought to detect paragraphs correctly when `left-margin' is nonzero. I think that won't be hard to do. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: Bidirectional editing in Emacs -- main design decisions> From: Richard Stallman <rms@...>
> CC: emacs-devel@..., emacs-bidi@... > Date: Sun, 11 Oct 2009 04:41:22 -0400 > > move-to-left-margin shows what it means to be at the left margin. > It is a matter of matching the paragraph regexps after the right > amount of whitespace as specified by the value of `left-margin'. Would it be sufficient to account for any arbitrary amount of horizontal whitespace between the beginning of the line and the paragraph regexps? If so, that is an almost trivial modification of the code I already have. My problem was with potentially more complicated situations, since paragraph regexps may in principle be anything. I also have issues with a paragraph that is separated from the previous one by just the amount of indentation, like this: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbbbbbbbbbbb Are you suggesting that Emacs should recompute the paragraph direction of the two lines with b's, and the result could be a different base direction from that used by the two preceding lines with a's? I think that such direction changes will annoy users of bidirectional scripts. What are the use-cases where such paragraphs are useful? > When `left-margin' is nonzero, a line which fails to start with that much > whitespace also starts a paragraph. You mean, a line which starts with more indentation, or a line that starts with less? Like this: aaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa xxxxxxxxxxxxxxxxxxxxxxxxxx yyyyyyyyyyyyyyyyyyyyyy Should the "xxx" line be considered a new paragraph? > To be fully correct, it ought to detect paragraphs correctly when > `left-margin' is nonzero. I think that won't be hard to do. Maybe it will be not hard -- once I make move-to-column, current-column, and the rest of indent.c work with bidirectional text. Right now, it's badly broken, because it assumes buffer positions increase linearly with screen positions. Even vertical cursor motion and C-e does not work correctly, because of that. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: Re: Bidirectional editing in Emacs -- main design decisions> Date: Sun, 11 Oct 2009 22:12:54 +0200
> From: Eli Zaretskii <eliz@...> > Cc: emacs-bidi@..., emacs-devel@... > > Maybe it will be not hard -- once I make move-to-column, > current-column, and the rest of indent.c work with bidirectional > text. Right now, it's badly broken Broken for a buffer with bidirectional text, I should have said. It works okay with unidirectional left-to-right text, of course. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: Bidirectional editing in Emacs -- main design decisions Would it be sufficient to account for any arbitrary amount of
horizontal whitespace between the beginning of the line and the paragraph regexps? No, that's not correct. You need to skip whitespace whose width is the value of `left-margin' and then match the regexp. (More precisely, you need to skip the amount of space specified by the value that the function `current-left-margin' would return.) Looking at the code, I think I was mistaken in what I said about "less than `left-margin' indentation starts a paragraph". I think that if the line doesn't have `left-margin' worth of indentation, then the paragaph regexps match at the end of the indentation. Look at the code of `forward-paragraph' to see the paragraph criteria in full detail. It is very important to support the full set of features that Emacs offers for controlling paragraphs. At least, it is important to support the full set when this is released. I won't say it has to be the very next job you work on. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
|
|
Re: Bidirectional editing in Emacs -- main design decisions> From: Richard Stallman <rms@...>
> CC: emacs-devel@..., emacs-bidi@... > Date: Mon, 12 Oct 2009 06:11:42 -0400 > > It is very important to support the full set of features that Emacs > offers for controlling paragraphs. OK, I will add this to my TODO. Thanks. _______________________________________________ emacs-bidi mailing list emacs-bidi@... http://lists.gnu.org/mailman/listinfo/emacs-bidi |
| Free embeddable forum powered by Nabble | Forum Help |