|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
|
|
Incorrect point-and-click URIs of files with non-ASCII content on WindowsOn Windows platform only (where the default encoding is not UTF-8), non-ASCII
characters in a source line shift the subsequent PDF point-and-click link column positions in that line by 1. Example: %{ á %} { a } The link of the note 'a' points to the space after the 'a' (column 11), not to it (column 10). _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on WindowsOn 2009-10-01, Harmath Dénes wrote:
> On Windows platform only (where the default encoding is not UTF-8), > non-ASCII characters in a source line shift the subsequent PDF > point-and-click link column positions in that line by 1. Example: > > %{ á %} { a } > > The link of the note 'a' points to the space after the 'a' (column > 11), not to it (column 10). Hi, I'm a little confused about this report. I don't see any indication of column 11 in LilyPond's PDF output. Can you check the textedit URI in the Postscript output? Here is the output I get when testing the attached file. textedit:///home/pnorcks/test.ly:3:10:10 Does point-and-click in Windows open LilyPad? If so, then I'm not surprised this isn't working, because the Windows LilyPad doesn't currently recognize UTF-8 encoding. -Patrick \version "2.13.6" %{ á %} { a } _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on WindowsOn 2009-10-26, Patrick McCarty wrote:
> On 2009-10-01, Harmath Dénes wrote: > > On Windows platform only (where the default encoding is not UTF-8), > > non-ASCII characters in a source line shift the subsequent PDF > > point-and-click link column positions in that line by 1. Example: > > > > %{ á %} { a } > > > > The link of the note 'a' points to the space after the 'a' (column > > 11), not to it (column 10). > > Can you check the textedit URI in the Postscript output? Here is the > output I get when testing the attached file. > > textedit:///home/pnorcks/test.ly:3:10:10 By the way, the output of current git is textedit:///home/pnorcks/test.ly:3:11:10 but it looks like your issue is unrelated. -Patrick _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
|
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on Windows2009/10/27 Patrick McCarty <pnorcks@...>:
> Valentin, can you add this to the tracker? The subject of this thread > is okay for an issue summary. Does http://code.google.com/p/lilypond/issues/detail?id=887 cover it? I haven't mentioned the weird position oddity. Cheers, Valentin _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
|
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on Windows2009/10/29 Valentin Villenave <v.villenave@...>:
> 2009/10/29 Harmath Dénes <harmathdenes@...>: >> No, the two are very different. >> In http://code.google.com/p/lilypond/issues/detail?id=887, the _filename_ >> part of the URIs is wrong because of non-ASCII _filenames_. >> In this bug (http://article.gmane.org/gmane.comp.gnu.lilypond.bugs/16097), >> the _column position_ part of the URIs is wrong because of non-ASCII >> _content_. > > Interesting. Patrick, doesn't it look a lot like David's report on > http://lists.gnu.org/archive/html/bug-lilypond/2009-10/msg00049.html ? It is very similar, and the problem probably stems from the same function, namely Source_file::get_counts(). >> Anyway, as mentioned, the latter is more general, and is related to not the >> PDF generation, but the position calculation, since it occurs in >> command-line compiler errors too. > > Yes, which is why I'd like to know if Patrick's recent work on David's > report could have affected this one as well. It *should* have an effect. But if I'm thinking about this correctly, there will still be an off-by-one error in the character/column counts, because my commit only affects the value of the character count when non-ASCII characters are found. -Patrick _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on WindowsOn 2009.10.29., at 22:12, Patrick McCarty wrote:
> It *should* have an effect. But if I'm thinking about this correctly, > there will still be an off-by-one error in the character/column > counts, because my commit only affects the value of the character > count when non-ASCII characters are found. I don't understand this completely, but it doesn't matter, if the column number in the error messages and at least one of the column numbers in the point-and-click hyperlinks will be right. thSoft _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on WindowsOn 2009-10-29, Harmath Dénes wrote:
> On 2009.10.29., at 22:12, Patrick McCarty wrote: > >It *should* have an effect. But if I'm thinking about this correctly, > >there will still be an off-by-one error in the character/column > >counts, because my commit only affects the value of the character > >count when non-ASCII characters are found. > > I don't understand this completely, but it doesn't matter, if the > column number in the error messages and at least one of the column > numbers in the point-and-click hyperlinks will be right. I see the reason for confusion now. Let me outline the various cases: 1) All ASCII characters. In this case, the character and column will always be the same, as in "3:10:10". 2) Tab characters. When tabs are used, the column number is typically greater than the character number (unless you use a tab that is one character wide). An example might be "1:1:8". 3) UTF-8 characters. In UTF-8 locales, terminals need to know about the byte offset, so I am using the character count to specify this offset. An example would be "3:11:10". The third case is arguably misleading, so maybe it should be changed to use the "3:10:10" instead. I am okay with either format. If we want to use "3:10:10" instead, then an additional parameter would be needed to calculate the byte offset. -Patrick _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on WindowsHow is the tab width determined? In my case, it was always 8, so it's
superfluous I think. And also IMHO there's no sense in differentiating character & byte offset. They are misleading. I'd propose keeping only the character offset, which does not take tab width into account, but correctly counts UTF-8 characters. thSoft On 2009.10.30., at 4:07, Patrick McCarty wrote: > On 2009-10-29, Harmath Dénes wrote: >> On 2009.10.29., at 22:12, Patrick McCarty wrote: >>> It *should* have an effect. But if I'm thinking about this >>> correctly, >>> there will still be an off-by-one error in the character/column >>> counts, because my commit only affects the value of the character >>> count when non-ASCII characters are found. >> >> I don't understand this completely, but it doesn't matter, if the >> column number in the error messages and at least one of the column >> numbers in the point-and-click hyperlinks will be right. > > I see the reason for confusion now. Let me outline the various cases: > > 1) All ASCII characters. In this case, the character and column > will always be the same, as in "3:10:10". > > 2) Tab characters. When tabs are used, the column number is > typically greater than the character number (unless you use a tab > that is one character wide). An example might be "1:1:8". > > 3) UTF-8 characters. In UTF-8 locales, terminals need to know about > the byte offset, so I am using the character count to specify this > offset. An example would be "3:11:10". > > The third case is arguably misleading, so maybe it should be changed > to use the "3:10:10" instead. I am okay with either format. If we > want to use "3:10:10" instead, then an additional parameter would be > needed to calculate the byte offset. > > > -Patrick _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on Windows> 3) UTF-8 characters. In UTF-8 locales, terminals need to know about > the byte offset, so I am using the character count to specify this > offset. An example would be "3:11:10". > > The third case is arguably misleading, so maybe it should be changed > to use the "3:10:10" instead. I am okay with either format. If we > want to use "3:10:10" instead, then an additional parameter would be > needed to calculate the byte offset. > I hardly believe anyone or anything should care about byte offset. LilyPond source files contain UTF-8 characters and not bytes! If a terminal/editor doesn't support UTF-8 character streams, than the terminal/editor should be fixed. Bert _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on WindowsOn 2009-11-01, Bertalan Fodor (LilyPondTool) wrote:
> > > 3) UTF-8 characters. In UTF-8 locales, terminals need to know about > > the byte offset, so I am using the character count to specify this > > offset. An example would be "3:11:10". > > > >The third case is arguably misleading, so maybe it should be changed > >to use the "3:10:10" instead. I am okay with either format. If we > >want to use "3:10:10" instead, then an additional parameter would be > >needed to calculate the byte offset. > > I hardly believe anyone or anything should care about byte offset. > LilyPond source files contain UTF-8 characters and not bytes! If a > terminal/editor doesn't support UTF-8 character streams, than the > terminal/editor should be fixed. Well, in its current state, LilyPond is not using "UTF-8 character streams" for error message output and point-click URIs, so this would likely require a rewrite. Do you have any suggestions about how to implement this in C++ (portably)? -Patrick _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on WindowsPatrick McCarty wrote:
> On 2009-11-01, Bertalan Fodor (LilyPondTool) wrote: > >>> 3) UTF-8 characters. In UTF-8 locales, terminals need to know about >>> the byte offset, so I am using the character count to specify this >>> offset. An example would be "3:11:10". >>> >>> The third case is arguably misleading, so maybe it should be changed >>> to use the "3:10:10" instead. I am okay with either format. If we >>> want to use "3:10:10" instead, then an additional parameter would be >>> needed to calculate the byte offset. >>> >> I hardly believe anyone or anything should care about byte offset. >> LilyPond source files contain UTF-8 characters and not bytes! If a >> terminal/editor doesn't support UTF-8 character streams, than the >> terminal/editor should be fixed. >> > > Well, in its current state, LilyPond is not using "UTF-8 character > streams" for error message output and point-click URIs, so this would > likely require a rewrite. > > Do you have any suggestions about how to implement this in C++ > (portably)? > > -Patrick > > offset problem. _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on WindowsOn 2009-11-02, Bertalan Fodor (LilyPondTool) wrote:
> Patrick McCarty wrote: > >On 2009-11-01, Bertalan Fodor (LilyPondTool) wrote: > >>> 3) UTF-8 characters. In UTF-8 locales, terminals need to know about > >>> the byte offset, so I am using the character count to specify this > >>> offset. An example would be "3:11:10". > >>> > >>>The third case is arguably misleading, so maybe it should be changed > >>>to use the "3:10:10" instead. I am okay with either format. If we > >>>want to use "3:10:10" instead, then an additional parameter would be > >>>needed to calculate the byte offset. > > > >>I hardly believe anyone or anything should care about byte offset. > >>LilyPond source files contain UTF-8 characters and not bytes! If a > >>terminal/editor doesn't support UTF-8 character streams, than the > >>terminal/editor should be fixed. > > > >Well, in its current state, LilyPond is not using "UTF-8 character > >streams" for error message output and point-click URIs, so this would > >likely require a rewrite. > > > >Do you have any suggestions about how to implement this in C++ > >(portably)? > > Unfortunately no. But that should not be related to the bad > character offset problem. I don't have a fix yet, but I think I discovered the problem. Here are the relevant lines from lily/source-file.cc: #if HAVE_MBRTOWC wchar_t multibyte[2]; size_t thislen = mbrtowc (multibyte, line_chars, left, &state); #else size_t thislen = 1; #endif /* !HAVE_MBRTOWC */ On Windows, the variable `thislen' seems to always have the value `1', so that's why a multibyte (UTF-8) character in a source file is messing up the character/column count. This means that either 1) mbrtowc() on Windows always returns `1' regardless of the character it is considering. This is doubtful. 2) LilyPond's configure script is not detecting mbrtowc() when compiling the Windows installer with GUB, and so the `size_t thislen = 1;' line is executed instead. The second case is much more likely. Graham, I don't want to bug you with this, but would you mind checking the log for mingw::lilypond to see if configure detects the mbrtowc() function? On my Linux system, the output is checking for mbrtowc... yes Thanks, Patrick _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on WindowsOn Mon, Nov 02, 2009 at 01:07:48AM -0800, Patrick McCarty wrote:
> The second case is much more likely. Graham, I don't want to bug you > with this, but would you mind checking the log for mingw::lilypond to > see if configure detects the mbrtowc() function? On my Linux system, > the output is > > checking for mbrtowc... yes I have ac_cv_func_mbrtowc=yes ac_cv_search_mbrtowc='none required' and also #define HAVE_MBRTOWC 1 Dunno what that second line means. Cheers, - Graham _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on WindowsOn 2009-11-02, Graham Percival wrote:
> On Mon, Nov 02, 2009 at 01:07:48AM -0800, Patrick McCarty wrote: > > The second case is much more likely. Graham, I don't want to bug you > > with this, but would you mind checking the log for mingw::lilypond to > > see if configure detects the mbrtowc() function? On my Linux system, > > the output is > > > > checking for mbrtowc... yes > > I have > ac_cv_func_mbrtowc=yes > ac_cv_search_mbrtowc='none required' > and also > #define HAVE_MBRTOWC 1 > > Dunno what that second line means. Okay, that means mbrtowc() is supported and the preprocessor macro HAVE_MBRTOWC is enabled. So my guess was wrong. :-( I plan on removing this function (due to the FIXME) and use a simpler approach instead that should not have these problems. Thanks, Patrick _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on WindowsOn Mon, Nov 02, 2009 at 09:06:46AM -0800, Patrick McCarty wrote:
> On 2009-11-02, Graham Percival wrote: > > I have > > ac_cv_func_mbrtowc=yes > > ac_cv_search_mbrtowc='none required' > > and also > > #define HAVE_MBRTOWC 1 > > > > Dunno what that second line means. > > Okay, that means mbrtowc() is supported and the preprocessor macro > HAVE_MBRTOWC is enabled. So my guess was wrong. :-( The first and third lines mean that, yeah. But the 'none required' -- does that mean "no extra measure are required, since we have the normal mbrtowc" ? That seems like a weird interpretation, but this would hardly be the first time I've seen a difficult-to-understand compiler or configure message. :) Cheers, - Graham _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on WindowsOn 2009-11-02, Graham Percival wrote:
> On Mon, Nov 02, 2009 at 09:06:46AM -0800, Patrick McCarty wrote: > > On 2009-11-02, Graham Percival wrote: > > > I have > > > ac_cv_func_mbrtowc=yes > > > ac_cv_search_mbrtowc='none required' > > > and also > > > #define HAVE_MBRTOWC 1 > > > > > > Dunno what that second line means. > > > > Okay, that means mbrtowc() is supported and the preprocessor macro > > HAVE_MBRTOWC is enabled. So my guess was wrong. :-( > > The first and third lines mean that, yeah. But the 'none > required' -- does that mean "no extra measure are required, since > we have the normal mbrtowc" ? Yes, I think so. If mbrtowc() is not found in the standard place, then configure would conduct a thorough search to try and find it. Thanks, Patrick _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on WindowsPatrick McCarty wrote:
> On 2009-11-02, Graham Percival wrote: > >> On Mon, Nov 02, 2009 at 01:07:48AM -0800, Patrick McCarty wrote: >> >>> The second case is much more likely. Graham, I don't want to bug you >>> with this, but would you mind checking the log for mingw::lilypond to >>> see if configure detects the mbrtowc() function? On my Linux system, >>> the output is >>> >>> checking for mbrtowc... yes >>> >> I have >> ac_cv_func_mbrtowc=yes >> ac_cv_search_mbrtowc='none required' >> and also >> #define HAVE_MBRTOWC 1 >> >> Dunno what that second line means. >> > > Okay, that means mbrtowc() is supported and the preprocessor macro > HAVE_MBRTOWC is enabled. So my guess was wrong. :-( > > I plan on removing this function (due to the FIXME) and use a simpler > approach instead that should not have these problems. > > Thanks, > Patrick > > with UTF-8 files? _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
|
|
Re: Incorrect point-and-click URIs of files with non-ASCII content on WindowsOn Mon, Nov 2, 2009 at 1:39 PM, Bertalan Fodor (LilyPondTool)
<lilypondtool@...> wrote: > Patrick McCarty wrote: >> Okay, that means mbrtowc() is supported and the preprocessor macro >> HAVE_MBRTOWC is enabled. So my guess was wrong. :-( >> >> I plan on removing this function (due to the FIXME) and use a simpler >> approach instead that should not have these problems. > > For me the first question is how does LC_CTYPE affect mbrtowc on Windows > with UTF-8 files? I don't know, especially since I'm not sure mbrtowc() is called at all in LilyPond on Windows, but there is not an easy way to verify this. I will attempt to address the mbrtowc() portability issue and the character/byte-offset problem later today (or soonish). Thanks for your patience. -Patrick _______________________________________________ bug-lilypond mailing list bug-lilypond@... http://lists.gnu.org/mailman/listinfo/bug-lilypond |
| Free embeddable forum powered by Nabble | Forum Help |