|
View:
New views
5 Messages
—
Rating Filter:
Alert me
|
|
|
[lex] Character classesAnother question related to this topic:
> The lexer doesn't support character sets either. Everything is implemented > based on the standard locale (namespace boost::spirit::standard). This is > something we want to look into in the future. Do you think it would be possible to add another charset (comparable to 'ascii.hpp' and 'iso-8859-1.hpp') let's say 'unicode.hpp' based on this listing: http://www.unicode.org/Public/UNIDATA/UnicodeData.txt and integrate it in spirit (based on wchar_t)? All the necessary information (small, capital, control etc.) seems to be in there, so I volunteer to script a conversion ;). Cheers, Kay ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: [lex] Character classes> Another question related to this topic:
> > The lexer doesn't support character sets either. Everything is > implemented > > based on the standard locale (namespace boost::spirit::standard). > This is > > something we want to look into in the future. > Do you think it would be possible to add another charset (comparable > to 'ascii.hpp' and 'iso-8859-1.hpp') let's say 'unicode.hpp' based on > this listing: > http://www.unicode.org/Public/UNIDATA/UnicodeData.txt > and integrate it in spirit (based on wchar_t)? All the necessary > information (small, capital, control etc.) seems to be in there, so I > volunteer to script a conversion ;). That actually has been the plan from the beginning, but it has not been implemented yet. We wanted to 'wait' for Boost to get a Unicode library, but if you have a quicker solution be our guest to come up with a patch! Thanks! Regards Hartmut ------------------- Meet me at BoostCon http://boostcon.com ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
Re: [lex] Character classesHartmut Kaiser wrote:
>> Another question related to this topic: >>> The lexer doesn't support character sets either. Everything is >> implemented >>> based on the standard locale (namespace boost::spirit::standard). >> This is >>> something we want to look into in the future. >> Do you think it would be possible to add another charset (comparable >> to 'ascii.hpp' and 'iso-8859-1.hpp') let's say 'unicode.hpp' based on >> this listing: >> http://www.unicode.org/Public/UNIDATA/UnicodeData.txt >> and integrate it in spirit (based on wchar_t)? All the necessary >> information (small, capital, control etc.) seems to be in there, so I >> volunteer to script a conversion ;). > > That actually has been the plan from the beginning, but it has not been > implemented yet. > We wanted to 'wait' for Boost to get a Unicode library, but if you have a > quicker solution be our guest to come up with a patch! Alas, it's not that simple. You'll find out as you dig deeper into UnicodeData.txt and its required semantics. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net http://www.facebook.com/djowel Meet me at BoostCon http://www.boostcon.com/home http://www.facebook.com/boostcon ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
|
|
|
|
|
Re: [lex] Character classesKay-Michael Wuerzner wrote:
>>> That actually has been the plan from the beginning, but it has not been >>> implemented yet. >>> We wanted to 'wait' for Boost to get a Unicode library, but if you have a >>> quicker solution be our guest to come up with a patch! > > I'll try my best. > >> Alas, it's not that simple. You'll find out as you dig deeper into >> UnicodeData.txt and its required semantics. > > Granted, but there are Unicode libraries (based on UnicodeData.txt) > available for other languages, let's say python. One could use the > included 'upper', 'lower', 'digit', etc. classification to generate a > 'wchar_t unicode_char_types[]'. From my experience, the python Unicode > support is very good. 'Upper'->'Lower' mappings are included for > really weird characters as 'Ⅲ' for example. What about UTF-7, UTF-8, UTF-16 (UCS2), UTF32 (UCS4)? wchar_t alone won't cut it. It can't even represent unicode by itself. Each unicode character (code point) is 1 to 4 octets (8-bit bytes). You need 32 bits to represent unicode and wchar_t is not guaranteed to have 32 bits. It is 16 bits on some platforms (and can be as small as 8 bits). uint32_t can be sufficient, but it is very wasteful of memory usage. UTF-8 is very efficient on memory usage but can have an impact on performance. The only acceptable strategy is to be generic and not fix the data type. It is hairy to implement, but it is the right way to go. Sure, we can hack it, but I'd rather wait for a more robust solution. The Boost unicode project is close to becoming useful. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Spirit-general mailing list Spirit-general@... https://lists.sourceforge.net/lists/listinfo/spirit-general |
| Free embeddable forum powered by Nabble | Forum Help |