[filesystem] Problems with wpath on Linux

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

[filesystem] Problems with wpath on Linux

by Alexei Alexandrov-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I'm trying to use fs::wpath on Linux and I'm encountering some problems with
internal conversion of wide strings to external representation. The problem can
be demonstrated with the following code:

#include <iostream>
#include <boost/filesystem/path.hpp>
#include <boost/filesystem/convenience.hpp>

namespace fs = boost::filesystem;
int main()
{
    // Setting the global locale to be environment locale
    std::locale::global(std::locale(""));

    // Setting the wpath locale to be global
    fs::wpath_traits::imbue(std::locale());

    fs::wpath mypath(L"/tmp/some/directory");
    fs::create_directories(mypath);
}

To me, this work looks correct and should work.
But it terminates with the following message:

terminate called after throwing an instance of
'boost::filesystem::basic_filesystem_error<
boost::filesystem::basic_path<std::basic_string<wchar_t,
std::char_traits<wchar_t>, std::allocator<wchar_t> >,
boost::filesystem::wpath_traits> >'
  what():  boost::filesystem::wpath::to_external conversion error
Aborted

At the same time, the following code works:

#include <iostream>
#include <boost/filesystem/path.hpp>
#include <boost/filesystem/convenience.hpp>

#include <libs/filesystem/src/utf8_codecvt_facet.hpp>

namespace fs = boost::filesystem;
int main()
{
    fs::detail::utf8_codecvt_facet utf8_facet;
    std::locale loc( std::locale(), &utf8_facet );
    fs::wpath_traits::imbue( loc );

    fs::wpath mypath(L"/tmp/some/directory");
    fs::create_directories(mypath);
}

which uses a UTF-8 facet from boost itself.

The first example should work too - this is my understanding. Who is wrong here
- me or boost?


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Eric MALENFANT :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Alexei Alexandrov, le 25 janvier 2008 07:11:

>
> namespace fs = boost::filesystem;
> int main()
> {
>     // Setting the global locale to be environment locale
>     std::locale::global(std::locale(""));
>
>     // Setting the wpath locale to be global
>     fs::wpath_traits::imbue(std::locale());
>
>     fs::wpath mypath(L"/tmp/some/directory");
>     fs::create_directories(mypath);
> }
>
> To me, this work looks correct and should work.
> But it terminates with the following message:
>
> terminate called after throwing an instance of
> 'boost::filesystem::basic_filesystem_error<
> boost::filesystem::basic_path<std::basic_string<wchar_t,
> std::char_traits<wchar_t>, std::allocator<wchar_t> >,
> boost::filesystem::wpath_traits> >'
>   what():  boost::filesystem::wpath::to_external conversion error
> Aborted
>
> At the same time, the following code works:

[snip: Same program as above, using fs::detail::utf8_codecvt_facet instead of the system's locale]

> which uses a UTF-8 facet from boost itself.
>
> The first example should work too - this is my understanding. Who is
> wrong here - me or boost?

... or your platform's implementation of codecvt?

This is a wild guess as I don't know anything about your platform (in particular: which implementation of the standard library, and the locale environment variables used when running your program)


Éric Malenfant
---------------------------------------------
Quidquid latine dictum sit, altum viditur.
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Alexei Alexandrov-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Eric MALENFANT <Eric.Malenfant <at> sagem-interstar.com> writes:

> > The first example should work too - this is my understanding. Who is
> > wrong here - me or boost?
>
> ... or your platform's implementation of codecvt?
>
> This is a wild guess as I don't know anything about your platform (in
particular: which implementation of
> the standard library, and the locale environment variables used when running
your program)
>

Yes, I should have provided this information in the first place. It's Linux,
x86, gcc 3.4.6. Locale is en_US.UTF-8



_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Alexei Alexandrov-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Alexei Alexandrov wrote:

> Eric MALENFANT <Eric.Malenfant <at> sagem-interstar.com> writes:
>
>>> The first example should work too - this is my understanding. Who is
>>> wrong here - me or boost?
>> ... or your platform's implementation of codecvt?
>>
>> This is a wild guess as I don't know anything about your platform (in
> particular: which implementation of
>> the standard library, and the locale environment variables used when running
> your program)
>
> Yes, I should have provided this information in the first place. It's Linux,
> x86, gcc 3.4.6. Locale is en_US.UTF-8
>

So is this information enough? This issue is a real showstopper for me
since it doesn't seem to be possible to use wpath on Linux correctly on
systems where locale is not UTF-8!

--
Alexei Alexandrov

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Beman Dawes :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Alexei Alexandrov wrote:
> Alexei Alexandrov wrote:
>
> So is this information enough? This issue is a real showstopper for me
> since it doesn't seem to be possible to use wpath on Linux correctly on
> systems where locale is not UTF-8!

What version of Boost are you using?

--Beman


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Alexei Alexandrov-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Beman Dawes wrote:
> Alexei Alexandrov wrote:
>> Alexei Alexandrov wrote:
>>
>> So is this information enough? This issue is a real showstopper for me
>> since it doesn't seem to be possible to use wpath on Linux correctly on
>> systems where locale is not UTF-8!
>
> What version of Boost are you using?
>

Ah, sorry again for not providing these details - I'm using
boost.filesystem from 1.34.1 release.

I also ran the failing use case under valgrind - it showed a number of
"conditional jumps on uninitialized value" somewhere deep under
libstdc++ and also a couple of "4 bytes uninitialized read". I don't
know if it's related to the problem though - I was just trying to do
what I can.

The problem is rather serious for me - I'm ready to do whatever it's
needed to help the boost.filesystem maintainer (is it you?) investigate
and fix the problem.

--
Alexei Alexandrov

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Alexei Alexandrov-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Alexei Alexandrov wrote:

> Beman Dawes wrote:
>>> Alexei Alexandrov wrote:
>>>
>>> So is this information enough? This issue is a real showstopper for me
>>> since it doesn't seem to be possible to use wpath on Linux correctly on
>>> systems where locale is not UTF-8!
>> What version of Boost are you using?
>>
>
> Ah, sorry again for not providing these details - I'm using
> boost.filesystem from 1.34.1 release.
>
> I also ran the failing use case under valgrind - it showed a number of
> "conditional jumps on uninitialized value" somewhere deep under
> libstdc++ and also a couple of "4 bytes uninitialized read". I don't
> know if it's related to the problem though - I was just trying to do
> what I can.
>
> The problem is rather serious for me - I'm ready to do whatever it's
> needed to help the boost.filesystem maintainer (is it you?) investigate
> and fix the problem.
>

Beman, is there any way to help with investigating/fixing this issue? I
also wonder whether wpath is being tested on Linux as part of Boost test
suite? I mean, I'm the only one who reported this problem or just nobody
used wchar_t with standard codecvt on Linux so far?

--
Alexei Alexandrov

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Jens Seidel :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Jan 28, 2008 at 10:02:26AM +0300, Alexei Alexandrov wrote:

> Alexei Alexandrov wrote:
> > I also ran the failing use case under valgrind - it showed a number of
> > "conditional jumps on uninitialized value" somewhere deep under
> > libstdc++ and also a couple of "4 bytes uninitialized read". I don't
> > know if it's related to the problem though - I was just trying to do
> > what I can.
> >
> > The problem is rather serious for me - I'm ready to do whatever it's
> > needed to help the boost.filesystem maintainer (is it you?) investigate
> > and fix the problem.
> >
>
> Beman, is there any way to help with investigating/fixing this issue? I
> also wonder whether wpath is being tested on Linux as part of Boost test
> suite? I mean, I'm the only one who reported this problem or just nobody
> used wchar_t with standard codecvt on Linux so far?

There is simple no need for wchar_t on Linux. If you use a classical
encoding in your filesystem it is a 8bit one (except you use a Asian
language such as Japanese). All modern distributions switched already
to UTF-8 as default encoding and for this you don't need wchar_t as
well. Use ordinary char* streams for this ...

Remember that you know for UTF-8 always where the current character
stops if you just have a pointer to an arbritary byte (in the middle of
a multi-byte character). It's also useless to group bytes pairwise as a
valid UTF-8 character can consist of more than two bytes. char* is
really sufficent.

Jens
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by James Talbut :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

 

> Jens Seidel wrote
> There is simple no need for wchar_t on Linux. If you use a
> classical encoding in your filesystem it is a 8bit one
> (except you use a Asian language such as Japanese). All
> modern distributions switched already to UTF-8 as default
> encoding and for this you don't need wchar_t as well. Use
> ordinary char* streams for this ...
>
> Remember that you know for UTF-8 always where the current
> character stops if you just have a pointer to an arbritary
> byte (in the middle of a multi-byte character). It's also
> useless to group bytes pairwise as a valid UTF-8 character
> can consist of more than two bytes. char* is really sufficent.
>

wchar_t is required on Windows, if Linux doesn't support it fully cross
platform work is complicated.

Jim

________________________________________________________________________
This e-mail, and any attachment, is confidential. If you have received it in error, do not use or disclose the information in any way, notify me immediately, and please delete it from your system.
________________________________________________________________________
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Eric MALENFANT :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Alexei Alexandrov wrote:
> I mean, I'm the only one who reported this problem or
> just nobody used wchar_t with standard codecvt on Linux so far?

We use boost::filesystem::wpaths on Linux without problems (that I'm aware of).

Also, IIUC, explicitely imbue()-ing the environment locale (std::locale("")) is not necessary, as it seems to be the default (look at libs/filesystem/src/path.cpp)

Éric Malenfant
---------------------------------------------
Why is lemon juice made with artificial flavor,
and dishwashing liquid made with real lemons?
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Alexei Alexandrov-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jens Seidel wrote:

>
> There is simple no need for wchar_t on Linux. If you use a classical
> encoding in your filesystem it is a 8bit one (except you use a Asian
> language such as Japanese). All modern distributions switched already
> to UTF-8 as default encoding and for this you don't need wchar_t as
> well. Use ordinary char* streams for this ...
>
> Remember that you know for UTF-8 always where the current character
> stops if you just have a pointer to an arbritary byte (in the middle of
> a multi-byte character). It's also useless to group bytes pairwise as a
> valid UTF-8 character can consist of more than two bytes. char* is
> really sufficent.
>

This is more of a design choices discussion. As for Boost, it has wpath
in its interfaces on Linux so the support is claimed. We made a design
choices to use wchar_t cross-platform since the code is Windows/Linux.
This is why I want to get it working.

--
Alexei Alexandrov

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Alexei Alexandrov-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Eric MALENFANT wrote:
> Alexei Alexandrov wrote:
>> I mean, I'm the only one who reported this problem or
>> just nobody used wchar_t with standard codecvt on Linux so far?
>
> We use boost::filesystem::wpaths on Linux without problems (that I'm aware of).
>
> Also, IIUC, explicitely imbue()-ing the environment locale (std::locale("")) is not necessary, as it seems to be the default (look at libs/filesystem/src/path.cpp)
>

This is very valuable information, thanks!

--
Alexei Alexandrov

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Beman Dawes :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Alexei Alexandrov wrote:

> Alexei Alexandrov wrote:
>> Beman Dawes wrote:
>>>> Alexei Alexandrov wrote:
>>>>
>>>> So is this information enough? This issue is a real showstopper for me
>>>> since it doesn't seem to be possible to use wpath on Linux correctly on
>>>> systems where locale is not UTF-8!
>>> What version of Boost are you using?
>>>
>> Ah, sorry again for not providing these details - I'm using
>> boost.filesystem from 1.34.1 release.
>>
>> I also ran the failing use case under valgrind - it showed a number of
>> "conditional jumps on uninitialized value" somewhere deep under
>> libstdc++ and also a couple of "4 bytes uninitialized read". I don't
>> know if it's related to the problem though - I was just trying to do
>> what I can.
>>
>> The problem is rather serious for me - I'm ready to do whatever it's
>> needed to help the boost.filesystem maintainer (is it you?) investigate
>> and fix the problem.
>>
>
> Beman, is there any way to help with investigating/fixing this issue? I
> also wonder whether wpath is being tested on Linux as part of Boost test
> suite?

Yes. See boost-root/libs/filesystem/test/wide_test.cpp.

> I mean, I'm the only one who reported this problem or just nobody
> used wchar_t with standard codecvt on Linux so far?

You are the one I can recall reporting this problem, but I don't have
any way to know how widely used the facility is on Linux.

--Beman


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Beman Dawes :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Alexei Alexandrov wrote:

> Beman Dawes wrote:
>> Alexei Alexandrov wrote:
>>> Alexei Alexandrov wrote:
>>>
>>> So is this information enough? This issue is a real showstopper for me
>>> since it doesn't seem to be possible to use wpath on Linux correctly on
>>> systems where locale is not UTF-8!
>> What version of Boost are you using?
>>
>
> Ah, sorry again for not providing these details - I'm using
> boost.filesystem from 1.34.1 release.
>
> I also ran the failing use case under valgrind - it showed a number of
> "conditional jumps on uninitialized value" somewhere deep under
> libstdc++ and also a couple of "4 bytes uninitialized read". I don't
> know if it's related to the problem though - I was just trying to do
> what I can.
>
> The problem is rather serious for me - I'm ready to do whatever it's
> needed to help the boost.filesystem maintainer (is it you?) investigate
> and fix the problem.

I'm the maintainer.

One possible way to isolate the problem is to try a codecvt operation on
the locale of interest without involving any boost code at all. If that
works, the problem is likely within Boost.Filesystem. But if that fails,
the problem is with the locale or use of it, not with Boost.Filesystem.

I've got a Linux system here I can test on, but I'm not very familiar
with Linux so am hesitant to start testing here as it always takes me
awhile to come up to speed on Linux.

--Beman
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Jamie Allsop :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Beman Dawes wrote:
> Alexei Alexandrov wrote:
[ snip ]
>
>> I mean, I'm the only one who reported this problem or just nobody
>> used wchar_t with standard codecvt on Linux so far?
>
> You are the one I can recall reporting this problem, but I don't have
> any way to know how widely used the facility is on Linux.

I'm currently using wpath on Linux as I'm writing a gui app using
wxWidgets that I want to run on Windows also. This seemed to be the best
approach. To get it to work I had to imbue wpath_traits with the
experimental UTF8 locale as the example (wide_test.cpp I think) does.

I have yet to really try this in anger, but regardless, I too am
interested in this.

Jamie

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Alexei Alexandrov-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jamie Allsop wrote:

> Beman Dawes wrote:
> To get it to work I had to imbue wpath_traits with the
> experimental UTF8 locale as the example (wide_test.cpp I think) does.
>
> I have yet to really try this in anger, but regardless, I too am
> interested in this.

It works fine for me too when imbuing boost UTF-8 codecvt facet - see
the original post. But it doesn't work somehow when imbuing system
codecvt (system locale encoding is UTF-8).

This is what looks like a bug. I don't know where it is yet - boost, me,
or my libstdc++ implementation (gcc 3.4.6).

--
Alexei Alexandrov

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Alexei Alexandrov-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Beman Dawes wrote:
> Alexei Alexandrov wrote:
>> Beman, is there any way to help with investigating/fixing this issue? I
>> also wonder whether wpath is being tested on Linux as part of Boost test
>> suite?
>
> Yes. See boost-root/libs/filesystem/test/wide_test.cpp.
>

The test imbues boost UTF-8 codecvt facet and this works fine for me
tool. What doesn't work is when I imbue system locale codecvt.

I'll take a look at it more. I might think that it's a libstdc++ bug,
but I don't think so because I tested it imbuing it to a wofstream,
outputting some international wchar_t data and the data in the file got
converted to UTF-8 properly.

--
Alexei Alexandrov

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Alexei Alexandrov-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Beman Dawes wrote:
> Alexei Alexandrov wrote:
>> Beman Dawes wrote:
>
> One possible way to isolate the problem is to try a codecvt operation on
> the locale of interest without involving any boost code at all. If that
> works, the problem is likely within Boost.Filesystem. But if that fails,
> the problem is with the locale or use of it, not with Boost.Filesystem.
>

This is what I did. It was something like

int main()
{
     std::wofstream of("test.txt");
     of.imbue(std::locale(""));

     of << utf8_to_wide("Some Russian string in UTF-8") << std::endl;
}

and the data in the output file appeared correctly as UTF-8.

--
Alexei Alexandrov

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Alexei Alexandrov-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Eric MALENFANT wrote:
> Alexei Alexandrov wrote:
>> I mean, I'm the only one who reported this problem or
>> just nobody used wchar_t with standard codecvt on Linux so far?
>
> We use boost::filesystem::wpaths on Linux without problems (that I'm aware of).

Additional question: are you sure you use fs::wpaths with environment
locale? Not boost UTF-8 codecvt facet (this is what is done in
libs/filesystem/test/wide_test.cpp). Because boost UTF-8 codecvt facet
works fine for me too, but I want to get system locale working - I don't
want to rely on system locale encoding being UTF-8.

 > Also, IIUC, explicitely imbue()-ing the environment locale
 > (std::locale("")) is not necessary,
 > as it seems to be the default (look at libs/filesystem/src/path.cpp)

This is true. So you don't imbue anything at all and it works for you?
I'd really appreciate this clarification.

Thanks a lot to you and to all who are helping me in this thread!

--
Alexei Alexandrov

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [filesystem] Problems with wpath on Linux

by Eric MALENFANT :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Alexei Alexandrov, le 29 janvier 2008 01:07:

> Eric MALENFANT wrote:
>> Alexei Alexandrov wrote:
>>> I mean, I'm the only one who reported this problem or
>>> just nobody used wchar_t with standard codecvt on Linux so far?
>>
>> We use boost::filesystem::wpaths on Linux without problems (that I'm
>> aware of).
>
> Additional question: are you sure you use fs::wpaths with environment
> locale? Not boost UTF-8 codecvt facet (this is what is done in
> libs/filesystem/test/wide_test.cpp). Because boost UTF-8
> codecvt facet
> works fine for me too, but I want to get system locale
> working - I don't
> want to rely on system locale encoding being UTF-8.
>
>  > Also, IIUC, explicitely imbue()-ing the environment locale
>  > (std::locale("")) is not necessary,
>  > as it seems to be the default (look at
> libs/filesystem/src/path.cpp)
>
> This is true. So you don't imbue anything at all and it works
> for you?
> I'd really appreciate this clarification.

I just made a full search for "imbue" on our entire codebase, and the only occurences I found were on iostreams.
So yes, we don't imbue anything, and it works for us.


Éric Malenfant
---------------------------------------------
In business, if two people always agree, one of them is unnecessary.
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
< Prev | 1 - 2 | Next >