UTF-8 mailbox names in filesystem

View: New views
10 Messages — Rating Filter:   Alert me  

UTF-8 mailbox names in filesystem

by Timo Sirainen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Currently mailbox names are stored in IMAP's modified-UTF-7 format in
filesystem. I was wondering about changing this in v2.0. The default
would still be to use mUTF-7 in filesystem, but just adding :UTF8 or
something to mail_location could enable UTF-8.

Any thoughts? Could this be dangerous somehow? UTF-8 enables a lot of
weird characters, perhaps no one really wants to see them on filesystem
since there's no way to type the characters? But for small systems this
probably isn't a problem.



signature.asc (204 bytes) Download Attachment

Re: UTF-8 mailbox names in filesystem

by Laurent Blume :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Quoting Timo Sirainen <tss@...>:
> Currently mailbox names are stored in IMAP's modified-UTF-7 format in
> filesystem. I was wondering about changing this in v2.0. The default
> would still be to use mUTF-7 in filesystem, but just adding :UTF8 or
> something to mail_location could enable UTF-8.
>
> Any thoughts? Could this be dangerous somehow? UTF-8 enables a lot of
> weird characters, perhaps no one really wants to see them on filesystem
> since there's no way to type the characters? But for small systems this
> probably isn't a problem.

I would personally find it useful. I use accented and Chinese  
characters, and I've worked in environments where they were common as  
well. Having a common name between MUA and FS would certainly be nice.

As for the risks, maybe some Unicode ranges could be restricted to  
avoid control characters and such? Or limit the use to given subsets?
It might be useful as well to be able to enable it on a per-user basis.
Would that add too much complexity?

I think of it as a nice feature, but not a critical one.

Laurent


Re: UTF-8 mailbox names in filesystem

by Geert Hendrickx :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Nov 09, 2009 at 09:11:23PM -0500, Timo Sirainen wrote:
> Currently mailbox names are stored in IMAP's modified-UTF-7 format in
> filesystem. I was wondering about changing this in v2.0. The default
> would still be to use mUTF-7 in filesystem, but just adding :UTF8 or
> something to mail_location could enable UTF-8.
>
> Any thoughts? Could this be dangerous somehow? UTF-8 enables a lot of
> weird characters, perhaps no one really wants to see them on filesystem
> since there's no way to type the characters? But for small systems this
> probably isn't a problem.


What's the advantage?


        Geert


--
Geert Hendrickx  -=-  ghen@...  -=-  PGP: 0xC4BB9E9F
This e-mail was composed using 100% recycled spam messages!

Re: UTF-8 mailbox names in filesystem

by Steffen Kaiser-9 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 10 Nov 2009, Laurent Blume wrote:

> I would personally find it useful. I use accented and Chinese characters, and

I, too.

> I've worked in environments where they were common as well. Having a common
> name between MUA and FS would certainly be nice.

It would be nicer for some scripts and plugins as well.
Will there be an API to match folder names, upper and lower case etc.pp.?

> As for the risks, maybe some Unicode ranges could be restricted to avoid
> control characters and such? Or limit the use to given subsets?

UTF8 does use octets >= 0x80, every system should be 8bit clean nowadays.

regards,

- --
Steffen Kaiser
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iQEVAwUBSvlyg3WSIuGy1ktrAQLsLgf9HVO/E7jwHl8Vgug6esIVK6Icurez7EV5
tvPxtobDSwBDq+ZP8BC6Kdw1uzmRNH60xs/KnaKgscv3vHyOYoiPlRLzYJmNriVt
Msct59wPsKwEYACXm1P9iVCMOX0TYLiXliC+LCfOpOL0BqxDBolULuqKw9X2OF9t
71L+WL79KOxgYD2EwUGD9yYoEOo3uixd3AQdsADYfhFqbO9JwsPvuACXmmgAEL0A
L3cPGpAp7YeAeAS6DQNCn5d1r1jGRaK47dipHmNSU6U5F3YW40DCl+JUS50AT3no
bxrxrNbvXUGFGyHli54RaQS3svArJyXOii9ro9rtqngrnF3xaqunuA==
=0IFT
-----END PGP SIGNATURE-----

Re: UTF-8 mailbox names in filesystem

by Timo Sirainen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Nov 10, 2009, at 9:02 AM, Steffen Kaiser wrote:

>> I've worked in environments where they were common as well. Having  
>> a common name between MUA and FS would certainly be nice.
>
> It would be nicer for some scripts and plugins as well.
> Will there be an API to match folder names, upper and lower case  
> etc.pp.?

Mailbox names have always been case-sensitive. So you could use some  
generic UTF-8 functions if you really needed to, but other than that I  
wasn't planning on doing anything.


1.1.20 patch - PAM file cleaup

by Bugzilla from toddr@cpanel.net :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

This patch prevents a temp file cleanup issue related to PAM. It  
appears to be relevant only to 1.1 dovecot code. It seems pretty  
sensible. From what I can tell, it's not included in the 1.1.20  
release. I'm not clear if it was ever reported. Could it be merged in  
for future releases?

http://cvs.fedora.redhat.com/viewvc/rpms/dovecot/F-8/dovecot-1.0.rc2-pam-setcred.patch?revision=1.1&view=markup

Thanks,
Todd Rinaldo

Re: 1.1.20 patch - PAM file cleaup

by Timo Sirainen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 2009-11-10 at 12:25 -0600, Todd Rinaldo wrote:
> This patch prevents a temp file cleanup issue related to PAM. It  
> appears to be relevant only to 1.1 dovecot code. It seems pretty  
> sensible. From what I can tell, it's not included in the 1.1.20  
> release. I'm not clear if it was ever reported. Could it be merged in  
> for future releases?
>
> http://cvs.fedora.redhat.com/viewvc/rpms/dovecot/F-8/dovecot-1.0.rc2-pam-setcred.patch?revision=1.1&view=markup

No need to patch. Just use:

passdb pam {
  args = setcred=no
}



signature.asc (204 bytes) Download Attachment

Re: UTF-8 mailbox names in filesystem

by Ben Winslow :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, 09 Nov 2009 21:11:23 -0500
Timo Sirainen <tss@...> wrote:

> Currently mailbox names are stored in IMAP's modified-UTF-7 format in
> filesystem. I was wondering about changing this in v2.0. The default
> would still be to use mUTF-7 in filesystem, but just adding :UTF8 or
> something to mail_location could enable UTF-8.
>
> Any thoughts? Could this be dangerous somehow? UTF-8 enables a lot of
> weird characters, perhaps no one really wants to see them on
> filesystem since there's no way to type the characters? But for small
> systems this probably isn't a problem.

A while ago, I was playing around with the idea of encoded '/'s in
Maildir names since many people have asked for a way to use them.
UTF-7 does not require that each character be representable in only 1
way like UTF-8 does, so it's possible to encode US-ASCII characters and
put them into the folder name; however, I found that most clients
decode any mUTF-7 in folder names while parsing LIST/LSUB replies and
then discard the name given by the server (expecting that they can just
re-encode any non-ASCII characters and still arrive at the correct
folder name.)  While I would argue that these clients are buggy, the
bug seems to be so common that encoding characters this way isn't
practical.  With that in mind, you do lose the ability to encode
characters like this if the folder names on disk are UTF8, but that's
not much of a loss anyway if UTF8 encoding is optional.

So far as UTF-8 on the filesystem is concerned, I've been using UTF-8
in filenames on my personal systems for years now without any real
issues.

--
Ben Winslow <rain@...>

Re: UTF-8 mailbox names in filesystem

by Timo Sirainen :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 2009-11-10 at 17:24 -0500, Ben Winslow wrote:
> A while ago, I was playing around with the idea of encoded '/'s in
> Maildir names since many people have asked for a way to use them.
> UTF-7 does not require that each character be representable in only 1
> way like UTF-8 does, so it's possible to encode US-ASCII characters and
> put them into the folder name;

This is explicitly disallowed by RFC 3501:

"Modified BASE64 MUST NOT be used to represent any printing US-ASCII
character which can represent itself."

Anyway, listescape plugin can already do that, as long as you're not
using '/' as hierarchy separator.



signature.asc (204 bytes) Download Attachment

Re: UTF-8 mailbox names in filesystem

by Joseph Yee :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On 10-Nov-09, at 9:02 AM, Steffen Kaiser wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Tue, 10 Nov 2009, Laurent Blume wrote:
>
>> I would personally find it useful. I use accented and Chinese  
>> characters, and
>
> I, too.
Same here.

>
>> I've worked in environments where they were common as well. Having  
>> a common name between MUA and FS would certainly be nice.
>
> It would be nicer for some scripts and plugins as well.
> Will there be an API to match folder names, upper and lower case  
> etc.pp.?


>
>> As for the risks, maybe some Unicode ranges could be restricted to  
>> avoid control characters and such? Or limit the use to given subsets?
>
> UTF8 does use octets >= 0x80, every system should be 8bit clean  
> nowadays.

I had some worries rather than risk.  Some MUA may convert before  
passing the name, and it results in no match... but maybe Timo thought  
about this already :)

Other than looking weird to sys admin whose non foreign speaker,  
especially in bidirectional presentation, in file system, there should  
be no issue.

best,
Joseph

>
> regards,
>
> - -- Steffen Kaiser
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
>
> iQEVAwUBSvlyg3WSIuGy1ktrAQLsLgf9HVO/E7jwHl8Vgug6esIVK6Icurez7EV5
> tvPxtobDSwBDq+ZP8BC6Kdw1uzmRNH60xs/KnaKgscv3vHyOYoiPlRLzYJmNriVt
> Msct59wPsKwEYACXm1P9iVCMOX0TYLiXliC+LCfOpOL0BqxDBolULuqKw9X2OF9t
> 71L+WL79KOxgYD2EwUGD9yYoEOo3uixd3AQdsADYfhFqbO9JwsPvuACXmmgAEL0A
> L3cPGpAp7YeAeAS6DQNCn5d1r1jGRaK47dipHmNSU6U5F3YW40DCl+JUS50AT3no
> bxrxrNbvXUGFGyHli54RaQS3svArJyXOii9ro9rtqngrnF3xaqunuA==
> =0IFT
> -----END PGP SIGNATURE-----