|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
svn log --xml not generating valid utf-8This usually happens when one of our devs enters a
comment containing non-ascii text. There's a lot of this in a large legacy
project we've inherited, and it makes it inconvenient to use tools that
post-process the logs to extract information. (e.g. statsvn)
|
|
|
Re: svn log --xml not generating valid utf-8On Fri, Nov 6, 2009 at 18:26, Justin Michel <justin_michel@...> wrote:
> This usually happens when one of our devs enters a comment containing > non-ascii text. There's a lot of this in a large legacy project we've > inherited, and it makes it inconvenient to use tools that post-process the > logs to extract information. (e.g. statsvn) I suspect: Subversion <= 1.5 assumes that the bytes for the log message (and other internal subversion properties) are UTF-8 but does not actually verify this. This works provided the client software does the transcoding. I believe the svn client learns this from the environment variables LANG and friends. If Subversion believes that the console is using a different encoding than it actually is, hilarity ensues. When emitting XML Subversion just assumes that the bytes it's got are correctly encoded and drops them into the output. (I encountered a failure to properly escape & in an earlier release, but that's neither here nor there and probably long since fixed.) I believe Subversion >= 1.6 is stricter on this count, rejecting log messages which do not use the proper encoding (UTF-8) and eol-style (LF). But this doesn't help you if your server is older than 1.6, and it won't help for old commits made with previous releases of Subversion. Anyway, that's my understanding. Corrections welcome. // Ben ------------------------------------------------------ http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=2415185 To unsubscribe from this discussion, e-mail: [users-unsubscribe@...]. |
|
|
Re: svn log --xml not generating valid utf-8As Ben said: Log messages are stored as UTF-8 in the repository. This is
enforced (at commit/propset time) only with 1.6.x servers, but isn't checked by the svn client (AFAIK). Are your log messages stored in UTF-8 in the repository? (Check with 'propget --strict svn:log'.) B. Smith-Mannschott wrote on Fri, 6 Nov 2009 at 19:38 +0100: > On Fri, Nov 6, 2009 at 18:26, Justin Michel <justin_michel@...> wrote: > > This usually happens when one of our devs enters a comment containing > > non-ascii text. There's a lot of this in a large legacy project we've > > inherited, and it makes it inconvenient to use tools that post-process the > > logs to extract information. (e.g. statsvn) > > I suspect: > > Subversion <= 1.5 assumes that the bytes for the log message (and > other internal subversion properties) are UTF-8 but does not actually > verify this. > This works provided the client software does the transcoding. I > believe the svn client learns this from the environment variables LANG > and friends. > If Subversion believes that the console is using a different encoding > than it actually is, hilarity ensues. > > When emitting XML Subversion just assumes that the bytes it's got are > correctly encoded and drops them into the output. > (I encountered a failure to properly escape & in an earlier release, > but that's neither here nor there and probably long since fixed.) > > I believe Subversion >= 1.6 is stricter on this count, rejecting log > messages which do not use the proper encoding (UTF-8) and eol-style > (LF). > But this doesn't help you if your server is older than 1.6, and it > won't help for old commits made with previous releases of Subversion. > > Anyway, that's my understanding. Corrections welcome. > > // Ben > > ------------------------------------------------------ > http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=2415185 > > To unsubscribe from this discussion, e-mail: [users-unsubscribe@...]. > ------------------------------------------------------ http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=2415251 To unsubscribe from this discussion, e-mail: [users-unsubscribe@...]. |
| Free embeddable forum powered by Nabble | Forum Help |