|
View:
New views
8 Messages
—
Rating Filter:
Alert me
|
|
|
UTF-8 combine_path() bugSomething crashes hard when using combine_path() with a UTF-8 filename.
"subfolder" contains UTF-8 characters, "path" contains no UTF-8 characters: string x = combine_path(path, subfolder); // This crashes combine_path(path, subfolder); // This does not crash I'm running Windows 7 RC 64-bit and Pike 7.8.352. === Crash details === Faulting application name: pike.exe, version: 0.0.0.0, time stamp: 0x4aba8624 Faulting module name: MSVCR90.dll, version: 9.0.30729.4918, time stamp: 0x49d43da7 Exception code: 0xc0000005 Fault offset: 0x0003c15e Faulting process id: 0xf50 Faulting application start time: 0x01ca533c3efca9cc Faulting application path: C:\Program Files (x86)\Pike\bin\pike.exe Faulting module path: C:\Windows\WinSxS\x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4918_none_508da958bcbd2845\MSVCR90.dll // Pontus Rodling |
|
|
Re: UTF-8 combine_path() bugPontus Rodling <pontus@...> wrote:
> Something crashes hard when using combine_path() with a UTF-8 > filename. Can you please provide a complete test program? |
|
|
Re: UTF-8 combine_path() bugMartin Stjernholm skrev:
> Pontus Rodling <pontus@...> wrote: > > >> Something crashes hard when using combine_path() with a UTF-8 >> filename. >> > > Can you please provide a complete test program? > > Simply put a folder with a UTF-8 name somewhere in c:/test and it should cause it to crash. As an example, c:/test/パイク crashes Pike. // Pontus --- int foldercount; array folders = ({ }); int main() { folders += ({ Folder("C:/test"), }); foreach (folders, Folder f) { write("Indexing: %s\n", f->path); f->deepindex(); } write("Folders: %d\n", foldercount); write("Press enter to quit... "); Stdio.stdin->read(1); } class Folder { string name; Folder parent; string path; private array subfolders; private array files; void create(string n, Folder|void p) { name = n; parent = p; if (parent) path = combine_path(parent->path, name); else path = name; foldercount++; } void deepindex() { array a = get_subfolders(); foreach (a, Folder f) { f->deepindex(); } } array get_subfolders() { if (!subfolders) { subfolders = ({ }); array a = sort(get_dir(path)); foreach (a, string b) { // CRASHES HERE if (Stdio.is_dir(combine_path(path, b))) { subfolders += ({ Folder(b, this) }); } } } return copy_value(subfolders); } } |
|
|
Re: UTF-8 combine_path() bugPontus Rodling <pontus@...> wrote:
> This is the code I was working on when i triggered the bug. > Simply put a folder with a UTF-8 name somewhere in c:/test and it > should cause it to crash. Getting a better minimized test case would have been nice. I reduced it to this: #charset utf-8 int main() { werror ("%O\n", combine_path_nt ("test", "€uro")); } Annoyingly enough it doesn't fail on unix (neither 32- nor 64-bit) and not in windbg. Where is Valgrind4Windows? :P |
|
|
Re: UTF-8 combine_path() bugMartin Stjernholm skrev:
> Pontus Rodling <pontus@...> wrote: > > >> This is the code I was working on when i triggered the bug. >> Simply put a folder with a UTF-8 name somewhere in c:/test and it >> should cause it to crash. >> > > Getting a better minimized test case would have been nice. I reduced > it to this: > > #charset utf-8 > int main() > { > werror ("%O\n", combine_path_nt ("test", "€uro")); > } > > Annoyingly enough it doesn't fail on unix (neither 32- nor 64-bit) and > not in windbg. Where is Valgrind4Windows? :P > I should also learn to put code as an attachment and not inline, doh! Anyway, a custom combine_path function will have to do for now. // Pontus |
|
|
Re: UTF-8 combine_path() bugMartin Stjernholm <mast@...> wrote:
> #charset utf-8 > int main() > { > werror ("%O\n", combine_path_nt ("test", "€uro")); > } > > Annoyingly enough it doesn't fail on unix (neither 32- nor 64-bit) and > not in windbg. Where is Valgrind4Windows? :P By pure luck I managed to catch it in a debugger. It looks like an out-of-bounds memory access in isalpha() in the microsoft CRT when it's given a 16-bit value (crashes only if the memory isn't mapped, which depends on loaded dll's, load order, sizes of compilation units, and similar phase-of-the-moon factors). The bug only occurs if the first char in the second arg is wide. I've checked in a workaround, which is to simply avoid isalpha(). Thanks for the report. |
|
|
Re: UTF-8 combine_path() bugIs this a problem in main.c also?
|
|
|
Re: UTF-8 combine_path() bug"Martin Nilsson (Opera Mini - AFK!) @ Pike (-) importmöte för mailinglistan" <6341@...> wrote:
> Is this a problem in main.c also? I looked at that one as well, but given the type of the variable and how it's used elsewhere in the function, it doesn't look like it could ever be a widestring. I imagine it isn't unlikely that the problem affects the other is* functions though, and they are used here and there. I've no idea in how many places they might be fed wide chars. |
| Free embeddable forum powered by Nabble | Forum Help |