cstring returning invalid encoding? mp3's

View: New views
11 Messages — Rating Filter:   Alert me  

cstring returning invalid encoding? mp3's

by xenoterracide :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

incomplete program... basically this functions job is to scan the
filesystem and build a database of tags...


// std
#include <iostream>
#include <string>

// boost
#include <boost/filesystem.hpp>

// libpqxx
#include <pqxx/pqxx>

// TagLib
#include <fileref.h>
#include <tag.h>
#include <tstring.h>

using namespace std;
using namespace boost::filesystem;
using namespace pqxx;

int scanner(const path &cpath)
{
        if ( !exists(cpath)) {
                cerr << cpath.filename() << " does not exists.\n";
                return 1;
        }
        if ( !is_directory(cpath) ) {
                cerr << cpath.filename() << " is not a directory.\n";
                return 2;
        }
        directory_iterator iter(cpath), iter_end;

        connection conn("user=xenoterracide dbname=xenoterracide");
        work tran(conn);
        for (; iter != iter_end; ++iter) {
                if ( is_directory(*iter)) {
                        scanner(*iter);
                }
                if ( is_regular_file(*iter)) {
                        const char *tpath = iter->path().string().c_str();
                        TagLib::FileRef tfile(tpath);

                        const char *ttitle = tfile.tag()->title().toCString();

                        cout
                                << iter->path() << "\n"
                                << ttitle << endl;
                        stringstream sql;
                        sql << "INSERT INTO korama.tracks ( file_path,
title ) VALUES ('"
                                << tran.esc(iter->path().string())
                                << "','"
                                << tran.esc(ttitle)
                                << "');";
                        tran.exec(sql.str());
                }
        }
        tran.commit();
        return 0;
}

This all seems to working ok with the flac's it's scanning... however
when it hit's mp3's

/home/music/R/Rush/The_Spirit_of_Radio_-_Greatest_Hits_1974-1987/03-2112_Overture___The_Temples_of_Syrinx-Rush-The_Spirit_of_Radio_-_Greatest_Hits_1974-1987.flac
2112 Overture / The Temples of Syrinx
/home/music/R/Rihanna/Music_of_the_Sun/12-Now_I_Know-Rihanna-Music_of_the_Sun.mp3
0'
/home/music/R/Rihanna/Music_of_the_Sun/05-That_La,_La,_La-Rihanna-Music_of_the_Sun.mp3

terminate called after throwing an instance of 'pqxx::data_exception'
  what():  ERROR:  invalid byte sequence for encoding "UTF8": 0x90
HINT:  This error can also happen if the byte sequence does not match
the encoding expected by the server, which is controlled by
"client_encoding"

it seems to be changing what it finds on those mp3's with each run...
am I doing something wrong?
--
Caleb Cushing

http://xenoterracide.blogspot.com
_______________________________________________
taglib-devel mailing list
taglib-devel@...
https://mail.kde.org/mailman/listinfo/taglib-devel

Re: cstring returning invalid encoding? mp3's

by Bugzilla from lalinsky@gmail.com :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Sep 21, 2009 at 9:07 AM, Caleb Cushing <xenoterracide@...> wrote:
>                        const char *ttitle = tfile.tag()->title().toCString();

See http://developer.kde.org/~wheeler/taglib/api/classTagLib_1_1String.html#ac86b42f97707048978f4ac9dd801959

"If unicode if false (the default) this string will be encoded in
Latin1. If it is true the returned C-String will be UTF-8 encoded."

>                                << tran.esc(ttitle)

And it seems that here you are trying to interpret the Latin-1 string as UTF-8.

--
Lukas Lalinsky
lalinsky@...
_______________________________________________
taglib-devel mailing list
taglib-devel@...
https://mail.kde.org/mailman/listinfo/taglib-devel

Re: cstring returning invalid encoding? mp3's

by xenoterracide :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> And it seems that here you are trying to interpret the Latin-1 string as UTF-8.

so that means passing true to toCString() right?

const char *ttitle = tfile.tag()->title().toCString(true);

it does change my error... although I'm not sure I understand why an
album full of flac's had no problems and yet the mp3's are.

what():  incomplete multibyte character

--
Caleb Cushing

http://xenoterracide.blogspot.com
_______________________________________________
taglib-devel mailing list
taglib-devel@...
https://mail.kde.org/mailman/listinfo/taglib-devel

Re: cstring returning invalid encoding? mp3's

by xenoterracide :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

changed my code around a little

      if ( is_regular_file(*iter)) {
            const char *tpath = iter->path().string().c_str();
            TagLib::FileRef tfile(tpath);

            const char *ttitle = tfile.tag()->title().toCString(true);
// convert to cstring for libpqxx

            cout
                << iter->filename() << "\n"
                << tfile.tag()->title() << "\n" // outputs fine
                << ttitle << endl;                 // outputs nothing
or garbage for at least select mp3's...


even with toCString(true) set I still get the invalid byte sequence
for utf8 sometimes.
--
Caleb Cushing

http://xenoterracide.blogspot.com
_______________________________________________
taglib-devel mailing list
taglib-devel@...
https://mail.kde.org/mailman/listinfo/taglib-devel

Re: cstring returning invalid encoding? mp3's

by xenoterracide :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

sorry for constant repeat posts. I should also note I can read these
files from amarok, just fine... so I'm sure it's my code... maybe
something I'm missing...


--
Caleb Cushing

http://xenoterracide.blogspot.com
_______________________________________________
taglib-devel mailing list
taglib-devel@...
https://mail.kde.org/mailman/listinfo/taglib-devel

Re: cstring returning invalid encoding? mp3's

by xenoterracide :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

made a more simple prorgram... to test with

#include <iostream>
#include <cstring>
#include <cstdio>
#include <fileref.h>
#include <tag.h>
#include <tstring.h>
using namespace std;
int main()
{
    TagLib::FileRef tfile("/home/music/R/Rihanna/Music_of_the_Sun/12-Now_I_Know-
    cout << tfile.tag()->title() << endl;

    const char *ttitle = tfile.tag()->title().toCString(true);

    while(*ttitle) {
        cout << hex << static_cast<int> (*ttitle++) << " ";
    }
    cout << endl;
}

compiled with
g++ -I /usr/include/taglib -l tag tags.cpp -o tagger

several runs. same file... wtf?

slave4 src $ ./tagger
Now I Know
ffffffb0 61 ffffffa9 1
slave4 src $ ./tagger
Now I Know
ffffffb0 ffffff81 52 2
slave4 src $ ./tagger
Now I Know
ffffffb0 11 6e 1
slave4 src $ ./tagger
Now I Know
ffffffb0 41 fffffff5



change the program to use a flac instead of an mp3...

slave4 src $ ./tagger
Working Man
57 6f 72 6b 69 6e 67 20 4d 61 6e
slave4 src $ ./tagger
Working Man
57 6f 72 6b 69 6e 67 20 4d 61 6e
slave4 src $ ./tagger
Working Man
57 6f 72 6b 69 6e 67 20 4d 61 6e

and it works perfectly. I take it back when I said I think it's my
code... I think there's a bug in taglib here somewhere... possibly
(probably) the conversion... I'm not sure. Nore am I sure why amarok
works fine... can anyone suggest a workaround?
--
Caleb Cushing

http://xenoterracide.blogspot.com
_______________________________________________
taglib-devel mailing list
taglib-devel@...
https://mail.kde.org/mailman/listinfo/taglib-devel

Re: cstring returning invalid encoding? mp3's

by Jeff Mitchell-11 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Caleb Cushing wrote:

> made a more simple prorgram... to test with
>
> #include <iostream>
> #include <cstring>
> #include <cstdio>
> #include <fileref.h>
> #include <tag.h>
> #include <tstring.h>
> using namespace std;
> int main()
> {
>     TagLib::FileRef tfile("/home/music/R/Rihanna/Music_of_the_Sun/12-Now_I_Know-
>     cout << tfile.tag()->title() << endl;
>
>     const char *ttitle = tfile.tag()->title().toCString(true);
>
>     while(*ttitle) {
>         cout << hex << static_cast<int> (*ttitle++) << " ";
>     }
>     cout << endl;
> }
>
> compiled with
> g++ -I /usr/include/taglib -l tag tags.cpp -o tagger
Using your exact program (with a different file of course) and your
exact compile statement, I don't have the issues you are seeing.

jmitchell@heifertosh ~/scratch/taglib-test  $  ./tagger
My Name is Jonas
70 ffffffad 60
jmitchell@heifertosh ~/scratch/taglib-test  $  ./tagger
My Name is Jonas
70 ffffffad 60
jmitchell@heifertosh ~/scratch/taglib-test  $  ./tagger
My Name is Jonas
70 ffffffad 60
jmitchell@heifertosh ~/scratch/taglib-test  $  ./tagger
My Name is Jonas
70 ffffffad 60



_______________________________________________
taglib-devel mailing list
taglib-devel@...
https://mail.kde.org/mailman/listinfo/taglib-devel

signature.asc (203 bytes) Download Attachment

Re: cstring returning invalid encoding? mp3's

by xenoterracide :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

figured it out 20-ish minutes ago... memory corruption. apparently the
memory was popping off the stack before the pointer was assigned (or
something like that).  using an intermediate variable.


            TagLib::String s( tfile.tag()->title() );
            const char *ttitle = s.toCString(true);

has solved this issue.

--
Caleb Cushing

http://xenoterracide.blogspot.com
_______________________________________________
taglib-devel mailing list
taglib-devel@...
https://mail.kde.org/mailman/listinfo/taglib-devel

Re: cstring returning invalid encoding? mp3's

by rengels :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tuesday 22 September 2009 01:55:49 am ext Caleb Cushing wrote:

> figured it out 20-ish minutes ago... memory corruption. apparently the
> memory was popping off the stack before the pointer was assigned (or
> something like that).  using an intermediate variable.
>
>
>             TagLib::String s( tfile.tag()->title() );
>             const char *ttitle = s.toCString(true);
>
> has solved this issue.
>

Hi Caleb,
it should not do that.
As tfile does not run out of scope tag and tag->title should also remain valid.
Strange...

BR,
Ralf
_______________________________________________
taglib-devel mailing list
taglib-devel@...
https://mail.kde.org/mailman/listinfo/taglib-devel

Re: cstring returning invalid encoding? mp3's

by xenoterracide :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> it should not do that.
> As tfile does not run out of scope tag and tag->title should also remain valid.
> Strange...

I'm not sure, but the person who helped me had run my program through
valgrind and said that's what it was. So apparently it's not just a
'my system' thing.

IRC convo logs... maybe this will help..

[Monday 21 September 2009] [18:15:28] <xenoterracide>
http://privatepaste.com/fd0oC6poSx I can't figure out why it's not
spitting out a cstring right from mp3's
[Monday 21 September 2009] [18:22:27] <diddly_> xenoterracide: without
knowing anything about taglib, looks like you are referencing
un-initialized memory.  tried valgrind?
[Monday 21 September 2009] [18:24:20] <xenoterracide>   diddly_:
nope... I suppose I could
[Monday 21 September 2009] [18:24:58] <diddly_> i just did
[Monday 21 September 2009] [18:25:26] <diddly_> does the title()
method perhaps create and return a temp variable?  because if it does,
you have a ptr to a variable that fell off the stack after that call
in your code
[Monday 21 September 2009] [18:25:48] <xenoterracide>   diddly_: I'm not sure...
[Monday 21 September 2009] [18:25:58] <diddly_> sec lemme check api,
i've never used taglib
[Monday 21 September 2009] [18:26:22] <xenoterracide>   diddly_: I'm
addmittedly still a very amateur programmer...
[Monday 21 September 2009] [18:26:41] <diddly_> xenoterracide: only
one way to get better ;)
[Monday 21 September 2009] [18:27:59] <xenoterracide>   diddly_:
right. practice practice practice. but like all professions... you
have to start somewhere and you can't do everything by yourself right
off the bat
[Monday 21 September 2009] [18:28:12] <diddly_> xenoterracide: no, i
meant ask questions in IRC ;)
[Monday 21 September 2009] [18:28:34] <xenoterracide>   diddly_: lol
[Monday 21 September 2009] [18:29:30] <diddly_> xenoterracide: right
so your problem was that the title() method returns a TagLib::String
object.  It since it was never stored, the object goes out of scope
immediately after that line of
[Monday 21 September 2009] [18:29:39] <diddly_> sec i'll paste a possible fix
[Monday 21 September 2009] [18:30:08] <xenoterracide>   what I don't
get is why it only happens to the mp3's... and not the flac's
[Monday 21 September 2009] [18:30:21] <diddly_> http://dpaste.org/0dLo/
[Monday 21 September 2009] [18:31:05] <diddly_> just dumb luck, if the
flac object is smaller, the memory on the stack may not be re-used as
quickly and it still has data that hasnt been over-written yet
[Monday 21 September 2009] [18:31:29] <xenoterracide>   ah
[Monday 21 September 2009] [18:31:32] <diddly_> its one of the more
subtle c++ bugs
[Monday 21 September 2009] [18:31:49] <xenoterracide>   right
[Monday 21 September 2009] [18:32:23] <xenoterracide>   remember the stack.
[Monday 21 September 2009] [18:32:24] <xenoterracide>   fun
[Monday 21 September 2009] [18:33:51] <diddly_> typically if you see
behaviour like that (randomly changing output/results, given a fixed
input) it starts to yell memory corruption, and valgrind is your
quickest way to find it


--
Caleb Cushing

http://xenoterracide.blogspot.com
_______________________________________________
taglib-devel mailing list
taglib-devel@...
https://mail.kde.org/mailman/listinfo/taglib-devel

Re: cstring returning invalid encoding? mp3's

by Michael Pyne :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tuesday 22 September 2009 07:18:07 rengels wrote:

> On Tuesday 22 September 2009 01:55:49 am ext Caleb Cushing wrote:
> > figured it out 20-ish minutes ago... memory corruption. apparently the
> > memory was popping off the stack before the pointer was assigned (or
> > something like that).  using an intermediate variable.
> >
> >
> >             TagLib::String s( tfile.tag()->title() );
> >             const char *ttitle = s.toCString(true);
> >
> > has solved this issue.
>
> Hi Caleb,
> it should not do that.
> As tfile does not run out of scope tag and tag->title should also remain
>  valid. Strange...
tfile was not the issue, it was a temporary String returned by tfile.tag()-
>title().

Look at the API docs:
http://developer.kde.org/~wheeler/taglib/api/classTagLib_1_1String.html#ac86b42f97707048978f4ac9dd801959

"This string remains valid until the String instance is destroyed or another
export method is called."

With tfile.tag()->title().toCString(true);, the temporary String returned by
title() is destroyed essentially as soon as that semicolon is encountered,
leaving the memory pointed to by ttitle in an undefined state.  Forcing the
String to stay alive by storing it in a local variable also keeps the ttitle
pointer valid.

Regards,
 - Michael Pyne


_______________________________________________
taglib-devel mailing list
taglib-devel@...
https://mail.kde.org/mailman/listinfo/taglib-devel

signature.asc (853 bytes) Download Attachment