too many FreeDB imports?

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

too many FreeDB imports?

by Robert Kaye :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I was chatting with Steve Wyles about the rising number of FreeDB  
imports and the number of add album mods that get added with no one  
voting on them. To combat this problem which seems to be overwhelming  
our moderators/voters, I think we should:

1. Change the moderations for add albums so that if the grace-period  
for these mods expires with still no votes, the moderations is  
dropped and the album removed. Seriously, if no one cares to vouch  
for that album, no one in the MB community cares. So, why keep the  
album? When the time comes when someone cares, they will import it  
again and make sure it gets votes.

2. Implement a FreeDB blacklist. If a FreeDB album has been imported,  
or has been attempted to be imported and failed, a non-automod  
moderator cannot import the album again. This should cut down on the  
number of duplicate/useless freedb imports we get. If someone really  
wants an album imported, they can cut and paste the data into a  
regular album add or petition an automod to do it. Both require  
effort, which is what these lazy moderators that add duplicate freedb  
albums aren't wishing to spend.

3. On the track lookup page that is the entry page for tagger users,  
if the search they attempted didn't yield anything, we currently  
offer them to import the album from freedb. We should change this  
text and add features to do this:

=====
Didn't find what you are looking for?

1. Search again. (Things may not appear as you think they are.)
2. Import the album into MusicBrainz. (You need to spend some time  
cleaning up the data, finding release dates and grooming the data  
before it gets inserted into MusicBrainz. blah blah)
3. Screw this, I just want to tag my music!
===

If the user chooses #3, they get bounced back to Picard, where a  
freedb import dialog opens up. The user can then import an album from  
freedb INTO PICARD, click the simple guess case button, and then tag  
their files against this imported album. Once the tracks are tagged,  
the album is discarded and no UUIDs are written to the tags.

This lets tagger users tag their music without having to "hassle with  
getting data into MB". The bulk of tagger users will choose this  
option. The other part of the people that are curious and want to  
spend more time can choose the other option. Hopefully those are the  
people we pull into the MB fold and integrate into our community and  
avoid force integrating tagger users who care to tag only their music.

Thoughts?

--

--ruaok      Somewhere in Texas a village is *still* missing its idiot.

Robert Kaye     --     rob@...     --    http://mayhem-chaos.net


_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: too many FreeDB imports?

by Simon Reinhardt :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Robert Kaye wrote:

> I was chatting with Steve Wyles about the rising number of FreeDB  
> imports and the number of add album mods that get added with no one  
> voting on them. To combat this problem which seems to be overwhelming  
> our moderators/voters, I think we should:
>
> 1. Change the moderations for add albums so that if the grace-period  
> for these mods expires with still no votes, the moderations is  
> dropped and the album removed. Seriously, if no one cares to vouch  
> for that album, no one in the MB community cares. So, why keep the  
> album? When the time comes when someone cares, they will import it  
> again and make sure it gets votes.

When I started editing on MB, my first votes expired without being voted
on. And - not knowing the moderation system well enough then - I feared
they could be discarded. If that happens with albums, we might lose some
albums, someone really has put effort into. It's just so much to vote
on, it can happen that good album adds expire without votes. And this
method will not reduce the add album mods first, so they would be
dropped. We don't want to put off those users.

>
> 2. Implement a FreeDB blacklist.

Too much effort.

> 3. On the track lookup page that is the entry page for tagger users,  
> if the search they attempted didn't yield anything, we currently  
> offer them to import the album from freedb. We should change this  
> text and add features to do this:
>
> =====
> Didn't find what you are looking for?
>
> 1. Search again. (Things may not appear as you think they are.)
> 2. Import the album into MusicBrainz. (You need to spend some time  
> cleaning up the data, finding release dates and grooming the data  
> before it gets inserted into MusicBrainz. blah blah)
> 3. Screw this, I just want to tag my music!
> ===

I would switch 3. and 2. though.
While this seems to be a good solution, I would also consider:

4. Abandon freedb imports. At least into the db - for the tagger they
might still serve some purpose. It was thought about if we should allow
freedb imports for automods only because they should know what they do
and are trusted to correct the data. I think normal users still should
be able to use the data though, it's not that they all don't care about
good data!
But the solution for this is already implemented: you can use the
TrackParser for importing data from freedb. So even if we disable freedb
import we still can use the data from it. And the good thing about this:
we don't point to the parser as much as to the freedb import. So the
newbs can't even think about using it for crappy imports. But if they
really care about what they put in the db, they are willing to take some
time to learn how MB works - and then they will also find the
TrackParser. So everyone would be happy.. or not?

Simon (Shepard)
_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: too many FreeDB imports?

by Alexander Dupuy :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Robert Kaye wrote:

> I was chatting with Steve Wyles about the rising number of FreeDB  
> imports and the number of add album mods that get added with no one  
> voting on them. To combat this problem which seems to be overwhelming  
> our moderators/voters, I think we should:
>
> 1. Change the moderations for add albums so that if the grace-period  
> for these mods expires with still no votes, the moderations is  
> dropped and the album removed. Seriously, if no one cares to vouch  
> for that album, no one in the MB community cares. So, why keep the  
> album? When the time comes when someone cares, they will import it  
> again and make sure it gets votes.
>
> 2. Implement a FreeDB blacklist. If a FreeDB album has been imported,  
> or has been attempted to be imported and failed, a non-automod  
> moderator cannot import the album again. This should cut down on the  
> number of duplicate/useless freedb imports we get. If someone really  
> wants an album imported, they can cut and paste the data into a  
> regular album add or petition an automod to do it. Both require  
> effort, which is what these lazy moderators that add duplicate freedb  
> albums aren't wishing to spend.
>
These seem like good suggestions, but please make sure that a failed
FreeDB import (due to lack of votes, rather than explicit negatives)
does not get blacklisted, or the interaction of these two changes will
be overkill.  Also, while change 1 makes (some) sense for FreeDB
imports, I would be reluctant to impose it on "from scratch" add albums,
as it is much harder to recreate the data in that case.  Some kind of
advance warning to the moderator would also be very important, e.g.
"Your add album edit #1234567 has not received any votes and will fail
in one week if none are received - you should try contacting other
moderators via IRC at #musicbrainz to make sure that this does not
happen."  (It might be worth e-mailing this to moderators who are
subscribed to the artist as well).

Is there any way we can track FreeDB ids explicitly so that an attempt
to import a FreeDB album that was already imported successfully returns
a pointer to the current MB album entry (ideally, even if there have
been album merges)?  Just saying "can't do that" is not helpful if there
is correct data, but the titles/artists have changed radically enough
that searching isn't working for the user.  I know that we could
probably do this if there is a MB DiscID, since that can be used to
generate a FreeDB id, but in many cases, there is not one.

> If the user chooses #3, they get bounced back to Picard, where a  
> freedb import dialog opens up.


What happens if the user is using the old MB Tagger?

@alex

_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: [mailing] too many FreeDB imports?

by Marco Sola :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thursday, January 05, 2006 10:23 PM,
Robert Kaye <rob@...> wrote:

> 1. Change the moderations for add albums so that if the grace-period
> for these mods expires with still no votes, the moderations is
> dropped and the album removed. Seriously, if no one cares to vouch
> for that album, no one in the MB community cares. So, why keep the
> album? When the time comes when someone cares, they will import it
> again and make sure it gets votes.

How can you sure you don't discard useful info? I'm sure a lot of album entries are added witout any vote and interest from voter but then are lookuped and used by users and taggers. Will you discard TRM added? And even cdid? Will you discard a 15 min editing of someone? Again, how could evaluate if a entry is proper only by the fact no one votes on it? Are you sure MB voters are represetative of MB users? Are you sure voters are actually listening to music or are they just subscribed to some strange Artist to see if someone reverts their wonderful edits? (btw, is MB for voter of for users?)

But, most of all, is db clutter *really* a problem?

(Since I seem to have more than a problem with the actual MB "staff") I'm starting editing on wikipedia and I experienced and really agree with their point of view: they discard almost nothing, they think everything, even if not perfect, is valuable. What is primary is that they give maximum effort to keep clean and up to date the most important informations, about the most popular argument. (Here on MB , let me say, I see a lot of Guideline wars and "cd 1" to "(disc 1)" edit but *all* the pages of important Artist are a quite a mess)

Discard Add Album entries with a reason like this and/or block freedb entries and quite soon MB will be a dried placed with compulsive people jumping around and adding "feat.", with final dot.

> 2. Implement a FreeDB blacklist. If a FreeDB album has been imported,
> or has been attempted to be imported and failed, a non-automod
> moderator cannot import the album again. This should cut down on the
> number of duplicate/useless freedb imports we get. If someone really
> wants an album imported, they can cut and paste the data into a
> regular album add or petition an automod to do it. Both require
> effort, which is what these lazy moderators that add duplicate freedb
> albums aren't wishing to spend.
> 3. On the track lookup page that is the entry page for tagger users,
> if the search they attempted didn't yield anything, we currently
> offer them to import the album from freedb. We should change this
> text and add features to do this:
>  Didn't find what you are looking for?
> 1. Search again. (Things may not appear as you think they are.)
> 2. Import the album into MusicBrainz. (You need to spend some time
> cleaning up the data, finding release dates and grooming the data
> before it gets inserted into MusicBrainz. blah blah)

I add since months and I *never* add an album I can't find on freedb. I guess most users are like me, mostly because often you don't have that album in hand and asking to look around internet for a tracklist is a ridicoulous prerequisite. If you block it, people will manually look most probably on FreeDb itself, with all its error and, guess what?, just do copy&paste without even telling you where this come from.

 If you will to spend a huge developer time like this (and MB seems to have very few of it) maybe think about simply block the same import: FreeDb as a unique identifier and, even if it has duplicates with different code, I never understood why a check so somehow simple wasn't done yet.

Ciao

MArco

_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: too many FreeDB imports?

by Robert Kaye :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Jan 5, 2006, at 1:57 PM, Alexander Dupuy wrote:
> Is there any way we can track FreeDB ids explicitly so that an  
> attempt to import a FreeDB album that was already imported  
> successfully returns a pointer to the current MB album entry  
> (ideally, even if there have been album merges)?

I think we can -- we now keep track of which freedb ids things were  
imported from.

>   Just saying "can't do that" is not helpful if there is correct  
> data, but the titles/artists have changed radically enough that  
> searching isn't working for the user.  I know that we could  
> probably do this if there is a MB DiscID, since that can be used to  
> generate a FreeDB id, but in many cases, there is not one.
>
>
>> If the user chooses #3, they get bounced back to Picard, where a  
>> freedb import dialog opens up.
>
> What happens if the user is using the old MB Tagger?

We won't even show this option for old tagger users. Thus more need  
to migrate people to picard...

--

--ruaok      Somewhere in Texas a village is *still* missing its idiot.

Robert Kaye     --     rob@...     --    http://mayhem-chaos.net


_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: [mailing] too many FreeDB imports?

by Robert Kaye :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Jan 5, 2006, at 3:41 PM, Marco Sola wrote:

> On Thursday, January 05, 2006 10:23 PM,
> Robert Kaye <rob@...> wrote:
>
>
>> 1. Change the moderations for add albums so that if the grace-period
>> for these mods expires with still no votes, the moderations is
>> dropped and the album removed. Seriously, if no one cares to vouch
>> for that album, no one in the MB community cares. So, why keep the
>> album? When the time comes when someone cares, they will import it
>> again and make sure it gets votes.
>>
>
> How can you sure you don't discard useful info? I'm sure a lot of  
> album entries are added witout any vote and interest from voter but  
> then are lookuped and used by users and taggers. Will you discard  
> TRM added? And even cdid? Will you discard a 15 min editing of  
> someone? Again, how could evaluate if a entry is proper only by the  
> fact no one votes on it? Are you sure MB voters are represetative  
> of MB users? Are you sure voters are actually listening to music or  
> are they just subscribed to some strange Artist to see if someone  
> reverts their wonderful edits?

I think the suggestion to only dump freedb imports is a good one.  
What if we extend it to say that if a freedb import that has no CD  
id, no TRMs, no release date, no script or language gets dumped if it  
gets past the grace period??

> (btw, is MB for voter of for users?)

Its a precarious balance between editors, voters, taggers and  
customers. Everything we do needs to balance all those.

> But, most of all, is db clutter *really* a problem?

No, not for me. I think this issue is more about voter overload.

> (Since I seem to have more than a problem with the actual MB "staff")

That would be me -- since I am the only 'staff' we have. Do you mean  
you have an issue with me or someone else?

> I'm starting editing on wikipedia and I experienced and really  
> agree with their point of view: they discard almost nothing, they  
> think everything, even if not perfect, is valuable. What is primary  
> is that they give maximum effort to keep clean and up to date the  
> most important informations, about the most popular argument. (Here  
> on MB , let me say, I see a lot of Guideline wars and "cd 1" to  
> "(disc 1)" edit but *all* the pages of important Artist are a quite  
> a mess)

Well, we're moving along the spectrum from Quantity (where we  
accepted everything to bootstrap) to Quality (where we become more  
discerning) in what we accept. We're not on either end of the  
spectrum right now -- we're somewhere in the middle.

> Discard Add Album entries with a reason like this and/or block  
> freedb entries and quite soon MB will be a dried placed with  
> compulsive people jumping around and adding "feat.", with final dot.

The flipside is a place that is overrun by tagger users who only care  
to tag their music and don't care about accurate data. The key for us  
is to find the balance so we can make everyone happy.

> I add since months and I *never* add an album I can't find on  
> freedb. I guess most users are like me, mostly because often you  
> don't have that album in hand and asking to look around internet  
> for a tracklist is a ridicoulous prerequisite. If you block it,  
> people will manually look most probably on FreeDb itself, with all  
> its error and, guess what?, just do copy&paste without even telling  
> you where this come from.
>
>  If you will to spend a huge developer time like this (and MB seems  
> to have very few of it) maybe think about simply block the same  
> import: FreeDb as a unique identifier and, even if it has  
> duplicates with different code, I never understood why a check so  
> somehow simple wasn't done yet.

That's part of what I mean by the FreeDB blacklist. And it wouldn't  
be that much work...

--

--ruaok      Somewhere in Texas a village is *still* missing its idiot.

Robert Kaye     --     rob@...     --    http://mayhem-chaos.net


_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: [mailing] too many FreeDB imports?

by Marco Sola :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Friday, January 06, 2006 1:31 AM,
Robert Kaye <rob@...> wrote:

> I think the suggestion to only dump freedb imports is a good one.
> What if we extend it to say that if a freedb import that has no CD
> id, no TRMs, no release date, no script or language gets dumped if it
> gets past the grace period??

Those are things you can check and choose to include or not as a filter as you like. But in any case how can you say if you are discarding the previous mentioned 15 min perfect edit on that freedb entry?

>> (btw, is MB for voter of for users?)
 
> Its a precarious balance between editors, voters, taggers and
> customers. Everything we do needs to balance all those.

IMHO, sorry if I'm rude, active editors just need data to work on. And btw, about voters, how can you say that single Yes is proper to let the Album in?

> I think this issue is more about voter overload.

I put an accent on expired mod just one year ago in this ML and I was kindly told not to worry so much.

Maybe a little help to decrease open mods could be just to bring Yes needed to close them from 3 to 2.

>> (Since I seem to have more than a problem with the actual MB "staff")
 
> That would be me -- since I am the only 'staff' we have. Do you mean
> you have an issue with me or someone else?

No, I used "" because I can't address my statemet correctly. I just feel that and in fact I am quite in disagreement with most of the elder moderators about a lot of matters. (But probably it's only the way I am). Anyway asbolutely nothing with you in particular.

But your mention to IRC recalls me another matter I had from the very start of my doing here on MB and that I also yelled: I still think it it's not fair that tons of big and little "rules" are agreed only there, without a single note on official documentation and that those decision are  forced on mods by massive collective voting, often without any explanation.

This cause the learning curve for a decent editor to be so hard: he has to be guided here and there and there's no time to, he has to fail a lot and it's not funny, he as to recall what was agreed in dozen of mod wars which probably he didn't see.

I really wish anyone could start doing useful edits simply reading a couple of nice pages loaded with examples. Unfortunately official documentation is static, old, don't include matters we agreed months ago and is spreaded all around and cluttered with unuseful and dead discussions.

As the db is now, less important matters and beutifier should be kept distant from Baseline rules, they could just come later, and only if possibile and reasonable: I still think it's simply crazy to refuse an entry just because it doesn't follow LatvianCapitalizationStandard.

> Well, we're moving along the spectrum from Quantity (where we
> accepted everything to bootstrap) to Quality (where we become more
> discerning) in what we accept.

I got your point but, sorry, this is not a disctionary, you can't stop Quantity, you will have dozens of new releases a day, like it or not. And Quantity is still a value: if I don't find a release because someone refused its adding because it needed a / I just think MB sucks and see if there's something else around.

> The flipside is a place that is overrun by tagger users who only care
> to tag their music and don't care about accurate data. The key for us
> is to find the balance so we can make everyone happy.

I really feel we first need to understand what are "accurate data" and I think I don't have to explain what's my own idea about it. Dozen of emails (and nothing done yet!) for adding a release country? I still see mod refused because they don't follow BootlegNameStyle. I know that for most voters and editors MB could be somehow just a relaxing game of rules and knowledge but IMHO we are pushing it too far.

>> simply block the same  import: FreeDb as a unique identifier and, even if it has
>> duplicates with different code
 
> That's part of what I mean by the FreeDB blacklist. And it wouldn't
> be that much work...

I will joyfully welcome it and I think it would save us a good bunch of editing and votes.

Ciao

MArco / ClutchEr2

_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: [mailing] too many FreeDB imports?

by Nikki-12 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, Jan 06, 2006 at 12:41:07AM +0100, Marco Sola wrote:

> Will you discard a 15 min editing of someone?

If someone cares enough to spend 15 minutes editing it, maybe they should
care enough to vote yes.

> But, most of all, is db clutter *really* a problem?

In my opinion it is. I keep checking the expired add album mods and I
always find plenty of duplicates and plenty of others with bad style. I
think we need to encourage people to give proof of an album's existence
when adding an album. If it's easy to check that it exists, people will be
more likely to vote yes.

Also, FreeDB sucks because it's full of duplicates, homebrews and bad
style, if we're going to continue allowing any old crap into the database,
we're going to end up with data of no better quality than FreeDB's.

> Discard Add Album entries with a reason like this and/or block freedb
> entries and quite soon MB will be a dried placed with compulsive people
> jumping around and adding "feat.", with final dot.

Plenty of people add albums without importing from FreeDB.

> I add since months and I *never* add an album I can't find on freedb. I
> guess most users are like me, mostly because often you don't have that
> album in hand and asking to look around internet for a tracklist is a
> ridicoulous prerequisite.

Most users, in my experience, just do the few clicks required to import an
album from FreeDB without bothering to fix the style or even check whether
the album even exists. Why is finding a tracklist such a ridiculous
prerequisite? Why should the voters have to do the hard work of finding out
whether this is yet another FreeDB homebrew or not? I would say that the
people who go to the trouble of giving proof when they're adding albums
(which is the same as finding a tracklist, really) add much better quality
data than those who just import something from FreeDB, tag against it and
leave.

--Nikki

_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Asking the wrong Questions (Was: too many FreeDB imports?)

by DonRedman :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, 06 Jan 2006 05:54:43 +0100, Nikki wrote:

> Most users, in my experience, just do the few clicks required to import  
> an
> album from FreeDB without bothering to fix the style or even check  
> whether
> the album even exists. Why is finding a tracklist such a ridiculous
> prerequisite?

I think these are the wrong questions, and I want to strongly support  
Marco's point.

I wish we could stop talking about bad data. If we handle it properly  
there is no bad data. Let me explain:

Someone looks for an album with a common but wrong search term or using a  
badly tagged track.
MB does not yield any results, so the person adds the album _he_believes_  
it is to MB.
_This_ album will be found by future equally wrong searches.
Instead of deleting it, we should merge it with the correct one in way  
that _retains_ the wrong search terms. Much like the ArtistAlias feature.

Marco is right: MusicBrainz is developing a culture in which 'we' are  
trying to _defend_ 'our' good data from 'stupid' people who corrupt it  
because 'they' do not put enough work into cleaning it up.

This approach is
a) stupid
b) not scalable

ad a) The assumption that 'they' are 'stupid' or 'lazy' is plain wrong.  
Fact is that they use MB from their point of view. However, MB is shutting  
itself off so that it takes months to realize that your point of view is  
different from the general MB point of view and than to adapt to the  
general MB culture.
That would be the same thing if you'd require everybody who wants to visit  
Paris to speak fluent French.

ad b) If you look at MB and its users forming an ecological system, you  
will realize that these users enter an important bit of information into  
MB: The fact that this track is searched for by these terms. Currently we  
_discard_ this information over and over again. We defend ourself from  
information which is out there. It will come back again and again. And the  
more MB gets useful, the more often it will come back.


So here are some completely different questions:

How about logging all the search terms?
How about never again deleting duplicate albums, but merging them into the  
correct one?
How about a search index that uses this merge data to redirect future bad  
searches to the correct album?
How about not deleting bullshit albums but changing their release status  
to a new type "Hoax", and adding a sensible annotation?
These would not be displayed on the album listing, but the information  
would be used for searches. People would then suddenly be presented with  
_meaningful_ search results: "This album is a very common homeburnt copy,  
its files are shared on many P2P networks. However, the tracks are  
compiled from the following three albums... You might want to tag you  
files with the correct metadata instead." (from the Hoaxe's annotation).
How about this option if your search did not yield any results: "Perhaps  
you searched using wrong terms from bad tags. You can for help on the  
mb-users MailingList or the musicbrainztaggers IRC channel."


In one word:

How about we stop trying to defend ourselves from so called 'bad' data,  
and instead find ways to absorb that data into MB in a meaningful way,  
turing it to 'good' data.

   DonRedman

PS: And additionally this one: Robert, how about we start to get WikiDocs  
out of the box _now_. Screw my exams!



--
Words that are written in CamelCase refer to WikiPages:
Visit http://wiki.musicbrainz.org/ the best MusicBrainz documentation  
around! :-)
_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: Asking the wrong Questions (Was: too many FreeDB imports?)

by Steve Wyles :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 7 Jan 2006, Don Redman wrote:

> On Fri, 06 Jan 2006 05:54:43 +0100, Nikki wrote:
>
>> Most users, in my experience, just do the few clicks required to import an
>> album from FreeDB without bothering to fix the style or even check whether
>> the album even exists. Why is finding a tracklist such a ridiculous
>> prerequisite?
>
> I think these are the wrong questions, and I want to strongly support Marco's
> point.
>
> I wish we could stop talking about bad data. If we handle it properly there
> is no bad data. Let me explain:
>
> Someone looks for an album with a common but wrong search term or using a
> badly tagged track.
> MB does not yield any results, so the person adds the album _he_believes_ it
> is to MB.
> _This_ album will be found by future equally wrong searches.
> Instead of deleting it, we should merge it with the correct one in way that
> _retains_ the wrong search terms. Much like the ArtistAlias feature.

The issues I'm seeing arn't always a failure to match an existing entry.
But where it does match and the user adds a new entry anyway, resulting in
identical (apart from style) entries.

In the vast majority of cases, the import from freedb would have indicated
a > 90% confidence warning that the album they are adding is already in
the database. They are ignoring this warning and adding it anyway.

>
> Marco is right: MusicBrainz is developing a culture in which 'we' are trying
> to _defend_ 'our' good data from 'stupid' people who corrupt it because
> 'they' do not put enough work into cleaning it up.
>
> This approach is
> a) stupid
> b) not scalable
>
> ad a) The assumption that 'they' are 'stupid' or 'lazy' is plain wrong. Fact
> is that they use MB from their point of view.

In the context with which this discussion was started. The users are
ignoring the warnings presented that tells them the album is already in
the database.

>
> So here are some completely different questions:
>
> How about logging all the search terms?

This probably isn't really practical.

> How about never again deleting duplicate albums, but merging them into the
> correct one?

There is little point in merging when where is no additional information
on the newly added duplicate entry. If a certain album with a known freedb
id is already in the database, there is little point in adding it again.
Duplicate freedb id matches should be stopped at the import stage.

> How about a search index that uses this merge data to redirect future bad
> searches to the correct album?

Matching the search terms used to the album that was added is going to be
a nightmare to code.

> How about we stop trying to defend ourselves from so called 'bad' data, and
> instead find ways to absorb that data into MB in a meaningful way, turing it
> to 'good' data.
>

This discussion wasn't originally about so called 'bad' or unverified
freedb entries. But preventing those that are being repeatedly added as
duplicates. Adding freedb ids into the blacklist only when they are known
duplicates, would still allow unique 'home-made' entries to be added.

Instead of adding the id to the blacklist on a 'failed' vote, how about
offering an option to automoderators that allows them to add the freedb id
to the blacklist bacause it is a duplicate. Is this a better idea?

Steve (inhouseuk)
_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: Asking the wrong Questions (Was: too many FreeDB imports?)

by orion-6 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Steve Wyles wrote:
> Instead of adding the id to the blacklist on a 'failed' vote, how about
> offering an option to automoderators that allows them to add the freedb
> id to the blacklist bacause it is a duplicate. Is this a better idea?

To be clear - when you say ID, do you mean the full genre/hash ID or
just the hash part?  The former seems like it might not do much good
since often the same album is under rock/ID and misc/ID and data/ID and
blues/ID until people run out of places to stick it.  If you mean
blacklisting the ID without the freedb genre part, that seems likely to
blacklist a lot of legitimate albums due to the non-uniqueness of them.
    For example:
http://www.freedb.org/freedb_search_fmt.php?cat=blues&id=1001fa02 Shiina
Hekiru and
http://www.freedb.org/freedb_search_fmt.php?cat=rock&id=1001fa02 the
Cure both have a single with the same ID.  If the cure one gets imported
several times and then blacklisted, it'll make it harder for people to
import the less well known artist.

_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: Asking the wrong Questions (Was: too many FreeDB imports?)

by Steve Wyles :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 7 Jan 2006, Orion wrote:

> Steve Wyles wrote:
>> Instead of adding the id to the blacklist on a 'failed' vote, how about
>> offering an option to automoderators that allows them to add the freedb id
>> to the blacklist bacause it is a duplicate. Is this a better idea?
>
> To be clear - when you say ID, do you mean the full genre/hash ID or just the
> hash part?

Sorry, I should have been clearer here,

I mean the full genre/hash ID

Steve (inhouseuk)

_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: Asking the wrong Questions (Was: too many FreeDB imports?)

by DonRedman :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 07 Jan 2006 16:40:12 +0100, Steve Wyles wrote:

> On Sat, 7 Jan 2006, Don Redman wrote:
>
>> I think these are the wrong questions, and I want to strongly support  
>> Marco's point.
>>
>> I wish we could stop talking about bad data. If we handle it properly  
>> there is no bad data. Let me explain:
>>
>> Someone looks for an album with a common but wrong search term or using  
>> a badly tagged track.
>> MB does not yield any results, so the person adds the album  
>> _he_believes_ it is to MB.
>> _This_ album will be found by future equally wrong searches.
>> Instead of deleting it, we should merge it with the correct one in way  
>> that _retains_ the wrong search terms. Much like the ArtistAlias  
>> feature.
>
> The issues I'm seeing arn't always a failure to match an existing entry.  
> But where it does match and the user adds a new entry anyway, resulting  
> in identical (apart from style) entries.
>
> In the vast majority of cases, the import from freedb would have  
> indicated a > 90% confidence warning that the album they are adding is  
> already in the database. They are ignoring this warning and adding it  
> anyway.

Are you sure? Can this be logged and proven?

Maybe our duplication warning is not so good. I know that I get many  
"nothing found" results with both MBTagger and Picard. Now, _I_ know how  
to search MB, but how should a newbie user know?

Very shourt term solution: Add a link to a SearchingHints wiki page.  
Experienced users could list some good searching techniques there.

E.g. I often remove information from the search box, this generally gives  
better reults. Also having a featuring artist in the wrong field (than it  
is stored in in MB, which you have to _know_ is the TrackTitle) totally  
messes up the track lookup.


>> Marco is right: MusicBrainz is developing a culture in which 'we' are  
>> trying to _defend_ 'our' good data from 'stupid' people who corrupt it  
>> because 'they' do not put enough work into cleaning it up.
>>
>> This approach is
>> a) stupid
>> b) not scalable
>>
>> ad a) The assumption that 'they' are 'stupid' or 'lazy' is plain wrong.  
>> Fact is that they use MB from their point of view.
>
> In the context with which this discussion was started. The users are  
> ignoring the warnings presented that tells them the album is already in  
> the database.

I am saying the context in which we think is wrong.

Did it occur to you that maybe there is a _reason_ people enter the same  
album over and over again?
Maybe they are not ignorant, but acting very rationally. Maybe just cannot  
find the thing they are looking for, and really believe that this album is  
not in MB.

It does not help at all to warn them not to do something, if that  
something is the only way they can make MB useful to them.

I think the search function could and should be greatly improved.

Additionally making tags editable in Picard (and _telling_ people that  
this is possible) will avoid that tagging alternatives get added to the  
database.


>> So here are some completely different questions:
>>
>> How about logging all the search terms?
>
> This probably isn't really practical.
>
>> How about never again deleting duplicate albums, but merging them into  
>> the correct one?
>
> There is little point in merging when where is no additional information  
> on the newly added duplicate entry. If a certain album with a known  
> freedb id is already in the database, there is little point in adding it  
> again. Duplicate freedb id matches should be stopped at the import stage.
>
>> How about a search index that uses this merge data to redirect future  
>> bad searches to the correct album?
>
> Matching the search terms used to the album that was added is going to  
> be a nightmare to code.
>
>> How about we stop trying to defend ourselves from so called 'bad' data,  
>> and instead find ways to absorb that data into MB in a meaningful way,  
>> turing it to 'good' data.
>>
>
> This discussion wasn't originally about so called 'bad' or unverified  
> freedb entries. But preventing those that are being repeatedly added as  
> duplicates. Adding freedb ids into the blacklist only when they are  
> known duplicates, would still allow unique 'home-made' entries to be  
> added.
>
> Instead of adding the id to the blacklist on a 'failed' vote, how about  
> offering an option to automoderators that allows them to add the freedb  
> id to the blacklist bacause it is a duplicate. Is this a better idea?

I think you did not get my point.

I say that a blacklist will not help, because I assume that users are  
rationally motivated to add these duplicates.

Telling them: "You cannot add this album to MB, it is already there" does  
not help if they cannot find it. What MB should do instead is tell they:  
"Hey, you are probably looking for this album".

The three things I proposed are to be taken as a whole:
Log the users unsuccessful searches, and log the resulting freedb imports,  
and log the merges into correct albums. Then put this data into an index  
that relates the unsuccessful search terms and the original freedb data to  
the _correct_ MBID.

I do not see how this is going to be a nightmare. Logging can just be  
added upon existing structures, the index can be updated in intervals, and  
only needs to be querried additionally if the main search yielded no  
results (the additional search could even be invoked manualy).

If it really is a coding nightmare, then spend half of the engery in  
optimizing the current search form.

I thik MB would greatly profit from a Pimp My Tunes like search in which  
you just put text (not saying to which tags it belongs), and a quanticized  
track length.

   DonRedman


--
Words that are written in CamelCase refer to WikiPages:
Visit http://wiki.musicbrainz.org/ the best MusicBrainz documentation  
around! :-)
_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: Asking the wrong Questions (Was: too many FreeDB imports?)

by Nikki-12 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, Jan 07, 2006 at 05:15:17PM +0100, Don Redman wrote:
 
> I think the search function could and should be greatly improved.

Agreed.

> Telling them: "You cannot add this album to MB, it is already there" does
> not help if they cannot find it. What MB should do instead is tell they:
> "Hey, you are probably looking for this album".

Agreed.

> The three things I proposed are to be taken as a whole: Log the users
> unsuccessful searches, and log the resulting freedb imports,  and log the
> merges into correct albums. Then put this data into an index  that
> relates the unsuccessful search terms and the original freedb data to
> the _correct_ MBID.

I think effort put into a system like this would be better spent on
improving the search function. The exact matches (more or less) only
approach of how the search works is probably why so many duplicates get
added. Someone doesn't get the words exactly right or has some extraneous
words and they can't find the album any more.

There's also too much emphasis on "Can't find it? Add it." when MusicBrainz
now has a lot of data and thus likely has the album the user is looking
for. In many cases we need to be saying "Can't find it? Try changing your
search terms."

--Nikki
_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: Asking the wrong Questions (Was: too many FreeDB imports?)

by Jan van Thiel-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 1/7/06, Nikki <nikki@...> wrote:
> There's also too much emphasis on "Can't find it? Add it." when MusicBrainz
> now has a lot of data and thus likely has the album the user is looking
> for. In many cases we need to be saying "Can't find it? Try changing your
> search terms."

Yes, that's it! Some fuzzy search mechanism would improve the search
results. Also, do not let people choose between Artist/Album/Track,
because artist information is currently also stored in album and track
fields. Generally: use a system that resembles the
http://www.discogs.com/ system. Mr. Levenshtein's contribution to
information theory comes to mind for a possible implementation.

I think, however, that there's a bigger underlying problem. Lack of
documentation for new users/taggers, the majority of the users I
imagine. Of course there's the wiki, but a new user will get lost
there within 5 minutes. We probably need an introduction to MB, with
some examples of possible edits (Add Album, Add non-album track, Merge
Artists vs. Change Track Artist) and the need for correct data (at
least, in my opinion).

And a note on voting on 'bad' data: I generally vote no on Add album
moderations for albums that have no release type and/or status and no
explanation why these aren't entered. Of course, there are people that
say 'Hey, why don't you do a quick google search and fill out that
information yourself?'. I feel that's the task for the moderator
entering the album in the first place. He/she should know. And if not,
why is he/she adding the album in the first place? For albums that
only have style issues, I do not vote no. I might clean up the release
and point out the mistakes.

Jan

_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: Asking the wrong Questions (Was: too many FreeDB imports?)

by DonRedman :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sat, 07 Jan 2006 17:47:39 +0100, Nikki wrote:

> I think effort put into a system like this would be better spent on
> improving the search function. The exact matches (more or less) only
> approach of how the search works is probably why so many duplicates get
> added. Someone doesn't get the words exactly right or has some extraneous
> words and they can't find the album any more.
>
> There's also too much emphasis on "Can't find it? Add it." when  
> MusicBrainz
> now has a lot of data and thus likely has the album the user is looking
> for. In many cases we need to be saying "Can't find it? Try changing your
> search terms."

OK, so the realistic options are:

1) Change the wording on the "nothing found" page like you said. Add to it  
a link to SearchingHints on the wiki, and create that page.

2) Improve the search functionality. Lucene, Pimp my Tunes and all that  
jazz.

3) I hate to bang on about this, but I think that WikiDocs is  
over-over-ripe. There have been multiple requests for more  
documentation/help in various places, and WikiDocs is _the_ answer to all  
these requests.
All I can say is: Any work that I am supposed to do for this to happen,  
I'll do it.

   DonRedman



--
Words that are written in CamelCase refer to WikiPages:
Visit http://wiki.musicbrainz.org/ the best MusicBrainz documentation  
around! :-)
_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: Asking the wrong Questions (Was: too many FreeDB imports?)

by Nikki-12 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Sun, Jan 08, 2006 at 11:56:50PM +0100, Don Redman wrote:

> OK, so the realistic options are:
>
> 1) Change the wording on the "nothing found" page like you said. Add to it  a link to SearchingHints on the wiki, and create
> that page.
>
> 2) Improve the search functionality. Lucene, Pimp my Tunes and all that  jazz.
>
> 3) I hate to bang on about this, but I think that WikiDocs is  over-over-ripe. There have been multiple requests for more  
> documentation/help in various places, and WikiDocs is _the_ answer to all  these requests.
> All I can say is: Any work that I am supposed to do for this to happen,  I'll do it.

4) All the above! :)

Regarding number 1, perhaps we need a 'project' to go through and improve
things like this, give more and better feedback when people do something (I
suppose it would also include the terminology changes you proposed, as much
as I hate 'autoedit' and 'autoeditor').

--Nikki
_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: Asking the wrong Questions (Was: too many FreeDB imports?)

by Robert Kaye :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


On Jan 7, 2006, at 6:50 AM, Don Redman wrote:
> a) The assumption that 'they' are 'stupid' or 'lazy' is plain  
> wrong. Fact is that they use MB from their point of view.

And I think the predominant point of view for people is "Don't bother  
me, I want to tag my files." Regardless of what label you want to put  
on this, (lazy, stupid, tag-motivated, whatever) doesn't matter one  
bit. Forcing people to do something when they are focused on doing  
something else will give you lots of lots of results of garbage data.

Rather than viewing these people as lazy, we should give these people  
the tools to do their job with minimal impact on their goals and MB.  
And this conversation is already headed in that direction -- I just  
wanted to pick on semantics for a bit. :)

> That would be the same thing if you'd require everybody who wants  
> to visit Paris to speak fluent French.

Thats the case isn't it?

> How about logging all the search terms?

Sure, logging search terms should be easy. But given our current  
search features this may not yield much useful.

> PS: And additionally this one: Robert, how about we start to get  
> WikiDocs out of the box _now_. Screw my exams!

I've been thinking about this very much in the past days. The  
continuing appeals for more/updated/easier docs is quickly rising to  
the top of my list of things to address. However, this change will  
require a schema change -- as will the promised transliteration  
features. Given these, I propose the following plan of actions:

1. Finshing coding the WikiDocs release -- this is mostly some CSS  
hacking and a Moin plugin.
2. Code the transliterated artist aliases features. Both of these  
features require schema changes, so lets bundle them.
2.a. Add in minor tweaks or wording discussed in this thread. Can  
someone please be a champion for these improvements??
3. Release #1 and #2 in about 1 months time.
4. In the meantime, lets start working on the improved docs that  
everyone is screaming for. Updated moderation introduction with tons  
of examples for starters.
5. Develop these docs in the main wiki and copy them over manually  
into the existing wikidocs system. Then once the next gen wikidocs  
rolls around, we can transliterate directly from the wiki.
6. Replace the search system with a lucene based system

I'm on #1 and #2. How does that sound?

BTW:
My MB dev todo list looks like:
- Finish pimpmytunes
- wikidocs
- transliterated aliases

--

--ruaok      Somewhere in Texas a village is *still* missing its idiot.

Robert Kaye     --     rob@...     --    http://mayhem-chaos.net


_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: Asking the wrong Questions (Was: too many FreeDB imports?)

by Shaun Guth :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

4. In the meantime, lets start working on the improved docs that
everyone is screaming for. Updated moderation introduction with tons
of examples for starters.

I'm jumping into this discussion late and uninformed, but I had this exact same idea while reflecting on a recent discussion regarding one of my track edits...

My approach was to have a blog section titled "Moderation(s) of the day/week/etc" in which one of the automods writes up a quick entry describing one of the moderations that they think was particularly interesting or required some extra discussion.  With minimal extra effort, the author can include a "quick summary" of the mods which can be compiled them into a lengthier general-purpose document on the wiki.

I think this gives the currently-active moderation community a section to be made aware of the latest in moderation discussions, and as it grows, a page for beginners to use as a reference example for applying their own edits or voting on moderations.

-- Shaun

_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts

Re: Asking the wrong Questions (Was: too many FreeDB imports?)

by DonRedman :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, 09 Jan 2006 20:58:30 +0100, Robert Kaye wrote:

>
> On Jan 7, 2006, at 6:50 AM, Don Redman wrote:
>> a) The assumption that 'they' are 'stupid' or 'lazy' is plain wrong.  
>> Fact is that they use MB from their point of view.
>
> And I think the predominant point of view for people is "Don't bother  
> me, I want to tag my files." Regardless of what label you want to put on  
> this, (lazy, stupid, tag-motivated, whatever) doesn't matter one bit.  
> Forcing people to do something when they are focused on doing something  
> else will give you lots of lots of results of garbage data.
>
> Rather than viewing these people as lazy, we should give these people  
> the tools to do their job with minimal impact on their goals and MB.

Not exactly. The best solution would be to give them the tools to do their  
job with ease in a way that is _beneficial_ to MB.

Letting people import freedb tags into Picard is the wrong way. It cuts  
their problem out of the loop for MB. Ideally their problem should be an  
element of a loop that makes MB better.

And actually that is what we did. We took their problem (the album is not  
in MB) and gave them a simple way to solve it (import it from freedb) that  
was beneficial to MB (add the album to the DB).

However the conditions have evolved and are different now. The loop now  
looks like this:

They encounter a problem (they cannot find the album that matches their  
broken tags), and use the easiest solution (import the badly tagged albzm  
 from freedb as advertized), which harms us (duplicate albums in the DB).

The simplest remedy is to enhance the search functionality, so that the  
problem does not arise anymore unless the album _reall_ is not in the DB).

The logging stuff I proposed would be a more complex solution. Logging the  
search terms is only part of it. Thw whole loop goes like this:

  1. John User has a badly tagged track, this track is common to
     P2P networks.
  2. He does a MB tracklookup but cannot find the track, because
     the MB metadata has been corrrected and differs too much
     (even for an enhanced search).
# Now we need to create a link betwen the bad tags and the correct
# MBID by an easy solution to John's problem
  3. He imports the album "HULLOH" from freedb that matches his
     bad tags closely enough.
  4. Jane Contributor sees the album and knows that it is a duplicate
     of the album "Hello". She meges "HULLOH" into "Hello".
# That's it, we have closed the loop, we just need to make the system
# aware of it:

MBSearchLogger loggs search terms and subsequent freedb imports and stores  
them in a table that contains MB_Album_ID, Freedb_ID, Search_Terms
MBMergeLogger loggs album merges and tries to match them to the  
SearchLogger's table every once in a while.

For each album that has been merged it creates entries in a search helper  
table that links the unsuccessful search terms (and maybe the words from  
the freedb entry, too) to the correct MB_Album_ID.


Got it? Note that I am not suggesting to implement this right now and wiht  
high priority. Enhancing the search function should be waaay further up  
the list. But before you implement something that cuts the user's problem  
out of the loop (like freedb import into Picard), think about this.


>> That would be the same thing if you'd require everybody who wants to  
>> visit Paris to speak fluent French.
>
> Thats the case isn't it?

And it is stupid, arrogant, and unfriendly [1], which was my point.

>> How about logging all the search terms?
>
> Sure, logging search terms should be easy. But given our current search  
> features this may not yield much useful.

see above.

>> PS: And additionally this one: Robert, how about we start to get  
>> WikiDocs out of the box _now_. Screw my exams!

I am replying to the rest in a different mail.

   DonRedman

----
[1] I am allowed to say that, I am half French myself :-)
--
Words that are written in CamelCase refer to WikiPages:
Visit http://wiki.musicbrainz.org/ the best MusicBrainz documentation  
around! :-)
_______________________________________________
Musicbrainz-experts mailing list
Musicbrainz-experts@...
http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-experts
< Prev | 1 - 2 | Next >