French Pronunciations

View: New views
5 Messages — Rating Filter:   Alert me  

French Pronunciations

by Charles Darwin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello list,

I need some help with downloading all french pronunciation files (the  
ogg files) off the fr.wiktionary.org.

As a work around I have this:

grep -o audio=.*ogg frwiktionary-20081228-pages-articles.xml |sed 's/
audio=\([aA-zZ].*\)/curl\ -sO\ \"\http\:\/\/fr\.wiktionary\.org\/wiki\/
Fichier:\1\"/g' > filenames

which extracts all file_names from `frwiktionary-20081228-pages-
articles.xml` and then I added this:

http://fr.wiktionary.org/wiki/Fichier:file_names

as this as the base url. So a simple

curl -O "http://fr.wiktionary.org/wiki/Fichier:Fr-chaise.ogg"

should Dload the any file. It all looks fine and rosy up to here but  
when I open the file with QTPlayer it doesn't play and the info window  
gives me 0 bytes whereas:

ls Fichier\:Fr-chaise.ogg
-rw-r--r--   -   24K 11 Jan 09:53 Fichier:Fr-chaise.ogg

There is nothing wrong with my players; this shell script:

test -d /tmp/frp || /bin/mkdir -p /tmp/frp ; cd /tmp/frp ; for i in $
(curl -s http://fr.wiktionary.org/wiki/${1} |grep --only-matching  
"http.*ogg"\" |/usr/bin/sed 's/\".*$//') ; do curl -sO $i;  
done && open -g -a itunes /tmp/frp/Fr-${1}*.ogg

works for most entries (try chaise). So I Dload the file Fr-chaise.ogg  
and play the files with no problem. Also:

ls /Volumes/neo/Users/pm/Music/iTunes/iTunes\ Music/Unknown\ Artist/
Unknown\ Album/une\ chaise.ogg
-rw-r--r--@  -   15K 11 Jan 10:00 …/Music/iTunes/iTunes Music/Unknown  
Artist/Unknown Album/une chaise.ogg


Thanks,
_______________________________________________
Wiktionary-l mailing list
Wiktionary-l@...
https://lists.wikimedia.org/mailman/listinfo/wiktionary-l

Re: French Pronunciations

by Bugzilla from kipmaster@gmail.com :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Chales,

when you download "http://fr.wiktionary.org/wiki/Fichier:Fr-chaise.ogg",
I think what you obtain is not an ogg file, but an html file,
i.e. the same html that you see if you click on the link.
(you can try to edit the file you obtain with a text editor to check)

What you need to download is the file linked in this html,
in this case,
http://upload.wikimedia.org/wikipedia/commons/a/a3/Fr-chaise.ogg
There is a method which can be used to automatically guess the /a/a3/
directories
given the name of the file ( Fr-chaise.ogg ).

I am not sure how I did it (but I did it),
I think it is the first character of the md5sum and then the first two
characters.

Do not hesitate to contact me if you are interested in this md5 thingy,
I know I have it somewhere.

Have fun !



> should Dload the any file. It all looks fine and rosy up to here but
> when I open the file with QTPlayer it doesn't play and the info window
> gives me 0 bytes whereas:
>
> ls Fichier\:Fr-chaise.ogg
> -rw-r--r--   -   24K 11 Jan 09:53 Fichier:Fr-chaise.ogg
>
> There is nothing wrong with my players; this shell script:
>
> test -d /tmp/frp || /bin/mkdir -p /tmp/frp ; cd /tmp/frp ; for i in $
> (curl -s http://fr.wiktionary.org/wiki/${1}<http://fr.wiktionary.org/wiki/$%7B1%7D>|grep --only-matching
> "http.*ogg"\" |/usr/bin/sed 's/\".*$//') ; do curl -sO $i;
> done && open -g -a itunes /tmp/frp/Fr-${1}*.ogg
>
> works for most entries (try chaise). So I Dload the file Fr-chaise.ogg
> and play the files with no problem. Also:
>
> ls /Volumes/neo/Users/pm/Music/iTunes/iTunes\ Music/Unknown\ Artist/
> Unknown\ Album/une\ chaise.ogg
> -rw-r--r--@  -   15K 11 Jan 10:00 …/Music/iTunes/iTunes Music/Unknown
> Artist/Unknown Album/une chaise.ogg
>
>
> Thanks,
> _______________________________________________
> Wiktionary-l mailing list
> Wiktionary-l@...
> https://lists.wikimedia.org/mailman/listinfo/wiktionary-l
>
_______________________________________________
Wiktionary-l mailing list
Wiktionary-l@...
https://lists.wikimedia.org/mailman/listinfo/wiktionary-l

Re: French Pronunciations

by Gerard Meijssen-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hoi,
I am surprised that these files are in the French Wikitionary.. It would be
helpful when these files were all moved to Commons so that the other
Wiktionaries could benefit from them as well.
Thanks,
      GerardM

2009/1/12 Christophe Millet <kipmaster@...>

> Hi Chales,
>
> when you download "http://fr.wiktionary.org/wiki/Fichier:Fr-chaise.ogg",
> I think what you obtain is not an ogg file, but an html file,
> i.e. the same html that you see if you click on the link.
> (you can try to edit the file you obtain with a text editor to check)
>
> What you need to download is the file linked in this html,
> in this case,
> http://upload.wikimedia.org/wikipedia/commons/a/a3/Fr-chaise.ogg
> There is a method which can be used to automatically guess the /a/a3/
> directories
> given the name of the file ( Fr-chaise.ogg ).
>
> I am not sure how I did it (but I did it),
> I think it is the first character of the md5sum and then the first two
> characters.
>
> Do not hesitate to contact me if you are interested in this md5 thingy,
> I know I have it somewhere.
>
> Have fun !
>
>
>
> > should Dload the any file. It all looks fine and rosy up to here but
> > when I open the file with QTPlayer it doesn't play and the info window
> > gives me 0 bytes whereas:
> >
> > ls Fichier\:Fr-chaise.ogg
> > -rw-r--r--   -   24K 11 Jan 09:53 Fichier:Fr-chaise.ogg
> >
> > There is nothing wrong with my players; this shell script:
> >
> > test -d /tmp/frp || /bin/mkdir -p /tmp/frp ; cd /tmp/frp ; for i in $
> > (curl -s http://fr.wiktionary.org/wiki/${1}<http://fr.wiktionary.org/wiki/$%7B1%7D>
> <http://fr.wiktionary.org/wiki/$%7B1%7D>|grep --only-matching
> > "http.*ogg"\" |/usr/bin/sed 's/\".*$//') ; do curl -sO $i;
> > done && open -g -a itunes /tmp/frp/Fr-${1}*.ogg
> >
> > works for most entries (try chaise). So I Dload the file Fr-chaise.ogg
> > and play the files with no problem. Also:
> >
> > ls /Volumes/neo/Users/pm/Music/iTunes/iTunes\ Music/Unknown\ Artist/
> > Unknown\ Album/une\ chaise.ogg
> > -rw-r--r--@  -   15K 11 Jan 10:00 …/Music/iTunes/iTunes Music/Unknown
> > Artist/Unknown Album/une chaise.ogg
> >
> >
> > Thanks,
> > _______________________________________________
> > Wiktionary-l mailing list
> > Wiktionary-l@...
> > https://lists.wikimedia.org/mailman/listinfo/wiktionary-l
> >
> _______________________________________________
> Wiktionary-l mailing list
> Wiktionary-l@...
> https://lists.wikimedia.org/mailman/listinfo/wiktionary-l
>
_______________________________________________
Wiktionary-l mailing list
Wiktionary-l@...
https://lists.wikimedia.org/mailman/listinfo/wiktionary-l

Re: French Pronunciations

by Thomas Goldammer :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

They are on Commons if I understand "Ce fichier et les informations de
sa page de description sont présents sur Wikimedia Commons."
correctly. ;)

Th.

2009/1/12, Gerard Meijssen <gerard.meijssen@...>:
> Hoi,
>  I am surprised that these files are in the French Wikitionary.. It would be
>  helpful when these files were all moved to Commons so that the other
>  Wiktionaries could benefit from them as well.
>  Thanks,
>       GerardM
>
_______________________________________________
Wiktionary-l mailing list
Wiktionary-l@...
https://lists.wikimedia.org/mailman/listinfo/wiktionary-l

Re: French Pronunciations

by Charles Darwin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

cirwin at #wiktionary showed me a better way of doing this using  
python. Here is how-to for the curious:

svn co https://mwclient.svn.sourceforge.net/svnroot/mwclient/trunk
cd trunk
phython #at least python 2.4 is needed (see the README.txt that comes  
with mwclient)

import mwclient
commons = mwclient.Site('commons.wikimedia.org')

for img in commons.pages['Category:French pronunciation'].members():
     if img.name.endswith('.ogg'):
         print img.name.encode('utf-8')
         saveas = open(u"/tmp/%s" % img.name[5:],'w')
         remote = img.download()
         saveas.write(remote.read())
         saveas.close()

Then Ctrl-D to get out of python. This is going to populate your /tmp/  
folder with all the ogg files (130M).

I forgot to ask cirwin about how to keep the local copy up-to-date so  
if anyone here is familiar with python or mwclien or knows Bryan <http://commons.wikimedia.org/wiki/User_talk:Bryan 
 > and can get an answer from him (or cirwin) I would appreciate it  
very much.

Thanks,

_______________________________________________
Wiktionary-l mailing list
Wiktionary-l@...
https://lists.wikimedia.org/mailman/listinfo/wiktionary-l