|
View:
New views
5 Messages
—
Rating Filter:
Alert me
|
|
|
French PronunciationsHello list,
I need some help with downloading all french pronunciation files (the ogg files) off the fr.wiktionary.org. As a work around I have this: grep -o audio=.*ogg frwiktionary-20081228-pages-articles.xml |sed 's/ audio=\([aA-zZ].*\)/curl\ -sO\ \"\http\:\/\/fr\.wiktionary\.org\/wiki\/ Fichier:\1\"/g' > filenames which extracts all file_names from `frwiktionary-20081228-pages- articles.xml` and then I added this: http://fr.wiktionary.org/wiki/Fichier:file_names as this as the base url. So a simple curl -O "http://fr.wiktionary.org/wiki/Fichier:Fr-chaise.ogg" should Dload the any file. It all looks fine and rosy up to here but when I open the file with QTPlayer it doesn't play and the info window gives me 0 bytes whereas: ls Fichier\:Fr-chaise.ogg -rw-r--r-- - 24K 11 Jan 09:53 Fichier:Fr-chaise.ogg There is nothing wrong with my players; this shell script: test -d /tmp/frp || /bin/mkdir -p /tmp/frp ; cd /tmp/frp ; for i in $ (curl -s http://fr.wiktionary.org/wiki/${1} |grep --only-matching "http.*ogg"\" |/usr/bin/sed 's/\".*$//') ; do curl -sO $i; done && open -g -a itunes /tmp/frp/Fr-${1}*.ogg works for most entries (try chaise). So I Dload the file Fr-chaise.ogg and play the files with no problem. Also: ls /Volumes/neo/Users/pm/Music/iTunes/iTunes\ Music/Unknown\ Artist/ Unknown\ Album/une\ chaise.ogg -rw-r--r--@ - 15K 11 Jan 10:00 …/Music/iTunes/iTunes Music/Unknown Artist/Unknown Album/une chaise.ogg Thanks, _______________________________________________ Wiktionary-l mailing list Wiktionary-l@... https://lists.wikimedia.org/mailman/listinfo/wiktionary-l |
|
|
Re: French PronunciationsHi Chales,
when you download "http://fr.wiktionary.org/wiki/Fichier:Fr-chaise.ogg", I think what you obtain is not an ogg file, but an html file, i.e. the same html that you see if you click on the link. (you can try to edit the file you obtain with a text editor to check) What you need to download is the file linked in this html, in this case, http://upload.wikimedia.org/wikipedia/commons/a/a3/Fr-chaise.ogg There is a method which can be used to automatically guess the /a/a3/ directories given the name of the file ( Fr-chaise.ogg ). I am not sure how I did it (but I did it), I think it is the first character of the md5sum and then the first two characters. Do not hesitate to contact me if you are interested in this md5 thingy, I know I have it somewhere. Have fun ! > should Dload the any file. It all looks fine and rosy up to here but > when I open the file with QTPlayer it doesn't play and the info window > gives me 0 bytes whereas: > > ls Fichier\:Fr-chaise.ogg > -rw-r--r-- - 24K 11 Jan 09:53 Fichier:Fr-chaise.ogg > > There is nothing wrong with my players; this shell script: > > test -d /tmp/frp || /bin/mkdir -p /tmp/frp ; cd /tmp/frp ; for i in $ > (curl -s http://fr.wiktionary.org/wiki/${1}<http://fr.wiktionary.org/wiki/$%7B1%7D>|grep --only-matching > "http.*ogg"\" |/usr/bin/sed 's/\".*$//') ; do curl -sO $i; > done && open -g -a itunes /tmp/frp/Fr-${1}*.ogg > > works for most entries (try chaise). So I Dload the file Fr-chaise.ogg > and play the files with no problem. Also: > > ls /Volumes/neo/Users/pm/Music/iTunes/iTunes\ Music/Unknown\ Artist/ > Unknown\ Album/une\ chaise.ogg > -rw-r--r--@ - 15K 11 Jan 10:00 …/Music/iTunes/iTunes Music/Unknown > Artist/Unknown Album/une chaise.ogg > > > Thanks, > _______________________________________________ > Wiktionary-l mailing list > Wiktionary-l@... > https://lists.wikimedia.org/mailman/listinfo/wiktionary-l > Wiktionary-l mailing list Wiktionary-l@... https://lists.wikimedia.org/mailman/listinfo/wiktionary-l |
|
|
Re: French PronunciationsHoi,
I am surprised that these files are in the French Wikitionary.. It would be helpful when these files were all moved to Commons so that the other Wiktionaries could benefit from them as well. Thanks, GerardM 2009/1/12 Christophe Millet <kipmaster@...> > Hi Chales, > > when you download "http://fr.wiktionary.org/wiki/Fichier:Fr-chaise.ogg", > I think what you obtain is not an ogg file, but an html file, > i.e. the same html that you see if you click on the link. > (you can try to edit the file you obtain with a text editor to check) > > What you need to download is the file linked in this html, > in this case, > http://upload.wikimedia.org/wikipedia/commons/a/a3/Fr-chaise.ogg > There is a method which can be used to automatically guess the /a/a3/ > directories > given the name of the file ( Fr-chaise.ogg ). > > I am not sure how I did it (but I did it), > I think it is the first character of the md5sum and then the first two > characters. > > Do not hesitate to contact me if you are interested in this md5 thingy, > I know I have it somewhere. > > Have fun ! > > > > > should Dload the any file. It all looks fine and rosy up to here but > > when I open the file with QTPlayer it doesn't play and the info window > > gives me 0 bytes whereas: > > > > ls Fichier\:Fr-chaise.ogg > > -rw-r--r-- - 24K 11 Jan 09:53 Fichier:Fr-chaise.ogg > > > > There is nothing wrong with my players; this shell script: > > > > test -d /tmp/frp || /bin/mkdir -p /tmp/frp ; cd /tmp/frp ; for i in $ > > (curl -s http://fr.wiktionary.org/wiki/${1}<http://fr.wiktionary.org/wiki/$%7B1%7D> > <http://fr.wiktionary.org/wiki/$%7B1%7D>|grep --only-matching > > "http.*ogg"\" |/usr/bin/sed 's/\".*$//') ; do curl -sO $i; > > done && open -g -a itunes /tmp/frp/Fr-${1}*.ogg > > > > works for most entries (try chaise). So I Dload the file Fr-chaise.ogg > > and play the files with no problem. Also: > > > > ls /Volumes/neo/Users/pm/Music/iTunes/iTunes\ Music/Unknown\ Artist/ > > Unknown\ Album/une\ chaise.ogg > > -rw-r--r--@ - 15K 11 Jan 10:00 …/Music/iTunes/iTunes Music/Unknown > > Artist/Unknown Album/une chaise.ogg > > > > > > Thanks, > > _______________________________________________ > > Wiktionary-l mailing list > > Wiktionary-l@... > > https://lists.wikimedia.org/mailman/listinfo/wiktionary-l > > > _______________________________________________ > Wiktionary-l mailing list > Wiktionary-l@... > https://lists.wikimedia.org/mailman/listinfo/wiktionary-l > Wiktionary-l mailing list Wiktionary-l@... https://lists.wikimedia.org/mailman/listinfo/wiktionary-l |
|
|
Re: French PronunciationsThey are on Commons if I understand "Ce fichier et les informations de
sa page de description sont présents sur Wikimedia Commons." correctly. ;) Th. 2009/1/12, Gerard Meijssen <gerard.meijssen@...>: > Hoi, > I am surprised that these files are in the French Wikitionary.. It would be > helpful when these files were all moved to Commons so that the other > Wiktionaries could benefit from them as well. > Thanks, > GerardM > _______________________________________________ Wiktionary-l mailing list Wiktionary-l@... https://lists.wikimedia.org/mailman/listinfo/wiktionary-l |
|
|
Re: French Pronunciationscirwin at #wiktionary showed me a better way of doing this using
python. Here is how-to for the curious: svn co https://mwclient.svn.sourceforge.net/svnroot/mwclient/trunk cd trunk phython #at least python 2.4 is needed (see the README.txt that comes with mwclient) import mwclient commons = mwclient.Site('commons.wikimedia.org') for img in commons.pages['Category:French pronunciation'].members(): if img.name.endswith('.ogg'): print img.name.encode('utf-8') saveas = open(u"/tmp/%s" % img.name[5:],'w') remote = img.download() saveas.write(remote.read()) saveas.close() Then Ctrl-D to get out of python. This is going to populate your /tmp/ folder with all the ogg files (130M). I forgot to ask cirwin about how to keep the local copy up-to-date so if anyone here is familiar with python or mwclien or knows Bryan <http://commons.wikimedia.org/wiki/User_talk:Bryan > and can get an answer from him (or cirwin) I would appreciate it very much. Thanks, _______________________________________________ Wiktionary-l mailing list Wiktionary-l@... https://lists.wikimedia.org/mailman/listinfo/wiktionary-l |
| Free embeddable forum powered by Nabble | Forum Help |