A multi-languages search engine for OCAL. Your help to finalize it

View: New views
2 Messages — Rating Filter:   Alert me  

A multi-languages search engine for OCAL. Your help to finalize it

by COTTE ANDRÉ :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

A multi-languages search engine for OCAL. Your help to finalize it

Hello,

By way of introduction, we (Kevin Albert and André Cotte) are working to
make open source software more readily available to school systems in
the province of Québec, Canada.

We’d like to bring to your attention a project that we have been working
on for some time now – in fact the project is quite far advanced – and
in particular to point out one of the few remaining obstacles.  We’re at
the stage where only the OCAL team will be able help us with this
problem.

Basically the project would allow users to access the OCAL database
using the French language with the help of its API and the “Google
translate” service. (Our project could be adapted to other languages as
well.)

We are aware that most of the OCAL keywords (tags) are in English, which
presents a problem for students in non-English language countries.  We
are also aware that the OCAL site makes available a rich API, and so we
have come up with the following process for accessing OCAL clipart
images in French.

The francophone student enters the keyword for the search he wishes to
carry out in the question box.  Our program delivers this word to the
Google translation service, then submits the translation to OCAL. OCAL
delivers the result to us, which we transmit to the student.  So far, so
good.

The problem we are encountering in an educational context is not being
able to identify via the OCAL API images tagged NSFW.  We know that this
criterion exists because images are tagged NSFW on the site – but the
API does not transmit this value. 

We have come up with two possible solutions.

The first and most simple solution would be for the API to send us
directly the NSFW status of an image.  This may not be possible because
you use CCHost for the API. 

The second option would be for us to program a mechanism to add the NSFW
tag to each of the clipart images for which the NSFW field has already
been clicked. Thus the API, in sending back to us the keywords, would
immediately identify the NSFW state of the image.

We are aware that in asking you to help us solve this problem we are
piling another task on what is undoubtedly an already heavy workload.
But solving our problem opens the door to making OCAL accessible to
grateful users in every language in the world.

The program we have designed to access the database in French will be
made available with an open source license to people of all  languages.
There would be no problem making it available on the OCAL site.

We thank you for giving our problem your attention, and hope that we
will be able to collaborate fruitfully in making OCAL available to a
global audience. 


Your truly,

Kevin Albert

André Cotte

Zone libre en education

http://zonelibre.grics.qc.ca 


_______________________________________________
clipart mailing list
clipart@...
http://lists.freedesktop.org/mailman/listinfo/clipart

Re: A multi-languages search engine for OCAL. Your help to finalize it

by Francis Bond-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

G'day,

This sounds like a great project.  I am afraid I don't know enough
about  ccHost to help with the tagging, although I really hope someone
can fix the system so that the tags are properly inserted.

However, if you are interested I could help you look up words in
French using a different method as well.  I have tagged almost 1,000
images with WordNet tags.   Using them you can directly look up the
French WordNet (or the Wordnet Libre du Français).  If someone solves
the CChost tagging problem I can almost certainly tag many more.

With the tags, you can avoid some ambiguity.  For example, if you look
for 'chauve-souris' Google will translate it as 'bat'.  If you search
for 'bat', you will not only get the mammal bat, but also the sports
bat.  Using the WordNet tags, you could go straight from chauve-souris
to the image you want, or if there is ambiguity have it explicitly
indicated.

The mappings and some not very good documentation are at:
<http://nlpwww.nict.go.jp/wn-ja/index.en.html>.  If you are interested
and have any questions please feel to ask me.

Yours,

--
Francis Bond <http://www3.ntu.edu.sg/home/fcbond/>
Division of Linguistics and Multilingual Studies
Nanyang Technological University
_______________________________________________
clipart mailing list
clipart@...
http://lists.freedesktop.org/mailman/listinfo/clipart