|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 | Next > |
|
|
web to semantic web : an automated approachHello friends, I have been following semantic web for some time now and have seen quite a lot of projects being run (dbpedia, FOAF etc) trying to generate some semantic content. While these approaches might have been successful in their goals, one major problem plaguing semantic web as a whole is the lack of semantic content. Unfortunately there is nothing in sight that we can rely on to generate semantic content for the truckloads of information being put on web everyday. I think one of the _wrong_ assumption in semantic web community is that content creators will be creating a semantic data which I think is too much for the asking from even more technically sound part of web community let along whole of the web community. It hasn't happened over last so many years and I don't see it happening in the near future. I think what we need to move the semantic web forward is a mechanism to _automatcially_ convert the information over the web to semantic information. There are many softwares/services that can be used for this purpose. I am currently developing one prototype for this purpose. This prototype uses services from OpenCalais(http://www.opencalais.com/) to convert ordinary text to semantic form. This service is very limited in what entities supports at the moment but its a very good start. I am pretty sure there will be many other good options available that might be unknown to me. The currently very primitive prototype can be seen at http://arcse.appspot.com. This currently implements very few of the ideas I have for this. This is hosted on Google's AppEngine so sometime gives timeout messages internally so please bear with this :). This automatic conversion however is not a simple task and needs work in lot in domains ranging form NLP to artificial intelligence to semantic web to logic etc. So thats why this mail. I will be more than happy if we can join together to form a like minded team that can work on solving this most important problem plaguing semantic web currently. Waiting for your suggestions/criticisms Ravinder Thakur |
|
|
|
Re: web to semantic web : an automated approachany thoughts on this...
On Mon, Oct 20, 2008 at 12:38 AM, ravinder thakur <ravinderthakur@...> wrote: Hello friends, |
|
|
|
Re: web to semantic web : an automated approachOn Sunday 19 October 2008 21:08:10 ravinder thakur wrote: > I think one of the _wrong_ > assumption in semantic web community is that content creators will be > creating a semantic data which I think is too much for the asking from > even more technically sound part of web community let along whole of the > web community. It hasn't happened over last so many years and I don't > see it happening in the near future. This is indeed an essential point in the development of the Semantic Web. I'm mostly in the "it'll happen" camp with regards to people creating semantic content. There are two main sources, one is that they say that 70% of the data on the web is allready in some structured form, thus what's needed is to clarify what that structure means. The other thing is what is often called "Mass Intelligence". Geonames.org is an excellent example of an application that let people produce meaningful data, and I think that could be extended to many more domains. I personally would prefer Mass Intelligence, i.e. the real intelligence of many people, over Artificial Intelligence (when it tries to do things humans are better at), as I fear that the Semantic Web would inherit the perceived and/or real weaknesses of AI if AI was dominant. That's not to say that an automated approach isn't useful, I think it is, but I see it as something that should be applied with care where it makes sense. I also have colleagues who would agree with you that it is the most interesting thing to do right now. > I am > pretty sure there will be many other good options available that might > be unknown to me. I think Cypher, which had a 1.9 release announced here a couple of days ago would be of great interest to you: http://www.monrai.com/products/cypher/ Also, I think IBM's SUKI http://www.research.ibm.com/UIMA/SUKI/ might be of interest. Kind regards Kjetil Kjernsmo -- Senior Knowledge Engineer Mobile: +47 986 48 234 Email: kjetil.kjernsmo@... Web: http://www.computas.com/ | SHARE YOUR KNOWLEDGE | Computas AS PO Box 482, N-1327 Lysaker | Phone:+47 6783 1000 | Fax:+47 6783 1001 |
|
|
|
Re: web to semantic web : an automated approachHi Mr. Ravinder
It is a nice idea, why cannot we transform web content to semantic web content.
This is a necessity and will avoid regeneration of web content as semantic web data.
Even I am focusing in this direction.
With regards,
Dr. Rajkumar Kannan
Associate Professor
Dept. of Computer Science
Bishop Heber College, Tiruchirappalli, TN, India
===================================================
On 10/20/08, रविंदर ठाकुर (ravinder thakur) <ravinderthakur@...> wrote:
|
|
|
|
Re: web to semantic web : an automated approachHi, it's all happening, but it's not so easy as one may think in the first place. Basically there are multiple sources of structured/interlinked information (A) and multiple ways of how to expose (B) linked data on the Web. (A) 1. generated (wrapped) from information systems (RDBMS, etc) => needs mapping 2. user-generated (natively RDF-based systems, Semantic Wikis, etc.) => already in the right form 3. extracted (AI, heuristics, cypher, etc. - different levels of granularity; difficult, sometimes wrong) 1. RDF documents 2. SPARQL endpoints 3. embedded into HTML (RDFa) The Linked Data Community project plays an important role regarding A1 and A2. A3 is cumbersome and may produce wrong links and information - a nightmare without implicit support for provenance. In corporate environments A3 is already very popular, but in the broader Web-scale I'm a bit sceptical this will work well. What do you tink? Regards, AndyL On Oct 20, 2008, at 10:35 AM, Kannan Rajkumar wrote:
Web of Data Practitioners Days / Oct 22-23 / Vienna http://www.webofdata.info ---------------------------------------------------------------------- Dipl.-Ing.(FH) Andreas Langegger Institute for Applied Knowledge Processing Johannes Kepler University Linz A-4040 Linz, Altenberger Straße 69 |
|
|
|
Re: web to semantic web : an automated approachHi all,
Popular content management systems have a great role to play in democratizing the semantic web. Some CMS like Drupal have already understood this and are rapidly moving towards exposing their content as RDF data. Because so many people are using, there is a great potential in implementing some built in semantic web features, and that's what's happening with Drupal 7, which will ship with RDFa support in core. Drupal will then be part of the category A2, with more than 30 000 RDFa enabled sites! -- Stéphane, scor @ drupal.org http://drupal.org/user/52142 On Mon, Oct 20, 2008 at 9:55 AM, Andreas Langegger <al@...> wrote:
|
|
|
|
Re: web to semantic web : an automated approach>>>>This is indeed an essential point in the development of the Semantic Web. I'm
>>>>mostly in the "it'll happen" camp with regards to people creating semantic >>>> content. There are two main sources, one is that they say that 70% of the >>>> data on the web is allready in some structured form, thus what's needed is to >>>> clarify what that structure means. I have been in "it will happen" camp but nothing far reaching seems to be happening so i am out. I would say that most of the data (90%) of data out there is unstructured. Also most of the strucutred data is specific to companies and they wont share it. There are people writing blogs, wikipedia, news websites producing content continuisley, people reviewing the products, putting their opinions online, the list of unstructured data is endless and will continue to grow with increasing Internet peneratration in 3rd world conturies. To assume that all users will manually convert this data to sturcutred seems too far fetched. To assume that the information being put by these end users is of little uses than say wikipedia/dbpedia would be a horrible mistake. Even if we have large data, someone needs to club this vast amount of rdf/owl data and create a global graph interlinking all of that.(BTW i see some serious ontology issues anyone will likely to hit in this approach) >>>>Also, I think IBM's SUKI http://www.research.ibm.com/UIMA/SUKI/ might be of >>>>interest. I have used UIMA but its not a one man army's job. Its just a framework and there is hell lot of things to be done yet on this. eg. write domain specific components etc. >>>>A3 is cumbersome and may produce wrong links and information - a nightmare without implicit support for provenance. In corporate >>>>environments A3 is already very popular, but in the broader Web-scale I'm a bit sceptical this will work well. What do you tink? I am hoping a lot on the progress we have made in NLP and no doubt NLP will continue to improve its performance in the near future. Currently to aliviate the wrong linking/information problem I think reduancy of information will play an important role. If we have 10 sources of same peice of information and 6 NLP parsers give one view and rest 4 give other view, i am pretty sure the one on which 6 are agreeing will be the right one. Also we dont have to be 100% right(that too in the begining) since ( other than your boss :) ) nobody is 100% right:) >>>>Some CMS like Drupal have already understood this and are rapidly moving towards exposing their content as RDF data Here's the problem. Drupal are exposing _the data stored in Drupal. Do we expect everyone on web to use Drupal ? No. What happens to information on times.com, blogspot.com, googlegroups.com or kashmirtimes.com ? Semantic web is not about converting someone's data and exposing it with semantic view. Its about the _whole_ data out there on web and then building a web of semantic links on top of that and then doing reasoning on top of that etc. Thanks for initiating the discussion anyways. Keep it coming :) |
|
|
|
Re: web to semantic web : an automated approachStefan
I have a content management background, and I am a Drupal user. I have been looking forward to the advances that you mention below however my problem is modelling the triples. I am sure that a tool can automatically extract/infer triples from content, but I am not sure these would be meaningful/representative. Assuming the functionality is available to expose the data as RDF (of course I would have to upgrade to drupal 7 and all the custom modules and functionalities written for drupal 6 would have to be rewritten, but thats admittedly another problem) but what kind of knowledge schema/ontology would it adhere to? would the system automaticlaly infer what is the subject what is the predicate, and would I (website mom) be able to override what the system suggests? I havent quite worked out how the system would work clues welcome Paola Di Maio On Mon, Oct 20, 2008 at 4:28 PM, Stephane Corlosquet <scorlosquet@...> wrote: > Hi all, > > Popular content management systems have a great role to play in > democratizing the semantic web. Some CMS like Drupal have already understood > this and are rapidly moving towards exposing their content as RDF data. > Because so many people are using, there is a great potential in implementing > some built in semantic web features, and that's what's happening with Drupal > 7, which will ship with RDFa support in core. Drupal will then be part of > the category A2, with more than 30 000 RDFa enabled sites! > > -- > Stéphane, > scor @ drupal.org > http://drupal.org/user/52142 > > On Mon, Oct 20, 2008 at 9:55 AM, Andreas Langegger <al@...> wrote: >> >> Hi, >> it's all happening, but it's not so easy as one may think in the first >> place. >> Basically there are multiple sources of structured/interlinked information >> (A) and multiple ways of how to expose (B) linked data on the Web. >> (A) >> 1. generated (wrapped) from information systems (RDBMS, etc) => needs >> mapping >> 2. user-generated (natively RDF-based systems, Semantic Wikis, etc.) => >> already in the right form >> 3. extracted (AI, heuristics, cypher, etc. - different levels of >> granularity; difficult, sometimes wrong) >> (B) >> 1. RDF documents >> 2. SPARQL endpoints >> 3. embedded into HTML (RDFa) >> The Linked Data Community project plays an important role regarding A1 and >> A2. A3 is cumbersome and may produce wrong links and information - a >> nightmare without implicit support for provenance. In corporate environments >> A3 is already very popular, but in the broader Web-scale I'm a bit sceptical >> this will work well. What do you tink? >> Regards, >> AndyL >> >> >> On Oct 20, 2008, at 10:35 AM, Kannan Rajkumar wrote: >> >> Hi Mr. Ravinder >> >> It is a nice idea, why cannot we transform web content to semantic web >> content. >> >> This is a necessity and will avoid regeneration of web content as semantic >> web data. >> >> Even I am focusing in this direction. >> >> With regards, >> >> Dr. Rajkumar Kannan >> Associate Professor >> Dept. of Computer Science >> Bishop Heber College, Tiruchirappalli, TN, India >> URL: http://member.acm.org/~rajkumark/ >> >> >> =================================================== >> >> On 10/20/08, रविंदर ठाकुर (ravinder thakur) <ravinderthakur@...> >> wrote: >>> >>> any thoughts on this... >>> >>> >>> On Mon, Oct 20, 2008 at 12:38 AM, ravinder thakur >>> <ravinderthakur@...> wrote: >>>> >>>> Hello friends, >>>> >>>> I have been following semantic web for some time now and have seen quite >>>> a lot of projects being run (dbpedia, FOAF etc) trying to generate some >>>> semantic content. While these approaches might have been successful in their >>>> goals, one major problem plaguing semantic web as a whole is the lack of >>>> semantic content. Unfortunately there is nothing in sight that we can rely >>>> on to generate semantic content for the truckloads of information being put >>>> on web everyday. I think one of the _wrong_ assumption in semantic web >>>> community is that content creators will be creating a semantic data which I >>>> think is too much for the asking from even more technically sound part of >>>> web community let along whole of the web community. It hasn't happened over >>>> last so many years and I don't see it happening in the near future. >>>> >>>> I think what we need to move the semantic web forward is a mechanism to >>>> _automatcially_ convert the information over the web to semantic >>>> information. There are many softwares/services that can be used for this >>>> purpose. I am currently developing one prototype for this purpose. This >>>> prototype uses services from OpenCalais(http://www.opencalais.com/) to >>>> convert ordinary text to semantic form. This service is very limited in what >>>> entities supports at the moment but its a very good start. I am pretty sure >>>> there will be many other good options available that might be unknown to me. >>>> The currently very primitive prototype can be seen at >>>> http://arcse.appspot.com. This currently implements very few of the ideas I >>>> have for this. This is hosted on Google's AppEngine so sometime gives >>>> timeout messages internally so please bear with this :). >>>> >>>> This automatic conversion however is not a simple task and needs work in >>>> lot in domains ranging form NLP to artificial intelligence to semantic web >>>> to logic etc. So thats why this mail. I will be more than happy if we can >>>> join together to form a like minded team that can work on solving this most >>>> important problem plaguing semantic web currently. >>>> >>>> Waiting for your suggestions/criticisms >>>> Ravinder Thakur >>>> >>> >> >> >> >> >> Web of Data Practitioners Days / Oct 22-23 / Vienna >> http://www.webofdata.info >> ---------------------------------------------------------------------- >> Dipl.-Ing.(FH) Andreas Langegger >> Institute for Applied Knowledge Processing >> Johannes Kepler University Linz >> A-4040 Linz, Altenberger Straße 69 >> http://www.langegger.at >> >> >> > > -- Paola Di Maio School of IT www.mfu.ac.th ********************************************* |
|
|
|
Re: web to semantic web : an automated approach>>>>>Some CMS like Drupal have already understood this and are rapidly moving >>>>> towards exposing their content as RDF data > > Here's the problem. Drupal are exposing _the data stored in Drupal. Do we > expect everyone on web to use Drupal ? No. What happens to information on > times.com, blogspot.com, googlegroups.com or kashmirtimes.com ? Semantic web > is not about converting someone's data and exposing it with semantic view. > Its about the _whole_ data out there on web and then building a web of > semantic links on top of that and then doing reasoning on top of that etc. > > I remember when RSS first, and then ATOM, became available to export feeds first one, then two, then all the platforms supported this small but vital feature I figure RDF could be another one of those, but the questions I asked about modelling the triples is still something I have not come to terms with. I still cant figure it out, but think its somewhere along the line pdm |
|
|
|
Re: web to semantic web : an automated approachHere's the problem. Drupal are exposing _the data stored in Drupal. Do we expect everyone on web to use Drupal ? No. I gave Drupal as an example. It's up to the other systems to follow through, maybe learning from the other systems implementing it. What happens to information on times.com, blogspot.com, googlegroups.com or kashmirtimes.com ? You'd be surprised by the sites already using Drupal, to name a few: observer.com, fastcompany.com, amnesty.org, universalmusic.com, theonion.com. These represent already many web pages ready to make the switch. Semantic web is not about converting someone's data and exposing it with semantic view. Its about the _whole_ data out there on web and then building a web of semantic links on top of that and then doing reasoning on top of that etc. This cannot happen in one day. This cannot open via one project. It's a global effort that will take a bit of time.
Stéphane. On Mon, Oct 20, 2008 at 11:12 AM, रविंदर ठाकुर (ravinder thakur) <ravinderthakur@...> wrote:
|
|
|
|
Re: web to semantic web : an automated approachI've always been a member of the pragmatics-camp, scepticism helps indeed, but it doesn't help to get forward. I think the idea of a global "Semantic Web" was, and still is tempting and many bloggers, columnists, and many smart and visionary people like to talk about web-scale reasoning. Some even said, the SW will replace the traditional Web, or the Web 2.0... This is soo stupid. The most important thing to me is the SW standards, the layer cake. The possibility to share and interlink information where it's appropriate, it's just about an open standard for data. The last 10 years everybody was talking about open protocols and Web services. But what's actually communicated between endpoints is data. XML/XML-Schema won't be replaced either. But if you want to interlink data on the Web, it's not feasible. This is where SW standards rule. Because SW research is an open and democratic process, there are so many different viewpoints and interpretations about what it is itself. Many have stopped using ontologies and reasoners at all, they just use RDF and maybe RDF-S, even Ora Lassila - co-author of the original Scientific American article in 2001 [1] - as far as I know. Beside subsumption based on class hierachies, it's probably not working for all-day-information, but it works very very well for many applications mainly coming from live sciences. This is what Web 2.0 people and all those sceptical about reasoning usually don't see! They see blogs, foaf, vcards, etc. Here the possiblity for RDF-only interlinking is great and I'm sure it will be successfull and I'm sure that other CMS beside Drupal have and will adopt soon and introduce RDF features! Nobody will ever demand for 100% of all information on the Web being RDFized... think pragmatic! Regards, AndyL On Oct 20, 2008, at 12:12 PM, रविंदर ठाकुर (ravinder thakur) wrote:
Web of Data Practitioners Days / Oct 22-23 / Vienna http://www.webofdata.info ---------------------------------------------------------------------- Dipl.-Ing.(FH) Andreas Langegger Institute for Applied Knowledge Processing Johannes Kepler University Linz A-4040 Linz, Altenberger Straße 69 |
|
|
|
Re: web to semantic web : an automated approachthere will always be people overly optimistic and overly pessimistic about anything in the world so i wont take the case with case with SW as an exception. On the otherhand we shouldn't an option just because it there doesn't seems anyone asking for it. When Faraday found a way to create electircty he thought that was a useful invention since nobody uses or is asking for electicity. I don't see _everyone_ to be using RDFs but this can be used to solve many problems that no other technology can boast of. This is especially true since more and more data is coming to web and we need a better way to analyse/search/present that data to the end users looking for it.
But coming to the main point, i don't see why semantic web as envisioned by its main proponents shouldn't work. I am not saying that first attempt at it will be last one but an honest attempt is much better than some untested opinions :). What we need is a) a NLP system (similar to the one in www.opencalais.com) that converts the data on the web to its semantic form(rdf/owl etc) for much broader set of concepts. b) a store for this data from a) c) a reasoner for data stored in b) As I see it a) is the hardest part and ignoring performance/scalability issues, b) and c) already exists. Its the lack of a) that is keeping them from achieving anything great with semantic web. |
|
|
|
Re: web to semantic web : an automated approachOn Sun, Oct 19, 2008 at 3:08 PM, ravinder thakur <ravinderthakur@...> wrote: > I have been following semantic web for some time now and have seen quite a > lot of projects being run (dbpedia, FOAF etc) trying to generate some > semantic content. While these approaches might have been successful in their > goals, one major problem plaguing semantic web as a whole is the lack of > semantic content. Unfortunately there is nothing in sight that we can rely > on to generate semantic content for the truckloads of information being put > on web everyday. I think one of the _wrong_ assumption in semantic web > community is that content creators will be creating a semantic data which I > think is too much for the asking from even more technically sound part of > web community let along whole of the web community. It hasn't happened over > last so many years and I don't see it happening in the near future. Thanks for the thought provoking question Ravinder. I wrote up a longish response about how we basically need to convince people with rich domain models (RDBMS, etc) to make their data available on the web in a particular way. But then I re-read your email and realized you already knew this :-) My perspective is that we already see people publishing their data in machine readable form on the web in vast quantities [1,2]. There is no lack of semantic data on the web. Although I guess we could go back and forth for a bit about the semantics of 'semantic' ... which wouldn't be much fun. The challenge (I think) is in convincing current and future publishers that there is value in using some patterns from the semweb community [3, 4]. What sort of stuff does a linked-data space enable? What sorts of things can you _do_ with a linked data graph that you couldn't do before? Where are the real, working, non-hypothetical applications I can play with now that use say the linked-open-data space?. How do we get people excited enough about these things to invest the time in making their data assets available in this way on the web? These are the questions I personally need help with. I already have semantic data, expressed in databases of various kinds, painstakingly entered by real people. I need to convince my colleagues that it's a worthwhile endeavor expressing these semantics on the web using URIs and RDF. I think web2.0 has convinced large portions of the software development and business communities that there _is_ value in making machine readable data available on the web. But I don't think these people have yet been convinced that the entities described in this data need URIs, and that there is value in linking these entities together within the enterprise and across organizational boundaries. But it feels like we're close to a tipping point ... perhaps you feel like we've been on the tipping point for a while eh? In the end I think the way forward is to continue to add to the list of success stories [5], to bring semweb technologies to the broader web developer community [6,7], and to show the synergies between semweb technologies and other stuff like REST, AtomPub, OpenID, Microformats, content-management-systems (Drupal, etc). To keep on keeping on, as it were ... finding new algorithms, building useful tools and real systems now (as you are)--not giving in to despair, and retreating to an ivory tower to gaze into the future where the computers will just take care of it. //Ed [1] http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html [2] http://www.programmableweb.com/ [3] http://www.w3.org/TR/cooluris/ [4] http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData [5] http://www.w3.org/2001/sw/sweo/public/UseCases/ [6] http://www.w3.org/TR/xhtml-rdfa-primer/ [7] http://www.w3.org/2003/g/data-view |
|
|
|
|
Dear all, Before we can make any sensible comments on the extent to which we can structure information before it is put on the web or convert into SW formats, we need to know what is on the web in terms of (raw) information and data, which type of web services access this information and who are the users. For the Semantic Web to gain momentum and expand its user base and number of SW compliant web pages, we need to do a survey. For all the hype the Semantic Web has been getting, we must accept the fact that some (quite a bit actually) of (raw) data, information will not lend itself to useful content creation. The question that should be answered is what, in what form for whom would be useful to convert into SW compliant format. On the latter issue more effort should be focused. Some key words here are open archives (library exchange), open access (to digital repositories), open licenses and open source applications, open access publications, web portals. Rainbow Warriors International, Ekolibrium Foundation, WiserEarth are just three examples of three non-profits in a specific field, namely sustainable development who believe the people-centered approach is necessary to expanding the Semantic Web. I am glad to see Drupal embrace SW, as we last year requested that Drupal consider incorporating SW, in a long email detailing the convergence of (new) technologies being 3G and 4G GSM mobile network platforms on GSM phones, search engine technologies (we recently in a submission to the www.project10tothe100.com of Google in which we suggested the GOLDEN TIP, i.e. expanding the search heuristics in the search algorithms to include links to tagged data, which means Google would be able to filter for SW content containing pages. Similar ideas we have bounced off to companies like Sybase and the Eclipse consortium (www.eclipse.org). There are also barriers and obstacles to consider, specialty printed publications who cater to professionals, librarires, academic institutions and research instutes etc, stand to loose potentially as do printed newspaper and magazine publishers. As a sustainable development organization we are simply trying to rally as much support for the broadest possible platform utilizing the semantic web for a common good. In our case we are thrying to empower all stakeholders in sustainable development worldwide by the utilization of ICT technologies (with mobile telephony and internet as spearhead technology platforms).. Why? Because it has the power of consensus of the UN to throw at it, which makes it easier to persuade corporate players to throw resources at it. Companies like Microsoft, Sun Microsystems, Sybase, Oracle, and the developers of web browsers and internet applications widely used by internet users need to come onboard. Then we will have the thrust to get things moving. I am betting that Google and browser developing software companies and open source networks will lead the way, with the academia and libraries following closely on foot. But before we get into anything, we need raw numbers and answers to the question at the beginning of this email. Milton Ponson GSM: +297 747 8280 Rainbow Warriors Core Foundation PO Box 1154, Oranjestad Aruba, Dutch Caribbean www.rainbowwarriors.net (under revision) Project Paradigm: A structured approach to bringing the tools for sustainable development to all stakeholders worldwide www.projectparadigm.info (under construction) NGO-Opensource: Creating ICT tools for NGOs worldwide for Project Paradigm www.ngo-opensource.org (proposed project) MetaPortal: providing online access to web sites and repositories of data and information for sustainable development www.metaportal.info (proposed project) SemanticWebSoftware, part of NGO-Opensource to enable SW technologies in the Metaportal project (proposed site: www.semanticwebsoftware.org) --- On Mon, 10/20/08, Andreas Langegger <al@...> wrote: From: Andreas Langegger <al@...> |
| Dear Ravinder, You are spot on in saying that quite a few data owners won't feel any incentive to have their data made SW compatible if no monetary or other gain is to be made. And I also share the idea that we cannot wait. In the email I sent off to Google I actually made reference to tag-templates and not tags as an extra heuristic set to be included in the search algorithms. I would like to know if there is anyone out there willing to look at the idea of creating a new search engine, that does what Google does, but includes semantic web filters? Google can also be customized for languages, how about customizing for specific tags and semantic web structure formats? I am a mathematician by training and would love to accept the challenge of creating a new search engine which is customizable semantic web enabled.. After all, anyone who knows the origin of Google, may believe it is possible to create something new if Google no longer fits the bill. The same simple line of reasoning Page and Brin had when they built the prototype, was, "Build it and they (the guys with money) will come (to our door).' But the project will have to be OPEN SOURCE with OPEN LICENSE approach, and preferably have support from popular browsers. Anyone up to the challenge? Milton Ponson GSM: +297 747 8280 Rainbow Warriors Core Foundation PO Box 1154, Oranjestad Aruba, Dutch Caribbean www.rainbowwarriors.net (under revision) Project Paradigm: A structured approach to bringing the tools for sustainable development to all stakeholders worldwide www.projectparadigm.info (under construction) NGO-Opensource: Creating ICT tools for NGOs worldwide for Project Paradigm www.ngo-opensource.org (proposed project) MetaPortal: providing online access to web sites and repositories of data and information for sustainable development www.metaportal.info (proposed project) SemanticWebSoftware, part of NGO-Opensource to enable SW technologies in the Metaportal project (proposed site: www.semanticwebsoftware.org) --- On Mon, 10/20/08, रविंदर ठा <ravinderthakur@...> wrote: From: रविंदर ठा <ravinderthakur@...> |
BBN, and other NLP researchers, have had considerable success in
using NLP to automatically extracting instance data from unstructured text and
mapping it into ontological knowledge bases. The issue of co-reference
resolution remains a difficult problem. Extracting structure and
automatically creating the ontology is an even harder problem. Continued research
in these areas is important because a great deal of human knowledge is
contained in unstructured data. However, I’m personally convinced that in the
long (maybe very long) run the best approach will be to mark up data as instances
of classes and properties of ontologies as the very first step in information
processing and then automatically generating unstructured (human readable)
text from the knowledge bases that is tailored to specific human information requests.
Human analyst would no longer spend their time writing unstructured text but
would rather populate Semantic Web knowledge bases. Of course, this approach
to publishing wouldn’t apply to novels, plays, poems and other such works of
art, only tailored responses to direct requests for information.
John
From:
semantic-web-request@... [mailto:semantic-web-request@...] On Behalf
Of ?????? ????? (ravinder thakur)
Sent: Monday, October 20, 2008 11:55 AM
To: John Flynn; semantic-web@...; semantic_web@...
Subject: Re: web to semantic web : an automated approach
Buts whats the incentive for
web site owners to mark up their website with semantic data. Few days back i
was reading some study conducted by Opera browser team that said that most of
the html generated by websites is not even valid. How can we hope them to
create correct semantic data. Also what happens to lot of other user submitted
content(blogs, wikis etc ) ?
Instead why not create a mechanism to automatically convert web data to
semantic data. Opencalais.com is already doing it on small domain, why can't/shouldn't
we do it at web's scale ?
John : I realized that you are form BBN. In case you are aware, can you
please tell us from your experience about the state of NLP ? To what extent the
current best NLP systems are capable of extracting infroatmion from unformatted
text ? And what are the hopes for the future to overcome the curent
shortcomings in NLP systems?
| < Prev | 1 - 2 - 3 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |