Putting Government Data online

View: New views
18 Messages — Rating Filter:   Alert me  

Putting Government Data online

by Danny Ayers :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Tim typically hid his talent under a bushel

must read :
http://www.w3.org/DesignIssues/GovData.html

--
http://danny.ayers.name


Re: Putting Government Data online

by AzamatAbdoullaev :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

"Tim typically hid his talent under a bushel
must read : http://www.w3.org/DesignIssues/GovData.html"

I much doubt that this note may have any big use. Recommend to learn more
about the relationship of Data, Information, Knowledge and Wisdom. Good to
start from the Ackoff's paper: "From data to wisdom."  There is a rich
literature on the data-information-knowledge-wisdom hierarchy (pyramid),
http://en.wikipedia.org/wiki/DIKW. More advanced concepts are Linked
Information and Linked Knowledge or the Wisdom Pyramid with meaningfully
dynamic knowledge networks topology: full relationship as well as line,
loop, bus, mesh, star, or tree.

It is claimed that "Linked Data allows different things in different
datasets of all kinds to be connected."
http://www.thenationaldialogue.org/ideas/linked-open-data.

As it is, , Linked Data looks a big mess-up of data, http://linkeddata.org/,
with low quality content and lack of any knowledge structure or inference
mechanism.



I share the concerns recently expressed by John Sowa on other forum:

"My major complaint about the Semantic Web is that they ignored all
the development techniques that worked successfully for years, and
they failed to provide a migration path.

Following are some of the most egregious blunders:

  1. Ignoring the fact that every major web site is built on top
     of a relational database.  The major sites use big commercial
     databases.  Smaller sites are based on LAMP -- Linux, Apache,
     MySQL, and Perl, Python, or PHP.

  2. Building RDF on top of triples, instead of the SQL n-tuples.

  3. Failing to integrate their notations with UML diagrams, which
     include type hierarchies and various notations for constraints.

If the Semantic Web had addressed these three issues from the beginning,
it would have been integrated into the mainstream of data processing in
about 3 or 4 years.  Today, we would have seen some truly spectacular
applications.
The SemWeb still has a chance, but it has to be integrated with the
mainstream of data processing before it can become the mainstream."



Azamat Abdoullaev

http://standardontology.org







----- Original Message -----
From: "Danny Ayers" <danny.ayers@...>
To: "Semantic Web" <semantic-web@...>
Sent: Wednesday, June 24, 2009 3:00 PM
Subject: Putting Government Data online


> Tim typically hid his talent under a bushel
>
> must read :
> http://www.w3.org/DesignIssues/GovData.html
>
> --
> http://danny.ayers.name
>



AW: Putting Government Data online

by Chris Bizer-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Azamat,

> I much doubt that this note may have any big use. Recommend to
> learn more about the relationship of Data, Information, Knowledge
> and Wisdom.

We have done this for 10 years now with mixed results.

So why not try a slightly different approach?

Cheers,

Chris


> -----Ursprüngliche Nachricht-----
> Von: semantic-web-request@... [mailto:semantic-web-request@...]
> Im Auftrag von Azamat
> Gesendet: Mittwoch, 24. Juni 2009 17:24
> An: 'SW-forum'
> Cc: John F. Sowa
> Betreff: Re: Putting Government Data online
>
> "Tim typically hid his talent under a bushel
> must read : http://www.w3.org/DesignIssues/GovData.html"
>
> I much doubt that this note may have any big use. Recommend to learn
> more
> about the relationship of Data, Information, Knowledge and Wisdom. Good
> to
> start from the Ackoff's paper: "From data to wisdom."  There is a rich
> literature on the data-information-knowledge-wisdom hierarchy
> (pyramid),
> http://en.wikipedia.org/wiki/DIKW. More advanced concepts are Linked
> Information and Linked Knowledge or the Wisdom Pyramid with
> meaningfully
> dynamic knowledge networks topology: full relationship as well as line,
> loop, bus, mesh, star, or tree.
>
> It is claimed that "Linked Data allows different things in different
> datasets of all kinds to be connected."
> http://www.thenationaldialogue.org/ideas/linked-open-data.
>
> As it is, , Linked Data looks a big mess-up of data,
> http://linkeddata.org/,
> with low quality content and lack of any knowledge structure or
> inference
> mechanism.
>
>
>
> I share the concerns recently expressed by John Sowa on other forum:
>
> "My major complaint about the Semantic Web is that they ignored all
> the development techniques that worked successfully for years, and
> they failed to provide a migration path.
>
> Following are some of the most egregious blunders:
>
>   1. Ignoring the fact that every major web site is built on top
>      of a relational database.  The major sites use big commercial
>      databases.  Smaller sites are based on LAMP -- Linux, Apache,
>      MySQL, and Perl, Python, or PHP.
>
>   2. Building RDF on top of triples, instead of the SQL n-tuples.
>
>   3. Failing to integrate their notations with UML diagrams, which
>      include type hierarchies and various notations for constraints.
>
> If the Semantic Web had addressed these three issues from the
> beginning,
> it would have been integrated into the mainstream of data processing in
> about 3 or 4 years.  Today, we would have seen some truly spectacular
> applications.
> The SemWeb still has a chance, but it has to be integrated with the
> mainstream of data processing before it can become the mainstream."
>
>
>
> Azamat Abdoullaev
>
> http://standardontology.org
>
>
>
>
>
>
>
> ----- Original Message -----
> From: "Danny Ayers" <danny.ayers@...>
> To: "Semantic Web" <semantic-web@...>
> Sent: Wednesday, June 24, 2009 3:00 PM
> Subject: Putting Government Data online
>
>
> > Tim typically hid his talent under a bushel
> >
> > must read :
> > http://www.w3.org/DesignIssues/GovData.html
> >
> > --
> > http://danny.ayers.name
> >




Re: Putting Government Data online

by Sören Auer :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Azamat wrote:
> As it is, , Linked Data looks a big mess-up of data,

That's the intention of Linked Data - create a big mess-up of data. ;-)
When there is sufficient quantity of data messed-up and search engines
start allowing people to aggregate and interconnect this data it will
gain structure and coherence automatically, since data providers will
strive to align their data according to common schemes. I think only
such a bottom-up approach (transitioning from quantity to quality to
speak in philosophical terms) will work on the Web, nothing else!

> http://linkeddata.org/, with low quality content and lack of any
> knowledge structure or inference mechanism.

Lack of inference mechanisms might be considered as a feature! I do not
see any hope that comprehensive inference algorithms will achive the
required scalability required for the Web. Keyword searches on the web
work scalable now. The next challenge is to make conjunctive querying
(ala SQL/Datalog/SPARQL) Web scale. After we solved that we can look
into reasoning (although some already try now ;-).

> I share the concerns recently expressed by John Sowa on other forum:
>
> "My major complaint about the Semantic Web is that they ignored all
> the development techniques that worked successfully for years, and
> they failed to provide a migration path.

I think that the initially too dominant focus on reasoning and
comprehensive knowledge representation hindered the deployment of the SW.
Also the disgrace of its late birth (in German I would phrase it
"Ungnade der späten Geburt") slowed the SW down: Companies started
converting their technologies to XML (and they are still busy with that)
and do not want to switch again soon to another technology stack,
although in particular for data oriented applications the RDF stack
would be much more appropriate.

> Following are some of the most egregious blunders:
>
>  1. Ignoring the fact that every major web site is built on top
>     of a relational database.  The major sites use big commercial
>     databases.  Smaller sites are based on LAMP -- Linux, Apache,
>     MySQL, and Perl, Python, or PHP.

There was quite early support for many of the scripting languages - cf.
e.g. the Scripting for the Semantic Web Workshop series [1],
Powl/OntoWiki [2], RAP [3] etc.
Meanwhile there is also a large amount of approaches related to
integrating DBs and RDF cf. the RDB2RDF XG report [4] and Triplify [5]
(which targets DB backed Webapps).

>  2. Building RDF on top of triples, instead of the SQL n-tuples.

This is best what could have happened (although I'm a big fan of RDBs
and SQL). On the Web, however, its all about interlinking and
integrating data - n-tuples do not merge naturally, triples do!

>  3. Failing to integrate their notations with UML diagrams, which
>     include type hierarchies and various notations for constraints.

I think the Semantic Web should rather focus on lightweight technologies
  such as REST, Webapps, Wikis etc. - these will be better enablers.

Cheers,

Sören

[1] http://www.semanticscripting.org/
[2] http://ontowiki.net
[3] http://www4.wiwiss.fu-berlin.de/bizer/rdfapi/
[4] http://esw.w3.org/topic/Rdb2RdfXG/StateOfTheArt
[5] http://Triplify.org

--

--------------------------------------------------------------
Sören Auer, AKSW/Computer Science Dept., University of Leipzig
http://www.informatik.uni-leipzig.de/~auer,  Skype: soerenauer


Re: Putting Government Data online

by Frank Manola :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Jun 24, 2009, at 11:24 AM, Azamat wrote:

> "Tim typically hid his talent under a bushel
> must read : http://www.w3.org/DesignIssues/GovData.html"
>
> I much doubt that this note may have any big use. Recommend to learn  
> more about the relationship of Data, Information, Knowledge and  
> Wisdom. Good to start from the Ackoff's paper: "From data to  
> wisdom."  There is a rich literature on the data-information-
> knowledge-wisdom hierarchy (pyramid), http://en.wikipedia.org/wiki/DIKW 
> . More advanced concepts are Linked Information and Linked Knowledge  
> or the Wisdom Pyramid with meaningfully dynamic knowledge networks  
> topology: full relationship as well as line, loop, bus, mesh, star,  
> or tree.
>
> It is claimed that "Linked Data allows different things in different  
> datasets of all kinds to be connected." http://www.thenationaldialogue.org/ideas/linked-open-data 
> .
>
> As it is, , Linked Data looks a big mess-up of data, http://linkeddata.org/ 
> , with low quality content and lack of any knowledge structure or  
> inference mechanism.
>

Yes, but it's on the Web, and linked!  As opposed to lots of other  
data (much of which also has low quality content and lack of any  
knowledge structure or inference mechanism) that isn't.  There's no  
point in comparing the current state of linked data with some "data  
Eden" that doesn't (and never did) exist.  What progress is being made  
toward the S*m*ntic W*b (the S*m*ntic W*b is the alternative to the  
Semantic Web that avoids all the supposed errors of the Semantic Web)  
using these other approaches?

>
>
> I share the concerns recently expressed by John Sowa on other forum:

He may have expressed these concerns recently on another forum, but  
he's been expressing them for years.

>
> "My major complaint about the Semantic Web is that they ignored all
> the development techniques that worked successfully for years, and
> they failed to provide a migration path.

Worked successfully *for what*?  No one is debating the success of  
relational databases as database technology, but if there was a  
migration path to the S*m*ntic W*b it was either not very clearly  
marked, or those who believed in it weren't proceeding along it at any  
substantial pace, or both.

>
> Following are some of the most egregious blunders:
>
> 1. Ignoring the fact that every major web site is built on top
>    of a relational database.  The major sites use big commercial
>    databases.  Smaller sites are based on LAMP -- Linux, Apache,
>    MySQL, and Perl, Python, or PHP.

How does the Semantic Web ignore relational databases?  Do you mean  
people building triple stores?  There's nothing built into the  
Semantic Web that requires triple stores.

>
> 2. Building RDF on top of triples, instead of the SQL n-tuples.

Which enables people to grab groups of triples off the Web without  
having to find schemas to figure out what the fields of the n-tuples  
are.  I call that an *advantage* on the Web, not an "egregious  
blunder".  Besides, triples just constitute a highly-normalized form  
of relational database anyway (a number of relational database design  
experts recommend a similar type of conceptual design), so the  
foundation is pretty much the same.  And if building the S*m*ntic W*b  
directly on n-tuples is so much better, why don't more of the critics  
get busy on it, instead of just carping about the work other people  
are trying to do?

>
> 3. Failing to integrate their notations with UML diagrams, which
>    include type hierarchies and various notations for constraints.

Work has been done on this, but do you seriously believe lack of UML  
diagrams is a major issue?  Relational databases certainly didn't rely  
very much on UML diagrams for database design to become a mainstream  
technology.

>
> If the Semantic Web had addressed these three issues from the  
> beginning,
> it would have been integrated into the mainstream of data processing  
> in
> about 3 or 4 years.  Today, we would have seen some truly spectacular
> applications.

Baloney.  What evidence exists that the problem is technology, as  
opposed to cost, requirements, and politics (of putting data online)?  
Integrating/rationalizing heterogeneous data is hard work, and always  
has been (even when the data being integrated was *entirely* in  
relational databases).

> The SemWeb still has a chance, but it has to be integrated with the
> mainstream of data processing before it can become the mainstream."

Certainly true.  Let me offer a couple more truisms:

The Semantic Web still has a chance given the number of dedicated and  
smart people working on it.

The S*m*ntic W*b has *no* chance as long as those who believe in it  
don't develop their own specs and software that demonstrate all the  
purported advantage of doing it that way (whatever it is).

--Frank



Re: Putting Government Data online

by Ed Summers :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Jun 24, 2009 at 11:24 AM, Azamat<abdoul@...> wrote:
>  1. Ignoring the fact that every major web site is built on top
>    of a relational database.  The major sites use big commercial
>    databases.  Smaller sites are based on LAMP -- Linux, Apache,
>    MySQL, and Perl, Python, or PHP.

The Library of Congress recently published the contents of ~1,250,000
historic newspaper pages as linked data [1] using Python and a RDMBS
(MySQL). I think it's somewhat  misleading to suggest that RDF and
Linked Data aren't useful for expressing the relations locked up in a
RDBMS using familiar LAMP tools. Although certainly the data that we
are exposing isn't perfect, and could be improved in places :-)

//Ed

[1] http://chroniclingamerica.loc.gov/newspapers.rdf


Re: erasing relevant data

by c :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> > If the Semantic Web had addressed these three issues from the  
> > beginning,
> > it would have been integrated into the mainstream of data processing  
> > in
> > about 3 or 4 years.  Today, we would have seen some truly spectacular
> > applications.
>
> Baloney.  What evidence exists that the problem is technology, as  
> opposed to cost, requirements, and politics (of putting data online)?  
> Integrating/rationalizing heterogeneous data is hard work, and always  
> has been (even when the data being integrated was *entirely* in  
> relational databases).
>
> > The SemWeb still has a chance, but it has to be integrated with the
> > mainstream of data processing before it can become the mainstream."
>
> Certainly true.  Let me offer a couple more truisms:
>
> The Semantic Web still has a chance given the number of dedicated and  
> smart people working on it.

and on the other side of the coin, we have Twitter, removing <a> from the party

http://bit.ly/sMsfI3fd ought ot be enough for anyone

>
> The S*m*ntic W*b has *no* chance as long as those who believe in it  
> don't develop their own specs and software that demonstrate all the  
> purported advantage of doing it that way (whatever it is).
>
> --Frank
>


Re: erasing relevant data

by Danny Ayers :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2009/6/25 carmen <_@...>:

> and on the other side of the coin, we have Twitter, removing <a> from the party

 too true, unexpected stuff

> http://bit.ly/sMsfI3fd ought ot be enough for anyone
>
>>
>> The S*m*ntic W*b has *no* chance as long as those who believe in it
>> don't develop their own specs and software that demonstrate all the
>> purported advantage of doing it that way (whatever it is).

there I disagree - the specs exist, and people are using them.



--
http://danny.ayers.name


Re: Putting Government Data online

by richard murphy-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Danny & All:

I work for the government and I'm urrently working on data.gov. It's
great to have an Internet design note to point people at.

I've had this post in the queue for a while and pushed it out today
after a few updates.

http://phaneron.rickmurphy.org/?p=34

So far 2009 looks like a very good year!

Rick

Danny Ayers wrote:
> Tim typically hid his talent under a bushel
>
> must read :
> http://www.w3.org/DesignIssues/GovData.html
>

--
Rick

cell: 703-201-9129
web:  http://www.rickmurphy.org
blog: http://phaneron.rickmurphy.org


Re: Putting Government Data online

by Danny Ayers :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

2009/6/25 rick <rick@...>:
> Danny & All:
>
> I work for the government and I'm urrently working on data.gov. It's great
> to have an Internet design note to point people at.

it reads well already, but I do hope it's not a first draft not
intended for these purposes

whatever, the interwebs gave us http://badgerbadgerbadger.com





--
http://danny.ayers.name


Re: Putting Government Data online

by AzamatAbdoullaev :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

FM: The Semantic Web still has a chance given the number of dedicated and
 smart people working on it.
Right. There are many brilliant minds involved in the SW Activities, needing
the wise management of the whole program and its many constituent parts,
projects and specifications.

FM: The S*m*ntic W*b has *no* chance as long as those who believe in it
don't develop their own specs and software that demonstrate all the
purported advantage of doing it that way (whatever it is).
Again, right. But it sounds as if you were estranging yourself from the
idea(l).
To reach out, the SW pile [of URI's Identifiers, UNICODE character set, XML
syntax, RDF data interchange; RDFS taxonomies, SPARQL querying, OWL
vocabularies, RIF/SWRL rules; Unifying Logic; Proof; Trust] is to be
reviewed in more lucid and consistent terms of Data, Information, Knowledge,
and Wisdom Hierarchy, widely used in Information Science and Knowledge
Management Systems. Then the whole thing becomes ordered and logical, as in:

  a.. The bottom Data Level makes the web of data (ontology's ground
elements, individuals, instances, facts, or raw data) all sort of data
repositories, digital archives, silos, data warehouses in all fields of
knowledge and practice;
  b.. The Information Level, the web of information (ontology's elements of
classes, sets and collections of data, collection of facts, datasets with
some structure). Here belong the Linked Data, and a terabyte of information
in the social sciences, natural sciences, or the digital humanities data,
like Google Books, Project Gutenberg, Microsoft books.live.com, any
statistical data sets, somehow ordered, as the relational databases;
  c.. The Knowledge Level makes the web of knowledge (ontology's elements of
relationships, the related facts, truths and principles and inference rules,
content and context, proof and trust, semantic and logical rules). Here WILL
belong SW knowledge bases, reasoning mechanisms, systems and tools and
languages and domain ontologies);
  d.. The Wisdom Level makes the web of wisdom, or the Wisdom Web,
Intelligent Web, Real Semantic Web (a global ontology of all resources
implying a standard ontology of top categories and meanings and a single
universal identification system of entities).
As the first step, all the specifications need to be aligned with the
universal concept of resources, as entities with identity, concrete,
collective or abstract, as anything, so that a URI could identify anything
and everything, everywhere and every time.

Azamat
----- Original Message -----
From: "Frank Manola" <fmanola@...>
To: "SW-forum" <semantic-web@...>
Sent: Wednesday, June 24, 2009 9:35 PM
Subject: Re: Putting Government Data online


> On Jun 24, 2009, at 11:24 AM, Azamat wrote:
>
>> "Tim typically hid his talent under a bushel
>> must read : http://www.w3.org/DesignIssues/GovData.html"
>>
>> I much doubt that this note may have any big use. Recommend to learn
>> more about the relationship of Data, Information, Knowledge and  Wisdom.
>> Good to start from the Ackoff's paper: "From data to  wisdom."  There is
>> a rich literature on the data-information- knowledge-wisdom hierarchy
>> (pyramid), http://en.wikipedia.org/wiki/DIKW . More advanced concepts are
>> Linked Information and Linked Knowledge  or the Wisdom Pyramid with
>> meaningfully dynamic knowledge networks  topology: full relationship as
>> well as line, loop, bus, mesh, star,  or tree.
>>
>> It is claimed that "Linked Data allows different things in different
>> datasets of all kinds to be connected."
>> http://www.thenationaldialogue.org/ideas/linked-open-data .
>>
>> As it is, Linked Data looks a big mess-up of data, http://linkeddata.org/ 
>> , with low quality content and lack of any knowledge structure or
>> inference mechanism.
>>
>
> Yes, but it's on the Web, and linked!  As opposed to lots of other  data
> (much of which also has low quality content and lack of any  knowledge
> structure or inference mechanism) that isn't.  There's no  point in
> comparing the current state of linked data with some "data  Eden" that
> doesn't (and never did) exist.  What progress is being made  toward the
> S*m*ntic W*b (the S*m*ntic W*b is the alternative to the  Semantic Web
> that avoids all the supposed errors of the Semantic Web)  using these
> other approaches?
>
>>
>>
>> I share the concerns recently expressed by John Sowa on other forum:
>
> He may have expressed these concerns recently on another forum, but  he's
> been expressing them for years.
>
>>
>> "My major complaint about the Semantic Web is that they ignored all
>> the development techniques that worked successfully for years, and
>> they failed to provide a migration path.
>
> Worked successfully *for what*?  No one is debating the success of
> relational databases as database technology, but if there was a  migration
> path to the S*m*ntic W*b it was either not very clearly  marked, or those
> who believed in it weren't proceeding along it at any  substantial pace,
> or both.
>
>>
>> Following are some of the most egregious blunders:
>>
>> 1. Ignoring the fact that every major web site is built on top
>>    of a relational database.  The major sites use big commercial
>>    databases.  Smaller sites are based on LAMP -- Linux, Apache,
>>    MySQL, and Perl, Python, or PHP.
>
> How does the Semantic Web ignore relational databases?  Do you mean
> people building triple stores?  There's nothing built into the  Semantic
> Web that requires triple stores.
>
>>
>> 2. Building RDF on top of triples, instead of the SQL n-tuples.
>
> Which enables people to grab groups of triples off the Web without  having
> to find schemas to figure out what the fields of the n-tuples  are.  I
> call that an *advantage* on the Web, not an "egregious  blunder".
> Besides, triples just constitute a highly-normalized form  of relational
> database anyway (a number of relational database design  experts recommend
> a similar type of conceptual design), so the  foundation is pretty much
> the same.  And if building the S*m*ntic W*b  directly on n-tuples is so
> much better, why don't more of the critics  get busy on it, instead of
> just carping about the work other people  are trying to do?
>
>>
>> 3. Failing to integrate their notations with UML diagrams, which
>>    include type hierarchies and various notations for constraints.
>
> Work has been done on this, but do you seriously believe lack of UML
> diagrams is a major issue?  Relational databases certainly didn't rely
> very much on UML diagrams for database design to become a mainstream
> technology.
>
>>
>> If the Semantic Web had addressed these three issues from the  beginning,
>> it would have been integrated into the mainstream of data processing  in
>> about 3 or 4 years.  Today, we would have seen some truly spectacular
>> applications.
>
> Baloney.  What evidence exists that the problem is technology, as  opposed
> to cost, requirements, and politics (of putting data online)?
> Integrating/rationalizing heterogeneous data is hard work, and always  has
> been (even when the data being integrated was *entirely* in  relational
> databases).
>
>> The SemWeb still has a chance, but it has to be integrated with the
>> mainstream of data processing before it can become the mainstream."
>
> Certainly true.  Let me offer a couple more truisms:
>
> The Semantic Web still has a chance given the number of dedicated and
> smart people working on it.
>
> The S*m*ntic W*b has *no* chance as long as those who believe in it  don't
> develop their own specs and software that demonstrate all the  purported
> advantage of doing it that way (whatever it is).
>
> --Frank
>
>



Re: Putting Government Data online

by John F. Sowa :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Frank and Azamat,

I have been the most enthusiastic proponent of a truly Semantic Web.
But along the way, the semantics got lost in an ungodly mess of
syntax.

FM> The Semantic Web still has a chance given the number of dedicated
 > and smart people working on it.
 >
 > The S*m*ntic W*b has *no* chance as long as those who believe in it
 > don't develop their own specs and software that demonstrate all the
 > purported advantage of doing it that way (whatever it is)

I very strongly agree.  And I wrote a note to ontolog forum that
explains how to restore the focus on semantics.  (Copy below)

In the process, my proposal cures the incredibly stupid blunder
that is killing the Semantic Web:  ignoring the fact that every
major web site is built around a relational database.

I used to call SQL the worst notation for logic ever conceived.
But I changed my mind after seeing RDF and OWL.

My proposal below solves that problem by integrating SQL, RDF,
and OWL on a truly equal footing.

I honestly believe that this is the only way to rescue the
original goals and hopes for the Semantic Web.

John Sowa
___________________________________________________________________

The real problem of "bringing semantics" into anything, whether a
database or the WWW or anything else, is to keep your focus on the
main goal:  representing meaning.  Everything else is a distraction.

 > Is "semantic foreign key" possible to facilitate current relational
 > database step into semantic database? In other words, if we can
 > build RDF or OWL based semantic foreign keys across different tables
 > and databases while providing those innovative foreign keys inference
 > and reasoning ability, it may help to bring the semantics into the
 > current DB.

That is not the problem.  People have been talking about integrating
semantics with relational databases for over 30 years.  The solution
was always very clear:  represent the meaning of the data in logic.

The major obstacle was also very clear:  people ignored meaning,
and devoted most of their efforts to adding more and more special
"features" to SQL to address one or another low-level syntactic
notation to support somebody's pet implementation.

The major issues in creating the Semantic Web were also very clear:
express meaning in logic.  But instead of focusing on the logic,
they started to address all kinds of special cases, such as using
triples instead of n-tuples or forcing everything into some kind
of XML syntax.

If you step back and look at the logic, all the problems disappear:

  1. First order logic hasn't changed in the past 130 years, and
     the syntax can be defined in half a page.

  2. The mapping of relational databases to and from FOL is obvious.

  3. The mapping of Description Logics to FOL is obvious.

  4. You can develop very clean, very simple mappings of the above
     three to one another.

  5. The details of XML-based notations or table-based SQL notations
     are of minor importance.  Those should *never* be allowed to
     have the slightest influence on #1, #2, and #3 above.

That is all very clean and very simple.  But we still have to deal
with the problem of current systems such as SQL, RDF, and OWL.

The answer is also simple:  SQL, RDF, and OWL will be declared
"legacy systems".  In the terminology that IBM used, they will be
called "functionally stabilized".  That means no new features or
additions or further changes will be made to them.  They will be
supported forever, but not as the basis for future development.

All future development will focus on the very simple principles of
#1, #2, and #3 above and with further purely *logical* extensions,
not rinky-dink syntactic features of the kind that burden SQL,
RDF, OWL, and all other horrible syntaxes that have outlived
their usefulness.

That is the answer.  It's extremely simple, and it provides
*equal* support for both the current relational DBs and
the current Semantic Web.  It is a solid and secure foundation
for the future.



Re: Putting Government Data online

by Danny Ayers :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

John, while I strongly respect your arguments (especially the syntax
things, we have tended to get mired in that way), but I do believe you
overlook the value of simply naming things - for FOL this may be
trivial, but in the context of the Web it's hugely powerful, the
possibility of using a simple protocol to retrieve more information
about the topic at hand.

Cheers,
Danny.


Re: Putting Government Data online

by John F. Sowa :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Danny,

I never raised any objections to URIs.  In fact, the ISO standard
for Common Logic supports them as names.

 > ... for FOL this may be trivial, but in the context of the Web
 > it's hugely powerful, the possibility of using a simple protocol
 > to retrieve more information about the topic at hand.

I am always in favor of supporting simple but powerful things.
What I am against is making simple things difficult.

My recommendation for the next version of the Semantic Web
is very simple:

  1. Keep the URIs.

  2. Replace RDF with JSON (which is as readable as any of
     the recommended syntaxes for triples, but it also supports
     n-tuples).  (And JSON, by the way, is the notation that
     Google uses instead of RDF.)

  3. Replace OWL with a DL that has equivalent logical power,
     but a much cleaner syntax and the ability to use JSON.

  4. Adopt ISO 24707 for Common Logic as the semantic foundation
     for multiple dialects.  For example, a Horn-clause subset
     or a DL subset would be two different subsets of full CL.

  5. Use a tag such as <script> ... </script> for embedding such
     notations in a web page.  (But Common Logic also supports
     an XML-ified dialect called XCL, which is more compact than
     RDF for triples -- and it also supports full Common Logic.)

I realize that some people claim that triple stores are useful,
but there are far more efficient internal representations that
give the programmer (or the logician) a view as either tables
or as graphs.

For just one example, see the following paper:
 
http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=911FFAC5BC8B7B7A60B5E9197850E6AD?doi=10.1.1.52.3727&rep=rep1&type=url&i=0
The GMAP: a versatile tool for physical data independence

Tsatalos, the first author, did the work for his PhD dissertation,
in which he demonstrated that Gmaps (Generalized Combinatorial Maps)
provide a physical representation that is more efficient for SQL than
conventional tables and more efficient for object-oriented access
than conventional graphs.  He was hired by IBM Research, but as might
be expected, he was not able to budge the DB2 behemoth.  So he left
IBM to start his own company.

For our company, VivoMind Intelligence, we use Gmaps to represent
graphs, and they support very efficient operations with very compact
code.  Gmaps are also widely used in architectural systems to represent
huge graphs with billions of nodes.  They enable graphs that represent
a building or a complex of buildings to be mapped to any perspective
for virtual reality -- and the mappings are extremely fast, even on
huge graphs.  They can run circles around anything that could be done
with SPARQL.  And as Tsatalos showed, they can support SQL-like
queries against arbitrarily large graphs.

That is just one example of Knuth's dictum:  "Premature optimization
is the root of all evil."  The choice of triples to support the
implementation of triple stores was a premature optimization by
people who did not understand the state of the art for processing
graphs.

John



Re: Putting Government Data online

by Pat Hayes :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

John and Danny, you are both right :-)  John is right that the SWeb  
should be based on FOL, and Danny is right that names, and the  
processes of designing, agreeing on, and using names are critically  
important (and traditional logic hasn't paid any attention to this  
stuff.) Take a look at the last slide of

http://is.gd/1ehQK

Pat


On Jun 26, 2009, at 4:12 AM, Danny Ayers wrote:

> John, while I strongly respect your arguments (especially the syntax
> things, we have tended to get mired in that way), but I do believe you
> overlook the value of simply naming things - for FOL this may be
> trivial, but in the context of the Web it's hugely powerful, the
> possibility of using a simple protocol to retrieve more information
> about the topic at hand.
>
> Cheers,
> Danny.
>
>
>

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes







Re: Putting Government Data online

by John F. Sowa :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Pat,

I want to emphasize that my proposal is *upward compatible* with the
methodologies and practices developed by the Semantic Web community.

PH> John and Danny, you are both right  :-)   John is right that
 > the SWeb should be based on FOL, and Danny is right that names,
 > and the processes of designing, agreeing on, and using names
 > are critically important (and traditional logic hasn't paid
 > any attention to this stuff.)

There is not a single methodology, practice, or technique that
anyone uses today that they can't continue to use with my proposal.

The only thing that I suggest that people *stop* doing is turning
human eyeballs on the raw notations for RDF and OWL.  All the
current tools are being designed to make those notations as
invisible as possible to humans.

I am just proposing the next obvious step:  make the XML-based
notations for RDF and OWL *optional* for document exchange as
well:

  1. The recommended exchange form for RDF will become JSON.
     Any JSON documents that are limited to triples can use
     the old XML-based RDF form, but they can also use the
     more compact and more general full JSON.

  2. Development tools such as Protege can generate *either*
     the current XML-based notation for OWL or they can
     generate a new notation for OWL based on Common Logic.

  3. Programs that use XSLT to manipulate RDF and OWL will have
     to use the old XML-based notations.  But newer programs
     can take advantage of more powerful methodologies.

Among the newer, more powerful methodologies are -- surprise! --
*all* the old methodologies for software development such as UML.

The goal of my proposal is nothing less than a total *integration*
of the Semantic Web methodologies with the methodologies that have
been used in the traditional software development community.

That integration will also support an open-ended flowering of
new logic-based methodologies in which the boundaries between
relational DBs, object-oriented DBs, and web-based documents
vanish, disappear, and become *irrelevant* for everything
except the lowest level of tweaks and optimizations that are
performed by automated or at least semi-automated means.

PH> Take a look at the last slide of http://is.gd/1ehQK

I recommend that slide and the full talk by Pat.

I strongly endorse a logic-based vision in which the Semantic Web,
the Semantic DBs, the Knowledge Bases, and the rule-based systems
merge in a seamless *Semantic System* in which the boundaries and
distinguishing labels vanish.

John



Re: Putting Government Data online

by Frank Manola :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

John--

What you're proposing here is not at all unreasonable, but I do think  
there are some things that need to be qualified/clarified a bit.

You talk about the Semantic Web "ignoring the fact that every major  
web site is built around a relational database".  I may be wrong, but  
your further comments suggest that what you mean by this is mainly  
that RDF uses triples rather than being based directly on n-tuples.  I  
don't think these are quite the same thing.  It might help if we were  
to distinguish better between the notation used for the *logic*, and  
the notation used to refer to the *data* (instances).

Trying to cram FOL expressions into triples is certainly a mess.  On  
the other hand, in dealing with data instances there's a need to  
support what is sometimes called Codd's "guaranteed access principle",  
which is that every atomic value in a relational database is  
guaranteed to be logically (in the database sense of that word)  
accessible by a combination of table name, primary-key value, and  
column name.  I.e., you need the combination (table name, primary-key  
value) to select the row, and (table name, column name) to select the  
column (note that the table name is needed for disambiguation *within  
a given database* in each case;  on the Web you need to identify the  
database too).  URIs provide various ways of disambiguating these  
names on the Web (e.g., you can have a URI for the table, which  
disambiguates the other components *within that table*), and you may  
prefer using compound names, but RDF simply boils the (table name,  
primary-key value) combination to the subject URI, and the (table  
name, column name) combination to the predicate URI;  i.e., it's a  
very direct way of providing for the guaranteed access principle, and  
this *does not* ignore relational databases.

RDF also reflects an aspect of relational databases that the use of n-
tuples for logic expressions tends to ignore, namely normalization.  
It's one thing to think of logical expressions having n-tuples of  
arbitrary arity, and another to think of storing and then managing  
billions of instances of those same tuples (e.g., in determining which  
stored values need to be changed when an update occurs).  The same  
normalization principles that (ideally) govern the design of  
relational databases ought to be considered in the Semantic Web.  As  
in relational databases (and as you have suggested) there's no reason  
for forcing the stored representation of the data to be the same as  
the notation used to refer to it (and lots of reasons why making them  
the same is often a *very bad idea*) so there's lots of room for  
maneuver between what is stored and what the user sees.  However, RDF  
at least directly reflects this issue in providing a way of referring  
to an exact value within the Web (although even RDF doesn't, and  
can't, disambiguate references to values when different users use  
different URIs to refer to them), once again *not* ignoring relational  
databases (in fact, reflecting a prime concern in relational database  
design).

Finally, I want to repeat the general theme of my original reply (some  
of which you quoted below):  progress toward an alternative Semantic  
Web isn't going to be made by sniping remarks at people trying to get  
linked data on the Web, or telling people what they did or didn't  
ignore in developing the specs, but rather by working out the details  
of the alternative ideas, *showing people specifically how those ideas  
make it easier to develop a Semantic Web*, and implementing associated  
software.

--Frank

On Jun 25, 2009, at 11:26 PM, John F. Sowa wrote:

> Frank and Azamat,
>
> I have been the most enthusiastic proponent of a truly Semantic Web.
> But along the way, the semantics got lost in an ungodly mess of
> syntax.
>
> FM> The Semantic Web still has a chance given the number of dedicated
> > and smart people working on it.
> >
> > The S*m*ntic W*b has *no* chance as long as those who believe in it
> > don't develop their own specs and software that demonstrate all the
> > purported advantage of doing it that way (whatever it is)
>
> I very strongly agree.  And I wrote a note to ontolog forum that
> explains how to restore the focus on semantics.  (Copy below)
>
> In the process, my proposal cures the incredibly stupid blunder
> that is killing the Semantic Web:  ignoring the fact that every
> major web site is built around a relational database.
>
> I used to call SQL the worst notation for logic ever conceived.
> But I changed my mind after seeing RDF and OWL.
>
> My proposal below solves that problem by integrating SQL, RDF,
> and OWL on a truly equal footing.
>
> I honestly believe that this is the only way to rescue the
> original goals and hopes for the Semantic Web.
>
> John Sowa
> ___________________________________________________________________
>
> The real problem of "bringing semantics" into anything, whether a
> database or the WWW or anything else, is to keep your focus on the
> main goal:  representing meaning.  Everything else is a distraction.
>
> > Is "semantic foreign key" possible to facilitate current relational
> > database step into semantic database? In other words, if we can
> > build RDF or OWL based semantic foreign keys across different tables
> > and databases while providing those innovative foreign keys  
> inference
> > and reasoning ability, it may help to bring the semantics into the
> > current DB.
>
> That is not the problem.  People have been talking about integrating
> semantics with relational databases for over 30 years.  The solution
> was always very clear:  represent the meaning of the data in logic.
>
> The major obstacle was also very clear:  people ignored meaning,
> and devoted most of their efforts to adding more and more special
> "features" to SQL to address one or another low-level syntactic
> notation to support somebody's pet implementation.
>
> The major issues in creating the Semantic Web were also very clear:
> express meaning in logic.  But instead of focusing on the logic,
> they started to address all kinds of special cases, such as using
> triples instead of n-tuples or forcing everything into some kind
> of XML syntax.
>
> If you step back and look at the logic, all the problems disappear:
>
> 1. First order logic hasn't changed in the past 130 years, and
>    the syntax can be defined in half a page.
>
> 2. The mapping of relational databases to and from FOL is obvious.
>
> 3. The mapping of Description Logics to FOL is obvious.
>
> 4. You can develop very clean, very simple mappings of the above
>    three to one another.
>
> 5. The details of XML-based notations or table-based SQL notations
>    are of minor importance.  Those should *never* be allowed to
>    have the slightest influence on #1, #2, and #3 above.
>
> That is all very clean and very simple.  But we still have to deal
> with the problem of current systems such as SQL, RDF, and OWL.
>
> The answer is also simple:  SQL, RDF, and OWL will be declared
> "legacy systems".  In the terminology that IBM used, they will be
> called "functionally stabilized".  That means no new features or
> additions or further changes will be made to them.  They will be
> supported forever, but not as the basis for future development.
>
> All future development will focus on the very simple principles of
> #1, #2, and #3 above and with further purely *logical* extensions,
> not rinky-dink syntactic features of the kind that burden SQL,
> RDF, OWL, and all other horrible syntaxes that have outlived
> their usefulness.
>
> That is the answer.  It's extremely simple, and it provides
> *equal* support for both the current relational DBs and
> the current Semantic Web.  It is a solid and secure foundation
> for the future.
>



Re: Putting Government Data online

by John F. Sowa :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Frank,

I'd like to respond to your comment at the end of a note from
June 29th:

 > Finally, I want to repeat the general theme of my original reply
 > ... progress toward an alternative Semantic Web isn't going to be
 > made by sniping remarks at people trying to get linked data on
 > the Web, or telling people what they did or didn't ignore in
 > developing the specs, but rather by working out the details of
 > the alternative ideas, *showing people specifically how those
 > ideas make it easier to develop a Semantic Web*, and implementing
 > associated software.

During the past two months, I've been doing some traveling and
participating in a couple of conferences.  At one of them, I
presented a 3-hour tutorial with the title

    Controlled Natural Languages for Semantic Systems

That talk surveyed various issues about the development and use
of semantic systems and ways that controlled NLs can support
better interfaces.  I didn't propose "alternatives" to any of
the current systems, but recommendations for upward compatible
developments that could preserve existing software while enabling
better integration and interoperability:

Since then, I made further revisions and extensions to the slides:

    http://www.jfsowa.com/talks/cnl4ss.pdf

John Sowa