UMTHES and SKOS-XL

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

UMTHES and SKOS-XL

by Antoine Isaac-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi everyone,

I'm putting here a discussion we started with Thomas Brandholtz on UMTHES [1] on the use of SKOS-XL there (see slides at [2]). A long mail, but it can be interesting for a wider audience, as UMTHES is one of the first SKOS-XL implementations!

===

Dear Thomas,

So let's go. The main issue I have is that xl:Label is used in a very "term-oriented" way in UMTHES.
More precisely, I feel that you are using labels to aggregate lexical entities which which indeed are belonging to the same "term". But these literals be introduced as labels in basic SKOS, I think. Trying to use a concrete example from your slides:

:4711 rdf:type skos:Concept;
    skosxl:prefLabel :wasteWater.

:wasteWater rdf:type skosxl:Label;
    skosxl:literalForm "waste water";
    ext:lexicalVariant "wastewater";
    ext:compoundFrom (:waste :water).

"wastewater" is introduced as a lexical variant of "waste water". Per se, this is of course ok.
But in basic SKOS, I would have modelled that "wastewater" as a skos:altLabel or a skos:hiddenLabel of :4711. As not attaching that string to an instance of xl:Label using xl:literalForm prevents you from benefitting from the useful property chains given in XL. So I would have represented "wastewater" as an instance of xl:Label.

Of course, you may object that you can declare yourself a property chain (or property chains) that would allow to infer that the literals that are objects of ext:lexicalVariant triples (or the ones involving sub-properties of ext:lexicalVariant) are also objects of skos:hiddenLabel (or skos:altLabel) statements attached to the skos:Concept to which your xl:Label is attached.

But then I'd be still uncomfortable with an xl:Label giving raise to several (SKOS-basic) labels.
Additionally, we actually introduced xl:labelRelation to handle cases like acronyms [1]. In your approach, acronym is a subproperty of lexicalVariant, which is clearly a different pattern from ours.

As I feel it, your choice may be prefectly grounded in terminology. Still, I'd be curious to hear whether this is a strong position of yours, or if you could accomodate a different pattern.

Maybe there can be indeed a solution accomodating both points of view (if I interpreted one correctly, of course). Namely, introducing "wastewater" as the literalForm of an xl:Label which is not connected to any concept; just connected (by an ext:lexicalVariant which would be then a sub-property of xl:labelRelation) to :wasteWater.

Of course you can say then that the distinction between "waste water" and "wastewater" is something very important for your UMTHES and the applications you envision with it, and that "wastewater" should never be used as a basic concept label, even a hidden one. Or not even interpreted as something that could be a label...

You can also argue that the xl:Label story is quite thin in the SKOS Reference anyway, and that you can use that class as a purely technical hook for any purpose. That's indeed not far from being the truth, and if all are rightfully motivated, well, I guess we can have several ways of handling a relation such as acronymy co-exist.
But well, having one of the first XL deployments departing from the meager guidelines we had put in the Reference would not be a great sign for us :-/


Apart from this issue of ext:literalVariant and its sub-properties, I found the rest really good, confirming my first enthusiastic reaction after your talk :-)

Two comments/questions, maybe:

1. Are you planning to add the language tag that seem to be missing on some slides (e.g. for the ext:inflection objects) in the real data?

2. Intuitively, I feel that the definition of :NonPreferredTerm (on slide 33) is too strong. I would have said that everything that is related via xl:altLabel to a concept cannot be a PreferredTerm. Otherwise there would be a conflict with the inferred basic SKOS labelling triples [2]. So the complementOf axiom would not be really needed. But again, it's late, and I prefer to send this mail rather than letting you wait more time for my answer...

Cheers,

Antoine

[1] http://www.w3.org/2006/07/SWD/track/issues/215
[2] http://eea.eionet.europa.eu/Public/irc/envirowindows/jad/library?l=/ecoinformatics_indicator/ecoterm_5-6102009/ecoterm09-bandholtzppt/_EN_1.0_&a=d




Re: UMTHES and SKOS-XL

by Stella Dextre Clarke-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Ah yes. We discovered a similar problem during work on BS 8723. It was
about whether to introduce a specialisation of USE/UF to cater for
abbreviations/acronyms and their expansions, for which you might use
tags such as AB/FT. A problem arises when the abbreviation is short for
another non-preferred term rather than the preferred term.
(For example, the preferred term "Information and communication
technology" can have non-preferred terms "Information technology", "IT"
and "ICT")
It becomes apparent that the proposed specialisation is not really a
type of USE/UF. It is an inter-term relationship that can sometimes
apply between non-preferred terms. Obviously it is possible to find a
way of representing this accurately, but at the expense of making the
whole model more complicated and the tagging conventions more cumbersome.

My personal view on this is that if you try to add more value in the
shape of lexical/terminological information, you lose the virtue of
simplicity. To put it another way, if you have mixed objectives (trying
to achieve  terminological objectives as well as enabling information
retrieval) these tend to detract from each other.

Cheers
Stella

*****************************************************
Stella Dextre Clarke
Information Consultant
Luke House, West Hendred, Wantage, OX12 8RR, UK
Tel: 01235-833-298
Fax: 01235-863-298
stella@...
*****************************************************


Antoine Isaac wrote:

> Hi everyone,
>
> I'm putting here a discussion we started with Thomas Brandholtz on
> UMTHES [1] on the use of SKOS-XL there (see slides at [2]). A long mail,
> but it can be interesting for a wider audience, as UMTHES is one of the
> first SKOS-XL implementations!
>
> ===
>
> Dear Thomas,
>
> So let's go. The main issue I have is that xl:Label is used in a very
> "term-oriented" way in UMTHES.
> More precisely, I feel that you are using labels to aggregate lexical
> entities which which indeed are belonging to the same "term". But these
> literals be introduced as labels in basic SKOS, I think. Trying to use a
> concrete example from your slides:
>
> :4711 rdf:type skos:Concept;
>    skosxl:prefLabel :wasteWater.
>
> :wasteWater rdf:type skosxl:Label;
>    skosxl:literalForm "waste water";
>    ext:lexicalVariant "wastewater";
>    ext:compoundFrom (:waste :water).
>
> "wastewater" is introduced as a lexical variant of "waste water". Per
> se, this is of course ok.
> But in basic SKOS, I would have modelled that "wastewater" as a
> skos:altLabel or a skos:hiddenLabel of :4711. As not attaching that
> string to an instance of xl:Label using xl:literalForm prevents you from
> benefitting from the useful property chains given in XL. So I would have
> represented "wastewater" as an instance of xl:Label.
>
> Of course, you may object that you can declare yourself a property chain
> (or property chains) that would allow to infer that the literals that
> are objects of ext:lexicalVariant triples (or the ones involving
> sub-properties of ext:lexicalVariant) are also objects of
> skos:hiddenLabel (or skos:altLabel) statements attached to the
> skos:Concept to which your xl:Label is attached.
>
> But then I'd be still uncomfortable with an xl:Label giving raise to
> several (SKOS-basic) labels.
> Additionally, we actually introduced xl:labelRelation to handle cases
> like acronyms [1]. In your approach, acronym is a subproperty of
> lexicalVariant, which is clearly a different pattern from ours.
>
> As I feel it, your choice may be prefectly grounded in terminology.
> Still, I'd be curious to hear whether this is a strong position of
> yours, or if you could accomodate a different pattern.
>
> Maybe there can be indeed a solution accomodating both points of view
> (if I interpreted one correctly, of course). Namely, introducing
> "wastewater" as the literalForm of an xl:Label which is not connected to
> any concept; just connected (by an ext:lexicalVariant which would be
> then a sub-property of xl:labelRelation) to :wasteWater.
>
> Of course you can say then that the distinction between "waste water"
> and "wastewater" is something very important for your UMTHES and the
> applications you envision with it, and that "wastewater" should never be
> used as a basic concept label, even a hidden one. Or not even
> interpreted as something that could be a label...
>
> You can also argue that the xl:Label story is quite thin in the SKOS
> Reference anyway, and that you can use that class as a purely technical
> hook for any purpose. That's indeed not far from being the truth, and if
> all are rightfully motivated, well, I guess we can have several ways of
> handling a relation such as acronymy co-exist.
> But well, having one of the first XL deployments departing from the
> meager guidelines we had put in the Reference would not be a great sign
> for us :-/
>
>
> Apart from this issue of ext:literalVariant and its sub-properties, I
> found the rest really good, confirming my first enthusiastic reaction
> after your talk :-)
>
> Two comments/questions, maybe:
>
> 1. Are you planning to add the language tag that seem to be missing on
> some slides (e.g. for the ext:inflection objects) in the real data?
>
> 2. Intuitively, I feel that the definition of :NonPreferredTerm (on
> slide 33) is too strong. I would have said that everything that is
> related via xl:altLabel to a concept cannot be a PreferredTerm.
> Otherwise there would be a conflict with the inferred basic SKOS
> labelling triples [2]. So the complementOf axiom would not be really
> needed. But again, it's late, and I prefer to send this mail rather than
> letting you wait more time for my answer...
> Cheers,
>
> Antoine
>
> [1] http://www.w3.org/2006/07/SWD/track/issues/215
> [2]
> http://eea.eionet.europa.eu/Public/irc/envirowindows/jad/library?l=/ecoinformatics_indicator/ecoterm_5-6102009/ecoterm09-bandholtzppt/_EN_1.0_&a=d 
>
>
>
>


Re: UMTHES and SKOS-XL

by Thomas Bandholtz :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dear Stella & Antoine,

Antoine has raised the essential issue, Stella came up with a related use case which can be solved using the UMTHES patterns.
UMTHES distinguishes not only prefLabel from altlabel, but also both from multiple spelling conventions of any label.
We see abbreviations/acronyms as part of such spelling conventions, others are inflectional forms of the same term, or even common misspellings. If we mix this all together into altLabel instances, it would not make sense any more.

Stellas example about abbrev is similar, but we separate spelling conventions ("lexical variants") from labels regardless whether they may be pref or alt.
Example:

:4711 rdf:type skos:Concept;
   skos:prefLabel "waste water";
   skos:altLabel "sewage".

makes sense, but

#not recommended:
:4711 rdf:type skos:Concept;
   skos:prefLabel "waste water";
   skos:prefLabel "waste waters";
   skos:prefLabel "wastewater";
   skos:prefLabel "wastewaters";
   skos:altLabel "sewage".

looks at least somehow "unballanced".

UMTHTES knows even more about lexical complexity (a really awful issue in German), that is why we decided to use xl:Label extensions to separate such complexity from the more prominent list of labels which are directly assigned to a skos:Concept:

# hiding lexical complexity from the list of labels
:wasteWater rdf:type skosxl:Label;
   skosxl:literalForm "waste water";
   ext:lexicalVariant "wastewater";
   ext:lexicalVariant "wastewaters";
   ext:compoundFrom (:waste :water).

Speaking in ISO Thesaurus lingo: we do not want inflectional forms etc. to become entry terms.
(see http://www.w3.org/2004/02/skos/core/proposals.html#thesaurusRepresentation-11 ...)

This is also why we really do not want to have a property chain from a ext:lexicalVariant to a skos:Concept.
We appreciate the property chain from the skosxl:literalForm to the skos:Concept.

Why then do we need all those lexical variants at all?
At first, UMTHES just has them. It is my job to serialise UMTHES in SKOS, not to change UMTHES.
Secondly, we need this stuff to support automated indexing of full text documents. Machine need to be enabled to detect the Concepts behind this weird mess of character strings that makes a document (more on this in the ecoterm presentation).

See some more notes inline below.

Stella Dextre Clarke schrieb:
Ah yes. We discovered a similar problem during work on BS 8723. It was about whether to introduce a specialisation of USE/UF to cater for abbreviations/acronyms and their expansions, for which you might use tags such as AB/FT. A problem arises when the abbreviation is short for another non-preferred term rather than the preferred term.
(For example, the preferred term "Information and communication technology" can have non-preferred terms "Information technology", "IT" and "ICT")
It becomes apparent that the proposed specialisation is not really a type of USE/UF. It is an inter-term relationship that can sometimes apply between non-preferred terms. Obviously it is possible to find a way of representing this accurately, but at the expense of making the whole model more complicated and the tagging conventions more cumbersome.

My personal view on this is that if you try to add more value in the shape of lexical/terminological information, you lose the virtue of simplicity. To put it another way, if you have mixed objectives (trying to achieve  terminological objectives as well as enabling information retrieval) these tend to detract from each other.
right. If someone only wants the pure thesaurus, she might get along with the skos: part of UMTHES only and simply ignore the skosxl:+extensions.
Cudos to the property chain which Antoine has mentioned, each skosxl:literalForm is equivalent to a directly asigned skos:pref/altLabel.
So, nothing would be missing.


Cheers
Stella

*****************************************************
Stella Dextre Clarke
Information Consultant
Luke House, West Hendred, Wantage, OX12 8RR, UK
Tel: 01235-833-298
Fax: 01235-863-298
stella@...
*****************************************************


Antoine Isaac wrote:
Hi everyone,

I'm putting here a discussion we started with Thomas Brandholtz on UMTHES [1] on the use of SKOS-XL there (see slides at [2]). A long mail, but it can be interesting for a wider audience, as UMTHES is one of the first SKOS-XL implementations!

===

Dear Thomas,

So let's go. The main issue I have is that xl:Label is used in a very "term-oriented" way in UMTHES.
More precisely, I feel that you are using labels to aggregate lexical entities which which indeed are belonging to the same "term". But these literals be introduced as labels in basic SKOS, I think. Trying to use a concrete example from your slides:

:4711 rdf:type skos:Concept;
   skosxl:prefLabel :wasteWater.

:wasteWater rdf:type skosxl:Label;
   skosxl:literalForm "waste water";
   ext:lexicalVariant "wastewater";
   ext:compoundFrom (:waste :water).

"wastewater" is introduced as a lexical variant of "waste water". Per se, this is of course ok.
But in basic SKOS, I would have modelled that "wastewater" as a skos:altLabel or a skos:hiddenLabel of :4711. As not attaching that string to an instance of xl:Label using xl:literalForm prevents you from benefitting from the useful property chains given in XL. So I would have represented "wastewater" as an instance of xl:Label.

Of course, you may object that you can declare yourself a property chain (or property chains) that would allow to infer that the literals that are objects of ext:lexicalVariant triples (or the ones involving sub-properties of ext:lexicalVariant) are also objects of skos:hiddenLabel (or skos:altLabel) statements attached to the skos:Concept to which your xl:Label is attached.
as said, above: we do not want such property chains.
Anyway, hiddenLabel might also hide the lexical complexity, this might be an idea.
But I don't like the idea of creating thousands of xl:Label class instances when each of them only carries "exactly one" xl:literalForm and I do not really need class instances for anything else.

But then I'd be still uncomfortable with an xl:Label giving raise to several (SKOS-basic) labels.
Additionally, we actually introduced xl:labelRelation to handle cases like acronyms [1]. In your approach, acronym is a subproperty of lexicalVariant, which is clearly a different pattern from ours.
This usage for acronyms (se also Stellas example above) is just an example, not part of the standard. We have considered to follow this example in the beginning, but then we found "subproperty of lexicalVariant" more convenient. It still conforms, as far as I see.

As I feel it, your choice may be prefectly grounded in terminology. Still, I'd be curious to hear whether this is a strong position of yours, or if you could accomodate a different pattern.

Maybe there can be indeed a solution accomodating both points of view (if I interpreted one correctly, of course). Namely, introducing "wastewater" as the literalForm of an xl:Label which is not connected to any concept; just connected (by an ext:lexicalVariant which would be then a sub-property of xl:labelRelation) to :wasteWater.
Why should we introduce such a complex linkage chain here and waste all those recources needed to handle linked class instances instead of simple string properties ?

Further more & may be more important, I see a considerabel semantic difference between a term (label) and a spelling variant of a term. That's why I do not want to handle them both equally on the model level.


Of course you can say then that the distinction between "waste water" and "wastewater" is something very important for your UMTHES and the applications you envision with it, and that "wastewater" should never be used as a basic concept label, even a hidden one. Or not even interpreted as something that could be a label...
see above

You can also argue that the xl:Label story is quite thin in the SKOS Reference anyway, and that you can use that class as a purely technical hook for any purpose. That's indeed not far from being the truth, and if all are rightfully motivated, well, I guess we can have several ways of handling a relation such as acronymy co-exist.
I would appreciate this and I am expecting nothing else. There are more patterns which have not been "harmonised" in SKOS, such as norrowerPartitive etc. for good reasons. I don't think this is a problem. Any standard should give room for some diversity at its borders.

But well, having one of the first XL deployments departing from the meager guidelines we had put in the Reference would not be a great sign for us :-/
why this? The paatern you recommend is not bad, but its usability depends on the intentions of the thesaurus provdiders.
Anyway, I can think about this for acronyms.


Apart from this issue of ext:literalVariant and its sub-properties, I found the rest really good, confirming my first enthusiastic reaction after your talk :-)

Two comments/questions, maybe:

1. Are you planning to add the language tag that seem to be missing on some slides (e.g. for the ext:inflection objects) in the real data?
I can do so theough this is not what we want to express. As each xl:label has exactly one xl:literalForm, this necessarily has  a single language.  From this can be infered that lexical variants of  this literalForm have the same language.  This is what we want to express, but I see no way to do this  in Turtle  or even savely in RDF/XML ...

2. Intuitively, I feel that the definition of :NonPreferredTerm (on slide 33) is too strong. I would have said that everything that is related via xl:altLabel to a concept cannot be a PreferredTerm. Otherwise there would be a conflict with the inferred basic SKOS labelling triples [2]. So the complementOf axiom would not be really needed.
You may be right, I'll think this over, but now I have to go out for dinner first :-)

Many thanks for your rich comments, Antone!


Best regards, Thomas
But again, it's late, and I prefer to send this mail rather than letting you wait more time for my answer...
Cheers,

Antoine

[1] http://www.w3.org/2006/07/SWD/track/issues/215
[2] http://eea.eionet.europa.eu/Public/irc/envirowindows/jad/library?l=/ecoinformatics_indicator/ecoterm_5-6102009/ecoterm09-bandholtzppt/_EN_1.0_&a=d






-- 
Thomas Bandholtz, thomas.bandholtz@..., http://www.innoq.com 
innoQ Deutschland GmbH, Halskestr. 17, D-40880 Ratingen, Germany
Phone: +49 228 9288490 Mobile: +49 178 4049387 Fax: +49 228 9288491

Re: UMTHES and SKOS-XL

by Stella Dextre Clarke-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thomas Bandholtz wrote:

> Secondly, we need this stuff to support automated indexing of full text
> documents. Machine need to be enabled to detect the Concepts behind this
> weird mess of character strings that makes a document (more on this in
> the ecoterm presentation).
Another interesting point. I sometimes hear people complain that
ISO2788-compliant thesauri do not help enough with retrieval from full
text of documents that have not been humanly indexed. This is hardly
surprising, since they were designed to support retrieval of documents
indexed with that same vocabulary. The same is true of BS 8723-2 and the
forthcoming ISO 25964-1.

When people want to use a thesaurus for full text retrieval, I sometimes
suggest they could improve the results by stripping the qualifiers off
the non-preferred terms. But more could be done to enhance the results
of that process, by including inflectional forms, term weighting,
Boolean expressions, additional less reliable clue-words, etc, and of
course dropping the idea of admitting the clue-words as non-preferred
synonyms with  reciprocal relationships.

I sometimes wonder if a future revised version of BS 8723 or ISO 25964
should include some recommendations to this effect. What do you think?

Stella

*****************************************************
Stella Dextre Clarke
Information Consultant
Luke House, West Hendred, Wantage, OX12 8RR, UK
Tel: 01235-833-298
Fax: 01235-863-298
stella@...
*****************************************************




Re: UMTHES and SKOS-XL

by Thomas Bandholtz :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Stella,

remember Leonard Will's posting about "revising the ISO standard for
thesauri for information retrieval" from Feb this year?
http://lists.w3.org/Archives/Public/public-esw-thes/2009Feb/0033.html
with a huge diagram attached.
Would be curious what has happened since then.

Leonard, still on the line?

Something else regarding my previous post. I was too eager to go out for
dinner, so I made a misleading error in this turtle syntax example:
#not recommended (and not what I wanted to write)
:4711 rdf:type skos:Concept;
   skos:prefLabel "waste water";
   skos:prefLabel "waste waters";
   skos:prefLabel "wastewater";
   skos:prefLabel "wastewaters";
   skos:altLabel "sewage".

This is not what i wanted to say. Should read as:

#not recommended:
:4711 rdf:type skos:Concept;
   skos:prefLabel "waste water";
   skos:altLabel "waste waters";
   skos:altLabel "wastewater";
   skos:altLabel "wastewaters";
   skos:altLabel "sewage".

Too silly!  Excuse me for such a confusion, i was somehow ... hungry!
Damn copy&paste in a hurry!

Best regards,
Thomas

> Thomas Bandholtz wrote:
>
>> Secondly, we need this stuff to support automated indexing of full
>> text documents. Machine need to be enabled to detect the Concepts
>> behind this weird mess of character strings that makes a document
>> (more on this in the ecoterm presentation).
> Another interesting point. I sometimes hear people complain that
> ISO2788-compliant thesauri do not help enough with retrieval from full
> text of documents that have not been humanly indexed. This is hardly
> surprising, since they were designed to support retrieval of documents
> indexed with that same vocabulary. The same is true of BS 8723-2 and
> the forthcoming ISO 25964-1.
>
> When people want to use a thesaurus for full text retrieval, I
> sometimes suggest they could improve the results by stripping the
> qualifiers off the non-preferred terms. But more could be done to
> enhance the results of that process, by including inflectional forms,
> term weighting, Boolean expressions, additional less reliable
> clue-words, etc, and of course dropping the idea of admitting the
> clue-words as non-preferred synonyms with  reciprocal relationships.
>
> I sometimes wonder if a future revised version of BS 8723 or ISO 25964
> should include some recommendations to this effect. What do you think?
>
> Stella
>
> *****************************************************
> Stella Dextre Clarke
> Information Consultant
> Luke House, West Hendred, Wantage, OX12 8RR, UK
> Tel: 01235-833-298
> Fax: 01235-863-298
> stella@...
> *****************************************************
>
>


--
Thomas Bandholtz, thomas.bandholtz@..., http://www.innoq.com 
innoQ Deutschland GmbH, Halskestr. 17, D-40880 Ratingen, Germany
Phone: +49 228 9288490 Mobile: +49 178 4049387 Fax: +49 228 9288491



Re: UMTHES and SKOS-XL

by Richard Light :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

In message <4ADF6185.1060309@...>, Stella Dextre Clarke
<stella@...> writes

>Thomas Bandholtz wrote:
>
>> Secondly, we need this stuff to support automated indexing of full
>>text documents. Machine need to be enabled to detect the Concepts
>>behind this  weird mess of character strings that makes a document
>>(more on this in  the ecoterm presentation).
>Another interesting point. I sometimes hear people complain that
>ISO2788-compliant thesauri do not help enough with retrieval from full
>text of documents that have not been humanly indexed. This is hardly
>surprising, since they were designed to support retrieval of documents
>indexed with that same vocabulary. The same is true of BS 8723-2 and
>the forthcoming ISO 25964-1.
>
>When people want to use a thesaurus for full text retrieval, I
>sometimes suggest they could improve the results by stripping the
>qualifiers off the non-preferred terms. But more could be done to
>enhance the results of that process, by including inflectional forms,
>term weighting, Boolean expressions, additional less reliable
>clue-words, etc, and of course dropping the idea of admitting the
>clue-words as non-preferred synonyms with  reciprocal relationships.
>
>I sometimes wonder if a future revised version of BS 8723 or ISO 25964
>should include some recommendations to this effect. What do you think?

I would say not.  "Machines detecting concepts" strikes me as an
unachievable goal, certainly with our current capabilities.  "Machines
detecting the presence of words which are also terms in a thesaurus" is
achievable, but it _isn't_ the same thing.

Richard
--
Richard Light


RE: UMTHES and SKOS-XL

by Johan De Smedt :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Suggestion: There are three levels of organization.
- Concepts (SKOS talk)
- Labels
- Text processing

A significant part of the issues discussed related to what is on the label management level
and what is on the text processing level (thus needing a proper definition)

Language specific text processing and analysis (including inflection)
seems to me a specialized area for which global resource (language dictionalries)
like word-net can solve.
Stemmeng, also is in this area.
It seems to me costly if this would be managed in every thesaurus.

Label management can focus on standard terms and term decomposition as relevant within a
thesaurus or taxonomy.  (equivalence relation, compound equivalence, acronym,
short-name, qualifiers ...)

Indexing and search engines combining thesaurus and text processing should can use the label
management layer (of the thesaurus) to configure the relevant text processing.

Concept and label processing surely belong to the thesaurus/taxonomy/... management.
Text processing, I would suggest, is in the text processing engines.

PS:
- thanks for the UMTHES presentation - very instructive.
- would it be an idea to build on further SKOS extensions to have common schema for
  artefacts like equivalence relation and compound equivalence; or for specializing
  some xl:labelRelation ?

kr, Johan De Smedt.
===================
-----Original Message-----
From: public-esw-thes-request@... [mailto:public-esw-thes-request@...] On Behalf Of Stella Dextre Clarke
Sent: Wednesday, 21 October, 2009 21:31
To: Thomas Bandholtz
Cc: Antoine Isaac; SKOS
Subject: Re: UMTHES and SKOS-XL

Thomas Bandholtz wrote:

> Secondly, we need this stuff to support automated indexing of full text
> documents. Machine need to be enabled to detect the Concepts behind this
> weird mess of character strings that makes a document (more on this in
> the ecoterm presentation).
Another interesting point. I sometimes hear people complain that
ISO2788-compliant thesauri do not help enough with retrieval from full
text of documents that have not been humanly indexed. This is hardly
surprising, since they were designed to support retrieval of documents
indexed with that same vocabulary. The same is true of BS 8723-2 and the
forthcoming ISO 25964-1.

When people want to use a thesaurus for full text retrieval, I sometimes
suggest they could improve the results by stripping the qualifiers off
the non-preferred terms. But more could be done to enhance the results
of that process, by including inflectional forms, term weighting,
Boolean expressions, additional less reliable clue-words, etc, and of
course dropping the idea of admitting the clue-words as non-preferred
synonyms with  reciprocal relationships.

I sometimes wonder if a future revised version of BS 8723 or ISO 25964
should include some recommendations to this effect. What do you think?

Stella

*****************************************************
Stella Dextre Clarke
Information Consultant
Luke House, West Hendred, Wantage, OX12 8RR, UK
Tel: 01235-833-298
Fax: 01235-863-298
stella@...
*****************************************************




Re: UMTHES and SKOS-XL

by Thomas Bandholtz :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Johan,
> Suggestion: There are three levels of organization.
> - Concepts (SKOS talk)
> - Labels
> - Text processing
>  
Good idea!
I would add: Labels are skosxl, text processing is not yet really
covered by skos(xl), but can be supported by extending skosxl locally.
> A significant part of the issues discussed related to what is on the label management level
> and what is on the text processing level (thus needing a proper definition)
>
> Language specific text processing and analysis (including inflection)
> seems to me a specialized area for which global resource (language dictionalries)
> like word-net can solve.
>  
http://wordnet.princeton.edu/wordnet/ starts with this sentence:
"WordNet® is a large lexical database of English". Right. We have more
than 20 languages in European GEMET. Believe me, when it comes to
language specific text processing, English is the most simple language.

> Stemmeng, also is in this area.
> It seems to me costly if this would be managed in every thesaurus.
>  
It is costly, sure, but as I have expressed before, UMTHES has already
invested in this, and the question now is how to express the results in
a skosxl extension, but not: should UMTHES forget all the results of
this investment. You are right in one point: In general, a thesaurus
needs not to care about this. It is not a general requirement. But
language specific text processing needs to be solved on a language
specific level by someone somehow.
> Label management can focus on standard terms and term decomposition as relevant within a
> thesaurus or taxonomy.  (equivalence relation, compound equivalence, acronym,
> short-name, qualifiers ...)
>  
Right so far. What we try to handle is: each of such terms (=labels) has
multiple spelling conventions, and a spelling variant does not make a
different term on the same level. May be this is specific to some
languages only and not such an issue in English.
> Indexing and search engines combining thesaurus and text processing should can use the label
> management layer (of the thesaurus) to configure the relevant text processing.
>  
I think this needs a third, dedicated layer.
> Concept and label processing surely belong to the thesaurus/taxonomy/... management.
> Text processing, I would suggest, is in the text processing engines.
>  
Right, but text processing engines need some structure to express the
diversity of term (Label) ocurrence in natural language.
> PS:
> - thanks for the UMTHES presentation - very instructive.
>  
Thanks for the flowers, I tried hard to provide some valuable
contribution. As always, one has to surrender at some point of
complexity (just to be on time for the meeting) and leave the rest to
the next presentation, ...
> - would it be an idea to build on further SKOS extensions to have common schema for
>   artefacts like equivalence relation and compound equivalence; or for specializing
>   some xl:labelRelation ?
>  
I think we should collect more examples and patterns, and we should not
try to harmonise this too striktly.
What we tried to implement in UMTHES: seperate a pure SKOS CORE
representation which everybody can handle from a somehow more
experimental (admitted) extension which goes beyound established
skos(xl) patterns. But for UMTHES need it now (!) as an exchange format
in a real production scenario, so we cannot wait.

Thanks Johan for your comments, really helpful to think this over more
thoroughly!

--
Thomas Bandholtz, thomas.bandholtz@..., http://www.innoq.com 
innoQ Deutschland GmbH, Halskestr. 17, D-40880 Ratingen, Germany
Phone: +49 228 9288490 Mobile: +49 178 4049387 Fax: +49 228 9288491



Re: UMTHES and SKOS-XL

by Thomas Bandholtz :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dear Robert,

I would say not.  "Machines detecting concepts" strikes me as an unachievable goal, certainly with our current capabilities.  "Machines detecting the presence of words which are also terms in a thesaurus" is achievable, but it _isn't_ the same thing.
Richard, when the machine has detected a term (which is quite easy so far) there are some remaining problems to be solved. I give only two examples:
  • the term may be simply ambigous ( a homograph). It may designate more than one Concept. Qualifiers may help in this case (Stella mentioned this), but such qualifiers may not appear literally in the same text context ...
  • the term can designate a Concepts by itself, but it may also occurr as part of a compound term which designates a different Concept.
You can add more cases.
"Machines detecting concepts" means getting closer and closer towards a save automatic decision in such cases.
This will not be finalised by a "big bang", but it is not "an unachievable goal" as you say.
It is not yet achieved completely, but there are many approaches coming closer every time you revisit them.
Give this a little more time!

Best regards,
Thomas

PS: On the other hand, if someone wants to to expose her knowledge to the Semantic Web, she should use a formal language such as RDF directly and not human lingo. This would make everything much easyer! (Dreaming ;-)


-- 
Thomas Bandholtz, thomas.bandholtz@..., http://www.innoq.com 
innoQ Deutschland GmbH, Halskestr. 17, D-40880 Ratingen, Germany
Phone: +49 228 9288490 Mobile: +49 178 4049387 Fax: +49 228 9288491

Re: UMTHES and SKOS-XL and Others!

by Christophe Dupriez :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi!

A few thoughts coming from this discussion:

* Indexing Authority List vs Existing Concepts Inventory: the MeSH is an example of merging both.
   In MeSH/UMLS, Concepts have their specific labels (terms) but they are grouped in micro-hierarchies to form an Heading entry.
  Example:
http://www.nlm.nih.gov/cgi/mesh/2010/MB_cgi?mode=&index=877&view=expanded
  I believe SKOS is able to represent most of MeSH attributes:
    * Concept Unique identifier is the "about"
    * Tree numbers (changing from one year to another) is a notation system
    * (Heading) Entry Unique Id is another notation system (an id within a sub-scheme)
    * Registry Number (CAS) is another notation system (an id within another scheme)
    * Terms are preferred labels or synonyms (depending of lexical tag value)
    * Scope Notes are SKOS Scope notes. The concept references within Scope Notes have to represented somehow.
    * Annotation and other are editor notes or other types of SKOS notes
    * Previous indexing: relatedMatch with older Heading Schemes?
  It remain to be found a good way to represent Semantic Types (collections?) and Allowable qualifiers (collections too? or SKOS extension?)
  In this example, a difficult problem is present: the Heading entry is a specific (and not a generic) of the two other "non preferred" concepts!

* Full Natural Language Processing needs a way to efficiently treat the EXCEPTIONS: the intuition believes that 80/20 rule is good enough.
   Reality is much more demanding: "small" linguistic errors are never accepted by humans (when visible: this is why Google does not document them!).
   So the representation of exceptions must be in the design of data structures for Natural Language Processing systems.
   It is their main use (the general 80% rules can even be hard coded).
   This is way too complex to be seen as a simple SKOS extension.

* Thesaurus "projection" over a text has been used with success to generate suggestions to human indexers (not for fully automatic indexation).
   It is very useful and it is true that having the necessary lexical information in a SKOS extension to achieve this would be nice.
   It is limited to the detection of nominal groups but it may have problems with different grammatical ways to express coordination between elementary concepts in a term.
   To succeed, this "extension" normalization effort should be done to define properties only for that precise purpose.

   In general, focused "purpose", open to the different applications with that purpose, is the only way to deliver a working standard...

I am very very sorry I cannot attend "Classification at CrossRoads" and the SKOS day, October the 30th in Nederlands: I hope to be able at another occasion.
I suppose the communications will be available?

Have a nice day!

Christophe Dupriez

Thomas Bandholtz a écrit :
Dear Robert,

I would say not.  "Machines detecting concepts" strikes me as an unachievable goal, certainly with our current capabilities.  "Machines detecting the presence of words which are also terms in a thesaurus" is achievable, but it _isn't_ the same thing.
Richard, when the machine has detected a term (which is quite easy so far) there are some remaining problems to be solved. I give only two examples:
  • the term may be simply ambigous ( a homograph). It may designate more than one Concept. Qualifiers may help in this case (Stella mentioned this), but such qualifiers may not appear literally in the same text context ...
  • the term can designate a Concepts by itself, but it may also occurr as part of a compound term which designates a different Concept.
You can add more cases.
"Machines detecting concepts" means getting closer and closer towards a save automatic decision in such cases.
This will not be finalised by a "big bang", but it is not "an unachievable goal" as you say.
It is not yet achieved completely, but there are many approaches coming closer every time you revisit them.
Give this a little more time!

Best regards,
Thomas

PS: On the other hand, if someone wants to to expose her knowledge to the Semantic Web, she should use a formal language such as RDF directly and not human lingo. This would make everything much easyer! (Dreaming ;-)


-- 
Thomas Bandholtz, thomas.bandholtz@..., http://www.innoq.com 
innoQ Deutschland GmbH, Halskestr. 17, D-40880 Ratingen, Germany
Phone: +49 228 9288490 Mobile: +49 178 4049387 Fax: +49 228 9288491
  


[christophe_dupriez.vcf]

begin:vcard
fn:Christophe Dupriez
n:Dupriez;Christophe
org:DESTIN inc. SSEB
adr;quoted-printable:;;rue des Palais 44, bo=C3=AEte 1;Bruxelles;;B-1030;Belgique
email;internet:Christophe.Dupriez@...
title:Informaticien
tel;work:+32/2/216.66.15
tel;fax:+32/2/242.97.25
tel;cell:+32/475.77.62.11
note;quoted-printable:D=C3=A9veloppement de Syst=C3=A8mes de Traitement de l'Information
x-mozilla-html:TRUE
url:http://www.destin.be
version:2.1
end:vcard



Re: UMTHES and SKOS-XL and Others!

by Thomas Bandholtz :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dear Christophe,

I am not familiar enough with the MeSH/UMLS schema to comment your SKOS
mapping spontaneously.
So i limit myself to your more general statements:

>
> * Full Natural Language Processing needs a way to efficiently treat
> the EXCEPTIONS: the intuition believes that 80/20 rule is good enough.
>    Reality is much more demanding: "small" linguistic errors are never
> accepted by humans (when visible: this is why Google does not document
> them!).
>    So the representation of exceptions must be in the design of data
> structures for Natural Language Processing systems.
>    It is their main use (the general 80% rules can even be hard coded).
>    This is way too complex to be seen as a simple SKOS extension.

I agree, more or less. SKOS is not made to express rules. But you may
enhance xl:Label instances with certain linguistic data (specific to the
given language) in order to enable NLP systems getting along with the
remaining 20%. At least this is what we try in UMTHES.

>
> * Thesaurus "projection" over a text has been used with success to
> generate suggestions to human indexers (not for fully automatic
> indexation).

In practise, we once buildt a wizzard making suggestions to human
indexers, and after some tests people used it as a fully automatic
indexation.
This was not because the wizzard would have been perfect, it was because
80% (or even 70) were found to be "good enough". This depends strongly
on the use case.

>    It is very useful and it is true that having the necessary lexical
> information in a SKOS extension to achieve this would be nice.
>    It is limited to the detection of nominal groups but it may have
> problems with different grammatical ways to express coordination
> between elementary concepts in a term.
>    To succeed, this "extension" normalization effort should be done to
> define properties only for that precise purpose

Can this be "normalized". I don't see any normalized NLP methods, so I
wonder how we can normalize the properties that will support such
methods. Do you have something in mind?

>
>    In general, focused "purpose", open to the different applications
> with that purpose, is the only way to deliver a working standard...

To me any real world conceptScheme is an individual to a certain extent.
SKOS (XL included) covers the common patterns and gives room for
necessarily individual extensions. Over time, we might discover more
common patterns even in the individuality of each scheme, but some
diversity will always remain. I don't think this is a problem.

Referring to the UMTHES extensions, it was not the intension to provide
a standardisation proposal.
UMTHES just needs a lossless RDF serialisation making the most of SKOS
and extending it for our specific demands, and we need all this now.
But I would be enthusiastic about some future extensions of SKOS towards
linguistics and NLP support, if they may arise from this discussion.

Kind regards,
Thomas

--
Thomas Bandholtz, thomas.bandholtz@..., http://www.innoq.com 
innoQ Deutschland GmbH, Halskestr. 17, D-40880 Ratingen, Germany
Phone: +49 228 9288490 Mobile: +49 178 4049387 Fax: +49 228 9288491



Re: UMTHES and SKOS-XL

by Antoine Isaac-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Dear Thomas,

The discussion has gone quite wild, I see :-)
I'll try to come back to the original UMTHES issue, first...

> Speaking in ISO Thesaurus lingo: we do not want inflectional forms etc. to become entry terms.

> Why then do we need all those lexical variants at all?
> At first, UMTHES just has them. It is my job to serialise UMTHES in SKOS, not to change UMTHES.
> Secondly, we need this stuff to support automated indexing of full text documents. Machine need to be enabled to detect the Concepts behind this weird mess of character strings that makes a document (more on this in the ecoterm presentation).


I think everything is here, and you don't need to say much more!
Especially the first sentence, which can be enough to define a practice (or actually remind it). I now see clearly the point in the example in your slides [2], where the main form xl:Label has dozens of variants in German. Having the knowledge of those could be counter-productive for many user-oriented applications but sophisticated NLP-based tools.

Please remind however of the hiddenLabel solution. I agree with your prejudice againts creating more instances of xl:Label, but if you see a slight chance that UMTHES could evolve towards an even more lexically intensive thing, having the instances of xl:Label could spare your some painful model change...

Picking some elements from your mail at the bottom:

> This usage for acronyms (se also Stellas example above) is just an
> example, not part of the standard. We have considered to follow this
> example in the beginning, but then we found "subproperty of
> lexicalVariant" more convenient. It still conforms, as far as I see.


Yes!


> Why should we introduce such a complex linkage chain here and waste all
> those recources needed to handle linked class instances instead of
> simple string properties ?


The overhead is not really huge, in fact. I mean, it adds a fraction of all triples that you have already in UMTHES, it's not as if it mutliplied them by ten.

 
> Further more & may be more important, I see a considerabel semantic
> difference between a term (label) and a spelling variant of a term.
> That's why I do not want to handle them both equally on the model level.


Yes, but as SKOS would not make the distinction (other than treating them as hiddenLabel, whereas the others would be pref or alt labels) there would not b a strong counter-argument to it from the SKOS perspective. And from your more practical perspective, you could still create two sub-classes of xl:Label, a bit like what you hint at in your presentation, in fact.


>>> I guess we can have
>>> several ways of handling a relation such as acronymy co-exist.
> I would appreciate this and I am expecting nothing else. There are more
> patterns which have not been "harmonised" in SKOS, such as
> norrowerPartitive etc. for good reasons. I don't think this is a
> problem. Any standard should give room for some diversity at its borders.
>>> But well, having one of the first XL deployments departing from the
>>> meager guidelines we had put in the Reference would not be a great
>>> sign for us :-/
> why this? The paatern you recommend is not bad, but its usability
> depends on the intentions of the thesaurus provdiders.
> Anyway, I can think about this for acronyms.


You're right, much of that depends on the intention of thesaurus providers. And the pattern we had is certainly not intended as normative.


>>> 1. Are you planning to add the language tag that seem to be missing
>>> on some slides (e.g. for the ext:inflection objects) in the real data?
> I can do so theough this is not what we want to express. As each
> xl:label has exactly one xl:literalForm, this necessarily has  a single
> language.  From this can be infered that lexical variants of  this
> literalForm have the same language.  This is what we want to express,
> but I see no way to do this  in Turtle  or even savely in RDF/XML ...


Yes. The only way to proceed is to simulate that rule and by just putting the tags for all your literals that are in your data :-/

If you want to do it in a neat way, with rules, then you have to represent languages as full-fledged resources, and build axioms using them.
Note that there is some logic, in a way. You cannot expect the syntax to allow you to deal with something that seems very much at the model level, at least to me!

Cheers,

Antoine


> Dear Stella & Antoine,
>
> Antoine has raised the essential issue, Stella came up with a related
> use case which can be solved using the UMTHES patterns.
> UMTHES distinguishes not only prefLabel from altlabel, but also both
> from multiple spelling conventions of any label.
> We see abbreviations/acronyms as part of such spelling conventions,
> others are inflectional forms of the same term, or even common
> misspellings. If we mix this all together into altLabel instances, it
> would not make sense any more.
>
> Stellas example about abbrev is similar, but we separate spelling
> conventions ("lexical variants") from labels regardless whether they may
> be pref or alt.
> Example:
>
> :4711 rdf:type skos:Concept;
>    skos:prefLabel "waste water";
>    skos:altLabel "sewage".
>
> makes sense, but
>
> #not recommended:
> :4711 rdf:type skos:Concept;
>    skos:prefLabel "waste water";
>    skos:prefLabel "waste waters";
>    skos:prefLabel "wastewater";
>    skos:prefLabel "wastewaters";
>    skos:altLabel "sewage".
>
> looks at least somehow "unballanced".
>
> UMTHTES knows even more about lexical complexity (a really awful issue
> in German), that is why we decided to use xl:Label extensions to
> separate such complexity from the more prominent list of labels which
> are directly assigned to a skos:Concept:
>
> # hiding lexical complexity from the list of labels
> :wasteWater rdf:type skosxl:Label;
>    skosxl:literalForm "waste water";
>    ext:lexicalVariant "wastewater";
>    ext:lexicalVariant "wastewaters";
>    ext:compoundFrom (:waste :water).
>
> Speaking in ISO Thesaurus lingo: we do not want inflectional forms etc.
> to become entry terms.
> (see
> http://www.w3.org/2004/02/skos/core/proposals.html#thesaurusRepresentation-11 
> ...)
>
> This is also why we really do not want to have a property chain from a
> ext:lexicalVariant to a skos:Concept.
> We appreciate the property chain from the skosxl:literalForm to the
> skos:Concept.
>
> Why then do we need all those lexical variants at all?
> At first, UMTHES just has them. It is my job to serialise UMTHES in
> SKOS, not to change UMTHES.
> Secondly, we need this stuff to support automated indexing of full text
> documents. Machine need to be enabled to detect the Concepts behind this
> weird mess of character strings that makes a document (more on this in
> the ecoterm presentation).
>
> See some more notes inline below.
>
> Stella Dextre Clarke schrieb:
>> Ah yes. We discovered a similar problem during work on BS 8723. It was
>> about whether to introduce a specialisation of USE/UF to cater for
>> abbreviations/acronyms and their expansions, for which you might use
>> tags such as AB/FT. A problem arises when the abbreviation is short
>> for another non-preferred term rather than the preferred term.
>> (For example, the preferred term "Information and communication
>> technology" can have non-preferred terms "Information technology",
>> "IT" and "ICT")
>> It becomes apparent that the proposed specialisation is not really a
>> type of USE/UF. It is an inter-term relationship that can sometimes
>> apply between non-preferred terms. Obviously it is possible to find a
>> way of representing this accurately, but at the expense of making the
>> whole model more complicated and the tagging conventions more cumbersome.
>>
>> My personal view on this is that if you try to add more value in the
>> shape of lexical/terminological information, you lose the virtue of
>> simplicity. To put it another way, if you have mixed objectives
>> (trying to achieve  terminological objectives as well as enabling
>> information retrieval) these tend to detract from each other.
> right. If someone only wants the pure thesaurus, she might get along
> with the skos: part of UMTHES only and simply ignore the skosxl:+extensions.
> Cudos to the property chain which Antoine has mentioned, each
> skosxl:literalForm is equivalent to a directly asigned skos:pref/altLabel.
> So, nothing would be missing.
>
>>
>> Cheers
>> Stella
>>
>> *****************************************************
>> Stella Dextre Clarke
>> Information Consultant
>> Luke House, West Hendred, Wantage, OX12 8RR, UK
>> Tel: 01235-833-298
>> Fax: 01235-863-298
>> stella@...
>> *****************************************************
>>
>>
>> Antoine Isaac wrote:
>>> Hi everyone,
>>>
>>> I'm putting here a discussion we started with Thomas Brandholtz on
>>> UMTHES [1] on the use of SKOS-XL there (see slides at [2]). A long
>>> mail, but it can be interesting for a wider audience, as UMTHES is
>>> one of the first SKOS-XL implementations!
>>>
>>> ===
>>>
>>> Dear Thomas,
>>>
>>> So let's go. The main issue I have is that xl:Label is used in a very
>>> "term-oriented" way in UMTHES.
>>> More precisely, I feel that you are using labels to aggregate lexical
>>> entities which which indeed are belonging to the same "term". But
>>> these literals be introduced as labels in basic SKOS, I think. Trying
>>> to use a concrete example from your slides:
>>>
>>> :4711 rdf:type skos:Concept;
>>>    skosxl:prefLabel :wasteWater.
>>>
>>> :wasteWater rdf:type skosxl:Label;
>>>    skosxl:literalForm "waste water";
>>>    ext:lexicalVariant "wastewater";
>>>    ext:compoundFrom (:waste :water).
>>>
>>> "wastewater" is introduced as a lexical variant of "waste water". Per
>>> se, this is of course ok.
>>> But in basic SKOS, I would have modelled that "wastewater" as a
>>> skos:altLabel or a skos:hiddenLabel of :4711. As not attaching that
>>> string to an instance of xl:Label using xl:literalForm prevents you
>>> from benefitting from the useful property chains given in XL. So I
>>> would have represented "wastewater" as an instance of xl:Label.
>>>
>>> Of course, you may object that you can declare yourself a property
>>> chain (or property chains) that would allow to infer that the
>>> literals that are objects of ext:lexicalVariant triples (or the ones
>>> involving sub-properties of ext:lexicalVariant) are also objects of
>>> skos:hiddenLabel (or skos:altLabel) statements attached to the
>>> skos:Concept to which your xl:Label is attached.
> as said, above: we do not want such property chains.
> Anyway, hiddenLabel might also hide the lexical complexity, this might
> be an idea.
> But I don't like the idea of creating thousands of xl:Label class
> instances when each of them only carries "exactly one" xl:literalForm
> and I do not really need class instances for anything else.

>>> But then I'd be still uncomfortable with an xl:Label giving raise to
>>> several (SKOS-basic) labels.
>>> Additionally, we actually introduced xl:labelRelation to handle cases
>>> like acronyms [1]. In your approach, acronym is a subproperty of
>>> lexicalVariant, which is clearly a different pattern from ours.
> This usage for acronyms (se also Stellas example above) is just an
> example, not part of the standard. We have considered to follow this
> example in the beginning, but then we found "subproperty of
> lexicalVariant" more convenient. It still conforms, as far as I see.
>>>
>>> As I feel it, your choice may be prefectly grounded in terminology.
>>> Still, I'd be curious to hear whether this is a strong position of
>>> yours, or if you could accomodate a different pattern.
>>>
>>> Maybe there can be indeed a solution accomodating both points of view
>>> (if I interpreted one correctly, of course). Namely, introducing
>>> "wastewater" as the literalForm of an xl:Label which is not connected
>>> to any concept; just connected (by an ext:lexicalVariant which would
>>> be then a sub-property of xl:labelRelation) to :wasteWater.
> Why should we introduce such a complex linkage chain here and waste all
> those recources needed to handle linked class instances instead of
> simple string properties ?
>
> Further more & may be more important, I see a considerabel semantic
> difference between a term (label) and a spelling variant of a term.
> That's why I do not want to handle them both equally on the model level.
>
>>>
>>> Of course you can say then that the distinction between "waste water"
>>> and "wastewater" is something very important for your UMTHES and the
>>> applications you envision with it, and that "wastewater" should never
>>> be used as a basic concept label, even a hidden one. Or not even
>>> interpreted as something that could be a label...
> see above
>>>
>>> You can also argue that the xl:Label story is quite thin in the SKOS
>>> Reference anyway, and that you can use that class as a purely
>>> technical hook for any purpose. That's indeed not far from being the
>>> truth, and if all are rightfully motivated, well, I guess we can have
>>> several ways of handling a relation such as acronymy co-exist.
> I would appreciate this and I am expecting nothing else. There are more
> patterns which have not been "harmonised" in SKOS, such as
> norrowerPartitive etc. for good reasons. I don't think this is a
> problem. Any standard should give room for some diversity at its borders.
>
>>> But well, having one of the first XL deployments departing from the
>>> meager guidelines we had put in the Reference would not be a great
>>> sign for us :-/
> why this? The paatern you recommend is not bad, but its usability
> depends on the intentions of the thesaurus provdiders.
> Anyway, I can think about this for acronyms.
>>>
>>>
>>> Apart from this issue of ext:literalVariant and its sub-properties, I
>>> found the rest really good, confirming my first enthusiastic reaction
>>> after your talk :-)
>>>
>>> Two comments/questions, maybe:
>>>
>>> 1. Are you planning to add the language tag that seem to be missing
>>> on some slides (e.g. for the ext:inflection objects) in the real data?
> I can do so theough this is not what we want to express. As each
> xl:label has exactly one xl:literalForm, this necessarily has  a single
> language.  From this can be infered that lexical variants of  this
> literalForm have the same language.  This is what we want to express,
> but I see no way to do this  in Turtle  or even savely in RDF/XML ...
>>>
>>> 2. Intuitively, I feel that the definition of :NonPreferredTerm (on
>>> slide 33) is too strong. I would have said that everything that is
>>> related via xl:altLabel to a concept cannot be a PreferredTerm.
>>> Otherwise there would be a conflict with the inferred basic SKOS
>>> labelling triples [2]. So the complementOf axiom would not be really
>>> needed.
> You may be right, I'll think this over, but now I have to go out for
> dinner first :-)
>
> Many thanks for your rich comments, Antone!
>
>
> Best regards, Thomas
>>> But again, it's late, and I prefer to send this mail rather than
>>> letting you wait more time for my answer...
>>> Cheers,
>>>
>>> Antoine
>>>
>>> [1] http://www.w3.org/2006/07/SWD/track/issues/215
>>> [2]
>>> http://eea.eionet.europa.eu/Public/irc/envirowindows/jad/library?l=/ecoinformatics_indicator/ecoterm_5-6102009/ecoterm09-bandholtzppt/_EN_1.0_&a=d 
>>>
>>>
>>>
>>>
>>
>
>
> --
> Thomas Bandholtz, thomas.bandholtz@..., http://www.innoq.com 
> innoQ Deutschland GmbH, Halskestr. 17, D-40880 Ratingen, Germany
> Phone: +49 228 9288490 Mobile: +49 178 4049387 Fax: +49 228 9288491
>




Re: UMTHES and SKOS-XL and Others!

by Antoine Isaac-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello everyone,

Johan's suggestion
> There are three levels of organization.
> - Concepts (SKOS talk)
> - Labels
> - Text processing
makes sense indeed. As Thomas, however, I would think that the label layer falls at least partly in the SKOS(XL) scope. And in the ISO/BS one.

But to answer Stella specifically, on what I think belongs to the third point of Johan,

> I sometimes wonder if a future revised version of BS 8723 or ISO 25964 should include some recommendations to this effect. What do you think?

I also think that this is a dangerous road to go.
I mean, I certainly think that the effort of representing lexical info is very useful. And I believe that it is possible to achieve interesting stuff based on that.
But for us (more simple KOS-oriented efforts like SKOS/ISO/BS) it would be better to just focus on:
- point to some initiatives, such as Wordnet and [1], which try to represent lexical information to allow NLP tools to work with.
- to allow those initiatives to be plugged onto our KOS-related efforts (or vice versa) by providing with the sufficient extension hooks. Which was the main rationale for SKOS-XL, in fact.

Trying to cope with all the required details is out of our scope, and I think, our expertise, even if ISO/BS committees have bright people involved ;-)
In fact finding a core model for lexical information modelling (such as [1]) is still an ongoing work, and there are multiple proposals around, which shows that it is indeed a complex.

Cheers,

Antoine

[1] http://code.google.com/p/lexinfo/

> Dear Christophe,
>
> I am not familiar enough with the MeSH/UMLS schema to comment your SKOS
> mapping spontaneously.
> So i limit myself to your more general statements:
>
>> * Full Natural Language Processing needs a way to efficiently treat
>> the EXCEPTIONS: the intuition believes that 80/20 rule is good enough.
>>    Reality is much more demanding: "small" linguistic errors are never
>> accepted by humans (when visible: this is why Google does not document
>> them!).
>>    So the representation of exceptions must be in the design of data
>> structures for Natural Language Processing systems.
>>    It is their main use (the general 80% rules can even be hard coded).
>>    This is way too complex to be seen as a simple SKOS extension.
>
> I agree, more or less. SKOS is not made to express rules. But you may
> enhance xl:Label instances with certain linguistic data (specific to the
> given language) in order to enable NLP systems getting along with the
> remaining 20%. At least this is what we try in UMTHES.
>
>> * Thesaurus "projection" over a text has been used with success to
>> generate suggestions to human indexers (not for fully automatic
>> indexation).
>
> In practise, we once buildt a wizzard making suggestions to human
> indexers, and after some tests people used it as a fully automatic
> indexation.
> This was not because the wizzard would have been perfect, it was because
> 80% (or even 70) were found to be "good enough". This depends strongly
> on the use case.
>
>>    It is very useful and it is true that having the necessary lexical
>> information in a SKOS extension to achieve this would be nice.
>>    It is limited to the detection of nominal groups but it may have
>> problems with different grammatical ways to express coordination
>> between elementary concepts in a term.
>>    To succeed, this "extension" normalization effort should be done to
>> define properties only for that precise purpose
>
> Can this be "normalized". I don't see any normalized NLP methods, so I
> wonder how we can normalize the properties that will support such
> methods. Do you have something in mind?
>
>>    In general, focused "purpose", open to the different applications
>> with that purpose, is the only way to deliver a working standard...
>
> To me any real world conceptScheme is an individual to a certain extent.
> SKOS (XL included) covers the common patterns and gives room for
> necessarily individual extensions. Over time, we might discover more
> common patterns even in the individuality of each scheme, but some
> diversity will always remain. I don't think this is a problem.
>
> Referring to the UMTHES extensions, it was not the intension to provide
> a standardisation proposal.
> UMTHES just needs a lossless RDF serialisation making the most of SKOS
> and extending it for our specific demands, and we need all this now.
> But I would be enthusiastic about some future extensions of SKOS towards
> linguistics and NLP support, if they may arise from this discussion.
>
> Kind regards,
> Thomas
>



Parent Message unknown Re: UMTHES and SKOS-XL

by Thomas Bandholtz :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Johan & Antoine,

from the SKOS point of view, the structure of your ev:permutedLiteralForm is very similar to that of umthes:lexicalVariant.
As both are defined as local datatype properties of skosxl:Label, the property chain S57 will not work:

S57:      "The property chain (skosxl:hiddenLabel, skosxl:literalForm) is a sub-property of skos:hiddenLabel."

You are right, OWL 2 introduces property chains for owl:ObjectProperty but not for owl:DatatypeProperty. See
http://www.w3.org/TR/2009/WD-owl2-new-features-20090611/#F8:_Property_Chain_Inclusion

There has been some discussion about "formal expression of property chains" in the skos list, but no final clarification. See
http://lists.w3.org/Archives/Public/public-esw-thes/2009May/0003.html

I think that Antoine's draft in
http://lists.w3.org/Archives/Public/public-swd-wg/2009Mar/0043.html and
is not valid OWL2 because he refers to datatype properties.

There are some valid examples in the OWL 2 primer, for instance:

 <rdf:Description rdf:about="hasUncle">
   <owl:propertyChainAxiom rdf:parseType="Collection">
     <owl:ObjectProperty rdf:about="hasFather"/>
     <owl:ObjectProperty rdf:about="hasBrother"/>
   </owl:propertyChainAxiom>
 </rdf:Description>


To me it is not really clear why this pattern is restricted to object properties in OWL 2, but it is.

Anyway, given that S57 is valid, ev:permutedLiteralForm and umthes:lexicalVariant would need to be remodelled as xl:literalForm of some xl:Label, with some additional ev:hasPermutedLiteralForm subproperty of xl:LabelRelation. Then you can point from the Concept to the permuted from using xl:hiddenLabel. Something like:

# no hiddenLabel in this example
:123 rdf:type skos:Concept;
    skosxl:prefLabel :ABC.

:ABC rdf:Type skosxl:Label;
    skosxl:literalForm "Something";
    jv:permutedLiteralForm "
permuted form of Something".

would need to be modified as in the next example:

# using hiddenLabel and a subproperty of skosxl:labelRelation
:123 rdf:type skos:Concept;
    skosxl:prefLabel :ABC;
    skosxl:hiddenLabel :ABCperm.

:ABC rdf:Type skosxl:Label;
    skosxl:literalForm "Something".

:ABCperm rdf:Type skosxl:Label;
    skosxl:literalForm "permuted form of Something".

:ABC: jv:hasPermutedLiteralForm :ABCperm.


Looks a bit complicated, but this is how I read the SKOS specification, and Antoine pointed to this in a previous mail.
I am still not decided whether to go this way for umthes:lexicalVariant.

May be these properties just don't need to be hiddenLabels.
Definition of skos:hiddenLabel:
"A lexical label for a resource that should be hidden when generating visual displays of the resource, but should still be accessible to free text search operations."

Doesn't seem to be very clear: free text search operations may access literal values of any property, but a SPARQL query restricted to rdfs:label (including subs) wouldn't.
As far as i see, this would be the only consequence.
Did I miss something?

Kind regards,
Thomas


Johan De Smedt schrieb:
Hi Thomas,
 
The permuted label is managed by the EUROVOC editors.
ev:permuted LiteralForm is a data property on an xl:Label
For each xl:Label/xl:literForm, any number of label permutations may be provided.
 
On export from the maintenance system, these permutations are made available as skos:hiddenLabel by the export service,
So a skos:Concept that has an xl:Label either as an xl:prefLabel or an xl:altLabel, and that xl:Label has permuted
literal form results in a skos:hiddenLabel for each of those permuted literals.
(cfr the importRule annotation)
[
We could have defined property chains like
- skos:hiddenLabel (xl:prevLabel o ev:permutedLiteralForm)
- skos:hiddenLabel (xl:altLabel o ev:permutedLiteralForm)
But I think OWL2 only provides property chains on object properties (please correct if I am wrong here !)
]
[
Would it make sense to make ev:permutedLiteralForm an annotation property ?
]
 
We chose this approach because:
- The skos:hiddenLabel semantics are less precise as what is intended by the EUROVOC maintenance and aditorial team
  when handling "permutations".
- We still wanted users/application only using SKOS/XL and the EUROVOC extension (so not interpreting the semantics
  of permuted literal form) to be able to search using any of the permuted literal values.
- In the back office permuted labels are not managed as "controlled vocabulary" entities, but rather as properties of the original labels.
  We felt it was not up to the export service to "create" entities/resources but to reuse the managed resources/literals
  when providing the closest possible SKOS equivalent.
 
Thanks for your remarks.
 
about PS:
- We still are working on the final form of the documentation.
  but making these considerations public is OK as that will only help to establish best practices an to improve inter-operability.
 
kr, Johan De Smedt.
===================
 


From: Thomas Bandholtz [thomas.bandholtz@...]
Sent: Saturday, 24 October, 2009 23:45
To: Johan De Smedt
Cc: Antoine Isaac; Stella Dextre Clarke; 'Bernard Vatant'
Subject: Re: UMTHES and SKOS-XL

Hi Johan,

looking closer, i see one most interesting detail:

eurovoc.owl has one extended owl:DatatypePoperty(!) of xl:Label named ev:permutedLiteralForm.
This is formally similar to umthes:lexicalVariant: both are not subclasses of xl:Label connected by a subProperty of xl:labelRelation.
So there exists no property chain that would infer that both would appear as skos:hiddenLabel instances towards a search agent.

Generally i have my doubt that such a property chain would work in practise today, it has not even been formalized ...
Anyway, when i read your rdfs:comment about ev:permutedLiteralForm i am not sure about your intention:

"This property provides a permuted search string for the label.
It is:
- Generated by the back-office system, based on the literal form of the SKOS preferred or alternate label.
- Provided as a hidden SKOS label. {@en}"

When you declare ev:permutedLiteralForm as a owl:DatatypePoperty of xl:Label, this will not infer to any appearance among the skos:hiddenLabel instances. Is this only an oversight or some so far not yet unresolved conflict of your model?

In other words: do you want the permutedLiteralForm  to appear as one of the hiddenLabels?

As far as I see: you have the choice, it is not SKOS who makes the decision ;-)

Looking forward to further discussion.

Best,
Thomas


-- 
Thomas Bandholtz, thomas.bandholtz@..., http://www.innoq.com 
innoQ Deutschland GmbH, Halskestr. 17, D-40880 Ratingen, Germany
Phone: +49 228 9288490 Mobile: +49 178 4049387 Fax: +49 228 9288491

RE: UMTHES and SKOS-XL

by Johan De Smedt :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Thomas,
 
Thanks for the references, analysis and explanation.
You did not miss anything.
However, I want to iterate on some practical considerations
 
I see SKOS primarily as an exchange format not as a maintenance format.
Users of the EUROVOC thesaurus maintenance system manage permuted literals as properties of preferred or alternate xl:Labels.
Hence:
- the genuine managed labels are ok to have a URI that can later be used in an LOD or SPARQL service interface
- permuted literal forms do not have this quality
However, when making a SKOS compliant publication, the hidden label has relevance (as search value)
Hence EUROVOC publishes for a skos:Concept :C
- :C xl:prefLabel :ptC; xl:altLabel :nptC; skos:hiddenLabel "permuted literal form of C" .
- :ptC xl:literalForm "PT of C" .
- :nptC xl:literalForm "nPT of C" .
I think this is compliant with SKOS(XL) - comment is welcome.
 
The details of why
- :nptC is an an alt label
- "permuted literal form of C" is a permuted label
can be found in the equivalence relationships (simple or compound) or in the permuted literal forms of either :ptC or :nptC.
This is expressed in the EUROVOC specific SKOS extension (and thus requires knowledge of the owl schema beyond
the formal OWL expressions - i.e. the documentation that goes with the schema). 
 
The selection of something being a PT or an nPT or a permuted label is up to the thesaurus maintenance/management.
It is not always obvious if either an acronym ("OWL") or the full name ("Web Ontology Language") will be used as PT
in a real world thesaurus.  (Like considerations apply for other label relations)
However, once a name is selected as the PT, the related labels are likely (mandatory?) candidates for nPT or hidden labels.
 
As there currently is no SKOS extension capturing such label relations, we now discuss on which approach to take.
I would advocate that for some of the work done on the ISO standardization, it may be worthwhile to do some RDF
standardization effort in the future.
Possible candidates are:
- Concept groups
- Equivalence relationships (simple and compound)
Obviously, the industry may find the schema provided in the ISO standard (UML/XML) sufficient.
 

kr, Johan De Smedt.
===================

 


From: Thomas Bandholtz [mailto:thomas.bandholtz@...]
Sent: Sunday, 25 October, 2009 15:23
To: Johan De Smedt; Johan De Smedt; SKOS
Cc: 'Antoine Isaac'
Subject: Re: UMTHES and SKOS-XL

Hi Johan & Antoine,

from the SKOS point of view, the structure of your ev:permutedLiteralForm is very similar to that of umthes:lexicalVariant.
As both are defined as local datatype properties of skosxl:Label, the property chain S57 will not work:

S57:      "The property chain (skosxl:hiddenLabel, skosxl:literalForm) is a sub-property of skos:hiddenLabel."

You are right, OWL 2 introduces property chains for owl:ObjectProperty but not for owl:DatatypeProperty. See
http://www.w3.org/TR/2009/WD-owl2-new-features-20090611/#F8:_Property_Chain_Inclusion

There has been some discussion about "formal expression of property chains" in the skos list, but no final clarification. See
http://lists.w3.org/Archives/Public/public-esw-thes/2009May/0003.html

I think that Antoine's draft in
http://lists.w3.org/Archives/Public/public-swd-wg/2009Mar/0043.html and
is not valid OWL2 because he refers to datatype properties.

There are some valid examples in the OWL 2 primer, for instance:

 <rdf:Description rdf:about="hasUncle">
   <owl:propertyChainAxiom rdf:parseType="Collection">
     <owl:ObjectProperty rdf:about="hasFather"/>
     <owl:ObjectProperty rdf:about="hasBrother"/>
   </owl:propertyChainAxiom>
 </rdf:Description>


To me it is not really clear why this pattern is restricted to object properties in OWL 2, but it is.

Anyway, given that S57 is valid, ev:permutedLiteralForm and umthes:lexicalVariant would need to be remodelled as xl:literalForm of some xl:Label, with some additional ev:hasPermutedLiteralForm subproperty of xl:LabelRelation. Then you can point from the Concept to the permuted from using xl:hiddenLabel. Something like:

# no hiddenLabel in this example
:123 rdf:type skos:Concept;
    skosxl:prefLabel :ABC.

:ABC rdf:Type skosxl:Label;
    skosxl:literalForm "Something";
    jv:permutedLiteralForm "
permuted form of Something".

would need to be modified as in the next example:

# using hiddenLabel and a subproperty of skosxl:labelRelation
:123 rdf:type skos:Concept;
    skosxl:prefLabel :ABC;
    skosxl:hiddenLabel :ABCperm.

:ABC rdf:Type skosxl:Label;
    skosxl:literalForm "Something".

:ABCperm rdf:Type skosxl:Label;
    skosxl:literalForm "permuted form of Something".

:ABC: jv:hasPermutedLiteralForm :ABCperm.


Looks a bit complicated, but this is how I read the SKOS specification, and Antoine pointed to this in a previous mail.
I am still not decided whether to go this way for umthes:lexicalVariant.

May be these properties just don't need to be hiddenLabels.
Definition of skos:hiddenLabel:
"A lexical label for a resource that should be hidden when generating visual displays of the resource, but should still be accessible to free text search operations."

Doesn't seem to be very clear: free text search operations may access literal values of any property, but a SPARQL query restricted to rdfs:label (including subs) wouldn't.
As far as i see, this would be the only consequence.
Did I miss something?

Kind regards,
Thomas


Johan De Smedt schrieb:
Hi Thomas,
 
The permuted label is managed by the EUROVOC editors.
ev:permuted LiteralForm is a data property on an xl:Label
For each xl:Label/xl:literForm, any number of label permutations may be provided.
 
On export from the maintenance system, these permutations are made available as skos:hiddenLabel by the export service,
So a skos:Concept that has an xl:Label either as an xl:prefLabel or an xl:altLabel, and that xl:Label has permuted
literal form results in a skos:hiddenLabel for each of those permuted literals.
(cfr the importRule annotation)
[
We could have defined property chains like
- skos:hiddenLabel (xl:prevLabel o ev:permutedLiteralForm)
- skos:hiddenLabel (xl:altLabel o ev:permutedLiteralForm)
But I think OWL2 only provides property chains on object properties (please correct if I am wrong here !)
]
[
Would it make sense to make ev:permutedLiteralForm an annotation property ?
]
 
We chose this approach because:
- The skos:hiddenLabel semantics are less precise as what is intended by the EUROVOC maintenance and aditorial team
  when handling "permutations".
- We still wanted users/application only using SKOS/XL and the EUROVOC extension (so not interpreting the semantics
  of permuted literal form) to be able to search using any of the permuted literal values.
- In the back office permuted labels are not managed as "controlled vocabulary" entities, but rather as properties of the original labels.
  We felt it was not up to the export service to "create" entities/resources but to reuse the managed resources/literals
  when providing the closest possible SKOS equivalent.
 
Thanks for your remarks.
 
about PS:
- We still are working on the final form of the documentation.
  but making these considerations public is OK as that will only help to establish best practices an to improve inter-operability.
 
kr, Johan De Smedt.
===================
 


From: Thomas Bandholtz [thomas.bandholtz@...]
Sent: Saturday, 24 October, 2009 23:45
To: Johan De Smedt
Cc: Antoine Isaac; Stella Dextre Clarke; 'Bernard Vatant'
Subject: Re: UMTHES and SKOS-XL

Hi Johan,

looking closer, i see one most interesting detail:

eurovoc.owl has one extended owl:DatatypePoperty(!) of xl:Label named ev:permutedLiteralForm.
This is formally similar to umthes:lexicalVariant: both are not subclasses of xl:Label connected by a subProperty of xl:labelRelation.
So there exists no property chain that would infer that both would appear as skos:hiddenLabel instances towards a search agent.

Generally i have my doubt that such a property chain would work in practise today, it has not even been formalized ...
Anyway, when i read your rdfs:comment about ev:permutedLiteralForm i am not sure about your intention:

"This property provides a permuted search string for the label.
It is:
- Generated by the back-office system, based on the literal form of the SKOS preferred or alternate label.
- Provided as a hidden SKOS label. {@en}"

When you declare ev:permutedLiteralForm as a owl:DatatypePoperty of xl:Label, this will not infer to any appearance among the skos:hiddenLabel instances. Is this only an oversight or some so far not yet unresolved conflict of your model?

In other words: do you want the permutedLiteralForm  to appear as one of the hiddenLabels?

As far as I see: you have the choice, it is not SKOS who makes the decision ;-)

Looking forward to further discussion.

Best,
Thomas


-- 
Thomas Bandholtz, thomas.bandholtz@..., http://www.innoq.com 
innoQ Deutschland GmbH, Halskestr. 17, D-40880 Ratingen, Germany
Phone: +49 228 9288490 Mobile: +49 178 4049387 Fax: +49 228 9288491

Re: UMTHES and SKOS-XL

by Thomas Bandholtz :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Johan,

some considerations inline:

[skip]
I see SKOS primarily as an exchange format not as a maintenance format.

This is a very important issue. I used to see SKOS purely as an exchange format either, but since LOD I understand skosified reference vocabularies as an important building block at runtime. This does not constrain maintenance to anything else than the vocabulary has to be serializable in SKOS so that it will meet the expectations of a SKOS aware application.

Users of the EUROVOC thesaurus maintenance system manage permuted literals as properties of preferred or alternate xl:Labels.
Hence:
- the genuine managed labels are ok to have a URI that can later be used in an LOD or SPARQL service interface
- permuted literal forms do not have this quality
However, when making a SKOS compliant publication, the hidden label has relevance (as search value)
Hence EUROVOC publishes for a skos:Concept :C
- :C xl:prefLabel :ptC; xl:altLabel :nptC; skos:hiddenLabel "permuted literal form of C" .
- :ptC xl:literalForm "PT of C" .
- :nptC xl:literalForm "nPT of C" .
I think this is compliant with SKOS(XL) - comment is welcome.

I don't see any reason why this should not be compliant with SKOS. But it may not express your semantics: You provide prefLabel and altLabel with XL, but hiddenLabel in plain SKOS, so how will you express that some permuted literal form refers to one of the labels? A Concept by itself has no literal form, so i do not understand how it may have a permuted literal form. Is there exactly one permuted literal form per Concept or per Label?
Could you give an example?

 
The details of why
- :nptC is an an alt label
- "permuted literal form of C" is a permuted label
can be found in the equivalence relationships (simple or compound) or in the permuted literal forms of either :ptC or :nptC.
This is expressed in the EUROVOC specific SKOS extension (and thus requires knowledge of the owl schema beyond
the formal OWL expressions - i.e. the documentation that goes with the schema).

I understand that "ptC" stands for "preferred term of a Concept" and "nptC" for "non-preferred term of a Concept", right?
The basic ISO equivalence relationship is "preferredTerm USED FOR non-preferredTerm" with inverse "non-preferredTerm USE preferredTerm".
There is no such construct in SKOS. A SKOSXL Label cannot be preferred or not by itself, it only depends on how it is linked to a Concept (pref/altLabel).
(see an example below ...)
Guess that is the reason why you have a EUROVOC specific SKOS extension (which we don't know so far).
I wonder how you express "permuted literal forms of either :ptC or :nptC", when the permuted literal form is a rdf:Literal?
This might be a special case, but xs:labelRelation is intended to point to a xl:Label instance, not to a Literal.


The selection of something being a PT or an nPT or a permuted label is up to the thesaurus maintenance/management.
It is not always obvious if either an acronym ("OWL") or the full name ("Web Ontology Language") will be used as PT
in a real world thesaurus.  (Like considerations apply for other label relations)
However, once a name is selected as the PT, the related labels are likely (mandatory?) candidates for nPT or hidden labels.
 
As there currently is no SKOS extension capturing such label relations, we now discuss on which approach to take.
I would advocate that for some of the work done on the ISO standardization, it may be worthwhile to do some RDF
standardization effort in the future.
Possible candidates are:
- Concept groups

What is the difference from a skos:Collection?

- Equivalence relationships (simple and compound)

SKOS only has altLabel prefLabel relations from a Concept to a Label.
>From this arises the question whether the same Label my be pref of one Concept and alt of another?
Would this be compliant? Yes (may be not intentionally).

S13: "skos:prefLabel, skos:altLabel and skos:hiddenLabel are pairwise disjoint properties."
S14: "A resource has no more than one value of skos:prefLabel per language tag."

These only keep you from saying something like:

<Love> skos:prefLabel "love"@en ; skos:prefLabel "adoration"@en .

or

<Love> skos:prefLabel "love"@en ; skos:altLabel "love"@en .

But the following is compliant:

<A> skos:prefLabel "love"@en ; skos:altLabel "adoration"@en .
<b> skos:prefLabel "adoration"@en ; skos:altLabel "love"@en .

Or even more evident in XL:

<A> skosxl:prefLabel :love; skosxl:altLabel :adoration .
<B> skosxl:prefLabel :adoration ; skosxl:altLabel :love .
:love skosxl:literalForm "love"@en .
:adoration skosxl:literalForm "adoration"@en.

SKOS pref/alt of a label is only known in the context of a given Concept, while
ISO pref/nonPref is bound to a given label (~term).
Right?

If you want to have ISO equivalence in SKOS you may express something like:

prefTerm subClassOf xl:Label .
nonPrefTerm subclassOf xl:Label .
prefTerm disjointWith nonPrefTerm .

xl:prefLabel range prefTerm .
xl:altLabel range nonPrefTerm .


usedFor subPropertyOf xl:labelRelation;
    domain prefTerm;
    range nonPrefTerm;
    inverseOf use .


and then:

love  a prefTerm;
adoration a nonPrefTerm;
love usedFor adoration.


Obviously, the industry may find the schema provided in the ISO standard (UML/XML) sufficient.

? I do not really understand this.

-- 
Thomas Bandholtz, thomas.bandholtz@..., http://www.innoq.com 
innoQ Deutschland GmbH, Halskestr. 17, D-40880 Ratingen, Germany
Phone: +49 228 9288490 Mobile: +49 178 4049387 Fax: +49 228 9288491

RE: UMTHES and SKOS-XL

by Johan De Smedt :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Thomas,
 
Please find in-line and prefixed with ">>JDS-2:" clarifications on your added questions.
In the examples, "ev:" stands for the EUROVOC schema prefix.
 

kr, Johan De Smedt.
===================

 


From: Thomas Bandholtz [mailto:thomas.bandholtz@...]
Sent: Sunday, 25 October, 2009 20:13
To: Johan De Smedt
Cc: 'SKOS'; 'Antoine Isaac'
Subject: Re: UMTHES and SKOS-XL

Hi Johan,

some considerations inline:

[skip]
I see SKOS primarily as an exchange format not as a maintenance format.

This is a very important issue. I used to see SKOS purely as an exchange format either, but since LOD I understand skosified reference vocabularies as an important building block at runtime. This does not constrain maintenance to anything else than the vocabulary has to be serializable in SKOS so that it will meet the expectations of a SKOS aware application.

Users of the EUROVOC thesaurus maintenance system manage permuted literals as properties of preferred or alternate xl:Labels.
Hence:
- the genuine managed labels are ok to have a URI that can later be used in an LOD or SPARQL service interface
- permuted literal forms do not have this quality
However, when making a SKOS compliant publication, the hidden label has relevance (as search value)
Hence EUROVOC publishes for a skos:Concept :C
- :C xl:prefLabel :ptC; xl:altLabel :nptC; skos:hiddenLabel "permuted literal form of C" .
- :ptC xl:literalForm "PT of C" .
- :nptC xl:literalForm "nPT of C" .
I think this is compliant with SKOS(XL) - comment is welcome.

I don't see any reason why this should not be compliant with SKOS. But it may not express your semantics:  
>>JDS-2: indeed, not all semantics are expressed that is what I try to explain below..
 You provide prefLabel and altLabel with XL, but hiddenLabel in plain SKOS, so how will you express that some permuted literal form refers to one of the labels? A Concept by itself has no literal form, so i do not understand how it may have a permuted literal form. Is there exactly one permuted literal form per Concept or per Label?
Could you give an example?
 >>JDS-2: example:
C stands for the concept with preferred term "child abuse"
:C xl:prefLabel :childAbuse
:childAbuse xl:literalForm "child abuse"@en .
:childAbuse ev:permutedLiteralForm "abuse, child"@en .
For this, the EUROVOC publishing service generating SKOS will generate in addition
:C skos:prefLabel "child abuse"@en .
:C skos:hiddenLabel "abuse, child"@en
This is based on the following two (informally noted) rules that go with the EUROVOC schema
- A chain xl:prefLabel([Concept][Term]) o ev:permutedLiteralForm([Term][literal]) → skos:hiddenLabel([Concept][literal]) .
- A chain xl:altLabel([Concept][Term]) o ev:permutedLiteralForm([Term][literal]) → skos:hiddenLabel([Concept][literal]) .
 

 
The details of why
- :nptC is an an alt label
- "permuted literal form of C" is a permuted label
can be found in the equivalence relationships (simple or compound) or in the permuted literal forms of either :ptC or :nptC.
This is expressed in the EUROVOC specific SKOS extension (and thus requires knowledge of the owl schema beyond
the formal OWL expressions - i.e. the documentation that goes with the schema).

I understand that "ptC" stands for "preferred term of a Concept" and "nptC" for "non-preferred term of a Concept", right?  >>JDS-2: yes. 
The basic ISO equivalence relationship is "preferredTerm USED FOR non-preferredTerm" with inverse "non-preferredTerm USE preferredTerm".
There is no such construct in SKOS. A SKOSXL Label cannot be preferred or not by itself, it only depends on how it is linked to a Concept (pref/altLabel).
(see an example below ...)
Guess that is the reason why you have a EUROVOC specific SKOS extension (which we don't know so far).
I wonder how you express "permuted literal forms of either :ptC or :nptC", when the permuted literal form is a rdf:Literal? 
>>JDS-2: an xl:Label may have an arbitrary number of ev:permutedLiteralForm.  This is a data property (like xl:literalForm).
>>JDS-2: in contrast though, ev:permutedLiteralForm has no cardinality constraints.
>>JDS-2: further, for any xlLabel :L, its property xl:literalForm and all its ev:permutedLiteralForm must have the same language. 
This might be a special case, but xs:labelRelation is intended to point to a xl:Label instance, not to a Literal.


The selection of something being a PT or an nPT or a permuted label is up to the thesaurus maintenance/management.
It is not always obvious if either an acronym ("OWL") or the full name ("Web Ontology Language") will be used as PT
in a real world thesaurus.  (Like considerations apply for other label relations)
However, once a name is selected as the PT, the related labels are likely (mandatory?) candidates for nPT or hidden labels.
 
As there currently is no SKOS extension capturing such label relations, we now discuss on which approach to take.
I would advocate that for some of the work done on the ISO standardization, it may be worthwhile to do some RDF
standardization effort in the future.
Possible candidates are:
- Concept groups

What is the difference from a skos:Collection? 
>>JDS-2: group means any subset of concepts while collections where aimed to represent "node labels" and "facets".
>>JDS-2: I think Stella and Antoine are better placed to respond to this accurately. 

- Equivalence relationships (simple and compound)

SKOS only has altLabel prefLabel relations from a Concept to a Label.
From this arises the question whether the same Label my be pref of one Concept and alt of another?
Would this be compliant? Yes (may be not intentionally).

S13: "skos:prefLabel, skos:altLabel and skos:hiddenLabel are pairwise disjoint properties."
S14: "A resource has no more than one value of skos:prefLabel per language tag."

These only keep you from saying something like:

<Love> skos:prefLabel "love"@en ; skos:prefLabel "adoration"@en .

or

<Love> skos:prefLabel "love"@en ; skos:altLabel "love"@en .

But the following is compliant:

<A> skos:prefLabel "love"@en ; skos:altLabel "adoration"@en .
<b> skos:prefLabel "adoration"@en ; skos:altLabel "love"@en .

Or even more evident in XL:

<A> skosxl:prefLabel :love; skosxl:altLabel :adoration .
<B> skosxl:prefLabel :adoration ; skosxl:altLabel :love .
:love skosxl:literalForm "love"@en .
:adoration skosxl:literalForm "adoration"@en.

SKOS pref/alt of a label is only known in the context of a given Concept, while
ISO pref/nonPref is bound to a given label (~term).
Right? 
>>JDS-2: I agree and this makes ISO Thesaurus semantics more strict than SKOS (as you demonstrate in the example below). 

If you want to have ISO equivalence in SKOS you may express something like:

prefTerm subClassOf xl:Label .
nonPrefTerm subclassOf xl:Label .
prefTerm disjointWith nonPrefTerm .

xl:prefLabel range prefTerm .
xl:altLabel range nonPrefTerm . 
>>JDS-2: I do not follow with these last 2 rules as they would redefine SKOS-XL.
 
>>JDS-2: Instead we define ev:EquivalenceRelation relating a prefTerm and a nonPrefTerm using properties ev:use and ev:uf respectively.
>>JDS-2: ev:use and ev:uf do have range prefTerm and nonPrefTerm respectively.
>>JDS-2: then we say that :C xl:altLabel :nptC is entailed by:
>>JDS-2: :C xl:prefLabel :ptC.
>>JDS-2: :eqr rdf:type ev:equivalenceRelation ; ev:use :ptC ; ev:uf :nptC. 

usedFor subPropertyOf xl:labelRelation;
    domain prefTerm;
    range nonPrefTerm;
    inverseOf use .


and then:

love  a prefTerm;
adoration a nonPrefTerm;
love usedFor adoration.


Obviously, the industry may find the schema provided in the ISO standard (UML/XML) sufficient.

? I do not really understand this. 
>>JDS-2: I mean the ISO standard went a long way:
>>JDS-2: The BS preparing it defined an XML schema for Thesauri. 
>>JDS-2: This covered more than SKOS (this statement is scoped to thesaurus).
>>JDS-2: In addition a model was defined using the Unified Modeling Language (UML).
>>JDS-2: Likewise this model has more specific thesaurus artifacts. 

-- 
Thomas Bandholtz, thomas.bandholtz@..., http://www.innoq.com 
innoQ Deutschland GmbH, Halskestr. 17, D-40880 Ratingen, Germany
Phone: +49 228 9288490 Mobile: +49 178 4049387 Fax: +49 228 9288491

Label management information in SKOS-XL (continuing from UMTHES and SKOS-XL)

by Christophe Dupriez :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi!

It was asked to me today how I will keep track of a label source (and other management information) in SKOS...
Hopefully SKOS-XL reifies labels.

Looking at ISO 25964, I have:
Attributes of ThesaurusTerm:
LexicalValue String      1 The wording of the term
identifier   String      1 A unique identifier for the term
created      date     0..1 The date when the term was created
modified     date     0..1 The date when the term was last modified
source       String   0..1 The person(s) or document(s) from which the term was taken
Status       String   0..1 Indication of whether the term is candidate, approved, etc.
lang         language 0..1 A code showing the language of the term. This should be
                           included if the thesaurus supports more than one language

The subject may be touchy but one could like to see a standardized way to have SKOS / ISO 25964 interchangeability
(something like an SKOS/ISO application profile)

This would allow:
1) An ISO 25964 thesaurus editor could unload/reload using SKOS files without information losses.
2) Exchanges between different ISO 25964 thesauri could be done using the SKOS format.
3) SKOS aware applications could support ISO 25964 "extensions" without parameterization to indicate which RDF attributes contains the supplementary ISO data.
4) SKOS would benefit from the insights of ISO 25964 design team.

There is a rather striking difference between SKOS flexibility and extendability (opening ways to unstandardized horizons) and ISO willingness to build upon the past within a stricter frame.

What I am suggesting is to check (and to normalize somewhat) how the complete data model of ISO can be mapped in SKOS(-XL).

By the way, labels reification opens the way to write labels which are written from multiple coordinated concepts.
A reified label of a coordination concept could include an rdf:Seq.
This rdf:Seq would contain strings and/or refers to (reified) labels from the different coordinated concepts and/or refers to coordination operators (conjunctions).
This could generate a dynamic literalForm based on the labels of the differeent coordinated concepts.

Have a nice evening,

Christophe


Johan De Smedt a écrit :
Hi Thomas,
 
Please find in-line and prefixed with ">>JDS-2:" clarifications on your added questions.
In the examples, "ev:" stands for the EUROVOC schema prefix.
 

kr, Johan De Smedt.
===================

 


From: Thomas Bandholtz [thomas.bandholtz@...]
Sent: Sunday, 25 October, 2009 20:13
To: Johan De Smedt
Cc: 'SKOS'; 'Antoine Isaac'
Subject: Re: UMTHES and SKOS-XL

Hi Johan,

some considerations inline:

[skip]
I see SKOS primarily as an exchange format not as a maintenance format.

This is a very important issue. I used to see SKOS purely as an exchange format either, but since LOD I understand skosified reference vocabularies as an important building block at runtime. This does not constrain maintenance to anything else than the vocabulary has to be serializable in SKOS so that it will meet the expectations of a SKOS aware application.

Users of the EUROVOC thesaurus maintenance system manage permuted literals as properties of preferred or alternate xl:Labels.
Hence:
- the genuine managed labels are ok to have a URI that can later be used in an LOD or SPARQL service interface
- permuted literal forms do not have this quality
However, when making a SKOS compliant publication, the hidden label has relevance (as search value)
Hence EUROVOC publishes for a skos:Concept :C
- :C xl:prefLabel :ptC; xl:altLabel :nptC; skos:hiddenLabel "permuted literal form of C" .
- :ptC xl:literalForm "PT of C" .
- :nptC xl:literalForm "nPT of C" .
I think this is compliant with SKOS(XL) - comment is welcome.

I don't see any reason why this should not be compliant with SKOS. But it may not express your semantics:  
>>JDS-2: indeed, not all semantics are expressed that is what I try to explain below..
 You provide prefLabel and altLabel with XL, but hiddenLabel in plain SKOS, so how will you express that some permuted literal form refers to one of the labels? A Concept by itself has no literal form, so i do not understand how it may have a permuted literal form. Is there exactly one permuted literal form per Concept or per Label?
Could you give an example?
 >>JDS-2: example:
C stands for the concept with preferred term "child abuse"
:C xl:prefLabel :childAbuse
:childAbuse xl:literalForm %22child%20abuse%22@... .
:childAbuse ev:permutedLiteralForm %22abuse,%20child%22@... .
For this, the EUROVOC publishing service generating SKOS will generate in addition
:C skos:prefLabel %22child%20abuse%22@... .
:C skos:hiddenLabel %22abuse,%20child%22@...
This is based on the following two (informally noted) rules that go with the EUROVOC schema
- A chain xl:prefLabel([Concept][Term]) o ev:permutedLiteralForm([Term][literal]) → skos:hiddenLabel([Concept][literal]) .
- A chain xl:altLabel([Concept][Term]) o ev:permutedLiteralForm([Term][literal]) → skos:hiddenLabel([Concept][literal]) .
 

 
The details of why
- :nptC is an an alt label
- "permuted literal form of C" is a permuted label
can be found in the equivalence relationships (simple or compound) or in the permuted literal forms of either :ptC or :nptC.
This is expressed in the EUROVOC specific SKOS extension (and thus requires knowledge of the owl schema beyond
the formal OWL expressions - i.e. the documentation that goes with the schema).

I understand that "ptC" stands for "preferred term of a Concept" and "nptC" for "non-preferred term of a Concept", right?  >>JDS-2: yes. 
The basic ISO equivalence relationship is "preferredTerm USED FOR non-preferredTerm" with inverse "non-preferredTerm USE preferredTerm".
There is no such construct in SKOS. A SKOSXL Label cannot be preferred or not by itself, it only depends on how it is linked to a Concept (pref/altLabel).
(see an example below ...)
Guess that is the reason why you have a EUROVOC specific SKOS extension (which we don't know so far).
I wonder how you express "permuted literal forms of either :ptC or :nptC", when the permuted literal form is a rdf:Literal? 
>>JDS-2: an xl:Label may have an arbitrary number of ev:permutedLiteralForm.  This is a data property (like xl:literalForm).
>>JDS-2: in contrast though, ev:permutedLiteralForm has no cardinality constraints.
>>JDS-2: further, for any xlLabel :L, its property xl:literalForm and all its ev:permutedLiteralForm must have the same language. 
This might be a special case, but xs:labelRelation is intended to point to a xl:Label instance, not to a Literal.


The selection of something being a PT or an nPT or a permuted label is up to the thesaurus maintenance/management.
It is not always obvious if either an acronym ("OWL") or the full name ("Web Ontology Language") will be used as PT
in a real world thesaurus.  (Like considerations apply for other label relations)
However, once a name is selected as the PT, the related labels are likely (mandatory?) candidates for nPT or hidden labels.
 
As there currently is no SKOS extension capturing such label relations, we now discuss on which approach to take.
I would advocate that for some of the work done on the ISO standardization, it may be worthwhile to do some RDF
standardization effort in the future.
Possible candidates are:
- Concept groups

What is the difference from a skos:Collection? 
>>JDS-2: group means any subset of concepts while collections where aimed to represent "node labels" and "facets".
>>JDS-2: I think Stella and Antoine are better placed to respond to this accurately. 

- Equivalence relationships (simple and compound)

SKOS only has altLabel prefLabel relations from a Concept to a Label.
>From this arises the question whether the same Label my be pref of one Concept and alt of another?
Would this be compliant? Yes (may be not intentionally).

S13: "skos:prefLabel, skos:altLabel and skos:hiddenLabel are pairwise disjoint properties."
S14: "A resource has no more than one value of skos:prefLabel per language tag."

These only keep you from saying something like:

<Love> skos:prefLabel "love"@en ; skos:prefLabel "adoration"@en .

or

<Love> skos:prefLabel "love"@en ; skos:altLabel "love"@en .

But the following is compliant:

<A> skos:prefLabel "love"@en ; skos:altLabel "adoration"@en .
<b> skos:prefLabel "adoration"@en ; skos:altLabel "love"@en .

Or even more evident in XL:

<A> skosxl:prefLabel :love; skosxl:altLabel :adoration .
<B> skosxl:prefLabel :adoration ; skosxl:altLabel :love .
:love skosxl:literalForm "love"@en .
:adoration skosxl:literalForm "adoration"@en.

SKOS pref/alt of a label is only known in the context of a given Concept, while
ISO pref/nonPref is bound to a given label (~term).
Right? 
>>JDS-2: I agree and this makes ISO Thesaurus semantics more strict than SKOS (as you demonstrate in the example below). 

If you want to have ISO equivalence in SKOS you may express something like:

prefTerm subClassOf xl:Label .
nonPrefTerm subclassOf xl:Label .
prefTerm disjointWith nonPrefTerm .

xl:prefLabel range prefTerm .
xl:altLabel range nonPrefTerm . 
>>JDS-2: I do not follow with these last 2 rules as they would redefine SKOS-XL.
 
>>JDS-2: Instead we define ev:EquivalenceRelation relating a prefTerm and a nonPrefTerm using properties ev:use and ev:uf respectively.
>>JDS-2: ev:use and ev:uf do have range prefTerm and nonPrefTerm respectively.
>>JDS-2: then we say that :C xl:altLabel :nptC is entailed by:
>>JDS-2: :C xl:prefLabel :ptC.
>>JDS-2: :eqr rdf:type ev:equivalenceRelation ; ev:use :ptC ; ev:uf :nptC. 

usedFor subPropertyOf xl:labelRelation;
    domain prefTerm;
    range nonPrefTerm;
    inverseOf use .


and then:

love  a prefTerm;
adoration a nonPrefTerm;
love usedFor adoration.


Obviously, the industry may find the schema provided in the ISO standard (UML/XML) sufficient.

? I do not really understand this. 
>>JDS-2: I mean the ISO standard went a long way:
>>JDS-2: The BS preparing it defined an XML schema for Thesauri. 
>>JDS-2: This covered more than SKOS (this statement is scoped to thesaurus).
>>JDS-2: In addition a model was defined using the Unified Modeling Language (UML).
>>JDS-2: Likewise this model has more specific thesaurus artifacts. 

-- 
Thomas Bandholtz, thomas.bandholtz@..., http://www.innoq.com 
innoQ Deutschland GmbH, Halskestr. 17, D-40880 Ratingen, Germany
Phone: +49 228 9288490 Mobile: +49 178 4049387 Fax: +49 228 9288491
  


[christophe_dupriez.vcf]

begin:vcard
fn:Christophe Dupriez
n:Dupriez;Christophe
org:DESTIN inc. SSEB
adr:;;rue G.Godefroid 9;Felenne (Beauraing);;B-5570;Belgique
email;internet:Christophe.Dupriez@...
title;quoted-printable:Informaticien, Syst=C3=A8mes d'Information et de Documentation
tel;cell:+32/475.77.62.11
note;quoted-printable:D=C3=A9veloppement de Syst=C3=A8mes de Traitement de l'Information
x-mozilla-html:TRUE
url:http://www.destin.be
version:2.1
end:vcard



RE: Label management information in SKOS-XL (continuing from UMTHES and SKOS-XL)

by Johan De Smedt :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Christophe, I provided some in-line considerations
 
nice evening

kr, Johan De Smedt.
===================

 


From: Christophe Dupriez [mailto:christophe.dupriez@...]
Sent: Wednesday, 04 November, 2009 22:06
To: Johan De Smedt
Cc: 'Thomas Bandholtz'; 'SKOS'; 'Antoine Isaac'; Dominique Vanpée
Subject: Label management information in SKOS-XL (continuing from UMTHES and SKOS-XL)

Hi!

It was asked to me today how I will keep track of a label source (and other management information) in SKOS...
>>>JDS-3: dc:source applied would seem a nice candidate
Hopefully SKOS-XL reifies labels.

Looking at ISO 25964, I have:
Attributes of ThesaurusTerm:
LexicalValue String      1 The wording of the term
identifier   String      1 A unique identifier for the term
created      date     0..1 The date when the term was created
modified     date     0..1 The date when the term was last modified
source       String   0..1 The person(s) or document(s) from which the term was taken
Status       String   0..1 Indication of whether the term is candidate, approved, etc.
lang         language 0..1 A code showing the language of the term. This should be
                           included if the thesaurus supports more than one language

The subject may be touchy but one could like to see a standardized way to have SKOS / ISO 25964 interchangeability
(something like an SKOS/ISO application profile) .

This would allow:
1) An ISO 25964 thesaurus editor could unload/reload using SKOS files without information losses. 
>>>JDS-3: I think it is feasible to write a SKOS extension that captures the formal model of the ISO standard.
>>>JDS-3: on an earlier version of SKOS-XL, I made an exercise some time ago to cover the BS8723 (which was input for the ISO standard)
>>>JDS-3: note this exercise was never discussed on any forum yet 
2) Exchanges between different ISO 25964 thesauri could be done using the SKOS format.
3) SKOS aware applications could support ISO 25964 "extensions" without parameterization to indicate which RDF attributes contains the supplementary ISO data.
4) SKOS would benefit from the insights of ISO 25964 design team. 
>>>JDS-3: I support these considerations.  It would provide a formal guideline for SKOS - ISO thesaurus transformation.

There is a rather striking difference between SKOS flexibility and extendability (opening ways to unstandardized horizons) and ISO willingness to build upon the past within a stricter frame.

What I am suggesting is to check (and to normalize somewhat) how the complete data model of ISO can be mapped in SKOS(-XL).

By the way, labels reification opens the way to write labels which are written from multiple coordinated concepts.
A reified label of a coordination concept could include an rdf:Seq.
This rdf:Seq would contain strings and/or refers to (reified) labels from the different coordinated concepts and/or refers to coordination operators (conjunctions).
This could generate a dynamic literalForm based on the labels of the differeent coordinated concepts.

Have a nice evening,

Christophe


Johan De Smedt a écrit :
Hi Thomas,
 
Please find in-line and prefixed with ">>JDS-2:" clarifications on your added questions.
In the examples, "ev:" stands for the EUROVOC schema prefix.
 

kr, Johan De Smedt.
===================

 


From: Thomas Bandholtz [thomas.bandholtz@...]
Sent: Sunday, 25 October, 2009 20:13
To: Johan De Smedt
Cc: 'SKOS'; 'Antoine Isaac'
Subject: Re: UMTHES and SKOS-XL

Hi Johan,

some considerations inline:

[skip]
I see SKOS primarily as an exchange format not as a maintenance format.

This is a very important issue. I used to see SKOS purely as an exchange format either, but since LOD I understand skosified reference vocabularies as an important building block at runtime. This does not constrain maintenance to anything else than the vocabulary has to be serializable in SKOS so that it will meet the expectations of a SKOS aware application.

Users of the EUROVOC thesaurus maintenance system manage permuted literals as properties of preferred or alternate xl:Labels.
Hence:
- the genuine managed labels are ok to have a URI that can later be used in an LOD or SPARQL service interface
- permuted literal forms do not have this quality
However, when making a SKOS compliant publication, the hidden label has relevance (as search value)
Hence EUROVOC publishes for a skos:Concept :C
- :C xl:prefLabel :ptC; xl:altLabel :nptC; skos:hiddenLabel "permuted literal form of C" .
- :ptC xl:literalForm "PT of C" .
- :nptC xl:literalForm "nPT of C" .
I think this is compliant with SKOS(XL) - comment is welcome.

I don't see any reason why this should not be compliant with SKOS. But it may not express your semantics:  
>>JDS-2: indeed, not all semantics are expressed that is what I try to explain below..
 You provide prefLabel and altLabel with XL, but hiddenLabel in plain SKOS, so how will you express that some permuted literal form refers to one of the labels? A Concept by itself has no literal form, so i do not understand how it may have a permuted literal form. Is there exactly one permuted literal form per Concept or per Label?
Could you give an example?
 >>JDS-2: example:
C stands for the concept with preferred term "child abuse"
:C xl:prefLabel :childAbuse
:childAbuse xl:literalForm %22child%20abuse%22@... .
:childAbuse ev:permutedLiteralForm %22abuse,%20child%22@... .
For this, the EUROVOC publishing service generating SKOS will generate in addition
:C skos:prefLabel %22child%20abuse%22@... .
:C skos:hiddenLabel %22abuse,%20child%22@...
This is based on the following two (informally noted) rules that go with the EUROVOC schema
- A chain xl:prefLabel([Concept][Term]) o ev:permutedLiteralForm([Term][literal]) → skos:hiddenLabel([Concept][literal]) .
- A chain xl:altLabel([Concept][Term]) o ev:permutedLiteralForm([Term][literal]) → skos:hiddenLabel([Concept][literal]) .
 

 
The details of why
- :nptC is an an alt label
- "permuted literal form of C" is a permuted label
can be found in the equivalence relationships (simple or compound) or in the permuted literal forms of either :ptC or :nptC.
This is expressed in the EUROVOC specific SKOS extension (and thus requires knowledge of the owl schema beyond
the formal OWL expressions - i.e. the documentation that goes with the schema).

I understand that "ptC" stands for "preferred term of a Concept" and "nptC" for "non-preferred term of a Concept", right?  >>JDS-2: yes. 
The basic ISO equivalence relationship is "preferredTerm USED FOR non-preferredTerm" with inverse "non-preferredTerm USE preferredTerm".
There is no such construct in SKOS. A SKOSXL Label cannot be preferred or not by itself, it only depends on how it is linked to a Concept (pref/altLabel).
(see an example below ...)
Guess that is the reason why you have a EUROVOC specific SKOS extension (which we don't know so far).
I wonder how you express "permuted literal forms of either :ptC or :nptC", when the permuted literal form is a rdf:Literal? 
>>JDS-2: an xl:Label may have an arbitrary number of ev:permutedLiteralForm.  This is a data property (like xl:literalForm).
>>JDS-2: in contrast though, ev:permutedLiteralForm has no cardinality constraints.
>>JDS-2: further, for any xlLabel :L, its property xl:literalForm and all its ev:permutedLiteralForm must have the same language. 
This might be a special case, but xs:labelRelation is intended to point to a xl:Label instance, not to a Literal.


The selection of something being a PT or an nPT or a permuted label is up to the thesaurus maintenance/management.
It is not always obvious if either an acronym ("OWL") or the full name ("Web Ontology Language") will be used as PT
in a real world thesaurus.  (Like considerations apply for other label relations)
However, once a name is selected as the PT, the related labels are likely (mandatory?) candidates for nPT or hidden labels.
 
As there currently is no SKOS extension capturing such label relations, we now discuss on which approach to take.
I would advocate that for some of the work done on the ISO standardization, it may be worthwhile to do some RDF
standardization effort in the future.
Possible candidates are:
- Concept groups

What is the difference from a skos:Collection? 
>>JDS-2: group means any subset of concepts while collections where aimed to represent "node labels" and "facets".
>>JDS-2: I think Stella and Antoine are better placed to respond to this accurately. 

- Equivalence relationships (simple and compound)

SKOS only has altLabel prefLabel relations from a Concept to a Label.
From this arises the question whether the same Label my be pref of one Concept and alt of another?
Would this be compliant? Yes (may be not intentionally).

S13: "skos:prefLabel, skos:altLabel and skos:hiddenLabel are pairwise disjoint properties."
S14: "A resource has no more than one value of skos:prefLabel per language tag."

These only keep you from saying something like:

<Love> skos:prefLabel "love"@en ; skos:prefLabel "adoration"@en .

or

<Love> skos:prefLabel "love"@en ; skos:altLabel "love"@en .

But the following is compliant:

<A> skos:prefLabel "love"@en ; skos:altLabel "adoration"@en .
<b> skos:prefLabel "adoration"@en ; skos:altLabel "love"@en .

Or even more evident in XL:

<A> skosxl:prefLabel :love; skosxl:altLabel :adoration .
<B> skosxl:prefLabel :adoration ; skosxl:altLabel :love .
:love skosxl:literalForm "love"@en .
:adoration skosxl:literalForm "adoration"@en.

SKOS pref/alt of a label is only known in the context of a given Concept, while
ISO pref/nonPref is bound to a given label (~term).
Right? 
>>JDS-2: I agree and this makes ISO Thesaurus semantics more strict than SKOS (as you demonstrate in the example below). 

If you want to have ISO equivalence in SKOS you may express something like:

prefTerm subClassOf xl:Label .
nonPrefTerm subclassOf xl:Label .
prefTerm disjointWith nonPrefTerm .

xl:prefLabel range prefTerm .
xl:altLabel range nonPrefTerm . 
>>JDS-2: I do not follow with these last 2 rules as they would redefine SKOS-XL.
 
>>JDS-2: Instead we define ev:EquivalenceRelation relating a prefTerm and a nonPrefTerm using properties ev:use and ev:uf respectively.
>>JDS-2: ev:use and ev:uf do have range prefTerm and nonPrefTerm respectively.
>>JDS-2: then we say that :C xl:altLabel :nptC is entailed by:
>>JDS-2: :C xl:prefLabel :ptC.
>>JDS-2: :eqr rdf:type ev:equivalenceRelation ; ev:use :ptC ; ev:uf :nptC. 

usedFor subPropertyOf xl:labelRelation;
    domain prefTerm;
    range nonPrefTerm;
    inverseOf use .


and then:

love  a prefTerm;
adoration a nonPrefTerm;
love usedFor adoration.


Obviously, the industry may find the schema provided in the ISO standard (UML/XML) sufficient.

? I do not really understand this. 
>>JDS-2: I mean the ISO standard went a long way:
>>JDS-2: The BS preparing it defined an XML schema for Thesauri. 
>>JDS-2: This covered more than SKOS (this statement is scoped to thesaurus).
>>JDS-2: In addition a model was defined using the Unified Modeling Language (UML).
>>JDS-2: Likewise this model has more specific thesaurus artifacts. 

-- 
Thomas Bandholtz, thomas.bandholtz@..., http://www.innoq.com 
innoQ Deutschland GmbH, Halskestr. 17, D-40880 Ratingen, Germany
Phone: +49 228 9288490 Mobile: +49 178 4049387 Fax: +49 228 9288491
  


Re: Label management information in SKOS-XL (continuing from UMTHES and SKOS-XL)

by Leonard Will :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, 4 Nov 2009 at 23:17:06, Johan De Smedt
<johan.de-smedt@...> wrote

>Hi Christophe, I provided some in-line considerations
>
>From: Christophe Dupriez [mailto:christophe.dupriez@...]
>Sent: Wednesday, 04 November, 2009 22:06
>
>The subject may be touchy but one could like to see a standardized way
>to have SKOS / ISO 25964 interchangeability (something like an SKOS/ISO
>application profile) .
>
>This would allow:

>1) An ISO 25964 thesaurus editor could unload/reload using SKOS files
>without information losses.

>>>>JDS-3: I think it is feasible to write a SKOS extension that
>>>>captures the formal model of the ISO standard.

>>>>JDS-3: on an earlier version of SKOS-XL, I made an exercise some
>>>>time ago to cover the BS8723 (which was input for the ISO standard)

>>>>JDS-3: note this exercise was never discussed on any forum yet

>2) Exchanges between different ISO 25964 thesauri could be done using the
>SKOS format.

>3) SKOS aware applications could support ISO 25964 "extensions" without
>parameterization to indicate which RDF attributes contains the supplementary
>ISO data.

>4) SKOS would benefit from the insights of ISO 25964 design team.

>>>>JDS-3: I support these considerations.  It would provide a formal
>>>>guideline for SKOS - ISO thesaurus transformation.

>There is a rather striking difference between SKOS flexibility and
>extendability (opening ways to unstandardized horizons) and ISO
>willingness to build upon the past within a stricter frame.
>
>What I am suggesting is to check (and to normalize somewhat) how the
>complete data model of ISO can be mapped in SKOS(-XL).

I am encouraged that this issue has been opened again, because as a
member of the ISO 25964 working party, I am keen to resolve any
divergence between SKOS and that standard. Part 1 of that standard is
still in draft, but will soon be circulated to national standardising
bodies for comment - they may be willing to supply copies to interested
parties, for a price :-(

Some of the issues were discussed in my message of 13th February 2009
and subsequent discussion

<http://lists.w3.org/Archives/Public/public-esw-thes/2009Feb/0033.html>

The UML model has been slightly modified since then, and I attach a copy
of the latest version of the class diagram below. There are notes in the
standard which give more detail, but due to ISO copyright restrictions I
am very sorry not to be able to make them available; I shall however do
my best to clarify any further points if anyone asks.

>By the way, labels reification opens the way to write labels which are
>written from multiple coordinated concepts.

>A reified label of a coordination concept could include an rdf:Seq.
>This rdf:Seq would contain strings and/or refers to (reified) labels from
>the different coordinated concepts and/or refers to coordination operators
>(conjunctions).

>This could generate a dynamic literalForm based on the labels of the
>differeent coordinated concepts.

This is the main element that is missing to allow the model to represent
classification schemes and other forms of pre-coordinated knowledge
organisation schemes. Such schemes typically have classes which
represent compound concepts, in which concepts from more than one facet
are combined, such as an activity and the people who carry out that
activity. When changes of facet occur within a classification hierarchy,
the relationship is one of synthesis rather than of subordination, and
neither SKOS nor the ISO model yet provide for this.

I would like to see this added to our model, and I think that it will
probably involve a solution on the lines that Christophe suggests above.
It is not just a case of combining labels, though; presumably we have to
treat the compound concept as a type of concept in the model, so that we
have concepts which are made up of compounds of other, simpler,
concepts, combined in a specified sequence, possibly with coordination
operators, and with corresponding labels. Would anyone like to try
adding this to the model below?

Coordination of concepts has previously been discussed during the
development of SKOS but was not followed through because it was "too
hard" to deal with within the time available, e.g.

<http://www.w3.org/2004/02/skos/core/proposals.html#coordination-8>
<http://www.w3.org/2006/07/SWD/track/issues/40>

Can we look at it again now?

Regards

Leonard Will

--
Willpower Information     (Partners: Dr Leonard D Will, Sheena E Will)
Information Management Consultants            Tel: +44 (0)20 8372 0092
27 Calshot Way                              L.Will@...
ENFIELD                                Sheena.Will@...
EN2 7BQ, UK                            http://www.willpowerinfo.co.uk/



Model_2009-08-31.jpg (386K) Download Attachment
< Prev | 1 - 2 | Next >