« Return to Thread: [geni-dev] CF Requirements: 2) Identity Vocabulary

[geni-dev] CF Requirements: 2) Identity Vocabulary

by Max Ott-2 :: Rate this Message:

Reply to Author | View in Thread


OK, let me start.

On 14/03/2009, at 8:59 AM, Harry Mussman wrote:
>

> 2a)  During the discussion, Larry Lannom of CNRI made the point that a
> system like GENI needs a precise vocabulary or ontology,  that is
> shared by all suites. (This is absolutely essential when multiple GENI
> suites that are federated together, as expected.) This will apply to
> principals, aggregates and slices.

I fully agree and have been making the argument that a taxonomic  
approach as taken in RSpec is insufficient. We need an ontology not  
only to have a precise way of describing things (first basic  
requirement for achieving repeatability) but also to describe  
RELATIONSHIPS and constraints among them. Now you can shoehorn all  
that into a taxonomy or to be less strict, into a tree structure with  
refids like we have in XML, but it will be messy to define the  
underlying vocabulary.

Now one of the biggest and valid criticism is that it is very hard to  
create an all encompassing ontology. To get a sizeable group with  
diverse interests to agree on all aspects can take years, just check  
the progress on some OASIS standards.

But we don't need that in order to get going. Namespaces allow us to  
easily extend an existing, or add a new one. Obviously, it won't help  
us if everyone has their own version, but there are already a few good  
starting points (including RSpec for an initial set of topics/nouns)  
and the various groups in GENI could add the things they care about.  
There has been tremendous progress in the Semantic Web community on  
automatic mapping related ontologies to each other and if we use their  
basic technologies, such as OWL, we can leverage a lot.

Ilia Baldine is using NDL in ORCA and I have been trying to at least  
convert RSpec into an ontology (all my tools fails to parse the  
current spec).


> 2b)  The current DRAFT states:
> "Each principal (also aggregate, component, slice) shall have a
> globally-unique name and/or a globally unique numerical identifier."

I would actually broaden that to something like 'artifact'. In Orbit,  
beside all that, every experiment, every measurement set, every  
experiment description file, configuration prototype, application, ...  
has its unique identifier. How else would we be able to describe what  
we want to do and also what we did and how everything is related to.

Coming back to relationships mentioned above, there are some  
interesting 'complications'. Let's pick a simple resource, such as a  
computer. That obviously should have a identifier. At some stage we  
replace the disk. Does the resource get a new identifier? It's not the  
same anymore, it's performance and capabilities may have changed. So  
our inventory ontology (or database schema) breaks this down into  
related resources which make up an other one. (Is a computer now an  
aggregate as it aggregates such "atomic" resources as motherboard,  
memory, disk, ...)

>
>
> 2c)  Discussion:
> Current prototype implementations use a UUID as a unique identifier,
> which is a long "random number" that is (with a very high probability)
> unique within one suite, and also among all suites.

First of all is the use of an UUID as defined in RFC 4122 a core tenet  
of the architecture? There are other well defined ways to accomplish  
that. There is an efficiency argument to be made, but what else was  
behind this? What prevents us from using generic URNs? You can always  
get to a UUID by specifying a hash function and a mapping namespace  
something most UUID libraries provide.

>
> However, there is no way to take a UUID and decide which suite it is
> in, and thus there is no way to find a UUID in a suite registry
> without checking the registries of all suites.


Anyway, there is a clear trade-off between the ease of creating a  
unique identifier and finding information about it. But I disagree  
that the only solution to find a UUID is by checking all registries.  
We have very robust DHT technologies which can easily be used for  
that. In fact, this is the route we are taking (with an interesting  
twist, though).
>

> 2d)  A proposed solution is to have the requirements read:
> "Each principal (also aggregate, component, slice) shall have a
> globally-unique name and/or a globally unique numerical identifier,
> where part of the name and/or numerical identifier directly specifies
> the identity of the GENI suite."

Not sure if this fundamentally solves the problem. How do we ensure  
uniqueness of the Suite ID (another UUID) and how do we initially find  
all the entry points to the various suites?  If we assume that in  
order to bootstrap the system we need a way to find out about all the  
registries first, or have a hierarchical structure where everyone  
knows THE registry and it knows (indirectly) every available suite,  
then obviously we start with the relevant knowledge.

I guess, if nothing else it limits the number of  identifiers we are  
looking for and it's a rather stable set. Any gossiping scheme would  
work very well.

Now ,one solution for a hierarchical naming scheme and one which makes  
my networking colleagues squirm, is  using IPv6 addresses for the  
identifiers and the DNS infrastructure for lookups. We have a well  
established way to assign address spaces, the SRV record (RFC 2782)  
for instance is used by XMPP to find the relevant XMPP server for a  
domain (and that's how we currently implement federation), ...

As this is supposed to be a discussion, I better end on a slightly  
controversial note :)

Cheers,

-max


_______________________________________________
control-wg mailing list
control-wg@...
http://lists.geni.net/mailman/listinfo/control-wg

 « Return to Thread: [geni-dev] CF Requirements: 2) Identity Vocabulary