|
View:
New views
6 Messages
—
Rating Filter:
Alert me
|
|
|
Concerning LET or ASThank you for your time and attention at the WG meeting today.
TopQuadrant would like Holger's earlier comment [1] to be treated as a formal comment. (i.e. with an official WG response on this mailing list). My understanding from today's meeting is that that is likely to be that the WG has already considered the LET design and believes the AS design to be adequate. (LET is merely an abbreviated form for certain AS constructs). I also do not believe that TopQuadrant is bringing any new information that was not considered at your f2f meeting [2]. We however feel strongly about this, and are likely to raise a formal objection (in the sense that we believe it would be better for the WG to take a few weeks longer over SPARQL 1.1 and get this right, than to deliver SPARQL 1.1 on schedule without this feature). Thinking through particularly Steve's comments, I tried to come up with an example illustrating how the ordering of operations that is sometimes required is better articulated with LET than with AS. This example is not as polished as I would like, since I believe it is more helpful to contribute during your F2F meeting. First I wish to clarify that this is not about whether or not assignment should be in SPARQL 1.1. Assignment is in already, with the AS construct that was discussed under item 39. This issue is purely about the syntax and scoping rules for the single assignment capability. Many of the sort of processing tasks that we and are customers have involve mapping several legacy sources together, merging them into one RDF graph, and then doing some processing. A frequent problem is that different legacy sources represent the same data in different ways, e.g. with different case conventions, in different units, or whatever. In these cases, data laundry of one sort or another is necessary. One option for laundry is using functions and assignment within SPARQL. So for my example, I am taking information about alumni at a college and trying to find the appropriate year photo for them. I will simplify the name problem to a name consist of a first name and a last name, (no middle initial), but people change their last name from time to time. The data sources that I have include: - a current mailing database, with full-names, e-mail addresses, and addresses a:fullName a:email a:address _:w a:fullName "John Smith" . _:w a:email <mailto:john.smith@...>. - a database with students first names and last names and former last names to simplify processing I just use two properties b:firstName b:lastName for example: _:x b:firstName "John" . _:x b:lastName "Doe" . _:x b:lastName "Smith". shows that the person known as John Doe and the person known as John Smith are one and the same, without clarifying the chronology of the name change. - a database with date of matriculation, and years of study, by full name at time of matriculation c:matriculationDate c:studyYears c:fullName _:y c:fullName "John Doe" . _:y c:studyYears "P1Y"^^xs:yearMonthDuration . _:y c:matriculationDate "1988-09-01"^^xsd:date. - and a list of graduation photo names by year. d:year d:fileName _:z d:year "1988"^^xsd:date _:z d:fileName "classOf88" - I have arranged these photos as jpg files on the web at http://www.example.org/photos http://www.example.org/photos/classOf88.jpg SELECT ?eMail ?image WHERE { ?a a:email ?eMail . ?a e:fullName ?fullName LET ( ?fullNameSpaceNormalized=normalize-space(?fullName) ) [A] LET ( ?firstName=substring-before(?fullNameSpaceNormalized," ") [B] ?lastName=substring-after(?fullNameSpaceNormalized," ") ) ?b b:firstName ?firstName . ?b b:lastName ?lastName . ?b b:lastName ?altLastName . [C] LET ( ?altName=concat(?firstName, " ", ?altLastName ) ) ?c c:fullName ?a;tName . ?c c:studyYears ?lengthOfCourse . ?c c:matriculationDate ?matriculate . LET (?endDate=|year-from-date(add-yearMonthDuration-to-date(?matriculate,?lengthOfCourse)) ) ?d d:year ?endDate . ?d d:fileName ?imageFile . LET ( ?image = xs:anyURI(concat("http://www.example.org/photos", ?imageFile, ".jpg" ) ) )| } Notes: [A] for robustness against leading/trailing space and/or double space in the middle [B] cannot be combined with [A] because of rules discussed under issue 39 [C] ?altLastName can be the same as ?lastName I believe the WG is considering recommending that this query should be written as follows. SELECT ?eMail, xs:anyURI(concat("http://www.example.org/photos", ?imageFile, ".jpg" ) ) as ?image WHERE { SELECT ( * year-from-date(add-yearMonthDuration-to-date(?matriculate,?lengthOfCourse)) AS ?endDate ) WHERE { SELECT ( * concat(?firstName, " ", ?altLastName ) AS ?altName ) WHERE { SELECT (* substring-before(?fullNameSpaceNormalized," ") AS ?firstName, substring-after(?fullNameSpaceNormalized," ") AS ?lastName ) WHERE { SELECT (* normalize-space(?fullName) as ?fullNameSpaceNormalized) WHERE { ?a a:email ?eMail . ?a e:fullName ?fullName . } } ?b b:firstName ?firstName . ?b b:lastName ?lastName . ?b b:lastName ?altLastName . } ?c c:fullName ?a;tName . ?c c:studyYears ?lengthOfCourse . ?c c:matriculationDate ?matriculate . } ?d d:year ?endDate . ?d d:fileName ?imageFile . } (Using the equivalence from [3]) We believe that this is inferior. Harder to write, harder to read, harder to understand, and that the cost of complicating the language by having two ways to say the same thing is well worth it. Jeremy Carroll AC Rep, TopQuadrant. [1] http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2009Oct/0003 [2] http://www.w3.org/2009/sparql/meeting/2009-05-06#ProjectExpressions___26___20_Assignment [3] http://www.w3.org/2009/sparql/wiki/Feature:Assignment#Equivalence_with_SubSelects_and_ProjectExpressions |
|
|
Re: Concerning LET or ASPS
My example come from putting together the following thoughts ... A suggestion that what people don't like about LET is that it is 'procedural'. However single-assignment is declarative, and that perceiving LET as procedural is a failure of understanding. Ordering constraints can be declarative or procedural, LET introduces declarative ordering constraints. The ordering constraints appear because of the shape of the problem: for example if you compute an end date from a start date and a duration, you need to know the start date and the duration. Each LET declaratively but concisely introduces an ordering constraint. The alternative SELECT AS WHERE construct declaratively and verbosely introduces an ordering constraint. There is a natural ordering to do with the flow of information. This isn't necessarily the order of computation, but it is an order in which it is easier for the query author to think about the query. The LET syntax follows this natural ordering, the AS syntax does not. |
|
|
Re: Concerning LET or ASI thought I should share a couple of the comments I have had on this topic from TopQuadrant colleagues: [[ Speaking as a person who teaches this stuff, more than our own reputation is at stake. Perhaps this is something to add to our objection. Technology adoption is the goal of a standard. SPARQL fights an uphill battle for a couple of reasons: (1) It's Not SQL. (2) Pattern-based retrieval is weird (witness the fabulous popularity of PROLOG as a software engineering language). In short, many people are looking for reasons not to adopt it, and to stay with familiarity. In short, the semantic web is faced with a huge hurdle in SPARQL. And while I applaud the "small standard" policy (which in general is a boon to teaching and adoption), it is only good when it serves ease of adoption. SPARQL 1.0 has some huge problems, that give SQL fans great ammunition when it comes to saying "it's not ready" - negation and aggregates are the biggies here, and both of them have been fixed. Adding in another complex idiom (like nested subqueries) for something simple (LET) will be repeating the mistake of negation. Another reason to say, "Wait for SPARQL 3". I would go so far as to use the "small standard" argument the other way. Subqueries are difficult - in fact, in my course, I say, 'The reason SPARQL doesn't have subquery is because it is not needed. The sorts of things that you use them for in SQL are done easily in a pattern language, without resorted to a complex construct like a subquery". I challenge the room to prove me wrong. One person was able to do so. No SQL programmer can do it - subqueries are error prone and confusing. So - faced with a confusing, difficult, unsuccessful idea from SQL (subqueries) vs a well-accepted idea from BASIC, which one fits the "small standard" mantra better? I can speak confidently as an educator in this stuff. LET wins hands-down. ]] and [[ Also look at our mailing list to see what our customers are doing. [X]'s message from yesterday contains: SELECT ?stringPredicate ?stringObject ?stringAttributesNodeName ? stringAttributeNodeName WHERE { CQ:GatheredData ?predicate ?object . LET (?stringPredicate := smf:cast(smf:name(?predicate), xsd:string)) . LET (?stringObject := smf:cast(smf:name(?object), xsd:string)) . LET (?uuid := smf:generateUUID()) . ?attributes a CQ:attributes . LET (?attributesNodeName := smf:qname(?attributes)) . LET (?stringAttributesNodeName := smf:cast(smf:name(? attributesNodeName), xsd:string)) . LET (?stringAttributeNodeName := smf:buildString("{? stringAttributesNodeName}-{?uuid}")) . } Will make a nice chain of sub-SELECTs... ]] |
|
|
Re: Concerning LET or ASoff list Steve and Lee encouraged me to be clearer about what I thought
LET as a keyword means. Here is my attempt at specifying it (based on the WG Wiki page). Please note that I am not responsible for TopQuadrant's SPARQL work; Holger is our expert, and we tend to be dependent on Andy's implementation. So, I am happy with any corrections from Andy. It is not important how the word LET is spelt (i.e. as far as I know, TopQuadrant has no particular attachment to 'LET' rather than 'BIND' for example). ================================ In the FPWD of Query 1.1 we modify rule 43 for GroupGraphPattern as follows: [43*] GroupGraphPattern ::= '{' GroupGraphPatternLetSub '}' [A] GroupGraphPatternLetSub ::= ( GroupGraphPatternLetSub Let '.'? )? GroupGraphPatternSub [B] Let ::= 'LET' '(' Var ':=' Expression ( ',' Var ':=' Expression )* ')' Rules [43*] [A] and [B] are interpreted by rewriting queries involving LET into queries not involving LET. We will use phi(x) to be the written query of x. If x matches rule B, then: phi(x) = 'SELECT' '(' * '(' Expression 'AS' Var ')' ( '(' Expression 'AS' Var ')' )* ')' (with the variables matching respectively). If x matches rule A then phi(x) = phi(Let) 'WHERE' '{' phi(GroupGraphPatternLetSub) '}' GroupGraphPatternSub The rest of the specification then applies. ================== (Note this is a fine recipe for implementing as well). Specifically, this prohibits forward references. Being a macro expansion into a declarative form, this is declarative. Jeremy |
|
|
Re: Concerning LET or ASI made a couple of mistakes in my previous text.
Please allow me to withdraw that text and try again. The errors were: 1) I missed the { } around the subselect 2) my modification to rule 43 lost the SubSelect expansion 3) Removed too many '(' ')' in rewrite rule I made some modifications for clarity too. Also as a very minor comment, rule [43] etc combined with the gramar rules from SPARQL 1.0 do not seem to expand to the example query immediately above. I take the intent of rule 43 to be: GroupGraphPattern ::= '{' ( SubSelect | GroupGraphPatternSub )+ '}' (Without the +, the rule matches either a single subselect or a SPARQL 1.0 body, but not a combination of both.) Here is modified text: =============================== 'LET' is specified as a macro-expansion, in terms of subselect queries. In the FPWD of Query 1.1 we modify rule 43 for GroupGraphPattern as follows: [43*] GroupGraphPattern ::= '{' GroupGraphPatternLetSub '}' [A] GroupGraphPatternLetSub ::= ( GroupGraphPatternLetSub LetExpr '.'? )? GroupGraphPatternNoLetSub [B] LetExpr ::= 'LET' '(' Var ':=' Expression ( ',' Var ':=' Expression )* ')' [C] GroupGraphPatternNoLetSub ::= ( SubSelect | GroupGraphPatternSub )+ Rules [A] and [B] are interpreted by rewriting queries involving LET into queries not involving LET. We will use phi(x) to be the rewritten query of x. For clarity of exposition we will expand the two alternative readings of [A] as [A.1] GroupGraphPatternLetSub ::= GroupGraphPatternNoLetSub [A.2] GroupGraphPatternLetSub ::= GroupGraphPatternLetSub LetExpr '.'? GroupGraphPatternNoLetSub Expressions matching rule [A.1] are not rewritten. If x matches rule B, then: phi(x) = 'SELECT' * '(' Expression 'AS' Var ')' ( '(' Expression 'AS' Var ')' )* (with the variables matching respectively). If x matches rule A.2 with y matching GroupGraphPatternLetSub on the R.H.S., z matching LetExpr, and w matching GroupGraphPatternNoLetSub then phi(x) = '{' phi(z) 'WHERE' '{' phi(y) '}' '}' w After this rewrite is applied to all instances matching rules [A] and [B], the rewritten query does not involve 'LET' and its meaning is as given in the rest of the specification. ====================== |
|
|
Re: Concerning LET or ASAs a personal comment, (sorry you are probably sick of me now). I don't
particularly seek a response to this comment. I am surprised that there is concern that the LET single assignment construct may mislead users into having an incorrect processing model in their heads that might be overly procedural. This surprise is because the whole point about having a declarative semantics is that the processing model is irrelevant. Thus, with a declarative language, we expect, perhaps even desire, that users have incorrect processing models. Each implementation is free to use their own processing model, and the user works with their own. For example, when running an XSLT script, if it has some side effect of writing a message to the console, it is often surprising when these messages get written. This is because the easiest way to think of the XSLT processing model is top-down left-to-right, but good implementations tend to be lazy. This mismatch between the users model and the implementor's reality is desirable because: a) it makes it easier for the user to understand the language b) it allows the implementor to efficiently implement the language c) the declarative language design ensures that it doesn't matter that these two views of the processing model differ, perhaps radically. So, I think an advantage of the term LET as opposed to BIND (say) is that LET reminds some users of procedural programming in BASIC, and allows them to reuse that programming model. Now, while the details of the execution flow are very different in SPARQL than in BASIC, it seems that this apparent familiarity has pedagogical advantages. == For the record, TopQuadrant's position is we don't care what word is used, whether it is LET or BIND or something else. A further aside is that the latest release of TopBraid Composer includes a SPARQL debugger function that exposes some aspects of the insides of the SPARQL processing (I haven't used it). But I guess that using such a tool would quickly disabuse you of any incorrect notions of the processing model. (LET, I believe, is one of the SPARQL extensions supported by the tool). Jeremy |
| Free embeddable forum powered by Nabble | Forum Help |