[jira] Created: (BOO-1236) Missing PatternMatching patterns, plus regex pattern

View: New views
10 Messages — Rating Filter:   Alert me  

[jira] Created: (BOO-1236) Missing PatternMatching patterns, plus regex pattern

by JIRA jira@codehaus.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Missing PatternMatching patterns, plus regex pattern
----------------------------------------------------

                 Key: BOO-1236
                 URL: http://jira.codehaus.org/browse/BOO-1236
             Project: Boo
          Issue Type: Improvement
    Affects Versions: 0.9.1
            Reporter: Martinho Fernandes
            Priority: Minor
             Fix For: 0.9.2
         Attachments: re-patterns.patch

I noticed a few patterns were documented as NOT IMPLEMENTED in Boo.Lang.PatternMatching, so I implemented them.

I also added the ability to match with regular expressions, like this:

match aString:
    case /^\d+$/:
        print "It's a number!"
    case /^\w+$/:
        print "It's a word!"
    case /^\s+$/:
        print "It's whitespace!"
    otherwise:
        print "It's something else"

Test cases included.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



[jira] Commented: (BOO-1236) Missing PatternMatching patterns, plus regex pattern

by JIRA jira@codehaus.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ http://jira.codehaus.org/browse/BOO-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=185644#action_185644 ]

Cedric Vivier commented on BOO-1236:
------------------------------------

Thanks for the patch! :)
I'm not sure however the patch has intended behavior wrt ConstrainedAnd and ConstrainedOr (Rodrigo?).

Regex pattern matching is a neat idea, it could totally awesome if we add variable binding for named capture expressions in the regex, e.g:

{code}
s = "hello hello"
match s:
    case /(?<word>\w+)\s+(\k<word>)/:
        print word #prints `hello'
{code}





> Missing PatternMatching patterns, plus regex pattern
> ----------------------------------------------------
>
>                 Key: BOO-1236
>                 URL: http://jira.codehaus.org/browse/BOO-1236
>             Project: Boo
>          Issue Type: Improvement
>    Affects Versions: 0.9.1
>            Reporter: Martinho Fernandes
>            Priority: Minor
>             Fix For: 0.9.2
>
>         Attachments: re-patterns.patch
>
>
> I noticed a few patterns were documented as NOT IMPLEMENTED in Boo.Lang.PatternMatching, so I implemented them.
> I also added the ability to match with regular expressions, like this:
> match aString:
>     case /^\d+$/:
>         print "It's a number!"
>     case /^\w+$/:
>         print "It's a word!"
>     case /^\s+$/:
>         print "It's whitespace!"
>     otherwise:
>         print "It's something else"
> Test cases included.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



[jira] Issue Comment Edited: (BOO-1236) Missing PatternMatching patterns, plus regex pattern

by JIRA jira@codehaus.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ http://jira.codehaus.org/browse/BOO-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=185644#action_185644 ]

Cedric Vivier edited comment on BOO-1236 at 8/2/09 9:53 AM:
------------------------------------------------------------

Thanks for the patch! :)
I'm not sure however the patch has intended behavior wrt ConstrainedAnd and ConstrainedOr (Rodrigo?).

Regex pattern matching is a neat idea, it would be totally awesome if we add variable binding for named capture expressions in the regex, e.g:

{code}
s = "hello hello"
match s:
    case /(?<word>\w+)\s+(\k<word>)/:
        print word #prints `hello'
{code}





      was (Author: cedricv):
    Thanks for the patch! :)
I'm not sure however the patch has intended behavior wrt ConstrainedAnd and ConstrainedOr (Rodrigo?).

Regex pattern matching is a neat idea, it could totally awesome if we add variable binding for named capture expressions in the regex, e.g:

{code}
s = "hello hello"
match s:
    case /(?<word>\w+)\s+(\k<word>)/:
        print word #prints `hello'
{code}




 

> Missing PatternMatching patterns, plus regex pattern
> ----------------------------------------------------
>
>                 Key: BOO-1236
>                 URL: http://jira.codehaus.org/browse/BOO-1236
>             Project: Boo
>          Issue Type: Improvement
>    Affects Versions: 0.9.1
>            Reporter: Martinho Fernandes
>            Priority: Minor
>             Fix For: 0.9.2
>
>         Attachments: re-patterns.patch
>
>
> I noticed a few patterns were documented as NOT IMPLEMENTED in Boo.Lang.PatternMatching, so I implemented them.
> I also added the ability to match with regular expressions, like this:
> match aString:
>     case /^\d+$/:
>         print "It's a number!"
>     case /^\w+$/:
>         print "It's a word!"
>     case /^\s+$/:
>         print "It's whitespace!"
>     otherwise:
>         print "It's something else"
> Test cases included.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



[jira] Commented: (BOO-1236) Missing PatternMatching patterns, plus regex pattern

by JIRA jira@codehaus.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ http://jira.codehaus.org/browse/BOO-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=185648#action_185648 ]

Martinho Fernandes commented on BOO-1236:
-----------------------------------------

From the small documentation there was (a comment in MatchMacro.boo):

    Pattern1 and condition -- constrained pattern  (NOT IMPLEMENTED)
    Pattern1 or condition -- constrained pattern  (NOT IMPLEMENTED)

I interpreted this as matching if the match value matches Pattern1 and (or) the condition is true, so I implemented it like that. Let me know if something else was intended.

I also like the idea of binding variables to captures. I will work on that and update the patch.

> Missing PatternMatching patterns, plus regex pattern
> ----------------------------------------------------
>
>                 Key: BOO-1236
>                 URL: http://jira.codehaus.org/browse/BOO-1236
>             Project: Boo
>          Issue Type: Improvement
>    Affects Versions: 0.9.1
>            Reporter: Martinho Fernandes
>            Priority: Minor
>             Fix For: 0.9.2
>
>         Attachments: re-patterns.patch
>
>
> I noticed a few patterns were documented as NOT IMPLEMENTED in Boo.Lang.PatternMatching, so I implemented them.
> I also added the ability to match with regular expressions, like this:
> match aString:
>     case /^\d+$/:
>         print "It's a number!"
>     case /^\w+$/:
>         print "It's a word!"
>     case /^\s+$/:
>         print "It's whitespace!"
>     otherwise:
>         print "It's something else"
> Test cases included.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



[jira] Updated: (BOO-1236) Missing PatternMatching patterns, plus regex pattern

by JIRA jira@codehaus.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


     [ http://jira.codehaus.org/browse/BOO-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Martinho Fernandes updated BOO-1236:
------------------------------------

    Attachment: repatterns-w-binding.patch

I've implemented the regex group binding Cedric suggested and updated the patch.

There are a few points to note:

-  Only named groups are bound. I tend to use the explicit capture option all the time, so I don't care much about numbered groups, and I think no one should, because they're fragile. And I cannot bind to variables named 1, 2, or 3. I wouldn't like it if boo inherited all that Perl nonsense of $1, $2, $3... But since I am not the one steering the boo boat, I have no say in that. Let me know if you would like to see numbered groups bound, and I will see to it.

- I decided to bind the variables to the group Captures property, instead of the match value, because there can be more than one non-adjoining match. Look at these examples:

match "foofoofoo":
    case /(f(?<os>oo))+/:
        print "${os.Count} pairs of o's"
# Output: 3 pairs of o's

match "a1b2c3":
    case /(\w+(?<number>\d+)){3}/:
        print number[0].Value + number[2].Value + number[1].Value
# Output: 132

I know that 'number[0].Value' is not as wrist-friendly as simply 'number'. I could have gone with binding to the value of the first Capture, but that would not work with multiple captures, and would also break when there is no capture:

match "bar":
    case /(?<foo>foo)*bar/:
        pass    # argument out of range => match.Groups['foo'].Captures[0]

With my approach you can do this:

match "bar":
    case /(?<foo>foo)*bar/:
        if foo.Count > 0:
            print "There is a foo!"
        else:
            print "There is no foo!"
# Output: There is no foo!

- I am not sure about what would be the expected behaviour in a few cases:

match "foobar":
    case /(?<foo>foo)/ & /(?<foo>\w+)/:
        # Error "Duplicate group binding" at compile-time or runtime?
    case /(?<foo>foo)/ | /(?<foo>\w+)/:
        # Should this work, with foo being bound to the first that matches?

match "foo42":
    case /(?<word>\w+)/ | /(?<number>\d+)/:
        print word is null, number is null
# Output: false true ? or true true ?

Should this be short-circuited? Or should number be evaluated if there is a match?

If you don't do any crazy mixup like those above, everything works fine though:

match "foo42":
    case /(?<word>\w+)/ & /(?<number>\d+)/:
        print "The word is ${word[0].Value} and the number is ${number[0].Value}"
# Output: The word is foo and the number is 42


Let me know what you think.

> Missing PatternMatching patterns, plus regex pattern
> ----------------------------------------------------
>
>                 Key: BOO-1236
>                 URL: http://jira.codehaus.org/browse/BOO-1236
>             Project: Boo
>          Issue Type: Improvement
>    Affects Versions: 0.9.1
>            Reporter: Martinho Fernandes
>            Priority: Minor
>             Fix For: 0.9.2
>
>         Attachments: re-patterns.patch, repatterns-w-binding.patch
>
>
> I noticed a few patterns were documented as NOT IMPLEMENTED in Boo.Lang.PatternMatching, so I implemented them.
> I also added the ability to match with regular expressions, like this:
> match aString:
>     case /^\d+$/:
>         print "It's a number!"
>     case /^\w+$/:
>         print "It's a word!"
>     case /^\s+$/:
>         print "It's whitespace!"
>     otherwise:
>         print "It's something else"
> Test cases included.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



[jira] Commented: (BOO-1236) Missing PatternMatching patterns, plus regex pattern

by JIRA jira@codehaus.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ http://jira.codehaus.org/browse/BOO-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=189353#action_189353 ]

Daniel Grunwald commented on BOO-1236:
--------------------------------------

Even though the captures are sometimes useful (e.g. you want to know the start/end positions); for pattern matching I would prefer getting the value(s).
I'm not sure how to make that work with multiple captures, though - ignoring all but the first capture is a bad idea, and throwing an exception if there's no capture is a very bad idea.


case /(?<foo>foo)/ & /(?<foo>\w+)/:
What's happening if you have multiple groups named <foo> inside a single regex?
Probably the same should happen with &; or just make it a compile time error.


case /(?<word>\w+)/ | /(?<number>\d+)/:

'|' is not short-circuiting in Boo. Though I wonder why 'case' is using '|' and '&' (normally only used as bit-wise operators) instead of the short-circuiting 'or' / 'and'

> Missing PatternMatching patterns, plus regex pattern
> ----------------------------------------------------
>
>                 Key: BOO-1236
>                 URL: http://jira.codehaus.org/browse/BOO-1236
>             Project: Boo
>          Issue Type: Improvement
>    Affects Versions: 0.9.1
>            Reporter: Martinho Fernandes
>            Priority: Minor
>             Fix For: 0.9.2
>
>         Attachments: re-patterns.patch, repatterns-w-binding.patch
>
>
> I noticed a few patterns were documented as NOT IMPLEMENTED in Boo.Lang.PatternMatching, so I implemented them.
> I also added the ability to match with regular expressions, like this:
> match aString:
>     case /^\d+$/:
>         print "It's a number!"
>     case /^\w+$/:
>         print "It's a word!"
>     case /^\s+$/:
>         print "It's whitespace!"
>     otherwise:
>         print "It's something else"
> Test cases included.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



[jira] Commented: (BOO-1236) Missing PatternMatching patterns, plus regex pattern

by JIRA jira@codehaus.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ http://jira.codehaus.org/browse/BOO-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=189362#action_189362 ]

Martinho Fernandes commented on BOO-1236:
-----------------------------------------

Cedric's original idea was binding to the values as well. So, values it is.
But, if I bind the variables to the values, what's the expected behavior in these situations:

{noformat}
match "bar":
    case /(?<foo>foo)*bar/:
        assert foo == null # or foo == string.Empty?
{noformat}

{noformat}
match "abc"
    case /(?<letter>\w)+/:
        assert letter == "a" # or letter = ("a", "b", "c)?
{noformat}

I think this complex scenarios should be rejected by the compiler and force you to go back to using the regex API explicitely. But these cases are non-trivial to detect at compile-time (they require parsing the regex and checking if groups can have more than one capture).

I don't like either option I've shown for the second example. Getting only the first capture is useless if you're using that regex, because you clearly want more than one capture (otherwise you would have written either /(?<letter>w)w*/ or /(?<letter>w+)/). Making it an array solves that problem but goes back to my Captures approach, except you can simply write letter[0] instead of letter[0].Value.

About using '|' and '&' instead of 'or' / 'and': I used those because I was following what was written in the comment that spurred this idea, and they were using 'or' / 'and' for another case (constrained patterns, whatever those are). I would vote to forgetting those "constrained patterns" (a simple 'if' can do that) and using 'or' / 'and' for the "both" and "either" patterns.

> Missing PatternMatching patterns, plus regex pattern
> ----------------------------------------------------
>
>                 Key: BOO-1236
>                 URL: http://jira.codehaus.org/browse/BOO-1236
>             Project: Boo
>          Issue Type: Improvement
>    Affects Versions: 0.9.1
>            Reporter: Martinho Fernandes
>            Priority: Minor
>             Fix For: 0.9.2
>
>         Attachments: re-patterns.patch, repatterns-w-binding.patch
>
>
> I noticed a few patterns were documented as NOT IMPLEMENTED in Boo.Lang.PatternMatching, so I implemented them.
> I also added the ability to match with regular expressions, like this:
> match aString:
>     case /^\d+$/:
>         print "It's a number!"
>     case /^\w+$/:
>         print "It's a word!"
>     case /^\s+$/:
>         print "It's whitespace!"
>     otherwise:
>         print "It's something else"
> Test cases included.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



[jira] Updated: (BOO-1236) Missing PatternMatching patterns, plus regex pattern

by JIRA jira@codehaus.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


     [ http://jira.codehaus.org/browse/BOO-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rodrigo B. de Oliveira updated BOO-1236:
----------------------------------------

    Fix Version/s:     (was: 0.9.2)
                   0.9.3

> Missing PatternMatching patterns, plus regex pattern
> ----------------------------------------------------
>
>                 Key: BOO-1236
>                 URL: http://jira.codehaus.org/browse/BOO-1236
>             Project: Boo
>          Issue Type: Improvement
>    Affects Versions: 0.9.1
>            Reporter: Martinho Fernandes
>            Priority: Minor
>             Fix For: 0.9.3
>
>         Attachments: re-patterns.patch, repatterns-w-binding.patch
>
>
> I noticed a few patterns were documented as NOT IMPLEMENTED in Boo.Lang.PatternMatching, so I implemented them.
> I also added the ability to match with regular expressions, like this:
> match aString:
>     case /^\d+$/:
>         print "It's a number!"
>     case /^\w+$/:
>         print "It's a word!"
>     case /^\s+$/:
>         print "It's whitespace!"
>     otherwise:
>         print "It's something else"
> Test cases included.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



[jira] Resolved: (BOO-1236) Missing PatternMatching patterns, plus regex pattern

by JIRA jira@codehaus.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


     [ http://jira.codehaus.org/browse/BOO-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rodrigo B. de Oliveira resolved BOO-1236.
-----------------------------------------

    Resolution: Fixed
      Assignee: Rodrigo B. de Oliveira

Fixed in rev. 3399. Thanks for the patch!

> Missing PatternMatching patterns, plus regex pattern
> ----------------------------------------------------
>
>                 Key: BOO-1236
>                 URL: http://jira.codehaus.org/browse/BOO-1236
>             Project: Boo
>          Issue Type: Improvement
>    Affects Versions: 0.9.1
>            Reporter: Martinho Fernandes
>            Assignee: Rodrigo B. de Oliveira
>            Priority: Minor
>             Fix For: 0.9.3
>
>         Attachments: re-patterns.patch, repatterns-w-binding.patch
>
>
> I noticed a few patterns were documented as NOT IMPLEMENTED in Boo.Lang.PatternMatching, so I implemented them.
> I also added the ability to match with regular expressions, like this:
> match aString:
>     case /^\d+$/:
>         print "It's a number!"
>     case /^\w+$/:
>         print "It's a word!"
>     case /^\s+$/:
>         print "It's whitespace!"
>     otherwise:
>         print "It's something else"
> Test cases included.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



[jira] Closed: (BOO-1236) Missing PatternMatching patterns, plus regex pattern

by JIRA jira@codehaus.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


     [ http://jira.codehaus.org/browse/BOO-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rodrigo B. de Oliveira closed BOO-1236.
---------------------------------------


> Missing PatternMatching patterns, plus regex pattern
> ----------------------------------------------------
>
>                 Key: BOO-1236
>                 URL: http://jira.codehaus.org/browse/BOO-1236
>             Project: Boo
>          Issue Type: Improvement
>    Affects Versions: 0.9.1
>            Reporter: Martinho Fernandes
>            Assignee: Rodrigo B. de Oliveira
>            Priority: Minor
>             Fix For: 0.9.3
>
>         Attachments: re-patterns.patch, repatterns-w-binding.patch
>
>
> I noticed a few patterns were documented as NOT IMPLEMENTED in Boo.Lang.PatternMatching, so I implemented them.
> I also added the ability to match with regular expressions, like this:
> match aString:
>     case /^\d+$/:
>         print "It's a number!"
>     case /^\w+$/:
>         print "It's a word!"
>     case /^\s+$/:
>         print "It's whitespace!"
>     otherwise:
>         print "It's something else"
> Test cases included.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email