regexp and parentheses: incompatibility with MATLAB

View: New views
6 Messages — Rating Filter:   Alert me  

regexp and parentheses: incompatibility with MATLAB

by spasmous2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I'm on Octave 3.03 and MATLAB 2008a,

str='(0018,0011) DS 128                    # 1, 4 Columns';

# in Octave
regexp(str,'0018,0011)')
error: regexp: unmatched parentheses at position 9 of expression

regexp(str,'(0018,0011')
error: regexp: missing ) at position 10 of expression


% in MATLAB
regexp(str,'0018,0011)')
ans = 2

regexp(str,'(0018,0011')
ans = 1

_______________________________________________
Bug-octave mailing list
Bug-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/bug-octave

regexp and parentheses: incompatibility with MATLAB

by John W. Eaton-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 26-Feb-2009, spasmous wrote:

| I'm on Octave 3.03 and MATLAB 2008a,
|
| str='(0018,0011) DS 128                    # 1, 4 Columns';
|
| # in Octave
| regexp(str,'0018,0011)')
| error: regexp: unmatched parentheses at position 9 of expression
|
| regexp(str,'(0018,0011')
| error: regexp: missing ) at position 10 of expression
|
|
| % in MATLAB
| regexp(str,'0018,0011)')
| ans = 2
|
| regexp(str,'(0018,0011')
| ans = 1

The error message is coming from the PCRE library, so this problem
should probably be fixed there.

Or, it might be better for you to escape the special metacharacters in
your code:

  regexp(str,'0018,0011\)')
  regexp(str,'\(0018,0011')

I think both of these will work properly in Octave and Matlab.

jwe
_______________________________________________
Bug-octave mailing list
Bug-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/bug-octave

Re: regexp and parentheses: incompatibility with MATLAB

by Sergei Steshenko-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message





--- On Thu, 2/26/09, John W. Eaton <jwe@...> wrote:

> From: John W. Eaton <jwe@...>
> Subject: regexp and parentheses: incompatibility with MATLAB
> To: "spasmous" <spasmous@...>
> Cc: bug-octave@...
> Date: Thursday, February 26, 2009, 11:55 AM
> On 26-Feb-2009, spasmous wrote:
>
> | I'm on Octave 3.03 and MATLAB 2008a,
> |
> | str='(0018,0011) DS 128                    # 1, 4
> Columns';
> |
> | # in Octave
> | regexp(str,'0018,0011)')
> | error: regexp: unmatched parentheses at position 9 of
> expression
> |
> | regexp(str,'(0018,0011')
> | error: regexp: missing ) at position 10 of expression
> |
> |
> | % in MATLAB
> | regexp(str,'0018,0011)')
> | ans = 2
> |
> | regexp(str,'(0018,0011')
> | ans = 1
>
> The error message is coming from the PCRE library, so this
> problem
> should probably be fixed there.
>
> Or, it might be better for you to escape the special
> metacharacters in
> your code:
>
>   regexp(str,'0018,0011\)')
>   regexp(str,'\(0018,0011')
>
> I think both of these will work properly in Octave and
> Matlab.
>
> jwe

Well, in Perl (and I guess PCRE copies the functionality from Perl) it's
user's duty to disambiguate between \), \( as literal parentheses and
between (<something>) meant to extract that <something>.

Do Matlab/Octave allow extraction of the matching <something> in
(<something>) ?

If they don't, and PCRE does, then, I think, the application should prepend
() with \, but it becomes ugly, since the application has to parse the RE,
and this is what it doesn't want to do in the first place ...


Regards,
  Sergei.


     
_______________________________________________
Bug-octave mailing list
Bug-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/bug-octave

Re: regexp and parentheses: incompatibility with MATLAB

by dbateman :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Sergei Steshenko-2 wrote:
Well, in Perl (and I guess PCRE copies the functionality from Perl) it's
user's duty to disambiguate between \), \( as literal parentheses and
between (<something>) meant to extract that <something>.

Do Matlab/Octave allow extraction of the matching <something> in
(<something>) ?

If they don't, and PCRE does, then, I think, the application should prepend
() with \, but it becomes ugly, since the application has to parse the RE,
and this is what it doesn't want to do in the first place ...
We already have to partially parse the pattern to deal with matlab style named tokens. I suppose if we want  full matla compatibility we should do a simple parse of the RE and escape the unescaped special characters.... I'll look at this

D.

Re: regexp and parentheses: incompatibility with MATLAB

by John W. Eaton-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On  1-Mar-2009, dbateman wrote:

| We already have to partially parse the pattern to deal with matlab style
| named tokens. I suppose if we want  full matla compatibility we should do a
| simple parse of the RE and escape the unescaped special characters.... I'll
| look at this

Since you can't just escape them all but you have to look for
mismatched parens (for example) this seems fairly complex to me, and
something that would be best left to PCRE, which already has to parse
the expression.  Maybe it would be better to fix the problem there.  I
don't see it as a priority for fixing this in Octave, but if you see
an easy way to do it and it can be done in a reliable way, then I
guess it would be OK.

jwe
_______________________________________________
Bug-octave mailing list
Bug-octave@...
https://www-old.cae.wisc.edu/mailman/listinfo/bug-octave

Re: regexp and parentheses: incompatibility with MATLAB

by dbateman :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


John W. Eaton-3 wrote:
On  1-Mar-2009, dbateman wrote:

| We already have to partially parse the pattern to deal with matlab style
| named tokens. I suppose if we want  full matla compatibility we should do a
| simple parse of the RE and escape the unescaped special characters.... I'll
| look at this

Since you can't just escape them all but you have to look for
mismatched parens (for example) this seems fairly complex to me, and
something that would be best left to PCRE, which already has to parse
the expression.  Maybe it would be better to fix the problem there.  I
don't see it as a priority for fixing this in Octave, but if you see
an easy way to do it and it can be done in a reliable way, then I
guess it would be OK.

jwe
You're right I read too quickly thinking the ( were in the string and not the pattern

D.