[GNU m4 1.4.16] Shy groups in regular expressions

View: New views
3 Messages — Rating Filter:   Alert me  

[GNU m4 1.4.16] Shy groups in regular expressions

by Tim Landscheidt :: Rate this Message:

| View Threaded | Show Only this Message

Hi,

the documentation for regexp says:

|  -- Builtin: regexp (STRING, REGEXP, [REPLACEMENT])
|      Searches for REGEXP in STRING.  The syntax for regular expressions
|      is the same as in GNU Emacs, which is similar to BRE, Basic
|      Regular Expressions in POSIX.  *Note Syntax of Regular
|      Expressions: (emacs)Regexps.  Support for ERE, Extended Regular
|      Expressions is not available, but will be added in GNU M4 2.0.

|      [...]

However:

| [tim@passepartout ~]$ m4
| regexp(`abc', `\(b\)')
| 1
| regexp(`abc', `\(?:b\)')
| -1
| [tim@passepartout ~]$

Emacs's documentation on "Backslash in Regular Expressions"
that is linked from m4's info file doesn't seem to imply
that shy groups were in fact ERE.  So is this a bug?

Tim



Re: [GNU m4 1.4.16] Shy groups in regular expressions

by eblake :: Rate this Message:

| View Threaded | Show Only this Message

On 03/11/2012 05:06 PM, Tim Landscheidt wrote:

> Hi,
>
> the documentation for regexp says:
>
> |  -- Builtin: regexp (STRING, REGEXP, [REPLACEMENT])
> |      Searches for REGEXP in STRING.  The syntax for regular expressions
> |      is the same as in GNU Emacs, which is similar to BRE, Basic
> |      Regular Expressions in POSIX.  *Note Syntax of Regular
> |      Expressions: (emacs)Regexps.  Support for ERE, Extended Regular
> |      Expressions is not available, but will be added in GNU M4 2.0.
>
> |      [...]
>
> However:
>
> | [tim@passepartout ~]$ m4
> | regexp(`abc', `\(b\)')
> | 1
> | regexp(`abc', `\(?:b\)')
> | -1
> | [tim@passepartout ~]$
>
> Emacs's documentation on "Backslash in Regular Expressions"
> that is linked from m4's info file doesn't seem to imply
> that shy groups were in fact ERE.  So is this a bug?
Thanks for the report.

Shy groups are not part of glibc's re_compile_pattern, and are therefore
not a part of GNU m4.  M4 is only using glibc's implementation with a
flags of 0, which happens to be emacs-compatible, and not extensions
such as shy groups that have later been added to just emacs beyond what
glibc provides.

If I could ever get more time to work on m4, there is a proposal for m4
2.0 to allow the user to have more control over which flavor of regex is
in use, rather than forcing things to glibc's re_compile_pattern with
the default flag settings.  It would be feasible to make m4 2.0 compile
against libpcre or other such extension mechanism, in order to also
allow for additional regex flavors with features such as shy groups.

--
Eric Blake   eblake@...    +1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc (633 bytes) Download Attachment

Re: [GNU m4 1.4.16] Shy groups in regular expressions

by Tim Landscheidt :: Rate this Message:

| View Threaded | Show Only this Message

Eric Blake <eblake@...> wrote:

> [...]

> Shy groups are not part of glibc's re_compile_pattern, and are therefore
> not a part of GNU m4.  M4 is only using glibc's implementation with a
> flags of 0, which happens to be emacs-compatible, and not extensions
> such as shy groups that have later been added to just emacs beyond what
> glibc provides.

> If I could ever get more time to work on m4, there is a proposal for m4
> 2.0 to allow the user to have more control over which flavor of regex is
> in use, rather than forcing things to glibc's re_compile_pattern with
> the default flag settings.  It would be feasible to make m4 2.0 compile
> against libpcre or other such extension mechanism, in order to also
> allow for additional regex flavors with features such as shy groups.

Thanks for the explanation.  While m4 2.0 certainly sounds
nice, I think the bug in:

|      [...]                           The syntax for regular expressions
|      is the same as in GNU Emacs, which is similar to BRE, Basic
|      Regular Expressions in POSIX.  [...]

could also be fixed by changing the reference :-).  I googl-
ed a bit, but gave up when I ended on
<URI:http://www.gnu.org/software/libc/manual/html_mono/libc.html#Regular-Expressions>'s:

| The GNU C library supports two interfaces for matching regu-
| lar expressions.  One is the standard POSIX.2 interface, and
| the other is what the GNU system has had for many years.

and then did not found any documentation for "the other".
Gnulib has some information at
<URI:http://www.gnu.org/software/gnulib/manual/gnulib.html#Regular-expressions>
but it focuses on developers using the C functions rather
than someone interested in the syntax supported.

Tim