Backreferences in source-highlight 2.7

View: New views
5 Messages — Rating Filter:   Alert me  

Backreferences in source-highlight 2.7

by gnombat :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I'm having trouble with backreferences in source-highlight 2.7.  If I
create a language definition file named foo.lang:

keyword = `a(.)\1`

And then I create a test file example.foo:

aaa abb acc a** a@@

And then I run source-highlight:

source-highlight --lang-def=foo.lang example.foo

Then "aaa", "abb" and "acc" are highlighted, but "a**" and "a@@" are
not.  Shouldn't the dot match any character?



_______________________________________________
Help-source-highlight mailing list
Help-source-highlight@...
http://lists.gnu.org/mailman/listinfo/help-source-highlight

Re: Backreferences in source-highlight 2.7

by Lorenzo Bettini :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

gnombat wrote:

> I'm having trouble with backreferences in source-highlight 2.7.  If I
> create a language definition file named foo.lang:
>
> keyword = `a(.)\1`
>
> And then I create a test file example.foo:
>
> aaa abb acc a** a@@
>
> And then I run source-highlight:
>
> source-highlight --lang-def=foo.lang example.foo
>
> Then "aaa", "abb" and "acc" are highlighted, but "a**" and "a@@" are
> not.  Shouldn't the dot match any character?

yes you're right, so this must be a bug...

I'll try to work on it tomorrow

thanks for the feedback!
cheers
        Lorenzo

--
Lorenzo Bettini, PhD in Computer Science, DSI, Univ. di Firenze
ICQ# lbetto, 16080134     (GNU/Linux User # 158233)
HOME: http://www.lorenzobettini.it MUSIC: http://www.purplesucker.com
BLOGS: http://tronprog.blogspot.com  http://longlivemusic.blogspot.com
http://www.gnu.org/software/src-highlite
http://www.gnu.org/software/gengetopt
http://www.gnu.org/software/gengen http://doublecpp.sourceforge.net


_______________________________________________
Help-source-highlight mailing list
Help-source-highlight@...
http://lists.gnu.org/mailman/listinfo/help-source-highlight

Re: Backreferences in source-highlight 2.7

by gnombat :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Lorenzo Bettini wrote:

> gnombat wrote:
>> I'm having trouble with backreferences in source-highlight 2.7.  If I
>> create a language definition file named foo.lang:
>>
>> keyword = `a(.)\1`
>>
>> And then I create a test file example.foo:
>>
>> aaa abb acc a** a@@
>>
>> And then I run source-highlight:
>>
>> source-highlight --lang-def=foo.lang example.foo
>>
>> Then "aaa", "abb" and "acc" are highlighted, but "a**" and "a@@" are
>> not.  Shouldn't the dot match any character?
>
> yes you're right, so this must be a bug...
>
> I'll try to work on it tomorrow
>
> thanks for the feedback!
> cheers
> Lorenzo
>

I think I figured out what is going on: is_to_isolate from
regexpstatebuilder.cpp is treating the regular expression as an
alphanumerical string because `a(.)\1` starts and ends with a letter or
number.

/**
  * An expression is isolated basically if it is an alphanumerical
  * string
  * TODO check whether this is actually correct in principle
  * @param s
  * @return
  */
bool is_to_isolate(const string &s) {
   if (s.size()) {
     if ((isalnum(s[0]) || s[0] == '_') && (isalnum(s[s.size()-1]) ||
s[s.size()-1] == '_'))
       return true;
   }

   return false;
}

Then the whole regular expression gets surrounded with word boundary
characters.  This causes the match to fail with "a**" and "a@@".



_______________________________________________
Help-source-highlight mailing list
Help-source-highlight@...
http://lists.gnu.org/mailman/listinfo/help-source-highlight

Re: Re: Backreferences in source-highlight 2.7

by Lorenzo Bettini :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

gnombat wrote:

> Lorenzo Bettini wrote:
>> gnombat wrote:
>>> I'm having trouble with backreferences in source-highlight 2.7.  If I
>>> create a language definition file named foo.lang:
>>>
>>> keyword = `a(.)\1`
>>>
>>> And then I create a test file example.foo:
>>>
>>> aaa abb acc a** a@@
>>>
>>> And then I run source-highlight:
>>>
>>> source-highlight --lang-def=foo.lang example.foo
>>>
>>> Then "aaa", "abb" and "acc" are highlighted, but "a**" and "a@@" are
>>> not.  Shouldn't the dot match any character?
>>
>> yes you're right, so this must be a bug...
>>
>> I'll try to work on it tomorrow
>>
>> thanks for the feedback!
>> cheers
>>     Lorenzo
>>
>
> I think I figured out what is going on: is_to_isolate from
> regexpstatebuilder.cpp is treating the regular expression as an
> alphanumerical string because `a(.)\1` starts and ends with a letter or
> number.
>
> /**
>  * An expression is isolated basically if it is an alphanumerical
>  * string
>  * TODO check whether this is actually correct in principle
>  * @param s
>  * @return
>  */
> bool is_to_isolate(const string &s) {
>   if (s.size()) {
>     if ((isalnum(s[0]) || s[0] == '_') && (isalnum(s[s.size()-1]) ||
> s[s.size()-1] == '_'))
>       return true;
>   }
>
>   return false;
> }
>
> Then the whole regular expression gets surrounded with word boundary
> characters.  This causes the match to fail with "a**" and "a@@".

Yes you're right!

I'll have to fix the code calling is_to_isolate (in the case of a ` `
is_to_isolate probably should not be called at all)

hope to have a fix soon
thanks again
        Lorenzo

--
Lorenzo Bettini, PhD in Computer Science, DSI, Univ. di Firenze
ICQ# lbetto, 16080134     (GNU/Linux User # 158233)
HOME: http://www.lorenzobettini.it MUSIC: http://www.purplesucker.com
BLOGS: http://tronprog.blogspot.com  http://longlivemusic.blogspot.com
http://www.gnu.org/software/src-highlite
http://www.gnu.org/software/gengetopt
http://www.gnu.org/software/gengen http://doublecpp.sourceforge.net




_______________________________________________
Help-source-highlight mailing list
Help-source-highlight@...
http://lists.gnu.org/mailman/listinfo/help-source-highlight

Re: Backreferences in source-highlight 2.7

by Lorenzo Bettini :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

gnombat wrote:

> I'm having trouble with backreferences in source-highlight 2.7.  If I
> create a language definition file named foo.lang:
>
> keyword = `a(.)\1`
>
> And then I create a test file example.foo:
>
> aaa abb acc a** a@@
>
> And then I run source-highlight:
>
> source-highlight --lang-def=foo.lang example.foo
>
> Then "aaa", "abb" and "acc" are highlighted, but "a**" and "a@@" are
> not.  Shouldn't the dot match any character?
>


Hi there

the temporary version you find here

http://rap.dsi.unifi.it/~bettini/source-highlight-2.7.tar.gz

should fix this problem

hoep to hear from you soon
        Lorenzo

--
Lorenzo Bettini, PhD in Computer Science, DSI, Univ. di Firenze
ICQ# lbetto, 16080134     (GNU/Linux User # 158233)
HOME: http://www.lorenzobettini.it MUSIC: http://www.purplesucker.com
BLOGS: http://tronprog.blogspot.com  http://longlivemusic.blogspot.com
http://www.gnu.org/software/src-highlite
http://www.gnu.org/software/gengetopt
http://www.gnu.org/software/gengen http://doublecpp.sourceforge.net



_______________________________________________
Help-source-highlight mailing list
Help-source-highlight@...
http://lists.gnu.org/mailman/listinfo/help-source-highlight