One error in pattern applying

View: New views
2 Messages — Rating Filter:   Alert me  

One error in pattern applying

by geelpheels :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I have already ported icu4.2 to a platfrom. But when runnning the code, it exited uncorrectly.

It was happened in the applyPropertyPattern function in the uniset_props.cpp file.

the applyPropertyPattern function is as follows:
UnicodeSet& UnicodeSet::applyPropertyPattern(const UnicodeString& pattern,
                                            ParsePosition& ppos,
                                                  UErrorCode &ec)
......

    // Look for an '=' sign.  If this is present, we will parse a
    // medium \p{gc=Cf} or long \p{GeneralCategory=Format}
    // pattern.
    int32_t equals = pattern.indexOf(EQUALS, pos);
    UnicodeString propName, valueName;
    if (equals >= 0 && equals < close && !isName) {
        // Equals seen; parse medium/long pattern
        pattern.extractBetween(pos, equals, propName);
        pattern.extractBetween(equals+1, close, valueName);
    }

    else {
        // Handle case where no '=' is seen, and \N{}
        pattern.extractBetween(pos, close, propName);
           
        // Handle \N{name}
        if (isName) {
            // This is a little inefficient since it means we have to
            // parse NAME_PROP back to UCHAR_NAME even though we already
            // know it's UCHAR_NAME.  If we refactor the API to
            // support args of (UProperty, char*) then we can remove
            // NAME_PROP and make this a little more efficient.
            valueName = propName;
            propName = UnicodeString(NAME_PROP, NAME_PROP_LENGTH, US_INV);
        }
    }

    applyPropertyAlias(propName, valueName, ec);

......

the pattern being applied is

static const UChar gIsWordPattern[] = {
//    [     \     p     {    A     l     p     h     a     b     e     t     i      c    }
    0x5b, 0x5c, 0x70, 0x7b, 0x61, 0x6c, 0x70, 0x68, 0x61, 0x62, 0x65, 0x74, 0x69, 0x63, 0x7d,
//          \     p     {    M     }                               Mark
          0x5c, 0x70, 0x7b, 0x4d, 0x7d,
//          \     p     {    N     d     }                         Digit_Numeric
          0x5c, 0x70, 0x7b, 0x4e, 0x64, 0x7d,
//          \     p     {    P     c     }      ]                  Connector_Punctuation
          0x5c, 0x70, 0x7b, 0x50, 0x63, 0x7d, 0x5d, 0};

becuase the bold lines never entered.

it was firstly called in this line of file regexst.cpp in function RegexStaticSets::RegexStaticSets(UErrorCode *status):

            fPropSets[URX_ISWORD_SET]  = new UnicodeSet(UnicodeString(TRUE, gIsWordPattern, -1),     *status);

I think it goes to the wrong direction, then what is the cause?

Re: One error in pattern applying

by geelpheels :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Nobody knows! Unblievable!