[19308] | 1 | ChangeLog for PCRE |
---|
| 2 | ------------------ |
---|
| 3 | |
---|
| 4 | Version 3.0 02-Jan-02 |
---|
| 5 | --------------------- |
---|
| 6 | |
---|
| 7 | 1. A bit of extraneous text had somehow crept into the pcregrep documentation. |
---|
| 8 | |
---|
| 9 | 2. If --disable-static was given, the building process failed when trying to |
---|
| 10 | build pcretest and pcregrep. (For some reason it was using libtool to compile |
---|
| 11 | them, which is not right, as they aren't part of the library.) |
---|
| 12 | |
---|
| 13 | |
---|
| 14 | Version 3.8 18-Dec-01 |
---|
| 15 | --------------------- |
---|
| 16 | |
---|
| 17 | 1. The experimental UTF-8 code was completely screwed up. It was packing the |
---|
| 18 | bytes in the wrong order. How dumb can you get? |
---|
| 19 | |
---|
| 20 | |
---|
| 21 | Version 3.7 29-Oct-01 |
---|
| 22 | --------------------- |
---|
| 23 | |
---|
| 24 | 1. In updating pcretest to check change 1 of version 3.6, I screwed up. |
---|
| 25 | This caused pcretest, when used on the test data, to segfault. Unfortunately, |
---|
| 26 | this didn't happen under Solaris 8, where I normally test things. |
---|
| 27 | |
---|
| 28 | 2. The Makefile had to be changed to make it work on BSD systems, where 'make' |
---|
| 29 | doesn't seem to recognize that ./xxx and xxx are the same file. (This entry |
---|
| 30 | isn't in ChangeLog distributed with 3.7 because I forgot when I hastily made |
---|
| 31 | this fix an hour or so after the initial 3.7 release.) |
---|
| 32 | |
---|
| 33 | |
---|
| 34 | Version 3.6 23-Oct-01 |
---|
| 35 | --------------------- |
---|
| 36 | |
---|
| 37 | 1. Crashed with /(sens|respons)e and \1ibility/ and "sense and sensibility" if |
---|
| 38 | offsets passed as NULL with zero offset count. |
---|
| 39 | |
---|
| 40 | 2. The config.guess and config.sub files had not been updated when I moved to |
---|
| 41 | the latest autoconf. |
---|
| 42 | |
---|
| 43 | |
---|
| 44 | Version 3.5 15-Aug-01 |
---|
| 45 | --------------------- |
---|
| 46 | |
---|
| 47 | 1. Added some missing #if !defined NOPOSIX conditionals in pcretest.c that |
---|
| 48 | had been forgotten. |
---|
| 49 | |
---|
| 50 | 2. By using declared but undefined structures, we can avoid using "void" |
---|
| 51 | definitions in pcre.h while keeping the internal definitions of the structures |
---|
| 52 | private. |
---|
| 53 | |
---|
| 54 | 3. The distribution is now built using autoconf 2.50 and libtool 1.4. From a |
---|
| 55 | user point of view, this means that both static and shared libraries are built |
---|
| 56 | by default, but this can be individually controlled. More of the work of |
---|
| 57 | handling this static/shared cases is now inside libtool instead of PCRE's make |
---|
| 58 | file. |
---|
| 59 | |
---|
| 60 | 4. The pcretest utility is now installed along with pcregrep because it is |
---|
| 61 | useful for users (to test regexs) and by doing this, it automatically gets |
---|
| 62 | relinked by libtool. The documentation has been turned into a man page, so |
---|
| 63 | there are now .1, .txt, and .html versions in /doc. |
---|
| 64 | |
---|
| 65 | 5. Upgrades to pcregrep: |
---|
| 66 | (i) Added long-form option names like gnu grep. |
---|
| 67 | (ii) Added --help to list all options with an explanatory phrase. |
---|
| 68 | (iii) Added -r, --recursive to recurse into sub-directories. |
---|
| 69 | (iv) Added -f, --file to read patterns from a file. |
---|
| 70 | |
---|
| 71 | 6. pcre_exec() was referring to its "code" argument before testing that |
---|
| 72 | argument for NULL (and giving an error if it was NULL). |
---|
| 73 | |
---|
| 74 | 7. Upgraded Makefile.in to allow for compiling in a different directory from |
---|
| 75 | the source directory. |
---|
| 76 | |
---|
| 77 | 8. Tiny buglet in pcretest: when pcre_fullinfo() was called to retrieve the |
---|
| 78 | options bits, the pointer it was passed was to an int instead of to an unsigned |
---|
| 79 | long int. This mattered only on 64-bit systems. |
---|
| 80 | |
---|
| 81 | 9. Fixed typo (3.4/1) in pcre.h again. Sigh. I had changed pcre.h (which is |
---|
| 82 | generated) instead of pcre.in, which it its source. Also made the same change |
---|
| 83 | in several of the .c files. |
---|
| 84 | |
---|
| 85 | 10. A new release of gcc defines printf() as a macro, which broke pcretest |
---|
| 86 | because it had an ifdef in the middle of a string argument for printf(). Fixed |
---|
| 87 | by using separate calls to printf(). |
---|
| 88 | |
---|
| 89 | 11. Added --enable-newline-is-cr and --enable-newline-is-lf to the configure |
---|
| 90 | script, to force use of CR or LF instead of \n in the source. On non-Unix |
---|
| 91 | systems, the value can be set in config.h. |
---|
| 92 | |
---|
| 93 | 12. The limit of 200 on non-capturing parentheses is a _nesting_ limit, not an |
---|
| 94 | absolute limit. Changed the text of the error message to make this clear, and |
---|
| 95 | likewise updated the man page. |
---|
| 96 | |
---|
| 97 | 13. The limit of 99 on the number of capturing subpatterns has been removed. |
---|
| 98 | The new limit is 65535, which I hope will not be a "real" limit. |
---|
| 99 | |
---|
| 100 | |
---|
| 101 | Version 3.4 22-Aug-00 |
---|
| 102 | --------------------- |
---|
| 103 | |
---|
| 104 | 1. Fixed typo in pcre.h: unsigned const char * changed to const unsigned char *. |
---|
| 105 | |
---|
| 106 | 2. Diagnose condition (?(0) as an error instead of crashing on matching. |
---|
| 107 | |
---|
| 108 | |
---|
| 109 | Version 3.3 01-Aug-00 |
---|
| 110 | --------------------- |
---|
| 111 | |
---|
| 112 | 1. If an octal character was given, but the value was greater than \377, it |
---|
| 113 | was not getting masked to the least significant bits, as documented. This could |
---|
| 114 | lead to crashes in some systems. |
---|
| 115 | |
---|
| 116 | 2. Perl 5.6 (if not earlier versions) accepts classes like [a-\d] and treats |
---|
| 117 | the hyphen as a literal. PCRE used to give an error; it now behaves like Perl. |
---|
| 118 | |
---|
| 119 | 3. Added the functions pcre_free_substring() and pcre_free_substring_list(). |
---|
| 120 | These just pass their arguments on to (pcre_free)(), but they are provided |
---|
| 121 | because some uses of PCRE bind it to non-C systems that can call its functions, |
---|
| 122 | but cannot call free() or pcre_free() directly. |
---|
| 123 | |
---|
| 124 | 4. Add "make test" as a synonym for "make check". Corrected some comments in |
---|
| 125 | the Makefile. |
---|
| 126 | |
---|
| 127 | 5. Add $(DESTDIR)/ in front of all the paths in the "install" target in the |
---|
| 128 | Makefile. |
---|
| 129 | |
---|
| 130 | 6. Changed the name of pgrep to pcregrep, because Solaris has introduced a |
---|
| 131 | command called pgrep for grepping around the active processes. |
---|
| 132 | |
---|
| 133 | 7. Added the beginnings of support for UTF-8 character strings. |
---|
| 134 | |
---|
| 135 | 8. Arranged for the Makefile to pass over the settings of CC, CFLAGS, and |
---|
| 136 | RANLIB to ./ltconfig so that they are used by libtool. I think these are all |
---|
| 137 | the relevant ones. (AR is not passed because ./ltconfig does its own figuring |
---|
| 138 | out for the ar command.) |
---|
| 139 | |
---|
| 140 | |
---|
| 141 | Version 3.2 12-May-00 |
---|
| 142 | --------------------- |
---|
| 143 | |
---|
| 144 | This is purely a bug fixing release. |
---|
| 145 | |
---|
| 146 | 1. If the pattern /((Z)+|A)*/ was matched agained ZABCDEFG it matched Z instead |
---|
| 147 | of ZA. This was just one example of several cases that could provoke this bug, |
---|
| 148 | which was introduced by change 9 of version 2.00. The code for breaking |
---|
| 149 | infinite loops after an iteration that matches an empty string was't working |
---|
| 150 | correctly. |
---|
| 151 | |
---|
| 152 | 2. The pcretest program was not imitating Perl correctly for the pattern /a*/g |
---|
| 153 | when matched against abbab (for example). After matching an empty string, it |
---|
| 154 | wasn't forcing anchoring when setting PCRE_NOTEMPTY for the next attempt; this |
---|
| 155 | caused it to match further down the string than it should. |
---|
| 156 | |
---|
| 157 | 3. The code contained an inclusion of sys/types.h. It isn't clear why this |
---|
| 158 | was there because it doesn't seem to be needed, and it causes trouble on some |
---|
| 159 | systems, as it is not a Standard C header. It has been removed. |
---|
| 160 | |
---|
| 161 | 4. Made 4 silly changes to the source to avoid stupid compiler warnings that |
---|
| 162 | were reported on the Macintosh. The changes were from |
---|
| 163 | |
---|
| 164 | while ((c = *(++ptr)) != 0 && c != '\n'); |
---|
| 165 | to |
---|
| 166 | while ((c = *(++ptr)) != 0 && c != '\n') ; |
---|
| 167 | |
---|
| 168 | Totally extraordinary, but if that's what it takes... |
---|
| 169 | |
---|
| 170 | 5. PCRE is being used in one environment where neither memmove() nor bcopy() is |
---|
| 171 | available. Added HAVE_BCOPY and an autoconf test for it; if neither |
---|
| 172 | HAVE_MEMMOVE nor HAVE_BCOPY is set, use a built-in emulation function which |
---|
| 173 | assumes the way PCRE uses memmove() (always moving upwards). |
---|
| 174 | |
---|
| 175 | 6. PCRE is being used in one environment where strchr() is not available. There |
---|
| 176 | was only one use in pcre.c, and writing it out to avoid strchr() probably gives |
---|
| 177 | faster code anyway. |
---|
| 178 | |
---|
| 179 | |
---|
| 180 | Version 3.1 09-Feb-00 |
---|
| 181 | --------------------- |
---|
| 182 | |
---|
| 183 | The only change in this release is the fixing of some bugs in Makefile.in for |
---|
| 184 | the "install" target: |
---|
| 185 | |
---|
| 186 | (1) It was failing to install pcreposix.h. |
---|
| 187 | |
---|
| 188 | (2) It was overwriting the pcre.3 man page with the pcreposix.3 man page. |
---|
| 189 | |
---|
| 190 | |
---|
| 191 | Version 3.0 01-Feb-00 |
---|
| 192 | --------------------- |
---|
| 193 | |
---|
| 194 | 1. Add support for the /+ modifier to perltest (to output $` like it does in |
---|
| 195 | pcretest). |
---|
| 196 | |
---|
| 197 | 2. Add support for the /g modifier to perltest. |
---|
| 198 | |
---|
| 199 | 3. Fix pcretest so that it behaves even more like Perl for /g when the pattern |
---|
| 200 | matches null strings. |
---|
| 201 | |
---|
| 202 | 4. Fix perltest so that it doesn't do unwanted things when fed an empty |
---|
| 203 | pattern. Perl treats empty patterns specially - it reuses the most recent |
---|
| 204 | pattern, which is not what we want. Replace // by /(?#)/ in order to avoid this |
---|
| 205 | effect. |
---|
| 206 | |
---|
| 207 | 5. The POSIX interface was broken in that it was just handing over the POSIX |
---|
| 208 | captured string vector to pcre_exec(), but (since release 2.00) PCRE has |
---|
| 209 | required a bigger vector, with some working space on the end. This means that |
---|
| 210 | the POSIX wrapper now has to get and free some memory, and copy the results. |
---|
| 211 | |
---|
| 212 | 6. Added some simple autoconf support, placing the test data and the |
---|
| 213 | documentation in separate directories, re-organizing some of the |
---|
| 214 | information files, and making it build pcre-config (a GNU standard). Also added |
---|
| 215 | libtool support for building PCRE as a shared library, which is now the |
---|
| 216 | default. |
---|
| 217 | |
---|
| 218 | 7. Got rid of the leading zero in the definition of PCRE_MINOR because 08 and |
---|
| 219 | 09 are not valid octal constants. Single digits will be used for minor values |
---|
| 220 | less than 10. |
---|
| 221 | |
---|
| 222 | 8. Defined REG_EXTENDED and REG_NOSUB as zero in the POSIX header, so that |
---|
| 223 | existing programs that set these in the POSIX interface can use PCRE without |
---|
| 224 | modification. |
---|
| 225 | |
---|
| 226 | 9. Added a new function, pcre_fullinfo() with an extensible interface. It can |
---|
| 227 | return all that pcre_info() returns, plus additional data. The pcre_info() |
---|
| 228 | function is retained for compatibility, but is considered to be obsolete. |
---|
| 229 | |
---|
| 230 | 10. Added experimental recursion feature (?R) to handle one common case that |
---|
| 231 | Perl 5.6 will be able to do with (?p{...}). |
---|
| 232 | |
---|
| 233 | 11. Added support for POSIX character classes like [:alpha:], which Perl is |
---|
| 234 | adopting. |
---|
| 235 | |
---|
| 236 | |
---|
| 237 | Version 2.08 31-Aug-99 |
---|
| 238 | ---------------------- |
---|
| 239 | |
---|
| 240 | 1. When startoffset was not zero and the pattern began with ".*", PCRE was not |
---|
| 241 | trying to match at the startoffset position, but instead was moving forward to |
---|
| 242 | the next newline as if a previous match had failed. |
---|
| 243 | |
---|
| 244 | 2. pcretest was not making use of PCRE_NOTEMPTY when repeating for /g and /G, |
---|
| 245 | and could get into a loop if a null string was matched other than at the start |
---|
| 246 | of the subject. |
---|
| 247 | |
---|
| 248 | 3. Added definitions of PCRE_MAJOR and PCRE_MINOR to pcre.h so the version can |
---|
| 249 | be distinguished at compile time, and for completeness also added PCRE_DATE. |
---|
| 250 | |
---|
| 251 | 5. Added Paul Sokolovsky's minor changes to make it easy to compile a Win32 DLL |
---|
| 252 | in GnuWin32 environments. |
---|
| 253 | |
---|
| 254 | |
---|
| 255 | Version 2.07 29-Jul-99 |
---|
| 256 | ---------------------- |
---|
| 257 | |
---|
| 258 | 1. The documentation is now supplied in plain text form and HTML as well as in |
---|
| 259 | the form of man page sources. |
---|
| 260 | |
---|
| 261 | 2. C++ compilers don't like assigning (void *) values to other pointer types. |
---|
| 262 | In particular this affects malloc(). Although there is no problem in Standard |
---|
| 263 | C, I've put in casts to keep C++ compilers happy. |
---|
| 264 | |
---|
| 265 | 3. Typo on pcretest.c; a cast of (unsigned char *) in the POSIX regexec() call |
---|
| 266 | should be (const char *). |
---|
| 267 | |
---|
| 268 | 4. If NOPOSIX is defined, pcretest.c compiles without POSIX support. This may |
---|
| 269 | be useful for non-Unix systems who don't want to bother with the POSIX stuff. |
---|
| 270 | However, I haven't made this a standard facility. The documentation doesn't |
---|
| 271 | mention it, and the Makefile doesn't support it. |
---|
| 272 | |
---|
| 273 | 5. The Makefile now contains an "install" target, with editable destinations at |
---|
| 274 | the top of the file. The pcretest program is not installed. |
---|
| 275 | |
---|
| 276 | 6. pgrep -V now gives the PCRE version number and date. |
---|
| 277 | |
---|
| 278 | 7. Fixed bug: a zero repetition after a literal string (e.g. /abcde{0}/) was |
---|
| 279 | causing the entire string to be ignored, instead of just the last character. |
---|
| 280 | |
---|
| 281 | 8. If a pattern like /"([^\\"]+|\\.)*"/ is applied in the normal way to a |
---|
| 282 | non-matching string, it can take a very, very long time, even for strings of |
---|
| 283 | quite modest length, because of the nested recursion. PCRE now does better in |
---|
| 284 | some of these cases. It does this by remembering the last required literal |
---|
| 285 | character in the pattern, and pre-searching the subject to ensure it is present |
---|
| 286 | before running the real match. In other words, it applies a heuristic to detect |
---|
| 287 | some types of certain failure quickly, and in the above example, if presented |
---|
| 288 | with a string that has no trailing " it gives "no match" very quickly. |
---|
| 289 | |
---|
| 290 | 9. A new runtime option PCRE_NOTEMPTY causes null string matches to be ignored; |
---|
| 291 | other alternatives are tried instead. |
---|
| 292 | |
---|
| 293 | |
---|
| 294 | Version 2.06 09-Jun-99 |
---|
| 295 | ---------------------- |
---|
| 296 | |
---|
| 297 | 1. Change pcretest's output for amount of store used to show just the code |
---|
| 298 | space, because the remainder (the data block) varies in size between 32-bit and |
---|
| 299 | 64-bit systems. |
---|
| 300 | |
---|
| 301 | 2. Added an extra argument to pcre_exec() to supply an offset in the subject to |
---|
| 302 | start matching at. This allows lookbehinds to work when searching for multiple |
---|
| 303 | occurrences in a string. |
---|
| 304 | |
---|
| 305 | 3. Added additional options to pcretest for testing multiple occurrences: |
---|
| 306 | |
---|
| 307 | /+ outputs the rest of the string that follows a match |
---|
| 308 | /g loops for multiple occurrences, using the new startoffset argument |
---|
| 309 | /G loops for multiple occurrences by passing an incremented pointer |
---|
| 310 | |
---|
| 311 | 4. PCRE wasn't doing the "first character" optimization for patterns starting |
---|
| 312 | with \b or \B, though it was doing it for other lookbehind assertions. That is, |
---|
| 313 | it wasn't noticing that a match for a pattern such as /\bxyz/ has to start with |
---|
| 314 | the letter 'x'. On long subject strings, this gives a significant speed-up. |
---|
| 315 | |
---|
| 316 | |
---|
| 317 | Version 2.05 21-Apr-99 |
---|
| 318 | ---------------------- |
---|
| 319 | |
---|
| 320 | 1. Changed the type of magic_number from int to long int so that it works |
---|
| 321 | properly on 16-bit systems. |
---|
| 322 | |
---|
| 323 | 2. Fixed a bug which caused patterns starting with .* not to work correctly |
---|
| 324 | when the subject string contained newline characters. PCRE was assuming |
---|
| 325 | anchoring for such patterns in all cases, which is not correct because .* will |
---|
| 326 | not pass a newline unless PCRE_DOTALL is set. It now assumes anchoring only if |
---|
| 327 | DOTALL is set at top level; otherwise it knows that patterns starting with .* |
---|
| 328 | must be retried after every newline in the subject. |
---|
| 329 | |
---|
| 330 | |
---|
| 331 | Version 2.04 18-Feb-99 |
---|
| 332 | ---------------------- |
---|
| 333 | |
---|
| 334 | 1. For parenthesized subpatterns with repeats whose minimum was zero, the |
---|
| 335 | computation of the store needed to hold the pattern was incorrect (too large). |
---|
| 336 | If such patterns were nested a few deep, this could multiply and become a real |
---|
| 337 | problem. |
---|
| 338 | |
---|
| 339 | 2. Added /M option to pcretest to show the memory requirement of a specific |
---|
| 340 | pattern. Made -m a synonym of -s (which does this globally) for compatibility. |
---|
| 341 | |
---|
| 342 | 3. Subpatterns of the form (regex){n,m} (i.e. limited maximum) were being |
---|
| 343 | compiled in such a way that the backtracking after subsequent failure was |
---|
| 344 | pessimal. Something like (a){0,3} was compiled as (a)?(a)?(a)? instead of |
---|
| 345 | ((a)((a)(a)?)?)? with disastrous performance if the maximum was of any size. |
---|
| 346 | |
---|
| 347 | |
---|
| 348 | Version 2.03 02-Feb-99 |
---|
| 349 | ---------------------- |
---|
| 350 | |
---|
| 351 | 1. Fixed typo and small mistake in man page. |
---|
| 352 | |
---|
| 353 | 2. Added 4th condition (GPL supersedes if conflict) and created separate |
---|
| 354 | LICENCE file containing the conditions. |
---|
| 355 | |
---|
| 356 | 3. Updated pcretest so that patterns such as /abc\/def/ work like they do in |
---|
| 357 | Perl, that is the internal \ allows the delimiter to be included in the |
---|
| 358 | pattern. Locked out the use of \ as a delimiter. If \ immediately follows |
---|
| 359 | the final delimiter, add \ to the end of the pattern (to test the error). |
---|
| 360 | |
---|
| 361 | 4. Added the convenience functions for extracting substrings after a successful |
---|
| 362 | match. Updated pcretest to make it able to test these functions. |
---|
| 363 | |
---|
| 364 | |
---|
| 365 | Version 2.02 14-Jan-99 |
---|
| 366 | ---------------------- |
---|
| 367 | |
---|
| 368 | 1. Initialized the working variables associated with each extraction so that |
---|
| 369 | their saving and restoring doesn't refer to uninitialized store. |
---|
| 370 | |
---|
| 371 | 2. Put dummy code into study.c in order to trick the optimizer of the IBM C |
---|
| 372 | compiler for OS/2 into generating correct code. Apparently IBM isn't going to |
---|
| 373 | fix the problem. |
---|
| 374 | |
---|
| 375 | 3. Pcretest: the timing code wasn't using LOOPREPEAT for timing execution |
---|
| 376 | calls, and wasn't printing the correct value for compiling calls. Increased the |
---|
| 377 | default value of LOOPREPEAT, and the number of significant figures in the |
---|
| 378 | times. |
---|
| 379 | |
---|
| 380 | 4. Changed "/bin/rm" in the Makefile to "-rm" so it works on Windows NT. |
---|
| 381 | |
---|
| 382 | 5. Renamed "deftables" as "dftables" to get it down to 8 characters, to avoid |
---|
| 383 | a building problem on Windows NT with a FAT file system. |
---|
| 384 | |
---|
| 385 | |
---|
| 386 | Version 2.01 21-Oct-98 |
---|
| 387 | ---------------------- |
---|
| 388 | |
---|
| 389 | 1. Changed the API for pcre_compile() to allow for the provision of a pointer |
---|
| 390 | to character tables built by pcre_maketables() in the current locale. If NULL |
---|
| 391 | is passed, the default tables are used. |
---|
| 392 | |
---|
| 393 | |
---|
| 394 | Version 2.00 24-Sep-98 |
---|
| 395 | ---------------------- |
---|
| 396 | |
---|
| 397 | 1. Since the (>?) facility is in Perl 5.005, don't require PCRE_EXTRA to enable |
---|
| 398 | it any more. |
---|
| 399 | |
---|
| 400 | 2. Allow quantification of (?>) groups, and make it work correctly. |
---|
| 401 | |
---|
| 402 | 3. The first character computation wasn't working for (?>) groups. |
---|
| 403 | |
---|
| 404 | 4. Correct the implementation of \Z (it is permitted to match on the \n at the |
---|
| 405 | end of the subject) and add 5.005's \z, which really does match only at the |
---|
| 406 | very end of the subject. |
---|
| 407 | |
---|
| 408 | 5. Remove the \X "cut" facility; Perl doesn't have it, and (?> is neater. |
---|
| 409 | |
---|
| 410 | 6. Remove the ability to specify CASELESS, MULTILINE, DOTALL, and |
---|
| 411 | DOLLAR_END_ONLY at runtime, to make it possible to implement the Perl 5.005 |
---|
| 412 | localized options. All options to pcre_study() were also removed. |
---|
| 413 | |
---|
| 414 | 7. Add other new features from 5.005: |
---|
| 415 | |
---|
| 416 | $(?<= positive lookbehind |
---|
| 417 | $(?<! negative lookbehind |
---|
| 418 | (?imsx-imsx) added the unsetting capability |
---|
| 419 | such a setting is global if at outer level; local otherwise |
---|
| 420 | (?imsx-imsx:) non-capturing groups with option setting |
---|
| 421 | (?(cond)re|re) conditional pattern matching |
---|
| 422 | |
---|
| 423 | A backreference to itself in a repeated group matches the previous |
---|
| 424 | captured string. |
---|
| 425 | |
---|
| 426 | 8. General tidying up of studying (both automatic and via "study") |
---|
| 427 | consequential on the addition of new assertions. |
---|
| 428 | |
---|
| 429 | 9. As in 5.005, unlimited repeated groups that could match an empty substring |
---|
| 430 | are no longer faulted at compile time. Instead, the loop is forcibly broken at |
---|
| 431 | runtime if any iteration does actually match an empty substring. |
---|
| 432 | |
---|
| 433 | 10. Include the RunTest script in the distribution. |
---|
| 434 | |
---|
| 435 | 11. Added tests from the Perl 5.005_02 distribution. This showed up a few |
---|
| 436 | discrepancies, some of which were old and were also with respect to 5.004. They |
---|
| 437 | have now been fixed. |
---|
| 438 | |
---|
| 439 | |
---|
| 440 | Version 1.09 28-Apr-98 |
---|
| 441 | ---------------------- |
---|
| 442 | |
---|
| 443 | 1. A negated single character class followed by a quantifier with a minimum |
---|
| 444 | value of one (e.g. [^x]{1,6} ) was not compiled correctly. This could lead to |
---|
| 445 | program crashes, or just wrong answers. This did not apply to negated classes |
---|
| 446 | containing more than one character, or to minima other than one. |
---|
| 447 | |
---|
| 448 | |
---|
| 449 | Version 1.08 27-Mar-98 |
---|
| 450 | ---------------------- |
---|
| 451 | |
---|
| 452 | 1. Add PCRE_UNGREEDY to invert the greediness of quantifiers. |
---|
| 453 | |
---|
| 454 | 2. Add (?U) and (?X) to set PCRE_UNGREEDY and PCRE_EXTRA respectively. The |
---|
| 455 | latter must appear before anything that relies on it in the pattern. |
---|
| 456 | |
---|
| 457 | |
---|
| 458 | Version 1.07 16-Feb-98 |
---|
| 459 | ---------------------- |
---|
| 460 | |
---|
| 461 | 1. A pattern such as /((a)*)*/ was not being diagnosed as in error (unlimited |
---|
| 462 | repeat of a potentially empty string). |
---|
| 463 | |
---|
| 464 | |
---|
| 465 | Version 1.06 23-Jan-98 |
---|
| 466 | ---------------------- |
---|
| 467 | |
---|
| 468 | 1. Added Markus Oberhumer's little patches for C++. |
---|
| 469 | |
---|
| 470 | 2. Literal strings longer than 255 characters were broken. |
---|
| 471 | |
---|
| 472 | |
---|
| 473 | Version 1.05 23-Dec-97 |
---|
| 474 | ---------------------- |
---|
| 475 | |
---|
| 476 | 1. Negated character classes containing more than one character were failing if |
---|
| 477 | PCRE_CASELESS was set at run time. |
---|
| 478 | |
---|
| 479 | |
---|
| 480 | Version 1.04 19-Dec-97 |
---|
| 481 | ---------------------- |
---|
| 482 | |
---|
| 483 | 1. Corrected the man page, where some "const" qualifiers had been omitted. |
---|
| 484 | |
---|
| 485 | 2. Made debugging output print "{0,xxx}" instead of just "{,xxx}" to agree with |
---|
| 486 | input syntax. |
---|
| 487 | |
---|
| 488 | 3. Fixed memory leak which occurred when a regex with back references was |
---|
| 489 | matched with an offsets vector that wasn't big enough. The temporary memory |
---|
| 490 | that is used in this case wasn't being freed if the match failed. |
---|
| 491 | |
---|
| 492 | 4. Tidied pcretest to ensure it frees memory that it gets. |
---|
| 493 | |
---|
| 494 | 5. Temporary memory was being obtained in the case where the passed offsets |
---|
| 495 | vector was exactly big enough. |
---|
| 496 | |
---|
| 497 | 6. Corrected definition of offsetof() from change 5 below. |
---|
| 498 | |
---|
| 499 | 7. I had screwed up change 6 below and broken the rules for the use of |
---|
| 500 | setjmp(). Now fixed. |
---|
| 501 | |
---|
| 502 | |
---|
| 503 | Version 1.03 18-Dec-97 |
---|
| 504 | ---------------------- |
---|
| 505 | |
---|
| 506 | 1. A erroneous regex with a missing opening parenthesis was correctly |
---|
| 507 | diagnosed, but PCRE attempted to access brastack[-1], which could cause crashes |
---|
| 508 | on some systems. |
---|
| 509 | |
---|
| 510 | 2. Replaced offsetof(real_pcre, code) by offsetof(real_pcre, code[0]) because |
---|
| 511 | it was reported that one broken compiler failed on the former because "code" is |
---|
| 512 | also an independent variable. |
---|
| 513 | |
---|
| 514 | 3. The erroneous regex a[]b caused an array overrun reference. |
---|
| 515 | |
---|
| 516 | 4. A regex ending with a one-character negative class (e.g. /[^k]$/) did not |
---|
| 517 | fail on data ending with that character. (It was going on too far, and checking |
---|
| 518 | the next character, typically a binary zero.) This was specific to the |
---|
| 519 | optimized code for single-character negative classes. |
---|
| 520 | |
---|
| 521 | 5. Added a contributed patch from the TIN world which does the following: |
---|
| 522 | |
---|
| 523 | + Add an undef for memmove, in case the the system defines a macro for it. |
---|
| 524 | |
---|
| 525 | + Add a definition of offsetof(), in case there isn't one. (I don't know |
---|
| 526 | the reason behind this - offsetof() is part of the ANSI standard - but |
---|
| 527 | it does no harm). |
---|
| 528 | |
---|
| 529 | + Reduce the ifdef's in pcre.c using macro DPRINTF, thereby eliminating |
---|
| 530 | most of the places where whitespace preceded '#'. I have given up and |
---|
| 531 | allowed the remaining 2 cases to be at the margin. |
---|
| 532 | |
---|
| 533 | + Rename some variables in pcre to eliminate shadowing. This seems very |
---|
| 534 | pedantic, but does no harm, of course. |
---|
| 535 | |
---|
| 536 | 6. Moved the call to setjmp() into its own function, to get rid of warnings |
---|
| 537 | from gcc -Wall, and avoided calling it at all unless PCRE_EXTRA is used. |
---|
| 538 | |
---|
| 539 | 7. Constructs such as \d{8,} were compiling into the equivalent of |
---|
| 540 | \d{8}\d{0,65527} instead of \d{8}\d* which didn't make much difference to the |
---|
| 541 | outcome, but in this particular case used more store than had been allocated, |
---|
| 542 | which caused the bug to be discovered because it threw up an internal error. |
---|
| 543 | |
---|
| 544 | 8. The debugging code in both pcre and pcretest for outputting the compiled |
---|
| 545 | form of a regex was going wrong in the case of back references followed by |
---|
| 546 | curly-bracketed repeats. |
---|
| 547 | |
---|
| 548 | |
---|
| 549 | Version 1.02 12-Dec-97 |
---|
| 550 | ---------------------- |
---|
| 551 | |
---|
| 552 | 1. Typos in pcre.3 and comments in the source fixed. |
---|
| 553 | |
---|
| 554 | 2. Applied a contributed patch to get rid of places where it used to remove |
---|
| 555 | 'const' from variables, and fixed some signed/unsigned and uninitialized |
---|
| 556 | variable warnings. |
---|
| 557 | |
---|
| 558 | 3. Added the "runtest" target to Makefile. |
---|
| 559 | |
---|
| 560 | 4. Set default compiler flag to -O2 rather than just -O. |
---|
| 561 | |
---|
| 562 | |
---|
| 563 | Version 1.01 19-Nov-97 |
---|
| 564 | ---------------------- |
---|
| 565 | |
---|
| 566 | 1. PCRE was failing to diagnose unlimited repeat of empty string for patterns |
---|
| 567 | like /([ab]*)*/, that is, for classes with more than one character in them. |
---|
| 568 | |
---|
| 569 | 2. Likewise, it wasn't diagnosing patterns with "once-only" subpatterns, such |
---|
| 570 | as /((?>a*))*/ (a PCRE_EXTRA facility). |
---|
| 571 | |
---|
| 572 | |
---|
| 573 | Version 1.00 18-Nov-97 |
---|
| 574 | ---------------------- |
---|
| 575 | |
---|
| 576 | 1. Added compile-time macros to support systems such as SunOS4 which don't have |
---|
| 577 | memmove() or strerror() but have other things that can be used instead. |
---|
| 578 | |
---|
| 579 | 2. Arranged that "make clean" removes the executables. |
---|
| 580 | |
---|
| 581 | |
---|
| 582 | Version 0.99 27-Oct-97 |
---|
| 583 | ---------------------- |
---|
| 584 | |
---|
| 585 | 1. Fixed bug in code for optimizing classes with only one character. It was |
---|
| 586 | initializing a 32-byte map regardless, which could cause it to run off the end |
---|
| 587 | of the memory it had got. |
---|
| 588 | |
---|
| 589 | 2. Added, conditional on PCRE_EXTRA, the proposed (?>REGEX) construction. |
---|
| 590 | |
---|
| 591 | |
---|
| 592 | Version 0.98 22-Oct-97 |
---|
| 593 | ---------------------- |
---|
| 594 | |
---|
| 595 | 1. Fixed bug in code for handling temporary memory usage when there are more |
---|
| 596 | back references than supplied space in the ovector. This could cause segfaults. |
---|
| 597 | |
---|
| 598 | |
---|
| 599 | Version 0.97 21-Oct-97 |
---|
| 600 | ---------------------- |
---|
| 601 | |
---|
| 602 | 1. Added the \X "cut" facility, conditional on PCRE_EXTRA. |
---|
| 603 | |
---|
| 604 | 2. Optimized negated single characters not to use a bit map. |
---|
| 605 | |
---|
| 606 | 3. Brought error texts together as macro definitions; clarified some of them; |
---|
| 607 | fixed one that was wrong - it said "range out of order" when it meant "invalid |
---|
| 608 | escape sequence". |
---|
| 609 | |
---|
| 610 | 4. Changed some char * arguments to const char *. |
---|
| 611 | |
---|
| 612 | 5. Added PCRE_NOTBOL and PCRE_NOTEOL (from POSIX). |
---|
| 613 | |
---|
| 614 | 6. Added the POSIX-style API wrapper in pcreposix.a and testing facilities in |
---|
| 615 | pcretest. |
---|
| 616 | |
---|
| 617 | |
---|
| 618 | Version 0.96 16-Oct-97 |
---|
| 619 | ---------------------- |
---|
| 620 | |
---|
| 621 | 1. Added a simple "pgrep" utility to the distribution. |
---|
| 622 | |
---|
| 623 | 2. Fixed an incompatibility with Perl: "{" is now treated as a normal character |
---|
| 624 | unless it appears in one of the precise forms "{ddd}", "{ddd,}", or "{ddd,ddd}" |
---|
| 625 | where "ddd" means "one or more decimal digits". |
---|
| 626 | |
---|
| 627 | 3. Fixed serious bug. If a pattern had a back reference, but the call to |
---|
| 628 | pcre_exec() didn't supply a large enough ovector to record the related |
---|
| 629 | identifying subpattern, the match always failed. PCRE now remembers the number |
---|
| 630 | of the largest back reference, and gets some temporary memory in which to save |
---|
| 631 | the offsets during matching if necessary, in order to ensure that |
---|
| 632 | backreferences always work. |
---|
| 633 | |
---|
| 634 | 4. Increased the compatibility with Perl in a number of ways: |
---|
| 635 | |
---|
| 636 | (a) . no longer matches \n by default; an option PCRE_DOTALL is provided |
---|
| 637 | to request this handling. The option can be set at compile or exec time. |
---|
| 638 | |
---|
| 639 | (b) $ matches before a terminating newline by default; an option |
---|
| 640 | PCRE_DOLLAR_ENDONLY is provided to override this (but not in multiline |
---|
| 641 | mode). The option can be set at compile or exec time. |
---|
| 642 | |
---|
| 643 | (c) The handling of \ followed by a digit other than 0 is now supposed to be |
---|
| 644 | the same as Perl's. If the decimal number it represents is less than 10 |
---|
| 645 | or there aren't that many previous left capturing parentheses, an octal |
---|
| 646 | escape is read. Inside a character class, it's always an octal escape, |
---|
| 647 | even if it is a single digit. |
---|
| 648 | |
---|
| 649 | (d) An escaped but undefined alphabetic character is taken as a literal, |
---|
| 650 | unless PCRE_EXTRA is set. Currently this just reserves the remaining |
---|
| 651 | escapes. |
---|
| 652 | |
---|
| 653 | (e) {0} is now permitted. (The previous item is removed from the compiled |
---|
| 654 | pattern). |
---|
| 655 | |
---|
| 656 | 5. Changed all the names of code files so that the basic parts are no longer |
---|
| 657 | than 10 characters, and abolished the teeny "globals.c" file. |
---|
| 658 | |
---|
| 659 | 6. Changed the handling of character classes; they are now done with a 32-byte |
---|
| 660 | bit map always. |
---|
| 661 | |
---|
| 662 | 7. Added the -d and /D options to pcretest to make it possible to look at the |
---|
| 663 | internals of compilation without having to recompile pcre. |
---|
| 664 | |
---|
| 665 | |
---|
| 666 | Version 0.95 23-Sep-97 |
---|
| 667 | ---------------------- |
---|
| 668 | |
---|
| 669 | 1. Fixed bug in pre-pass concerning escaped "normal" characters such as \x5c or |
---|
| 670 | \x20 at the start of a run of normal characters. These were being treated as |
---|
| 671 | real characters, instead of the source characters being re-checked. |
---|
| 672 | |
---|
| 673 | |
---|
| 674 | Version 0.94 18-Sep-97 |
---|
| 675 | ---------------------- |
---|
| 676 | |
---|
| 677 | 1. The functions are now thread-safe, with the caveat that the global variables |
---|
| 678 | containing pointers to malloc() and free() or alternative functions are the |
---|
| 679 | same for all threads. |
---|
| 680 | |
---|
| 681 | 2. Get pcre_study() to generate a bitmap of initial characters for non- |
---|
| 682 | anchored patterns when this is possible, and use it if passed to pcre_exec(). |
---|
| 683 | |
---|
| 684 | |
---|
| 685 | Version 0.93 15-Sep-97 |
---|
| 686 | ---------------------- |
---|
| 687 | |
---|
| 688 | 1. /(b)|(:+)/ was computing an incorrect first character. |
---|
| 689 | |
---|
| 690 | 2. Add pcre_study() to the API and the passing of pcre_extra to pcre_exec(), |
---|
| 691 | but not actually doing anything yet. |
---|
| 692 | |
---|
| 693 | 3. Treat "-" characters in classes that cannot be part of ranges as literals, |
---|
| 694 | as Perl does (e.g. [-az] or [az-]). |
---|
| 695 | |
---|
| 696 | 4. Set the anchored flag if a branch starts with .* or .*? because that tests |
---|
| 697 | all possible positions. |
---|
| 698 | |
---|
| 699 | 5. Split up into different modules to avoid including unneeded functions in a |
---|
| 700 | compiled binary. However, compile and exec are still in one module. The "study" |
---|
| 701 | function is split off. |
---|
| 702 | |
---|
| 703 | 6. The character tables are now in a separate module whose source is generated |
---|
| 704 | by an auxiliary program - but can then be edited by hand if required. There are |
---|
| 705 | now no calls to isalnum(), isspace(), isdigit(), isxdigit(), tolower() or |
---|
| 706 | toupper() in the code. |
---|
| 707 | |
---|
| 708 | 7. Turn the malloc/free funtions variables into pcre_malloc and pcre_free and |
---|
| 709 | make them global. Abolish the function for setting them, as the caller can now |
---|
| 710 | set them directly. |
---|
| 711 | |
---|
| 712 | |
---|
| 713 | Version 0.92 11-Sep-97 |
---|
| 714 | ---------------------- |
---|
| 715 | |
---|
| 716 | 1. A repeat with a fixed maximum and a minimum of 1 for an ordinary character |
---|
| 717 | (e.g. /a{1,3}/) was broken (I mis-optimized it). |
---|
| 718 | |
---|
| 719 | 2. Caseless matching was not working in character classes if the characters in |
---|
| 720 | the pattern were in upper case. |
---|
| 721 | |
---|
| 722 | 3. Make ranges like [W-c] work in the same way as Perl for caseless matching. |
---|
| 723 | |
---|
| 724 | 4. Make PCRE_ANCHORED public and accept as a compile option. |
---|
| 725 | |
---|
| 726 | 5. Add an options word to pcre_exec() and accept PCRE_ANCHORED and |
---|
| 727 | PCRE_CASELESS at run time. Add escapes \A and \I to pcretest to cause it to |
---|
| 728 | pass them. |
---|
| 729 | |
---|
| 730 | 6. Give an error if bad option bits passed at compile or run time. |
---|
| 731 | |
---|
| 732 | 7. Add PCRE_MULTILINE at compile and exec time, and (?m) as well. Add \M to |
---|
| 733 | pcretest to cause it to pass that flag. |
---|
| 734 | |
---|
| 735 | 8. Add pcre_info(), to get the number of identifying subpatterns, the stored |
---|
| 736 | options, and the first character, if set. |
---|
| 737 | |
---|
| 738 | 9. Recognize C+ or C{n,m} where n >= 1 as providing a fixed starting character. |
---|
| 739 | |
---|
| 740 | |
---|
| 741 | Version 0.91 10-Sep-97 |
---|
| 742 | ---------------------- |
---|
| 743 | |
---|
| 744 | 1. PCRE was failing to diagnose unlimited repeats of subpatterns that could |
---|
| 745 | match the empty string as in /(a*)*/. It was looping and ultimately crashing. |
---|
| 746 | |
---|
| 747 | 2. PCRE was looping on encountering an indefinitely repeated back reference to |
---|
| 748 | a subpattern that had matched an empty string, e.g. /(a|)\1*/. It now does what |
---|
| 749 | Perl does - treats the match as successful. |
---|
| 750 | |
---|
| 751 | **** |
---|