|
|
@ -271,11 +271,16 @@ NAME |
|
|
PCRE BUILD-TIME OPTIONS |
|
|
PCRE BUILD-TIME OPTIONS |
|
|
|
|
|
|
|
|
This document describes the optional features of PCRE that can be |
|
|
This document describes the optional features of PCRE that can be |
|
|
selected when the library is compiled. They are all selected, or dese- |
|
|
|
|
|
lected, by providing options to the configure script that is run before |
|
|
|
|
|
the make command. The complete list of options for configure (which |
|
|
|
|
|
includes the standard ones such as the selection of the installation |
|
|
|
|
|
directory) can be obtained by running |
|
|
|
|
|
|
|
|
selected when the library is compiled. It assumes use of the configure |
|
|
|
|
|
script, where the optional features are selected or deselected by pro- |
|
|
|
|
|
viding options to configure before running the make command. However, |
|
|
|
|
|
the same options can be selected in both Unix-like and non-Unix-like |
|
|
|
|
|
environments using the GUI facility of CMakeSetup if you are using |
|
|
|
|
|
CMake instead of configure to build PCRE. |
|
|
|
|
|
|
|
|
|
|
|
The complete list of options for configure (which includes the standard |
|
|
|
|
|
ones such as the selection of the installation directory) can be |
|
|
|
|
|
obtained by running |
|
|
|
|
|
|
|
|
./configure --help |
|
|
./configure --help |
|
|
|
|
|
|
|
|
@ -361,6 +366,19 @@ CODE VALUE OF NEWLINE |
|
|
conventional to use the standard for your operating system. |
|
|
conventional to use the standard for your operating system. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
WHAT \R MATCHES |
|
|
|
|
|
|
|
|
|
|
|
By default, the sequence \R in a pattern matches any Unicode newline |
|
|
|
|
|
sequence, whatever has been selected as the line ending sequence. If |
|
|
|
|
|
you specify |
|
|
|
|
|
|
|
|
|
|
|
--enable-bsr-anycrlf |
|
|
|
|
|
|
|
|
|
|
|
the default is changed so that \R matches only CR, LF, or CRLF. What- |
|
|
|
|
|
ever is selected when PCRE is built can be overridden when the library |
|
|
|
|
|
functions are called. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
BUILDING SHARED AND STATIC LIBRARIES |
|
|
BUILDING SHARED AND STATIC LIBRARIES |
|
|
|
|
|
|
|
|
The PCRE building process uses libtool to build both shared and static |
|
|
The PCRE building process uses libtool to build both shared and static |
|
|
@ -499,6 +517,33 @@ USING EBCDIC CODE |
|
|
environment (for example, an IBM mainframe operating system). |
|
|
environment (for example, an IBM mainframe operating system). |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
PCREGREP OPTIONS FOR COMPRESSED FILE SUPPORT |
|
|
|
|
|
|
|
|
|
|
|
By default, pcregrep reads all files as plain text. You can build it so |
|
|
|
|
|
that it recognizes files whose names end in .gz or .bz2, and reads them |
|
|
|
|
|
with libz or libbz2, respectively, by adding one or both of |
|
|
|
|
|
|
|
|
|
|
|
--enable-pcregrep-libz |
|
|
|
|
|
--enable-pcregrep-libbz2 |
|
|
|
|
|
|
|
|
|
|
|
to the configure command. These options naturally require that the rel- |
|
|
|
|
|
evant libraries are installed on your system. Configuration will fail |
|
|
|
|
|
if they are not. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
PCRETEST OPTION FOR LIBREADLINE SUPPORT |
|
|
|
|
|
|
|
|
|
|
|
If you add |
|
|
|
|
|
|
|
|
|
|
|
--enable-pcretest-libreadline |
|
|
|
|
|
|
|
|
|
|
|
to the configure command, pcretest is linked with the libreadline |
|
|
|
|
|
library, and when its input is from a terminal, it reads it using the |
|
|
|
|
|
readline() function. This provides line-editing and history facilities. |
|
|
|
|
|
Note that libreadline is GPL-licenced, so if you distribute a binary of |
|
|
|
|
|
pcretest linked in this way, there may be licensing issues. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
SEE ALSO |
|
|
SEE ALSO |
|
|
|
|
|
|
|
|
pcreapi(3), pcre_config(3). |
|
|
pcreapi(3), pcre_config(3). |
|
|
@ -513,7 +558,7 @@ AUTHOR |
|
|
|
|
|
|
|
|
REVISION |
|
|
REVISION |
|
|
|
|
|
|
|
|
Last updated: 30 July 2007 |
|
|
|
|
|
|
|
|
Last updated: 18 December 2007 |
|
|
Copyright (c) 1997-2007 University of Cambridge. |
|
|
Copyright (c) 1997-2007 University of Cambridge. |
|
|
------------------------------------------------------------------------------ |
|
|
------------------------------------------------------------------------------ |
|
|
|
|
|
|
|
|
@ -824,7 +869,7 @@ PCRE API OVERVIEW |
|
|
a Perl-compatible manner. A sample program that demonstrates the sim- |
|
|
a Perl-compatible manner. A sample program that demonstrates the sim- |
|
|
plest way of using them is provided in the file called pcredemo.c in |
|
|
plest way of using them is provided in the file called pcredemo.c in |
|
|
the source distribution. The pcresample documentation describes how to |
|
|
the source distribution. The pcresample documentation describes how to |
|
|
run it. |
|
|
|
|
|
|
|
|
compile and run it. |
|
|
|
|
|
|
|
|
A second matching function, pcre_dfa_exec(), which is not Perl-compati- |
|
|
A second matching function, pcre_dfa_exec(), which is not Perl-compati- |
|
|
ble, is also provided. This uses a different algorithm for the match- |
|
|
ble, is also provided. This uses a different algorithm for the match- |
|
|
@ -919,8 +964,11 @@ NEWLINES |
|
|
dollar metacharacters, the handling of #-comments in /x mode, and, when |
|
|
dollar metacharacters, the handling of #-comments in /x mode, and, when |
|
|
CRLF is a recognized line ending sequence, the match position advance- |
|
|
CRLF is a recognized line ending sequence, the match position advance- |
|
|
ment for a non-anchored pattern. There is more detail about this in the |
|
|
ment for a non-anchored pattern. There is more detail about this in the |
|
|
section on pcre_exec() options below. The choice of newline convention |
|
|
|
|
|
does not affect the interpretation of the \n or \r escape sequences. |
|
|
|
|
|
|
|
|
section on pcre_exec() options below. |
|
|
|
|
|
|
|
|
|
|
|
The choice of newline convention does not affect the interpretation of |
|
|
|
|
|
the \n or \r escape sequences, nor does it affect what \R matches, |
|
|
|
|
|
which is controlled in a similar way, but by separate options. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
MULTITHREADING |
|
|
MULTITHREADING |
|
|
@ -977,6 +1025,14 @@ CHECKING BUILD-TIME OPTIONS |
|
|
and -1 for ANY. The default should normally be the standard sequence |
|
|
and -1 for ANY. The default should normally be the standard sequence |
|
|
for your operating system. |
|
|
for your operating system. |
|
|
|
|
|
|
|
|
|
|
|
PCRE_CONFIG_BSR |
|
|
|
|
|
|
|
|
|
|
|
The output is an integer whose value indicates what character sequences |
|
|
|
|
|
the \R escape sequence matches by default. A value of 0 means that \R |
|
|
|
|
|
matches any Unicode line ending sequence; a value of 1 means that \R |
|
|
|
|
|
matches only CR, LF, or CRLF. The default can be overridden when a pat- |
|
|
|
|
|
tern is compiled or matched. |
|
|
|
|
|
|
|
|
PCRE_CONFIG_LINK_SIZE |
|
|
PCRE_CONFIG_LINK_SIZE |
|
|
|
|
|
|
|
|
The output is an integer that contains the number of bytes used for |
|
|
The output is an integer that contains the number of bytes used for |
|
|
@ -1106,6 +1162,15 @@ COMPILING A PATTERN |
|
|
all with number 255, before each pattern item. For discussion of the |
|
|
all with number 255, before each pattern item. For discussion of the |
|
|
callout facility, see the pcrecallout documentation. |
|
|
callout facility, see the pcrecallout documentation. |
|
|
|
|
|
|
|
|
|
|
|
PCRE_BSR_ANYCRLF |
|
|
|
|
|
PCRE_BSR_UNICODE |
|
|
|
|
|
|
|
|
|
|
|
These options (which are mutually exclusive) control what the \R escape |
|
|
|
|
|
sequence matches. The choice is either to match only CR, LF, or CRLF, |
|
|
|
|
|
or to match any Unicode newline sequence. The default is specified when |
|
|
|
|
|
PCRE is built. It can be overridden from within the pattern, or by set- |
|
|
|
|
|
ting an option when a compiled pattern is matched. |
|
|
|
|
|
|
|
|
PCRE_CASELESS |
|
|
PCRE_CASELESS |
|
|
|
|
|
|
|
|
If this bit is set, letters in the pattern match both upper and lower |
|
|
If this bit is set, letters in the pattern match both upper and lower |
|
|
@ -1291,7 +1356,7 @@ COMPILATION ERROR CODES |
|
|
9 nothing to repeat |
|
|
9 nothing to repeat |
|
|
10 [this code is not in use] |
|
|
10 [this code is not in use] |
|
|
11 internal error: unexpected repeat |
|
|
11 internal error: unexpected repeat |
|
|
12 unrecognized character after (? |
|
|
|
|
|
|
|
|
12 unrecognized character after (? or (?- |
|
|
13 POSIX named classes are supported only within a class |
|
|
13 POSIX named classes are supported only within a class |
|
|
14 missing ) |
|
|
14 missing ) |
|
|
15 reference to non-existent subpattern |
|
|
15 reference to non-existent subpattern |
|
|
@ -1299,7 +1364,7 @@ COMPILATION ERROR CODES |
|
|
17 unknown option bit(s) set |
|
|
17 unknown option bit(s) set |
|
|
18 missing ) after comment |
|
|
18 missing ) after comment |
|
|
19 [this code is not in use] |
|
|
19 [this code is not in use] |
|
|
20 regular expression too large |
|
|
|
|
|
|
|
|
20 regular expression is too large |
|
|
21 failed to get memory |
|
|
21 failed to get memory |
|
|
22 unmatched parentheses |
|
|
22 unmatched parentheses |
|
|
23 internal error: code overflow |
|
|
23 internal error: code overflow |
|
|
@ -1328,7 +1393,7 @@ COMPILATION ERROR CODES |
|
|
46 malformed \P or \p sequence |
|
|
46 malformed \P or \p sequence |
|
|
47 unknown property name after \P or \p |
|
|
47 unknown property name after \P or \p |
|
|
48 subpattern name is too long (maximum 32 characters) |
|
|
48 subpattern name is too long (maximum 32 characters) |
|
|
49 too many named subpatterns (maximum 10,000) |
|
|
|
|
|
|
|
|
49 too many named subpatterns (maximum 10000) |
|
|
50 [this code is not in use] |
|
|
50 [this code is not in use] |
|
|
51 octal value is greater than \377 (not in UTF-8 mode) |
|
|
51 octal value is greater than \377 (not in UTF-8 mode) |
|
|
52 internal error: overran compiling workspace |
|
|
52 internal error: overran compiling workspace |
|
|
@ -1336,10 +1401,18 @@ COMPILATION ERROR CODES |
|
|
found |
|
|
found |
|
|
54 DEFINE group contains more than one branch |
|
|
54 DEFINE group contains more than one branch |
|
|
55 repeating a DEFINE group is not allowed |
|
|
55 repeating a DEFINE group is not allowed |
|
|
56 inconsistent NEWLINE options" |
|
|
|
|
|
|
|
|
56 inconsistent NEWLINE options |
|
|
57 \g is not followed by a braced name or an optionally braced |
|
|
57 \g is not followed by a braced name or an optionally braced |
|
|
non-zero number |
|
|
non-zero number |
|
|
58 (?+ or (?- or (?(+ or (?(- must be followed by a non-zero number |
|
|
58 (?+ or (?- or (?(+ or (?(- must be followed by a non-zero number |
|
|
|
|
|
59 (*VERB) with an argument is not supported |
|
|
|
|
|
60 (*VERB) not recognized |
|
|
|
|
|
61 number is too big |
|
|
|
|
|
62 subpattern name expected |
|
|
|
|
|
63 digit expected after (?+ |
|
|
|
|
|
|
|
|
|
|
|
The numbers 32 and 10000 in errors 48 and 49 are defaults; different |
|
|
|
|
|
values may be used if the limits were changed when PCRE was built. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
STUDYING A PATTERN |
|
|
STUDYING A PATTERN |
|
|
@ -1532,13 +1605,14 @@ INFORMATION ABOUT A PATTERN |
|
|
|
|
|
|
|
|
Return 1 if the pattern contains any explicit matches for CR or LF |
|
|
Return 1 if the pattern contains any explicit matches for CR or LF |
|
|
characters, otherwise 0. The fourth argument should point to an int |
|
|
characters, otherwise 0. The fourth argument should point to an int |
|
|
variable. |
|
|
|
|
|
|
|
|
variable. An explicit match is either a literal CR or LF character, or |
|
|
|
|
|
\r or \n. |
|
|
|
|
|
|
|
|
PCRE_INFO_JCHANGED |
|
|
PCRE_INFO_JCHANGED |
|
|
|
|
|
|
|
|
Return 1 if the (?J) option setting is used in the pattern, otherwise |
|
|
|
|
|
0. The fourth argument should point to an int variable. The (?J) inter- |
|
|
|
|
|
nal option setting changes the local PCRE_DUPNAMES option. |
|
|
|
|
|
|
|
|
Return 1 if the (?J) or (?-J) option setting is used in the pattern, |
|
|
|
|
|
otherwise 0. The fourth argument should point to an int variable. (?J) |
|
|
|
|
|
and (?-J) set and unset the local PCRE_DUPNAMES option, respectively. |
|
|
|
|
|
|
|
|
PCRE_INFO_LASTLITERAL |
|
|
PCRE_INFO_LASTLITERAL |
|
|
|
|
|
|
|
|
@ -1813,6 +1887,14 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION |
|
|
turned out to be anchored by virtue of its contents, it cannot be made |
|
|
turned out to be anchored by virtue of its contents, it cannot be made |
|
|
unachored at matching time. |
|
|
unachored at matching time. |
|
|
|
|
|
|
|
|
|
|
|
PCRE_BSR_ANYCRLF |
|
|
|
|
|
PCRE_BSR_UNICODE |
|
|
|
|
|
|
|
|
|
|
|
These options (which are mutually exclusive) control what the \R escape |
|
|
|
|
|
sequence matches. The choice is either to match only CR, LF, or CRLF, |
|
|
|
|
|
or to match any Unicode newline sequence. These options override the |
|
|
|
|
|
choice that was made or defaulted when the pattern was compiled. |
|
|
|
|
|
|
|
|
PCRE_NEWLINE_CR |
|
|
PCRE_NEWLINE_CR |
|
|
PCRE_NEWLINE_LF |
|
|
PCRE_NEWLINE_LF |
|
|
PCRE_NEWLINE_CRLF |
|
|
PCRE_NEWLINE_CRLF |
|
|
@ -1829,7 +1911,7 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION |
|
|
When PCRE_NEWLINE_CRLF, PCRE_NEWLINE_ANYCRLF, or PCRE_NEWLINE_ANY is |
|
|
When PCRE_NEWLINE_CRLF, PCRE_NEWLINE_ANYCRLF, or PCRE_NEWLINE_ANY is |
|
|
set, and a match attempt for an unanchored pattern fails when the cur- |
|
|
set, and a match attempt for an unanchored pattern fails when the cur- |
|
|
rent position is at a CRLF sequence, and the pattern contains no |
|
|
rent position is at a CRLF sequence, and the pattern contains no |
|
|
explicit matches for CR or NL characters, the match position is |
|
|
|
|
|
|
|
|
explicit matches for CR or LF characters, the match position is |
|
|
advanced by two characters instead of one, in other words, to after the |
|
|
advanced by two characters instead of one, in other words, to after the |
|
|
CRLF. |
|
|
CRLF. |
|
|
|
|
|
|
|
|
@ -1839,9 +1921,12 @@ MATCHING A PATTERN: THE TRADITIONAL FUNCTION |
|
|
failing at the start, it skips both the CR and the LF before retrying. |
|
|
failing at the start, it skips both the CR and the LF before retrying. |
|
|
However, the pattern [\r\n]A does match that string, because it con- |
|
|
However, the pattern [\r\n]A does match that string, because it con- |
|
|
tains an explicit CR or LF reference, and so advances only by one char- |
|
|
tains an explicit CR or LF reference, and so advances only by one char- |
|
|
acter after the first failure. Note than an explicit CR or LF refer- |
|
|
|
|
|
ence occurs for negated character classes such as [^X] because they can |
|
|
|
|
|
match CR or LF characters. |
|
|
|
|
|
|
|
|
acter after the first failure. |
|
|
|
|
|
|
|
|
|
|
|
An explicit match for CR of LF is either a literal appearance of one of |
|
|
|
|
|
those characters, or one of the \r or \n escape sequences. Implicit |
|
|
|
|
|
matches such as [^X] do not count, nor does \s (which includes CR and |
|
|
|
|
|
LF in the characters that it matches). |
|
|
|
|
|
|
|
|
Notwithstanding the above, anomalous effects may still occur when CRLF |
|
|
Notwithstanding the above, anomalous effects may still occur when CRLF |
|
|
is a valid newline sequence and explicit \r or \n escapes appear in the |
|
|
is a valid newline sequence and explicit \r or \n escapes appear in the |
|
|
@ -2480,8 +2565,8 @@ AUTHOR |
|
|
|
|
|
|
|
|
REVISION |
|
|
REVISION |
|
|
|
|
|
|
|
|
Last updated: 21 August 2007 |
|
|
|
|
|
Copyright (c) 1997-2007 University of Cambridge. |
|
|
|
|
|
|
|
|
Last updated: 23 January 2008 |
|
|
|
|
|
Copyright (c) 1997-2008 University of Cambridge. |
|
|
------------------------------------------------------------------------------ |
|
|
------------------------------------------------------------------------------ |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@ -2765,16 +2850,23 @@ DIFFERENCES BETWEEN PCRE AND PERL |
|
|
(f) The PCRE_NOTBOL, PCRE_NOTEOL, PCRE_NOTEMPTY, and PCRE_NO_AUTO_CAP- |
|
|
(f) The PCRE_NOTBOL, PCRE_NOTEOL, PCRE_NOTEMPTY, and PCRE_NO_AUTO_CAP- |
|
|
TURE options for pcre_exec() have no Perl equivalents. |
|
|
TURE options for pcre_exec() have no Perl equivalents. |
|
|
|
|
|
|
|
|
(g) The callout facility is PCRE-specific. |
|
|
|
|
|
|
|
|
(g) The \R escape sequence can be restricted to match only CR, LF, or |
|
|
|
|
|
CRLF by the PCRE_BSR_ANYCRLF option. |
|
|
|
|
|
|
|
|
|
|
|
(h) The callout facility is PCRE-specific. |
|
|
|
|
|
|
|
|
(h) The partial matching facility is PCRE-specific. |
|
|
|
|
|
|
|
|
(i) The partial matching facility is PCRE-specific. |
|
|
|
|
|
|
|
|
(i) Patterns compiled by PCRE can be saved and re-used at a later time, |
|
|
|
|
|
|
|
|
(j) Patterns compiled by PCRE can be saved and re-used at a later time, |
|
|
even on different hosts that have the other endianness. |
|
|
even on different hosts that have the other endianness. |
|
|
|
|
|
|
|
|
(j) The alternative matching function (pcre_dfa_exec()) matches in a |
|
|
|
|
|
|
|
|
(k) The alternative matching function (pcre_dfa_exec()) matches in a |
|
|
different way and is not Perl-compatible. |
|
|
different way and is not Perl-compatible. |
|
|
|
|
|
|
|
|
|
|
|
(l) PCRE recognizes some special sequences such as (*CR) at the start |
|
|
|
|
|
of a pattern that set overall options that cannot be changed within the |
|
|
|
|
|
pattern. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
AUTHOR |
|
|
AUTHOR |
|
|
|
|
|
|
|
|
@ -2785,7 +2877,7 @@ AUTHOR |
|
|
|
|
|
|
|
|
REVISION |
|
|
REVISION |
|
|
|
|
|
|
|
|
Last updated: 08 August 2007 |
|
|
|
|
|
|
|
|
Last updated: 11 September 2007 |
|
|
Copyright (c) 1997-2007 University of Cambridge. |
|
|
Copyright (c) 1997-2007 University of Cambridge. |
|
|
------------------------------------------------------------------------------ |
|
|
------------------------------------------------------------------------------ |
|
|
|
|
|
|
|
|
@ -2853,7 +2945,14 @@ NEWLINE CONVENTIONS |
|
|
changes the convention to CR. That pattern matches "a\nb" because LF is |
|
|
changes the convention to CR. That pattern matches "a\nb" because LF is |
|
|
no longer a newline. Note that these special settings, which are not |
|
|
no longer a newline. Note that these special settings, which are not |
|
|
Perl-compatible, are recognized only at the very start of a pattern, |
|
|
Perl-compatible, are recognized only at the very start of a pattern, |
|
|
and that they must be in upper case. |
|
|
|
|
|
|
|
|
and that they must be in upper case. If more than one of them is |
|
|
|
|
|
present, the last one is used. |
|
|
|
|
|
|
|
|
|
|
|
The newline convention does not affect what the \R escape sequence |
|
|
|
|
|
matches. By default, this is any Unicode newline sequence, for Perl |
|
|
|
|
|
compatibility. However, this can be changed; see the description of \R |
|
|
|
|
|
in the section entitled "Newline sequences" below. A change of \R set- |
|
|
|
|
|
ting can be combined with a change of newline convention. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
CHARACTERS AND METACHARACTERS |
|
|
CHARACTERS AND METACHARACTERS |
|
|
@ -3128,9 +3227,9 @@ BACKSLASH |
|
|
|
|
|
|
|
|
Newline sequences |
|
|
Newline sequences |
|
|
|
|
|
|
|
|
Outside a character class, the escape sequence \R matches any Unicode |
|
|
|
|
|
newline sequence. This is a Perl 5.10 feature. In non-UTF-8 mode \R is |
|
|
|
|
|
equivalent to the following: |
|
|
|
|
|
|
|
|
Outside a character class, by default, the escape sequence \R matches |
|
|
|
|
|
any Unicode newline sequence. This is a Perl 5.10 feature. In non-UTF-8 |
|
|
|
|
|
mode \R is equivalent to the following: |
|
|
|
|
|
|
|
|
(?>\r\n|\n|\x0b|\f|\r|\x85) |
|
|
(?>\r\n|\n|\x0b|\f|\r|\x85) |
|
|
|
|
|
|
|
|
@ -3146,6 +3245,28 @@ BACKSLASH |
|
|
rator, U+2029). Unicode character property support is not needed for |
|
|
rator, U+2029). Unicode character property support is not needed for |
|
|
these characters to be recognized. |
|
|
these characters to be recognized. |
|
|
|
|
|
|
|
|
|
|
|
It is possible to restrict \R to match only CR, LF, or CRLF (instead of |
|
|
|
|
|
the complete set of Unicode line endings) by setting the option |
|
|
|
|
|
PCRE_BSR_ANYCRLF either at compile time or when the pattern is matched. |
|
|
|
|
|
(BSR is an abbrevation for "backslash R".) This can be made the default |
|
|
|
|
|
when PCRE is built; if this is the case, the other behaviour can be |
|
|
|
|
|
requested via the PCRE_BSR_UNICODE option. It is also possible to |
|
|
|
|
|
specify these settings by starting a pattern string with one of the |
|
|
|
|
|
following sequences: |
|
|
|
|
|
|
|
|
|
|
|
(*BSR_ANYCRLF) CR, LF, or CRLF only |
|
|
|
|
|
(*BSR_UNICODE) any Unicode newline sequence |
|
|
|
|
|
|
|
|
|
|
|
These override the default and the options given to pcre_compile(), but |
|
|
|
|
|
they can be overridden by options given to pcre_exec(). Note that these |
|
|
|
|
|
special settings, which are not Perl-compatible, are recognized only at |
|
|
|
|
|
the very start of a pattern, and that they must be in upper case. If |
|
|
|
|
|
more than one of them is present, the last one is used. They can be |
|
|
|
|
|
combined with a change of newline convention, for example, a pattern |
|
|
|
|
|
can start with: |
|
|
|
|
|
|
|
|
|
|
|
(*ANY)(*BSR_ANYCRLF) |
|
|
|
|
|
|
|
|
Inside a character class, \R matches the letter "R". |
|
|
Inside a character class, \R matches the letter "R". |
|
|
|
|
|
|
|
|
Unicode character properties |
|
|
Unicode character properties |
|
|
@ -3601,9 +3722,9 @@ VERTICAL BAR |
|
|
INTERNAL OPTION SETTING |
|
|
INTERNAL OPTION SETTING |
|
|
|
|
|
|
|
|
The settings of the PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and |
|
|
The settings of the PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, and |
|
|
PCRE_EXTENDED options can be changed from within the pattern by a |
|
|
|
|
|
sequence of Perl option letters enclosed between "(?" and ")". The |
|
|
|
|
|
option letters are |
|
|
|
|
|
|
|
|
PCRE_EXTENDED options (which are Perl-compatible) can be changed from |
|
|
|
|
|
within the pattern by a sequence of Perl option letters enclosed |
|
|
|
|
|
between "(?" and ")". The option letters are |
|
|
|
|
|
|
|
|
i for PCRE_CASELESS |
|
|
i for PCRE_CASELESS |
|
|
m for PCRE_MULTILINE |
|
|
m for PCRE_MULTILINE |
|
|
@ -3617,6 +3738,10 @@ INTERNAL OPTION SETTING |
|
|
is also permitted. If a letter appears both before and after the |
|
|
is also permitted. If a letter appears both before and after the |
|
|
hyphen, the option is unset. |
|
|
hyphen, the option is unset. |
|
|
|
|
|
|
|
|
|
|
|
The PCRE-specific options PCRE_DUPNAMES, PCRE_UNGREEDY, and PCRE_EXTRA |
|
|
|
|
|
can be changed in the same way as the Perl-compatible options by using |
|
|
|
|
|
the characters J, U and X respectively. |
|
|
|
|
|
|
|
|
When an option change occurs at top level (that is, not inside subpat- |
|
|
When an option change occurs at top level (that is, not inside subpat- |
|
|
tern parentheses), the change applies to the remainder of the pattern |
|
|
tern parentheses), the change applies to the remainder of the pattern |
|
|
that follows. If the change is placed right at the start of a pattern, |
|
|
that follows. If the change is placed right at the start of a pattern, |
|
|
@ -3642,9 +3767,11 @@ INTERNAL OPTION SETTING |
|
|
the effects of option settings happen at compile time. There would be |
|
|
the effects of option settings happen at compile time. There would be |
|
|
some very weird behaviour otherwise. |
|
|
some very weird behaviour otherwise. |
|
|
|
|
|
|
|
|
The PCRE-specific options PCRE_DUPNAMES, PCRE_UNGREEDY, and PCRE_EXTRA |
|
|
|
|
|
can be changed in the same way as the Perl-compatible options by using |
|
|
|
|
|
the characters J, U and X respectively. |
|
|
|
|
|
|
|
|
Note: There are other PCRE-specific options that can be set by the |
|
|
|
|
|
application when the compile or match functions are called. In some |
|
|
|
|
|
cases the pattern can contain special leading sequences to override |
|
|
|
|
|
what the application has set or what has been defaulted. Details are |
|
|
|
|
|
given in the section entitled "Newline sequences" above. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
SUBPATTERNS |
|
|
SUBPATTERNS |
|
|
@ -4644,7 +4771,7 @@ CALLOUTS |
|
|
is given in the pcrecallout documentation. |
|
|
is given in the pcrecallout documentation. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
BACTRACKING CONTROL |
|
|
|
|
|
|
|
|
BACKTRACKING CONTROL |
|
|
|
|
|
|
|
|
Perl 5.10 introduced a number of "Special Backtracking Control Verbs", |
|
|
Perl 5.10 introduced a number of "Special Backtracking Control Verbs", |
|
|
which are described in the Perl documentation as "experimental and sub- |
|
|
which are described in the Perl documentation as "experimental and sub- |
|
|
@ -4775,7 +4902,7 @@ AUTHOR |
|
|
|
|
|
|
|
|
REVISION |
|
|
REVISION |
|
|
|
|
|
|
|
|
Last updated: 21 August 2007 |
|
|
|
|
|
|
|
|
Last updated: 17 September 2007 |
|
|
Copyright (c) 1997-2007 University of Cambridge. |
|
|
Copyright (c) 1997-2007 University of Cambridge. |
|
|
------------------------------------------------------------------------------ |
|
|
------------------------------------------------------------------------------ |
|
|
|
|
|
|
|
|
@ -4904,7 +5031,7 @@ CHARACTER CLASSES |
|
|
[^...] negative character class |
|
|
[^...] negative character class |
|
|
[x-y] range (can be used for hex characters) |
|
|
[x-y] range (can be used for hex characters) |
|
|
[[:xxx:]] positive POSIX named set |
|
|
[[:xxx:]] positive POSIX named set |
|
|
[[^:xxx:]] negative POSIX named set |
|
|
|
|
|
|
|
|
[[:^xxx:]] negative POSIX named set |
|
|
|
|
|
|
|
|
alnum alphanumeric |
|
|
alnum alphanumeric |
|
|
alpha alphabetic |
|
|
alpha alphabetic |
|
|
@ -5074,7 +5201,8 @@ BACKTRACKING CONTROL |
|
|
|
|
|
|
|
|
NEWLINE CONVENTIONS |
|
|
NEWLINE CONVENTIONS |
|
|
|
|
|
|
|
|
These are recognized only at the very start of a pattern. |
|
|
|
|
|
|
|
|
These are recognized only at the very start of the pattern or after a |
|
|
|
|
|
(*BSR_...) option. |
|
|
|
|
|
|
|
|
(*CR) |
|
|
(*CR) |
|
|
(*LF) |
|
|
(*LF) |
|
|
@ -5083,6 +5211,15 @@ NEWLINE CONVENTIONS |
|
|
(*ANY) |
|
|
(*ANY) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
WHAT \R MATCHES |
|
|
|
|
|
|
|
|
|
|
|
These are recognized only at the very start of the pattern or after a |
|
|
|
|
|
(*...) option that sets the newline convention. |
|
|
|
|
|
|
|
|
|
|
|
(*BSR_ANYCRLF) |
|
|
|
|
|
(*BSR_UNICODE) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
CALLOUTS |
|
|
CALLOUTS |
|
|
|
|
|
|
|
|
(?C) callout |
|
|
(?C) callout |
|
|
@ -5103,7 +5240,7 @@ AUTHOR |
|
|
|
|
|
|
|
|
REVISION |
|
|
REVISION |
|
|
|
|
|
|
|
|
Last updated: 21 August 2007 |
|
|
|
|
|
|
|
|
Last updated: 14 November 2007 |
|
|
Copyright (c) 1997-2007 University of Cambridge. |
|
|
Copyright (c) 1997-2007 University of Cambridge. |
|
|
------------------------------------------------------------------------------ |
|
|
------------------------------------------------------------------------------ |
|
|
|
|
|
|
|
|
@ -5907,7 +6044,8 @@ MATCHING INTERFACE |
|
|
|
|
|
|
|
|
c. The "i"th argument has a suitable type for holding the |
|
|
c. The "i"th argument has a suitable type for holding the |
|
|
string captured as the "i"th sub-pattern. If you pass in |
|
|
string captured as the "i"th sub-pattern. If you pass in |
|
|
NULL for the "i"th argument, or pass fewer arguments than |
|
|
|
|
|
|
|
|
void * NULL for the "i"th argument, or a non-void * NULL |
|
|
|
|
|
of the correct type, or pass fewer arguments than the |
|
|
number of sub-patterns, "i"th captured sub-pattern is |
|
|
number of sub-patterns, "i"th captured sub-pattern is |
|
|
ignored. |
|
|
ignored. |
|
|
|
|
|
|
|
|
@ -6155,7 +6293,7 @@ AUTHOR |
|
|
|
|
|
|
|
|
REVISION |
|
|
REVISION |
|
|
|
|
|
|
|
|
Last updated: 06 March 2007 |
|
|
|
|
|
|
|
|
Last updated: 12 November 2007 |
|
|
------------------------------------------------------------------------------ |
|
|
------------------------------------------------------------------------------ |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@ -6183,10 +6321,9 @@ PCRE SAMPLE PROGRAM |
|
|
bility of matching an empty string. Comments in the code explain what |
|
|
bility of matching an empty string. Comments in the code explain what |
|
|
is going on. |
|
|
is going on. |
|
|
|
|
|
|
|
|
The demonstration program is automatically built if you use "./config- |
|
|
|
|
|
ure;make" to build PCRE. Otherwise, if PCRE is installed in the stan- |
|
|
|
|
|
dard include and library directories for your system, you should be |
|
|
|
|
|
able to compile the demonstration program using this command: |
|
|
|
|
|
|
|
|
If PCRE is installed in the standard include and library directories |
|
|
|
|
|
for your system, you should be able to compile the demonstration pro- |
|
|
|
|
|
gram using this command: |
|
|
|
|
|
|
|
|
gcc -o pcredemo pcredemo.c -lpcre |
|
|
gcc -o pcredemo pcredemo.c -lpcre |
|
|
|
|
|
|
|
|
@ -6233,8 +6370,8 @@ AUTHOR |
|
|
|
|
|
|
|
|
REVISION |
|
|
REVISION |
|
|
|
|
|
|
|
|
Last updated: 13 June 2007 |
|
|
|
|
|
Copyright (c) 1997-2007 University of Cambridge. |
|
|
|
|
|
|
|
|
Last updated: 23 January 2008 |
|
|
|
|
|
Copyright (c) 1997-2008 University of Cambridge. |
|
|
------------------------------------------------------------------------------ |
|
|
------------------------------------------------------------------------------ |
|
|
PCRESTACK(3) PCRESTACK(3) |
|
|
PCRESTACK(3) PCRESTACK(3) |
|
|
|
|
|
|
|
|
|