Wrapper for regexps with Str
syntax
This module was written at a time when we had only the Str
module
for regular expressions. However, Str
has an interface that does
not work for multi-threaded programs, because the state of the module
is visible to the outside. The module Netstring_str
is similar to Str
,
but has a thread-compatible interface.
For an explanation why we need this module, please read [root:Regexp].
Supported regexp syntax
. matches every character but newline
e* matches e several times
e+ matches e several times but at least once
e? matches e optionally
e{m,n} matches e at least m times and at most n times
e1\|e2 matches e1 or e2
[set] matches the characters from set
[^set] matches the characters except from set
\(...\) group paranthesis
\n back reference (n is digit)
^ matches at beginning of line
$ matches at end of line
This is exactly what Str
supports. Character classes
are not implemented.
The type of regular expressions
#
| Text of string
| |||
#
| Delim of string
| (* | Here we keep compatibility with Str | *) |
The type of matching results
Quotes a string such that it can be included in a regexp
Returns a case-insensitive regexp that matches exactly the string
Returns a regexp (as string) that matches any of the characters in the argument. The argument must be non-empty
Extracts the matched part from the string. The string argument
must be the same string passed to string_match
or the search
functions, and the result argument must be the corresponding
result.
Extracts the substring the nth group matches from the whole
string. The string argument
must be the same string passed to string_match
or the search
functions, and the result argument must be the corresponding
result.
Returns the position where the substring matching the nth group begins
Returns the position where the substring matching the nth group ends
global_replace re templ s
: Replaces all matchings of re
in
s
by templ
.
In templ
one can refer to matched groups by the backslash notation:
\1
refers to the first group, \2
to the second etc.
\0
is the whole match. \\
is the backslash character.
replace_first re templ s
: Replaces the first match of re
in
s
by templ
.
In templ
one can refer to matched groups by the backslash notation:
\1
refers to the first group, \2
to the second etc.
\0
is the whole match. \\
is the backslash character.
Splits the string according to the regexp in substrings. Occurrences of the delimiter at the beginning and the end are ignored.
Splits into at most n
substrings, based on split
Same as split
, but occurrences of the delimiter at the beginning
and the end are returned as empty strings
Splits into at most n
substrings, based on split_delim
Like split_delim
, but returns the delimiters in the result
Splits into at most n
substrings, based on full_split
The first n
characters of a string
The last n
characters of a string
Same as string_before
Same as string_after