An extension of the standard StringLabels. If you open Core.Std, you'll get these in the String module.
Caseless compares and hashes strings ignoring case, so that for example
Caseless.equal "OCaml" "ocaml" and Caseless.("apple" < "Banana") are true, and
Caseless.Map, Caseless.Table lookup and Caseless.Set membership is
case-insensitive.
Maximum length of a string.
Substring search and replace functions. They use the Knuth-Morris-Pratt algorithm (KMP) under the hood.
The functions in the Search_pattern module allow the program to preprocess the
searched pattern once and then use it many times without further allocations.
create pattern preprocesses pattern as per KMP, building an int array of
length length pattern. All inputs are valid.
pos < 0 or pos >= length string result in no match (hence index returns
None and index_exn raises).
Substring search and replace convenience functions. They call Search_pattern.create and
then forget the preprocessed pattern when the search is complete. pos < 0 or pos
>= length t result in no match (hence substr_index returns None and
substr_index_exn raises). may_overlap indicates whether to report overlapping
matches, see Search_pattern.index_all.
lfindi ?pos t ~f returns the smallest i >= pos such that f i t.[i], if there is
such an i. By default, pos = 0.
rfindi ?pos t ~f returns the largest i <= pos such that f i t.[i], if there is
such an i. By default pos = length t - 1.
foldi works similarly to fold, but also pass in index of each character to f
tr_inplace target replacement s destructively modifies s (in place!)
replacing every instance of target in s with replacement.
Operations for escaping and unescaping strings, with paramaterized escape and escapeworthy characters. Escaping/unescaping using this module is more efficient than using Pcre. Benchmark code can be found in core/benchmarks/string_escaping.ml.
escape_gen_exn escapeworthy_map escape_char returns a function that will escape a
string s as follows: if (c1,c2) is in escapeworthy_map, then all occurences of
c1 are replaced by escape_char concatenated to c2.
Raises an exception if escapeworthy_map is not one-to-one. If escape_char is
not in escapeworthy_map, then it will be escaped to itself.
escape ~escapeworthy ~escape_char s is
escape_gen_exn ~escapeworthy_map:(List.zip_exn escapeworthy escapeworthy)
~escape_char
.
Duplicates and escape_char will be removed from escapeworthy. So, no
exception will be raised
unescape_gen_exn is the inverse operation of escape_gen_exn. That is,
let escape = Staged.unstage (escape_gen_exn ~escapeworthy_map ~escape_char) in
let unescape = Staged.unstage (unescape_gen_exn ~escapeworthy_map ~escape_char) in
assert (s = unescape (escape s))
always succeed when ~escapeworthy_map is not causing exceptions.
Any char in an escaped string is either escaping, escaped or literal. For example, for escaped string "0_a0__0" with escape_char as '_', pos 1 and 4 are escaping, 2 and 5 are escaped, and the rest are literal
is_char_escaping s ~escape_char pos return true if the char at pos is escaping,
false otherwise.
is_char_escaped s ~escape_char pos return true if the char at pos is escaped,
false otherwise.
is_literal s ~escape_char pos return true if the char at pos is not escaped or
escaping.
index s ~escape_char char find the first literal (not escaped) instance of
char in s starting from 0.
rindex s ~escape_char char find the first literal (not escaped) instance of
char in s starting from the end of s and proceeding towards 0.
index_from s ~escape_char pos char find the first literal (not escaped)
instance of char in s starting from pos and proceeding towards the end of s.
rindex_from s ~escape_char pos char find the first literal (not escaped)
instance of char in s starting from pos and towards 0.
split s ~escape_char ~on
s that are separated by
literal versions of on. Consecutive on characters will cause multiple empty
strings in the result. Splitting the empty string returns a list of the empty
string, not the empty list."foo"; "bar_,baz"
split_on_chars s ~on
s that are separated by
one of the literal chars from on. on are not grouped. So a grouping of on in
the source string will produce multiple empty string splits in the result.',';'|' "foo_|bar,baz|0" ->
"foo_|bar"; "baz"; "0"