transform
text.normalization.SpanMapNormalizer.transform(pattern, replacement)Apply a regex transformation to the current text, while keeping track of the character span that every character in the new text maps to in the original text.
In the example below, the 4 characters in the replacement “I am” map to the match pattern “I’m” at span (0, 3) of the original text.
Example text: “I’m sorry” Example pattern: r”I’m” Example replacement: “I am”
new_text: “I am sorry” new_span_map: [(0, 3), (0, 3), (0, 3), (0, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 8), (8, 9)]
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| pattern | str | The regex pattern to match. | required |
| replacement | str or callable | The replacement string or a function that takes a match object and returns a replacement string. | required |