transform

text.normalization.SpanMapNormalizer.transform(pattern, replacement)

Apply a regex transformation to the current text, while keeping track of the character span that every character in the new text maps to in the original text.

In the example below, the 4 characters in the replacement “I am” map to the match pattern “I’m” at span (0, 3) of the original text.

Example text: “I’m sorry” Example pattern: r”I’m” Example replacement: “I am”

new_text: “I am sorry” new_span_map: [(0, 3), (0, 3), (0, 3), (0, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 8), (8, 9)]

Parameters

Name Type Description Default
pattern str The regex pattern to match. required
replacement str or callable The replacement string or a function that takes a match object and returns a replacement string. required