normalization.text_normalizer
text.normalization.text_normalizer(text)Default text normalization function.
Applies - Unicode normalization (NFKC) - Lowercasing - Normalization of whitespace - Remove parentheses and special characters
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| text | str | Input text. | required |
Returns
| Name | Type | Description |
|---|---|---|
| list of str | List of normalized tokens. | |
| list of dict | Mapping between tokens and original text spans. |