get_token_map

text.normalization.SpanMapNormalizer.get_token_map(tokenization_level='word')

Tokenize the current text and create a mapping of normalized tokens to the original text spans they were normalized from.

Parameters

Name Type Description Default
tokenization_level str Tokenization level (‘word’ or ‘char’). "word"

Returns

Name Type Description
list of dict Token mapping.