match.fuzzy_match

text.match.fuzzy_match(
    needle,
    haystack,
    threshold=55.0,
    max_length=300,
    return_words=False,
)

Fuzzy match between a needle (ground-truth text) and a haystack (ASR text).

Flattens all word segments from the SpeechSegment objects into a single haystack and searches for the needle within them. The returned FuzzyMatch object includes both word indices and audio timestamps of the matched segment.

Parameters

Name Type Description Default
needle str The text to search for. required
haystack list[SpeechSegment] Speech segments containing word-level alignments to search within. The text from these segments will be concatenated to form the haystack. required
threshold float Minimum score (0-100) for a match to be returned. 55.0
max_length int Character length for splitting long needles. Needles longer than 2 * max_length are matched by anchoring the first and last max_length characters independently. 300
return_words bool If True, also return the flattened word list as a second value. Useful for debugging (e.g. inspecting surrounding context of the match). False

Returns

Name Type Description
FuzzyMatch or None or tuple[FuzzyMatch or None, list[WordSegment]] The match result with timestamps, or None if no match above the threshold. If return_words is True, returns a (FuzzyMatch \| None, list[WordSegment]) tuple instead, for debugging purposes.