match.fuzzy_match

text.match.fuzzy_match(
    needle,
    haystack,
    threshold=55.0,
    max_length=300,
    return_words=False,
)

Fuzzy match between a needle (ground-truth text) and a haystack (ASR text).

Flattens all word segments from the SpeechSegment objects into a single haystack and searches for the needle within them. The returned FuzzyMatch object includes both word indices and audio timestamps of the matched segment.

Parameters

Name	Type	Description	Default
needle	str	The text to search for.	required
haystack	list[SpeechSegment]	Speech segments containing word-level alignments to search within. The text from these segments will be concatenated to form the haystack.	required
threshold	float	Minimum score (0-100) for a match to be returned.	`55.0`
max_length	int	Character length for splitting long needles. Needles longer than `2 * max_length` are matched by anchoring the first and last `max_length` characters independently.	`300`
return_words	bool	If True, also return the flattened word list as a second value. Useful for debugging (e.g. inspecting surrounding context of the match).	`False`

Returns

Name	Type	Description
	FuzzyMatch or None or tuple[FuzzyMatch or None, list[WordSegment]]	The match result with timestamps, or None if no match above the threshold. If `return_words` is True, returns a `(FuzzyMatch \\| None, list[WordSegment])` tuple instead, for debugging purposes.