vad_pipeline_generator

pipelines.vad_pipeline_generator(
    model,
    audio_paths,
    audio_dir,
    speeches=None,
    chunk_size=30,
    sample_rate=16000,
    metadata=None,
    num_workers=1,
    prefetch_factor=2,
    save_json=True,
    save_msgpack=False,
    return_vad=False,
    output_dir='output/vad',
)

Run VAD on a list of audio files.

Parameters

Name	Type	Description	Default
model	object	The loaded VAD model.	required
audio_paths	list	List of paths to audio files.	required
audio_dir	str	Directory where the audio files/dirs are located (if audio_paths are relative).	required
speeches	list[list[SpeechSegment]] or None	Optional list of SpeechSegment objects to run VAD only on specific segments of the audio. Alignment can generally be improved if VAD/alignment is only performed on the segments of the audio that overlap with text transcripts.	`None`
chunk_size	int	The maximum length chunks VAD will create (seconds).	`30`
sample_rate	int	The sample rate to resample the audio to before running VAD.	`16000`
metadata	list[dict] or None	Optional list of additional file level metadata to include.	`None`
num_workers	int	The number of workers for the DataLoader.	`1`
prefetch_factor	int	The prefetch factor for the DataLoader.	`2`
save_json	bool	Whether to save the VAD output as JSON files.	`True`
save_msgpack	bool	Whether to save the VAD output as Msgpack files.	`False`
return_vad	bool	Whether to yield the VAD output.	`False`
output_dir	str	Directory to save the VAD output files.	`"output/vad"`

Yields

Name	Type	Description
	AudioMetadata	If `return_vad` is True, yields AudioMetadata objects for each audio file.