Towards Duration Style Conversion

Tuan Dinh

Preliminary results of improving speech intelligibility using duration conversion. We use phase vocoder to convert duration of habitual to slow speech uniformly or non-uniformly. Uniform conversion applies a single sentence-level scaling factor on habitual speech. Non-uniform conversion applies multiple phoneme-level scaling factors on habitual speech. We assume these scaling factors are available. We also assume phoneme labels and boundaries are available in non-uniform conversion. The goal is to determine whether non-uniform modification is better than uniform conversion. The non-uniform conversion, however, is not better than the uniform conversion

Non-uniform vs uniform modificaiton

Habitual Speech non-uniform modification uniform modification Slow speech