-
Notifications
You must be signed in to change notification settings - Fork 107
Open
Description
Describe the problem
When using inference.TTS with a model from Cartesia within modelOptions is available a parameter speed.
The model options are defined by this interface:
export interface CartesiaOptions {
/** Maximum duration of audio in seconds. */
duration?: number;
/** Speech speed. Default: not specified. */
speed?: 'slow' | 'normal' | 'fast';
}Cartesia does not only admit these values but numeric values too, where 1.0 represent 'normal'. I'd want to add support in CartesiaOptions interface for using numbers too. For some use cases 'fast' is too fast and 'normal' too slow.
Describe the proposed solution
Update CartesiaOptions interface to:
export interface CartesiaOptions {
/** Maximum duration of audio in seconds. */
duration?: number;
/** Speech speed. Default: not specified. */
speed?: 'slow' | 'normal' | 'fast' | number;
}I think that there are no more changes needed since I'm using this config and is working alright:
tts: new inference.TTS({
model: 'cartesia/sonic-3',
voice: 'd4db5fb9-f44b-4bd1-85fa-192e0f0d75f9',
language: 'es',
modelOptions: {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
speed: 1.25 as any,
},
}),Alternatives considered
No response
Importance
nice to have
Additional Information
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels