diff --git a/index.bs b/index.bs index b28088a..5055872 100644 --- a/index.bs +++ b/index.bs @@ -202,9 +202,16 @@ interface SpeechRecognition : EventTarget { attribute EventHandler onend; }; +enum SpeechRecognitionQuality { + "command", + "dictation", + "conversation" +}; + dictionary SpeechRecognitionOptions { required sequence langs; boolean processLocally = false; + SpeechRecognitionQuality quality = "command"; }; enum SpeechRecognitionErrorCode { @@ -389,7 +396,7 @@ See [=default allowlist/'self'=]. When invoked, run these steps: @@ -400,7 +407,7 @@ See lang in {{SpeechRecognitionOptions/langs}} of options is not a valid [[!BCP47]] language tag, throw a {{SyntaxError}} and abort these steps. 1. If the on-device speech recognition language pack for any lang in {{SpeechRecognitionOptions/langs}} of options is unsupported, return a resolved {{Promise}} with false and skip the rest of these steps. 1. Let promise be a new promise. - 1. For each lang in {{SpeechRecognitionOptions/langs}} of options, initiate the download of the on-device speech recognition language for lang. + 1. For each lang in {{SpeechRecognitionOptions/langs}} of options, initiate the download of the on-device speech recognition language pack for lang matching the requested {{SpeechRecognitionOptions/quality}} level floor.

Note: The user agent can prompt the user for explicit permission to download the on-device speech recognition language pack.

@@ -448,6 +455,19 @@ See SpeechRecognitionQuality Enum Values +

The {{SpeechRecognitionQuality}} enum indicates the semantic capability and quality level floor requested for the speech recognition model. Its values are:

+
+
"command"
+
Level 1: Short phrases, single speaker, limited vocabulary (e.g., voice commands for smart home or simple apps).
+ +
"dictation"
+
Level 2: Continuous speech, moderate background noise, single primary speaker (e.g., long-form text input like SMS/Email).
+ +
"conversation"
+
Level 3: Multi-speaker, complex vocabulary, high noise tolerance (e.g., meeting transcripts and continuous captioning).
+
+

When the availability algorithm with options and promise is invoked, the user agent MUST run the following steps: 1. If the [=current settings object=]'s [=relevant global object=]'s [=associated Document=] is NOT [=fully active=], throw an {{InvalidStateError}} and abort these steps. 1. Let langs be {{SpeechRecognitionOptions/langs}} of options.