-
-
Notifications
You must be signed in to change notification settings - Fork 137
spec: Suggestions respect capitalization #3720
Description
This is part of a broader specification for Caps Lock support on touch devices. Development of this touches multiple components; each component will be implemented in a separate PR, referencing this issue.
Related features
- spec: Caps Lock layer for touch layouts #3620: spec: Caps Lock layer for touch layouts
- spec: Start of text/sentence selects
shiftlayer #3621: spec: Start of text/sentence selects shift layer
Introduction
When a user starts typing at start-of-sentence, currently KeymanWeb does not respect the start-of-sentence capitalization. As the lexical model dictionary ignores case for matches, we need a method of adjusting the suggestions to match the input casing -- Initial case or ALL CAPS.
A flag named languageUsesCasing on the model determines if casing rules should be applied to suggestions. If this flag is not set, none of the following functionality should apply.
A model can define an applyCasing() method consumed by wordform2Key():
// Separately declared, safe for models to reference
declare type CasingForm = 'lower' | 'initial' | 'upper';
applyCasing(form: CasingForm, text: string): stringInitial case will only upper case the first letter of the string. Lower case will lower case the whole string, and upper case will upper case the whole string.
A default implementation uses toLowerCase(), toUpperCase() and a simple toInitialCase() function. Developers may override applyCasing(), then should simply call defaultApplyCasing() after handling special cases.
Example for Turkish
function applyCasing(form: CasingForm, text: string,
defaultApplyCasing: (form, text) => string): string {
switch(form) {
case 'lower':
return defaultApplyCasing(form, text
.replace(/I/g, 'ı')
.replace(/İ/g, 'i'));
case 'upper':
return defaultApplyCasing(form, text
.replace(/ı/g, 'I')
.replace(/i/g, 'İ'));
case 'initial':
return applyCasing(casingForm.upper, text.charAt(0)) + text.substr(1);
default:
return text;
}
}Note: toLocaleUpperCase/toLocaleLowerCase are only supported in Chrome M58 on Android (we currently support M35), so we cannot use them at this time.
Mechanism
Start of token predictions: KeymanWeb should report the current on screen keyboard layer to the LMLayer. For start-of-word, this will allow predictions to display with ‘initial’ casing.
To assist with this, ‘start of token’ will trigger a predictive round of fat-finger execution based on the current on-screen keyboard layer. The lm-layer will request this of Web.
Mid-word predictions: The LMLayer should check the current context token to determine casing requirements for suggestions. If the first character is lowercase, then predictions should be provided without case modification (e.g. this allows for proper names, acronyms, and special casing that are provided in the model, which covers a majority of languages).
Else, if the token is only one character long, or only the first character is upper case, then applyCasing(initial) should be applied for each suggestion, unless the first letter of the suggestion is already uppercase (again, allowing for acronyms, special cases).
Otherwise, if the token is all upper case, then applyCasing(upper) should be applied to each suggestion.
Testing Casing
So long as the input text adheres to one of the three standard casing patterns, testing it against itself (via input_text == applyCasing(case, input_text)) should be mostly sufficient. However, if the input is all lowercase (the default case), we should not modify the suggestion’s base casing. If the suggestion’s base form in the lexical model is either initial or upper, that implies that the wordform is invalid for lower-casing.
So, in broad-stroke pseudocode…
if get_case_of(input_text) == lower
return suggestion_text
else
return applyCasing(get_case_of(input_text), suggestion_text)C3.1 Changes to .model.ts format
Additions:
applyCasinglanguageUsesCasing
C3.2 Changes to Compiler
Incorporates everything listed in C3.1.
May need additional validation: the new function (applyCasing) must compile properly.
C3.3 Changes to LMLayer
- adds case-matching
- is called toward end of prediction process
- is a transform on the most likely suggestions
- merging of duplicate suggestions required after the transform (e.g. god->God, God)
- Use of
languageUsesCasing
Additional notes
Future functionality plans remain in the Google Doc. We will revisit if/when we decide to implement.