Skip to content

spec(core): LDML KeyboardProcessor 🙀  #5015

@mcdurdin

Description

@mcdurdin

Introduction

This is a holding issue for the Keyman Core required to support LDML keyboards. We'll be filling this in as we complete planning and design. Much of this content should be moved to separate issues as we expand the details and establish component boundaries.

Objective: build a keystroke processor that works with the existing Keyman Core design that supports LDML keyboards, including:

  • load/save of LDML data
  • keyboard metadata API (particularly to support OSK) (this may not be needed?)
  • stateless keystroke transform

Intent is to build this in C++. It will need to cross compile to native Windows, macOS, Linux and WASM. The module will need to be standalone and not have runtime dependencies (static link of libraries is probably okay).

Related Features

LDML implementation:

Groundwork:

General Library Properties

  • no i/o, minimal/no dependencies. Why?

    • So that the WASM compiler/linker does not have to chase down file IO and deps
    • So that the lib is provably secure and portable
    • So that the lib can be tested in complete isolation from any deps or
      environmental concerns (outside data etc)
  • no_std? - to consider whether this is appropriate or not.

  • API boundary will be UTF-16 (std::u16string).

  • Internal string form will also be std::u16string.

  • Physical limits of input context and output transforms. 64?, 256 chars?

  • Keyboard data storage: a binary (black-box, not necessarily optimized) format
    that the KMXPlus compiler will produce given input XML.

C5015.1: Infrastructure

Moved to #5069.

  • Define LDMLKeyboardProcessor folder, build scripts, basic files
  • Add template unit tests

C5015.2: KMX+ Binary Loader

  • File Format: spec(common): KMX+ binary format supporting LDML #7043
  • Loads from the BLOB
  • no i/o requirements so kbd processor can depend on this
  • Provides metadata access (see §2.1, "Library for LDML access")
  • used by:
    • LDML Keyboard Processor Library
    • unit test
    • clients (i.e. "list of keyboards" or "filtering keyboards", "get osk data")
  • API:
    • ...
    • Metadata
    • Keyboard definition, transforms
    • OSK layout

C5015.3: LDML Keyboard Processor Library

  • no i/o
  • no state besides context
  • Depends on:
    • LDML Datablob Library
  • API:
    • constructor function:
      • LDML keyboard datablob
      • Platform immutable properties (e.g. OS, etc)?
    • processEvent function
      • Inputs:
        • context state
          • before text buffer - from app (string of Unicode characters; unspecified normalisation form; valid UTF-x)
          • transitory state - e.g. deadkeys - from previous processor run
            • may be empty/null if 1st run of processing engine
            • string of Unicode characters or index into state table?
          • "user settings" (if added to LDML)
        • incoming keystroke
          • key code - virtual key code (Windows?)
            • For hardware, the vkey is already resolved
          • modifiers - shift, ctrl, option, etc
          • toggle state keys - caps, num, etc
        • flags
          • touch or hardware (!touch)
      • Outputs:
        • Transform: Delete x Unicode codepoints before caret, insert string
        • Not supported:
          • delete x codepoints after
          • Caret repositioning
        • Transitory state for the next input event
        • Next OSK layer (5.14)
        • Changes to 'toggle' modifiers
        • Fail/error notification (in Keyman: "beep"; 5.18 ["error"])
      • Logic:
        • if !touch
          • lookup and remap vkey in vkey table
        • NOT called for switch keys, that is handled by the caller
        • Lookup (vkey, mod) in the keys section
          • Yields a UTF-32 codepoint or UTF-16LE str
        • if backspace, process as backspace and stop.
          • TODO/Q: Or is this handled by the layer?
        • push_character to context and to actions
        • TODO-LDML: Transform Mapping Here

C5015.4: Test Framework

  • Work from kmxkbd unit test model in Keyman Core
  • A data-driven test harness
    • Test harness should have no i/o
  • C++ and Typescript test runners
    • Single data source should produce identical results on all platforms
    • Allows us to verify that the interfaces are not causing trouble without additional unit tests
  • Java runner? (for CLDR CI)
  • Interactive tests
    • Web based?
    • GUI based?
  • Command line driven tooling for manual tests

C5015.5: Keyboard Delivery

Requirements:

  • The minimum version of Keyman that can load these .kmx files will be _____.
  • A .kvk file will be generated by the compiler from the LDML source file for use by desktop platforms.
  • On web, the .js will embed a binary base64 blob of the .kmx, alongside the touch and kvk data and necessary metadata. The .kmx blob will be delivered to the Keyman Core WASM module.

Questions:

  • Question: file naming conventions
  • Question: limit one LDML file per kmp package?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions