Skip to content

Refactor the way VT control sequences are identified and dispatched #7276

@j4james

Description

@j4james

Description of the new feature/enhancement

The way VT sequences are currently processed, we build up a list of intermediate characters (and private parameter prefixes) in a wchar_t vector, and then pass both that vector and the final character through to the state machine engine to dispatch. The engine goes through a rather complicated process where it has to check for intermediates, and branch off into separate methods, just to figure out which command has actually been requested.

So to make things simpler, I want to propose we identify each control sequence with a single integer value, which is a combination of the private parameter prefix, the intermediates, and the final char. This identifier can then be used in a simple switch statement when dispatching commands, without having to worry about special case handling for intermediates.

Right now we only have a few commands that depend on intermediates and private prefixes, and it's already a bit of a pain to work with. Once we start adding more advanced commands, though, it's going to get a lot worse. So before we get to that point, I thought it would be worth refactoring the code to make it a little easier to extend.

Proposed technical implementation details (optional)

I've been investigating different ways of implementing this, and the option I found best was a VTID class with a single size_t field holding the sequence value, and a size_t cast operator that will return that value. It would also have a constexpr constructor that can build the value from a string, so the action code enums can easily be defined with something like DECSTR_SoftReset = VTID("!p"). Currently we're pretending that DECSTR is identified by p, which is actually something else entirely.

The actual structure of the sequence value is simply made up from the intermediate and final characters stored one byte at a time in reverse order. For example, the DECTLTC control has a private parameter prefix of ?, one intermediate of ', and a final character of s. The ASCII values of those characters are 0x3F, 0x27, and 0x73 respectively, and reversing them gets you 0x73273F, so that would then be the identifier for the control.

The reason for storing them in reverse, is because sometimes we need to look at the first intermediate to determine the operation, and treat the rest of the sequence as a kind of sub-identifier (the character set designation sequences are one example of this). This is easily achieved by masking off the low byte to get the first intermediate, and then shifting the value right by 8 bits to get a new identifier with the rest of the sequence.

The downside of the reverse order is it makes constructing the sequence in the state machine a little more complicated, because you need to keep track of how many characters have already been stored in the identifier (each new character needs to be shifted 8-bit further left). The way I've approached that was with a little VTIDBuilder class, that has both an accumulator for the value being built up, and a shift value that determine the offset at which each new character is added.

And note that we don't need to worry too much about overflowing, because any sequence that wouldn't fit in a size_t would not be a valid sequence anyway. If it looks like it's going to overflow, we simply set the accumulator to zero, and stop accumulating further characters. When it's eventually dispatched, it'll just be ignored as an unrecognised identifier.

This is quite a big change to the code base, but I do think it makes things simpler, and should make the process of adding new multicharacter control sequences a lot less hassle.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Area-VTVirtual Terminal sequence supportHelp WantedWe encourage anyone to jump in on these.Issue-TaskIt's a feature request, but it doesn't really need a major design.Product-ConhostFor issues in the Console codebaseResolution-Fix-CommittedFix is checked in, but it might be 3-4 weeks until a release.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions