Skip to content

Refactoring/reimplementation of MXP support#3625

Merged
vadi2 merged 50 commits intoMudlet:developmentfrom
gcms:mxptbufrefactoring
May 3, 2020
Merged

Refactoring/reimplementation of MXP support#3625
vadi2 merged 50 commits intoMudlet:developmentfrom
gcms:mxptbufrefactoring

Conversation

@gcms
Copy link
Copy Markdown
Contributor

@gcms gcms commented Apr 16, 2020

  • Refactoring/reimplementation of MXP protocol
    • Extracts MXP support from TBuffer
    • Separates input parsing from protocol processing: input is parsed into MxpTag objects by the MxpNodeBuilder and are passed to be handled by the MxpTagProcessor
    • Class hierarchy of MxpTagHandler for implementing support for each tag: allows for a clear separation of the implementation of each tag and makes it easier for adding support for new tags
    • Defines a clear interface (MxpClient) separating the MXP implementation from other parts of the code
    • Solves some limitations of previous implementation such as supporting element definitions not only for<SEND> tags, interpolating named and positional attributes and interpolating &text; placeholder by tag content
    • Includes support for COLOR tag
    • Handles FONT, U, I, B, VAR and !ENTITY tags, but so far no defined behavior (can be implemented in MxpMudlet)
    • Attributes that hold MXP state are now in the Host class to avoid needless copies when TBuffer is copied around
    • Introduces QtTests for some of the classes
  • Other refactorings intended to simplify TBuffer class, extracting responbilities that aren't directly related to the buffer mgmt
    • Extracted TEncodingTable from TBuffer, this class is responsible for mapping encoding names to tables. It also removes some bloat from TBuffer.
    • TLinkStore to manage links and associated hints to be displayed.
    • TEntityResolver to map character entities, such as &gt; &#32; &#x20; to their associated string values; supports interpolating strings replacing the entity placeholders by their values

Brief overview of PR changes/additions

Motivation for adding to Mudlet

Other info (issues closed, discussion etc)

* These refactorings are intended to simplify TBuffer class, extracting responbilities that aren't directly related to the buffer mgmt and support extracting MXP protocol handling from TBuffer.
- Extracted TEncodingTable, this class is now responsible for managing a map of computer-friendly encoding names as keys.
- Extract TLinkStore to manage links and associated hints to be displayed.
@gcms gcms requested a review from a team as a code owner April 16, 2020 20:27
@gcms gcms requested review from a team April 16, 2020 20:27
@add-deployment-links
Copy link
Copy Markdown

add-deployment-links bot commented Apr 16, 2020

Hey there! Thanks for helping Mudlet improve. 🌟

Test versions

You can directly test the changes here:

No need to install anything - just unzip and run.
Let us know if it works well, and if it doesn't, please give details.

gcms added 3 commits April 16, 2020 17:37
- Extracted code responsible for detecting entities such as &gt; &lt; etc from the main loop in translateToPlainText to a specialized class that keeps track of identifying an entity and replacing it into the text stream
- Extracted TMxpTagDetector class to process input and extract MXP tags for later processing
@SlySven
Copy link
Copy Markdown
Member

SlySven commented Apr 17, 2020

🚨 Please be aware that the encoding tables you are manipulating are in the process of significant reorganisation in PR #3579 . That will mess with this one somewhat.

Copy link
Copy Markdown
Member

@vadi2 vadi2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are looking good 👍 unfortunately as mentioned #3579 doesn't just add a new encoding but also re-engineers a fair bit of things so it'll conflict with this re-engineering as well. It's just about to make it in, so let's wait for that first and continue with this?

Sorry that the 2nd thing you picked was already worked on! I hope this won't happen often. Ideally we like to keep PRs small and scoped to one topic.

vadi2
vadi2 previously approved these changes Apr 17, 2020
Copy link
Copy Markdown
Member

@vadi2 vadi2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, looks great, let's just get the outstanding PR in first and this is good to go.

***************************************************************************/
#include "TMxpTagDetector.h"

bool TMxpTagDetector::handle(char& ch, size_t& localBufferPosition)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 really nice class.


#include "TEncodingTable.h"

const TEncodingTable TEncodingTable::defaultInstance = TEncodingTable(csmEncodings);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also happy to see this not bloat up TBuffer 👍

}

return false;
} No newline at end of file
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
}


#include "TEntityHandler.h"

bool TEntityHandler::handle(char& ch, std::string& localBuffer, size_t& localBufferPosition, size_t localBufferLength)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 nice class too. Unrelated, but Evennia and Mudlet at times still struggle to cooperate on what is a < and what is a <


#include "TLinkStore.h"

int TLinkStore::addLinks(const QStringList& links, const QStringList& hints)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice 👍

@gcms
Copy link
Copy Markdown
Contributor Author

gcms commented Apr 17, 2020

Ok, we can try to handle the merging when #3579 is finished. I have some other changes to make, I am almost managing to fully extract MXP from TBuffer.

gcms added 2 commits April 17, 2020 15:18
- Extracted TMxpTagProcessor class, responsible for parsing MXP tags and executing the expected behavior
- TMxpProcessor keeps the state of MXP and delegates tag detection and processing to TMxpTagDetector and TMxpTagProcessor
@SlySven
Copy link
Copy Markdown
Member

SlySven commented Apr 17, 2020

I have just cancelled the current MacOS CI as that has been broken for a couple of days - and I am trying a bugfix which was queued behind it...

... hopefully if it flies and goes in then you (or I) will be able to go into Travis and restart the CI build jobs for Macs.

- Includes a table of 64 standard codes (uses a hash table instead of previous if/else comparisons)
- Supports unicode codes in decimal and hexdecimal such as &#32; and &#x20;
- Supports registering custom entities (useful for implementing MXP spec)
- Includes tests for the resolver and the MXP entity handler
Copy link
Copy Markdown
Member

@vadi2 vadi2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to start on tests for the C++ side 👍


QString TEntityResolver::getResolution(const QString& entity) const
{
if (entity.front() != '&' || entity.back() != ';')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you still wrap single line ifs with braces - we've adopted that codestyle across the codebase. Apple's Goto fail was enlightening here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you still wrap single line ifs with braces - we've adopted that codestyle across the codebase. Apple's Goto fail was enlightening here.

Ok, I'll ty to stick to it. What do you use for helping to enforce the coding style?

Is there any naming convention you use in the project for classes. I am putting these Ts in front of class names just because they are already used around, but I didn get what is the meaning.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-format - we have a .clang-format file at the root of the repo with the style. Unfortunately clang-format doesn't handle adding braces - there was some other tool that does, but we keep it simple with 1 formatter.

Naming convention - for variables it's m for a member variable, s for static and c for constant, hence, scm in a few places. I'm not sure about classes, to be honest :O

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a p for pointer.

*quoting many a teenage contestant on the UK ITV Quiz show from the late 80s "BlockBusters" to the host Bob Holness as in this Celebratory reprisal hosted by Dara O'Brean*:

"Can I have a P(ee) please Bob."

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Apr 19, 2020

Hey we've fixed up the failing macOS builds - I've updated your branch with the fix so it builds.

@vadi2 vadi2 added the Waiting for other PR Merge This PR is awaiting another PR to merge firstly. label Apr 19, 2020
@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Apr 21, 2020

Sorry it's taking a while - it'll be ready in 1-2 days max.

gcms added 7 commits April 21, 2020 03:24
- Extracts MXP support from TBuffer
- Support for parsing MXP tags, replacing direct string manipulation by accessing MxpTag attributes
- Class hierarchy of MxpTagHandler for implementing support for each tag. Allow for a clear separation of the implementation of each tag and makes it easier for adding new support
- Defines a clear interface separating the MXP implementation from other parts of the code
- Solves some limitations of previous implementation such as supporting element definitions not only for <SEND> tags, and interpolating named and positional attributes
- Supports COLOR tag
- Removes the MxpTagDetector, previously used to identify opening/closing <> tag symbols
- Replaced in previous commits by TMxpNodeBuilder, which detects and parse tags into a MxpTag object
- Events are added in a queue from TMxpMudlet (MXP client interface)
- TConsole sends the event to the TLuaInterpreter on printOnDisplay
@gcms gcms requested a review from a team April 21, 2020 08:13
Copy link
Copy Markdown
Member

@SlySven SlySven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a massive PR - in other circumstances I would be proud (and Vadim would, I guess, be cringing at the thought) of it. MXP is not something I want to touch with a 10 foot pole so for someone else to pick it up and give it a good going over is brilliant. 🙂

However the gains from factoring out the encoding tables are marginal IMVHO - fair enough it moves them to a separate class - however they (apart from my recent excursion into the TTextCodec class) are only really of relevance to this class. I was careful in the design to ensure that there is only ever one set of the tables in the whole application and that by using the QVector form for each table that lookups could be as good as possible. I note that you have incorporated the changes that I made recently to factor out the one potentially non-const element of the tables (their translated names for presentation in the preferences dialogue). TBH I would be happier - and this PR would be smaller if those tables were left as they were for the moment.

I guess the MXP will need a thorough testing - but does anyone (apart from Zugg) know exactly it is supposed to behave - i.e. how can we actually test this in practice? 😜

TScript.cpp
TSplitter.cpp
TSplitterHandle.cpp
TStringUtils.cpp
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent off?

TScript.h
TSplitter.h
TSplitterHandle.h
TStringUtils.h
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent wrong?

#include <QTextStream>
#include "post_guard.h"

#include "TMxpMudlet.h"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest moving these two up into the other Mudlet ones above the guarded Qt ones...

const MxpTagAttribute& MxpStartTag::getAttribute(const QString& attrName) const
{
const auto ptr = mAttrsMap.find(attrName.toUpper());
return *ptr;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does one need the local ptr value - I am not sure about return value optimisations and other things - and as you are not returning a reference the const on the return value is meaningless - same thing actually applies elsewhere in this class BTW...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange - i thought I should have spotted that '&' - guess I've been staring at too much code. 🕶️

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, hang on - you are returning a reference, and it refers to, let me think, a result from using QMap<T1, T2>::find(const T1 &) which returns an iterator pointing to the T2 type value with the given key. Maybe that is safe after all. {I've heard nasty tales of returning references to variables with local scope but that doesn't seem to be the case here, right?} 😖

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, hang on - you are returning a reference, and it refers to, let me think, a result from using QMap<T1, T2>::find(const T1 &) which returns an iterator pointing to the T2 type value with the given key. Maybe that is safe after all. {I've heard nasty tales of returning references to variables with local scope but that doesn't seem to be the case here, right?}

Exactly, this is to avoid returning a local reference. But I think the class can be redesigned to hold a value instead of reference.

for (const auto& attrName : mAttrsNames) {
result.append(' ');
if (attrName.contains(" ") || attrName.contains("<")) {
result.append('"');
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use QStringLiteral(...) here as well.

{
public:
TMxpVersionTagHandler() : TMxpSingleTagHandler("VERSION") {}
inline static const QString scmVersionString = QStringLiteral("\n\x1b[1z<VERSION MXP=1.0 CLIENT=Mudlet VERSION=%1>\n");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we really at the level of MXP 1.0 with all these changes? We certainly weren't with the previous code!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't change this part, the previous code already claimed it was MXP 1.0.

***************************************************************************/
#include "TStringUtils.h"

QStringRef TStringUtils::trimmedRef(const QStringRef& ref)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike some of the above classes and their methods, these really are working on references and need to return const values (erm, references)! Indeed they ought to be marked as const method as well shouldn't they?

return isOneOf(ch, "\'\"");
}

bool TStringUtils::isOneOf(QChar ch, const char* str)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember QStrings are UTF-16 based and that, like UTF-8, is a variable length encoding ‼️ Depending on where this method is going to be used you may encounter graphemes that require more than one QChar to convey their meaning (in which case ch.isSurrogate() will be true for both QChar that make them up) .


static void toUpper(QString& str)
{
for (auto& ch : str) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can this work for graphemes not in the BMP? I.e. ones that need two QChars to convey their meaning?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this MXP code deals only with tags and their attributes, the content is just "ignored" and handled by the other parts of the code. But this function is not used anymore, I'll delete it

const QVector<QChar>& TEncodingTable::getLookupTable(const QByteArray& encoding) const
{
const auto ptr = mEncodingMap.find(encoding);
return ptr != mEncodingMap.end() ? ptr.value() : csmEmptyLookupTable;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code that used a QMap<T1, T2>::value(const T1&) is actually clearer and simpler as it automatically returned a suitable null result (well, an empty table) if the encoding key is not found - so no need to have to declare a dummy one like you have to with this STL style getLookupTable(...) method.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a semantic improvement. It's clear that this is not just a Map from a byte array to a vector of chars datastructure, but something for holding encodings and their names.

But the idea was to be a starting point, now you can move the code in TBuffer that deals with encoding to this class.

Basic principle of separation of concerns. Right now TBuffer is doing too many things. It handles protocols, char coloring, encodings, ... maybe some of these responsibilities could be better delegated to different classes.

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented Apr 28, 2020

I think it's good to move those tables out - it makes the huge TBuffer class smaller and more focused on what it should do 👍

Same with the big tables in mudlet.cpp.

@SlySven
Copy link
Copy Markdown
Member

SlySven commented Apr 29, 2020

But does it slow things down with extra layers of function calls? This one is giving me indigestion as it is a lot to take in - OTOH I suppose it is sauce for the gander!

const QMap<QByteArray, QVector<QChar>> getEncodings() const { return mEncodingMap; }
QList<QByteArray> getEncodingNames() const;

const QVector<QChar>& getLookupTable(const QByteArray& encoding) const;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is pointless IMHO - the data is a single constant static QMap that is public and a Qt QMap<QByteArray, QVector<QChar>>::value(encoding) at the single point of use will return a suitable null (empty) table automagically without the need for csmEmptyLookupTable... 😝

@gcms
Copy link
Copy Markdown
Contributor Author

gcms commented Apr 29, 2020

I think it's good to move those tables out - it makes the huge TBuffer class smaller and more focused on what it should do

Same with the big tables in mudlet.cpp.

I agree. I think it is a lot better to have more smaller more focused classes than huge files with lots of members. Besides the gains in comprenhension, it makes it easier to handle merges as people are less likely to be changing the same file, reduces compile time when you change something, makes IDEs fasters (CLion gets laggy in some files of this project).

As for the performance issue of the "extra call", the compiler is smart enough to optimize things and inline it. A GUI app is not the kind of application where we should look for optimizing function calls.

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented May 2, 2020

Awesome! Is this good to go guys?

Copy link
Copy Markdown
Member

@demonnic demonnic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++ is not my strongest suit but this looks good to me, and the existing Lua tests all pass

image

@SlySven
Copy link
Copy Markdown
Member

SlySven commented May 2, 2020

I guess we will see how this all works in practice - if it does break anything I guess we will just have to glue the bits back together again. I cannot say that I approve it because there is too much to take in - but I realise that there is no reason to disapprove it either. Whilst I thought that de coding Extended ASCII streams is something that should be kept in here if there is no significant harm to moving that to a separate class then factoring the code out to there is not so unreasonable after all. :shipit:

When it comes to the Squash and merge please edit the commit message text down into something sensible before pushing it into the history - that has happened in the recent past and there is little merit when wandering through the repository to see a PR with a commit message that is little more than miscellaneous notes about intermittent issues in the process of creating the PR which are totally irrelevant in the end result. I am not saying that this megaPR does not have some good details in some of those commit messages - but they need to be reviewed - there may be some good stuff in there - ah, if I start the process (and abort it) I can copy the text that will be used - here it is:

* Refactor TBuffer, extracting unrelated behavior

* These refactorings are intended to simplify TBuffer class, extracting responbilities that aren't directly related to the buffer mgmt and support extracting MXP protocol handling from TBuffer.
- Extracted TEncodingTable, this class is now responsible for managing a map of computer-friendly encoding names as keys.
- Extract TLinkStore to manage links and associated hints to be displayed.

* Fixes code style formatting issues.

* Extracted MXP/HTML entity handling from TBuffer

- Extracted code responsible for detecting entities such as &gt; &lt; etc from the main loop in translateToPlainText to a specialized class that keeps track of identifying an entity and replacing it into the text stream

* Extracted MXP tag detection from TBuffer

- Extracted TMxpTagDetector class to process input and extract MXP tags for later processing

* Extracted processing of MXP tags from TBuffer

- Extracted TMxpTagProcessor class, responsible for parsing MXP tags and executing the expected behavior

* Extracted MXP protocol handling from TBuffer

- TMxpProcessor keeps the state of MXP and delegates tag detection and processing to TMxpTagDetector and TMxpTagProcessor

* Introduces TEntityResolver for mapping char entities

- Includes a table of 64 standard codes (uses a hash table instead of previous if/else comparisons)
- Supports unicode codes in decimal and hexdecimal such as &#32; and &#x20;
- Supports registering custom entities (useful for implementing MXP spec)
- Includes tests for the resolver and the MXP entity handler

* Complete refactoring/reimplementation of MXP support

- Extracts MXP support from TBuffer
- Support for parsing MXP tags, replacing direct string manipulation by accessing MxpTag attributes
- Class hierarchy of MxpTagHandler for implementing support for each tag. Allow for a clear separation of the implementation of each tag and makes it easier for adding new support
- Defines a clear interface separating the MXP implementation from other parts of the code
- Solves some limitations of previous implementation such as supporting element definitions not only for <SEND> tags, and interpolating named and positional attributes
- Supports COLOR tag

* Wrong value passed as argument TMxpFormattingTagsHandler

* Removes a class deprecated by previous commits

- Removes the MxpTagDetector, previously used to identify opening/closing <> tag symbols
- Replaced in previous commits by TMxpNodeBuilder, which detects and parse tags into a MxpTag object

* Reimplements MXP events over the new MXP code base

- Events are added in a queue from TMxpMudlet (MXP client interface)
- TConsole sends the event to the TLuaInterpreter on printOnDisplay

* Fixes bad merge

* Includes missin header #include

* Reformat files using clang-style

* Fixes MxpFontTagHandler implementation of MxpTagHandler

* Updated mudlet.pro to include missing file

* Remove uneeded member in MxpProcessor

* Reuses the same EntityResolver instance

- Different classes were initializing their own instance of this class that is supposed to be shared

* Support for MXP !ENTITY tag

* Support for MXP VAR tag

* Finished merging with upstream branch

- Moved encoding tables to TEncodingTable

* Finished merging with upstream branch

- Moved encoding tables to TEncodingTable

* Sets up mudlet-tests project

- Uses CMake
- Can be loaded through Qt Creator for editing and running the tests

* Move MXP state objects from TBuffer to Host

- As TBuffer is copied at different places is was also creating duplicates of the MXP state
- Thsi change avoids these needless copies and eliminates the needs for using pointers and copy constructors in EntityHandler

* Add missing file reference to mudlet.pro

- Includes TMxpMudlet.cpp in Qt project file

* Fixes minor issues

- Mark member functions as override
- Missing checks
- Replaces QString constructor by QString::toLatin1() due to compilation issues

* Moves LinkStore attribute back to TBuffer

- It was moved to Host together with MXP state to Host, but it's better to belong here

* Update src/TMxpVarTagHandler.cpp

Co-Authored-By: Vadim Peretokin <vperetokin@gmail.com>

* Update src/TMxpTagProcessor.cpp

Co-Authored-By: Vadim Peretokin <vperetokin@gmail.com>

* Update src/TMxpSendTagHandler.cpp

Co-Authored-By: Vadim Peretokin <vperetokin@gmail.com>

* Code style, naming and #includes

* Update src/TMxpProcessor.cpp

Co-Authored-By: Vadim Peretokin <vperetokin@gmail.com>

* Update src/TMxpNodeBuilder.cpp

Co-Authored-By: Vadim Peretokin <vperetokin@gmail.com>

* Update src/TMxpNodeBuilder.cpp

Co-Authored-By: Vadim Peretokin <vperetokin@gmail.com>

* Added pre/post guards to header include section

* Handles End Of Transmission char

- Im Achaea this char is used to signal the end of a "block" when MXP is turned on/off
- The solution handles this char as an end of line char such as \n, \r and EOF

* Updated headers' #include guard names

* Replaced mapColor member function by QColor constructor in TMxpMudlet

* Remove ununsed member function from TStringUtils

* Replaced some magic strings by QStringLiteral const attributes

Co-authored-by: Vadim Peretokin <vperetokin@gmail.com>

If @gcms can go through this and post up a cleaned up version then the person who does put it into the development code can use the that to replace the text that would be used by default! 😀

@SlySven SlySven removed their assignment May 2, 2020
@gcms
Copy link
Copy Markdown
Contributor Author

gcms commented May 3, 2020

I guess we will see how this all works in practice - if it does break anything I guess we will just have to glue the bits back together again. I cannot say that I approve it because there is too much to take in - but I realise that there is no reason to disapprove it either. Whilst I thought that de coding Extended ASCII streams is something that should be kept in here if there is no significant harm to moving that to a separate class then factoring the code out to there is not so unreasonable after all. :shipit:

When it comes to the Squash and merge please edit the commit message text down into something sensible before pushing it into the history - that has happened in the recent past and there is little merit when wandering through the repository to see a PR with a commit message that is little more than miscellaneous notes about intermittent issues in the process of creating the PR which are totally irrelevant in the end result. I am not saying that this megaPR does not have some good details in some of those commit messages - but they need to be reviewed - there may be some good stuff in there - ah, if I start the process (and abort it) I can copy the text that will be used - here it is:

* Refactor TBuffer, extracting unrelated behavior

* These refactorings are intended to simplify TBuffer class, extracting responbilities that aren't directly related to the buffer mgmt and support extracting MXP protocol handling from TBuffer.
- Extracted TEncodingTable, this class is now responsible for managing a map of computer-friendly encoding names as keys.
- Extract TLinkStore to manage links and associated hints to be displayed.

* Fixes code style formatting issues.

* Extracted MXP/HTML entity handling from TBuffer

- Extracted code responsible for detecting entities such as &gt; &lt; etc from the main loop in translateToPlainText to a specialized class that keeps track of identifying an entity and replacing it into the text stream

* Extracted MXP tag detection from TBuffer

- Extracted TMxpTagDetector class to process input and extract MXP tags for later processing

* Extracted processing of MXP tags from TBuffer

- Extracted TMxpTagProcessor class, responsible for parsing MXP tags and executing the expected behavior

* Extracted MXP protocol handling from TBuffer

- TMxpProcessor keeps the state of MXP and delegates tag detection and processing to TMxpTagDetector and TMxpTagProcessor

* Introduces TEntityResolver for mapping char entities

- Includes a table of 64 standard codes (uses a hash table instead of previous if/else comparisons)
- Supports unicode codes in decimal and hexdecimal such as &#32; and &#x20;
- Supports registering custom entities (useful for implementing MXP spec)
- Includes tests for the resolver and the MXP entity handler

* Complete refactoring/reimplementation of MXP support

- Extracts MXP support from TBuffer
- Support for parsing MXP tags, replacing direct string manipulation by accessing MxpTag attributes
- Class hierarchy of MxpTagHandler for implementing support for each tag. Allow for a clear separation of the implementation of each tag and makes it easier for adding new support
- Defines a clear interface separating the MXP implementation from other parts of the code
- Solves some limitations of previous implementation such as supporting element definitions not only for <SEND> tags, and interpolating named and positional attributes
- Supports COLOR tag

* Wrong value passed as argument TMxpFormattingTagsHandler

* Removes a class deprecated by previous commits

- Removes the MxpTagDetector, previously used to identify opening/closing <> tag symbols
- Replaced in previous commits by TMxpNodeBuilder, which detects and parse tags into a MxpTag object

* Reimplements MXP events over the new MXP code base

- Events are added in a queue from TMxpMudlet (MXP client interface)
- TConsole sends the event to the TLuaInterpreter on printOnDisplay

* Fixes bad merge

* Includes missin header #include

* Reformat files using clang-style

* Fixes MxpFontTagHandler implementation of MxpTagHandler

* Updated mudlet.pro to include missing file

* Remove uneeded member in MxpProcessor

* Reuses the same EntityResolver instance

- Different classes were initializing their own instance of this class that is supposed to be shared

* Support for MXP !ENTITY tag

* Support for MXP VAR tag

* Finished merging with upstream branch

- Moved encoding tables to TEncodingTable

* Finished merging with upstream branch

- Moved encoding tables to TEncodingTable

* Sets up mudlet-tests project

- Uses CMake
- Can be loaded through Qt Creator for editing and running the tests

* Move MXP state objects from TBuffer to Host

- As TBuffer is copied at different places is was also creating duplicates of the MXP state
- Thsi change avoids these needless copies and eliminates the needs for using pointers and copy constructors in EntityHandler

* Add missing file reference to mudlet.pro

- Includes TMxpMudlet.cpp in Qt project file

* Fixes minor issues

- Mark member functions as override
- Missing checks
- Replaces QString constructor by QString::toLatin1() due to compilation issues

* Moves LinkStore attribute back to TBuffer

- It was moved to Host together with MXP state to Host, but it's better to belong here

* Update src/TMxpVarTagHandler.cpp

Co-Authored-By: Vadim Peretokin <vperetokin@gmail.com>

* Update src/TMxpTagProcessor.cpp

Co-Authored-By: Vadim Peretokin <vperetokin@gmail.com>

* Update src/TMxpSendTagHandler.cpp

Co-Authored-By: Vadim Peretokin <vperetokin@gmail.com>

* Code style, naming and #includes

* Update src/TMxpProcessor.cpp

Co-Authored-By: Vadim Peretokin <vperetokin@gmail.com>

* Update src/TMxpNodeBuilder.cpp

Co-Authored-By: Vadim Peretokin <vperetokin@gmail.com>

* Update src/TMxpNodeBuilder.cpp

Co-Authored-By: Vadim Peretokin <vperetokin@gmail.com>

* Added pre/post guards to header include section

* Handles End Of Transmission char

- Im Achaea this char is used to signal the end of a "block" when MXP is turned on/off
- The solution handles this char as an end of line char such as \n, \r and EOF

* Updated headers' #include guard names

* Replaced mapColor member function by QColor constructor in TMxpMudlet

* Remove ununsed member function from TStringUtils

* Replaced some magic strings by QStringLiteral const attributes

Co-authored-by: Vadim Peretokin <vperetokin@gmail.com>

If @gcms can go through this and post up a cleaned up version then the person who does put it into the development code can use the that to replace the text that would be used by default!

I think the PR description gives a good overview of what's in it. Maybe we can use it as commit message? If you think it's still too long I can try to make it more succinct.

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented May 3, 2020

The description is fine. Agree on the mega PR, they are painful - splitting up into bitesize pieces makes them much more manageable. But it's OK for the first time :)

Mind resolving the merge conflict?

@vadi2 vadi2 merged commit 102491a into Mudlet:development May 3, 2020
@vadi2
Copy link
Copy Markdown
Member

vadi2 commented May 3, 2020

It's in - brilliant, thanks a lot :)

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented May 4, 2020

Coverity found a few minor issues, mind fixing them?

New defect(s) Reported-by: Coverity Scan
Showing 3 of 3 defect(s)


** CID 1492834:  Uninitialized members  (UNINIT_CTOR)
/home/travis/build/Mudlet/Mudlet/src/TMxpNodeBuilder.h: 87 in TMxpNodeBuilder::TMxpNodeBuilder(bool)()


________________________________________________________________________________________________________
*** CID 1492834:  Uninitialized members  (UNINIT_CTOR)
/home/travis/build/Mudlet/Mudlet/src/TMxpNodeBuilder.h: 87 in TMxpNodeBuilder::TMxpNodeBuilder(bool)()
81         , mIsInsideSequence(false)
82         , mHasSequence(false)
83         , mIsQuotedSequence(false)
84         , mIsText(false)
85         , mIsInsideText(false)
86         {
>>>     CID 1492834:  Uninitialized members  (UNINIT_CTOR)
>>>     Non-static class member "mSequenceHasSpaces" is not initialized in this constructor nor in any functions that it calls.
87         }
88     
89         // returns true when a node (text/tag start/tag end) is available
90         // the same char has to be input again when a match is found as it may be a boundary
91         // and the class keeps no buffer of it
92         bool accept(char ch);

** CID 1492833:  Uninitialized members  (UNINIT_CTOR)
/home/travis/build/Mudlet/Mudlet/src/TMxpMudlet.cpp: 26 in TMxpMudlet::TMxpMudlet(Host *)()


________________________________________________________________________________________________________
*** CID 1492833:  Uninitialized members  (UNINIT_CTOR)
/home/travis/build/Mudlet/Mudlet/src/TMxpMudlet.cpp: 26 in TMxpMudlet::TMxpMudlet(Host *)()
20     #include "TMxpMudlet.h"
21     #include "Host.h"
22     #include "TConsole.h"
23     #include "TLinkStore.h"
24     
25     
>>>     CID 1492833:  Uninitialized members  (UNINIT_CTOR)
>>>     Non-static class member "isUnderline" is not initialized in this constructor nor in any functions that it calls.
26     TMxpMudlet::TMxpMudlet(Host* pHost) : mpHost(pHost), mLinkMode(false) {}
27     
28     QString TMxpMudlet::getVersion()
29     {
30         return scmVersion;
31     }

** CID 1492832:  Code maintainability issues  (VIRTUAL_DTOR)
/home/travis/build/Mudlet/Mudlet/src/TMxpTagHandler.h: 33 in ()


________________________________________________________________________________________________________
*** CID 1492832:  Code maintainability issues  (VIRTUAL_DTOR)
/home/travis/build/Mudlet/Mudlet/src/TMxpTagHandler.h: 33 in ()
27     #include <QString>
28     #include "post_guard.h"
29     
30     class TMxpClient;
31     class TMxpContext;
32     
>>>     CID 1492832:  Code maintainability issues  (VIRTUAL_DTOR)
>>>     Class "TMxpTagHandler" does not have a virtual destructor.
33     class TMxpTagHandler
34     {
35     public:
36         virtual TMxpTagHandlerResult handleTag(TMxpContext& ctx, TMxpClient& client, MxpTag* tag);
37     
38         virtual void handleContent(char ch) {}

@gcms
Copy link
Copy Markdown
Contributor Author

gcms commented May 4, 2020

Coverity found a few minor issues, mind fixing them?

New defect(s) Reported-by: Coverity Scan
Showing 3 of 3 defect(s)


** CID 1492834:  Uninitialized members  (UNINIT_CTOR)
/home/travis/build/Mudlet/Mudlet/src/TMxpNodeBuilder.h: 87 in TMxpNodeBuilder::TMxpNodeBuilder(bool)()


________________________________________________________________________________________________________
*** CID 1492834:  Uninitialized members  (UNINIT_CTOR)
/home/travis/build/Mudlet/Mudlet/src/TMxpNodeBuilder.h: 87 in TMxpNodeBuilder::TMxpNodeBuilder(bool)()
81         , mIsInsideSequence(false)
82         , mHasSequence(false)
83         , mIsQuotedSequence(false)
84         , mIsText(false)
85         , mIsInsideText(false)
86         {
>>>     CID 1492834:  Uninitialized members  (UNINIT_CTOR)
>>>     Non-static class member "mSequenceHasSpaces" is not initialized in this constructor nor in any functions that it calls.
87         }
88     
89         // returns true when a node (text/tag start/tag end) is available
90         // the same char has to be input again when a match is found as it may be a boundary
91         // and the class keeps no buffer of it
92         bool accept(char ch);

** CID 1492833:  Uninitialized members  (UNINIT_CTOR)
/home/travis/build/Mudlet/Mudlet/src/TMxpMudlet.cpp: 26 in TMxpMudlet::TMxpMudlet(Host *)()


________________________________________________________________________________________________________
*** CID 1492833:  Uninitialized members  (UNINIT_CTOR)
/home/travis/build/Mudlet/Mudlet/src/TMxpMudlet.cpp: 26 in TMxpMudlet::TMxpMudlet(Host *)()
20     #include "TMxpMudlet.h"
21     #include "Host.h"
22     #include "TConsole.h"
23     #include "TLinkStore.h"
24     
25     
>>>     CID 1492833:  Uninitialized members  (UNINIT_CTOR)
>>>     Non-static class member "isUnderline" is not initialized in this constructor nor in any functions that it calls.
26     TMxpMudlet::TMxpMudlet(Host* pHost) : mpHost(pHost), mLinkMode(false) {}
27     
28     QString TMxpMudlet::getVersion()
29     {
30         return scmVersion;
31     }

** CID 1492832:  Code maintainability issues  (VIRTUAL_DTOR)
/home/travis/build/Mudlet/Mudlet/src/TMxpTagHandler.h: 33 in ()


________________________________________________________________________________________________________
*** CID 1492832:  Code maintainability issues  (VIRTUAL_DTOR)
/home/travis/build/Mudlet/Mudlet/src/TMxpTagHandler.h: 33 in ()
27     #include <QString>
28     #include "post_guard.h"
29     
30     class TMxpClient;
31     class TMxpContext;
32     
>>>     CID 1492832:  Code maintainability issues  (VIRTUAL_DTOR)
>>>     Class "TMxpTagHandler" does not have a virtual destructor.
33     class TMxpTagHandler
34     {
35     public:
36         virtual TMxpTagHandlerResult handleTag(TMxpContext& ctx, TMxpClient& client, MxpTag* tag);
37     
38         virtual void handleContent(char ch) {}

I pushed it to this branch. Do I need to create a separate PR?

@vadi2
Copy link
Copy Markdown
Member

vadi2 commented May 4, 2020

Hm, try. If it doesn't work, try braching off development.

@gcms gcms mentioned this pull request May 4, 2020
Chris7 pushed a commit to Chris7/Mudlet that referenced this pull request Jan 2, 2022
    Refactoring/reimplementation of MXP protocol
        Extracts MXP support from TBuffer
        Separates input parsing from protocol processing: input is parsed into MxpTag objects by the MxpNodeBuilder and are passed to be handled by the MxpTagProcessor
        Class hierarchy of MxpTagHandler for implementing support for each tag: allows for a clear separation of the implementation of each tag and makes it easier for adding support for new tags
        Defines a clear interface (MxpClient) separating the MXP implementation from other parts of the code
        Solves some limitations of previous implementation such as supporting element definitions not only for<SEND> tags, interpolating named and positional attributes and interpolating &text; placeholder by tag content
        Includes support for COLOR tag
        Handles FONT, U, I, B, VAR and !ENTITY tags, but so far no defined behavior (can be implemented in MxpMudlet)
        Attributes that hold MXP state are now in the Host class to avoid needless copies when TBuffer is copied around
        Introduces QtTests for some of the classes
    Other refactorings intended to simplify TBuffer class, extracting responbilities that aren't directly related to the buffer mgmt
        Extracted TEncodingTable from TBuffer, this class is responsible for mapping encoding names to tables. It also removes some bloat from TBuffer.
        TLinkStore to manage links and associated hints to be displayed.
        TEntityResolver to map character entities, such as &gt; &Mudlet#32; &#x20; to their associated string values; supports interpolating strings replacing the entity placeholders by their values
Kebap added a commit that referenced this pull request Oct 6, 2025
#### Brief overview of PR changes/additions
Internal refactoring, no functional changes.
- use `.arg()`
- ~~use `QStringBuilder` with % operator~~ didn't work as expected with
multiple code lines

#### Motivation for adding to Mudlet
Fix #8306 
Enhance #3625 by @gcms 

#### Other info (issues closed, discussion etc)
Increasing readability, shorting code ~50%.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants