Conversation
|
SMAZ er most efficient with English language, you might wanna look at Unishox2 as it can better handle UTF-8 as well. Did made a hardware implantation of it for MeshCore as a proof of concept. It can be accessed here : |
In the method I propose, the changes made to the text do not distort the actual information being conveyed in any way (a client without cyr2lat support will see the original text/will still be able to read the message that you intended to send). Cyr2lat is also applicable to both private messages and channels (which, incidentally, in my experience, are used more frequently). In other words, the final message will be just as readable as before conversion – and this is extremely useful for the official Meshcore client and any third-party client. In fact, this 18–25% reduction in size is a free optimisation that complements other classic compression algorithms perfectly, as it provides less data to the archiver. Some measurements/statistics/demonstrations:
|
|
I mean why not, I just have a small comment to make: if( smaz_enabled ) return smaz_if_smaller(text);
else if( cyr2lat_enabled ) return cyr2lat(text);
else return textto: if( replace_enabled ) text = replace(text);
if( smaz_enabled ) text = smaz_if_smaller(text);
//else if( unishox2_enabled ) text = unishox2(text);
return text;Best Regards Eric |
|
My view is that, since SMAZ is a dictionary-based compression method primarily designed for whole English words, replacing some characters with others will simply render it unusable for the resulting string. However, traditional compression methods (if we ever get them) will work. I.e.: |
|
Hi, I'm currently reviewing this |
|
I've come to the conclusion that the best course of action is just to build a SMAZ dict for each language. I'm currently working on that. I did test on the smaz method vs your method and using the SMAZ method is 50% more efficent in transfer volume size. |
Hi! There’s one issue with smaz compression: clients that don’t support it (such as the official Meshcore) won’t be able to decode and view the original message. The proposed method of replacing characters, however, is backward-compatible, so the resulting message with the replaced characters remains readable. |
This is acceptable as its the way it currently is. Also soon official clients wont even see SMAZ text on their chat screen anymore. meshcore-dev/MeshCore#2392 We will be switching to sending these as binary blobs. |
|
Thank you for your comment! I'm afraid that in this case there will be a divide between participants in public channels, as many users use the official / terminal / custom apps for meshcore, and meshcore_open users will be writing "into the void" if they want to optimise their outgoing traffic. I think smaz is undoubtedly more efficient, but it's only applicable either if a local Meshcore community in a particular city is just getting started, or for private channels, or for private messages. In cities where the community is already established, however, it will be very difficult to optimise the network by promoting smaz – for example, in my city, most people use the official Meshcore client without support for any kind of optimisation, and it is much easier to offer them a client that allows them to stay in touch with everyone and enjoy the benefits of shorter outgoing packets. The compression method I am proposing is an attempt at optimisation within the context of an established community. meshcore_open users will be able to send optimised messages to public channels without separating themselves from the rest of the chat participants. Perhaps there is some chance of implementing this functionality? Update: Synchronised changes with the dev branch 🙏 |
My next goal is to add the word SMAZ to the top of the chat screen and when you click it SMAZ is enabled and a message will appear saying only clients using meshcore open or apps that have implemented SMAZ will be able to see these messages. It will show filled and unfilled to indicate if its on or off. I supposed that instead of SMAZ it could use an icon like this or something then when you click it a dialog appears with toggles for both so you can enable SMAZ or enable this and it could explain both. Regular clients not being able to read SMAZ has been here since the apps conception. The only new thing coming with it is that regular users wouldnt see the base64 anymore. Another cool thing is using binary will make the SMAZ messages even smaller. What do you think about what I proposed? |
|
I think SMAZ is a very good way to compress messages, but it’s difficult to apply to public channels with a mixed audience (users of meshcore_open as first part and all other application users as second part of audience). This means that public channels will remain without any optimisation until:
As an example, when a fork of your app was created in an attempt to optimise sent text by changing the encoding, I can cite «Repachat». This is an iOS client in which, when the character limit was exceeded, the encoding switched from UTF-8 to UCF, where all characters are 1 byte in size. This fork did not catch on in our city community, as users of the standard apps saw unreadable messages, which caused discomfort for both readers and, ultimately, the sender, and they preferred to use the classic official version. From this, we can conclude that people wants to optimise their messages (if only to fit more information within the character limit), but at the same time they want other channel members to be able to read these messages as well. In fact, the proposed method is intended for channels with a mixed audience, where one does not want to lose the connection between its segments. For private channels and DM's is recommended use SMAZ. Personally, I use meshcore_open because it offers a more informative view of packet flow, clearer statistics, and, in some respects, more convenient control. And I wouldn’t want to lose contact with the rest of the network, just as I’d like to be able to optimise my own network load. |
Perhaps this will work better than SMAZ and Unishox2 but will must likely needed to be handled by the APP |
|
I think we can merge this for now and re-evaluate in the future. |


Hello! This PR adds an alternative compression mode, which is useful when working with, for example, Cyrillic characters.
The idea is that Cyrillic characters are replaced with similar Latin ones, and compression is achieved through the difference in character size – in UTF-8, Cyrillic characters take up 2 bytes, whilst Latin characters take up 1 byte. In practice, on default dictionary, compression of between 18–25% is achieved without visual differences – the freed-up space can be used either to optimise packet size or to fit a greater amount of information within the limit.
Also, this compression mode is fully back-compatible with regular meshcore users, using any application. There is no distortion of information, in fact.
Editing the substitution dictionary is supported – for example, you can extend support to other languages, or if transform text to full-translateration mode (when cyrillic transliterates to full latin, it increases compression up to 40%).
Cyr2lat-compression is implemented for channels and contacts. It is can be enabled for each channel/contact in it's settings, but when enabled, it disables SMAZ.
PR is reopened,
dart format .andflutter analyzeis passed, branch based on latest devScreenshots