Handle basic_string in datastream serializing/deserializing by mikelik · Pull Request #118 · AntelopeIO/cdt

mikelik · 2023-03-23T09:55:55Z

Change Description

Resolves #103
Fixes crash in cdt-cpp when using a std::basic_string<> in action wrapper

API Changes

API Changes

Documentation Additions

Documentation Additions

heifner · 2023-03-23T12:03:17Z

libraries/eosiolib/core/eosio/datastream.hpp

+template<typename Stream, typename T>
+datastream<Stream>& operator << ( datastream<Stream>& ds, const std::basic_string<T>& s ) {
+   ds << unsigned_int( s.size() );
+   for( const auto& i : s ) {


Wouldn't datastream.write and datastream.read below be better?

I checked in terms of performance and (if I'm reading assembler code correctly) there should be no difference for operator<< with above for loop vs ds.write - both end with a single call to memcpy (when compiled with -O2 flag)

From readers perspective ds.write/read is a bit more concise so I have changed the code.

basic_string operator<< with for loop:

basic_string operator<< with ds.write:

Maybe change it back then. Otherwise the write() needs sizeof I believe.

operator>> and ds.read needs sizeof(T), but sizeof is a compile time thing (there is no runtime cost) so I wouldn't worry about it.
I think it is only matter of what is easier to read by developer.

dimas1185 · 2023-03-24T04:21:05Z

tests/unit/datastream_tests.cpp

+   // std::basic_string
+   ds.seekp(0);
+   fill(begin(datastream_buffer), end(datastream_buffer), 0);
+   static const std::basic_string<uint8_t> inputBasicString {0, 1, 2, 3, 4, 5};


please add test with 2+ bytes type. something like wchar_t

That was really, really good idea.
It occurred that the second solution with datastream.read/write was wrong. Probably because casting to char* was narrowing 2 bytes to 1 byte?
I reverted solution to the original one and it looks good.
Thanks!

I think your issue was that you had * sizeof(T) in read but not in write. I can't think of other reason but if wouldn't work, try to use static_cast instead of c-cast. We have std::string that is using read/write. I propose to have consistent code. I see that -o2 supposed to make no difference here but anyway that is performance critical code so I think we should use read/write to make even non-optimized code faster.

My code is copied from vector serializing so I believe it is consistent (because basically basic_string is a container).
What is the purpose of making non-optimized code faster? The difference would be only if someone deliberately builds code with debug flags. If it really is important then we need to create issues for improving both vectors and arrays serialization.

I think your issue was that you had * sizeof(T) in read but not in write.

Thanks, that was the issue.
I additionally changed char* to void* in read method, so I could avoid unnecessary reinterpret_cast<char*> from wchar_t* in my code (which shouldn't be needed anyway, because memcpy needs void* type).

Short answer - I don't know for sure. My comment was based on the fact that cdt is not most recently installed compiler. this is our own patched compiler and its main purpose is to compile to web assembly. So I think it is easier to make it just optimal than check this behavior in wasm.
Regarding having optimal debug build - that may be helpful to avoid timeouts while we debug. You may recall errors in DUNE due to performance. Imagine you have slow machine and debug builds, it will be even slower so you need additional configuration to allow big timeouts

…ly wide characters. Add unit tests for 2 byte wchar_t.

heifner reviewed Mar 23, 2023

View reviewed changes

mikelik marked this pull request as ready for review March 23, 2023 13:45

mikelik requested review from dimas1185 and larryk85 March 23, 2023 13:47

dimas1185 reviewed Mar 24, 2023

View reviewed changes

Michal Lesiak added 4 commits March 24, 2023 15:46

Handle basic_string in datastream serializing/deserializing

b3803df

Code review fix - change from for loop to datastrem::write/read

9abf293

Revert changes with datastream.write/read which do not handle correct…

11abb4c

…ly wide characters. Add unit tests for 2 byte wchar_t.

Add back ds.read/write with bug fix.

da5547d

mikelik force-pushed the mikelik/basic_string branch from 060000a to da5547d Compare March 24, 2023 14:46

dimas1185 approved these changes Mar 24, 2023

View reviewed changes

larryk85 approved these changes Mar 24, 2023

View reviewed changes

larryk85 merged commit 1de734c into main Mar 24, 2023

larryk85 deleted the mikelik/basic_string branch March 24, 2023 20:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle basic_string in datastream serializing/deserializing#118

Handle basic_string in datastream serializing/deserializing#118
larryk85 merged 4 commits intomainfrom
mikelik/basic_string

mikelik commented Mar 23, 2023

Uh oh!

heifner Mar 23, 2023

Uh oh!

mikelik Mar 23, 2023

Uh oh!

heifner Mar 23, 2023

Uh oh!

mikelik Mar 23, 2023

Uh oh!

dimas1185 Mar 24, 2023

Uh oh!

mikelik Mar 24, 2023

Uh oh!

dimas1185 Mar 24, 2023

Uh oh!

mikelik Mar 24, 2023

Uh oh!

mikelik Mar 24, 2023

Uh oh!

dimas1185 Mar 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mikelik commented Mar 23, 2023

Change Description

API Changes

Documentation Additions

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants