Skip to content

Generic Message Compression#3490

Closed
cchengleo wants to merge 12 commits intoceph:masterfrom
cchengleo:wip-msg-compression
Closed

Generic Message Compression#3490
cchengleo wants to merge 12 commits intoceph:masterfrom
cchengleo:wip-msg-compression

Conversation

@cchengleo
Copy link
Contributor

This series commits implement a prototype for a generic message compression mechanism based on #3303 and commons. However, the interface to make use of this is still not clear.

@ghost ghost added core feature labels Jan 26, 2015
@mattbenjamin
Copy link
Contributor

Is this also something it could make sense to enable/disable for a connection or flow, rather than message by message?

@cchengleo
Copy link
Contributor Author

@mattbenjamin I agree with you. I think a per connection compression flag can save bandwidth a lot when communicating with remote OSD for site backup or something else.

@yuyuyu101
Copy link
Member

I'm not sure, maybe some messages which can't get great compress effect doesn't need to send with compress enabled. I think whether we compress a message isn't decided by connection, and should be considered by message itself. For this case, I prefer to add flag to message footer flag.

@mattbenjamin
Copy link
Contributor

That makes sense, and I agree. I'm trying to think of ways to pull conditions out of the decode path...

@liewegas
Copy link
Member

Is this also something it could make sense to enable/disable for a connection or flow, rather than message by message?

I can imagine that some messages are more compressible than others...

@markhpc
Copy link
Member

markhpc commented Jan 26, 2015

I was kind of thinking the other way, if we have data compressed at a lower layer, does it make sense to think about how to take it and pass it around without a decompression and re-compression step...

@cchengleo
Copy link
Contributor Author

@markhpc Could you please share more information on your question? What aspects do you mean for "take it and pass it around without a decompression and re-compression step..."?

@cchengleo
Copy link
Contributor Author

@yuyuyu101 I have already make use of footer.flags to tag compressed messages. We can let upper levels to make the decision and set footer.flags earlier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better way to approach this is a flag in the header, CEPH_MSG_HEADER_COMPRESSDATA and COMPRESSFRONT and COMPRESSMIDDLE (or whatever the naming scheme is). And make the first 4 bytes of each of those payloads be the decompressed length. This way we don't change any of the fundamental msg structures.

@liewegas
Copy link
Member

in the final version you'll need to drop teh first patch (or squash it in to the next one) so that we don't add the recovery-specific code and then remove it in the next patch

@liewegas
Copy link
Member

I think a sensible approach would be to add Message methods like

virtual should_compress() { return false; }
virtual should_not_compress() { return false; }

That way we can flag any messages that are good or bad candidates. Then we can take a policy that only compressings things that are explicitly flagged as good candidates, or compress everything that is not flagged as a bad candidate?

We'll also need to add a feature bit for compatibility, but we can do that last...

@liewegas liewegas closed this Jan 26, 2015
@liewegas liewegas reopened this Jan 26, 2015
@liewegas liewegas self-assigned this Jan 26, 2015
@cchengleo
Copy link
Contributor Author

@liewegas in the final version you'll need to drop teh first patch (or squash it in to the next one) so that we don't add the recovery-specific code and then remove it in the next patch

I have removed recovery-specific code originally coding by @XinzeChi in cchengleo@81d64c5

@cchengleo cchengleo force-pushed the wip-msg-compression branch 2 times, most recently from 1d1143f to a1677ba Compare January 27, 2015 05:55
@loic-bot
Copy link

FAIL: the output of run-make-check.sh on centos-centos7 for 23ec502 is http://paste2.org/_p5OeZhJH

:octocat: Sent from GH.

@cchengleo cchengleo force-pushed the wip-msg-compression branch from 07497e8 to 5c250c9 Compare January 27, 2015 17:41
@cchengleo
Copy link
Contributor Author

@liewegas

I think a sensible approach would be to add Message methods like

virtual should_compress() { return false; }
virtual should_not_compress() { return false; }

That way we can flag any messages that are good or bad candidates. Then we can take a policy that only compressings things that are explicitly flagged as good candidates, or compress everything that is not flagged as a bad candidate?

We'll also need to add a feature bit for compatibility, but we can do that last...

Could you please elaborate this a little bit? Are higher level logics use these API to set flag for compression?

@liewegas
Copy link
Member

It looks to me like you have the feature bit handled correctly for compatibility.

My main concern is that having a Message::compress() and decompress() method adds another dimension of state to the Message struct, which means that other code may potentially need to cope or understand with a compressed Message (or even just that we need to meaningfully represent a compressed Message in memory). Maybe we want to go down that road.... I'm not sure.

But I think a simpler approach would be what I describe in an earlier comment:

  1. the msg header struct has the original crc and length values
  2. the msg header has bits set if the over-the-wire payload is compressed
  3. the msg/simple/Pipe.cc read_message() and write_message() methods have an alternate path that compresses/decompresses the payload right then in private buffers in the local scope.
  4. the over-the-write compressed buffer (whether it is front, middle, or data) has the compressed crc, length, and then compressed payload.

Perhaps in the future if we want to represent the compressed message as a valid Message we can tackle that.. but if we avoid it entirely for now I think it's safer and simpler!

@liewegas
Copy link
Member

An open question for me is whether we leave the header flags for compression set on the Message that we received. This tells others that the data was once compressed on the wire, which might conceivably be interesting, but is someone confusing since it the buffers are no longer compressed. Similarly, if we are sending a message that has those flags set, does that mean we are requesting or forcing compression? We'll need to make sure they are cleared before sending over the wire if, for example, the other end doesn't support compression.

@cchengleo cchengleo closed this Jan 27, 2015
@gregsfortytwo
Copy link
Member

I don't think we should leak the compressed state outside of the messenger, if that's what handles encode/decode.

A more integrated compression system (that, eg, keeps the data compressed until it's passed into the ObjectStore, or even writes it to disk and transparently handles decompressing on update)) might be interesting but a lot more work that I'm not sure is worth it (the reason to do it would be eg memory bandwidth on a system that needs to copy data between threads or something). Outside of a scenario where the external system can actually change its behavior in meaningful ways based on compression state it should be invisible.

@cchengleo cchengleo reopened this Jan 27, 2015
@loic-bot
Copy link

FAIL: the output of run-make-check.sh on centos-centos7 for 60895ad3888faafbde6372255bc14d55637dc751 is http://paste2.org/_zcH0LB9W

:octocat: Sent from GH.

@cchengleo
Copy link
Contributor Author

@liewegas
On read_message, front/middle/data_len are used to get read front/middle/data correctly over-the-wire. These fields represents the real length of the following payloads. It also applies to front/middle/data_crc. That's why I didn't emit any original metadata in the header field.

@loic-bot
Copy link

FAIL: the output of run-make-check.sh on centos-centos7 for d8789c0 is http://paste2.org/_fy0f2saI

:octocat: Sent from GH.

@cchengleo cchengleo force-pushed the wip-msg-compression branch from 3c84960 to c2f31d2 Compare January 31, 2015 21:48
@loic-bot
Copy link

FAIL: the output of run-make-check.sh on centos-centos7 for d1dda41 is http://paste2.org/_9Mpz1h1L

:octocat: Sent from GH.

Mykola Golub and others added 12 commits February 17, 2015 13:50
It is set when crc for footer headers is calculated. Check crc for
footer headers only if this flag is set.

Note, this introduces some undesirable effect if older mons are in the
cluster: as they do not set NOHEADERCRC flag, crc for footer headers
from these mons will not be checked by new mons.

Also rename CEPH_MSG_FOOTER_NOCRC to CEPH_MSG_FOOTER_NODATACRC, which
is more correct.

Signed-off-by: Mykola Golub <mgolub@mirantis.com>
Use CEPH_MSG_FOOTER_NOHEADERCRC/NODATACRC flags instead of config options.

Signed-off-by: Mykola Golub <mgolub@mirantis.com>
ceph maybe run out of network bandwith in recovery process,
compress data if needed.

Signed-off-by: Xinze Chi <xmdxcxz@gmail.com>
Signed-off-by: Cheng Cheng <ccheng.leo@gmail.com>
Signed-off-by: Cheng Cheng <ccheng.leo@gmail.com>
…ture

"CEPH_FEATURE_MSG_COMPRESS" and set conditions to perform message
compression.

The message compression can be invoked by either 1) explicitly setting
ms_compress_all option; or 2) determined by upper level when generating message.
However, the above conditions cannot guarantee the message pending for
transmitting will be compressed, it still depends on whether the communicating
peer support CEPH_FEATURE_MSG_COMPRESS.

Signed-off-by: Cheng Cheng <ccheng.leo@gmail.com>
A flag in message header can indicate which parts of messages (payload, middle
or data) need to be compressed. The first 4 bytes of each of these payloads are
the original uncompressed length, and the next 4 bytes are the original CRC.
This implementation minimize the need to change fundamental msg structures.

Signed-off-by: Cheng Cheng <ccheng.leo@gmail.com>
Clear header.flags if the message to be sent is not compressed at all.
Populate original front/middle/data to header to avoid the swicherroo with the
throttler and clear the header flags after compression.

Signed-off-by: Cheng Cheng <ccheng.leo@gmail.com>
In read path, perform decompression before constructing a new message.
In write path, perform compression right before sending message to wire.
The message is intact in memory and upper levels should be agnostic about
compression/decompression.

TODO: have not tested in async Messenger.

Signed-off-by: Cheng Cheng <ccheng.leo@gmail.com>
…o control

compression for upper level.

Signed-off-by: Cheng Cheng <ccheng.leo@gmail.com>
…y for

RPM.

Signed-off-by: Cheng Cheng <ccheng.leo@gmail.com>
Signed-off-by: Cheng Cheng <ccheng.leo@gmail.com>
@loic-bot
Copy link

FAIL: the output of run-make-check.sh on centos-centos7 for 18d27db is http://paste2.org/_cVmmcvAm

:octocat: Sent from GH.

@majianpeng
Copy link
Member

I wanted to compress the journal for large write(from my test, (My cluster setup is five osd w/ on ssd as journal. w/o ec & replica size = 1, all data-zero the performance increase 44%. For random-data the performance increate 37%). But at present, there is no compress func/class.
@cchengleo Can you firstly make the first patch merge into the master. After that, i can don my journal compress/decompress. Thanks!

@cchengleo
Copy link
Contributor Author

@majianpeng Okay I will put this piece in my TODO list. And make run-make-check pass. From the design aspect, I think the current one is okay to perform necessary functionality.

@ghost
Copy link

ghost commented Apr 29, 2015

@cchengleo feel free to re-open when ready.

@ghost ghost closed this Apr 29, 2015
@yuyuyu101
Copy link
Member

I heard someone could benefit network message compress for recovery and some use case like vdi. So I would like to push this again. @cchengleo Would you have a new impl or reopen it again to make complete?

@cchengleo
Copy link
Contributor Author

I can follow up this PR. Let me take a look for current network impl and make changes to accommodate. I also would like to make it complete.

Sincerely,
Cheng

On May 24, 2015, at 12:36 PM, Haomai Wang notifications@github.com wrote:

I heard someone could benefit network message compress for recovery and some use case like vdi. So I would like to push this again. @cchengleo Would you have a new impl or reopen it again to make complete?


Reply to this email directly or view it on GitHub.

@yuyuyu101
Copy link
Member

@cchengleo I hope we can get some progress for this :-) I would like to help to complete this PR if you have any problem.

@yuyuyu101
Copy link
Member

Actually we may like to do compress even in messenger level async. Online compress within messenger thread may affect network bandwidth hugely

@cchengleo
Copy link
Contributor Author

@yuyuyu101 Thanks for the help! I am recently get busy with other stuffs. Please go ahead and implement what you proposed. I will try to help also.

I am not quite get why online compressing affect network bandwidth hugely. Can you further elaborate?

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants