Conversation
|
In case it helps, this patch 05b203e should provide the missing |
|
this is great. I guess we need to rebase and wait for the x509 release. (plus then adapt lower bounds) |
|
Nice rebase! The eio patch and a fix for the latest X509 breaking change (renaming of Authenticator functions) can be found at https://github.com/art-w/ocaml-tls/commits/no-cstruct/ |
|
@dinosaure would you mind to pick @art-w commits here? |
|
|
||
| let split_str ?(start = 0) str off = | ||
| String.sub str start off, | ||
| String.sub str (start + off) (String.length str - off - start) |
There was a problem hiding this comment.
it's likely I introduced this function, but from our learnings in MirageVPN, I guess we should re-visit the allocation and ownership stuff in this PR. The Strong.sub is super-expensive, and we should try to allocate the least possible.
This doesn't need to be in this PR, but before a release. We should as well benchmark the amount of handshakes and throughput before and after this change (with varying block sizes in send()). Not sure whether we have a suitable executable for this right now. Best to compare to OpenSSL on the same machine.
| buf | ||
| let buf = Bytes.create 8 in | ||
| Bytes.set_int64_be buf 0 seq ; | ||
| Bytes.unsafe_to_string buf |
There was a problem hiding this comment.
allocation-wise likely a bad idea
There was a problem hiding this comment.
What do you mean? We should use Bytes.to_string?
There was a problem hiding this comment.
no, the entire function and allocation discipline should be different. better to allocate a bytes once, and then filling it & passing offsets along. this requires, as mentioned, a pretty deep rewriting -- which brings the suggestion to first work on benchmarks (for the latest release - e2e, maybe similar to what @reynir worked on for MirageVPN -- but would be great if cstruct allocation could be part of the benchmark), to be able to compare with the changes in here and future changes.
There was a problem hiding this comment.
better to allocate a bytes once
Not sure to understand again, the content of this buffer actually depends on the given argument (seq). We probably can pre-allocate a set of sequence_buf and use them then but not sure this is what you want. If I think in this way, it's related to the multicore usage than can be done with ocaml-tls. In such context, "allocate a bytes once" (globally? per-domain?) can be risky.
There was a problem hiding this comment.
oh, sorry for the misunderstanding. my intention is: we get some data to be sent out. now, instead of allocating a header for crypto, a header for tls, etc. - let's allocate once a buffer where the encrypted payload together with the protocol header has space. thus we won't need any other allocations, neither a concat or append. this is all local to a send_data operation, not global/per domain.
nothing for this pr, though.
| Bytes.unsafe_to_string buf | ||
| in | ||
| Uncommon.Cs.xor nonce s | ||
| Uncommon.xor nonce s |
There was a problem hiding this comment.
same, likely better without allocations, and xor_into
| Handshake_crypto.finished (state_version state) session.ciphersuite session.common_session_data.master_secret "server finished" log | ||
| in | ||
| let* () = guard (Cstruct.equal computed fin) (`Fatal `BadFinished) in | ||
| let* () = guard (String.equal computed fin) (`Fatal `BadFinished) in |
lib/handshake_client.ml
Outdated
| let piece = Cstruct.sub sh.server_random 24 8 in | ||
| let* () = guard (not (Cstruct.equal Packet.downgrade12 piece)) (`Fatal `Downgrade12) in | ||
| guard (not (Cstruct.equal Packet.downgrade11 piece)) (`Fatal `Downgrade11) | ||
| let piece = String.sub sh.server_random 24 8 in |
There was a problem hiding this comment.
allocates, maybe we have a function in eqaf or define our own to have off and len params for eq?
There was a problem hiding this comment.
I did a try about implementations of String.sub_equal (which does not guarantee the constant time as eqaf). My benchmark is available here: https://gist.github.com/dinosaure/7ef54e9431217435e3b81df890192b72. From my results (on AMD), the "naive" implementation (the Oracle one) is fast enough for the worst case. For readability, I will consider this implementation for ocaml-tls (into the lib/utils.ml file) than the other.
There was a problem hiding this comment.
Actually, and according to my benchmark, it seems that an equal_sub which uses String.sub or an equal_sub which tries to iterate into given strings have the same performance. My conclusion is: even if we do a String.sub here, try to avoid this String.sub and try to directly read bytes and compare them is not faster than the String.sub...
| let early_secret = Handshake_crypto13.(derive (empty cipher) psk) in | ||
| let hs_secret = Handshake_crypto13.derive early_secret shared in | ||
| let log = log <+> raw in | ||
| let log = log ^ raw in |
There was a problem hiding this comment.
if I remember, we should be able to use a hash state since we never need the entire log data. this is for sure a separate PR.
| let s_mac, rt1 = Core.split_str rt0 mac in | ||
| let c_key, rt2 = Core.split_str rt1 key in | ||
| let s_key, rt3 = Core.split_str rt2 key in | ||
| let c_iv , s_iv = Core.split_str rt3 iv |
There was a problem hiding this comment.
this is 10 string.sub. we can easily have fewer by using the offset and string.sub directly
|
I think this is already pretty big, and we should work towards merging it. From my point of view, what is important:
Then we can compare and merge this PR and prepare performance improvements in subsequent changesets (e.g. allocate less). We as well have these decrypt/encrypt_into functions in mirage-crypto that we should use. If someone would be eager to develop the benchmark, that'd speed up our release schedule:) |
|
I don't understand the tls-eio mdx failures... I see two options: remove it or someone who's into that stuff proposing a fix /cc @talex5 (who developed these bits) |
It doesn't seem to have much to do with tls-eio or mdx; just trying to load tls in utop no longer works: |
thanks for your investigation. @dinosaure is it known and is there a workaround for digestif not being loadable into ocaml toplevel (due to virtual library?)? |
You need to load an implementation of |
Co-authored-by: Hannes Mehnert <hannes@mehnert.org>
|
force-pushed on top of main (#500 was merged), and fixed speed.ml |
CHANGES: * API breaking change: remove usage of Cstruct.t inside of TLS, use bytes and string instead (mirleft/ocaml-tls#497 by @art-w, @hannesm, @dinosaure, @reynir) Performance is up to 3x improved (bandwidth), 2x improvement for handshake/s on an Intel Core(TM) i7-5600U CPU @ 2.60GHz * FEATURE: add tls-miou-unix package, which adds miou support for TLS (mirleft/ocaml-tls#494 mirleft/ocaml-tls#503 @dinosaure) * FEATURE: tls-lwt and tls-async: allow TLS over an existing connection `Tls_lwt.client_of_channels : Tls.Config.client -> ?host:[`host] Domain_name.t -> Lwt_io.input_channel * Lwt_io.output_channel -> t Lwt.t` and `Tls_lwt.server_of_channels : Tls.Config.server -> Lwt_io.input_channel * Lwt_io.output_channel -> t Lwt.t` (mirleft/ocaml-tls#499 @art-w @MisterDA) * API breaking changes: revise errors - reduce the polymorphic variant in size, align it with RFC specified errors, be in parts more precise about errors, in other parts skip data (mirleft/ocaml-tls#505, @hannesm - fixes mirleft/ocaml-tls#491) NB: if you relied on a specific error constructor, please open an issue * Remove unused constructors from Packet.{alert_type, compression_methods, client_certificate_type, extension_type} (mirleft/ocaml-tls#505, @hannesm) NB: if you relied on specific constructors, please open an issue * API breaking change: Tls.Config.{server,client} now return a result type instead of raising an exception (mirleft/ocaml-tls#502, @hannesm, fixes mirleft/ocaml-tls#411) * FEATURE: add bench/speed.exe, a benchmark for bandwidth (for different ciphersuites) and handshakes (different key exchanges and private keys) (mirleft/ocaml-tls#500 @hannesm @dinosaure @reynir) * BUGFIX: tests/feedback.exe update with TLS 1.3 semantics, run as test (mirleft/ocaml-tls#501, @hannesm - reported by @dinosaure)
CHANGES: * API breaking change: remove usage of Cstruct.t inside of TLS, use bytes and string instead (mirleft/ocaml-tls#497 by @art-w, @hannesm, @dinosaure, @reynir) Performance is up to 3x improved (bandwidth), 2x improvement for handshake/s on an Intel Core(TM) i7-5600U CPU @ 2.60GHz * FEATURE: add tls-miou-unix package, which adds miou support for TLS (mirleft/ocaml-tls#494 mirleft/ocaml-tls#503 @dinosaure) * FEATURE: tls-lwt and tls-async: allow TLS over an existing connection `Tls_lwt.client_of_channels : Tls.Config.client -> ?host:[`host] Domain_name.t -> Lwt_io.input_channel * Lwt_io.output_channel -> t Lwt.t` and `Tls_lwt.server_of_channels : Tls.Config.server -> Lwt_io.input_channel * Lwt_io.output_channel -> t Lwt.t` (mirleft/ocaml-tls#499 @art-w @MisterDA) * API breaking changes: revise errors - reduce the polymorphic variant in size, align it with RFC specified errors, be in parts more precise about errors, in other parts skip data (mirleft/ocaml-tls#505, @hannesm - fixes mirleft/ocaml-tls#491) NB: if you relied on a specific error constructor, please open an issue * Remove unused constructors from Packet.{alert_type, compression_methods, client_certificate_type, extension_type} (mirleft/ocaml-tls#505, @hannesm) NB: if you relied on specific constructors, please open an issue * API breaking change: Tls.Config.{server,client} now return a result type instead of raising an exception (mirleft/ocaml-tls#502, @hannesm, fixes mirleft/ocaml-tls#411) * FEATURE: add bench/speed.exe, a benchmark for bandwidth (for different ciphersuites) and handshakes (different key exchanges and private keys) (mirleft/ocaml-tls#500 @hannesm @dinosaure @reynir) * BUGFIX: tests/feedback.exe update with TLS 1.3 semantics, run as test (mirleft/ocaml-tls#501, @hannesm - reported by @dinosaure)
CHANGES: * API breaking change: remove usage of Cstruct.t inside of TLS, use bytes and string instead (mirleft/ocaml-tls#497 by @art-w, @hannesm, @dinosaure, @reynir) Performance is up to 3x improved (bandwidth), 2x improvement for handshake/s on an Intel Core(TM) i7-5600U CPU @ 2.60GHz * FEATURE: add tls-miou-unix package, which adds miou support for TLS (mirleft/ocaml-tls#494 mirleft/ocaml-tls#503 @dinosaure) * FEATURE: tls-lwt and tls-async: allow TLS over an existing connection `Tls_lwt.client_of_channels : Tls.Config.client -> ?host:[`host] Domain_name.t -> Lwt_io.input_channel * Lwt_io.output_channel -> t Lwt.t` and `Tls_lwt.server_of_channels : Tls.Config.server -> Lwt_io.input_channel * Lwt_io.output_channel -> t Lwt.t` (mirleft/ocaml-tls#499 @art-w @MisterDA) * API breaking changes: revise errors - reduce the polymorphic variant in size, align it with RFC specified errors, be in parts more precise about errors, in other parts skip data (mirleft/ocaml-tls#505, @hannesm - fixes mirleft/ocaml-tls#491) NB: if you relied on a specific error constructor, please open an issue * Remove unused constructors from Packet.{alert_type, compression_methods, client_certificate_type, extension_type} (mirleft/ocaml-tls#505, @hannesm) NB: if you relied on specific constructors, please open an issue * API breaking change: Tls.Config.{server,client} now return a result type instead of raising an exception (mirleft/ocaml-tls#502, @hannesm, fixes mirleft/ocaml-tls#411) * FEATURE: add bench/speed.exe, a benchmark for bandwidth (for different ciphersuites) and handshakes (different key exchanges and private keys) (mirleft/ocaml-tls#500 @hannesm @dinosaure @reynir) * BUGFIX: tests/feedback.exe update with TLS 1.3 semantics, run as test (mirleft/ocaml-tls#501, @hannesm - reported by @dinosaure)
#493 rebased to include #495 & #496 plus an update of
tls-async. The only remaining package about theno-cstructmove istls-eio.