Skip to content

dtls.c: Do not generate Decrypt Alert on unexpected Epoch#85

Closed
mrdeep1 wants to merge 1 commit intoeclipse-tinydtls:developfrom
mrdeep1:fixes_79
Closed

dtls.c: Do not generate Decrypt Alert on unexpected Epoch#85
mrdeep1 wants to merge 1 commit intoeclipse-tinydtls:developfrom
mrdeep1:fixes_79

Conversation

@mrdeep1
Copy link
Copy Markdown
Contributor

@mrdeep1 mrdeep1 commented Jun 7, 2021

As per RFC5246
decrypt_error
A handshake cryptographic operation failed, including being unable
to correctly verify a signature or validate a Finished message.
This message is always fatal.

Does not cover a change in Epoch when the previous Epoch's handshakes have
not completed. Silently ignore the unexpected Epoch.

RFC6347 does not add any clarity here, other than

Implementations MUST either
discard or buffer all application data packets for the new epoch
until they have received the Finished message for that epoch.

New Epoch buffering is not currently supported.

Signed-off-by: Jon Shallow supjps-libcoap@jpshallow.com

dtls.c Outdated
if ((uint16_t)(peer->handshake_params->hs_epoch + 1) == msg_epoch) {
dtls_info("Next epoch packet arrived before handshake complete\n");
/* Just validate the packet, reporting any issues */
decrypt_verify(peer, msg, rlen, &data);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this extra operation is a good idea. Why do you think it is necessary?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decrypt_verify() would otherwise not take place on finished in the new epoch following receipt of change cipher spec and throw up any errors as this packet is not buffered. I agree that subsequent decryption of application packets would fail if there is an error here, but this may not be correctly detected at this point.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decrypt_verify() essentially just removes the "other security parameter" slot that is used internally to store values from the last epoch (it is a noop if there are none or if values for the next epoch are available). I still wonder why it would make a difference whether this is done here or in the next call to decrypt_verify().

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the dtls-fuzzer from Issue #79, if decrypt_verify() @4075 is removed, the following application data is still decrypted correctly.

My only concern with current fix is if this decrypt_verify() is removed, there never will be a validation of the out of order Finish packet.

@boaks
Copy link
Copy Markdown
Contributor

boaks commented Jun 24, 2021

Wouldn't suppress the MAC error (decrypt_verify()) in epochs > 0 help?

@mrdeep1
Copy link
Copy Markdown
Contributor Author

mrdeep1 commented Jun 24, 2021

Wouldn't suppress the MAC error (decrypt_verify()) in epochs > 0 help?

Maybe, but I am operating at the edge of my limited DTLS knowledge here.

@boaks
Copy link
Copy Markdown
Contributor

boaks commented Jun 25, 2021

Let me try to sum up my view:

The usage of the MAC is, in my opinion, bound the epochs after 0.
RFC6347, 4.1.2 therefore applies for epochs after the 0. If a record with epoch after 0 is received, and the MAC could not be verified (e.g. mismatching or the keys are not available, or the FINISH for that epoch hasn't been received), the records may be silently dropped.

If a FINISH passes that MAC test, but the contained verify_datadoesn't match the expected (e.g. caused by modifications of handshake messages), the a decrypt_error is sent back. Using the "extended master secret", that errors will happen less often and modification of handshake message will end up in MAC errors.

For epoch 0 itself, it's hard to say, what the intention is. For some, that period is that short, that an attack is considered to be very rare. Even assuming such an attack, it would still be very difficult (up to impossible) to distinguish the right messages from that attack ones. With that, ignoring or not, the handshakes will fail in my opinion ...

@mrdeep1
Copy link
Copy Markdown
Contributor Author

mrdeep1 commented Jun 25, 2021

Understood. There are 2 issues here - packets sent in right sequence but arrive in wrong order, and deliberate sending of packets in funny order with perhaps mismanagement of message seq etc.

verifying handshake order with epoch 0 along with buffering out of order epoch 0 will catch most things. The challenge is with the early arrival of epoch 1 packets while still finishing off the epoch 0 part of the handshake. Ideally these should be buffered as well, but need to be decrypted to do that based on the message seq as opposed to the packet seq which may include retries.

If insufficient of the epoch 0 handshake have arrived, then the premature epoch 1 cannot be decrypted - I guess it is game over at that point unless a different strategy of buffering epoch 1 is done.

bottomline is that the right answer is to fully buffer epoch 1 somehow. This PR fixes the invalid generate of decrypt alert and checks the simple case of a simple arrival reorder.

@boaks
Copy link
Copy Markdown
Contributor

boaks commented Jun 25, 2021

        if (peer->handshake_params &&
            peer->handshake_params->hs_epoch != msg_epoch &&
            peer->state != DTLS_STATE_WAIT_FINISHED) {
          if ((uint16_t)(peer->handshake_params->hs_epoch + 1) == msg_epoch) {
            dtls_info("Next epoch packet arrived before handshake complete\n");
            /* Just validate the packet, reporting any issues */
            decrypt_verify(peer, msg, rlen, &data);
          }
          return -1;
        }

So the hs_epoch is mainly introduced to drop the records before expecting the FINISH?

My feeling is, it's not only about expecting, its about "having received and verified that FINISH".
(Unfortunately, I've started this week to redesign the configuration for Californium, so I'm a little too defocused for this topic :-) ).

@obgm
Copy link
Copy Markdown
Contributor

obgm commented Jun 25, 2021

Thanks @boaks for your very good summary. I think the main rationale here is indeed to drop packets that have not been expected yet.

@boaks
Copy link
Copy Markdown
Contributor

boaks commented Jun 25, 2021

If data_length < 0 indicates a MAC error ("decrypt_verify", for me it does), then never send back an alert nor close the connection.

Turns:

      if (data_length < 0) {
        if (hs_attempt_with_existing_peer(msg, rlen, peer)) {
          data = msg + DTLS_RH_LENGTH;
          data_length = rlen - DTLS_RH_LENGTH;
          state = DTLS_STATE_WAIT_CLIENTHELLO;
          role = DTLS_SERVER;
        } else {
	  err =  dtls_alert_fatal_create(DTLS_ALERT_DECRYPT_ERROR);
          dtls_info("decrypt_verify() failed\n");
	  if (peer->state < DTLS_STATE_CONNECTED) {
	    dtls_alert_send_from_err(ctx, peer, &peer->session, err);
	    peer->state = DTLS_STATE_CLOSED;
	    dtls_stop_retransmission(ctx, peer);
	    dtls_destroy_peer(ctx, peer, 1);
	  }
          return err;
        }

into

      if (data_length < 0) {
          dtls_info("decrypt_verify() failure ignored\n");
          return 0;
      }

(Return 0 is required in order to keep libcoap from closing the connection.

  err = dtls_handle_message(dtls_context, dtls_session, data_rw, (int)data_len);

  if (err){
    coap_event_dtls = COAP_EVENT_DTLS_ERROR;
  }

)

The other topic, not to process other data, until the FINISH has not be verified yet, is something not addressed.
My feeling is, that there are more pitfall on processing the FINISH. E.g. on the server side receiving FINISH,

handle_handshake_msg:

dtls_handshake_free(peer->handshake_params)

seems to be wrong. If the server's flight 6 gets lost, the client will repeat flight 5, but, as far as I understand the code (without testing ;-) ) the server will not accept that flight 5 any longer.

So, if this PR should Fix "don't generate that alert", then in my opinion that should be done by

      if (data_length < 0) {
          dtls_info("decrypt_verify() failure ignored\n");
          return 0;
      }

@mrdeep1
Copy link
Copy Markdown
Contributor Author

mrdeep1 commented Jun 25, 2021

If data_length < 0 indicates a MAC error ("decrypt_verify", for me it does), then never send back an alert nor close the connection.

We still need to detect a new ClientHello using epoch 0, so i do not think it safe to reduce the code as you have done. If you apply your code reduction to the} else {part, it looks OK to me if you follow https://datatracker.ietf.org/doc/html/rfc6347#section-4.1.2.1 where the alert really should have been DTLS_ALERT_BAD_RECORD_MAC, not DTLS_ALERT_DECRYPT_ERROR if we elect to report it.

DTLS_ALERT_DECRYPT_ERROR should only be used as per RFC8446 (RFC5246 is similar):-

   decrypt_error:  A handshake (not record layer) cryptographic
      operation failed, including being unable to correctly verify a
      signature or validate a Finished message or a PSK binder.

and so should be generated in check_finished() as DTLS_ALERT_DECRYPT_ERROR, not DTLS_ALERT_HANDSHAKE_FAILURE. Hmm.

@boaks
Copy link
Copy Markdown
Contributor

boaks commented Jun 25, 2021

We still need to detect a new ClientHello using epoch 0

Of course!
There is so much more to do for that! Therefore my PR #89 ! The stateless processing for CLIENT_HELLOs comes with more challenges.

But therefore:
This PR to prevent accidentally ALERTS on MAC violations. Such MAC violations may be caused by spoofing!
These violations are reported, by decrypt_verify.
So, I would prefer to do that simply, but complete:

  • No return -1, that will trigger a close in libcoap.
  • No dtls_alert_send_from_err.
  • No dtls_destroy_peer, the MAC violation may just be spoofed.
    (No DTLS_ALERT_BAD_RECORD_MAC nor DTLS_ALERT_DECRYPT_ERROR based on the decrypt_verify result).

Only, if the (record-)MAC of a FINISH is valid, then in "check_finished" report a mismatch in the "verify_data" as DTLS_ALERT_DECRYPT_ERROR (instead of DTLS_ALERT_HANDSHAKE_FAILURE). That's the "validate a Finished message" part.

And for other details of the handshake, lets do that in other PRs.

@boaks
Copy link
Copy Markdown
Contributor

boaks commented Jun 25, 2021

decrypt_error: A handshake (not record layer) cryptographic
operation failed, including being unable to correctly verify a
signature or validate a Finished message or a PSK binder.

Just to mention, the record-MAC failure is the "record layer cryptographic operation failed".
Checking signatures or the FINISH, verify_data

struct {
          opaque verify_data[verify_data_length];
      } Finished;

      verify_data
         PRF(master_secret, finished_label, Hash(handshake_messages))
            [0..verify_data_length-1];

the "handshake cryptographic operation failed".

@mrdeep1
Copy link
Copy Markdown
Contributor Author

mrdeep1 commented Jun 25, 2021

Code pushed - needs the separate #91 to handle other Decrypt Alert cases.

@boaks
Copy link
Copy Markdown
Contributor

boaks commented Jun 25, 2021

PR #91 and #92 may replace this PR.

Protection against decoding records before FINISH will be done in a additional PR. My feeling is, that there is more to fix for proper processing the FINISH.

@mrdeep1
Copy link
Copy Markdown
Contributor Author

mrdeep1 commented Jun 26, 2021

#91 and #92 are good to go. However, if ChangeCipherSpec and Finished arrive in the wrong order (as highlighted in #79) and the Finished is not dropped or buffered, then

  case DTLS_HT_FINISHED:
    /* expect a Finished message from server */

    if (state != DTLS_STATE_WAIT_FINISHED) {
      return dtls_alert_fatal_create(DTLS_ALERT_UNEXPECTED_MESSAGE);

kicks in. So, we do need to drop the Finished (this PR - hence the need in this PR to save the handshake epoch as well) or buffer the Finished - without doing this Finish drop we get (#91 and #92 added to develop)

Jun 26 11:32:08 DEBG received handshake packet of type: finished (20)
Jun 26 11:32:08 DEBG handle handshake packet of type: finished (20)
Jun 26 11:32:08 WARN error while handling handshake packet
Jun 26 11:32:08 DEBG dtls_prepare_record(): encrypt using TLS_ECDHE_ECDSA_WITH_AES_128_CCM_8
Jun 26 11:32:08 DEBG nonce:: (16 bytes): 7105A83C000100000000000100000000
Jun 26 11:32:08 DEBG key:: (16 bytes): 6AEFED4F0A40C33DB2B23CC6757E065B
Jun 26 11:32:08 DEBG message:: (18 bytes): 0001000000000001AC6C0109E8F3A4551768
Jun 26 11:32:08 DEBG send header: (13 bytes):
00000000 15 FE FD 00 01 00 00 00  00 00 01 00 12
Jun 26 11:32:08 DEBG send unencrypted: (2 bytes):
00000000 02 0A

@boaks
Copy link
Copy Markdown
Contributor

boaks commented Jun 26, 2021

However, if ChangeCipherSpec and Finished arrive in the wrong order (as highlighted in #79) and the Finished is not dropped or buffered, then kicks in.

Agreed, that there is more to do about the FINISH.

  if (peer->handshake_params &&
            peer->handshake_params->hs_epoch != msg_epoch &&
            peer->state != DTLS_STATE_WAIT_FINISHED) {
          if ((uint16_t)(peer->handshake_params->hs_epoch + 1) == msg_epoch) {
            dtls_info("Next epoch packet arrived before handshake complete\n");
          }
          return 0;
   }

Seems for me to be a work-around.
Receiving records (in difference to sending) should be possible with a clearly defined receiving epoch. That epoch is only changed with CCS, when all previous handshake messages have been received. With that, it's more a issue with dtls_security_params_epoch,

dtls_security_parameters_t *security = dtls_security_params_epoch(peer, dtls_get_epoch(header)); 

and in decrypt_verify. dtls_security_params_epoch returns security_params for epoch 1 before the handshake is ready and the CCS is received. With that, it's more about the exact definition of the security_params[2], and to split dtls_security_params_epoch into read & write. My understanding of the current implementation is, that [0] contains the current epoch to write, therefore it is switched after sending CCS. But the receiving epoch seems to be not precisely defined. I'm currently thinking on that ... and have to play with the code ... but this if is for me a work-around, not a solution.
(Really very sorry for being that picky. DTLS can easily get corrupted, especially with "if", that reflects more a current topic, that the "defined processing".)

@mrdeep1
Copy link
Copy Markdown
Contributor Author

mrdeep1 commented Jun 26, 2021

I have no issues with doing the generic correct thing. I see that the draft DTLS1.3 allocates dedicated epochs to the handshake phase https://tools.ietf.org/id/draft-ietf-tls-dtls13-34.html#rfc.section.6.1 presumable to get around what we are trying to sort out here.

The closest that I can find the DTL1.2 spec gets to handling Finished arriving before the rest of the epoch 0 HS messages is

Implementations MUST either
discard or buffer all application data packets for the new epoch
until they have received the Finished message for that epoch

... but the Finished message is in the next epoch following the ChangeCipherSpec! Hence the fix in this PR for dropping things from the next epoch.

I will look to see if it is possible to sensibly buffer the Finished message, given that it may not be possible to decrypt if HS messages other than ChangeCipherSpec also arrive in the wrong order.

@boaks
Copy link
Copy Markdown
Contributor

boaks commented Jun 26, 2021

I prepare PR #94 in order to process the records according a specific read-epoch.

In the meantime, the "separated stateless client_hello epoch 0 processing" gets more and more important, to remove complexity out of the other parts.

@obgm
Copy link
Copy Markdown
Contributor

obgm commented Jun 29, 2021

I have just merged PR #91 and PR #92 which are intended to supersede this PR.

@boaks
Copy link
Copy Markdown
Contributor

boaks commented Jun 29, 2021

#94 supersede the left if.

I would prefer to explicitly use a "read_epoch" and not a combination as in the if in this PR.
#94 is based on my other PR #89, introducing a "separate stateless processing (only) for client_hello epoch 0".

As per RFC5246
   decrypt_error
      A handshake cryptographic operation failed, including being unable
      to correctly verify a signature or validate a Finished message.
      This message is always fatal.

Does not cover a change in Epoch when the previous Epoch's handshakes have
not completed. Silently ignore the unexpected Epoch.

RFC6347 does not add any clarity here, other than

   Implementations MUST either
   discard or buffer all application data packets for the new epoch
   until they have received the Finished message for that epoch.

New Epoch buffering is not currently supported, but the Finished is checked.

Signed-off-by: Jon Shallow <supjps-libcoap@jpshallow.com>
@mrdeep1
Copy link
Copy Markdown
Contributor Author

mrdeep1 commented Oct 6, 2021

Superceeded by #89

@mrdeep1 mrdeep1 closed this Oct 6, 2021
@mrdeep1 mrdeep1 deleted the fixes_79 branch December 8, 2021 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants