Skip to content

Data loss with TLS 1.3 #10880

@MrAnno

Description

@MrAnno

Hi,

OpenSSL version:

OpenSSL 1.1.1d  10 Sep 2019
built on: Wed Nov 13 16:09:29 2019 UTC
platform: linux-x86_64
options:  bn(64,64) rc4(16x,int) des(int) idea(int) blowfish(ptr)
compiler: gcc -fPIC -pthread -m64 -Wa,--noexecstack -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -Wa,--noexecstack -D_FORTIFY_SOURCE=2 -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DVPAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPOLY1305_ASM -DNDEBUG -D_FORTIFY_SOURCE=2
OPENSSLDIR: "/etc/ssl"
ENGINESDIR: "/usr/lib/engines-1.1"
Seeding source: os-specific

OS Name, Version, Hardware platform:

Arch Linux, Ubuntu Bionic x86_64

Compiler Details (name, version):

gcc version 9.2.0

Problem Description:

When a client sends data to a TLS 1.3 server and then closes the connection, the server won't read everything that was reported as sent on the client side. The amount of data read is non-deterministic.

Reproduction steps:

  1. Ignore SIGPIPE
  2. TLS client connects to server
  3. TLS client sends data (in small chunks, using multiple SSL_write() calls)
  4. TLS client closes the connection right after returning from all successful (blocking) SSL_write() calls
    • Delaying close() on the client side with nanosleep() or executing a complete TCP shutdown (shutdown(SHUT_RDWR)) decreases the possibility of data loss. I guess it's just because this issue has something to do with handling ECONNRESET and "new session ticket" messages.

A dummy example can be found here that reproduces the issue (localhost, loopback interface):
MrAnno/openssl-tls13-test@3b101e3

The data loss usually occurs if the server returns from a blocking SSL_read() with SSL_ERROR_SYSCALL:

$ ./tls13test
SSL_read returned SSL_ERROR_SYSCALL: Connection reset by peer
read: 3672 (should be 4080)

The reason I think it's a bug (even after reading #6904) is because SSL_read() sometimes returns with SSL_ERROR_SYSCALL, but the error queue and errno are empty:

$ ./tls13test
SSL_read returned SSL_ERROR_SYSCALL: Success
read: 3876 (should be 4080)

Metadata

Metadata

Assignees

No one assigned

    Labels

    branch: 1.1.1Applies to OpenSSL_1_1_1-stable branch (EOL)branch: masterApplies to master branchseverity: importantImportant bugs affecting a released versiontriaged: bugThe issue/pr is/fixes a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions