-
Notifications
You must be signed in to change notification settings - Fork 268
Non-determinism in golang runtime (etcd example, golang tests, obfs4proxy, ...) #2693
Copy link
Copy link
Closed
Labels
Type: BugError or flaw producing unexpected resultsError or flaw producing unexpected results
Milestone
Description
The syscall order isn't always consistent in the etcd example. For example, see the following two runs:
diff --git a/shadow.data/hosts/server1/server1.etcd.1000.strace b/shadow.data.2/hosts/server1/server1.etcd.1000.strace
index 3ac9e59f..17eb6044 100644
--- a/shadow.data/hosts/server1/server1.etcd.1000.strace
+++ b/shadow.data.2/hosts/server1/server1.etcd.1000.strace
@@ -207,10 +207,13 @@
00:00:00.000000000 [tid 1003] gettid(...) = 1003
00:00:00.000000000 [tid 1003] futex(...) = 1
00:00:00.000000000 [tid 1000] futex(...) = 0
+000000000000000000 [tid 1000] clock_gettime(...) = 0
00:00:00.000000000 [tid 1000] futex(...) = 1
00:00:00.000000000 [tid 1003] futex(...) = 0
+000000000000000000 [tid 1003] clock_gettime(...) = 0
00:00:00.000000000 [tid 1003] futex(...) = 1
00:00:00.000000000 [tid 1000] futex(...) = 0
+000000000000000000 [tid 1000] clock_gettime(...) = 0
00:00:00.000000000 [tid 1000] clock_gettime(...) = 0
00:00:00.000000000 [tid 1000] rt_sigprocmask(...) = 0
00:00:00.000000000 [tid 1000] mmap(0x0, 8392704, PROT_NONE, MAP_PRIVATE | MAP_ANON | MAP_ANONYMOUS | MAP_STACK, -1, 0) = 0x7fffceedd000
@@ -221,15 +224,15 @@
00:00:00.000000000 [tid 1000] rt_sigprocmask(...) = 0
00:00:00.000000000 [tid 1000] rt_sigprocmask(...) = 0
000000000000000000 [tid 1000] clock_gettime(...) = 0
-000000000000000000 [tid 1000] clock_gettime(...) = 0
-000000000000000000 [tid 1000] clock_gettime(...) = 0
00:00:00.000000000 [tid 1000] fcntl(0, 3) = 32768
00:00:00.000000000 [tid 1000] fcntl(1, 3) = 32769
00:00:00.000000000 [tid 1000] fcntl(2, 3) = 32769
000000000000000000 [tid 1000] clock_gettime(...) = 0
000000000000000000 [tid 1000] clock_gettime(...) = 0
00:00:00.000000000 [tid 1000] socket(Unix, 526338, 0) = 3
+000000000000000000 [tid 1000] clock_gettime(...) = 0
00:00:00.000000000 [tid 1000] bind(3, sockaddr_un { unnamed }, 2) = 0
+000000000000000000 [tid 1000] clock_gettime(...) = 0
00:00:00.000000000 [tid 1000] epoll_create1(...) = 4
00:00:00.000000000 [tid 1000] pipe2([1040768, 65728] (0xc000107348), O_CLOEXEC | O_NDELAY | O_NONBLOCK) = 0
00:00:00.000000000 [tid 1000] epoll_ctl(...) = 0Most of the diff is clock_gettime, but occasionally some other syscalls are moved around as well. When combined with the model_unblocked_syscall_latency option, the diff varies much more and sometimes the simulation will fail (see #2161 (comment)).
While we don't necessarily need to support etcd (we don't know of anyone using etcd in shadow), this does point towards a non-determinism issue that we should fix.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Type: BugError or flaw producing unexpected resultsError or flaw producing unexpected results