Build on MacOS
Patch to support the MacOS for go-dqlite, basically a copy of abandoned PR #119 with some cleaning and MacOS-specific changes.
Now it's ready for review - I tested it successfully with macos->macos and macos->linux connection and regular DB workload. Tests are in the go-dqlite PR here: https://github.com/canonical/go-dqlite/pull/132
The most important change that fixed the freeze is moving r->leader_state.change = req; prior to call of the clientChangeConfiguration function (because the macos IO implementation is not async and executes directly, so quite sure even on linux it could fail in case the async executor will decide to run IO operation right after the call).
The other patches in series:
- https://github.com/canonical/go-dqlite/pull/132
- https://github.com/canonical/dqlite/pull/285
- https://github.com/canonical/dqlite/pull/288
- https://github.com/canonical/dqlite/pull/290
Codecov Report
Merging #173 (5f6ae82) into master (fefc557) will increase coverage by
0.05%. The diff coverage is69.23%.
@@ Coverage Diff @@
## master canonical/raft#173 +/- ##
==========================================
+ Coverage 86.27% 86.32% +0.05%
==========================================
Files 47 47
Lines 7532 7531 -1
Branches 2000 2003 +3
==========================================
+ Hits 6498 6501 +3
+ Misses 928 923 -5
- Partials 106 107 +1
| Impacted Files | Coverage Δ | |
|---|---|---|
| src/byte.c | 98.57% <ø> (ø) |
|
| src/fixture.c | 92.99% <ø> (ø) |
|
| src/heap.c | 100.00% <ø> (ø) |
|
| src/uv.c | 86.45% <0.00%> (-0.51%) |
:arrow_down: |
| src/uv_os.c | 81.55% <ø> (ø) |
|
| src/uv_snapshot.c | 75.97% <ø> (ø) |
|
| src/uv_fs.c | 74.87% <60.00%> (-0.07%) |
:arrow_down: |
| src/raft.c | 82.50% <100.00%> (-1.46%) |
:arrow_down: |
| src/uv_writer.c | 81.13% <100.00%> (ø) |
|
| ... and 4 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update fefc557...5f6ae82. Read the comment docs.
Hi @stgraber could you please check this change as well?
I think I'd feel a lot more confident about this (and easier to review/revert) if it was split into a bunch of commits for the various type of changes done in there.
@stgraber you mean half-working ones? I checked the changelist again and almost any line is needed to be sure that the project could be built or work at all... Maybe you have some clues on how to split it, because it's quite none in my head right now...
@rabits Thank you for the PR. I am looking forward for this to land on master. However, I got your changes to build on MAC with this patch to raft, master of dqlite but seeing this error when I ran the dqlite-demo example,
dqlite-demo --api 127.0.0.1:8001 --db 127.0.0.1:9001
Assertion failed: (amount == (int)page_size), function vfsWalWrite, file src/vfs.c, line 1079.
[1] 79045 abort dqlite-demo --api 127.0.0.1:8001 --db 127.0.0.1:9001
Any idea what's going here?
Hi @pycaster , yeah, still working on the patches - but with the latest releases. So still need to prepare CI and retest them again.
Regarding the error you see - looks like some assert failed, hard to tell right now, but let's check when CI part will be ready.
Codecov Report
Merging #173 (23bade6) into master (b71e303) will decrease coverage by
0.15%. The diff coverage is80.43%.
@@ Coverage Diff @@
## master canonical/raft#173 +/- ##
==========================================
- Coverage 86.79% 86.63% -0.16%
==========================================
Files 47 47
Lines 7602 7648 +46
Branches 2030 2046 +16
==========================================
+ Hits 6598 6626 +28
- Misses 900 918 +18
Partials 104 104
| Impacted Files | Coverage Δ | |
|---|---|---|
| src/byte.c | 98.57% <ø> (ø) |
|
| src/heap.c | 100.00% <ø> (ø) |
|
| src/uv_metadata.c | 91.66% <0.00%> (ø) |
|
| src/uv_segment.c | 90.09% <ø> (ø) |
|
| src/uv_snapshot.c | 76.62% <ø> (+0.64%) |
:arrow_up: |
| src/uv_os.c | 80.58% <25.00%> (-0.98%) |
:arrow_down: |
| src/client.c | 78.12% <66.66%> (-0.71%) |
:arrow_down: |
| src/uv_fs.c | 75.05% <85.71%> (-0.58%) |
:arrow_down: |
| src/raft.c | 88.75% <100.00%> (+1.25%) |
:arrow_up: |
| src/uv.c | 85.71% <100.00%> (-1.36%) |
:arrow_down: |
| ... and 10 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update b71e303...23bade6. Read the comment docs.
@MathieuBordere , I tested osx build but looks like it's disabled for now - do you know a way to make it through?
@rabits Looks like our Travis account needs funding :-) should be functional again soon.
Oh, sorry - hopefully it's not me who eaten the budget... Is osx pricey on travis?
Hi @MathieuBordere , could you please check the CI log - I'm a little bit confused why the test could fail:
send/changeToUnconnectedAddress [ ERROR ]
Error: test/integration/test_uv_send.c:281: uv_run: condition not met in 20 iterations
Error: child killed by signal 6
Also not sure if the other tests are fixed correctly, so am I going in the right direction?
Hi @MathieuBordere , could you please check the CI log - I'm a little bit confused why the test could fail:
send/changeToUnconnectedAddress [ ERROR ] Error: test/integration/test_uv_send.c:281: uv_run: condition not met in 20 iterations Error: child killed by signal 6Also not sure if the other tests are fixed correctly, so am I going in the right direction?
It's not obvious why it would fail, I propose that you try to run the test locally and debug it.
The steps in the failing test are:
- Try to send a message to a server at a certain address
- Try to send a message to the same server at a different, disconnected address, hitting the code path https://github.com/canonical/raft/blob/34b3ae06efdb221c819ef4c804c37128008467df/src/uv_send.c#L426 but the connection will fail.
- Try to send a message to the same server at the original, connected address (this is where it fails for you). My guess is that https://github.com/canonical/raft/blob/34b3ae06efdb221c819ef4c804c37128008467df/src/uv_send.c#L487 has a non-0 return value.
@MathieuBordere I can reproduce the problem locally.
It does not look like uvGetClient() is the culprit (it can't return non-zero unless memory allocation fails AFAICT).
Can you walk me through how a new connection is supposed to appear in the test and who should flush the pending messages?
I hacked tracing.h as follows:
diff --git a/src/tracing.h b/src/tracing.h
index aeaddb2..9b0d9e2 100644
--- a/src/tracing.h
+++ b/src/tracing.h
@@ -16,6 +16,8 @@ extern struct raft_tracer StderrTracer;
/* Emit a debug message with the given tracer. */
#define Tracef(TRACER, ...) \
do { \
+fprintf(stderr, __VA_ARGS__); \
+fprintf(stderr, "\n"); \
if (TRACER != NULL && TRACER->emit != NULL && TRACER->enabled) { \
static char _msg[1024]; \
snprintf(_msg, sizeof _msg, __VA_ARGS__); \
and see the following:
penberg@vonneumann raft % make test/integration/uv && ./test/integration/uv send/changeToUnconnectedAddress
make: `test/integration/uv' is up to date.
Running test suite with seed 0x8ff66663...
send/changeToUnconnectedAddress [ ERROR ]
no connection available -> enqueue message
connect attempt completed -> status unknown error
send pending messages
connection available -> write message
no connection available -> enqueue message
no connection available -> enqueue message
connect attempt completed -> status no connection to remote server available
timer expired -> attempt to reconnect
connect attempt completed -> status no connection to remote server available
timer expired -> attempt to reconnect
connect attempt completed -> status no connection to remote server available
Error: test/integration/test_uv_send.c:284: uv_run: condition not met in 5 iterations
Error: child killed by signal 6
0 of 1 (0%) tests successful, 0 (0%) test skipped.
Thanks for this, will have a look later this week!
a lot developer use MacOS, please consider fix this.
Oh, just recalled that this PR is still here. Sorry folks, I used dqlite in my project before, but figured out some other solution and quite sure will not going to continue work on this PR - gets too complicated to support this. Don't really mind if someone wants to continue the work)