raft icon indicating copy to clipboard operation
raft copied to clipboard

Build on MacOS

Open rabits opened this issue 5 years ago • 15 comments

Patch to support the MacOS for go-dqlite, basically a copy of abandoned PR #119 with some cleaning and MacOS-specific changes.

Now it's ready for review - I tested it successfully with macos->macos and macos->linux connection and regular DB workload. Tests are in the go-dqlite PR here: https://github.com/canonical/go-dqlite/pull/132

The most important change that fixed the freeze is moving r->leader_state.change = req; prior to call of the clientChangeConfiguration function (because the macos IO implementation is not async and executes directly, so quite sure even on linux it could fail in case the async executor will decide to run IO operation right after the call).

The other patches in series:

  • https://github.com/canonical/go-dqlite/pull/132
  • https://github.com/canonical/dqlite/pull/285
  • https://github.com/canonical/dqlite/pull/288
  • https://github.com/canonical/dqlite/pull/290

rabits avatar Feb 20 '21 06:02 rabits

Codecov Report

Merging #173 (5f6ae82) into master (fefc557) will increase coverage by 0.05%. The diff coverage is 69.23%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     canonical/raft#173      +/-   ##
==========================================
+ Coverage   86.27%   86.32%   +0.05%     
==========================================
  Files          47       47              
  Lines        7532     7531       -1     
  Branches     2000     2003       +3     
==========================================
+ Hits         6498     6501       +3     
+ Misses        928      923       -5     
- Partials      106      107       +1     
Impacted Files Coverage Δ
src/byte.c 98.57% <ø> (ø)
src/fixture.c 92.99% <ø> (ø)
src/heap.c 100.00% <ø> (ø)
src/uv.c 86.45% <0.00%> (-0.51%) :arrow_down:
src/uv_os.c 81.55% <ø> (ø)
src/uv_snapshot.c 75.97% <ø> (ø)
src/uv_fs.c 74.87% <60.00%> (-0.07%) :arrow_down:
src/raft.c 82.50% <100.00%> (-1.46%) :arrow_down:
src/uv_writer.c 81.13% <100.00%> (ø)
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update fefc557...5f6ae82. Read the comment docs.

codecov-io avatar Feb 24 '21 04:02 codecov-io

Hi @stgraber could you please check this change as well?

rabits avatar Apr 22 '21 04:04 rabits

I think I'd feel a lot more confident about this (and easier to review/revert) if it was split into a bunch of commits for the various type of changes done in there.

stgraber avatar Apr 22 '21 04:04 stgraber

@stgraber you mean half-working ones? I checked the changelist again and almost any line is needed to be sure that the project could be built or work at all... Maybe you have some clues on how to split it, because it's quite none in my head right now...

rabits avatar Apr 22 '21 05:04 rabits

@rabits Thank you for the PR. I am looking forward for this to land on master. However, I got your changes to build on MAC with this patch to raft, master of dqlite but seeing this error when I ran the dqlite-demo example,

dqlite-demo --api 127.0.0.1:8001 --db 127.0.0.1:9001
Assertion failed: (amount == (int)page_size), function vfsWalWrite, file src/vfs.c, line 1079.
[1]    79045 abort      dqlite-demo --api 127.0.0.1:8001 --db 127.0.0.1:9001

Any idea what's going here?

pycaster avatar May 07 '21 15:05 pycaster

Hi @pycaster , yeah, still working on the patches - but with the latest releases. So still need to prepare CI and retest them again.

Regarding the error you see - looks like some assert failed, hard to tell right now, but let's check when CI part will be ready.

rabits avatar May 07 '21 17:05 rabits

Codecov Report

Merging #173 (23bade6) into master (b71e303) will decrease coverage by 0.15%. The diff coverage is 80.43%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     canonical/raft#173      +/-   ##
==========================================
- Coverage   86.79%   86.63%   -0.16%     
==========================================
  Files          47       47              
  Lines        7602     7648      +46     
  Branches     2030     2046      +16     
==========================================
+ Hits         6598     6626      +28     
- Misses        900      918      +18     
  Partials      104      104              
Impacted Files Coverage Δ
src/byte.c 98.57% <ø> (ø)
src/heap.c 100.00% <ø> (ø)
src/uv_metadata.c 91.66% <0.00%> (ø)
src/uv_segment.c 90.09% <ø> (ø)
src/uv_snapshot.c 76.62% <ø> (+0.64%) :arrow_up:
src/uv_os.c 80.58% <25.00%> (-0.98%) :arrow_down:
src/client.c 78.12% <66.66%> (-0.71%) :arrow_down:
src/uv_fs.c 75.05% <85.71%> (-0.58%) :arrow_down:
src/raft.c 88.75% <100.00%> (+1.25%) :arrow_up:
src/uv.c 85.71% <100.00%> (-1.36%) :arrow_down:
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update b71e303...23bade6. Read the comment docs.

codecov-commenter avatar May 10 '21 06:05 codecov-commenter

@MathieuBordere , I tested osx build but looks like it's disabled for now - do you know a way to make it through?

rabits avatar May 11 '21 06:05 rabits

@rabits Looks like our Travis account needs funding :-) should be functional again soon.

MathieuBordere avatar May 11 '21 07:05 MathieuBordere

Oh, sorry - hopefully it's not me who eaten the budget... Is osx pricey on travis?

rabits avatar May 11 '21 17:05 rabits

Hi @MathieuBordere , could you please check the CI log - I'm a little bit confused why the test could fail:

send/changeToUnconnectedAddress                             [ ERROR ]
Error: test/integration/test_uv_send.c:281: uv_run: condition not met in 20 iterations
Error: child killed by signal 6

Also not sure if the other tests are fixed correctly, so am I going in the right direction?

rabits avatar May 17 '21 05:05 rabits

Hi @MathieuBordere , could you please check the CI log - I'm a little bit confused why the test could fail:

send/changeToUnconnectedAddress                             [ ERROR ]
Error: test/integration/test_uv_send.c:281: uv_run: condition not met in 20 iterations
Error: child killed by signal 6

Also not sure if the other tests are fixed correctly, so am I going in the right direction?

It's not obvious why it would fail, I propose that you try to run the test locally and debug it.

The steps in the failing test are:

  • Try to send a message to a server at a certain address
  • Try to send a message to the same server at a different, disconnected address, hitting the code path https://github.com/canonical/raft/blob/34b3ae06efdb221c819ef4c804c37128008467df/src/uv_send.c#L426 but the connection will fail.
  • Try to send a message to the same server at the original, connected address (this is where it fails for you). My guess is that https://github.com/canonical/raft/blob/34b3ae06efdb221c819ef4c804c37128008467df/src/uv_send.c#L487 has a non-0 return value.

MathieuBordere avatar May 17 '21 14:05 MathieuBordere

@MathieuBordere I can reproduce the problem locally.

It does not look like uvGetClient() is the culprit (it can't return non-zero unless memory allocation fails AFAICT).

Can you walk me through how a new connection is supposed to appear in the test and who should flush the pending messages?

I hacked tracing.h as follows:

diff --git a/src/tracing.h b/src/tracing.h
index aeaddb2..9b0d9e2 100644
--- a/src/tracing.h
+++ b/src/tracing.h
@@ -16,6 +16,8 @@ extern struct raft_tracer StderrTracer;
 /* Emit a debug message with the given tracer. */
 #define Tracef(TRACER, ...)                                              \
     do {                                                                 \
+fprintf(stderr, __VA_ARGS__); \
+fprintf(stderr, "\n"); \
         if (TRACER != NULL && TRACER->emit != NULL && TRACER->enabled) { \
             static char _msg[1024];                                      \
             snprintf(_msg, sizeof _msg, __VA_ARGS__);                    \

and see the following:

penberg@vonneumann raft % make test/integration/uv && ./test/integration/uv send/changeToUnconnectedAddress
make: `test/integration/uv' is up to date.
Running test suite with seed 0x8ff66663...
send/changeToUnconnectedAddress                             [ ERROR ]
no connection available -> enqueue message
connect attempt completed -> status unknown error
send pending messages
connection available -> write message
no connection available -> enqueue message
no connection available -> enqueue message
connect attempt completed -> status no connection to remote server available
timer expired -> attempt to reconnect
connect attempt completed -> status no connection to remote server available
timer expired -> attempt to reconnect
connect attempt completed -> status no connection to remote server available
Error: test/integration/test_uv_send.c:284: uv_run: condition not met in 5 iterations
Error: child killed by signal 6
0 of 1 (0%) tests successful, 0 (0%) test skipped.

penberg avatar Nov 11 '21 18:11 penberg

Thanks for this, will have a look later this week!

MathieuBordere avatar Nov 15 '21 22:11 MathieuBordere

a lot developer use MacOS, please consider fix this.

calvin2021y avatar Jul 30 '22 14:07 calvin2021y

Oh, just recalled that this PR is still here. Sorry folks, I used dqlite in my project before, but figured out some other solution and quite sure will not going to continue work on this PR - gets too complicated to support this. Don't really mind if someone wants to continue the work)

rabits avatar Apr 22 '23 16:04 rabits