Skip to content

Implement AddSequencedLeaves in MySQL storage#1036

Merged
pav-kv merged 9 commits intogoogle:masterfrom
pav-kv:mysql_add_sequenced_leaves
Mar 22, 2018
Merged

Implement AddSequencedLeaves in MySQL storage#1036
pav-kv merged 9 commits intogoogle:masterfrom
pav-kv:mysql_add_sequenced_leaves

Conversation

@pav-kv
Copy link
Copy Markdown
Contributor

@pav-kv pav-kv commented Mar 1, 2018

This is a stub which does not yet fully implement the API contract, in particular it doesn't return duplicates if any.

}

func (m *mySQLLogStorage) beginInternal(ctx context.Context, tree *trillian.Tree) (storage.LogTreeTX, error) {
func (m *mySQLLogStorage) beginInternal(ctx context.Context, tree *trillian.Tree) (*logTreeTX, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this changed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AddSequencedLeaves below needs to access logTreeTX internals, in particular tx. This is because I could not use LogTreeTX.UpdateSequencedLeaves as it does not update LeafData.
I think what I could do is shifting this implementation a bit deeper to a dedicated storage.LogTreeTX.AddSequencedLeaves method. WDYT?

return nil, err
}

ok := status.New(codes.OK, "OK").Proto()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure these really help that much?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the 2 vars below are better looking if inlined. Not sure about ok, seems like a good optimization for the quick-path case when most/all returned statuses are OK: instead of allocating new proto for the status each time we reuse this one everywhere.

if err != nil && err != storage.ErrTreeNeedsInit {
return nil, err
}
return tx.(storage.ReadOnlyLogTreeTX), err
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does removing this cast change any behaviour such as losing the readonly?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line did not compile as it was because tx had become a *logTreeTX. The returned value of this function is still the readonly interface, so I suppose nothing changes on the caller's side?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now tx is again storage.LogTreeTX, but I think casting is unnecessary.

@codecov-io
Copy link
Copy Markdown

codecov-io commented Mar 1, 2018

Codecov Report

Merging #1036 into master will decrease coverage by 0.27%.
The diff coverage is 31.14%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1036      +/-   ##
==========================================
- Coverage    62.6%   62.32%   -0.28%     
==========================================
  Files         103      103              
  Lines        8455     8513      +58     
==========================================
+ Hits         5293     5306      +13     
- Misses       2632     2667      +35     
- Partials      530      540      +10
Impacted Files Coverage Δ
storage/mock_storage.go 0% <0%> (ø) ⬆️
storage/mysql/queue.go 42% <0%> (ø) ⬆️
server/log_rpc_server.go 77.46% <100%> (ø) ⬆️
storage/mysql/log_storage.go 65.71% <37.5%> (-2.68%) ⬇️
storage/mysql/map_storage.go 66.66% <0%> (-1.93%) ⬇️
server/map_rpc_server.go 30.58% <0%> (-1.13%) ⬇️
crypto/verifier.go 48.14% <0%> (-0.95%) ⬇️
integration/maptest/map.go 72.05% <0%> (-0.75%) ⬇️
examples/ct/ctmapper/mapper/mapper.go 11.81% <0%> (-0.19%) ⬇️
client/map_verifier.go 0% <0%> (ø) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6282acf...2bf0a96. Read the comment docs.

+ refactor
+ dummy implementations in other storages
+ go generate
res := make([]*trillian.QueuedLogLeaf, len(leaves))
ok := status.New(codes.OK, "OK").Proto()

// Note: Leaves are sorted by LeafIndex, so no reordering is necessary.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

selfnit: ... no deterministic reordering is necessary.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@pav-kv pav-kv requested a review from daviddrysdale March 2, 2018 10:59
@pav-kv
Copy link
Copy Markdown
Contributor Author

pav-kv commented Mar 2, 2018

@AlCutter @daviddrysdale Martin is OOO, could you please take a look?

@@ -259,15 +261,26 @@ func (m *mySQLLogStorage) ReadWriteTransaction(ctx context.Context, tree *trilli
}

func (m *mySQLLogStorage) AddSequencedLeaves(ctx context.Context, tree *trillian.Tree, leaves []*trillian.LogLeaf) ([]*trillian.QueuedLogLeaf, error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this func (on LogStorage impl) needed, or is this a LogTreeTX thing?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is both, similarly to QueueLeaves. Currently both are transactional, but we discussed earlier that we wanted to allow the LogStorage ones treat entries independently. WDYT?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

// TODO(pavelkalinnikov): Measure latencies.
_, err := t.tx.ExecContext(ctx, insertLeafDataSQL,
t.treeID, leaf.LeafIdentityHash, leaf.LeafValue, leaf.ExtraData, 0)
// Note: QueueTimestamp == 0 because the entry bypasses the queue.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will that mess with the integration latency metrics? (The signer sets integration_timestamp and computes the difference for each leaf, which it then add to a histogram - might be useful for seeing how the mirroring etc. is going on fetching leaves vs. integrating them?)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Are you suggesting to overload QueueTimestamp so that for PREORDERED_LOG it really means AddTimestamp?
We could add another field like SequencingTimestamp (which would make sense for regular LOG mode as well), or just rename this one to a more generic AddTimestamp. WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think we should split the metric in two: one for the LOG's merge delay, another for PREORDERED_LOG's integration delay.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlCutter ping

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think splitting the metrics is a good idea.
I guess I was suggesting overloading the QueueTimestamp, it's still kinda true in that it's queued to be properly integrated into the tree right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done timestamp saving, TODOed the new metric bit.
Yeah, kinda, but not queued in the sense we are used to, rather added/stored. Leaving it as is for now though.

Copy link
Copy Markdown
Contributor Author

@pav-kv pav-kv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlCutter Addressed some of your comments, PTAL.

return nil, err
}

// Note: If LeafIdentityHash collides, we still store the indexed entry.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlCutter Since we decided to postpone fixing the duplicates issue, I think it's safer to not silently store the erroneous data here, but rather return an error. See the updated code.

res := make([]*trillian.QueuedLogLeaf, len(leaves))
ok := status.New(codes.OK, "OK").Proto()

// Note: Leaves are sorted by LeafIndex, so no reordering is necessary.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// TODO(pavelkalinnikov): Measure latencies.
_, err := t.tx.ExecContext(ctx, insertLeafDataSQL,
t.treeID, leaf.LeafIdentityHash, leaf.LeafValue, leaf.ExtraData, 0)
// Note: QueueTimestamp == 0 because the entry bypasses the queue.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Are you suggesting to overload QueueTimestamp so that for PREORDERED_LOG it really means AddTimestamp?
We could add another field like SequencingTimestamp (which would make sense for regular LOG mode as well), or just rename this one to a more generic AddTimestamp. WDYT?

t.treeID, leaf.LeafIdentityHash, leaf.MerkleLeafHash, leaf.LeafIndex, 0)
// TODO(pavelkalinnikov): Update IntegrateTimestamp on integrating the leaf.

if isDuplicateErr(err) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Started manually testing, found a bug. Suppose we are trying to erroneously insert a new unique identity to an occupied leaf index. The INSERT above stores the identity, but the second INSERT fails because the index is occupied. Still, the tx gets committed, and the side-effect remains. Now this identity can't be inserted anymore.

I think I should add a test for this, and couple of simpler scenarios.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this actually complicates things more than I thought it would. Essentially, we need to do 2 INSERTs, each can fail independently of the other (it can be conflicting identity and/or conflicting leaf index). Thus, if one fails, we should cancel the effect of the other to leave the db in the old state. We can do it by either making separate transactions, which is slow, or by manually deleting inserted keys which doesn't look nice.

@AlCutter @Martin2112 Any thoughts?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The third option is reading before doing the inserts.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's look at this again next week. Might be able to use savepoints for this at least in MySQL.

Copy link
Copy Markdown
Contributor Author

@pav-kv pav-kv Mar 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this works. Will update the PR shortly (after #1061 is done as this one now depends on it).

@pav-kv
Copy link
Copy Markdown
Contributor Author

pav-kv commented Mar 21, 2018

@Martin2112 PTAL.

@Martin2112
Copy link
Copy Markdown
Contributor

Maybe alcutter@ knows but how will we implement this on CloudSpanner? I don't think it has an equivalent of savepoints.

@pav-kv
Copy link
Copy Markdown
Contributor Author

pav-kv commented Mar 21, 2018

I think we can at least do a ReadWrite transaction that first tries to read the identity/entry and does the two inserts only if there is no conflict. Anyway, this would be in a separate PR.

@Martin2112
Copy link
Copy Markdown
Contributor

Yes I'm not suggesting we write the code now. Just checking we have a plan.


// Leaves in this transaction are inserted in two tables. For each leaf, if
// one of the two inserts fails, we remove the side effect by rolling back to
// a savepoint installed before the first insert.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the savepoint would be updated each pass through the loop so only the ones that fail are rolled back? Is that not how it's meant to work?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I now realize we're still in an all or nothing situation. Ignore that comment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, actually we are not in all-or-nothing. Your initial interpretation was right: only the failed-to-insert entries are rolled back. This is why I update the savepoint on each loop iteration below.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. It's hard to read GitHub diffs. I'll have another look.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I create this savepoint upfront is making sure the "RELEASE SAVEPOINT ..." after the loop has something to delete and doesn't return an error.

glog.Errorf("Error adding savepoint: %s", err)
return nil, err
}
// TODO(pavelkalinnikov): Consider performance implication of executing this
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the overhead of creating a savepoint is low but we should test this theory.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will get to it later.

@Martin2112
Copy link
Copy Markdown
Contributor

Martin2112 commented Mar 21, 2018 via email

@pav-kv
Copy link
Copy Markdown
Contributor Author

pav-kv commented Mar 21, 2018

I wonder why the 4th builder in Travis consistently complains about SQL syntax, while others don't.

@Martin2112
Copy link
Copy Markdown
Contributor

Martin2112 commented Mar 21, 2018 via email

@Martin2112
Copy link
Copy Markdown
Contributor

I've looked at it again and I think it's doing the right thing. Let's investigate the batched_queue error travis error now.

@pav-kv
Copy link
Copy Markdown
Contributor Author

pav-kv commented Mar 21, 2018

I think the reason is that insertSequencedLeafSQL variable gets overridden in batched_queue mode by the one in queue_batching.go file. Will see how I can fix this.

@pav-kv
Copy link
Copy Markdown
Contributor Author

pav-kv commented Mar 21, 2018

@Martin2112 I fixed the errors, PTAL. For the performance (with or without savepoints), I will probably add a Benchmark test in a follow-up PR, and play with it locally to see the difference.

@pav-kv pav-kv merged commit 812857a into google:master Mar 22, 2018
@pav-kv pav-kv deleted the mysql_add_sequenced_leaves branch March 22, 2018 10:12
gdbelvin added a commit to gdbelvin/trillian that referenced this pull request Mar 26, 2018
* master:
  storage/testdb: drop now-unused entrypoints (google#1067)
  Drop use of SQLite (google#1064)
  Implement AddSequencedLeaves in MySQL storage (google#1036)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants