jobs: replace old node liveness job leases#51087
jobs: replace old node liveness job leases#51087craig[bot] merged 17 commits intocockroachdb:masterfrom
Conversation
cd56a3b to
a1deb9e
Compare
ajwerner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @spaskob)
pkg/sql/sqlliveness/slinstance/slinstance.go, line 50 at r8 (raw file):
case <-l.stopper.ShouldStop(): return case <-time.After(l.hb):
use timeutil.NewTimer
pkg/sql/sqlliveness/slinstance/slinstance.go, line 57 at r8 (raw file):
} // Extend the current live session. if err := l.db.Txn(ctx, func(ctx context.Context, txn *kv.Txn) error {
Ideally you wouldn't be holding the mutex during the entire transaction.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 65 at r8 (raw file):
sqlbase.InternalExecutorSessionDataOverride{User: security.RootUser}, ` UPDATE system.sqlliveness SET expiration = $1 WHERE expiration > $2 AND session_id = $3 RETURNING session_id`,
I'm not sure that this expiration > $2 condition is necessary. That matter when removing expired sessions but I'm less certain it matters when heartbeating. If a session expires in the woods and nobody is around, did it expire? :p Furthermore, if you avoid using that predicate you never need to delete the session. My feeling is that making the deletion of the session the property for its failure is a nice simplification.
But in all seriousness, commentary on the properties of the sessions somewhere seems good.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 84 at r8 (raw file):
} } else { l.mu.s.Exp = exp
I don't think you want to assign here because the function may get retried. I get that you're holding the mutex the whole time but I also am not super pleased about that.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 88 at r8 (raw file):
return nil }); err != nil { log.Error(ctx, err.Error())
Do you need to clear the session if you fail?
pkg/sql/sqlliveness/slinstance/slinstance.go, line 117 at r8 (raw file):
defer l.mu.Unlock() now := l.db.Clock().Now() if l.mu.s != nil && l.mu.s.Expiration().WallTime > now.WallTime {
I'm skeptical of this condition.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 123 at r8 (raw file):
id := uuid.MakeV4().GetBytes() exp := l.db.Clock().Now().Add(l.d.Nanoseconds(), 1) if err := l.db.Txn(ctx, func(ctx context.Context, txn *kv.Txn) error {
my sense is that this lazy session creation makes the locking harder to follow. I can't think of why we'd have an Instance and not want it to have a session. I think you could simplify all of this by just having the loop create or update the session.
pkg/sql/sqlliveness/slsession/slsession.go, line 11 at r8 (raw file):
// licenses/APL.txt. package slsession
This doesn't need to be in a subpackage, it should be an unexported struct in the slinstance package.
pkg/sql/sqlliveness/sltorage/slstorage.go, line 55 at r8 (raw file):
// If the expiration is corrupted it's better to treat session as expired. } else { if s.db.Clock().Now().WallTime < exp.WallTime {
I again think that just using the expiration is tenuous because then we have to worry about clock synchronization. My feeling is that you should only return false for a session if there is no row. That, of course, leaves open the question of who deletes expired sessions (the below TODO).
spaskob
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @spaskob)
pkg/sql/sqlliveness/slinstance/slinstance.go, line 117 at r8 (raw file):
Previously, ajwerner wrote…
I'm skeptical of this condition.
So you prefer Session to a simple getter for the session and the heart beat loop do all the work?
pkg/sql/sqlliveness/slinstance/slinstance.go, line 123 at r8 (raw file):
Previously, ajwerner wrote…
my sense is that this lazy session creation makes the locking harder to follow. I can't think of why we'd have an
Instanceand not want it to have a session. I think you could simplify all of this by just having the loop create or update the session.
That's sounds good. How would you deal with cases where you create an instance, it starts the heart beat loop but inserting the new session takes some time. Meanwhile you try to get a session with a call to Session, does the caller have to retry?
pkg/sql/sqlliveness/sltorage/slstorage.go, line 55 at r8 (raw file):
Previously, ajwerner wrote…
I again think that just using the expiration is tenuous because then we have to worry about clock synchronization. My feeling is that you should only return false for a session if there is no row. That, of course, leaves open the question of who deletes expired sessions (the below TODO).
That is tricky though, isn't it? If a node goes offline, another node has to declare the session expired, otherwise until the off-line node comes back the claimed resources will not be released. Should we use some time offset constant that is guaranteed to be >= the max time skew?
ajwerner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @spaskob)
pkg/sql/sqlliveness/slinstance/slinstance.go, line 117 at r8 (raw file):
Previously, spaskob (Spas Bojanov) wrote…
So you prefer
Sessionto a simple getter for the session and the heart beat loop do all the work?
Yes though the comment was more regarding the in-memory check of the expiration
pkg/sql/sqlliveness/slinstance/slinstance.go, line 123 at r8 (raw file):
Previously, spaskob (Spas Bojanov) wrote…
That's sounds good. How would you deal with cases where you create an instance, it starts the heart beat loop but inserting the new session takes some time. Meanwhile you try to get a session with a call to
Session, does the caller have to retry?
The semantics of whether this function fails or blocks should be clear.
I might make both variants. To implementing blocking I'd use a channel which gets first initialized at construction time and then atomically swapped out when the session fails.
pkg/sql/sqlliveness/sltorage/slstorage.go, line 55 at r8 (raw file):
Previously, spaskob (Spas Bojanov) wrote…
That is tricky though, isn't it? If a node goes offline, another node has to declare the session expired, otherwise until the off-line node comes back the claimed resources will not be released. Should we use some time offset constant that is guaranteed to be >= the max time skew?
My hope is that if other nodes are alive they will be deleting sessions. I think we need something to periodically delete expired sessions.
I really quite strongly do not want to get into the business of thinking about clock skew and we can avoid it with the deletion techniques.
spaskob
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @spaskob)
pkg/sql/sqlliveness/sltorage/slstorage.go, line 55 at r8 (raw file):
Previously, ajwerner wrote…
My hope is that if other nodes are alive they will be deleting sessions. I think we need something to periodically delete expired sessions.
I really quite strongly do not want to get into the business of thinking about clock skew and we can avoid it with the deletion techniques.
I see your idea - Storage object will upon creation start a daemon that deletes other nodes expired sessions without worrying about clock skew. The point is that the session will be deleted and the SQL instance that owns the session when serves calls to Session will have to lookup if the session id is still in system.sqlliveness or it will have to rely on the heartbeating loop to detect that the session was deleted. In the second case, there will be an interval (during the pause of the heart beat loop) in which the session will be deleted but still not considered expired by the owning sql instance. This does not sounds good, so we may have to query sqlliveness.system on each call to Session and use the blocking idea using channels that you described in your comment above. WDYT?
ajwerner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @spaskob)
pkg/sql/sqlliveness/sltorage/slstorage.go, line 55 at r8 (raw file):
or it will have to rely on the heartbeating loop to detect that the session was deleted.
This is the whole discussion at the end of the RFC. If a user of a session wants to gaurantee that the session still is live there are a variety of options.
- The view of the session will carry the latest expiration which can be used to bound a transaction commit timestamp
- The protocol can be that you have mutual exclusion so long as your claim exists and you transactionally verify the claim exists (and only remove claims of sessions which you know to no longer exist
- You can observe your session transactionally.
This all ends up not being a problem because transactions operate in a shared time domain.
This does not sounds good, so we may have to query sqlliveness.system on each call to Session and use the blocking idea using channels that you described in your comment above. WDYT?
I do not think we need to do query Sessions except when heartbeating.
spaskob
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/sql/sqlliveness/slinstance/slinstance.go, line 50 at r8 (raw file):
Previously, ajwerner wrote…
use
timeutil.NewTimer
Done.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 57 at r8 (raw file):
Previously, ajwerner wrote…
Ideally you wouldn't be holding the mutex during the entire transaction.
Done.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 65 at r8 (raw file):
Previously, ajwerner wrote…
I'm not sure that this
expiration > $2condition is necessary. That matter when removing expired sessions but I'm less certain it matters when heartbeating. If a session expires in the woods and nobody is around, did it expire? :p Furthermore, if you avoid using that predicate you never need to delete the session. My feeling is that making the deletion of the session the property for its failure is a nice simplification.But in all seriousness, commentary on the properties of the sessions somewhere seems good.
Done.
Good idea!
pkg/sql/sqlliveness/slinstance/slinstance.go, line 84 at r8 (raw file):
Previously, ajwerner wrote…
I don't think you want to assign here because the function may get retried. I get that you're holding the mutex the whole time but I also am not super pleased about that.
Done.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 88 at r8 (raw file):
Previously, ajwerner wrote…
Do you need to clear the session if you fail?
That's a good question and in general how to treat failures in the heart beat loop? We should probably raise a poison flag and kill the sql instance o/w we may get blocked in a call of Session waiting on the heart beat to extend or create a new session.
pkg/sql/sqlliveness/slsession/slsession.go, line 11 at r8 (raw file):
Previously, ajwerner wrote…
This doesn't need to be in a subpackage, it should be an unexported struct in the
slinstancepackage.
Done.
pkg/sql/sqlliveness/sltorage/slstorage.go, line 55 at r8 (raw file):
Previously, ajwerner wrote…
or it will have to rely on the heartbeating loop to detect that the session was deleted.
This is the whole discussion at the end of the RFC. If a user of a session wants to gaurantee that the session still is live there are a variety of options.
- The view of the session will carry the latest expiration which can be used to bound a transaction commit timestamp
- The protocol can be that you have mutual exclusion so long as your claim exists and you transactionally verify the claim exists (and only remove claims of sessions which you know to no longer exist
- You can observe your session transactionally.
This all ends up not being a problem because transactions operate in a shared time domain.
This does not sounds good, so we may have to query sqlliveness.system on each call to Session and use the blocking idea using channels that you described in your comment above. WDYT?
I do not think we need to do query Sessions except when heartbeating.
Done.
spaskob
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/sql/sqlliveness/sltorage/slstorage.go, line 55 at r8 (raw file):
Previously, spaskob (Spas Bojanov) wrote…
Done.
Should we have a separate daemon that deletes expired sessions?
ajwerner
left a comment
There was a problem hiding this comment.
Nice, this is looking really good! This still needs the daemon to actually delete the expired sessions, right? Some unit testing around the sqlliveness subpackages would be good. As a minor comment, it might be nice to rename the deprecated methods and fields in the jobs registry as I found it sort of unclear which methods had to do with the old system and which had to do with the new system
Reviewed 7 of 14 files at r1, 6 of 13 files at r4, 1 of 2 files at r5, 2 of 6 files at r6, 2 of 6 files at r8, 10 of 11 files at r9.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @spaskob)
pkg/jobs/adopt.go, line 53 at r9 (raw file):
rows, err := r.ex.QueryEx( ctx, "select-running/reverting-jobs", nil, sqlbase.InternalExecutorSessionDataOverride{User: security.RootUser}, `
do we want this to be the NodeUser instead? I'm not super pleased that the RootUser has write access to the jobs table. Seems like a recipe for disaster.
pkg/jobs/adopt.go, line 100 at r9 (raw file):
row, err := r.ex.QueryRowEx( ctx, "get-job-row", nil, sqlbase.InternalExecutorSessionDataOverride{User: security.RootUser}, `
same NodeUser comment.
pkg/jobs/adopt.go, line 109 at r9 (raw file):
} if row == nil { return errors.Errorf("job %d: epoch has changed, no longer eligible for resuming", jobID)
nit: consider returning the value of crdb_internal.sql_liveness_is_alive(claim_session_id) rather than filtering on it to disambiguate the row actually getting deleted from underneath you from the case where the session is not live.
pkg/jobs/registry.go, line 215 at r9 (raw file):
adoptionCh: make(chan struct{}), sqlInstance: slinstance.NewSqlInstance( stopper, clock, db, ex, DefaultAdoptInterval+10*time.Duration(time.Second), time.Second,
nit: I'd take this as a dependency and then construct it in the server.
pkg/jobs/registry.go, line 499 at r9 (raw file):
// TODO(spaskob): think about this invariant, can this ever happen? // Probably not, so we should fail hard in this case. panic("old job!")
consider making this a log.Fatal and some more surrounding information about the job type.
pkg/jobs/registry.go, line 622 at r9 (raw file):
return } log.Infof(ctx, "Registry live claim (instance_id: %s, sid: %s).\n",
nit: lose the .\n and consider moving this to VEventf(ctx, 1
pkg/sql/sqlbase/system.go, line 398 at r9 (raw file):
keys.StatementDiagnosticsTableID: privilege.ReadWriteData, keys.ScheduledJobsTableID: privilege.ReadWriteData, keys.SqllivenessID: privilege.ReadWriteData,
I think we want this to be privilege.ReadData. I don't really want root mucking around in here.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 11 at r9 (raw file):
// licenses/APL.txt. package slinstance
nit: package level comment
pkg/sql/sqlliveness/slinstance/slinstance.go, line 34 at r9 (raw file):
Sid sqlliveness.SessionID Exp hlc.Timestamp
nit: no reason to export these
pkg/sql/sqlliveness/slinstance/slinstance.go, line 43 at r9 (raw file):
func (s *session) Expiration() hlc.Timestamp { return s.Exp } type SQLInstance struct {
nit: comment
pkg/sql/sqlliveness/slinstance/slinstance.go, line 83 at r9 (raw file):
func (l *SQLInstance) createSession(ctx context.Context) (*session, error) { id := uuid.MakeV4().GetBytes() exp := l.clock.Now().Add(l.ttl.Nanoseconds(), 1)
I don't think you need to add 1 to the logical field. Having a non-zero logical timestamp is pretty rare.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 150 at r9 (raw file):
if err := l.extendSession(ctx); err != nil { l.clearSession() t.Reset(0)
My sense is that we want some sort of exponential backoff here or perhaps in extendSession
pkg/sql/sqlliveness/slinstance/slinstance.go, line 187 at r9 (raw file):
} select {
This should also listen on ctx.Done()
pkg/sql/sqlliveness/sltorage/slstorage.go, line 55 at r8 (raw file):
Previously, spaskob (Spas Bojanov) wrote…
Should we have a separate daemon that deletes expired sessions?
Yes, I think we should. I can live in a separate package potentially.
3aa1572 to
24a17f6
Compare
spaskob
left a comment
There was a problem hiding this comment.
Responded to feedback - will add tests as an additional change after we finalize the main discussion.
There are 2 open questions left:
- Where to store the settings for heartbeat interval and expiration ttl?
- How to get a stopper for the
slstorageobject that is needed for theis_alivebuiltin?
Reviewed 1 of 12 files at r10.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @spaskob)
pkg/jobs/adopt.go, line 53 at r9 (raw file):
Previously, ajwerner wrote…
do we want this to be the
NodeUserinstead? I'm not super pleased that theRootUserhas write access to the jobs table. Seems like a recipe for disaster.
done
pkg/jobs/registry.go, line 499 at r9 (raw file):
Previously, ajwerner wrote…
consider making this a
log.Fataland some more surrounding information about the job type.
Done.
pkg/sql/sqlbase/system.go, line 398 at r9 (raw file):
Previously, ajwerner wrote…
I think we want this to be
privilege.ReadData. I don't really wantrootmucking around in here.
The tests start timing out when I do that, before I dig far - what is the rationale - we are supposed to write and delete from this table correct?
pkg/sql/sqlliveness/slinstance/slinstance.go, line 83 at r9 (raw file):
Previously, ajwerner wrote…
I don't think you need to add
1to the logical field. Having a non-zero logical timestamp is pretty rare.
Done.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 150 at r9 (raw file):
Previously, ajwerner wrote…
My sense is that we want some sort of exponential backoff here or perhaps in
extendSession
Done.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 187 at r9 (raw file):
Previously, ajwerner wrote…
This should also listen on
ctx.Done()
Done.
pkg/sql/sqlliveness/sltorage/slstorage.go, line 55 at r8 (raw file):
Previously, ajwerner wrote…
Yes, I think we should. I can live in a separate package potentially.
Code added - what's the reasoning behind it being in a separate package?
Also I will need some help - I am not sure how to get a stopper in https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/conn_executor.go#L2010 which I will need to pass to the slstorage object.
ajwerner
left a comment
There was a problem hiding this comment.
- How about creating some cluster settings?
- we've been discussing offline.
Another thought occurs to me, one thing that's been an observability problem with jobs is understanding how many times that a job has been adopted. How do you feel about adding an integer column next to the claim in the jobs table which we increment each time we write a new claim into the row?
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @spaskob)
|
Added cluster settings for |
spaskob
left a comment
There was a problem hiding this comment.
Replaced all constants with cluster settings. Ready for review.
As for the adoption counter idea - if we decide we can do it in a follow up PR. Currently we can deduce this counter from the logs and we also store all resume errors in the job payload, and # resume errors should be a good approximation.
Reviewed 3 of 12 files at r10, 1 of 6 files at r11.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @spaskob)
ajwerner
left a comment
There was a problem hiding this comment.
Continuing to get better. Still looking for some unit testing on the implementations of the various sqlliveness interfaces. One area I'd like to focus on a bit before this merges that you could ease some concerns I have is around the upgrade path and making sure we handle it gracefully with regards to jobs that are currently being run by the registry when the version upgrade flips.
Reviewed 1 of 11 files at r9, 2 of 13 files at r14, 3 of 6 files at r15, 1 of 2 files at r16, 4 of 6 files at r17, 5 of 5 files at r18.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @spaskob)
pkg/jobs/registry.go, line 476 at r18 (raw file):
defer r.mu.Unlock() if _, alreadyRegistered := r.mu.adoptedJobs[*j.ID()]; alreadyRegistered { log.Fatalf(ctx, "job %d: was just created but foun din registered adopted jobs", *j.ID())
nit foun din
pkg/jobs/registry_external_test.go, line 87 at r18 (raw file):
jobs.LeniencySetting.Override(&s.ClusterSettings().SV, 0) const cancelInterval = time.Duration(math.MaxInt64) const adoptInterval = time.Nanosecond
under stressrace this can make things spin very fact. Consider time.Millisecond.
pkg/sql/sqlbase/system.go, line 398 at r9 (raw file):
Previously, spaskob (Spas Bojanov) wrote…
The tests start timing out when I do that, before I dig far - what is the rationale - we are supposed to write and delete from this table correct?
Well, node should have read-write access withReadData, just root shouldn't. My guess is somewhere is writing to this table using the root user instead of the node user if it's timing out.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 170 at r18 (raw file):
everySecond := log.Every(time.Second) if s, _ := l.getSessionOrBlockCh(); s == nil { for r := retry.StartWithCtx(ctx, opts); r.Next(); {
retry has a feature where you can pass it a stopper that should make this cleaner. See the Closer field https://godoc.org/github.com/cockroachdb/cockroach/pkg/util/retry#Options
pkg/sql/sqlliveness/slinstance/slinstance.go, line 191 at r18 (raw file):
for r := retry.StartWithCtx(ctx, opts); r.Next(); { select { // Finish immediately in tests instead of waiting for the retry.;w
nit: ;w
pkg/sql/sqlliveness/slstorage/slstorage.go, line 88 at r18 (raw file):
for { t := timeutil.NewTimer() t.Reset(0)
I think you want this above the loop
pkg/sql/sqlliveness/slstorage/slstorage.go, line 96 at r18 (raw file):
case <-t.C: t.Read = true t.Reset(s.gcInterval())
generally I prefer resetting it after the operation completes. If the delete-sessions query get slow this might spin continuously, which I don't think we want.
pkg/sql/sqlliveness/sltorage/slstorage.go, line 55 at r8 (raw file):
Previously, spaskob (Spas Bojanov) wrote…
Code added - what's the reasoning behind it being in a separate package?
Also I will need some help - I am not sure how to get a stopper in https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/conn_executor.go#L2010 which I will need to pass to the slstorage object.
It just feels orthogonal in functionality to the storage. I'm fine either way.
spaskob
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @spaskob)
pkg/jobs/registry_external_test.go, line 87 at r18 (raw file):
Previously, ajwerner wrote…
under stressrace this can make things spin very fact. Consider time.Millisecond.
Done.
pkg/sql/sqlbase/system.go, line 398 at r9 (raw file):
Previously, ajwerner wrote…
Well,
nodeshould have read-write access withReadData, justrootshouldn't. My guess is somewhere is writing to this table using therootuser instead of thenodeuser if it's timing out.
most probably, root user us used everywhere in jobs package. I propose to create an issue to fix this and do it in a follow up PR for the whole package.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 170 at r18 (raw file):
Previously, ajwerner wrote…
retryhas a feature where you can pass it a stopper that should make this cleaner. See the Closer field https://godoc.org/github.com/cockroachdb/cockroach/pkg/util/retry#Options
Done.
pkg/sql/sqlliveness/slstorage/slstorage.go, line 88 at r18 (raw file):
Previously, ajwerner wrote…
I think you want this above the loop
Done.
pkg/sql/sqlliveness/slstorage/slstorage.go, line 96 at r18 (raw file):
Previously, ajwerner wrote…
generally I prefer resetting it after the operation completes. If the delete-sessions query get slow this might spin continuously, which I don't think we want.
Done.
ajwerner
left a comment
There was a problem hiding this comment.
I think you need to rebase/regenerate the cluster versions stuff.
Reviewed 5 of 12 files at r19.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @spaskob)
out, line 1 at r19 (raw file):
Running make with -j8
junk
pkg/jobs/deprecated.go, line 57 at r19 (raw file):
} func (r *Registry) maybeAdoptJob(
nit: prefix with deprecated
pkg/jobs/deprecated.go, line 290 at r19 (raw file):
} func (r *Registry) newLease() *jobspb.Lease {
nit: prefix with deprecated
pkg/jobs/deprecated.go, line 300 at r19 (raw file):
} func (r *Registry) cancelAllLocked(ctx context.Context) {
nit: prefix with deprecated
pkg/jobs/deprecated.go, line 312 at r19 (raw file):
// killed and that no one else tries to resume it. This essentially works as a // barrier that only one function can cross and try to resume the job. func (r *Registry) register(jobID int64, cancel func()) error {
nit: prefix with deprecated
pkg/sql/sqlliveness/sqlliveness.go, line 63 at r19 (raw file):
Insert(context.Context, Session) error Update(context.Context, Session) error
Commentary on the semantics. In particular, I suspect you're going to want a concrete error or boolean value from Update to indicate that the session has been removed so that you can stop trying to update it.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 141 at r19 (raw file):
} everySecond := log.Every(time.Second) if s, _ := l.getSessionOrBlockCh(); s == nil {
Why not put the retry loops into createSession and extendSession? feels more fitting there.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 144 at r19 (raw file):
for r := retry.StartWithCtx(ctx, opts); r.Next(); { newSession, err := l.createSession(ctx) if err != nil {
if ctx.Err() != nil don't log and probably just return.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 151 at r19 (raw file):
} if log.V(2) { log.Infof(ctx, "Created new SQL liveness session %s", newSession.ID())
I suspect this should not have a V gate.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 160 at r19 (raw file):
var err error for r := retry.StartWithCtx(ctx, opts); r.Next(); { if err = l.extendSession(ctx); err != nil {
This seems too simplistic. In the case that somebody has removed your claim, you'll never succeed and you should definitely go back around to create a new session. In the case that you don't know then you want to retry. In the case your context has been canceled, then you want to return, which I guess is already the case though rather implicitly.
I'm not sure I understand the reason to have a max number of retries to extend. My sense is you should try to extend until you know that it won't ever succeed.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 162 at r19 (raw file):
if err = l.extendSession(ctx); err != nil { if log.V(2) { log.Infof(ctx, "Could not extend SQL liveness session %s", s.ID())
I don't think this one needs a vmodule gate, it's a pretty unexpected thing. Maybe indicate which attempt number we're on.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 168 at r19 (raw file):
break } if err != nil {
If ctx.Err() != nil don't log and probably just return.
pkg/sql/sqlliveness/slinstance/slinstance.go, line 179 at r19 (raw file):
} } t.Reset(l.hb())
This floating reset is hard to reason about give all the code above it. If more of the code were moved into helpers it might be clearer that this is the happy path.
pkg/sql/sqlliveness/slstorage/slstorage.go, line 107 at r19 (raw file):
return err }) t.Reset(s.gcInterval())
sometimes to make it more explicit I lift this to just above the select and do:
if t.Read {
t.Reset(...)
}
d4d81a3 to
e5111a3
Compare
Touches cockroachdb#47892. Add `claim_session_id` and `claim_instance_id` columns in a new column family. These two values specify the current claim by a job registry on this job row. The current live claims will live in a new table `system.sqlliveness` along with an expiration deadline which will need to be extended to keep the claims alive. Add sql migration code to migrate old definitions to the new one and for the new system table. Release note: none.
Release note: none.
See https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20200615_sql_liveness.md for background on sql liveness implementation. Regarding using sqlliveness for implementing a new job leasing mechanism. The biggest difficulty was to ensure the correct migration of existing clusters with existing running or paused jobs. These need to be handled by the old job adoption and leasing implementation and newly created need to be handled by the new job leasing and adoption. There is a new job adoption function `claimAndProcessJobs` implemented using the new adoption API in `adopt.go`. The old adoption code is preserved and it will be triggered if and only if the cluster has not been finalized. This code will be deleted in release `20.2`. All jobs inherited from the previous version or created before upgrade is finalized will be handled by the old code. When the cluster is finalized the new adoption code will kick in for newly created jobs. Old jobs that have not finished yet will be detected as having expired claims (column `claim_session_id` will have value NULL), will be canceled and eventually resumed by the new adoption code. Release note: none.
We introduce a new SQL builtin function `crdb_internal.sql_liveness_is_alive` that returns true if the provided sqlliveness session id has not expired. Release note: none.
Release note: none.
Release note: none.
Release note: none.
Release note: none.
This test checks if jobs before backup are preserved by restore but we are introducing a new column `claim_session_id` which is a `uuid` and needs to be exclueded from the check that job rows are identical for the same job id. Also refactored the test to produce a more helpful error message via `require.Equal`. Release note: none.
The SQL instance is normally started at startup but if old cluster is upgraded it won't start and when cluster is finalized the new adoption logic will kick in without a running SQL indstance which will cause adoption to stall. Release note: none.
This was supposed to be deprecated in 20.2 but because there is some chance of 19.2 jobs not finishing in 20.1 clusters we will depracte in 21.0. Release note: none.
This caching, as discussed in the RFC proved to be critical when there are a large number of jobs to avoid a quadratic number of lookups. Having a large number of jobs can occur relatively commonly. Release note: None
Also augmented the test a bit. Release note: None
Release note: None
This commit does a few things: 1) Drops the `SQL` from `SQLInstance` as it merely introduced stutter in the name. 2) Made the implementation of SessionID into a string as these things are immutable and we would like to use them to key maps. 3) Split up the storage interface as only the reading portion was used externally. The use of the `Session` interface in the writing portion ended up just being cruft. 4) Add a `Provider` interface which represents the subsystem as a whole exported outwards and an implementation in `slprovider`. 5) Adds sanity to the starting of the sqlliveness subsystem. Before this commit, the various components were started multiple times in different places. This commit moves the starting to the sqlserver Start method. Release note: None
Release note: None
|
bors r+ |
|
Build succeeded: |
Touches #47892.
For full details see the sqlliveness RFC.
Add claim_session_id and claim_instance_id columns in a new column family.
These two values specify the current claim by a job registry on this job row.
The current live claims will live in a new table system.sqlliveness along with
an expiration deadline which will need to be extended to keep the claims alive.
Add sql migration code to migrate old definitions to the new one and for the new
system table.
The new job adoption logic is implemented in claimAndProcessJobs in
registry.go. It gets a live epoch, potentially steals jobs from other
registries and lays claim on them and finally it processes all claimed
jobs.
The main steps of claimAndProcessJobs are implemented in adopt.go
and they on purpose circumvent the flawed update.go jobs API opting
for direct interaction with the jobs table. I think their implementation
illustrates the simplicity of this approach.
The currently claimed and resumed on the node jobs are kept in a new map
adoptedJobs that keeps track of their epoch at the time of the claim
so that they can be easily canceled if the epoch is bumped.
Migration to 20.2 probably needs some more work but is also sketched here:
In mixed-version cluster we use the legacy adoption mechanism as
before.
Once the cluster is detected to be finalized, the old adoption does
not get triggered any more.
The new adoption kicks in, kills all legacy jobs and starts adopting
using the new epoch based mechanism.
Release note: none.