Skip to content

replay: collect tables and manifests from workloads#2049

Merged
coolcom200 merged 2 commits intocockroachdb:masterfrom
coolcom200:workload-capture-cleaner
Nov 15, 2022
Merged

replay: collect tables and manifests from workloads#2049
coolcom200 merged 2 commits intocockroachdb:masterfrom
coolcom200:workload-capture-cleaner

Conversation

@coolcom200
Copy link
Copy Markdown
Contributor

@coolcom200 coolcom200 commented Oct 24, 2022

replay: collect tables and manifests from workloads

Develop a workload collector that captures tables and manifests from a
workload. The workload collector learns about the tables and manifests
from the EventListener. The workload collector exposes an Attach
method that attaches to the pebble Options. It also reads and
saves information from the already configured options.

The workload collector replaces the cleaner of a pebble database in
order to ensure that the files that are ready for cleaning do not get
cleaned before the workload collector has a chance to process them.
During the Attach call the workload collector will save the originally
provided cleaner. This saved cleaner will be used to perform the final
cleaning once the files the are obsolete for the collector.

When a manifest is created the event listener will be fired which will
trigger the onManifestCreated method that will mark the manifest file
as readyForProcessing. Additionally this event handler will update the
workload collectors current manifest file number which will be used in
the start method to ensure that the current manifest is copied once
workload collection starts. This is required as workload collection can
be started at any point in time and to collect the workload requires
having the manifest file saved.

Similarly when a table is ingested or flushed the onTableIngest and
onFlushEnd methods will run that also mark the tables as
readyForProcessing. After the files are marked as
readyForProcessing they won’t be cleaned until they are marked as
obsolete.

After the files’ states are set the table handlers mentioned above will
signal a go routine that will perform the processing. Note that only the
onTableIngest and onFlushEnd signal the go routine.

The processing go routine performs the task of copying the tables from
the source VFS to a destination VFS. It also copies the manifest
files in chunks to the destination VFS. The VFS interface was chosen
to allow clients flexibility in the location to copy the files to. Once
a table file has been processed it will be marked as obsolete and
cleaned by the cleaner.

The processing go routine mentioned above is started by a call to
Start which does several key things:

  1. It enables the collection handlers by setting the enabled atomic as
    1 (true) which allows the aforementioned EventListener methods
    to update the table/manifest states.
  2. Takes the current manifest and adds it to the list of manifests to
    process
  3. It starts the processing go routine

Closes: #2050

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@coolcom200 coolcom200 force-pushed the workload-capture-cleaner branch from dfce781 to e8cd48b Compare October 24, 2022 19:30
Copy link
Copy Markdown
Contributor

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 4 unresolved discussions (waiting on @coolcom200)


cleaner.go line 26 at r1 (raw file):

// handling flushed and ingested SSTs. The cleaner only deletes obselete files
// after they have been processed by the fileHandler.
type WorkloadCaptureCleaner struct {

let's start a new package in pebble/replay to organize this workload replay code

nit: how about WorkloadCollector. although it implements the Cleaner interface, it's a bit of a minor implementation detail.


cleaner.go line 33 at r1 (raw file):

	// and FlushEnded.
	fileHandler   WorkloadCaptureFileHandler
	fileProcessed map[string]bool

i think we might need a bit more information. specifically, I'm thinking about the case where a file is Clean'd before it's been collected (since file collection needs to be asynchronous). We want to retain the knowledge that Clean was called, and then have the copying goroutine remove the file once its finished copying.


cleaner.go line 51 at r1 (raw file):

// files over to the archive directory instead of moving them.
func (wcc DefaultWorkloadCaptureFileHandler) HandleFile(fs vfs.FS, path string) error {
	destDir := fs.PathJoin(fs.PathDir(path), "archive")

let's take a destination path as configuration, since we'll want to copy over to a separate volume (which likely won't be mounted within the database's data directory)


cleaner.go line 101 at r1 (raw file):

	defer w.Unlock()
	for _, table := range info.Output {
		w.processFileByNum(table.FileNum)

OnFlushEnd gets invoked by the goroutine that completes the flush and which is holding DB.mu. We'll need to actually perform the copying on a separate goroutine. I think we'll want to launch a goroutine when the workload collector is created, and use a condition variable to signal to the goroutine when there are new files for it to copy.

@coolcom200 coolcom200 force-pushed the workload-capture-cleaner branch from e8cd48b to f14f3f7 Compare October 26, 2022 19:11
Copy link
Copy Markdown
Contributor Author

@coolcom200 coolcom200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 5 unresolved discussions (waiting on @jbowens)


replay/workload_capture.go line 26 at r2 (raw file):

	storageDir         string
	notifier           *sync.Cond
	shouldStopNotifier bool // TODO(leon): stop.Stopper? in CRDB?

Not too sure if this should be the responsibility of pebble to expose the methods to start file collector processing go routine (StartCollectorFileListener) and hence manage the go routine (stop it when needed). If we choose to let clients manage the go routine we could potentially use the stop.Stopper in CRDB. Currently we expose a start and stop method for the go routine but it might make more sense for clients to perform the management of go routine. Any thoughts on this?


cleaner.go line 26 at r1 (raw file):

Previously, jbowens (Jackson Owens) wrote…

let's start a new package in pebble/replay to organize this workload replay code

nit: how about WorkloadCollector. although it implements the Cleaner interface, it's a bit of a minor implementation detail.

Done.


cleaner.go line 33 at r1 (raw file):

Previously, jbowens (Jackson Owens) wrote…

i think we might need a bit more information. specifically, I'm thinking about the case where a file is Clean'd before it's been collected (since file collection needs to be asynchronous). We want to retain the knowledge that Clean was called, and then have the copying goroutine remove the file once its finished copying.

Done.


cleaner.go line 51 at r1 (raw file):

Previously, jbowens (Jackson Owens) wrote…

let's take a destination path as configuration, since we'll want to copy over to a separate volume (which likely won't be mounted within the database's data directory)

Done.


cleaner.go line 101 at r1 (raw file):

Previously, jbowens (Jackson Owens) wrote…

OnFlushEnd gets invoked by the goroutine that completes the flush and which is holding DB.mu. We'll need to actually perform the copying on a separate goroutine. I think we'll want to launch a goroutine when the workload collector is created, and use a condition variable to signal to the goroutine when there are new files for it to copy.

Done.

@coolcom200 coolcom200 force-pushed the workload-capture-cleaner branch 3 times, most recently from b5a71d4 to 19bf2a9 Compare November 3, 2022 18:24
Copy link
Copy Markdown
Contributor

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good! still working my way through

Reviewed 1 of 4 files at r2.
Reviewable status: 1 of 4 files reviewed, 13 unresolved discussions (waiting on @coolcom200)


-- commits line 2 at r7:
nit: when a commit is specific to a package, we prefix with the package name. in this case replay: ...


replay/workload_capture.go line 26 at r2 (raw file):

Previously, coolcom200 (Leon Fattakhov) wrote…

Not too sure if this should be the responsibility of pebble to expose the methods to start file collector processing go routine (StartCollectorFileListener) and hence manage the go routine (stop it when needed). If we choose to let clients manage the go routine we could potentially use the stop.Stopper in CRDB. Currently we expose a start and stop method for the go routine but it might make more sense for clients to perform the management of go routine. Any thoughts on this?

A common, generalized pattern is to pass context.Context into the function that starts the goroutine. The goroutine then respects context cancellation and exits with the <-ctx.Done() channel returns.

https://pkg.go.dev/context

Unfortunately, mixing channels and condition variables is non-trivial, since you can't select on a condition variable. We can talk this through next week.


replay/workload_capture.go line 26 at r7 (raw file):

// own file handler to perform the workload capturing.
type WorkloadCollectorFileHandler interface {
	HandleTableFile(fs vfs.FS, path string) error

can we call this CopySSTable or something less generic?


replay/workload_capture.go line 28 at r7 (raw file):

	HandleTableFile(fs vfs.FS, path string) error
	HandleManifestFileMutated(fs vfs.FS, details workloadCollectorManifestDetails) (int64, error)
	OutputDirectory() string

this is a bit out of a place on the generalized interface. For example, we may want to write workloads to blob storage eventually for use with disaggregated storage.

it looks like this is being used for constructing the manfiest files. i think we can probably push that into the interface, eg

type WorkloadStorage {
    CopySSTable(srcFS vfs.FS, srcPath string) error
    CreateManifest(name string) (vfs.File, error)
}

since vfs.File itself is an interface, a WorkloadStorage can return its own implementation of the vfs.File that, for example, writes to blob storage.


replay/workload_capture.go line 33 at r7 (raw file):

// DefaultWorkloadCollectorFileHandler is a default workload capture tool that
// copies files over to the archive directory.
type DefaultWorkloadCollectorFileHandler struct {

these types are becoming a bit of a mouthful. wdyt of
WorkloadCollectorFileHandlerWorkloadStorage
DefaultWorkloadCollectorFileHandlerFilesystemWorkloadStorage

actually, I think we don't really need to export this concrete type. We can make it filesystemWorkloadStorage and make the constructor an exported function that returns the interface type:

func FilesystemWorkloadStorage(destinationPath string) WorkloadStorage {
    return filesystemWorkloadStorage{
        // ...
    }
}

replay/workload_capture.go line 43 at r7 (raw file):

) DefaultWorkloadCollectorFileHandler {
	return DefaultWorkloadCollectorFileHandler{
		destinationDirectory: captureDestinationDirectory,

if you think it would still be equally understandable (I'd say so), it'd be a little more Go idiomatic to use abbreviated field and variable names here, like dstDir for both


replay/workload_capture.go line 51 at r7 (raw file):

// files over to the archive directory instead of moving them.
func (wcc DefaultWorkloadCollectorFileHandler) HandleTableFile(fs vfs.FS, path string) error {
	if err := fs.MkdirAll(wcc.destinationDirectory, 0755); err != nil {

let's either move this into the FilesystemWorkloadStorage constructor, or make the onus of ensuring the destination exists the caller's responsibility. we'll be calling this function on every sstable, and the directory will only not exist the first time.


replay/workload_capture.go line 73 at r7 (raw file):

}

type workloadCollectorManifestDetails struct {

nit: manifestState?


replay/workload_capture.go line 111 at r7 (raw file):

// NewWorkloadCaptureCleaner is used externally to create a New WorkloadCollector.
func NewWorkloadCaptureCleaner(

let's drop the language of Cleaner here and in the comments above. We do implement the Cleaner interface, but I don't think the caller needs to know much, if anything, about it.

I think we should expose a WorkloadCollector method that configures a DB's to use the collector appropriately. For example,

func (w *WorkloadCollector) Attach(opts *pebble.Options) {
    opts.EnsureDefaults()
    // Replace the original Cleaner with the workload collector's implementation,
    // which will invoke the original Cleaner, but only once the collector's copied
    // what it needs.
    w.cleaner, opts.Cleaner = opts.Cleaner, w

    if opts.EventListener != nil {
        opts.EventListener = pebble.TeeEventListener(opts.EventListener, w)
    } else {
        opts.EventListener = w
    }

    w.srcFS = opts.FS

    // ... whatever else ...
}

replay/workload_capture.go line 133 at r7 (raw file):

// deleteFile deletes the specified path and removes the path from the fileState map
func (w *WorkloadCollector) deleteFile(path string) error {
	err := w.configuration.fs.Remove(path)

rather than deleting the file itself, let's make use of that Cleaner interface.


replay/workload_capture.go line 146 at r7 (raw file):

	w.Lock()
	fileState := w.fileState[path]
	if fileState.is(capturedSuccessfully) {

I think if a file is !ready, we also want to remove it. We're relying on Pebble to notify us of files relevance (through the event listener) before those files are cleaned. I think there are some subtleties, eg, it may make the implementation easier to never remove manifests, in which case we need to ensure the manifest file is marked ready or handle them specially here


replay/workload_capture.go line 198 at r7 (raw file):

		sourceFilepath: info.Path,
	}
	sourceFile, err := w.configuration.fs.Open(info.Path)

can we defer this Open until the separate goroutine runs to avoid introducing I/O within the EventListener call?

@coolcom200 coolcom200 force-pushed the workload-capture-cleaner branch from 19bf2a9 to 6abca2d Compare November 7, 2022 22:29
Copy link
Copy Markdown
Contributor Author

@coolcom200 coolcom200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 4 files at r2, 1 of 2 files at r4, 2 of 2 files at r7, all commit messages.
Reviewable status: 2 of 5 files reviewed, 12 unresolved discussions (waiting on @jbowens)


replay/workload_capture.go line 26 at r7 (raw file):

Previously, jbowens (Jackson Owens) wrote…

can we call this CopySSTable or something less generic?

Done.


replay/workload_capture.go line 28 at r7 (raw file):

Previously, jbowens (Jackson Owens) wrote…

this is a bit out of a place on the generalized interface. For example, we may want to write workloads to blob storage eventually for use with disaggregated storage.

it looks like this is being used for constructing the manfiest files. i think we can probably push that into the interface, eg

type WorkloadStorage {
    CopySSTable(srcFS vfs.FS, srcPath string) error
    CreateManifest(name string) (vfs.File, error)
}

since vfs.File itself is an interface, a WorkloadStorage can return its own implementation of the vfs.File that, for example, writes to blob storage.

I think we will also need the directory for the StartCollectorFileListener as we need to do the checkpoint which requires the output directory.

Following our discussion we decided to make WorkloadStorage create the manifest file and then we handle the copying of it. For checkpointing we will create another method that will handle checkpointing.

Code quote:

OutputDirectory

replay/workload_capture.go line 33 at r7 (raw file):

Previously, jbowens (Jackson Owens) wrote…

these types are becoming a bit of a mouthful. wdyt of
WorkloadCollectorFileHandlerWorkloadStorage
DefaultWorkloadCollectorFileHandlerFilesystemWorkloadStorage

actually, I think we don't really need to export this concrete type. We can make it filesystemWorkloadStorage and make the constructor an exported function that returns the interface type:

func FilesystemWorkloadStorage(destinationPath string) WorkloadStorage {
    return filesystemWorkloadStorage{
        // ...
    }
}

Done.


replay/workload_capture.go line 43 at r7 (raw file):

Previously, jbowens (Jackson Owens) wrote…

if you think it would still be equally understandable (I'd say so), it'd be a little more Go idiomatic to use abbreviated field and variable names here, like dstDir for both

Done.


replay/workload_capture.go line 51 at r7 (raw file):

Previously, jbowens (Jackson Owens) wrote…

let's either move this into the FilesystemWorkloadStorage constructor, or make the onus of ensuring the destination exists the caller's responsibility. we'll be calling this function on every sstable, and the directory will only not exist the first time.

Done.


replay/workload_capture.go line 73 at r7 (raw file):

Previously, jbowens (Jackson Owens) wrote…

nit: manifestState?

Changed to manifestDetails


replay/workload_capture.go line 93 at r7 (raw file):

	fileState      map[string]workloadCaptureState
	filesToProcess []string
	enabled        atomic.Bool

Any ideas why this causes the tests to fail?


replay/workload_capture.go line 111 at r7 (raw file):

Previously, jbowens (Jackson Owens) wrote…

let's drop the language of Cleaner here and in the comments above. We do implement the Cleaner interface, but I don't think the caller needs to know much, if anything, about it.

I think we should expose a WorkloadCollector method that configures a DB's to use the collector appropriately. For example,

func (w *WorkloadCollector) Attach(opts *pebble.Options) {
    opts.EnsureDefaults()
    // Replace the original Cleaner with the workload collector's implementation,
    // which will invoke the original Cleaner, but only once the collector's copied
    // what it needs.
    w.cleaner, opts.Cleaner = opts.Cleaner, w

    if opts.EventListener != nil {
        opts.EventListener = pebble.TeeEventListener(opts.EventListener, w)
    } else {
        opts.EventListener = w
    }

    w.srcFS = opts.FS

    // ... whatever else ...
}

Yeah I think you mentioned this before and I somehow didn't change the create method my bad 😬

Also the Attach method is really good idea! Done!


replay/workload_capture.go line 133 at r7 (raw file):

Previously, jbowens (Jackson Owens) wrote…

rather than deleting the file itself, let's make use of that Cleaner interface.

Aligned. Done.


replay/workload_capture.go line 146 at r7 (raw file):

Previously, jbowens (Jackson Owens) wrote…

I think if a file is !ready, we also want to remove it. We're relying on Pebble to notify us of files relevance (through the event listener) before those files are cleaned. I think there are some subtleties, eg, it may make the implementation easier to never remove manifests, in which case we need to ensure the manifest file is marked ready or handle them specially here

Discussed offline and implemented a check to ensure that we do not "clean" files that are needed but we do properly clean files that we do not care about hence the that is the !ready files.


replay/workload_capture.go line 198 at r7 (raw file):

Previously, jbowens (Jackson Owens) wrote…

can we defer this Open until the separate goroutine runs to avoid introducing I/O within the EventListener call?

I think we need this open here as we need to keep a reference to the manifest file so that we still have access to the its contents in the case where the manifest is created and deleted before we have a chance to open it in the separate go routine. Although the likelihood that we get into this case seems very very slim.

Discussed offline and decided that we do need to move the IO outside of this method hence this will require managing the manifest file states to ensure we don't clean them before we are done processing them.

@coolcom200 coolcom200 force-pushed the workload-capture-cleaner branch 2 times, most recently from 462d72e to f53ddc0 Compare November 9, 2022 19:47
Copy link
Copy Markdown
Contributor

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 3 files at r9, 1 of 2 files at r10.
Reviewable status: 2 of 5 files reviewed, 5 unresolved discussions (waiting on @coolcom200)


replay/workload_capture.go line 76 at r10 (raw file):

	// fileHandler is run in the OnFlushEnd and OnTableIngested which are supposed
	// to be hooked up to the respective EventListener events for TableIngested
	// and FlushEnded.

I think this comment could be reworded now. I also think we should shift from the fileHandler language to something more akin to a thin "storage" layer. That interface now is only responsible for persistence—this WorkloadCollector handles the Pebble logic.


replay/workload_capture.go line 237 at r10 (raw file):

		}
		filesToProcess := w.mu.sstablesToProcess[:]
		w.mu.sstablesToProcess = w.mu.sstablesToProcess[:0]

we should set this to nil, instead of reusing the previous slice's memory. currently both filesToProcess and w.mu.sstablesToProcess will be slices backed by the same underlying array, resulting in a data race.


replay/workload_capture.go line 255 at r10 (raw file):

			} else {
				w.mu.fileState[filepath] |= capturedSuccessfully
				w.mu.Unlock()

unlocking and locking has some overhead. can we do something like this?:

// Copy database mutations
filesToProcess := w.mu.filesToProcess
w.mu.filesToProcess = nil
totalManifests := len(w.mu.manifests)
index := w.mu.manifestIndex

func() {
    w.mu.Unlock()
    defer w.mu.Lock()
    // Copy sstables...
    // Copy manifests...
}()

// Clean up obsolete files.
obsoleteFiles := filesToProcess[:0]
for _, f := range filesToProcess {
    if w.mu.fileState[filepath].is(obsolete) {
        obsoleteFiles = append(obsoleteFiles, f)
    } else {
       w.mu.fileState[filepath] |= capturedSuccessfully
    }
}
func() {
    w.mu.Unlock()
    defer w.mu.Lock()
    // Clean all the files in obsoleteFiles
}()

If we make w.mu.manifests's values pointers *manifestDetails, then we don't need to synchronization when indexing into w.mu.manifests, since it's append only. Once we know the length of the slice is n, we know the locations < n won't be overwritten. We also won't need synchronization for manifestDetails.{sourceFile,destFile} since only this goroutine ever reads or writes those variables (although we should document them as such).


event.go line 483 at r10 (raw file):

}

func (l *EventListener) IsConfigured() bool {

let's remove IsConfigured() and rely on EnsureDefaults having been called by Options.EnsureDefaults

Copy link
Copy Markdown
Contributor Author

@coolcom200 coolcom200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r8, 1 of 3 files at r9, 2 of 2 files at r10, 1 of 3 files at r11, 1 of 1 files at r12, all commit messages.
Reviewable status: 3 of 5 files reviewed, 4 unresolved discussions (waiting on @jbowens)


replay/workload_capture.go line 93 at r7 (raw file):

Previously, coolcom200 (Leon Fattakhov) wrote…

Any ideas why this causes the tests to fail?

Resolved! Turns out atomic.Bool was only recently introduced and the older versions of go don't support it :(


replay/workload_capture.go line 237 at r10 (raw file):

Previously, jbowens (Jackson Owens) wrote…

we should set this to nil, instead of reusing the previous slice's memory. currently both filesToProcess and w.mu.sstablesToProcess will be slices backed by the same underlying array, resulting in a data race.

Good catch I thought the w.mu.sstablesToProcess[:] would make a copy. Does it make sense to do an actual copy with copy or is using nil better?


event.go line 483 at r10 (raw file):

Previously, jbowens (Jackson Owens) wrote…

let's remove IsConfigured() and rely on EnsureDefaults having been called by Options.EnsureDefaults

Removed it in the next revision.

@coolcom200 coolcom200 force-pushed the workload-capture-cleaner branch 2 times, most recently from 8b0b0b8 to da60d60 Compare November 11, 2022 22:19
Copy link
Copy Markdown
Contributor Author

@coolcom200 coolcom200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r15, 2 of 2 files at r16, 2 of 2 files at r17, 2 of 2 files at r18, 2 of 2 files at r19, 13 of 13 files at r20, 15 of 15 files at r21, 13 of 13 files at r22, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @jbowens)


replay/workload_capture.go line 76 at r10 (raw file):

Previously, jbowens (Jackson Owens) wrote…

I think this comment could be reworded now. I also think we should shift from the fileHandler language to something more akin to a thin "storage" layer. That interface now is only responsible for persistence—this WorkloadCollector handles the Pebble logic.

Done.


replay/workload_capture.go line 255 at r10 (raw file):

Previously, jbowens (Jackson Owens) wrote…

unlocking and locking has some overhead. can we do something like this?:

// Copy database mutations
filesToProcess := w.mu.filesToProcess
w.mu.filesToProcess = nil
totalManifests := len(w.mu.manifests)
index := w.mu.manifestIndex

func() {
    w.mu.Unlock()
    defer w.mu.Lock()
    // Copy sstables...
    // Copy manifests...
}()

// Clean up obsolete files.
obsoleteFiles := filesToProcess[:0]
for _, f := range filesToProcess {
    if w.mu.fileState[filepath].is(obsolete) {
        obsoleteFiles = append(obsoleteFiles, f)
    } else {
       w.mu.fileState[filepath] |= capturedSuccessfully
    }
}
func() {
    w.mu.Unlock()
    defer w.mu.Lock()
    // Clean all the files in obsoleteFiles
}()

If we make w.mu.manifests's values pointers *manifestDetails, then we don't need to synchronization when indexing into w.mu.manifests, since it's append only. Once we know the length of the slice is n, we know the locations < n won't be overwritten. We also won't need synchronization for manifestDetails.{sourceFile,destFile} since only this goroutine ever reads or writes those variables (although we should document them as such).

Done.

@coolcom200 coolcom200 force-pushed the workload-capture-cleaner branch from da60d60 to 747b1f2 Compare November 11, 2022 22:27
Copy link
Copy Markdown
Contributor

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great! left a couple minor nits, and then there's the minor sequencing change we talked about offline, and then I think we're good to go

Reviewed 3 of 13 files at r20, 1 of 3 files at r27, 1 of 13 files at r28.
Reviewable status: 5 of 18 files reviewed, 4 unresolved discussions (waiting on @coolcom200)


replay/workload_capture.go line 102 at r28 (raw file):

		FlushEnd:        w.OnFlushEnd,
		ManifestCreated: w.OnManifestCreated,
		TableIngested:   w.OnTableIngest,

nit: let's unexported OnFlushEnd, OnManifestCreated and OnTableIngest, since we don't need to expose them to link them up to the event listener.


replay/workload_capture.go line 125 at r28 (raw file):

func (w *WorkloadCollector) setSSTableAsReadyForProcessing(fileNum base.FileNum) {
	filepath := base.MakeFilepath(w.configuration.srcFS, w.configuration.srcDir, base.FileTypeTable, fileNum)
	w.mu.fileState[filepath] |= readyForProcessing

Using the file path as a key is a bit treacherous, given many there are many possible string values that logically form the same key. Maybe filepath.Clean-ing the path first to normalize it would be sufficient, but since we have our file numbering scheme readily available, I think we should key by file number. It's a bit annoying that we'll need to parse out the file number in Clean, but it shouldn't be too much trouble.


replay/workload_capture.go line 330 at r28 (raw file):

// StartCollectorFileListener starts a go routine that listens for new files that
// need to be collected.
func (w *WorkloadCollector) StartCollectorFileListener(destFS vfs.FS, destPath string) {

nit: can we name these start and stop methods Start and Stop? The CollectorFileListener is a bit redundant.

Copy link
Copy Markdown
Contributor Author

@coolcom200 coolcom200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 14 of 15 files at r23, 2 of 2 files at r24, 2 of 2 files at r25, 2 of 2 files at r26, 3 of 3 files at r27, 13 of 13 files at r28, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @jbowens)


replay/workload_capture.go line 125 at r28 (raw file):

Previously, jbowens (Jackson Owens) wrote…

Using the file path as a key is a bit treacherous, given many there are many possible string values that logically form the same key. Maybe filepath.Clean-ing the path first to normalize it would be sufficient, but since we have our file numbering scheme readily available, I think we should key by file number. It's a bit annoying that we'll need to parse out the file number in Clean, but it shouldn't be too much trouble.

Using base.FileNumas the key can cause a conflict between SST file number and the manifest file number right?
I am thinking of using the Filename as the key (MANIFEST-00001) instead. Does that sound reasonable?

Copy link
Copy Markdown
Contributor

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @coolcom200)


replay/workload_capture.go line 125 at r28 (raw file):

Previously, coolcom200 (Leon Fattakhov) wrote…

Using base.FileNumas the key can cause a conflict between SST file number and the manifest file number right?
I am thinking of using the Filename as the key (MANIFEST-00001) instead. Does that sound reasonable?

Naw, the manifest and sstables are part of the same file numbering namespace. The filename also works.

Develop a workload collector that captures tables and manifests from a
workload. The workload collector learns about the tables and manifests
from the `EventListener`. The workload collector exposes an `Attach`
method that  attaches to the pebble `Options`. It also reads and
saves information from the already configured options.

The workload collector replaces the cleaner of a pebble database in
order to ensure that the files that are ready for cleaning do not get
cleaned before the workload collector has a chance to process them.
During the `Attach` call the workload collector will save the originally
provided cleaner. This saved cleaner will be used to perform the final
cleaning once the files the are obsolete for the collector.

When a manifest is created the event listener will be fired which will
trigger the `onManifestCreated` method that will mark the manifest file
as `readyForProcessing`. Additionally this event handler will update the
workload collectors current manifest file number which will be used in
the start method to ensure that the current manifest is copied once
workload collection starts. This is required as workload collection can
be started at any point in time and to collect the workload requires
having the manifest file saved.

Similarly when a table is ingested or flushed the `onTableIngest` and
`onFlushEnd` methods will run that also mark the tables as
`readyForProcessing`.  After the files are marked as
`readyForProcessing` they won’t be cleaned until they are marked as
`obsolete`.

After the files’ states are set the table handlers mentioned above will
signal a go routine that will perform the processing. Note that only the
`onTableIngest` and `onFlushEnd` signal the go routine.

The processing go routine performs the task of copying the tables from
the source `VFS` to a destination `VFS`. It also copies the manifest
files in chunks to the destination `VFS`. The `VFS` interface was chosen
to allow clients flexibility in the location to copy the files to. Once
a table file has been processed it will be marked as `obsolete` and
cleaned by the cleaner.

The processing go routine mentioned above is started by a call to
`Start` which does several key things:

1. It enables the collection handlers by setting the `enabled` atomic as
   `1` (true)  which allows the aforementioned `EventListener` methods
   to update the table/manifest states.
2. Takes the current manifest and adds it to the list of manifests to
   process
3. It starts the processing go routine
@coolcom200 coolcom200 force-pushed the workload-capture-cleaner branch from 747b1f2 to a5c5def Compare November 14, 2022 23:53
@coolcom200 coolcom200 changed the title workload capture: collect tables from workload with custom cleaner replay: collect tables and manifests from workloads Nov 14, 2022
@coolcom200 coolcom200 marked this pull request as ready for review November 15, 2022 14:36
Copy link
Copy Markdown
Contributor

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

Reviewable status: 5 of 18 files reviewed, 1 unresolved discussion (waiting on @coolcom200)

@coolcom200
Copy link
Copy Markdown
Contributor Author

coolcom200 commented Nov 15, 2022

TFTRs!

@coolcom200 coolcom200 merged commit 2aa29fe into cockroachdb:master Nov 15, 2022
@coolcom200 coolcom200 deleted the workload-capture-cleaner branch November 15, 2022 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

replay: in-process workload collection

3 participants