Skip to content

Conversation

@gaodayue
Copy link
Contributor

@gaodayue gaodayue commented Aug 6, 2019

BetaRowsetWriter is used to write rowset in V2 segment format.

This PR contains several interface changes

  1. Rowset.make_snapshot() is renamed to link_files_to because hard links are also useful in copy task, linked schema change, etc
  2. Rowset.copy_files_to_path() is renamed to copy_files_to to be consistent with other names
  3. RowsetWriter.mem_pool() is removed because not all rowset writers use MemPool
  4. RowsetWriter.garbage_collection() is removed because it's not used by clients
  5. SegmentGroup's make_snapshot() is removed because link_segments_to_path() provides similar functionality

Future works

  1. implement zonemap for beta rowset writer
  2. collect index size in beta rowset
  3. choose write_mbytes_per_sec based on writer type (load/base compaction/cumulative compaction)

Copy link
Contributor

@imay imay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@imay imay merged commit af8256b into apache:master Aug 12, 2019
RowsetSharedPtr build() override;

MemPool* mem_pool() override;
Version version() override { return _rowset_writer_context.version; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add const description?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, will fix later


RowsetSharedPtr build() override;

Version version() override { return _context.version; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

_rowset_meta));
auto status = rowset->init();
if (status != OLAP_SUCCESS) {
LOG(WARNING) << "rowset init failed when build new rowset";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add status msg to log

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because OLAPStatus is just an enum and doesn't have errmsg field, the real reason is usually not logged by caller, but by the function who encounters error in the first place

OLAPStatus BetaRowsetWriter::_create_segment_writer() {
auto path = BetaRowset::segment_file_path(_context.rowset_path_prefix, _context.rowset_id, _num_segment);
segment_v2::SegmentWriterOptions writer_options;
_segment_writer.reset(new segment_v2::SegmentWriter(path, _num_segment, _context.tablet_schema, writer_options));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think SegmentWriter can add another constructor without SegmentWriterOptions as a argument, and use the default SegmentWriterOptions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it can be optimized in the future

swjtu-zhanglei pushed a commit to swjtu-zhanglei/incubator-doris that referenced this pull request Jul 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants