Skip to content

Conversation

@yiguolei
Copy link
Contributor

No description provided.

}

OLAPStatus FeBasedRowsetIdGenerator::get_next_id(RowsetId* gen_rowset_id) {
std::lock_guard<std::mutex> l(_lock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::mutex is too heavy for get next id. In most cases, it is a only add operation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


import com.google.common.collect.Maps;

public class RowsetIdGenerator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name it a PersistedIdGernator? I think it can not only be used for RowsetId, but also be used for other id

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it could not. because rowsetid generator could have many implementations, for example, we could scan all rowset id in storage engine at start time and then generate rowset it one by one. In this case it is not persistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Id generator logic, we can use a class.
Then you can let RowsetIdGenerator contain it.

// (start_rowset_id, end_rowset_id) not include start and end
struct TGetRowsetIdResponse {
1: optional i64 start_rowset_id = -1
2: optional i64 end_rowset_id = -1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer [start, end) with start_id and id_count

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@yiguolei yiguolei force-pushed the master branch 2 times, most recently from f94cb37 to 7f14f10 Compare August 23, 2019 09:41
@gaodayue
Copy link
Contributor

Can anyone explain what's the purpose of this PR and give a brief description of what's changed?

@yiguolei yiguolei force-pushed the master branch 2 times, most recently from efbc29d to 0d3988a Compare August 27, 2019 02:44
vector<string> error_msgs;
TStatusCode::type status_code = TStatusCode::OK;
return_value.__set_snapshot_version(PREFERRED_SNAPSHOT_VERSION);
int32_t return_snapshot_version = PREFERRED_SNAPSHOT_VERSION;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add some comment here to describe the version meaning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


bool rowset_id_in_use(RowsetId& rowset_id) { return _rowset_id_generator->id_in_use(rowset_id); };

void release_rowset_id(RowsetId& rowset_id) { return _rowset_id_generator->release_id(rowset_id); };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add some comments here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

std::mutex _lock;
RowsetId _next_id = -1;
RowsetId _id_batch_end = -1;
virtual void release_id(RowsetId& rowset_id) = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment to describe the api

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// generator a id according to data dir
// rowsetid is not globally unique, it is dir level
// it saves the batch end id into meta env
OLAPStatus next_id(RowsetId* rowset_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
OLAPStatus next_id(RowsetId* rowset_id);
OLAPStatus next_id(RowsetId* rowset_id) override;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// it saves the batch end id into meta env
OLAPStatus next_id(RowsetId* rowset_id);

bool id_in_use(RowsetId& rowset_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool id_in_use(RowsetId& rowset_id);
bool id_in_use(const RowsetId& rowset_id) override;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

lo = (id_version << 56) + (low & LOW_56_BITS);
}

std::string to_string() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to_string and serialize and deserialize api

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to_string is ok, because it is a string value in pb.


namespace doris {

std::string BetaRowset::segment_file_path(const std::string& dir, RowsetId rowset_id, int segment_id) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::string BetaRowset::segment_file_path(const std::string& dir, RowsetId rowset_id, int segment_id) {
std::string BetaRowset::segment_file_path(const std::string& dir, const RowsetId& rowset_id, const int segment_id) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return _valid_rowset_ids.find(rowset_id) != _valid_rowset_ids.end();
}

void UniqueRowsetIdGenerator::release_id(RowsetId& rowset_id) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void UniqueRowsetIdGenerator::release_id(RowsetId& rowset_id) {
void UniqueRowsetIdGenerator::release_id(const RowsetId& rowset_id) {

}


TEST_F(UniqueRowsetIdGeneratorTest, GenerateIdTest) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a case to test BE restart situation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// options
doris::EngineOptions options;
options.store_paths = paths;
options.backend_uid = doris::UniqueId();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is backend_uid act like a random seed that will be generated every time BE started?
If so, I think the name is confused that make people think of it as the ID of a BE?
Maybe random_seed is better?

static const uint16_t OLAP_STRING_MAX_LENGTH = 65535;

static const int32_t PREFERRED_SNAPSHOT_VERSION = 2;
static const int32_t PREFERRED_SNAPSHOT_VERSION = 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add some comment to explain the version 1 , 2, and 3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

request.setVersion_hash(versionHash);
request.setList_files(true);
request.setPreferred_snapshot_version(2);
request.setPreferred_snapshot_version(3);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can make this a global variable? Its better not to write magic number

@@ -0,0 +1,111 @@
// Licensed to the Apache Software Foundation (ASF) under one
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should add this file to run-ut.sh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

std::unique_ptr<TabletManager> _tablet_manager;
std::unique_ptr<TxnManager> _txn_manager;

RowsetIdGenerator* _rowset_id_generator;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should delete it in destructor or use std::unique_ptr


namespace doris {

UniqueRowsetIdGenerator::UniqueRowsetIdGenerator(UniqueId backend_uid) :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
UniqueRowsetIdGenerator::UniqueRowsetIdGenerator(UniqueId backend_uid) :
UniqueRowsetIdGenerator::UniqueRowsetIdGenerator(const UniqueId& backend_uid) :

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

_backend_uid(backend_uid), _inc_id(1) {
}
// generator a id according to data dir
// rowsetid is not globally unique, it is dir level
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think rowset id is global unique

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@yiguolei yiguolei force-pushed the master branch 2 times, most recently from 594ee87 to 218e07e Compare August 29, 2019 07:41
@yiguolei
Copy link
Contributor Author

yiguolei commented Sep 2, 2019

#1717

Copy link
Contributor

@imay imay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chaoyli chaoyli merged commit 6f4feca into apache:master Sep 2, 2019
swjtu-zhanglei pushed a commit to swjtu-zhanglei/incubator-doris that referenced this pull request Jul 25, 2023
swjtu-zhanglei pushed a commit to swjtu-zhanglei/incubator-doris that referenced this pull request Jul 25, 2023
* Fix BE UT failure introduced by apache#1526 in TabletSchemaCache (apache#1661)

* Fix be ut
* Revert "Fix BE UT failure introduced by apache#1526 in TabletSchemaCache (apache#1647)"
  This reverts commit bca5d32.

* (selectdb-cloud) Modify debug checks for detached schema kv (apache#1678)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants