Skip to content

Conversation

@hangc0276
Copy link
Contributor

Motivation

When use RegionAwareEnsemblePlacementPolicy to select ensemble set, it always select the same region set for all ledger creation.

The reason is that it traverse the availableRegions set in the same order for ensemble selection and select bookies in the same region set, which lead the load not balance between regions.

for (String region: availableRegions) {
        regionsWiseAllocation.put(region, Pair.of(0, 0));
}
...

for (Map.Entry<String, Pair<Integer, Integer>> regionEntry: regionsWiseAllocation.entrySet()) {
    // do selection
}

Changes

  1. introduce regionIndex field to mark next select index for available region list, avoiding traverse from the same position for each ensemble selection.

I will add the test latter.

@hangc0276
Copy link
Contributor Author

rerun failure checks

List<BookieId> ensemble2 = repp.newEnsemble(3, 3, 2,
null, new HashSet<>()).getResult();
ensemble1.retainAll(ensemble2);
assert(ensemble1.size() >= 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we depend on a Random number, is this test going to be flaky ? probably not, but I would like to double check your thoughts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the lastRegionIndex initialized to specific value, all bookkeeper clients will start with the same region, which may lead to hotpot for regions. However those bookkeeper clients create ledger speed won't keep sync, starting will the same region won't make much difference. I changed the initialized value to 0.

@hangc0276
Copy link
Contributor Author

rerun failure checks

1 similar comment
@hangc0276
Copy link
Contributor Author

rerun failure checks

@hangc0276
Copy link
Contributor Author

@eolivelli @merlimat @sijie Please take a look, thanks.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@merlimat merlimat merged commit b824a60 into apache:master May 8, 2021
@merlimat merlimat added this to the 4.14.0 milestone May 8, 2021
merlimat pushed a commit that referenced this pull request May 24, 2021
### Motivation
After add label for prometheus metric by #2650, it will cause prometheus metric format check failed when no label specified for a statsLogger. The metric list as follow.
```
replication_bookkeeper_client_bookkeeper_client_bookie_watcher_NEW_ENSEMBLE_TIME{success="false",quantile="0.9999", } NaN
```

### Modification
1. add label empty check for `PrometheusTextFormatUtil`
2. add label scope check test cover
3. add prometheus metric regex pattern check in test case

Reviewers: lipenghui <penghui@apache.org>, Andrey Yegorov <None>, Matteo Merli <mmerli@apache.org>, Jia Zhai <zhaijia@apache.org>, Addison Higham <None>, Enrico Olivelli <eolivelli@gmail.com>

This closes #2718 from hangc0276/chenhang/fix_bookeeper_metric_bug and squashes the following commits:

8590704 [hangc0276] format code
a6942d4 [chenhang] fix prometheus metric provider bug and add test to cover label scope and metric format check
bb8b1e0 [Andrey Yegorov] Include gradle files into the source artifact for releases, exclude site2/**
732b6cf [Andrey Yegorov] [maven-release-plugin] prepare for next development iteration
b0d9f10 [Andrey Yegorov] [maven-release-plugin] prepare branch branch-4.14
8dc108b [Matteo Merli] y
73e22ca [Don Inghram] ISSUE2620: RocksDB log path configurable
034ef85 [Shoothzj] Fix logger member not correct; (#2605)
b824a60 [hangc0276] fix always select the same region set bug for RegionAwareEnsemblePlacementPolicy (#2658)
683ad45 [Matteo Merli] Allow to attach labels to metrics (#2650)
8091096 [Matteo Merli] Allow to bypass journal for writes (#2401)
63867a9 [Matteo Merli] Impose a memory limit on the bookie journal (#2710)
87579b0 [Matteo Merli] Read entry error should print lastAddConfirmed in the log (#2707)
merlimat pushed a commit that referenced this pull request May 29, 2021
### Motivation
After add label for prometheus metric by #2650, it will cause prometheus metric format check failed when no label specified for a statsLogger. The metric list as follow.
```
replication_bookkeeper_client_bookkeeper_client_bookie_watcher_NEW_ENSEMBLE_TIME{success="false",quantile="0.9999", } NaN
```

### Modification
1. add label empty check for `PrometheusTextFormatUtil`
2. add label scope check test cover
3. add prometheus metric regex pattern check in test case

Reviewers: lipenghui <penghui@apache.org>, Andrey Yegorov <None>, Matteo Merli <mmerli@apache.org>, Jia Zhai <zhaijia@apache.org>, Addison Higham <None>, Enrico Olivelli <eolivelli@gmail.com>

This closes #2718 from hangc0276/chenhang/fix_bookeeper_metric_bug and squashes the following commits:

8590704 [hangc0276] format code
a6942d4 [chenhang] fix prometheus metric provider bug and add test to cover label scope and metric format check
bb8b1e0 [Andrey Yegorov] Include gradle files into the source artifact for releases, exclude site2/**
732b6cf [Andrey Yegorov] [maven-release-plugin] prepare for next development iteration
b0d9f10 [Andrey Yegorov] [maven-release-plugin] prepare branch branch-4.14
8dc108b [Matteo Merli] y
73e22ca [Don Inghram] ISSUE2620: RocksDB log path configurable
034ef85 [Shoothzj] Fix logger member not correct; (#2605)
b824a60 [hangc0276] fix always select the same region set bug for RegionAwareEnsemblePlacementPolicy (#2658)
683ad45 [Matteo Merli] Allow to attach labels to metrics (#2650)
8091096 [Matteo Merli] Allow to bypass journal for writes (#2401)
63867a9 [Matteo Merli] Impose a memory limit on the bookie journal (#2710)
87579b0 [Matteo Merli] Read entry error should print lastAddConfirmed in the log (#2707)
Sunny-Island pushed a commit to Sunny-Island/bookkeeper that referenced this pull request Jun 2, 2021
### Motivation
After add label for prometheus metric by apache#2650, it will cause prometheus metric format check failed when no label specified for a statsLogger. The metric list as follow.
```
replication_bookkeeper_client_bookkeeper_client_bookie_watcher_NEW_ENSEMBLE_TIME{success="false",quantile="0.9999", } NaN
```

### Modification
1. add label empty check for `PrometheusTextFormatUtil`
2. add label scope check test cover
3. add prometheus metric regex pattern check in test case

Reviewers: lipenghui <penghui@apache.org>, Andrey Yegorov <None>, Matteo Merli <mmerli@apache.org>, Jia Zhai <zhaijia@apache.org>, Addison Higham <None>, Enrico Olivelli <eolivelli@gmail.com>

This closes apache#2718 from hangc0276/chenhang/fix_bookeeper_metric_bug and squashes the following commits:

8590704 [hangc0276] format code
a6942d4 [chenhang] fix prometheus metric provider bug and add test to cover label scope and metric format check
bb8b1e0 [Andrey Yegorov] Include gradle files into the source artifact for releases, exclude site2/**
732b6cf [Andrey Yegorov] [maven-release-plugin] prepare for next development iteration
b0d9f10 [Andrey Yegorov] [maven-release-plugin] prepare branch branch-4.14
8dc108b [Matteo Merli] y
73e22ca [Don Inghram] ISSUE2620: RocksDB log path configurable
034ef85 [Shoothzj] Fix logger member not correct; (apache#2605)
b824a60 [hangc0276] fix always select the same region set bug for RegionAwareEnsemblePlacementPolicy (apache#2658)
683ad45 [Matteo Merli] Allow to attach labels to metrics (apache#2650)
8091096 [Matteo Merli] Allow to bypass journal for writes (apache#2401)
63867a9 [Matteo Merli] Impose a memory limit on the bookie journal (apache#2710)
87579b0 [Matteo Merli] Read entry error should print lastAddConfirmed in the log (apache#2707)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants