Skip to content

Add log loss benchmark framework#630

Closed
PettitWesley wants to merge 5 commits intoaws:mainlinefrom
PettitWesley:log-loss-framework
Closed

Add log loss benchmark framework#630
PettitWesley wants to merge 5 commits intoaws:mainlinefrom
PettitWesley:log-loss-framework

Conversation

@PettitWesley
Copy link
Copy Markdown
Contributor

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Signed-off-by: Wesley Pettit <wppttt@amazon.com>
@PettitWesley PettitWesley requested a review from a team as a code owner April 10, 2023 01:45
expectedTimeElapsed += cycleTimeInMs;
actualTimeElapsed = endSeconds - loggingStart;

if (actualTimeElapsed < expectedTimeElapsed)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a reason to recheck that we are under our total messages sent here, to avoid the wait of we have already sent all messages

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea but then the total runtime won't be correct... sicne we won't wait after the last cycle and then the calculation of the real throughput won't be correct.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this a bunch last night and the current state gives the right real throughput

Copy link
Copy Markdown
Contributor

@matthewfala matthewfala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a minor comment on some of the input validation logic. The core code looks good!

if (burst_enabled != NULL) {
burstSizeInMB = atoi(burst_enabled);
burstThroughputInKb = atoi(getenv("BURST_THROUGHPUT_IN_KB"));
if ((burstSizeInMB * 2) > totalSizeInKb) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean totalSizeInMb, this says totalSizeInKb

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe burstSizeInMB * 1000 * 2

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, I thought I had remembered this

burstSizeInMB = atoi(burst_enabled);
burstThroughputInKb = atoi(getenv("BURST_THROUGHPUT_IN_KB"));
if ((burstSizeInMB * 2) > totalSizeInKb) {
printf("ERROR: BURST_SIZE_IN_MB must be less than half of TOTAL_SIZE_IN_MB");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition doesn't check what this print statement says. The print statement should say:
ERROR: BURST_SIZE_IN_MB must be greater than half of TOTAL_SIZE_IN_KB

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh oops my math is wrong

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep!

Comment on lines +87 to +88
burstDelayMessages = (burstDelayInMB * 1000) / sizeInKb;
burstMessagesPerCycle = (burstThroughputInKb * cycleTimeInS) / sizeInKb;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works.
Dimensional analysis
mb * (kb/mb) / (kb/log) = log -- Number of logs per burst delay
(kb/s) * (s) / (kb/log) = log -- Number of logs to output per burst

Signed-off-by: Wesley Pettit <wppttt@amazon.com>
Signed-off-by: Wesley Pettit <wppttt@amazon.com>
Signed-off-by: Wesley Pettit <wppttt@amazon.com>
Signed-off-by: Wesley Pettit <wppttt@amazon.com>
@PettitWesley PettitWesley changed the title Add logger for log loss benchmarks Add log loss benchmark framework Apr 14, 2023
Copy link
Copy Markdown
Contributor

@matthewfala matthewfala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, Wesley! I took a look and this looks good! I like how you make use of ECS metadata to get a hold of the correct log stream in the validator. That's really cool, and impressive that it works.


func main() {
runningInECS := false
if (os.Getenv("ECS_CONTAINER_METADATA_URI_V4") != "") {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 Very cool

// First 8 char is the unique record ID
recordId := log[:8]
cwRecoredCounter += 1
if _, ok := inputMap[recordId]; ok {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note: if you run a 10mbps test for 1 day this map would be 6.91200GB if my math is correct:
10000 logs/seconds * 8 bytes/log * 86400 seconds = 6.912Gb. If you see oom kill it may be because of this map.

Copy link
Copy Markdown
Contributor Author

@PettitWesley PettitWesley Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is part of why each test run only sends a few hundred MB of data and runs for only a few minutes

// output is comma delimited, output insights query result as CSV
// simple python code can parse the CSV
fmt.Printf("%s %s - %s, percent lost, %d, number_lost, %d, total_input_record, %d, duplicates, %d, group=%s stream=%s TOTAL_SIZE_IN_MB=%s, SIZE_IN_MB=%s, THROUGHPUT_IN_KB=%s, %s, %s, %s, %s, %s",
testName, hasLogLoss, throughputInKB, (totalInputRecord-uniqueRecordFound)*100/totalInputRecord, totalInputRecord-uniqueRecordFound, totalInputRecord, totalRecordFound - uniqueRecordFound,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should've put the buffer size in it, but oh well, thats in the test name right now

burstSizeInMB = atoi(burst_enabled);
burstThroughputInKb = atoi(getenv("BURST_THROUGHPUT_IN_KB"));
if ((burstSizeInMB * 2) > totalSizeInKb) {
printf("ERROR: BURST_SIZE_IN_MB must be less than half of TOTAL_SIZE_IN_MB");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep!

@PettitWesley
Copy link
Copy Markdown
Contributor Author

Closing in favor of: #670

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants