Add log loss benchmark framework by PettitWesley · Pull Request #630 · aws/aws-for-fluent-bit

PettitWesley · 2023-04-10T01:45:08Z

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Signed-off-by: Wesley Pettit <wppttt@amazon.com>

ShelbyZ · 2023-04-10T01:53:54Z

troubleshooting/tools/log-loss-test-framework/logger/log_generator.c

+        expectedTimeElapsed += cycleTimeInMs;
+        actualTimeElapsed = endSeconds - loggingStart;
+
+        if (actualTimeElapsed < expectedTimeElapsed)


There is a reason to recheck that we are under our total messages sent here, to avoid the wait of we have already sent all messages

Yea but then the total runtime won't be correct... sicne we won't wait after the last cycle and then the calculation of the real throughput won't be correct.

I tested this a bunch last night and the current state gives the right real throughput

matthewfala

Left a minor comment on some of the input validation logic. The core code looks good!

matthewfala · 2023-04-12T18:23:41Z

troubleshooting/tools/log-loss-test-framework/logger/log_generator.c

+    if (burst_enabled != NULL) {
+        burstSizeInMB = atoi(burst_enabled);
+        burstThroughputInKb = atoi(getenv("BURST_THROUGHPUT_IN_KB"));
+        if ((burstSizeInMB * 2) > totalSizeInKb) {


Did you mean totalSizeInMb, this says totalSizeInKb

or maybe burstSizeInMB * 1000 * 2

oops, I thought I had remembered this

matthewfala · 2023-04-12T18:25:25Z

troubleshooting/tools/log-loss-test-framework/logger/log_generator.c

+        burstSizeInMB = atoi(burst_enabled);
+        burstThroughputInKb = atoi(getenv("BURST_THROUGHPUT_IN_KB"));
+        if ((burstSizeInMB * 2) > totalSizeInKb) {
+            printf("ERROR: BURST_SIZE_IN_MB must be less than half of TOTAL_SIZE_IN_MB");


The condition doesn't check what this print statement says. The print statement should say:
ERROR: BURST_SIZE_IN_MB must be greater than half of TOTAL_SIZE_IN_KB

oh oops my math is wrong

matthewfala · 2023-04-12T18:30:37Z

troubleshooting/tools/log-loss-test-framework/logger/log_generator.c

+        burstDelayMessages = (burstDelayInMB * 1000) / sizeInKb;
+        burstMessagesPerCycle = (burstThroughputInKb * cycleTimeInS) / sizeInKb;


This works.
Dimensional analysis
mb * (kb/mb) / (kb/log) = log -- Number of logs per burst delay
(kb/s) * (s) / (kb/log) = log -- Number of logs to output per burst

Signed-off-by: Wesley Pettit <wppttt@amazon.com>

matthewfala

Thank you, Wesley! I took a look and this looks good! I like how you make use of ECS metadata to get a hold of the correct log stream in the validator. That's really cool, and impressive that it works.

matthewfala · 2023-04-14T21:36:26Z

troubleshooting/tools/log-loss-test-framework/validator/validate.go

+
+func main() {
+	runningInECS := false
+	if (os.Getenv("ECS_CONTAINER_METADATA_URI_V4") != "") {


👏 Very cool

matthewfala · 2023-04-14T21:42:40Z

troubleshooting/tools/log-loss-test-framework/validator/validate.go

+			// First 8 char is the unique record ID
+			recordId := log[:8]
+			cwRecoredCounter += 1
+			if _, ok := inputMap[recordId]; ok {


Just a note: if you run a 10mbps test for 1 day this map would be 6.91200GB if my math is correct:
10000 logs/seconds * 8 bytes/log * 86400 seconds = 6.912Gb. If you see oom kill it may be because of this map.

this is part of why each test run only sends a few hundred MB of data and runs for only a few minutes

matthewfala · 2023-04-14T21:50:50Z

troubleshooting/tools/log-loss-test-framework/validator/validate.go

+	// output is comma delimited, output insights query result as CSV
+	// simple python code can parse the CSV
+	fmt.Printf("%s %s - %s, percent lost, %d, number_lost, %d, total_input_record, %d, duplicates, %d, group=%s stream=%s TOTAL_SIZE_IN_MB=%s, SIZE_IN_MB=%s, THROUGHPUT_IN_KB=%s, %s, %s, %s, %s, %s",
+			   testName, hasLogLoss, throughputInKB, (totalInputRecord-uniqueRecordFound)*100/totalInputRecord, totalInputRecord-uniqueRecordFound, totalInputRecord, totalRecordFound - uniqueRecordFound,


This looks good.

I should've put the buffer size in it, but oh well, thats in the test name right now

matthewfala · 2023-04-14T21:51:13Z

troubleshooting/tools/log-loss-test-framework/logger/log_generator.c

+        burstSizeInMB = atoi(burst_enabled);
+        burstThroughputInKb = atoi(getenv("BURST_THROUGHPUT_IN_KB"));
+        if ((burstSizeInMB * 2) > totalSizeInKb) {
+            printf("ERROR: BURST_SIZE_IN_MB must be less than half of TOTAL_SIZE_IN_MB");


PettitWesley · 2023-06-05T01:37:13Z

Closing in favor of: #670

Add logger for log loss benchmarks

b0d85ff

Signed-off-by: Wesley Pettit <wppttt@amazon.com>

PettitWesley requested a review from a team as a code owner April 10, 2023 01:45

ShelbyZ reviewed Apr 10, 2023

View reviewed changes

ShelbyZ approved these changes Apr 10, 2023

View reviewed changes

PettitWesley mentioned this pull request Apr 10, 2023

Driver testing - for awslogs driver #629

Closed

matthewfala reviewed Apr 12, 2023

View reviewed changes

log-loss-framework: add burst capability to logger

921626f

Signed-off-by: Wesley Pettit <wppttt@amazon.com>

PettitWesley force-pushed the log-loss-framework branch from fa542c2 to 921626f Compare April 12, 2023 20:08

PettitWesley added 3 commits April 14, 2023 13:48

logger: print real throughput to a file for easy checking

2daf021

Signed-off-by: Wesley Pettit <wppttt@amazon.com>

log loss framework: add validator

2ae4ff4

Signed-off-by: Wesley Pettit <wppttt@amazon.com>

log loss framework: add task def

a408ddb

Signed-off-by: Wesley Pettit <wppttt@amazon.com>

PettitWesley changed the title ~~Add logger for log loss benchmarks~~ Add log loss benchmark framework Apr 14, 2023

PettitWesley mentioned this pull request Apr 14, 2023

non-blocking bug in AWSLogs driver code moby/moby#45217

Closed

matthewfala approved these changes Apr 14, 2023

View reviewed changes

PettitWesley closed this Jun 5, 2023

PettitWesley mentioned this pull request Jun 5, 2023

log-loss-framework: all test materials developed for AWSLogs benchmark #670

Merged

		burstDelayMessages = (burstDelayInMB * 1000) / sizeInKb;
		burstMessagesPerCycle = (burstThroughputInKb * cycleTimeInS) / sizeInKb;

Conversation

PettitWesley commented Apr 10, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

matthewfala left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

matthewfala left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PettitWesley Apr 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PettitWesley commented Jun 5, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

PettitWesley Apr 17, 2023 •

edited

Loading