-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Added UpgradeCode to programs table #8587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| } | ||
|
|
||
| return productCodeUpgradeCodeMap; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generates a map of 334 k/v pairs. Is it ok to store this in memory? Or should we take the more CPU intensive route and iterate through the UpgradeCodes each time to look up a ProductCode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
334 is on your system or any system? That seems like a very reasonable amount of data to have in memory. I'm guessing we are talking about a few kb?
If the number of key value pairs is equal to the number of software items installed, I think it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This number is going to depend on the software installed on the system. And my system is fairly clean, I don't have a ton installed.
Given a UUID is 36 bytes (as a string with hyphens), so my 334 would be 24,048 bytes, so around ~24k.
UUIDs are certainly not cheap to store. : / I am starting to lean towards maybe eating the cost on CPU and iterating the list/performing the substring conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you. Either way I think is reasonable. If the CPU cost is significant I'd lean towards the caching. Even a few MB that stays around for just the life of the query (I think I have that right) would be not a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
If 1 MB is ok, then I'm good with sticking this into memory. 1mb of uuids would be around 27k. Feels like that's a lot more than I would expect any system to have 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to cap the cache and fall back to lookups?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size_t totalMemoryUsage = sizeof(productCodeUpgradeCodeMap); // Base size of the map
...
std::transform(productCode.begin(),
productCode.end(),
productCode.begin(),
::toupper);
// Calculate memory usage of the new key-value pair
size_t keyMemory = sizeof(productCode) + productCode.capacity();
size_t valueMemory = sizeof(upgradeCode) + upgradeCode.capacity();
// Insert the key-value pair into the map
productCodeUpgradeCodeMap[productCode] = upgradeCode;
// Update the total memory usage
totalMemoryUsage += sizeof(std::pair<std::string, std::string>); // Pair overhead
totalMemoryUsage += keyMemory + valueMemory;
// Log for now
VLOG(1) << "total memory usage: " << totalMemoryUsage << " bytes";
}
}
}
}
I think that would calculate memory usage properly. I guess we could abort and return an empty map if the cache size is "too big" (what number should we use here). Then in the lookup method we could detect if the map was empty or not, and on empty we could have it use a method for lookup rather than the empty map.
if this is a big concern I can certainly hack something together.
|
We get decent coverage for I did verify this with an MSI installed version of figma Full list: |
47da09d to
fe140e6
Compare
| if (encoded.length() != 32) { | ||
| return "Invalid length"; | ||
| } | ||
| // Microsoft uses a custom encoding for GUIDs in the registry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of this seems to align with the native guid implementation in c++
https://learn.microsoft.com/en-us/windows/win32/api/guiddef/ns-guiddef-guid
typedef struct _GUID {
unsigned long Data1;
unsigned short Data2;
unsigned short Data3;
unsigned char Data4[8];
} GUID;
Data1: Represents the first 8 hexadecimal digits (32 bits).
Data2: Represents the next 4 hexadecimal digits (16 bits).
Data3: Represents the subsequent 4 hexadecimal digits (16 bits).
Data4: Represents the final 12 hexadecimal digits (64 bits), stored as an array of 8 bytes.
zwass
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor changes requested. Thank you!
| std::string decodeMsiRegistryGuid(const std::string& encoded) { | ||
| // Ensure the encoded string is exactly 32 characters long | ||
| if (encoded.length() != 32) { | ||
| return "Invalid length"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to https://github.com/osquery/osquery/pull/8587/files#diff-08a676ce9fafbd67605d303da9b66f83e38ced915d6244222258cf1547b9291cR86-R89, seems like this should return an empty string? And perhaps log at the info or verbose level depending on how common it is?
| std::string str = reverseString(encoded.substr(0, 8)) + "-" + | ||
| reverseString(encoded.substr(8, 4)) + "-" + | ||
| reverseString(encoded.substr(12, 4)) + "-" + | ||
| reverseString(encoded.substr(16, 2)) + | ||
| reverseString(encoded.substr(18, 2)) + "-" + | ||
| reverseString(encoded.substr(20, 2)) + | ||
| reverseString(encoded.substr(22, 2)) + | ||
| reverseString(encoded.substr(24, 2)) + | ||
| reverseString(encoded.substr(26, 2)) + | ||
| reverseString(encoded.substr(28, 2)) + | ||
| reverseString(encoded.substr(30, 2)); | ||
|
|
||
| std::string guid = "{" + str + "}"; | ||
| return guid; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combine these lines?
| // It reverses the order of the bytes in the string | ||
| // This 2CCAB6107DB47314AB175756630CCD04 | ||
| // 1. Reverse last 2 characters 04 | ||
| // 2. Reverse next 2 characters CD | ||
| // 3. Reverse next 2 characters 0C | ||
| // 4. Reverse next 2 characters 63 | ||
| // 5. Reverse next 2 characters 56 | ||
| // 6. Reverse next 2 characters 57 | ||
| // 7. Reverse next 2 characters 17 | ||
| // 8. Reverse next 2 characters AB | ||
| // 9. Reverse next 4 characters 7314 | ||
| // 10. Reverse next 4 characters 7DB4 | ||
| // 11. Reverse first 8 characters 2CCAB610 | ||
| // becomes 016BACC2-4BD7-4137-BA71-756536C0DC40 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is very helpful. Would also be nice to see a unit test for this.
| } | ||
|
|
||
| return productCodeUpgradeCodeMap; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
334 is on your system or any system? That seems like a very reasonable amount of data to have in memory. I'm guessing we are talking about a few kb?
If the number of key value pairs is equal to the number of software items installed, I think it's fine.
fc49d91 to
9578717
Compare
fef7e49 to
941afa4
Compare
zwass
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ksykulev can you please merge/rebase master? That will fix the mdfind CI issue.
941afa4 to
3c1b983
Compare
zwass
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3c1b983 to
91fcbeb
Compare
zwass
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should move forward as-is and address any memory usage issues if they come up. The concern seems minimal in my eyes given the expected size of the cache.
For issue fleetdm/fleet#27759
Related to PR #8585
There are a few identifiers for windows programs.
UpgradeCodeis the unique identifier for MSI applications that is stable between version upgrades. It is different than theidentifying_numbercolumn (which is typicallyProductCode), as this changes between versions.https://learn.microsoft.com/en-us/windows/win32/msi/productcode
https://learn.microsoft.com/en-us/windows/win32/msi/upgradecode