Skip to content

Conversation

@zwass
Copy link
Member

@zwass zwass commented Jul 10, 2025

@zwass zwass requested a review from a team as a code owner July 10, 2025 02:17
@zwass zwass added the macOS label Jul 10, 2025
@zwass zwass requested a review from a team as a code owner July 10, 2025 02:17
@zwass zwass requested a review from Copilot July 29, 2025 18:42

This comment was marked as outdated.

} @catch (NSException* exception) {
LOG(WARNING) << "Exception in convertForJSON: "
<< [[exception reason] UTF8String];
return @"[Error converting object]";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is returning the error as a string? Does this make it hard to tell what's an error vs a string that happens to match?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I'm not certain what would be the best UX but this feels like a reasonable compromise.

Other potential options:

  1. Log and return an empty string or null (could be harder to interpret query results as it wouldn't be clear what was actually empty/null vs error converting)
  2. Fail the entire query

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure either. I agree this might be in the least-bad range of things

namespace osquery {
namespace tables {

id convertForJSON(id obj) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naive question, but does NSJSONSerialization make this any easier?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does and it's actually used below but per the docs there is a requirement that

All objects are instances of NSString, NSNumber, NSArray, NSDictionary, or NSNull.

This function converts other object types that were encountered in testing to types that can be serialized by NSJSONSerialization.

description("Query system_profiler data types and return the full result as JSON. Returns only the data types specified in the constraints. See available data types with `system_profiler -listDataTypes`.")
schema([
Column("data_type", TEXT, "The system profiler data type (e.g., SPHardwareDataType)", index=True, required=True),
Column("value", TEXT, "A JSON representation of the full result dictionary for the data type"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know some of the system profiler data blobs are huge. Do you think there's an issue with passing them through as one giant json blob? with lots of extracts? Is this going to end up inadvertently calling it multiple times?

(I honestly don't know)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting question. My initial reaction is that I sure hope SQLite knows to only generate the table once and then do the extracts on the already in-memory data.

I dug in and I believe it does do what we would hope for here.

osquery> explain SELECT json_extract(value, '$.foo'), json_extract(value, '$.bar'), json_extract(value, '$.baz') FROM system_profiler WHERE data_type = "SPParallelATADataType";
+------+-------------+----+----+-----+-----------------------+----+------------------------------+
| addr | opcode      | p1 | p2 | p3  | p4                    | p5 | comment                      |
+------+-------------+----+----+-----+-----------------------+----+------------------------------+
| 0    | Init        | 0  | 17 | 0   |                       | 0  | Start at 17                  |
| 1    | VOpen       | 0  | 0  | 0   | vtab:60000104BAB0     | 0  |                              |
| 2    | String8     | 0  | 3  | 0   | SPParallelATADataType | 0  | r[3]='SPParallelATADataType' |
| 3    | Integer     | 11 | 1  | 0   |                       | 0  | r[1]=11                      |
| 4    | Integer     | 1  | 2  | 0   |                       | 0  | r[2]=1                       |
| 5    | VFilter     | 0  | 16 | 1   |                       | 0  | iplan=r[1] zplan=''          |
| 6    | VColumn     | 0  | 0  | 4   |                       | 0  | r[4]=vcolumn(0)              |
| 7    | Ne          | 5  | 15 | 4   | BINARY-8              | 82 | if r[4]!=r[5] goto 15        |
| 8    | VColumn     | 0  | 1  | 9   |                       | 0  | r[9]=vcolumn(1)              |
| 9    | Function    | 2  | 9  | 6   | json_extract(-1)      | 0  | r[6]=func(r[9..10])          |
| 10   | VColumn     | 0  | 1  | 11  |                       | 0  | r[11]=vcolumn(1)             |
| 11   | Function    | 2  | 11 | 7   | json_extract(-1)      | 0  | r[7]=func(r[11..12])         |
| 12   | VColumn     | 0  | 1  | 13  |                       | 0  | r[13]=vcolumn(1)             |
| 13   | Function    | 2  | 13 | 8   | json_extract(-1)      | 0  | r[8]=func(r[13..14])         |
| 14   | ResultRow   | 6  | 3  | 0   |                       | 0  | output=r[6..8]               |
| 15   | VNext       | 0  | 6  | 0   |                       | 0  |                              |
| 16   | Halt        | 0  | 0  | 0   |                       | 0  |                              |
| 17   | Transaction | 1  | 0  | 185 | 0                     | 1  | usesStmtJournal=0            |
| 18   | String8     | 0  | 5  | 0   | SPParallelATADataType | 0  | r[5]='SPParallelATADataType' |
| 19   | String8     | 0  | 10 | 0   | $.foo                 | 0  | r[10]='$.foo'                |
| 20   | String8     | 0  | 12 | 0   | $.bar                 | 0  | r[12]='$.bar'                |
| 21   | String8     | 0  | 14 | 0   | $.baz                 | 0  | r[14]='$.baz'                |
| 22   | Goto        | 0  | 1  | 0   |                       | 0  |                              |
+------+-------------+----+----+-----+-----------------------+----+------------------------------+

My (AI-assisted) interpretation of this is that each value is generated once, and then passed in as arguments to the json_extract functions.

To double-check, I added logging and validated that the generate function is only called once for each data_type provided in the query.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know there are all kinds of weird edges around how joins work, and sqlite sometimes does call generate more than once. But, probably, we can ignore those edges here. If this meets the common case, that's great.

Copy link
Member Author

@zwass zwass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While investigating your questions I noticed that the implementation actually returns a bunch of data that seems to be intended to set rendering options for use in the system profiler application. I've updated it to only include the _items array which matches the actual data shown in system profiler.

} @catch (NSException* exception) {
LOG(WARNING) << "Exception in convertForJSON: "
<< [[exception reason] UTF8String];
return @"[Error converting object]";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I'm not certain what would be the best UX but this feels like a reasonable compromise.

Other potential options:

  1. Log and return an empty string or null (could be harder to interpret query results as it wouldn't be clear what was actually empty/null vs error converting)
  2. Fail the entire query

description("Query system_profiler data types and return the full result as JSON. Returns only the data types specified in the constraints. See available data types with `system_profiler -listDataTypes`.")
schema([
Column("data_type", TEXT, "The system profiler data type (e.g., SPHardwareDataType)", index=True, required=True),
Column("value", TEXT, "A JSON representation of the full result dictionary for the data type"),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting question. My initial reaction is that I sure hope SQLite knows to only generate the table once and then do the extracts on the already in-memory data.

I dug in and I believe it does do what we would hope for here.

osquery> explain SELECT json_extract(value, '$.foo'), json_extract(value, '$.bar'), json_extract(value, '$.baz') FROM system_profiler WHERE data_type = "SPParallelATADataType";
+------+-------------+----+----+-----+-----------------------+----+------------------------------+
| addr | opcode      | p1 | p2 | p3  | p4                    | p5 | comment                      |
+------+-------------+----+----+-----+-----------------------+----+------------------------------+
| 0    | Init        | 0  | 17 | 0   |                       | 0  | Start at 17                  |
| 1    | VOpen       | 0  | 0  | 0   | vtab:60000104BAB0     | 0  |                              |
| 2    | String8     | 0  | 3  | 0   | SPParallelATADataType | 0  | r[3]='SPParallelATADataType' |
| 3    | Integer     | 11 | 1  | 0   |                       | 0  | r[1]=11                      |
| 4    | Integer     | 1  | 2  | 0   |                       | 0  | r[2]=1                       |
| 5    | VFilter     | 0  | 16 | 1   |                       | 0  | iplan=r[1] zplan=''          |
| 6    | VColumn     | 0  | 0  | 4   |                       | 0  | r[4]=vcolumn(0)              |
| 7    | Ne          | 5  | 15 | 4   | BINARY-8              | 82 | if r[4]!=r[5] goto 15        |
| 8    | VColumn     | 0  | 1  | 9   |                       | 0  | r[9]=vcolumn(1)              |
| 9    | Function    | 2  | 9  | 6   | json_extract(-1)      | 0  | r[6]=func(r[9..10])          |
| 10   | VColumn     | 0  | 1  | 11  |                       | 0  | r[11]=vcolumn(1)             |
| 11   | Function    | 2  | 11 | 7   | json_extract(-1)      | 0  | r[7]=func(r[11..12])         |
| 12   | VColumn     | 0  | 1  | 13  |                       | 0  | r[13]=vcolumn(1)             |
| 13   | Function    | 2  | 13 | 8   | json_extract(-1)      | 0  | r[8]=func(r[13..14])         |
| 14   | ResultRow   | 6  | 3  | 0   |                       | 0  | output=r[6..8]               |
| 15   | VNext       | 0  | 6  | 0   |                       | 0  |                              |
| 16   | Halt        | 0  | 0  | 0   |                       | 0  |                              |
| 17   | Transaction | 1  | 0  | 185 | 0                     | 1  | usesStmtJournal=0            |
| 18   | String8     | 0  | 5  | 0   | SPParallelATADataType | 0  | r[5]='SPParallelATADataType' |
| 19   | String8     | 0  | 10 | 0   | $.foo                 | 0  | r[10]='$.foo'                |
| 20   | String8     | 0  | 12 | 0   | $.bar                 | 0  | r[12]='$.bar'                |
| 21   | String8     | 0  | 14 | 0   | $.baz                 | 0  | r[14]='$.baz'                |
| 22   | Goto        | 0  | 1  | 0   |                       | 0  |                              |
+------+-------------+----+----+-----+-----------------------+----+------------------------------+

My (AI-assisted) interpretation of this is that each value is generated once, and then passed in as arguments to the json_extract functions.

To double-check, I added logging and validated that the generate function is only called once for each data_type provided in the query.

@zwass zwass force-pushed the macos-system-profiler-table branch from 6dc5a5c to 31e3abe Compare July 30, 2025 23:26
@zwass zwass requested a review from Copilot July 30, 2025 23:43

This comment was marked as outdated.

@directionless
Copy link
Member

While investigating your questions I noticed that the implementation actually returns a bunch of data that seems to be intended to set rendering options for use in the system profiler application. I've updated it to only include the _items array which matches the actual data shown in system profiler.

I've seen that before, and always been unsure if it was neat or noise. But really, I'm not going to write a UI osquery 😆

Copy link
Member

@directionless directionless left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of those copilot warnings looks real. But I'd thumb this if you disagree with it

} @catch (NSException* exception) {
LOG(WARNING) << "Exception in convertForJSON: "
<< [[exception reason] UTF8String];
return @"[Error converting object]";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure either. I agree this might be in the least-bad range of things

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Aug 5, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@zwass zwass force-pushed the macos-system-profiler-table branch from 94720c8 to 51eced1 Compare August 5, 2025 18:29
@zwass zwass requested a review from Copilot August 5, 2025 18:42
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new system_profiler table for macOS that allows querying macOS system profiler data types and returning the results as JSON. The implementation leverages the existing system_profiler command-line tool to gather system information.

Key changes:

  • Adds system_profiler table implementation with JSON conversion utilities
  • Includes comprehensive integration tests for the new table
  • Provides table specification with proper constraint handling

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
osquery/tables/system/darwin/system_profiler.mm Core implementation with Objective-C utilities for JSON conversion and system profiler data retrieval
specs/darwin/system_profiler.table Table specification defining schema, constraints, and usage examples
tests/integration/tables/system_profiler.cpp Integration tests covering various query scenarios and constraint usage
tests/integration/tables/CMakeLists.txt Adds system_profiler test to build configuration
specs/CMakeLists.txt Registers the new table in the build system
osquery/tables/system/CMakeLists.txt Includes the implementation file in the build

Comment on lines 118 to 121
if (![items isKindOfClass:[NSDictionary class]]) {
LOG(WARNING) << "System profiler report items is not a dictionary for "
<< dataType;
continue;
}

Row r;
r["data_type"] = dataType;
r["value"] = nsDictionaryToJson((NSDictionary*)items);
results.push_back(r);
Copy link

Copilot AI Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check assumes _items should be a NSDictionary, but system_profiler typically returns _items as an NSArray containing dictionaries. This will likely cause most queries to fail with a warning and return no data.

Suggested change
if (![items isKindOfClass:[NSDictionary class]]) {
LOG(WARNING) << "System profiler report items is not a dictionary for "
<< dataType;
continue;
}
Row r;
r["data_type"] = dataType;
r["value"] = nsDictionaryToJson((NSDictionary*)items);
results.push_back(r);
if ([items isKindOfClass:[NSArray class]]) {
for (id item in (NSArray*)items) {
if (![item isKindOfClass:[NSDictionary class]]) {
LOG(WARNING) << "System profiler item is not a dictionary for " << dataType;
continue;
}
Row r;
r["data_type"] = dataType;
r["value"] = nsDictionaryToJson((NSDictionary*)item);
results.push_back(r);
}
} else if ([items isKindOfClass:[NSDictionary class]]) {
Row r;
r["data_type"] = dataType;
r["value"] = nsDictionaryToJson((NSDictionary*)items);
results.push_back(r);
} else {
LOG(WARNING) << "System profiler report items is not an array or dictionary for " << dataType;
continue;
}

Copilot uses AI. Check for mistakes.
Comment on lines 118 to 125
if (![items isKindOfClass:[NSDictionary class]]) {
LOG(WARNING) << "System profiler report items is not a dictionary for "
<< dataType;
continue;
}

Row r;
r["data_type"] = dataType;
r["value"] = nsDictionaryToJson((NSDictionary*)items);
Copy link

Copilot AI Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line casts items to NSDictionary after the previous check, but if _items is actually an NSArray (which is typical for system_profiler output), this cast will result in incorrect JSON serialization.

Suggested change
if (![items isKindOfClass:[NSDictionary class]]) {
LOG(WARNING) << "System profiler report items is not a dictionary for "
<< dataType;
continue;
}
Row r;
r["data_type"] = dataType;
r["value"] = nsDictionaryToJson((NSDictionary*)items);
if (![items isKindOfClass:[NSArray class]] && ![items isKindOfClass:[NSDictionary class]]) {
LOG(WARNING) << "System profiler report items is not an array or dictionary for "
<< dataType;
continue;
}
Row r;
r["data_type"] = dataType;
r["value"] = nsObjectToJson(items);

Copilot uses AI. Check for mistakes.
@zwass zwass force-pushed the macos-system-profiler-table branch from 51eced1 to 25bad69 Compare August 5, 2025 19:06
@zwass zwass force-pushed the macos-system-profiler-table branch from ee98d5c to f0d8182 Compare August 6, 2025 00:33
@zwass zwass requested a review from directionless August 6, 2025 02:30
@zwass zwass changed the title Add system_profiler table for macOS Add system_profiler table for macOS Aug 7, 2025
@zwass zwass merged commit bc67e7d into osquery:master Aug 12, 2025
22 checks passed
@zwass zwass deleted the macos-system-profiler-table branch August 12, 2025 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants