Skip to content

Adds Implementor Data Transfer Objects#28

Merged
JasonTheAdams merged 43 commits intotrunkfrom
feature/data-transfer-objects
Jul 26, 2025
Merged

Adds Implementor Data Transfer Objects#28
JasonTheAdams merged 43 commits intotrunkfrom
feature/data-transfer-objects

Conversation

@JasonTheAdams
Copy link
Copy Markdown
Member

Per the second task of #20, this adds the DTOs for the Implementor portion of the codebase based on the defined architecture.

Note, I purposely have not added the unit tests. I figure we'll want to look this over thoughtfully first to make sure it architecturally aligns with our thinking. Once we're happy with what we're seeing then I'll add all the unit tests.

@JasonTheAdams JasonTheAdams requested a review from felixarntz July 22, 2025 14:50
@JasonTheAdams JasonTheAdams self-assigned this Jul 22, 2025
@felixarntz felixarntz mentioned this pull request Jul 23, 2025
11 tasks
Base automatically changed from feature/unit-tests to trunk July 23, 2025 22:14
Copy link
Copy Markdown
Member

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JasonTheAdams Lots of good stuff in here! Overall this is looking great, though I also have a lot of feedback - simply because this is a lot of code and introducing many paradigms that we need to iron out.

There are probably a few things we'll need to discuss more, so let's just do that within the respective feedback threads below.

Let's take the time to get this right - once we've done that, I think we're set up for the future (e.g. the subsequent PR) to get things done fast.

Comment on lines +67 to +73

/**
* {@inheritDoc}
*
* @since n.e.x.t
*/
public static function getJsonSchema(): array
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something that we didn't chat about yet, but I think we should be doing for all DTOs to support these getJsonSchema() methods is to provide easy ways to transform a DTO to and from the raw data conforming to the respective JSON schema.

I'm thinking we should define another interface in Common/Contracts like WithRawDataRepresentation or something like that. It should have methods like:

  • public function toRawData(): array
  • public static function fromRawData(array $data): static

We could also combine it somehow with PHP's native JsonSerializable interface, basically the jsonSerialize() method would be the same as toRawData(). That said, I kind of like having both since you may want to get the raw data representation in some cases without necessarily wanting to immediately JSON encode it - i.e. it's not only relevant in context of JSON data. Alternatively, if y'all feel strongly about not having two methods that do the same thing, we could consider calling fromRawData($data) something like jsonUnserialize($data) - the main thing I'd want to make sure is that it would be clear that method would be the opposite of that other method. If we do the latter, maybe it could be called WithJsonRepresentation? And then that would automatically extend JsonSerializable?

Let's discuss in this thread how to implement this. Once we've agreed on the approach, feel free to add it as part of this PR, or as a separate follow up.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea that we will add in a subsequent PR. 👍

Comment on lines +129 to +130
'type' => ['array', 'null'],
'items' => FunctionDeclaration::getJsonSchema(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this valid? Combing type=null with items seems like a problem to me. Don't we need to use a oneOf instead?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is allowed. JSON Schema keywords that don't apply to the actual type get ignored.

https://json-schema.org/draft/2020-12/json-schema-core#name-assertions-and-instance-pri

Copy link
Copy Markdown
Member

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few additional minor points of feedback, but this is shaping nicely.

The main thing to iron out is how to deal with MIME types (which we're discussing on Slack too), plus address some outstanding feedback regarding the tool DTOs from my previous review.

@JasonTheAdams JasonTheAdams requested a review from felixarntz July 25, 2025 19:45
@JasonTheAdams
Copy link
Copy Markdown
Member Author

Alright, @felixarntz! We've done a lot of great stuff in this PR! Back to you for review. If it looks good then I'll add some unit tests to this PR and get it merged! 🙌

Copy link
Copy Markdown
Member

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JasonTheAdams Looks great, thank you for all the iterations!

A few last follow up comments, but nothing major. Definitely in a good place to add unit tests!

Comment on lines +95 to +109
'properties' => [
'id' => [
'type' => 'string',
'description' => 'The ID of the function call this is responding to.',
],
'name' => [
'type' => 'string',
'description' => 'The name of the function that was called.',
],
'response' => [
'type' => ['string', 'number', 'boolean', 'object', 'array', 'null'],
'description' => 'The response data from the function.',
],
],
'required' => ['id', 'name', 'response'],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my FunctionCall comment regarding id and name nuances, the same consideration applies here too.

Comment on lines +168 to +170
public function getUrl(): ?string
{
return $this->url;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The classic question: Should these be named url or uri?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hahah! Good question! Does this work with URIs that are not URLs?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure - happy to leave this as is.

Comment on lines +21 to +33
/**
* Inline file with base64-encoded data.
*
* @var string
*/
public const INLINE = 'inline';

/**
* Remote file referenced by URL.
*
* @var string
*/
public const REMOTE = 'remote';
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to rename these to base64Data and url (or uri, if we change that)? To me those names would be a lot more obvious in what they mean rather than inline and remote.

It would also match the type values with the corresponding other field in the JSON data (which is either base64Data or url), nicely aligned with how that is in MessagePart and Tool.

Last but not least, inline could be a bit too vague for a name, as it doesn't mandate for it to be a base64 encoded file, but that's what it is here.

These ideas are definitely not blockers for this PR. We can also think about this later - as long as there's no release we can rename and break things as much as we want :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of like the names not being quite so in the weeds of what the data literally is. Having it one level higher feels closer to the intent.

What if inline was instead encoded, embedded, or something like that?

That said, I really don't have a strong opinion about this. I think it's close enough where it's hard to know what's "right" here. I suspect more people would mean more opinions on this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, it's simply that inline is not as clear to me as to what it is compared to saying base64. Similarly, remote is not as clear to me as saying url.

I get your point on expressing at a higher level being desirable, but to me here it simply makes it less clear what it means.

But I'm happy to leave this as is for now, and we can reevaluate later.

@JasonTheAdams JasonTheAdams requested a review from felixarntz July 25, 2025 22:36
@JasonTheAdams
Copy link
Copy Markdown
Member Author

Just added the unit tests, @felixarntz! All is passing. Back to you for review!

Copy link
Copy Markdown
Member

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JasonTheAdams Awesome work!

I didn't review all the individual tests, since it's a lot and I trust Claude Code's thoroughness based on a high-level review, especially with all of them passing.

Two small details of feedback, but nothing critical. I'm super excited to see this land! 🎉

* @param string $name The name of the function.
* @param string $description A description of what the function does.
* @param mixed $parameters The JSON schema for the function parameters.
* @param mixed|null $parameters The JSON schema for the function parameters.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't null a subset of mixed?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! I did fix this but forgot to push. I'll fix this in the next PR.

Comment on lines +78 to +90
public function testCreateFromDataUriWithMimeTypeOverride(): void
{
$base64Data = 'SGVsbG8gV29ybGQ=';
$dataUri = 'data:text/plain;base64,' . $base64Data;
$overrideMimeType = 'text/html';

$file = new File($dataUri, $overrideMimeType);

$this->assertEquals(FileTypeEnum::inline(), $file->getFileType());
$this->assertEquals($base64Data, $file->getBase64Data());
$this->assertEquals($overrideMimeType, $file->getMimeType());
$this->assertEquals('data:text/html;base64,' . $base64Data, $file->getDataUri());
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing this now, I wonder whether we should even allow for such an inconsistency. I think it's okay to consider it as an "override MIME type", but it could also be considered an accident that should throw an exception, when the MIME types differ.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like a niche situation. I suspect folks will generally try without providing the MIME type, only providing it if they're giving a Base64 or something that requires it. To take your point from earlier, this feels like the kind of situation where our validation could be interpreted as wrong.

@JasonTheAdams
Copy link
Copy Markdown
Member Author

There are two open comments I'm happy to resolve in a subsequent PR! Merging this bad boy! Thanks for all the great input, folks! 🎉

@JasonTheAdams JasonTheAdams merged commit 840fb5c into trunk Jul 26, 2025
8 checks passed
@JasonTheAdams JasonTheAdams deleted the feature/data-transfer-objects branch July 26, 2025 04:18
@JasonTheAdams JasonTheAdams added this to the Finish the foundation milestone Aug 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants