Skip to content

Conversation

@brancz
Copy link
Contributor

@brancz brancz commented Jul 16, 2025

Which issue does this PR close?

Does not yet close, but contributes towards:

Rationale for this change

See the above issues. And this is a follow up to

This was also split out from: #7929

What changes are included in this PR?

This removes the API to allow preserving dict_id set in the Schema's Field within arrow-ipc and arrow-flight. This is in an effort to remove the dict_id field entirely and make it an IPC/flight-only concern.

Are these changes tested?

Yes, all existing tests continue to pass.

Are there any user-facing changes?

Yes, these previously (in 54.0.0) deprecated functions/fields are removed:

  • arrow_ipc::DictionaryTracker.set_dict_id
  • arrow_ipc::DictionaryTracker::new_with_preserve_dict_id
  • arrow_ipc::IpcWriteOptions.with_preserve_dict_id
  • arrow_ipc::IpcWriteOptions.preserve_dict_id (function and field)
  • arrow_ipc::schema_to_fb
  • arrow_ipc::schema_to_bytes

@brancz
Copy link
Contributor Author

brancz commented Jul 16, 2025

@tustvold @alamb @thinkharderdev @adriangb this one should be very uncontroversial and easy to review as it's just following through with the previously agreed upon deprecations.

@brancz brancz force-pushed the rm-preserve-dict-id-api branch from 332ab3f to 510f8d5 Compare July 17, 2025 07:42
@github-actions github-actions bot added parquet Changes to the parquet crate arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate labels Jul 17, 2025
@brancz brancz force-pushed the rm-preserve-dict-id-api branch from 510f8d5 to 411ef8d Compare July 17, 2025 13:05
Copy link
Contributor

@adriangb adriangb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @brancz and @adriangb -- likewise this makes sense to me

cc @thinkharderdev @Dandandan @crepererum as I think your systems also use dict_id maybe in the past

@alamb alamb merged commit 82821e5 into apache:main Jul 18, 2025
28 checks passed
@alamb
Copy link
Contributor

alamb commented Jul 18, 2025

Thanks again @brancz

@brancz brancz deleted the rm-preserve-dict-id-api branch July 19, 2025 07:33
alamb pushed a commit that referenced this pull request Jul 22, 2025
…7968)

# Which issue does this PR close?

Closes #6356

# Rationale for this change

Now that #7940 is merged, nothing
useful can be done with the `dict_id` field, therefore, it is now safe
to be removed from this requirement.

This was also split out from:
#7467

# What changes are included in this PR?

No longer require the `dict_id` fields of two `Field`s of schemas being
merged to be equal, as at this point the `dict_id` is only an IPC
concern, and the fact that it is still in the struct definition is just
legacy, marked for removal, we're just going through the proper
procedure of deprecating and replacing the APIs that use it.

# Are these changes tested?

Tests passing.

# Are there any user-facing changes?

No API changes, just a behavior change, that was to be expected and
desired due to the deprecations around the `dict_id` field.

@alamb @adriangb @tustvold
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants