Skip to content

[RFC] Deprecation and alternatives for utf8_encode and utf8_decode#1419

Closed
IMSoP wants to merge 2 commits intoencoding-function-improvementsfrom
rfc-utf8encode-deprecation
Closed

[RFC] Deprecation and alternatives for utf8_encode and utf8_decode#1419
IMSoP wants to merge 2 commits intoencoding-function-improvementsfrom
rfc-utf8encode-deprecation

Conversation

@IMSoP
Copy link
Copy Markdown
Collaborator

@IMSoP IMSoP commented Feb 20, 2022

Please see IMSoP#1 for current draft

See also #1418 for improvements which don't relate to the deprecation.

@IMSoP IMSoP added this to the PHP 8.2 milestone Feb 20, 2022
@IMSoP IMSoP force-pushed the rfc-utf8encode-deprecation branch from 0f1766a to b9534bd Compare March 3, 2022 22:14
- Move utf8_encode and utf8_decode into the strings chapter, since
  they were moved out of the XML extension in 7.2
- Recommend mb_convert_encoding, iconv, and UConverter::transcode
  when mentioning encoding in passing
- Document UConverter::transcode, based on examination of source
  and upstream ICU docs
- Make the language used more consistent, e.g. "convert" rather
  than "encode"/"decode", "encoding" rather than "charset"
@IMSoP IMSoP force-pushed the encoding-function-improvements branch from 467041d to 3b98512 Compare March 3, 2022 22:37
@IMSoP IMSoP force-pushed the rfc-utf8encode-deprecation branch from b9534bd to 5c06875 Compare March 3, 2022 22:58
@IMSoP IMSoP force-pushed the rfc-utf8encode-deprecation branch from 5c06875 to dfd441b Compare March 3, 2022 23:07
@IMSoP IMSoP force-pushed the encoding-function-improvements branch from 3b98512 to bb32cd2 Compare March 23, 2022 21:09
<listitem>
<para>

The encoding which <parameter>str</parameter> should be converted to.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The encoding which <parameter>str</parameter> should be converted to.
The encoding to which <parameter>str</parameter> should be converted.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading it back, all of these sentences were unnecessarily torturous. I've come up with a new wording across all three functions, which I think is less wordy and more precise.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, that's part of #1418 which has the changes which can land even if deprecation doesn't go ahead.

<listitem>
<para>
The type of encoding that <parameter>string</parameter> is being converted to.
The encoding which <parameter>string</parameter> should be converted to.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The encoding which <parameter>string</parameter> should be converted to.
The encoding to which <parameter>string</parameter> should be converted.

</note>
</refsect1>

<refsect1 role="changelog">
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the deprecation be mentioned in the changelog, too?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, probably. Also, looks like I missed an attribute to make it list as deprecated in indexes.

<note>
<para>
This function does not attempt to guess the current encoding of the provided
string, it assumes it is encoded as ISO-8859-1 (also known as "Latin 1")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
string, it assumes it is encoded as ISO-8859-1 (also known as "Latin 1")
string. It assumes it is encoded as ISO-8859-1 (also known as "Latin 1")

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I compromised and used a semi-colon 😜

@IMSoP IMSoP changed the base branch from encoding-function-improvements to master April 3, 2022 21:55
@IMSoP IMSoP changed the base branch from master to encoding-function-improvements April 3, 2022 21:55
@IMSoP IMSoP deleted the branch encoding-function-improvements April 3, 2022 21:58
@IMSoP IMSoP closed this Apr 3, 2022
@IMSoP IMSoP deleted the rfc-utf8encode-deprecation branch April 3, 2022 21:58
@IMSoP
Copy link
Copy Markdown
Collaborator Author

IMSoP commented Apr 3, 2022

Sorry, I've made a mess of this; my intention was to compare two branches, but I ended up with two copies of each branch, confusing everything.

Re-opened here for now: IMSoP#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants