Skip to content

Fix wp_unique_filename() when converting images, take 2#1591

Closed
azaozz wants to merge 7 commits intoWordPress:masterfrom
azaozz:Fix-wp_unique_filename()-when-converting-images
Closed

Fix wp_unique_filename() when converting images, take 2#1591
azaozz wants to merge 7 commits intoWordPress:masterfrom
azaozz:Fix-wp_unique_filename()-when-converting-images

Conversation

@azaozz
Copy link
Copy Markdown
Contributor

@azaozz azaozz commented Aug 16, 2021

Alt patch for https://core.trac.wordpress.org/ticket/53668 partially based on https://core.trac.wordpress.org/attachment/ticket/53668/53668.2.diff (take 2).

Extends wp_unique_filename() to account for image files that will be converted after uploading. The file names with the original and expected extensions are checked against existing files.

Trac ticket: https://core.trac.wordpress.org/ticket/53668.


// If the extension is uppercase add an alternate file name with lowercase extension. Both need to be tested
// for uniqueness as the extension will be changed to lowercase for better compatibility with different filesystems.
if ( $ext && $lc_ext !== $ext ) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the new filename's extension is always going to be lowercase on exit, have you considered just setting the extension as lowercase early in the function, and then routinely testing both that filename and a version with an uppercase extension for existence later?

That would negate the need for if statements like this, but also cater for the case whereby a lowercase filename is uploaded but files with uppercase extensions exist that were added by some other means that does not run through wp_unique_filename (e.g. FTP).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you considered just setting the extension as lowercase early in the function

Unfortunately $ext is passed as arg in the wp_unique_filename filter at the bottom. Changing it might affect something and would be considered a regression.

files with uppercase extensions exist that were added by some other means

That's not impossible but never seen a bug report about it, and is generally not supported. Files added "by hand" to the upload dirs would not have attachment posts and will be unusable from whitin WP. Perhaps if there's a bug report/ticket it can be fixed there.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately $ext is passed as arg in the wp_unique_filename filter at the bottom. Changing it might affect something and would be considered a regression.

Ah, good shout, I'll fix my patch for that.

That's not impossible but never seen a bug report about it, and is generally not supported.

We've seen this very scenario a bunch of times in support, either via a dump of media from Windows and then sucked in to a gallery plugin that doesn't use the Media Library, or imported into the Media Library from the filesystem by some importer.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've seen this very scenario a bunch of times

Did it result in lost/overwritten files? Sure, lets fix it but in another ticket and better not in a dot release.

Hm, actually, wouldn't file_exists() catch possible collisions caused by upper/lower case? It does a filesystem call so thinking it will.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, actually, wouldn't file_exists() catch possible collisions caused by upper/lower case? It does a filesystem call so thinking it will.

No, on a case sensitive filesystem (effectively all Linux servers) file_exists( 'wibble.png' ) will not see a wibble.PNG file.

* @param array $files An array of existing files in the directory. May be empty.
* @return bool True if the tested file name could match an existing file, false otherwise.
*/
function _wp_check_alternate_file_names( $filenames, $dir, $files ) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought a primary reason for proposing an alternate patch to mine was to avoid the introduction of a private helper function? 😄

Copy link
Copy Markdown
Contributor Author

@azaozz azaozz Aug 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hehe, not primary but yes, avoid if possible. It only makes things a bit more readable, the conditionals can also be in the while() loop.

The primary reason is to not run wp_unique_filename() recursively.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The primary reason is to not run wp_unique_filename() recursively.

This is where I very much disagree.

wp_unique_filename() should be called recursively to ensure the best method for checking filenames is always used, and to ensure that its filter is fired for each proposed filename, including the alternate format filenames that otherwise only wp_unique_filename() itself would check.

My next path has a test case that explains this better.

Copy link
Copy Markdown
Contributor Author

@azaozz azaozz Aug 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be called recursively...

This introduces a bug/edge case, see https://core.trac.wordpress.org/ticket/53668#comment:25. All alternate filenames should be checked together at the same time or "gaps" in the numbering (i.e. existing files like file-1, file-2, file-4) may result in overwriting.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces a bug/edge case, see https://core.trac.wordpress.org/ticket/53668#comment:25. All alternate filenames should be checked together at the same time or "gaps" in the numbering (i.e. file-1, file-2, file-4) will result in overwriting.

I could be wrong, but I don't believe my code has this bug as it re-checks the original and alternate extensions set should a new version be proposed on exit from the protected loop.

Do you have a test case for this that breaks my latest patch?


// The (resized) image files would have name and extension, and will be in the uploads dir.
if ( $name && $ext && @is_dir( $dir ) && false !== strpos( $dir, $upload_dir['basedir'] ) ) {
if ( $name && $ext && $is_image && @is_dir( $dir ) && false !== strpos( $dir, $upload_dir['basedir'] ) ) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think testing for $is_image here could cause problems. As per the first early test for thumbnail/scaled/rotated to append a version regardless of type to avoid clashes with images that might end up with those suffixes, entering this section to test the newly versioned file name for the existence of files that signify a potential base name clash is important.

It's especially important when file types like PDF and audio could have preview/cover images generated for them, avoiding potential clashes and disassociated filenames by keeping the existing functionality seems worthwhile.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, fixed.

Copy link
Copy Markdown

@ianmjones ianmjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the one minor code comment regarding undefined variables, looking great @azaozz. 🎉

I ran the tests, manually tested a bunch of scenarios, and reviewed the code, passed with flying colours!

@azaozz
Copy link
Copy Markdown
Contributor Author

azaozz commented Aug 24, 2021

Committed in https://core.trac.wordpress.org/changeset/51653.

@azaozz azaozz closed this Aug 24, 2021
@azaozz azaozz deleted the Fix-wp_unique_filename()-when-converting-images branch August 24, 2021 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants