Skip to content

Add skip_image_resolution to deduplicate multi-resolution dataset#2273

Merged
kohya-ss merged 5 commits intokohya-ss:sd3from
woct0rdho:min-max-orig-reso
Mar 18, 2026
Merged

Add skip_image_resolution to deduplicate multi-resolution dataset#2273
kohya-ss merged 5 commits intokohya-ss:sd3from
woct0rdho:min-max-orig-reso

Conversation

@woct0rdho
Copy link
Copy Markdown
Contributor

This PR is an alternative to #2270 .

I propose to add a dataset property min_orig_resolution, so we can write a multi-resolution dataset config like

[general]
bucket_no_upscale = true

[[datasets]]
resolution = 768
[[datasets.subsets]]
image_dir = 'path/to/image/dir'

[[datasets]]
resolution = 1024
min_orig_resolution = 768
[[datasets.subsets]]
image_dir = 'path/to/image/dir'

[[datasets]]
resolution = 1280
min_orig_resolution = 1024
[[datasets.subsets]]
image_dir = 'path/to/image/dir'

I've also added max_orig_resolution because it looks natural to have one.

We filter the images by their original resolutions in BaseDataset.make_buckets, and update num_train_images and num_reg_images. For DreamBoothDataset, we rebalance the number of regularization images after the filter. For ControlNetDataset, we check missing conditioning images after the filter, and ignore extra conditioning images.

There is no overhead if the user does not set min_orig_resolution and max_orig_resolution.

@kohya-ss
Copy link
Copy Markdown
Owner

Thank you for this PR!

However, this option seems a bit complicated and confusing. Please tell me why #2270's skip_image_resolution is not enough.

@woct0rdho
Copy link
Copy Markdown
Contributor Author

woct0rdho commented Feb 20, 2026

min_orig_resolution is exactly your skip_image_resolution but I renamed it because I think min_orig_resolution is more self-evident.

I can rename it back to skip_image_resolution and remove max_orig_resolution if you think that's better.

The code is indeed more complicated than what I thought at first, but this is the best way I (and AI tools I use) can find to implement:

  1. The filter using original resolution, which can be done only after we know the original resolution in make_buckets
  2. Make regularization images work correctly with the filter
  3. Make conditioning images work correctly with the filter

@kohya-ss
Copy link
Copy Markdown
Owner

Thanks for the explanation, I understand now.

skip_image_resolution explicitly states that images of that resolution will not be included, but I don't think min_orig_resolution explicitly states whether they will be included or not.

I'll try to find out if there's a simpler way to implement this.

@woct0rdho woct0rdho changed the title Add min_orig_resolution and max_orig_resolution to deduplicate multi-resolution dataset Add skip_image_resolution to deduplicate multi-resolution dataset Feb 20, 2026
@kohya-ss
Copy link
Copy Markdown
Owner

kohya-ss commented Feb 22, 2026

Thank you for update!

I think we could simply filter images with the following code.
Note that skip_image_resolution should be a tuple, just like resolution.

                            size_set_count += 1
                    logger.info(f"set image size from cache files: {size_set_count}/{len(img_paths)}")

            # from here
            if self.skip_image_resolution is not None:
                filtered_img_paths = []
                filtered_sizes = []
                skip_image_area = self.skip_image_resolution[0] * self.skip_image_resolution[1]
                for img_path, size in zip(img_paths, sizes):
                    if size is None:  # no latents cache file, get image size by reading image file (slow)
                        size = self.get_image_size(img_path)
                    if size[0] * size[1] <= skip_image_area:
                        continue
                    filtered_img_paths.append(img_path)
                    filtered_sizes.append(size)
                img_paths = filtered_img_paths
                sizes = filtered_sizes
                # add some logging here
            # to here

            # We want to create a training and validation split. This should be improved in the future
            # to allow a clearer distinction between training and validation. This can be seen as a

In FineTuningDataset, we can use the image size from the metadata.

@woct0rdho
Copy link
Copy Markdown
Contributor Author

Yes this makes the PR simpler. I've moved the filtering from make_buckets to __init__.

@kohya-ss
Copy link
Copy Markdown
Owner

Thank you for update! I will create a test dataset and review/test this sooner.

@kohya-ss kohya-ss changed the base branch from main to sd3 March 18, 2026 23:43
@kohya-ss kohya-ss merged commit 1cd95b2 into kohya-ss:sd3 Mar 18, 2026
3 checks passed
@woct0rdho woct0rdho deleted the min-max-orig-reso branch March 18, 2026 23:52
kohya-ss added a commit that referenced this pull request Mar 18, 2026
Document the skip_image_resolution dataset option added in PR #2273.
Add option description, multi-resolution dataset TOML example, and
command-line argument entry to both Japanese and English config READMEs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kohya-ss
Copy link
Copy Markdown
Owner

I've merged this and added a documentation for this feature. I'm sorry it took so long to merge. Thank you for this PR again!

kohya-ss added a commit that referenced this pull request Mar 19, 2026
* docs: add skip_image_resolution option to config README

Document the skip_image_resolution dataset option added in PR #2273.
Add option description, multi-resolution dataset TOML example, and
command-line argument entry to both Japanese and English config READMEs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: clarify `skip_image_resolution` functionality in dataset config

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants