Skip to content

Extending Isofit to handle multisurface classifications #617

Merged
pgbrodrick merged 8 commits into
isofit:devfrom
evan-greenbrg:multisurface_fresh
Sep 11, 2025
Merged

Extending Isofit to handle multisurface classifications #617
pgbrodrick merged 8 commits into
isofit:devfrom
evan-greenbrg:multisurface_fresh

Conversation

@evan-greenbrg

@evan-greenbrg evan-greenbrg commented Dec 20, 2024

Copy link
Copy Markdown
Collaborator

This is includes changes to the code base to handle an external classification file that is mapped to specific configurations of surface models and executions varying per-pixel or per-superpixel.

The goals of the changes:

  1. Utilize external or internal classifications of surface configurations via apply_oe, directly in a manually configured config, and able to run both in per-pixel mode and in superpixel modes.

  2. Remain efficient both in terms of computation time and memory requirements.

Things to highlight:

  1. New backwards compatible "Surface" section of the isofit config. Based on the input or generated classification file, the surface config is populated.
"surface": {
    "Surfaces": {
        "glint_model_surface": {
            "surface_category": "glint_model_surface",
            "surface_file": "/home/bgreenbe/store/Projects/Biosscape/prism/study_area/prm20231110t071521/Multisurface/027/data/surface_glint_model_surface.mat",
            "surface_int": 0
         },
         "multicomponent_surface": {
            "surface_category": "multicomponent_surface",
            "surface_file": "/home/bgreenbe/store/Projects/Biosscape/prism/study_area/prm20231110t071521/Multisurface/027/data/surface_multicomponent_surface.mat",
                    "surface_int": 1
                }
            },
            "base_surface_class_file": "/home/bgreenbe/store/Projects/Biosscape/prism/study_area/prm20231110t071521/Multisurface/027/input/prm20231110t071521_surface_class",
            "multi_surface_flag": true,
            "surface_class_file": "/home/bgreenbe/store/Projects/Biosscape/prism/study_area/prm20231110t071521/Multisurface/027/input/prm20231110t071521_subs_surface_class"
        }
}

The "Surfaces" key is only used when running multiple surfaces.

Now surface specific surface.mat files are possible, and encouraged. When generating from .json, the surface model will use a "surface type" key to generate shared surface groupings into respective .mat files.

The cleanest implementation was to first make this a required key for multi-surface runs from .json. This could be made optional. Passed .mat files will use the same surface prior file across types.

      {
        "input_spectrum_files":
          [
            "/home/bgreenbe/store/Data/SurfaceLibraries/filtered_other"
          ],
        "n_components": 1,
        "windows": [
          {"interval":[300,740], "regularizer":10, "correlation":"decorrelated"},
          {"interval":[740,1250], "regularizer":1e-6, "correlation":"EM", "name":"shallow-water"},
          {"interval":[1250,1325], "regularizer":1e-8, "correlation":"EM", "name": "osf"},
          {"interval":[1325,1960], "regularizer":10, "correlation": "decorrelated" },
          {"interval":[1960,2070], "regularizer":1e-6, "correlation":"EM","name": "co2" },
          {"interval":[2070,2300], "regularizer":10, "correlation":"decorrelated" },
          {"interval":[2300,2500], "regularizer":1e-3, "correlation":"EM",  "isolated": 1,"name": "noise" }
        ],
        "surface_type": "multicomponent_surface"
      },
  1. Isofit is now run sequentially across surface types.
surface_index = index_spectra_by_surface(input_config, index_pairs)
for surface_class_str, class_idx_pairs in surface_index.items():
    logging.info(f"Running surfaces: {surface_class_str}")
    ...
    # If multisurface, update config to reflect surface.
    # Otherwise, returns itself
    config = update_config_for_surface(
        deepcopy(input_config), surface_class_str
    )
    self.fm = fm = ForwardModel(config)
    ...
  1. On the spectrum write, the statevector is matched against a full statevector by name (not index). A fill value populates elements that are present in the full statevector, but not a pixel statevector.

  2. Analytical line processing is handled similarly in sequence across present surface types

A couple of other changes to note:

  • Refactored the analytical line. I simplified the file initialization, which reduced clutter updating output metadata. Would like a second pair of eyes on this to make sure the read/write is similarly memory intensive as dev branch.
  • Tests for multisurface runs (Would need to hook up to github actions)
  • Built in classifier based on the surface.json file. So I didn't have to build an external classification pipeline I incorporated a classifier based on the surface prior selection.
  • Generalized extractions.py to use "reducers." Extractions can now take any generic reducer to move between full pixel and super pixel resolution. Necessary for aggregating classified pixels.
  • Misc cleaning throughout.

There are a couple glint-related updates that I'm going to include in a separate PR.

@evan-greenbrg evan-greenbrg added the enhancement New feature or request label Dec 20, 2024
Comment thread isofit/radiative_transfer/radiative_transfer.py Outdated
Comment thread isofit/utils/template_construction.py Outdated
@unbohn

This comment was marked as outdated.

@evan-greenbrg

This comment was marked as outdated.

@unbohn

This comment was marked as outdated.

@evan-greenbrg evan-greenbrg force-pushed the multisurface_fresh branch 2 times, most recently from c15d282 to 4ff9896 Compare February 10, 2025 21:09
@evan-greenbrg evan-greenbrg marked this pull request as draft April 9, 2025 22:50
@evan-greenbrg evan-greenbrg force-pushed the multisurface_fresh branch 4 times, most recently from 108f56d to 6b72fe5 Compare July 9, 2025 22:44
@evan-greenbrg evan-greenbrg marked this pull request as ready for review July 30, 2025 21:50
@brentwilder

Copy link
Copy Markdown
Contributor

This worked for me! Testing the case of a single, multi-component surface model for an EMIT scene I have locally (emit20250327t212148).

Comment thread isofit/utils/template_construction.py Outdated
Comment on lines +491 to +518
surface_category: str,
multisurface: bool = False,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add args strings to documentation for these two arguments below.

Comment thread isofit/utils/template_construction.py Outdated
Comment thread isofit/utils/template_construction.py
Comment thread isofit/utils/apply_oe.py Outdated
Comment thread isofit/core/isofit.py
meshgrid = np.meshgrid(self.rows, self.cols)
index_pairs[:, 0] = meshgrid[0].flatten(order="f")
index_pairs[:, 1] = meshgrid[1].flatten(order="f")
del meshgrid

@evan-greenbrg evan-greenbrg Aug 11, 2025

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the indexing function using numpy where and actually found that numpy's performance is actually better than the view-based indexing.

View-based indexing: ~0.03s
Numpy-based indexing: ~0.005s

I'm going to swap the view-based indexing back to numpy. However, there is a speed difference between the memory contiguous index_pairs and the vstack (not contiguous) index_pairs.

Non-contiguous index_pairs: ~0.025s
Contiguous index_pairs: ~0.005s

This function is only called 2 * n_worker times throughout apply_oe so the differences here are so minor to be essentially irrelevant.

@evan-greenbrg

evan-greenbrg commented Aug 12, 2025

Copy link
Copy Markdown
Collaborator Author

@pgbrodrick / @unbohn

I've updated the PR with the following revisions based on our conversation:

Major changes:

  • Hard-code mapping between surface class file int and surface_category. Remove all fuzzy logic around these mappings.
  • Revise copy logic for surface model. If passing .json, make sure we're not generating additional .mat file. Always use data folder when running via Apply OE. Any surface copying file should only happen if passing .mat file.
  • Get run times to back up indexing strategy (see comment above).
  • Work caching into the RT load. Pass in pre-init RT object into the forward model as an optional argument.
  • Clean up the shared memory space in the workers. Only pass in large objects.

Minor changes:

  • Call surface types key something else.
  • Add logical check in check_surface to make sure dict keys are present in .mat file if multisurface is run.
  • Change name of reducers, may be built-in name. Reducers itself is not a standard library Python variable name. reduce(), however is. I've decided to keep reducers here because it's the most accurate term for what the functions do.
  • Force any mapping that does exist to be alphebetical
  • Rip out cloud portion of surface config
  • Return all indexes from construct_full_state()
  • Move match_statevector into multistate
  • Pull out delete keys from initialize function
  • Check all file closures.
  • Make sure I'm not calling extractions twice in Apply OE
  • Don't change the initialization to IO, just initialize later

)
else:
self.svf = []

@evan-greenbrg evan-greenbrg Aug 12, 2025

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brentwilder I ended making the skyview file consistent with the rest of the input files in that they are held as memory objects at the Worker level.

I also edited the case of no sky-view file to get around holding the ones array in memory. Let me know if you have any reservations at all.

It'd be super useful for you to run through a sky-view case on this PR whenever you have the bandwidth to do so! (that is once I get the checks passing...)

Comment thread isofit/utils/apply_oe.py
@unbohn

unbohn commented Aug 27, 2025

Copy link
Copy Markdown
Collaborator

Passing a .mat instead of a .json file as --surface_path to apply_oe causes a KeyError in line 2002 of template_construction.py if classify_multisurface is set to True as the multisurface classification expects one .mat file per surface category. We should decide on the default behavior here, whether we want to allow both input options, or clearly document that only a .json file is accepted. At least, we should raise a warning in case the user accidentally provides a single prebuilt .mat file to apply_oe.

Comment thread isofit/utils/multicomponent_classification.py
Comment thread isofit/utils/multicomponent_classification.py Outdated
Comment thread isofit/core/isofit.py Outdated
@unbohn

unbohn commented Aug 27, 2025

Copy link
Copy Markdown
Collaborator

Ok, the infrastructure runs without errors for me finally. Will inspect results qualitatively now...

@unbohn

unbohn commented Aug 27, 2025

Copy link
Copy Markdown
Collaborator

Qualitative results are interesting. Here are reflectance RGBs from running a classic single surface inversion on EMIT (upper image) vs. the multisurface approach with two surface categories (lower image):

emit20250327T212148_rfl emit20250327T212148_rfl_multisurface

@unbohn

unbohn commented Aug 27, 2025

Copy link
Copy Markdown
Collaborator

When looking closer at the RGBs, you notice that the multi surface-based result is going more into the greenish/blueish color. Both images are using the exact same stretching. This becomes more obvious when looking at randomly selected spectra:

image

There is a magnitude offset and in some cases even a shape difference.

@unbohn

unbohn commented Aug 28, 2025

Copy link
Copy Markdown
Collaborator

Magnitude and shape differences are very likely introduced by using the 6c emulator for the multi surface version, as opposed to the 1c sRTMnet for the classic L2A. Therefore, I think this PR is ready to merge, based on comprehensive testing of various scenarios.

@evan-greenbrg

evan-greenbrg commented Aug 28, 2025

Copy link
Copy Markdown
Collaborator Author

Action items:

  • Resolve question about splitting .mat inputs for multisurface runs
    • Long term solution should be to condense to a single output .mat file. surface.component should then be able to only use the priors with the matching categories.
    • Short term solution is to only allow .json surface inputs for the multisurface runs. Add checks for this case.
  • Test processing loads with new resource tracker

…or fit_params

Updates to AL and sky glint prior

Edits to sky glint prior

Editing glint fit parameters and adding indexing

Simplifying fit_params wl choice

Fixing typoes in surface_glint_model

Rebase with dev
Initializing output matrices in analytical line with numpy not envi

Fixed classification cleaning function

Fixed index shift in multicomponent classifier

Analytical Line bug fixes

Adding coupled total transm to hueristic atmosphere

Removing duplciate analytical_model function from rebase

Fixed typo in template_constructions

Changing simple to algebraic

Fixed logging

Changed state order to alphabetical

Bad object reference

Rolling back transm change

Fixing bugs in examples test

Fixing rebase issues

Fixed examples

Fixed bad invert_algebraic return

Typo in template

Fixed bad classification filtering
…put init func.

Missed incorrect function argument
Loose bugs to get running through AOE

Removing multi-scattering from AOE. Edits to AOE init to remove discontinuities

Glint updates to heuristic atm

Fix surface path in config and correct surface creation on non multisurface runs
Fixing classification cli. Tweaking glint est on init

Fixed staging and config for single surface json

Moving multistate to core

Moved multistate.py to core. Fixed surface config for input .mat files.

Fixing direct .mat input for surface

Bad indent

Missed surface_mapping var
Fixing bugs after rebase

isofit run gets n_rows from SpectrumFile

Moving tests into test_examples

editing working directory in apply oe test

Edits to test_cli

SpectrumFile errors out if trying to build output that already exists

Bug fix for tests

Added initialization method to IO class

Removing multisurface test label

updaing flush dirs to reflect correct working dir

Update Glint model. Fix surface copying

Surface matrices size checks for 1-component surfaces

Fixed typos

Fixed logging type
@evan-greenbrg

Copy link
Copy Markdown
Collaborator Author

Resource logs for multisurface run:

Screenshot 2025-09-09 at 8 49 39 AM

@pgbrodrick pgbrodrick merged commit b8d4c67 into isofit:dev Sep 11, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants