Skip to content

GERSL/Fmask

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Fmask

Fmask (Function of mask) is an automated algorithm for detecting clouds and cloud shadows in Landsat 4–9 (including 4, 5, 7, 8, and 9) and Sentinel-2 imagery (Figure 1). It processes Landsats 4-9 Collection 2 Level 1 imagery (Digital Number) and Sentinel-2 Level-1C imagery (Top Of Atmosphere reflectance).

Figure 1 Figure 1: Example of Fmask5-UPL for Sentinel-2. Image ID: S2B_MSIL1C_20190803T202849_N0208_R114_T10XEF_20190803T221046.

Fmask Version 5.0 (Fmask 5) offers a Physics-Informed Machine Learning (PIML) framework (Figure 2) to enhance cloud detection accuracy, while cloud shadow detection relies on the physical geometric relationship between identified clouds and their corresponding shadows. Table 1 summarizes seven cloud detection models that either integrate physical rules with machine learning or use each approach independently. Fmask 5 applies UPL for Landsats 8–9 and Sentinel-2, and LPL for Landsats 4–7.

Table 1: Cloud detection models. UPL is recommended for Landsat 8-9 and Sentinel-2, while LPL is recommended for Landsat 4-7 (4, 5, and 7).

Category Model Name Command Option Pre-trained ML Physical Rules Fine-tuned ML Post-processing
Baseline Physical PHY No Yes No Spatial morphology
LightGBM GBM LightGBM No No No
UNet UNet UNet No No No
Simple Combination LPL* LPL LightGBM Yes LightGBM (3% replacement rate) Spatial morphology
UPU UPU UNet Yes UNet (5 epochs) No
Hybrid Combination LPU LPU LightGBM Yes UNet (5 epochs) No
UPL* UPL UNet Yes LightGBM (3% replacement rate) UNet-overlap

Note: The machine learning model could be pre-trained using global reference datasets and fine-tuned using “localized” training data generated by physical rules. UNet-overlap means that the cloud objects, where none of the pixels were classified as clouds by the pre-trained UNet model, will be excluded. Abbreviations: ML: Machine learning.

Figure 2 Figure 2: Flowchart of physics-informed machine learning (PIML) for cloud detection. The approach utilizes pixel-based LightGBM and CNN-based UNet models. The arrow indicates the processing sequence, transitioning from gray to black arrows. Abbreviations: HOT: Haze Optimized Transformation.

Complete Package

This repository only provides the source code and does not include the integrated global auxiliary datasets or pre-trained machine learning models. To access the complete Fmask package (~3 GB), including all necessary auxiliary data and model files, please download it from the link(s) below:

Version Download
5.0.1 Link to Fmask 5.0.1 package
5.0.0 Link to Fmask 5.0.0 package

How to Use

Installation

Create python environment with version 3.9 from (Mini) Conda

  • conda create -n fmask python=3.10

Activate the python environment

  • conda activate fmask

Configure dependent packages (The packages listed below were used for testing and may not all be required)

  • conda install rasterio gdal -y
  • pip install -U segmentation-models-pytorch
  • pip install plotly
  • pip install --upgrade nbformat # for the plotly
  • pip install patchify
  • conda install lxml -y
  • pip install pandas
  • pip install geopandas
  • pip install -U scikit-learn
  • conda install scipy -y
  • conda install scikit-image -y
  • conda install matplotlib -y
  • pip install pyproj
  • pip install utm
  • pip install lightgbm
  • pip install click

Running Fmask from the main Folder

To apply Fmask-UPL on a single Landsat 8-9 image (default cloud dilation: 3 pixels):

python fmask.py --imagepath /path/to/image_directory_landsat8-9 --model UPL

To apply Fmask-UPL on a single Sentinel-2 image (recommended cloud dilation: 0 pixels):

python fmask.py --imagepath /path/to/image_directory_Sentinel-2.SAFE --model UPL --dcloud 0

To apply Fmask-LPL on a single Landsat 4-7 image (default cloud dilation: 3 pixels):

python fmask.py --imagepath /path/to/image_directory_landsat4-7 --model LPL

🛠️ Command-Line Options

Option Short Description Default
--imagepath -i Path to input image directory (Landsat/Sentinel-2). required
--model -m Cloud detection model to use (Options shown in Table 1). UPL
--dcloud -c Dilation size (in pixels) for cloud mask. 3
--dshadow -s Dilation size (in pixels) for cloud shadow mask. 5
--dsnow -n Dilation size (in pixels) for snow/ice mask. 0
--output -o Directory for saving output. If not provided, results go into the input image directory. None
--skip_existing -s Skip processing if results already exist (yes or no). no
--save_metadata -md Save model metadata as CSV. no
--display_fmask -df Save and display the Fmask result as a PNG. no
--display_image -di Save and display the color composite figure (NGR: NIR-Green-Red and SNG: SWIR1-NIR-Red), cirrus band, and thermal band (if available). no
--print_summary -ps Print cloud, shadow, snow, and clear percentage summary. no

Progress Information

If the tool runs successfully, you will see progress information as shown below:

************************************************
Starting Fmask 5.0.0 with dilating 3 for cloud, 5 for shadow, and 0 for snow  
Processing /gpfs/sharedfs1/zhulab/Shi/ProjectCloudDetectionFmask5/HLSDataset/Landsat/LC08_L1TP_048022_20230713_20230724_02_T1 with Fmask-UPL model
>>> loading solar_zenith in radian  
>>> loading coastal in toa  
>>> loading blue in toa 
Click to see the full information on the progress
************************************************
Starting Fmask 5.0.0 with dilating 3 for cloud, 5 for shadow, and 0 for snow  
Processing /gpfs/sharedfs1/zhulab/Shi/ProjectCloudDetectionFmask5/HLSDataset/Landsat/LC08_L1TP_048022_20230713_20230724_02_T1 with Fmask-UPL model  
>>> loading solar_zenith in radian  
>>> loading coastal in toa  
>>> loading blue in toa  
>>> loading green in toa  
>>> loading red in toa  
>>> loading nir in toa  
>>> loading swir1 in toa  
>>> loading swir2 in toa  
>>> loading tirs1 in bt  
>>> loading tirs2 in bt  
>>> loading cirrus in toa  
>>> calculating hot  
>>> calculating whiteness  
>>> calculating ndvi  
>>> calculating ndsi  
>>> calculating ndbi  
>>> calculating sfdi  
>>> calculating var_nir  
>>> loading dem from gtopo30  
>>> loading gswo  
>>> loading ['unet'] as base machine learning model  
>>> loading lightgbm as tune machine learning model  
>>> normalizing the datacube to [-1, 1] with percentiles [1, 99]  
>>> classifying image by unet  
>>> adjusting physical rules 01/01  
>>> cloud probability (TTT) | overlap: 0.198981859 | optimal threshold: 0.025000000  
>>> cloud probability (FTT) | overlap: 0.248279496 | optimal threshold: 0.000000000  
>>> cloud probability (TFT) | overlap: 0.110536988 | optimal threshold: 0.025000000  
>>> cloud probability (FFT) | overlap: 0.248274458 | optimal threshold: 0.000000000  
>>> cloud probability (TTF) | overlap: 0.212598233 | optimal threshold: 0.025000000  
>>> cloud probability (FTF) | overlap: 0.248252342 | optimal threshold: 0.000000000  
>>> optimal cloud probability (TFT) | optimal threshold: 0.03  
>>> tuning machine learning model 01/01  
>>> training lightgbm 100 tree based on 10001 samples  
>>> using 19 predictors: ['coastal', 'blue', 'green', 'red', 'nir', 'swir1', 'swir2', 'tirs1', 'tirs2', 'cirrus', 'hot', 'whiteness', 'ndvi', 'ndsi', 'ndbi', 'sfdi', 'var_nir', 'dem', 'swo']  
>>> classifying the image by lightgbm model  
>>> stop iterating at the end  
>>> postprocessing with morphology&unet-based elimination  
>>> masking potential cloud shadow by flood-fill  
>>> loading gtopo30-slope  
>>> loading gtopo30-aspect  
>>> loading sensor_zenith in degree  
>>> loading sensor_azimuth in degree  
>>> matching cloud shadows  
>>> saved fmask layer as geotiff to /gpfs/sharedfs1/zhulab/Shi/ProjectCloudDetectionFmask5/HLSDataset/Landsat/LC08_L1TP_048022_20230713_20230724_02_T1  
Finished with 11.90 mins

Output

The tool generates a uint8 GeoTIFF file named after the selected cloud detection model, such as '<image folder name>_UPL.tif'.

Each pixel is classified with one of the following values:

Value Class Description
0 Land Clear land surface
1 Water Clear water surface
2 Cloud Shadow Shadow matched with the detected cloud
3 Snow/Ice Snow- or ice-covered surface
4 Cloud Detected cloud
255 Filled No-data fill (e.g., due to missing input band(s))

Note: Water and snow/ice pixels are labeled solely to enhance cloud detection. Their detection accuracy has not been evaluated.

Computing Efficiency

TBD

Global Validation Dataset

The global validation samples are available at this link.

Version History

5.0.1

  • As described in Qiu et al., 2025

5.0.0

  • Initially released with 512×512 image chips and the full set of predictors in machine learning.
  • Adapted cloud shadow detection from MATLAB Fmask 4.6 with minor improvements described on this page.

1.6 - 4.7

Earlier versions of the Fmask tools offered only a physical-rule-based cloud detection module, programmed in MATLAB. See this page for more details.

Contributing

We welcome and encourage contributions to Fmask! There are two primary ways to contribute:

Report Issues or Suggestions

If you happen to have any issues or suggestions for improving Fmask, we encourage you to open an issue or submit a pull request.

Share Problematic Images

We are actively collecting examples of images that have not been processed accurately by the current version of Fmask. If you come across such images, please share the image ID with us on this page. The collected images will be used to refine the inner machine learning models, improving their accuracy and reliability in future versions.

Known Issues

  • False positive errors in cloud detection over bright surfaces. Although the most recent version of Fmask has addressed most of these issues, challenges remain in highly reflective areas, such as high-mountain snow and ice.
  • Artifacts in cloud detection under very thin clouds. Thin (cirrus) clouds over bright surfaces, such as buildings and cropland, are more easily identified, as their features become more pronounced when the bright surfaces are located beneath very thin cirrus clouds.
  • Potential omitted cloud shadows at the image boundary, where the associated clouds are either not identified or difficult to match outside the extent of the imagery (unable to detect beyond the image boundaries). Note: Our team is collecting images with cloud detection issues and will continuously update the machine learning model to make improvements.

References

Qiu, S., Zhu, Z., Yang, X., Ju, J., Zhou, Q., Neigh, C., Physics-Informed Machine Learning for Cloud Detection, Remote Sensing of Environment, In revision.

Qiu, S., et al., Fmask 4.0: Improved cloud and cloud shadow detection in Landsats 4-8 and Sentinel-2 imagery, Remote Sensing of Environment, (2019), doi.org/10.1016/j.rse.2019.05.024 (paper for 4.0).

Zhu, Z. and Woodcock, C. E., Improvement and Expansion of the Fmask Algorithm: Cloud, Cloud Shadow, and Snow Detection for Landsats 4-7, 8, and Sentinel 2 images, Remote Sensing of Environment, (2014), doi:10.1016/j.rse.2014.12.014 (paper for version 3.2).

Zhu, Z. and Woodcock, C. E., Object-based cloud and cloud shadow detection in Landsat imagery, Remote Sensing of Environment, (2012), doi:10.1016/j.rse.2011.10.028 (paper for 1.6).

Qiu, S., et al., Improving Fmask cloud and cloud shadow detection in mountainous areas for Landsats 4–8 images, Remote Sensing of Environment, (2017), doi.org/10.1016/j.rse.2017.07.002 (paper for Mountainous Fmask (MFmask), which has been integrated into the current Fmask).

Contact Us

Shi Qiu (shi.qiu@uconn.edu) and Zhe Zhu (zhe@uconn.edu)

Global Environmental Remote Sensing Laboratory (GERSL), University of Connecticut, Storrs, USA

About

Cloud and cloud shadow detection algorithm for Landsat and Sentinel-2 imagery

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published