Pipeline for remotely sensed imagery. The pipeline processes satellite imagery alongside auxiliary data in multiple steps to arrive at a set of trend files related to land-cover changes.
This document describes the output produced by the pipeline.
The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory.
The pipeline is built using Nextflow and processes data using the following steps:
- untar - Optionally extract input files
- Preparation- Create a masks and boundaries for further analyses.
- Preprocessing - Preprocessing of satellite imagery.
- Higher-level-Processing - Classify preprocessed imagery and perform time series analyses.
- Visualization. Create two visualizations of the results.
- MultiQC - Aggregate report describing results and QC from the whole pipeline
- Pipeline information - Report metrics generated during the workflow execution
<digital_elevation_dir>: directory containing symlinks to decompressed digital elevation input data. Only present if a tar archive was provided for the digital elevation model. Name of the directory derived from archive contents.
<water_vapor_dir>: directory containing symlinks to decompressed water vapor input data. Only present if a tar archive was provided for water vapor data. Name of the directory derived from archive contents.
<satellite_data_dir>: directory containing symlinks to decompressed satellite imagery input data. Only present if a tar archive was provided for satellite data. Name of the directory derived from archive contents.
untar is a nf-core module used to extract files from tar archives.
Invokation of untar depends on certain parameters (i.e input_tar, dem_tar and wvdb_tar). Thus, the outputs files are only generated when these are set to true.
tile_allow.txt: File containing all FORCE notation tiles of the earths surface that should be used further in the pipeline. The first line contains the number of tiles. Following lines contain tile identifiers.
mask/: Directory containing a subdirectory for every FORCE tile. Each subdirectory contains the
aoi.tiffile. This file represents a binary mask layer that indicates which pixels are eligible for analyses.
In the preparation step, usable tiles and pixels per tile are identified.
force-cube computes the usable pixels for each FORCE tile. This computation is based on the specified are of interest and the resolution. The resulting binary masks can be used to understand which pixels were discarded (e.g. because they only contain water).
preprocess/<SATELLITE INPUT IMAGE>/
param_files/: Directory containing parameter files for FORCE preprocessing modules. One file per satellite mission per tile.
level2_ard/: Directory containing symlinks to analysis-ready-data. Subdirectories contain the .tif files that were generated during preprocessing.
logs/: Logs from preprocessing.
Preprocessing consist of two parts, generating parameter files and actual preprocessing.
The parameter files created through force-parameter can be viewed to understand concrete preprocessing techniques applied for a given tile.
Logs and analysis-ready-data (ARD) are generated using the force-l2ps command. Logs can be consulted for debugging purposes. ARD may be collected as a basis for other remote sensing workflows. Moreover, ARD contains two .tif files per initial input image, a quality data file and the atmospherically corrected satellite data, that can be viewed using geographic information systems (GISs). Note that ARD data is only published as symbolic links due to the amount and size of the files.
param_files/: Parameter files used in force-higher-level.
trend_files: Symlinks to trend files that are the result of higher-level processing.
Higher level processing consist of two parts, generating parameter files and performing various processing task as defined in the parameter files.
Parameter files may be consulted to derive information about the specific processing task performed for a given tile. In this workflow a classification, optionally using spectral unmixing, is conducted. Next, time series analysis for different characteristics is performed.
The resulting trend files can be consulted to view trends for individual tiles. They are saved as symlinks because of their large size.
<TILE>: Auxiliary files for the mosaic visualization.
mosaic: Contains a single virtual raster file that defines the mosaic visualization.
pyramid/<TREND_TYPE>/trend/<TILE>/: Contains tile-wise pyramid visualizations for every trend analyzed in the workflow.
Two types of common visualizations are generated in the last step of the pipeline. They are results of force-mosaic and force-pyramid. Note that these visualizations do not add more logic to the workflow but rather rearrange the output files of higher-level-processing.
multiqc_report.html: a standalone HTML file that can be viewed in your web browser.
multiqc_data/: directory containing parsed statistics from the different tools used in the pipeline.
multiqc_plots/: directory containing static images from the report in various formats.
MultiQC is a visualization tool that generates a single HTML report summarising all samples in your project. Most of the pipeline QC results are visualised in the report and further statistics are available in the report data directory.
Results generated by MultiQC collate pipeline QC from supported tools e.g. FastQC. The pipeline has special steps which also allow the software versions to be reported in the MultiQC output for future traceability. For more information about how to use MultiQC reports, see http://multiqc.info.
- Reports generated by Nextflow:
- Reports generated by the pipeline:
pipeline_report*files will only be present if the
--email_on_failparameter’s are used when running the pipeline.
- Reformatted samplesheet files used as input to the pipeline:
- Reports generated by Nextflow:
Nextflow provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. This will allow you to troubleshoot errors with the running of the pipeline, and also provide you with other information such as launch commands, run times and resource usage.