Skip to content

Latest commit

 

History

History
38 lines (26 loc) · 2.54 KB

File metadata and controls

38 lines (26 loc) · 2.54 KB

🌐 PatchGen: Modular Tools for Deep Learning with Earth Observation Data

PatchGen is a modular framework for building deep learning–ready datasets from Earth observation imagery. It leverages Google Earth Engine (GEE), Apache Beam, TensorFlow, and open-source geospatial tools to streamline the extraction, sampling, patching, and exporting of training data for geospatial machine learning.

Whether you're performing land cover classification, vegetation monitoring, or urban mapping, PatchGen provides flexible, YAML-configurable tools to scale your workflow.


🧰 Project Structure

Directory Description
cfg/ YAML configuration files defining patch generation pipelines
data/ Ancillary data used for stratified sampling and spatial filtering.
notebooks/ Example notebooks demonstrating sampling and feature engineering sampling
src/ Main source code, organized into independent components:
   ├── sampler/ Stratified point sampling and feature extraction from GEE
   ├── generator/ Beam-powered patch generator that creates TFRecords from GEE directly
   ├── slicer/ Slices exported multiband rasters into TensorFlow-compatible patches
   └── exporter/ Exports co-registered predictor and target images to GCS from GEE

🛰️ Key Features

  • GEE-native sampling + processing: Generate rich predictor/target variables on the fly
  • Custom feature sets: Easily configure time-windowed statistics, indices, or radar metrics
  • High-throughput patch creation: Apache Beam pipeline generates compressed TFRecords
  • Multiple modes: Choose from direct-from-GEE (generator) or pre-exported raster slicing (slicer)
  • Reproducibility-first: YAML-driven configs make experiments traceable and swappable

📄 License

This project is licensed under the MIT License. See LICENSE for details.