Skip to main content

Overview

Datasets in Terrafloww Marketplace are defined using YAML configuration files. This format allows you to specify metadata, band configurations, access settings, and STAC integration details.

YAML Schema

Required Fields

Every dataset must include these essential fields:
FieldTypeDescription
NamestringDataset name (5-130 characters)
DescriptionstringDetailed description of the dataset
DocumentationstringURL to dataset documentation
ContactstringContact email or name
ManagedBystringOrganization managing the dataset
LicensestringSPDX license identifier or URL
TagsarrayKeywords for discovery
UpdateFrequencystringUpdate frequency (e.g., Daily, Weekly)

Optional Fields

FieldTypeDescription
DatasetIdstringCustom slug for the dataset (defaults to slugified Name)
ProviderIdstringProvider identifier
FieldTypeDescription
SpatialExtentarrayBounding box [min_lon, min_lat, max_lon, max_lat]
TemporalCoveragestringTime range (e.g., 2020-01-01/2024-12-31)
SpatialResolutionstringResolution (e.g., 10 m)
CoordinateSystemstringEPSG code (e.g., EPSG:4326)
FieldTypeDescription
StacApiUrlstringURL to the STAC API endpoint
StacCollectionIdstringSTAC Collection identifier
FieldTypeDescription
Access.RequesterPaysbooleanWhether the bucket requires requester-pays
Access.BucketstringS3 bucket name
Access.RegionstringAWS region (e.g., us-west-2)
Access.PrefixstringPath prefix in the bucket
Access.ProtocolstringAccess protocol (s3, http, https)

Band Configuration

Define the spectral bands available in your dataset:
Bands:
  - Name: B04
    Assets:
      - red
    Description: Red band
    Wavelength: 665.0
    Resolution: 10.0
  - Name: B08
    Assets:
      - nir
    Description: Near Infrared
    Wavelength: 842.0
    Resolution: 10.0

Band Fields

FieldRequiredDescription
NameYesBand identifier (e.g., B04, NDVI)
AssetsYesArray of STAC asset keys
BandIndexNoIndex within multi-band assets
DescriptionNoBand description
WavelengthNoCentral wavelength in nm
ResolutionNoSpatial resolution in meters
ScaleFactorNoMultiplicative scale factor
AddOffsetNoAdditive offset value

Complete Example

Here’s a complete dataset definition for Sentinel-2 imagery:
Name: Sentinel-2 Level 2A Surface Reflectance
DatasetId: sentinel-2-l2a
Description: |
  Sentinel-2 Level-2A products provide surface reflectance 
  measurements from the Copernicus Sentinel-2 mission.

Documentation: https://sentinel.esa.int/web/sentinel/user-guides/sentinel-2-msi
Contact: [email protected]
ManagedBy: My Organization
License: CC-BY-4.0
UpdateFrequency: Daily

Tags:
  - satellite
  - optical
  - sentinel-2
  - earth-observation

StacApiUrl: https://earth-search.aws.element84.com/v1
StacCollectionId: sentinel-2-l2a

SpatialExtent: [-180, -90, 180, 90]
TemporalCoverage: 2015-06-27/..
SpatialResolution: 10 m
CoordinateSystem: EPSG:32610

Access:
  RequesterPays: false
  Bucket: sentinel-cogs
  Region: us-west-2
  Protocol: s3

Bands:
  - Name: B02
    Assets: [blue]
    Description: Blue
    Wavelength: 490.0
    Resolution: 10.0
  - Name: B03
    Assets: [green]
    Description: Green
    Wavelength: 560.0
    Resolution: 10.0
  - Name: B04
    Assets: [red]
    Description: Red
    Wavelength: 665.0
    Resolution: 10.0
  - Name: B08
    Assets: [nir]
    Description: NIR
    Wavelength: 842.0
    Resolution: 10.0

URL Rewriting

For datasets with non-standard URL patterns, use the UrlRewrite configuration:
UrlRewrite:
  Pattern: "^https://original-domain.com/(.*)$"
  Substitution: "s3://my-bucket/$1"

Next Steps