FRAME_FM.datasets package

A location for the FRAME-FM datasets, which are all sub-classes of torch.utils.data.Dataset. They all use the standard model of exposing a dataset as a generator that can be directly used by a DataLoader object. The common interface is:

  • construction: __init__()

  • length: __len__()

  • get item by index: __getitem__(idx)

Class Hierarchy

The class hierarchy is built as follows:

torch.utils.data.Dataset
    - BaseDataset
        - BaseGriddedDataset
        - BaseGeoTIFFDataset
            - BaseASCIIGridDataset
        - BaseGriddedTimeSeriesDataset
            - SoilWaterIndexGriddedTimeSeriesDataset
        - BaseShapefileDataset
            - TopsoilDataset
        - CosmosUKDataset
    - BaseShapefileDataset
    - CorrespondingTilesDataset
    - TransformedDataset
    - TransformedInputDataset
    - TransformedInputCoordsDataset
    - TransformedInputTimeCoordsDataset
    - ZipDataset

Changing datasets using preprocessors and transforms

Each Dataset class can have two types of operations defined by arguments sent to the constructor:

preprocessors

A list of operations that get run when the Dataset instance is created.

  • These get run once only.

  • They operate sequentially, with the first taking in an xr.Dataset.

  • The final object is saved in self.data.

  • The resulting output should be ready for use by the standard methods:

    def __len__(self):
    def __getitem__(self, idx):
    
transforms

A list of operations that get run at training time, within the __getitem__(idx) call.

  • These are run whenever a DataLoader needs to access single items or batches of items with a Dataset object.

  • These are typically run like this:

    for transform in self.transforms:
        sample = transform(sample)
    

Note

The FRAME_FM.transforms.transforms module contains all the transform classes that can be used in either or both of the preprocessors and transforms lists.

See the transform examples in the unit tests: tests/transforms/test_transforms.py

See the Dataset unit tests for examples: tests/datasets/test_*.py

Submodules

FRAME_FM.datasets.ImageLabel_Dataset module

Lightweight Dataset wrapper that applies transforms to images only, preserving the (image, label) structure of torchvision datasets.

class FRAME_FM.datasets.ImageLabel_Dataset.TransformedDataset(base: Dataset, transform: Any | None = None)[source]

Bases: Dataset

PyTorch Dataset wrapper that applies transforms to images only,

FRAME_FM.datasets.InputOnly_Dataset module

Lightweight Dataset wrapper that takes inputs only

class FRAME_FM.datasets.InputOnly_Dataset.TransformedInputCoordsDataset(base: Dataset, transform: Any | None = None)[source]

Bases: Dataset

This class applies to input only in the dataset that also passes grid tile coordinates.

class FRAME_FM.datasets.InputOnly_Dataset.TransformedInputDataset(base: Dataset, transform: Any | None = None)[source]

Bases: Dataset

This class applies to input only style datasets that are useful for visual autoencoders. The method currently uses a scaling coefficient to scale the input, this will change in the future when the decision about transform settings are finalized.

Module contents