h3.models package#

Submodules#

h3.models.balance_process module#

h3.models.balance_process.check_files_in_list_exist(file_list: List[str] | List[Path])[source]#

State which files don’t exist and remove from list

h3.models.balance_process.data_loader(data_dir: str, ECMWF)[source]#

Loads NOAA weather, terrain and soil EFS from the pickle file, merges them and drops the duplicates.

Parameters:

data_dir (str) – Path to the datasets, input either the google drive path or the local path.

Returns:

Merged dataframe from all the pickled dataframes with EFs of interest.

Return type:

Dataframe

h3.models.balance_process.drop_cols_containing_lists(df: pandas.DataFrame) pandas.DataFrame[source]#

It seemed like the best solution at the time: and to be fair, I can’t really think of better… N.B. for speed, only looks at values in first row – if there is a multi-type column, this would be the least of our worries…

h3.models.balance_process.main(data_dir, ECMWF=False)[source]#

Randomly samples the merged dataframe ]

Parameters:

data_dir (str) – Path to directory of all the pickle EF data file.

Returns:

EFs dataframe, balanced with value count of the destroyed damage class.

Return type:

dataframe

h3.models.balance_process.read_and_merge_pkls(pkl_paths: List[str] | List[Path]) pandas.DataFrame[source]#

Read in pkl files from list of file paths and merge on index

h3.models.balance_process.rename_and_drop_duplicated_cols(df: pandas.DataFrame) pandas.DataFrame[source]#

Drop columns which are copies of others and rename the ‘asdf_x’ headers which would have resulted

h3.models.experiments module#

h3.models.multimodal module#

class h3.models.multimodal.ClassificationLayer(*args: Any, **kwargs: Any)[source]#

Bases: LightningModule

forward(x)[source]#
class h3.models.multimodal.GenericEncoder(*args: Any, **kwargs: Any)[source]#

Bases: LightningModule

forward(x)[source]#
class h3.models.multimodal.ImageEncoder(*args: Any, **kwargs: Any)[source]#

Bases: LightningModule

forward(x)[source]#
class h3.models.multimodal.OverallModel(*args: Any, **kwargs: Any)[source]#

Bases: LightningModule

Description of what this class does here

Parameters:
  • training_dataset (torch.utils.data.Dataset) – Contains the data used for training

  • validation_dataset (torch.utils.data.Dataset) – Contains the data used for training

  • image_embedding_architecture (str) –

    Determines the image embedding architecture used. Possible values are:
    • ’ResNet18’

    • ’ViT_L_16’

    • ’Swin_V2_B’

  • num_input_channels (int) – The number of channels in the input images.

  • EF_features (dict(String: List(String))) – A dictionary mapping from type of EF to a list of strings of names of the EFs. E.g., {“weather”: [“precip”, “wind_speed”], “soil”: [“clay”, “sand”]}

  • dropout_rate (float) – The dropout probability

  • image_encoder_lr (float) – The learning rate for the image encoder. If 0, then image encoder weights are frozen.

  • general_lr (float) – The learning rate for all other parts of the model.

  • batch_size (int) – The batch size used during training and validation steps.

  • weight_decay (float) – Adam weight decay (L2 penalty)

  • lr_scheduler_patience (int) – The number of epochs of validation loss plateau before lr is decreased.

  • num_image_feature_encoder_features (int) – The number of features output from the encoder that operates on the features produced by the image encoder

  • num_output_classes (int) – The number of output classes. Set to 1 for regression.

  • zoom_levels (List[str]) – A list containing the different image zoom levels.

  • class_weights (torch.FloatTensor) – A tensor containing a weights to be applied to each class in the cross entropy loss function.

  • image_only_model (Boolean) – If true, then the model behaves as if there were no EFs, and only the images are used to make predictions.

  • loss_function_str (str) –

    Determines the loss function used. Possible values are:
    • ’BCELoss’ : Binary Cross Entropy Loss, for binary classification

    • ’CELoss’ : Cross Entropy Loss, for multiclass classification

    • ’MSE’ : Mean Squared Error, for regression

  • output_activation (str) –

    Determines the output activation function used. Possible values are:
    • ’sigmoid’ : Sigmoid, for binary classification

    • ’softmax’ : Softmax, for multiclass classification

    • ’relu’ : ReLU, for regression

Describe the attributes here, e.g. image_encoder, classification, augment
configure_optimizers()[source]#
forward(inputs)[source]#

for each zoom level Z, do image_Z_embedding = self.image_encoder(inputs[“image_Z”], image_embedding_architecture)

train_dataloader()[source]#
training_step(batch, *args, **kwargs)[source]#
val_dataloader()[source]#
validation_step(batch, *args, **kwargs)[source]#

h3.models.opti_utils module#

h3.models.opti_utils.load_full_ram(lst_path: list, transform: Callable) list[source]#
h3.models.opti_utils.main()[source]#
h3.models.opti_utils.open_img(path: str, transform: Callable)#

h3.models.pre_train module#

h3.models.pre_train.load_image() numpy.ndarray[source]#
h3.models.pre_train.load_model()[source]#
h3.models.pre_train.main()[source]#

h3.models.simple_models module#

h3.models.simple_models.logistic_reg(x_train, y_train, x_test, y_test)[source]#
h3.models.simple_models.main()[source]#

h3.models.vision_transformer module#

h3.models.vision_transformer.get_model(name) torch.nn.Module[source]#
h3.models.vision_transformer.get_preprocess(name) Callable[source]#