h3.models package#

Submodules#

h3.models.balance_process module#

h3.models.balance_process.check_files_in_list_exist(file_list: List[str] | List[Path])[source]#: State which files don’t exist and remove from list

h3.models.balance_process.data_loader(data_dir: str, ECMWF)[source]#

Loads NOAA weather, terrain and soil EFS from the pickle file, merges them and drops the duplicates.

Parameters:: data_dir (str) – Path to the datasets, input either the google drive path or the local path.
Returns:: Merged dataframe from all the pickled dataframes with EFs of interest.
Return type:: Dataframe

h3.models.balance_process.drop_cols_containing_lists(df: pandas.DataFrame) → pandas.DataFrame[source]#: It seemed like the best solution at the time: and to be fair, I can’t really think of better… N.B. for speed, only looks at values in first row – if there is a multi-type column, this would be the least of our worries…

h3.models.balance_process.main(data_dir, ECMWF=False)[source]#

Randomly samples the merged dataframe ]

Parameters:: data_dir (str) – Path to directory of all the pickle EF data file.
Returns:: EFs dataframe, balanced with value count of the destroyed damage class.
Return type:: dataframe

h3.models.balance_process.read_and_merge_pkls(pkl_paths: List[str] | List[Path]) → pandas.DataFrame[source]#: Read in pkl files from list of file paths and merge on index

h3.models.balance_process.rename_and_drop_duplicated_cols(df: pandas.DataFrame) → pandas.DataFrame[source]#: Drop columns which are copies of others and rename the ‘asdf_x’ headers which would have resulted

h3.models.experiments module#

h3.models.multimodal module#

class h3.models.multimodal.ClassificationLayer(*args: Any, **kwargs: Any)[source]#

Bases: LightningModule

forward(x)[source]#

class h3.models.multimodal.GenericEncoder(*args: Any, **kwargs: Any)[source]#

Bases: LightningModule

forward(x)[source]#

class h3.models.multimodal.ImageEncoder(*args: Any, **kwargs: Any)[source]#

Bases: LightningModule

forward(x)[source]#

class h3.models.multimodal.OverallModel(*args: Any, **kwargs: Any)[source]#

Bases: LightningModule

Description of what this class does here

Parameters:

training_dataset (torch.utils.data.Dataset) – Contains the data used for training
validation_dataset (torch.utils.data.Dataset) – Contains the data used for training
image_embedding_architecture (str) –
Determines the image embedding architecture used. Possible values are:
- ’ResNet18’
- ’ViT_L_16’
- ’Swin_V2_B’
num_input_channels (int) – The number of channels in the input images.
EF_features (dict(String: List(String))) – A dictionary mapping from type of EF to a list of strings of names of the EFs. E.g., {“weather”: [“precip”, “wind_speed”], “soil”: [“clay”, “sand”]}
dropout_rate (float) – The dropout probability
image_encoder_lr (float) – The learning rate for the image encoder. If 0, then image encoder weights are frozen.
general_lr (float) – The learning rate for all other parts of the model.
batch_size (int) – The batch size used during training and validation steps.
weight_decay (float) – Adam weight decay (L2 penalty)
lr_scheduler_patience (int) – The number of epochs of validation loss plateau before lr is decreased.
num_image_feature_encoder_features (int) – The number of features output from the encoder that operates on the features produced by the image encoder
num_output_classes (int) – The number of output classes. Set to 1 for regression.
zoom_levels (List[str]) – A list containing the different image zoom levels.
class_weights (torch.FloatTensor) – A tensor containing a weights to be applied to each class in the cross entropy loss function.
image_only_model (Boolean) – If true, then the model behaves as if there were no EFs, and only the images are used to make predictions.
loss_function_str (str) –
Determines the loss function used. Possible values are:
- ’BCELoss’ : Binary Cross Entropy Loss, for binary classification
- ’CELoss’ : Cross Entropy Loss, for multiclass classification
- ’MSE’ : Mean Squared Error, for regression
output_activation (str) –
Determines the output activation function used. Possible values are:
- ’sigmoid’ : Sigmoid, for binary classification
- ’softmax’ : Softmax, for multiclass classification
- ’relu’ : ReLU, for regression

Describe the attributes here, e.g. image_encoder, classification, augment

configure_optimizers()[source]#

forward(inputs)[source]#: for each zoom level Z, do image_Z_embedding = self.image_encoder(inputs[“image_Z”], image_embedding_architecture)

train_dataloader()[source]#

training_step(batch, *args, **kwargs)[source]#

val_dataloader()[source]#

validation_step(batch, *args, **kwargs)[source]#

h3.models.opti_utils module#

h3.models.opti_utils.load_full_ram(lst_path: list, transform: Callable) → list[source]#

h3.models.opti_utils.main()[source]#

h3.models.opti_utils.open_img(path: str, transform: Callable)#

h3.models.pre_train module#

h3.models.pre_train.load_image() → numpy.ndarray[source]#

h3.models.pre_train.load_model()[source]#

h3.models.pre_train.main()[source]#

h3.models.simple_models module#

h3.models.simple_models.logistic_reg(x_train, y_train, x_test, y_test)[source]#

h3.models.simple_models.main()[source]#

h3.models.vision_transformer module#

h3.models.vision_transformer.get_model(name) → torch.nn.Module[source]#

h3.models.vision_transformer.get_preprocess(name) → Callable[source]#