fresco.validate package
Submodules
fresco.validate.exceptions module
Simple module to catch exception errors and print helpful messages.
fresco.validate.validate_params module
Module for sanity checking parameters for data and model building.
- class fresco.validate.validate_params.ValidateClcParams(cli_args, data_source: str = 'pre-generated')[source]
Bases:
object
Class to validate model-specific parameters for MOSSAIC models.
- Parameters:
cli_args – Argparse list of command line args.
data_source (str) – Indicates where the data will come from. Should be one of: - pre-generated: data_args.yml will indicate the source.
Post-condition: model_args dict loaded and sanity checked.
- check_data_files(data_path)[source]
Verify the necessary data files exist.
- Parameters:
data_path (str) – From argparser, optional path to dataset.
Note: Setting data_path will override the path set in model_args.yml.
- check_data_train_args()[source]
Verify arguments are appropriate for the chosen model options.
Parameters: none
Pre-condition: self.model_args is not None.
Post-condition: self.model_args[‘train_kwargs’][‘doc_max_len’] is updated from the data_kwargs.
- clc_arg_check()[source]
Check and modify HiSAN specific args.
Parameters: none
Pre-condition: self.model_args is not None
- Post-condition:
self.model_args[‘MTHiSAN_kwargs’][‘max_lines’] modified to be the ceiling of doc_max_len / max_words_per_line. self.model_args[‘train_kwargs’][‘doc_max_len’] modified to be max_words_per_line * max_lines.
- class fresco.validate.validate_params.ValidateParams(cli_args, data_source: str = 'pre-generated', model_args: dict | None = None)[source]
Bases:
object
Class to validate model-specific parameters for MOSSAIC models.
- Parameters:
cli_args – argparse list of command line args.
data_source (str) – Indicates where the data will come from. Should be one of: - pre-generated: data_args.yml will indicate the source.
Post-condition: model_args dict loaded and sanity checked.
- check_data_files(data_path=None)[source]
Verify the necessary data files exist.
- Parameters:
data_path (str) – From argparser, optional path to dataset.
Note: Setting data_path will override the path set in model_args.yml.
- check_data_train_args(from_pretrained=False)[source]
Verify arguments are appropriate for the chosen model options.
- Parameters:
from_pretrained (bool) – Checking model args from a pretrained model. Pretrained model args are different,
train_kwargs. (some are copied from data_kwargs to) –
Pre-condition: self.model_args is not None.
Post-condition: self.model_args[‘train_kwargs’][‘doc_max_len’] is updated from the data_kwargs, ‘max_lines’ is added to the hisan kw_args.
- hisan_arg_check()[source]
Check and modify HiSAN specific args.
Parameters: none
Pre-condition: self.model_args is not None
- Post-condition:
self.model_args[‘MTHiSAN_kwargs’][‘max_lines’] modified to be the ceiling of the doc_max_len / max_words_per_line. self.model_args[‘train_kwargs’][‘doc_max_len’] modified to be max_words_per_line * max_lines.