earthdiagnostics

earthdiagnostics.box

Module to manage 3D space restrictions

class earthdiagnostics.box.Box(depth_in_meters=False)[source]

Bases: object

Represents a box in the 3D space.

Also allows easy conversion from the coordinate values to significant string representations

Parameters:depth_in_meters (bool, optional) – If True, depth is given in meters. If False, it correspond to levels
depth_in_meters = None

If True, treats the depth as if it is given in meters. If False, as it is given in levels :rtype: bool

get_depth_str()[source]

Get a string representation of depth.

For depth expressed in meters, it adds the character ‘m’ to the end If min_depth is different from max_depth, it concatenates the two values

Returns:string representation for depth
Return type:str
get_lat_str()[source]

Get a string representation of the latitude in the format XX{N/S}.

If min_lat is different from max_lat, it concatenates the two values

Returns:string representation for latitude
Return type:str
get_lon_str()[source]

Get a string representation of the longitude in the format XX{E/W}.

If min_lon is different from max_lon, it concatenates the two values

Returns:string representation for longitude
Return type:str
max_depth = None

Maximum depth :rtype: float

max_lat

Maximum latitude

Return type:float
max_lon

Maximum longitude

Return type:float
min_depth = None

Minimum depth :rtype: float

min_lat

Minimum latitude

Return type:float
min_lon

Minimum longitude

Return type:float

earthdiagnostics.cdftools

CDFTOOLS interface

class earthdiagnostics.cdftools.CDFTools(path='')[source]

Bases: object

Class to run CDFTools executables

Parameters:path (str) – path to CDFTOOLS binaries
run(command, input_file, output_file=None, options=None, log_level=20, input_option=None)[source]

Run one of the CDFTools

Parameters:
  • command (str | iterable) – executable to run
  • input_file (str) – input file
  • output_file – output file. Not all tools support this parameter
  • options (str | [str] | Tuple[str] | None) – options for the tool.
  • log_level (int) – log level at which the output of the cdftool command will be added
  • input_option (str) – option to add before input file

earthdiagnostics.cmorizer

earthdiagnostics.cmormanager

earthdiagnostics.config

Classes to manage Earth Diagnostics configuration

class earthdiagnostics.config.CMORConfig(parser, var_manager)[source]

Bases: object

Configuration for the cmorization processes

Parameters:
any_required(variables)[source]

Check if any of the given variables is needed for cmorization

Parameters:variables (iterable of str) –
Returns:
Return type:bool
chunk_cmorization_requested(chunk)[source]

Check if the cmorization of a given chunk is required

Parameters:chunk (int) –
Returns:
Return type:bool
cmorize(var_cmor)[source]

Check if var_cmor is on variable list

Parameters:var_cmor (Variable) –
get_levels(frequency, variable)[source]

Get the levels to extract for a given variable

Parameters:
Returns:

Return type:

iterable of int

get_requested_codes()[source]

Get all the codes to be extracted from the grib files

Returns:
Return type:set of int
get_variables(frequency)[source]

Get the variables to get from the grib file for a given frequency

Parameters:frequency (Frequency) –
Returns:
Return type:str
Raises:ValueError – If the frequency passed is not supported
class earthdiagnostics.config.Config[source]

Bases: object

Class to read and manage the configuration

auto_clean = None

If True, the scratch dir is removed after finishing

cdftools_path = None

Path to CDFTOOLS executables

cmor = None

CMOR related configuration

Returns:
Return type:CMORConfig
con_files = None

Mask and meshes folder path

data_adaptor = None

Scratch folder path

data_convention = None

Data convention to use

data_dir = None

Root data folder path

data_type = None

Data type (experiment, observation or reconstruction)

experiment = None

Configuration related to the experiment

Returns:
Return type:ExperimentConfig
frequency = None

Default data frequency to be used by the diagnostics

get_commands()[source]

Return the list of commands after replacing the alias

Returns:
Return type:iterable of str
mask_regions = None

Custom mask regions file to use

mask_regions_3d = None

Custom mask regions 3D file to use

max_cores = None

Maximum number of cores to use

mesh_mask = None

Custom mesh mask file to use

new_mask_glo = None

Custom new mask glo file to use

parallel_downloads = None

Maximum number of simultaneous downloads

parallel_uploads = None

Maximum number of simultaneous uploads

parse(path)[source]

Read configuration from INI file

Parameters:path (str) –
report = None

Reporting configuration

Returns:
Return type:ReportConfig
restore_meshes = None

If True, forces the tool to copy all the mesh and mask files for the model, regardless of existence

scratch_dir = None

Scratch folder path

scratch_masks = None

Common scratch folder for masks

skip_diags_done = None

Flag to control if already done diags must be recalculated

thredds = None

THREDDS server configuration

Returns:
Return type:THREDDSConfig
use_ramdisk = None

If True, the scratch dir is created as a ram disk

exception earthdiagnostics.config.ConfigException[source]

Bases: exceptions.Exception

Exception raised when there is a problem with the configuration

class earthdiagnostics.config.ExperimentConfig[source]

Bases: object

Configuration related to the experiment

get_chunk_end(startdate, chunk)[source]

Get chunk’s last day

Parameters:
Returns:

Return type:

datetime.datetime

get_chunk_end_str(startdate, chunk)[source]

Get chunk’s last day as a string

Parameters:
Returns:

Return type:

datetime.datetime

See also

get_chunk_end()

get_chunk_list()[source]

Return a list with all the chunks

Returns:List containing tuples of startdate, member and chunk
Return type:tuple[str, int, int]
get_chunk_start(startdate, chunk)[source]

Get chunk’s first day

Parameters:
Returns:

Return type:

datetime.datetime

get_chunk_start_str(startdate, chunk)[source]

Get chunk’s first day string representation

Parameters:
Returns:

Return type:

str

get_full_years(startdate)[source]

Return the list of full years that are in the given startdate

Parameters:startdate (str) – startdate to use
Returns:list of full years
Return type:list[int]
get_member_list()[source]

Return a list with all the members

Returns:List containing tuples of startdate and member
Return type:tuple[str, int, int]
get_member_str(member)[source]

Return the member name for a given member number.

Parameters:member (int) – member’s number
Returns:member’s name
Return type:str
get_year_chunks(startdate, year)[source]

Get the list of chunks containing timesteps from the given year

Parameters:
  • startdate (str) – startdate to use
  • year (int) – reference year
Returns:

list of chunks containing data from the given year

Return type:

list[int]

parse_ini(parser)[source]

Parse experiment section from INI-like file

Parameters:parser (ConfigParser) –
class earthdiagnostics.config.ReportConfig(parser)[source]

Bases: object

Configuration for the reporting feature

Parameters:parser (ConfigParser) –
class earthdiagnostics.config.THREDDSConfig(parser)[source]

Bases: object

Configuration related to the THREDDS server

Parameters:parser (ConfigParser) –

earthdiagnostics.constants

Contains the enumeration-like classes used by the diagnostics

class earthdiagnostics.constants.Basin(name)[source]

Bases: object

Class representing a given basin

Parameters:name (str) – full basin’s name
name

Basin full name

Return type:str
class earthdiagnostics.constants.Basins[source]

Bases: object

Singleton class to manage available basins

get_available_basins(handler)[source]

Read available basins from file

Parameters:handler (netCDF4.Dataset) –
parse(basin)[source]

Return the basin matching the given name.

If the parameter basin is a Basin instance, directly returns the same instance. This bahaviour is intended to facilitate the development of methods that can either accept a nameor a Basin instance to characterize the basin.

Parameters:basin (str | Basin) – basin name or basin instance
Returns:basin instance corresponding to the basin name
Return type:Basin
class earthdiagnostics.constants.Models[source]

Bases: object

Predefined models

ECEARTH_2_3_O1L42 = 'Ec2.3_O1L42'

EC-Earth 2.3 ORCA1 L42

ECEARTH_3_0_O1L46 = 'Ec3.0_O1L46'

EC-Earth 3 ORCA1 L46

ECEARTH_3_0_O25L46 = 'Ec3.0_O25L46'

EC-Earth 3 ORCA0.25 L46

ECEARTH_3_0_O25L75 = 'Ec3.0_O25L75'

EC-Earth 3 ORCA0.25 L75

ECEARTH_3_1_O25L75 = 'Ec3.1_O25L75'

EC-Earth 3.1 ORCA0.25 L75

ECEARTH_3_2_O1L75 = 'Ec3.2_O1L75'

EC-Earth 3.2 ORCA1 L75

ECEARTH_3_2_O25L75 = 'Ec3.2_O25L75'

EC-Earth 3.2 ORCA0.25 L75

GLORYS2_V1_O25L75 = 'glorys2v1_O25L75'

GLORYS2v1 ORCA0.25 L75

NEMOVAR_O1L42 = 'nemovar_O1L42'

NEMOVAR ORCA1 L42

NEMO_3_2_O1L42 = 'N3.2_O1L42'

NEMO 3.2 ORCA1 L42

NEMO_3_3_O1L46 = 'N3.3_O1L46'

NEMO 3.3 ORCA1 L46

NEMO_3_6_O1L46 = 'N3.6_O1L75'

NEMO 3.6 ORCA1 L75

earthdiagnostics.datafile

Module for classes to manage storage manipulation

class earthdiagnostics.datafile.DataFile[source]

Bases: earthdiagnostics.publisher.Publisher

Represent a data file

Must be derived for each concrete data file format

add_cmorization_history()[source]

Add the history line corresponding to the cmorization to the local file

add_diagnostic_history()[source]

Add the history line corresponding to the diagnostic to the local file

add_modifier(diagnostic)[source]

Register a diagnostic as a modifier of this data

A modifier diagnostic is a diagnostic that read this data and changes it in any way. The diagnostic must be a modifier even if it only affects the metadata

Parameters:diagnostic (Diagnostic) –
clean_local()[source]

Check if a local file is still needed and remove it if not

Create a link from the original data in the <frequency>_<var_type> folder

dispatch(*args)

Notify update to all the suscribers

Parameters:args – arguments to pass
download()[source]

Get data from remote storage to the local one

Must be overriden by the derived classes

Raises:NotImplementedError – If the derived classes do not override this
download_required()[source]

Get if a download is required for this file

Returns:
Return type:bool
classmethod from_storage(filepath, data_convention)[source]

Create a new datafile to be downloaded from the storage

has_modifiers()[source]

Check if it has registered modifiers

Returns:
Return type:bool
local_status

Get local storage status

only_suscriber(who)

Get if an object is the sole suscriber of this publisher

Parameters:who (object) –
Returns:
Return type:bool
prepare_to_upload(rename_var)[source]

Prepare a local file to be uploaded

This includes renaming the variable if necessary, updating the metadata and adding the history and managing the possibility of multiple regions

ready_to_run(diagnostic)[source]

Check if the data is ready to run for a given diagnostics

To be ready to run, the datafile should be in the local storage and no modifiers can be pending.

Parameters:diagnostic (Diagnostic) –
Returns:
Return type:bool
set_local_file(local_file, diagnostic=None, rename_var='', region=None)[source]

Set the local file generated by EarthDiagnostics

This also prepares it for the upload

Parameters:
Returns:

Return type:

None

size

File size

storage_status

Get remote storage status

subscribe(who, callback=None)

Add a suscriber to the current publisher

Parameters:
  • who (object) – Subscriber to add
  • callback (callable or None, optional) – Callback to call
suscribers

List of suscribers of this publisher

classmethod to_storage(remote_file, data_convention)[source]

Create a new datafile object for a file that is going to be generated and stored

unsubscribe(who)

Remove a suscriber from the current publisher

Parameters:who (object) – suscriber to remove
upload()[source]

Send a loal file to the storage

upload_required()[source]

Get if an upload is needed for this file

Returns:
Return type:bool
class earthdiagnostics.datafile.LocalStatus[source]

Bases: object

Local file status enumeration

class earthdiagnostics.datafile.NetCDFFile[source]

Bases: earthdiagnostics.datafile.DataFile

Implementation of DataFile for netCDF files

add_cmorization_history()

Add the history line corresponding to the cmorization to the local file

add_diagnostic_history()

Add the history line corresponding to the diagnostic to the local file

add_modifier(diagnostic)

Register a diagnostic as a modifier of this data

A modifier diagnostic is a diagnostic that read this data and changes it in any way. The diagnostic must be a modifier even if it only affects the metadata

Parameters:diagnostic (Diagnostic) –
clean_local()

Check if a local file is still needed and remove it if not

Create a link from the original data in the <frequency>_<var_type> folder

dispatch(*args)

Notify update to all the suscribers

Parameters:args – arguments to pass
download()[source]

Get data from remote storage to the local one

download_required()

Get if a download is required for this file

Returns:
Return type:bool
classmethod from_storage(filepath, data_convention)

Create a new datafile to be downloaded from the storage

has_modifiers()

Check if it has registered modifiers

Returns:
Return type:bool
local_status

Get local storage status

only_suscriber(who)

Get if an object is the sole suscriber of this publisher

Parameters:who (object) –
Returns:
Return type:bool
prepare_to_upload(rename_var)

Prepare a local file to be uploaded

This includes renaming the variable if necessary, updating the metadata and adding the history and managing the possibility of multiple regions

ready_to_run(diagnostic)

Check if the data is ready to run for a given diagnostics

To be ready to run, the datafile should be in the local storage and no modifiers can be pending.

Parameters:diagnostic (Diagnostic) –
Returns:
Return type:bool
set_local_file(local_file, diagnostic=None, rename_var='', region=None)

Set the local file generated by EarthDiagnostics

This also prepares it for the upload

Parameters:
Returns:

Return type:

None

size

File size

storage_status

Get remote storage status

subscribe(who, callback=None)

Add a suscriber to the current publisher

Parameters:
  • who (object) – Subscriber to add
  • callback (callable or None, optional) – Callback to call
suscribers

List of suscribers of this publisher

classmethod to_storage(remote_file, data_convention)

Create a new datafile object for a file that is going to be generated and stored

unsubscribe(who)

Remove a suscriber from the current publisher

Parameters:who (object) – suscriber to remove
upload()

Send a loal file to the storage

upload_required()

Get if an upload is needed for this file

Returns:
Return type:bool
class earthdiagnostics.datafile.StorageStatus[source]

Bases: object

Remote file status enumeration

class earthdiagnostics.datafile.UnitConversion(source, destiny, factor, offset)[source]

Bases: object

Class to manage unit conversions

Parameters:
classmethod add_conversion(conversion)[source]

Add a conversion to the dictionary

Parameters:conversion (UnitConversion) – conversion to add
classmethod get_conversion_factor_offset(input_units, output_units)[source]

Get the conversion factor and offset for two units.

The conversion has to be done in the following way: converted = original * factor + offset

Parameters:
  • input_units (str) – original units
  • output_units (str) – destiny units
Returns:

factor and offset

Return type:

[float, float]

classmethod load_conversions()[source]

Load conversions from the configuration file

earthdiagnostics.datamanager

Base data manager for Earth diagnostics

class earthdiagnostics.datamanager.DataManager(config)[source]

Bases: object

Class to manage the data repositories

Parameters:config (Config) –
declare_chunk(domain, var, startdate, member, chunk, grid=None, region=None, box=None, frequency=None, vartype=1, diagnostic=None)[source]

Declare a variable chunk to be generated by a diagnostic

Parameters:
Returns:

Return type:

DataFile

Raises:

NotImplementedError – If not implemented by derived classes

declare_year(domain, var, startdate, member, year, grid=None, box=None, vartype=1, diagnostic=None)[source]

Declare a variable year to be generated by a diagnostic

Parameters:
Returns:

Return type:

DataFile

Raises:

NotImplementedError – If not implemented by derived classes

file_exists(domain, var, startdate, member, chunk, grid=None, box=None, frequency=None, vartype=1, possible_versions=None)[source]

Check if a file exists in the storage

Parameters:
  • domain (ModelingRealm) –
  • var (str) –
  • startdate (str) –
  • member (int) –
  • chunk (int) –
  • grid (str or None, optional) –
  • box (Box or None, optional) –
  • frequency (Frequency or None, optional) –
  • vartype (VariableType, optional) –
  • possible_versions (iterable od str or None, optional) –
Raises:

NotImplementedError – If not implemented by derived classes

Returns:

Return type:

bool

Create the link of a given file from the CMOR repository.

Parameters:
  • cmor_var
  • move_old
  • date_str
  • year (int) – if frequency is yearly, this parameter is used to give the corresponding year
  • domain (Domain) – CMOR domain
  • var (str) – variable name
  • startdate (str) – file’s startdate
  • member (int) – file’s member
  • chunk (int) – file’s chunk
  • grid (str) – file’s grid (only needed if it is not the original)
  • frequency (str) – file’s frequency (only needed if it is different from the default)
  • vartype (VariableType) – Variable type (mean, statistic)
Returns:

path to the copy created on the scratch folder

Return type:

str

prepare()[source]

Prepare the data to be used by Earth Diagnostics

request_chunk(domain, var, startdate, member, chunk, grid=None, box=None, frequency=None, vartype=None)[source]

Request a given file from the CMOR repository to the scratch folder and returns the path to the scratch’s copy

Parameters:
Returns:

Return type:

DataFile

Raises:

NotImplementedError – If not implemented by derived classes

request_year(diagnostic, domain, var, startdate, member, year, grid=None, box=None, frequency=None)[source]

Request a given year for a variavle from a CMOR repository

Parameters:
Returns:

Return type:

DataFile

Raises:

NotImplementedError – If not implemented by derived classes

earthdiagnostics.diagnostic

This module contains the Diagnostic base class and all the classes for parsing the options passed to them

class earthdiagnostics.diagnostic.Diagnostic(data_manager)[source]

Bases: earthdiagnostics.publisher.Publisher

Base class for the diagnostics.

Provides a common interface for them and also has a mechanism that allows diagnostic retrieval by name.

Parameters:data_manager (DataManager) – data manager that will be used to store and retrieve the necessary data
add_subjob(subjob)[source]

Add a subjob

Add a diagnostic that must be run before the current one

Parameters:subjob (Diagnostic) –
alias = None

Alias to call the diagnostic. Must be overridden at the derived clases

all_requests_in_storage()[source]

Check if all the data requested is in the local scratch

Returns:
Return type:bool
can_skip_run()[source]

Check if a diagnostic calculation can be skipped

Looks if the data to be generated is already there and is not going to be modified

Returns:
Return type:bool
check_is_ready()[source]

Check if a diagnostic is ready to run and change its status accordingly

compute()[source]

Calculate the diagnostic and stores the output

Must be implemented by derived classes

declare_chunk(domain, var, startdate, member, chunk, grid=None, region=None, box=None, frequency=None, vartype=1)[source]

Declare a chunk that is going to be generated by the diagnostic

Parameters:
Returns:

Return type:

DataFile

declare_data_generated()[source]

Declare the data to be generated by the diagnostic

Must be implemented by derived classes

declare_year(domain, var, startdate, member, year, grid=None, box=None, vartype=1)[source]

Declare a year that is going to be generated by the diagnostic

Parameters:
Returns:

Return type:

DataFile

dispatch(*args)

Notify update to all the suscribers

Parameters:args – arguments to pass
classmethod generate_jobs(diags, options)[source]

Generate the instances of the diagnostics that will be run by the manager

Must be implemented by derived classes.

Parameters:
  • diags (Diags) –
  • options (list of str) –
Returns:

Return type:

list of Diagnostic

static get_diagnostic(name)[source]

Return the class for a diagnostic given its name

Parameters:name (str) –
Returns:
Return type:Type[Diagnostic] or None
only_suscriber(who)

Get if an object is the sole suscriber of this publisher

Parameters:who (object) –
Returns:
Return type:bool
pending_requests()[source]

Get the number of data request pending to be fulfilled

Returns:
Return type:int
classmethod process_options(options, options_available)[source]

Process the configuration of a diagnostic

Parameters:
  • options (iterable of str) –
  • options_available (iterable of DiagnosticOptiion) –
Returns:

dict of str – Dictionary of names and values for the options

Return type:

str

Raises:

DiagnosticOptionError: – If there are more options that admitted for the diagnostic

static register(diagnostic_class)[source]

Register a new diagnostic using the given alias.

It must be called using the derived class.

Parameters:diagnostic_class (Type[Diagnostic]) –
request_chunk(domain, var, startdate, member, chunk, grid=None, box=None, frequency=None, to_modify=False, vartype=1)[source]

Request one chunk of data required by the diagnostic

Parameters:
  • domain (ModelingRealm) –
  • var (str) –
  • startdate (str or None) –
  • member (int or None) –
  • chunk (int or None) –
  • grid (str or None) –
  • box (Box or None) –
  • frequency (Frequency or str or None) –
  • to_modify (bool) – Flag that must be active if the diagnostic is going to generate a modified version of this data. In this case this data must not be declared as an output of the diagnostic
  • vartype (VariableType) –
Returns:

Return type:

DataFile

request_data()[source]

Request the data required by the diagnostic

Must be implemented by derived classes

request_year(domain, var, startdate, member, year, grid=None, box=None, frequency=None, to_modify=False)[source]

Request one year of data that is required for the diagnostic

Parameters:
  • domain (ModelingRealm) –
  • var (str) –
  • startdate (str) –
  • member (int) –
  • year (int) –
  • grid (str) –
  • box (Box) –
  • frequency (Frequency) –
  • to_modify (str) –
Returns:

Return type:

DataFile

status

Execution status

subscribe(who, callback=None)

Add a suscriber to the current publisher

Parameters:
  • who (object) – Subscriber to add
  • callback (callable or None, optional) – Callback to call
suscribers

List of suscribers of this publisher

unsubscribe(who)

Remove a suscriber from the current publisher

Parameters:who (object) – suscriber to remove
class earthdiagnostics.diagnostic.DiagnosticBasinListOption(name, default_value=None)[source]

Bases: earthdiagnostics.diagnostic.DiagnosticOption

Class to parse list of basins options

parse(option_value)[source]

Parse option value

Parameters:option_value (str) –
Returns:
Return type:Basin
class earthdiagnostics.diagnostic.DiagnosticBasinOption(name, default_value=None)[source]

Bases: earthdiagnostics.diagnostic.DiagnosticOption

Class to parse basin options

parse(option_value)[source]

Parse option value

Parameters:option_value (str) –
Returns:
Return type:Basin
class earthdiagnostics.diagnostic.DiagnosticBoolOption(name, default_value=None)[source]

Bases: earthdiagnostics.diagnostic.DiagnosticOption

Class to parse boolean options

parse(option_value)[source]

Parse option value

Parameters:option_value (str) –
Returns:
Return type:Bool
class earthdiagnostics.diagnostic.DiagnosticChoiceOption(name, choices, default_value=None, ignore_case=True)[source]

Bases: earthdiagnostics.diagnostic.DiagnosticOption

Class to parse choice option

Parameters:
  • name (str) –
  • choices (list of str) – Valid options for the option
  • default_value (str, optional) – If not None, it should ve a valid choice
  • ignore_case (bool, optional) – If false, value must match case of the valid choice
parse(option_value)[source]

Parse option value

Parameters:option_value (str) –
Returns:
Return type:str
class earthdiagnostics.diagnostic.DiagnosticComplexStrOption(name, default_value=None)[source]

Bases: earthdiagnostics.diagnostic.DiagnosticOption

Class to parse complex string options

It replaces ‘&;’ with ‘,’ and ‘&.’ with ‘ ‘

parse(option_value)[source]

Parse option value

Parameters:option_value (str) –
Returns:
Return type:str
class earthdiagnostics.diagnostic.DiagnosticDomainOption(name='domain', default_value=None)[source]

Bases: earthdiagnostics.diagnostic.DiagnosticOption

Class to parse domain options

Parameters:
  • name (str, optional) –
  • default_value (str, optional) –
parse(option_value)[source]

Parse option value

Returns:
Return type:ModelingRealm
class earthdiagnostics.diagnostic.DiagnosticFloatOption(name, default_value=None)[source]

Bases: earthdiagnostics.diagnostic.DiagnosticOption

Class for parsing float options

parse(option_value)[source]

Parse option value

Parameters:option_value (str) –
Returns:
Return type:float
class earthdiagnostics.diagnostic.DiagnosticFrequencyOption(name='frequency', default_value=None)[source]

Bases: earthdiagnostics.diagnostic.DiagnosticOption

Class to parse frequency options

Parameters:
  • name (str, optional) –
  • default_value (Frequency,optional) –
parse(option_value)[source]

Parse option value

Parameters:option_value (str) –
Returns:
Return type:Frequency
class earthdiagnostics.diagnostic.DiagnosticIntOption(name, default_value=None, min_limit=None, max_limit=None)[source]

Bases: earthdiagnostics.diagnostic.DiagnosticOption

Class for parsing integer options

Parameters:
  • name (str) –
  • default_value (int, optional) –
  • min_limit (int, optional) – If setted, any value below this will not be accepted
  • max_limit (int, optional) – If setted, any value over this will not be accepted
parse(option_value)[source]

Parse option value

Parameters:option_value (str) –
Returns:
Return type:int
Raises:DiagnosticOptionError – If parsed values is outside limits
class earthdiagnostics.diagnostic.DiagnosticListFrequenciesOption(name, default_value=None)[source]

Bases: earthdiagnostics.diagnostic.DiagnosticOption

Class for parsing an option which is a list of frequencies

Parameters:
  • name (str) –
  • default_value (list, optional) –
parse(option_value)[source]

Parse option value

Returns:
Return type:List of Frequency
class earthdiagnostics.diagnostic.DiagnosticListIntOption(name, default_value=None, min_limit=None, max_limit=None)[source]

Bases: earthdiagnostics.diagnostic.DiagnosticIntOption

Class for parsing integer list options

Parameters:
  • name (str) –
  • default_value (list, optional) –
  • min_limit (int, optional) – If setted, any value below this will not be accepted
  • max_limit (int, optional) – If setted, any value over this will not be accepted
max_limit = None

Upper limit

min_limit = None

Lower limit

parse(option_value)[source]

Parse option value

Parameters:option_value (str) –
Returns:
Return type:list(int)
Raises:DiagnosticOptionError – If parsed values is outside limits
class earthdiagnostics.diagnostic.DiagnosticOption(name, default_value=None)[source]

Bases: object

Class to manage string options for the diagnostic

parse(option_value)[source]

Get the final value for the option

If option_value is empty, return default_value

Parameters:option_value (str) –
Returns:
Return type:str
Raises:DiagnosticOptionError: – If the option is empty and default_value is False
exception earthdiagnostics.diagnostic.DiagnosticOptionError[source]

Bases: exceptions.Exception

Exception class for errors related to bad options for the diagnostics

class earthdiagnostics.diagnostic.DiagnosticStatus[source]

Bases: object

Enumeration of diagnostic status

class earthdiagnostics.diagnostic.DiagnosticVariableListOption(var_manager, name, default_value=None)[source]

Bases: earthdiagnostics.diagnostic.DiagnosticOption

Class to parse variable list options

Parameters:
parse(option_value)[source]

Parse option value

Returns:
Return type:List[Variable]
class earthdiagnostics.diagnostic.DiagnosticVariableOption(var_manager, name='variable', default_value=None)[source]

Bases: earthdiagnostics.diagnostic.DiagnosticOption

Class to parse variable options

Parameters:
parse(option_value)[source]

Parse option value

Returns:
Return type:Variable

earthdiagnostics.earthdiags

earthdiagnostics.frequency

Data frequency management tools

class earthdiagnostics.frequency.Frequencies[source]

Bases: object

Enumeration of supported frequencies

class earthdiagnostics.frequency.Frequency(freq)[source]

Bases: object

Time frequency

folder_name(vartype)[source]

Get foder name associated to this frequency

Parameters:vartype (VariableType) –
Returns:
Return type:str
static parse(freq)[source]

Get frequency instance from str

If a Frequency object is passed, it is returned as usual

Parameters:freq (str or Frequency) –
Returns:
Return type:Frequency

earthdiagnostics.modellingrealm

earthdiagnostics.obsreconmanager

Data management for BSC-Earth conventions

Focused on working with observations and reconstructions as well as with downloaded but no cmorized models (like ECMWF System 4)

class earthdiagnostics.obsreconmanager.ObsReconManager(config)[source]

Bases: earthdiagnostics.datamanager.DataManager

Data manager class for CMORized experiments

Parameters:config (Config) –
declare_chunk(domain, var, startdate, member, chunk, grid=None, region=None, box=None, frequency=None, vartype=1, diagnostic=None)[source]

Declare a variable chunk to be generated by a diagnostic

Parameters:
Returns:

Return type:

DataFile

declare_year(domain, var, startdate, member, year, grid=None, box=None, vartype=1, diagnostic=None)

Declare a variable year to be generated by a diagnostic

Parameters:
Returns:

Return type:

DataFile

Raises:

NotImplementedError – If not implemented by derived classes

file_exists(domain, var, startdate, member, chunk, grid=None, box=None, frequency=None, vartype=1, possible_versions=None)

Check if a file exists in the storage

Parameters:
  • domain (ModelingRealm) –
  • var (str) –
  • startdate (str) –
  • member (int) –
  • chunk (int) –
  • grid (str or None, optional) –
  • box (Box or None, optional) –
  • frequency (Frequency or None, optional) –
  • vartype (VariableType, optional) –
  • possible_versions (iterable od str or None, optional) –
Raises:

NotImplementedError – If not implemented by derived classes

Returns:

Return type:

bool

get_file_path(startdate, domain, var, frequency, vartype, box=None, grid=None)[source]

Return the path to a concrete file

Parameters:
  • startdate (str) – file’s startdate
  • domain (str) – file’s domain
  • var (str) – file’s var
  • frequency (Frequency) – file’s frequency
  • box (Box) – file’s box
  • grid (str) – file’s grid
  • vartype (VariableType) – Variable type (mean, statistic)
Returns:

path to the file

Return type:

str

Create the link of a given file from the CMOR repository.

Parameters:
  • cmor_var
  • move_old
  • date_str
  • year (int) – if frequency is yearly, this parameter is used to give the corresponding year
  • domain (Domain) – CMOR domain
  • var (str) – variable name
  • startdate (str) – file’s startdate
  • member (int) – file’s member
  • chunk (int) – file’s chunk
  • grid (str) – file’s grid (only needed if it is not the original)
  • frequency (str) – file’s frequency (only needed if it is different from the default)
  • vartype (VariableType) – Variable type (mean, statistic)
Returns:

path to the copy created on the scratch folder

Return type:

str

prepare()

Prepare the data to be used by Earth Diagnostics

request_chunk(domain, var, startdate, member, chunk, grid=None, box=None, frequency=None, vartype=1)[source]

Request a given file from the CMOR repository to the scratch folder and returns the path to the scratch’s copy

Parameters:
Returns:

Return type:

DataFile

request_year(diagnostic, domain, var, startdate, member, year, grid=None, box=None, frequency=None)

Request a given year for a variavle from a CMOR repository

Parameters:
Returns:

Return type:

DataFile

Raises:

NotImplementedError – If not implemented by derived classes

earthdiagnostics.publisher

Module to allow classes to communicate when an event is produced

class earthdiagnostics.publisher.Publisher[source]

Bases: object

Base class to provide functionality to notify updates to other objects

dispatch(*args)[source]

Notify update to all the suscribers

Parameters:args – arguments to pass
only_suscriber(who)[source]

Get if an object is the sole suscriber of this publisher

Parameters:who (object) –
Returns:
Return type:bool
subscribe(who, callback=None)[source]

Add a suscriber to the current publisher

Parameters:
  • who (object) – Subscriber to add
  • callback (callable or None, optional) – Callback to call
suscribers

List of suscribers of this publisher

unsubscribe(who)[source]

Remove a suscriber from the current publisher

Parameters:who (object) – suscriber to remove

earthdiagnostics.singleton

earthdiagnostics.threddsmanager

Data manager for THREDDS server

exception earthdiagnostics.threddsmanager.THREDDSError[source]

Bases: exceptions.Exception

Exception to be launched when a THREDDS related error is encounteredd

class earthdiagnostics.threddsmanager.THREDDSManager(config)[source]

Bases: earthdiagnostics.datamanager.DataManager

Data manager class for THREDDS

Parameters:config (Config) –
declare_chunk(domain, var, startdate, member, chunk, grid=None, region=None, box=None, frequency=None, vartype=1, diagnostic=None)[source]

Copy a given file from the CMOR repository to the scratch folder and returns the path to the scratch’s copy

Parameters:
  • diagnostic
  • region
  • domain (Domain) – CMOR domain
  • var (str) – variable name
  • startdate (str) – file’s startdate
  • member (int) – file’s member
  • chunk (int) – file’s chunk
  • grid (str|None) – file’s grid (only needed if it is not the original)
  • box (Box) – file’s box (only needed to retrieve sections or averages)
  • frequency (Frequency|None) – file’s frequency (only needed if it is different from the default)
  • vartype (VariableType) – Variable type (mean, statistic)
Returns:

path to the copy created on the scratch folder

Return type:

str

declare_year(domain, var, startdate, member, year, grid=None, box=None, vartype=1, diagnostic=None)

Declare a variable year to be generated by a diagnostic

Parameters:
Returns:

Return type:

DataFile

Raises:

NotImplementedError – If not implemented by derived classes

file_exists(domain, var, startdate, member, chunk, grid=None, box=None, frequency=None, vartype=1, possible_versions=None)[source]

Check if a file exists in the storage

Creates a THREDDSSubset and checks if it is accesible

Parameters:
Returns:

Return type:

THREDDSSubset

get_file_path(startdate, domain, var, frequency, vartype, box=None, grid=None)[source]

Return the path to a concrete file

Parameters:
Returns:

Return type:

str

get_var_url(var, startdate, frequency, box, vartype)[source]

Get url for dataset

Parameters:
  • var (str) – variable to retrieve
  • startdate (str) – startdate to retrieve
  • frequency (Frequency | None) – frequency to get:
  • box (Box) – box to get
  • vartype (VariableType) – type of variable
Returns:

get_year(domain, var, startdate, member, year, grid=None, box=None, vartype=1)[source]

Ge a file containing all the data for one year for one variable

Parameters:
  • domain (str) – variable’s domain
  • var (str) – variable’s name
  • startdate (str) – startdate to retrieve
  • member (int) – member to retrieve
  • year (int) – year to retrieve
  • grid (str) – variable’s grid
  • box (Box) – variable’s box
  • vartype (VariableType) – Variable type (mean, statistic)
Returns:

Create the link of a given file from the CMOR repository.

Parameters:
  • cmor_var
  • move_old
  • date_str
  • year (int) – if frequency is yearly, this parameter is used to give the corresponding year
  • domain (Domain) – CMOR domain
  • var (str) – variable name
  • startdate (str) – file’s startdate
  • member (int) – file’s member
  • chunk (int) – file’s chunk
  • grid (str) – file’s grid (only needed if it is not the original)
  • frequency (str) – file’s frequency (only needed if it is different from the default)
  • vartype (VariableType) – Variable type (mean, statistic)
Returns:

path to the copy created on the scratch folder

Return type:

str

prepare()

Prepare the data to be used by Earth Diagnostics

request_chunk(domain, var, startdate, member, chunk, grid=None, box=None, frequency=None, vartype=1)[source]

Request a given file from the CMOR repository to the scratch folder and returns the path to the scratch’s copy

Parameters:
Returns:

Return type:

DataFile

request_year(diagnostic, domain, var, startdate, member, year, grid=None, box=None, frequency=None)

Request a given year for a variavle from a CMOR repository

Parameters:
Returns:

Return type:

DataFile

Raises:

NotImplementedError – If not implemented by derived classes

class earthdiagnostics.threddsmanager.THREDDSSubset(thredds_path, file_path, var, start_time, end_time)[source]

Bases: earthdiagnostics.datafile.DataFile

Implementation of DataFile for the THREDDS server

Parameters:
  • thredds_path (str) –
  • file_path (str) –
  • var (str) –
  • start_time (datetime) –
  • end_time (datetime) –
add_cmorization_history()

Add the history line corresponding to the cmorization to the local file

add_diagnostic_history()

Add the history line corresponding to the diagnostic to the local file

add_modifier(diagnostic)

Register a diagnostic as a modifier of this data

A modifier diagnostic is a diagnostic that read this data and changes it in any way. The diagnostic must be a modifier even if it only affects the metadata

Parameters:diagnostic (Diagnostic) –
clean_local()

Check if a local file is still needed and remove it if not

Create a link from the original data in the <frequency>_<var_type> folder

dispatch(*args)

Notify update to all the suscribers

Parameters:args – arguments to pass
download()[source]

Get data from the THREDDS server

Raises:THREDDSError – If the data can not be downloaded
download_required()

Get if a download is required for this file

Returns:
Return type:bool
classmethod from_storage(filepath, data_convention)

Create a new datafile to be downloaded from the storage

has_modifiers()

Check if it has registered modifiers

Returns:
Return type:bool
local_status

Get local storage status

only_suscriber(who)

Get if an object is the sole suscriber of this publisher

Parameters:who (object) –
Returns:
Return type:bool
prepare_to_upload(rename_var)

Prepare a local file to be uploaded

This includes renaming the variable if necessary, updating the metadata and adding the history and managing the possibility of multiple regions

ready_to_run(diagnostic)

Check if the data is ready to run for a given diagnostics

To be ready to run, the datafile should be in the local storage and no modifiers can be pending.

Parameters:diagnostic (Diagnostic) –
Returns:
Return type:bool
set_local_file(local_file, diagnostic=None, rename_var='', region=None)

Set the local file generated by EarthDiagnostics

This also prepares it for the upload

Parameters:
Returns:

Return type:

None

size

File size

storage_status

Get remote storage status

subscribe(who, callback=None)

Add a suscriber to the current publisher

Parameters:
  • who (object) – Subscriber to add
  • callback (callable or None, optional) – Callback to call
suscribers

List of suscribers of this publisher

classmethod to_storage(remote_file, data_convention)

Create a new datafile object for a file that is going to be generated and stored

unsubscribe(who)

Remove a suscriber from the current publisher

Parameters:who (object) – suscriber to remove
upload()

Send a loal file to the storage

upload_required()

Get if an upload is needed for this file

Returns:
Return type:bool

earthdiagnostics.utils

Common utilities for multiple topics that are not big enough to have their own module

class earthdiagnostics.utils.TempFile[source]

Bases: object

Class to manage temporal files

autoclean = True

If True, new temporary files are added to the list for future cleaning

static clean()[source]

Remove all temporary files created with Tempfile until now

files = []

List of files to clean automatically

static get(filename=None, clean=None, suffix='.nc')[source]

Get a new temporal filename, storing it for automated cleaning

Parameters:
  • suffix
  • filename (str) – if it is not none, the function will use this filename instead of a random one
  • clean (bool) – if true, stores filename for cleaning
Returns:

path to the temporal file

Return type:

str

prefix = 'temp'

Prefix for temporary filenames

scratch_folder = ''

Scratch folder to create temporary files on it

class earthdiagnostics.utils.Utils[source]

Bases: object

Container class for miscellaneous utility methods

exception CopyException[source]

Bases: exceptions.Exception

Exception raised when copy fails

exception ExecutionError[source]

Bases: exceptions.Exception

Exception to raise when a command execution fails

exception UnzipException[source]

Bases: exceptions.Exception

Exception raised when unzip fails

static available_cpu_count()[source]

Number of available virtual or physical CPUs on this system

cdo = <cdo.Cdo object>

An instance of Cdo class ready to be used

static check_netcdf_file(filepath)[source]

Check if a NetCDF file is well stored

This functions is used to check if a NetCDF file is corrupted. It prefers to raise a false postive than to have false negatives.

Parameters:filepath
Returns:
Return type:bool
static concat_variables(source, destiny, remove_source=False)[source]

Add variables from a nc file to another

Parameters:
  • source (str) –
  • destiny (str) –
  • remove_source (bool) – if True, removes source file
static convert2netcdf4(filetoconvert)[source]

Convert a file to NetCDF4

Conversion only performed if required. Deflation level set to 4 and shuffle activated.

Parameters:filetoconvert (str) –
static convert_to_ascii_if_possible(string, encoding='ascii')[source]

Convert an Unicode string to ASCII if all characters can be translated.

If a string can not be translated it is unchanged. It also automatically replaces Bretonnière with Bretonniere

Parameters:
  • string (unicode) –
  • encoding (str, optional) –
Returns:

Return type:

str

static convert_units(var_handler, new_units, calendar=None, old_calendar=None)[source]

Convert units

Parameters:
  • var_handler (Dataset) –
  • new_units (str) –
  • calendar (str) –
  • old_calendar (str) –
static copy_attributes(new_var, original_var, omitted_attributtes=None)[source]

Copy attributtes from one variable to another

Parameters:
  • new_var (netCDF4.Variable) –
  • original_var (netCDF4.Variable) –
  • omitted_attributtes (iterable of str) – Collection of attributtes that should not be copied
static copy_dimension(source, destiny, dimension, must_exist=True, new_names=None, rename_dimension=False)[source]

Copy the given dimension from source to destiny, including dimension variables if present

Parameters:
  • source (netCDF4.Dataset) –
  • destiny (netCDF4.Dataset) –
  • dimension (str) –
  • must_exist (bool, optional) –
  • new_names (dict of str: str or None, optional) –
static copy_file(source, destiny, save_hash=False, use_stored_hash=True, retrials=3)[source]

Copy a file and compute a hash to check if the copy is equal to the source

Parameters:
  • source (str) –
  • destiny (str) –
  • save_hash (bool, optional) – If True, stores a copy of the hash
  • use_stored_hash (bool, optional) – If True, try to use the stored value of the source hash instead of computing it
  • retrials (int, optional) – Minimum value is 1

See also

move_file()

static copy_tree(source, destiny)[source]

Copy a full tree to a new location

Parameters:
  • source (str) –
  • destiny (str) –

See also

move_tree()

static copy_variable(source, destiny, variable, must_exist=True, add_dimensions=False, new_names=None, rename_dimension=True)[source]

Copy the given variable from source to destiny

Parameters:
  • source (netCDF4.Dataset) –
  • destiny (netCDF4.Dataset) –
  • variable (str) –
  • must_exist (bool, optional) –
  • add_dimensions (bool, optional) –
  • new_names (dict of str: str) –
Raises:

Exception – If dimensions are not correct in the destiny file and add_dimensions is False

static create_folder_tree(path)[source]

Create a folder path with all parent directories if needed.

Parameters:path (str) –
static execute_shell_command(command, log_level=10)[source]

Execute shell command

Writes the output to the log with the specified level

Parameters:
  • command (str or iterable of str) –
  • log_level (int, optional) –
Returns:

Standard output of the command

Return type:

iterable of str

Raises:

Utils.ExecutionError – If the command return value is non zero

static get_datetime_from_netcdf(handler, time_variable='time')[source]

Get time from NetCDF files

Parameters:
  • handler (netCDF4.Dataset) –
  • time_variable (str, optional) –
Returns:

Return type:

numpy.array of Datetime

static get_file_hash(filepath, use_stored=False, save=False)[source]

Get the xxHash hash for a given file

Parameters:
  • filepath (str) –
  • use_stored (bool, optional) – If True, tries to use the stored hash before computing it
  • save (bool, optional) – If True, saves the hash to a file
static get_file_variables(filename)[source]

Get all the variables in a file

Parameters:filename
Returns:
Return type:iterable of str
static get_mask(basin, with_levels=False)[source]

Return the mask for the given basin

Parameters:basin (Basin) –
Returns:
Return type:numpy.array
Raises:Exception: If mask.regions.nc is not available
static give_group_write_permissions(path)[source]

Give write permissions to the group

static move_file(source, destiny, save_hash=False, retrials=3)[source]

Move a file and compute a hash to check if the copy is equal to the source

It is just a call to Utils.copy_file followed bu

Parameters:
  • source (str) –
  • destiny (str) –
  • save_hash (bool, optional) – If True, stores a copy of the hash
  • retrials (int, optional) – Minimum value is 1

See also

copy_file()

static move_tree(source, destiny)[source]

Move a tree to a new location

Parameters:
  • source (str) –
  • destiny (str) –

See also

copy_tree()

nco = <nco.nco.Nco object>

An instance of Nco class ready to be used

static open_cdf(filepath, mode='a')[source]

Open a NetCDF file

Parameters:
  • filepath (str) –
  • mode (str, optional) –
Returns:

Return type:

netCDF4.Dataset

static remove_file(path)[source]

Delete a file only if it previously exists

Parameters:path (str) –
static rename_variable(filepath, old_name, new_name, must_exist=True, rename_dimension=True)[source]

Rename variable from a NetCDF file

This function is just a wrapper around Utils.rename_variables

Parameters:
  • filepath (str) –
  • old_name (str) –
  • new_name (str) –
  • must_exist (bool, optional) –
static rename_variables(filepath, dic_names, must_exist=True, rename_dimension=True)[source]

Rename multiple variables from a NetCDF file

Parameters:
  • filepath (str) –
  • dic_names (dict of str: str) – Gives the renaming to do in the form old_name: new_name
  • must_exist (bool, optional) –
Raises:
  • ValueError – If any original name is the same as the new
  • Exception – If any requested variable does not exist and must_exist is True
static setminmax(filename, variable_list)[source]

Set the valid_max and valid_min values to the current max and min values on the file

Parameters:
  • filename (str) –
  • variable_list (str or iterable of str) –
static untar(files, destiny_path)[source]

Untar files to a given destiny

Parameters:
  • files (iterable of str) –
  • destiny_path (str) –
static unzip(files, force=False)[source]

Unzip a list of files

files: str or iterable of str force: bool, optional

if True, it will overwrite unzipped files
earthdiagnostics.utils.suppress_stdout(*args, **kwds)[source]

Redirect the standard output to devnull

earthdiagnostics.variable

Classes to manage variable definitions and aliases

class earthdiagnostics.variable.CMORTable(name, frequency, date, domain)[source]

Bases: object

Class to represent a CMOR table

Parameters:
class earthdiagnostics.variable.Variable[source]

Bases: object

Class to characterize a CMOR variable.

It also contains the static method to make the match between the original name and the standard name. Requires data _convetion to be available in cmor_tables to work.

add_table(table, priority=None)[source]

Add table to variable

Parameters:
get_modelling_realm(domains)[source]

Get var modelling realm

Parameters:domains (iterable of str) –
Returns:
Return type:ModelingRealm or None
get_table(frequency, data_convention)[source]

Get a table object given the frequency and data_covention

If the variable does not contain the table information, it uses the domain to make a guess

Parameters:
Returns:

Return type:

CMORTable

Raises:

ValueError – If a table can not be deduced from the given parameters

parse_csv(var_line)[source]

Fill the object information from a csv line

Parameters:var_line (list of str) –
parse_json(json_var, variable)[source]

Parse variable json

Parameters:
  • json_var (dict of str: str) –
  • variable (str) –
class earthdiagnostics.variable.VariableAlias(alias, basin=None, grid=None)[source]

Bases: object

Class to characterize a CMOR variable.

It also contains the static method to make the match between the original name and the standard name. Requires data _convetion to be available in cmor_tables to work.

Parameters:alias (str) –
exception earthdiagnostics.variable.VariableJsonException[source]

Bases: exceptions.Exception

Exception to be raised when an error related to the json reading is encountered

class earthdiagnostics.variable.VariableManager[source]

Bases: object

Class for translating variable alias into standard names and provide the correct description for them

clean()[source]

Clean all information contained in the variable manager

create_aliases_dict()[source]

Create aliases dictionary for the registered variables

get_all_variables()[source]

Return all variables

Returns:CMOR variable list
Return type:set[Variable]
get_variable(original_name, silent=False)[source]

Return the cmor variable instance given a variable name

Parameters:
  • original_name (str) – original variable’s name
  • silent (bool) – if True, omits log warning when variable is not found
Returns:

CMOR variable

Return type:

Variable

get_variable_and_alias(original_name, silent=False)[source]

Return the cmor variable instance given a variable name

Parameters:
  • original_name (str) – original variable’s name
  • silent (bool) – if True, omits log warning when variable is not found
Returns:

CMOR variable

Return type:

Variable

load_variables(table_name)[source]

Load the CMOR csv and creates the variables dictionary

Parameters:table_name (str) –
register_variable(var)[source]

Register variable info

Parameters:var (Variable) –
class earthdiagnostics.variable.VariableType[source]

Bases: object

Enumeration of variable types

static to_str(vartype)[source]

Get str representation of vartype for the folder convention

earthdiagnostics.variable_type

earthdiagnostics.workmanager

Earthdiagnostics workflow manager

class earthdiagnostics.work_manager.Downloader[source]

Bases: object

Download manager for EarthDiagnostics

We are not using a ThreadPoolExecutor because we want to be able to control priorities in the download

shutdown()[source]

Stop the downloader after all downloads have finished

start()[source]

Create the downloader thread and initialize it

submit(datafile)[source]

Add a datafile to the download queue

class earthdiagnostics.work_manager.WorkManager(config, data_manager)[source]

Bases: object

Class to produce and control the workflow of EarthDiagnostics

Parameters:
prepare_job_list()[source]

Create the list of jobs to run

run()[source]

Run all the diagnostics

Returns:Only True if all diagnostic were correctly executed
Return type:bool