Configuration file options

This section contains the list and explanation about all the options that are available on the configuration file. Use it as a reference while preparing your configuration file. Each subsection will refer to the matching section from the config file. Those subsections explanation may be divided itself for the shake of clarity but this further divisions have nothing to do with the config file syntax itself.

DIAGNOSTICS

This section contains the general configuration for the diagnostics. The explanation has been divided in two subsections: the first one will cover all the mandatory options that you must specify in every configuration, while the second will cover all the optional configurations.

Mandatory configurations

  • SCRATCH_DIR:
    Temporary folder for the calculations. Final results will never be stored here.
  • DATA_DIR:
    ‘:’ separated list of folders to look for data in. It will look for file in the path $DATA_FOLDER/$EXPID and $DATA_FOLDER/$DATA_TYPE/$MODEL/$EXPID
  • CON_FILES:
    Folder containing mask and mesh files for the dataset.
  • FREQUENCY:
    Default data frequency to be used by the diagnostics. Some diagnostics can override this configuration or even ignore it completely.
  • DIAGS:
    List of diagnostic to run. No specific order is needed: data dependencies will be enforced.

Optional configurations

  • SCRATCH_MASKS
    Common scratch folder for the ocean masks. This is useful to avoid replicating them for each run at the fat nodes. By default is ‘/scratch/Earth/ocean_masks’
  • RESTORE_MESHES
    By default, Earth Diagnostics only copies the mask files if they are not present in the scratch folder. If this option is set to true, Earth Diagnostics will copy them regardless of existence. Default is False.
  • DATA_ADAPTOR
    This is used to choose the mechanism for storing and retrieving data. Options are CMOR (for our own experiments) or THREDDS (for anything else). Default value is CMOR
  • DATA_TYPE
    Type of the dataset to use. It can be exp, obs or recon. Default is exp.
  • DATA_CONVENTION
    Convention to use for file paths and names and variable naming among other things. Can be SPECS, PREFACE, PRIMAVERA or CMIP6. Default is SPECS.
  • CDFTOOLS_PATH
    Path to the folder containing CDFTOOLS executables. By default is empty, so CDFTOOLS binaries must be added to the system path.
  • MAX_CORES
    Maximum number of cores to use. By default the diagnostics will use all cores available to them. It is not necessary when launching through a scheduler, as Earthdiagnostics can detect how many cores the scheduler has allocated to it.
  • AUTO_CLEAN
    If True, EarthDiagnostics removes the temporary folder just after finsihing. If RAM_DISK is set to True, this value is ignored and always Default is True
  • RAM_DISK
    If set to True, the temporary files is created at the /dev/shm partition. This partition is not mounted from a disk. Instead, all files are created in the RAM memory, so hopefully this will improve performance at the cost of a much higher RAM consumption. Default is False.
  • MESH_MASK
    Custom file to use instead of the corresponding mesh mask file.
  • NEW_MASK_GLO
    Custom file to use instead of the corresponding new mask glo file
  • MASK_REGIONS
    Custom file to use instead of the corresponding 2D regions file
  • MASK_REGIONS_3D
    Custom file to use instead of the corresponding 3D regions file

EXPERIMENT

This sections contains options related to the experiment’s definition or configuration.

  • MODEL
    Name of the model used for the experiment.
  • MODEL_VERSION
    Model version. Used to get the correct mask and mesh files
  • ATMOS_TIMESTEP
    Time between outputs from the atmosphere. This is not the model simulation timestep! Default is 6.
  • OCEAN_TIMESTEP
    Time between outputs from the ocean. This is not the model simulation timestep! Default is 6.
  • ATMOS_GRID
    Atmospheric grid definition. Will be used as a default target for interpolation diagnostics.
  • INSTITUTE
    Institute that made the experiment, observation or reconstruction
  • EXPID
    Unique identifier for the experiment
  • NAME
    Experiment’s name. By default it is the EXPID.
  • STARTDATES
    Startdates to run as a space separated list
  • MEMBER
    Members to run as a space separated list. You can just provide the number or also add the prefix
  • MEMBER_DIGITS
    Number of minimum digits to compose the member name. By default it is 1. For example, for member 1 member name will be fc1 if MEMBER_DIGITS is 1 or fc01 if MEMBER_DIGITS is 2
  • MEMBER_PREFIX
    Prefix to use for the member names. By default is ‘fc’
  • MEMBER_COUNT_START
    Number corresponding to the first member. For example, if your first member is ‘fc1’, it should be 1. If it is ‘fc0’, it should be 0. By default is 0
  • CHUNK_SIZE
    Length of the chunks in months
  • CHUNKS
    Number of chunks to run
  • CHUNK_LIST
    List of chunks to run. If empty, all diagnostics will be applied to all chunks
  • CALENDAR
    Calendar to use for date calculation. All calendars supported by Autosubmit are available. Default is ‘standard’

CMOR

In this section, you can control how will work the cmorization process. All options belonging to this section are optional.

Cmorization options

This options control when and which varibales will be cmorized.

  • FORCE
    If True, launches the cmorization, regardless of existence of the extracted files or the package containing the online-cmorized ones. If False, only the non-present chunks will be cmorized. Default value is False
  • FORCE_UNTAR
    Unpacks the online-cmorized files regardless of exstience of extracted files. If ‘FORCE is True, this parameter has no effect. If False, only the non-present chunks will be unpacked. Default value is False.
  • FILTER_FILES
    Only cmorize original files containing any of the given strings. This is a space separated list. Default is the empty string.
  • OCEAN_FILES
    Boolean flag to activate or no NEMO files cmorization. Default is True.
  • ATMOSPHERE_FILES
    Boolean flag to activate or no IFS files cmorization. Default is True.
  • USE_GRIB
    Boolean flag to activate or no GRIB files cmorization for the atmosphere. If activated and no GRIB files are present, it will cmorize using the MMA files instead (as if it was set to False). Default is True.
  • CHUNKS
    Space separated list of chunks to be cmorized. If not provided, all chunks are cmorized
  • VARIABLE_LIST
    Space separated list of variables to cmorize. Variables must be specified as domain:var_name. If no one is specified, all the variables will be cmorized

Grib variables extraction

These three options ares used to configure the variables to be CMORized from the grib atmospheric files. They must be specified using the IFS code in a list separated by comma.

You can also specify the levels to extract using one of the the following syntaxes:

  • VARIABLE_CODE
  • VARIABLE_CODE:LEVEL,
  • VARIABLE_CODE:LEVEL_1-LEVEL_2-…-LEVEL_N
  • VARIABLE_CODE:MIN_LEVEL:MAX_LEVEL:STEP

Some examples to clarify it further: * Variable with code 129 at level 30000: 129:30000 * Variable with code 129 at levels 30000, 40000 and 60000: 129:30000-40000-60000 * Variable with code 129 at levels between 30000 and 600000 with 10000 intervals:

129:30000:60000:10000 equivalent to 129:30000-40000-50000-60000
  • ATMOS_HOURLY_VARS
    Configuration of variables to be extracted in an hourly basis
  • ATMOS_DAILY_VARS
    Configuration of variables to be extracted in a daily basis
  • ATMOS_MONTHLY_VARS
    Configuration of variables to be extracted in a monthly basis

Metadata options

All the options in this subsection will serve just to add the given values to the homonymous attributes in the cmorized files.

  • ASSOCIATED_EXPERIMENT
    Default value is ‘to be filled’
  • ASSOCIATED_MODEL
    Default value is ‘to be filled’
  • INITIALIZATION_DESCRIPTION
    Default value is ‘to be filled’
  • INITIALIZATION_METHOD
    Default value is ‘1’
  • PHYSICS_DESCRIPTION
    Default value is ‘to be filled’
  • PHYSICS_VERSION
    Default value is ‘1’
  • SOURCE
    Default value is ‘to be filled’
  • VERSION
    Dataset version to use (not present in all conventions)
  • DEFAULT_OCEAN_GRID
    Name of the default ocean grid for those conventions that require it (CMIP6 and PRIMAVERA). Default is gn.
  • DEFAULT_ATMOS_GRID
    Name of the default atmos grid for those conventions that require it (CMIP6 and PRIMAVERA). Default is gr.
  • ACTIVITY
    Name of the activity. Default is CMIP

THREDDS

For now, there is only one option for the THREDDS server configuration.

  • SERVER_URL
    THREDDS server URL

ALIAS

This config file section is different from all the others because it does not contain a set of configurations. Instead, in this section the user can define a set of aliases to be able to launch its most used configurations with ease. To do this, the user must add an option with named after the desired alias and assign to it the configuration or configurations to launch when this ALIAS is invoked. See the next example:

ALIAS_NAME = diag,opt1,opt2 diag,opt1new,opt2

In this case, the user has defined a new alias ‘ALIAS’ that can be used launch two times the diagnostic ‘diag’, the first with the options ‘opt1’ and ‘opt2’ and the second replacing ‘opt1’ with ‘opt1new’.

In this example, configuring the DIAGS as

DIAGS = ALIAS_NAME

will be identical to

DIAGS = diag,opt1,opt2 diag,opt1new,opt2