Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMORize tool fails for RAWOBS if directory structure does not include Tier2/Tier3 #3640

Open
k-a-webb opened this issue May 31, 2024 · 1 comment
Assignees

Comments

@k-a-webb
Copy link

When attempting to CMORize RAWOBS located in a directory which does not follow ESMValTool's convention (i.e., no Tiers) the cmorizer.py script fails at line 266

tier = self._get_dataset_tier(dataset)
if tier is None:
    logger.error("Data for %s not found. Perhaps you are not"
                 " storing it in a RAWOBS/TierX/%s"
                  " (X=2 or 3) directory structure?", dataset, dataset)
    return False

Ideally one could use the CMORizer tool with the same configuration files and directory structure that you use to run ESMValTool.


Example configuration files:

config-user.yml:

    log_level: debug
    exit_on_warning: false
    output_file_type: svg
    output_dir: ./output
    auxiliary_data_dir: ./auxiliary_data
    compress_netcdf: false
    save_intermediary_cubes: false
    remove_preproc_dir: false
    max_parallel_tasks: null
    config_developer_file: ./config-developer.yml
    profile_diagnostic: false

    rootpath:
      RAWOBS: /space/hall6/sitestore/eccc/crd/ccrn/users/scrd113/RAWOBS
      OBS6: /space/hall6/sitestore/eccc/crd/ccrn/users/scrd113/CMOROBS

    drs:
      RAWOBS: default
      OBS6: default

config-developer.yml

    OBS6:
      cmor_strict: false
      input_dir:
        evt_default: 'Tier{tier}/{dataset}'
        default: '{type}/{dataset}/{latestversion}/{frequency}/{short_name}'
      input_file:
        evt_default: '{project}_{dataset}_{type}_{version}_{mip}_{short_name}[_.]*nc'
        default: 'OBS6_{dataset}_{type}_{version}_{mip}_{short_name}[_.]*nc'
      output_file: 'OBS6_{dataset}_{type}_{version}_{mip}_{short_name}'
      cmor_type: 'CMIP6'

    RAWOBS:
      cmor_strict: false
      input_dir:
        default: '{type}/{dataset}/{latestversion}/{frequency}/{short_name}'
        evt_default: 'Tier{tier}/{dataset}'
      input_file:
        default: '*nc'
      output_file: 'OBS_{dataset}_{type}_{version}_{mip}_{name}' # "name" is specified in esmvaltool/cmorizers/data/cmor_config/WOA.yml
      cmor_type: 'CMIP5'

Attempting to CMORize WOA located in directory structure:

/space/hall6/sitestore/eccc/crd/ccrn/users/scrd113/RAWOBS/clim/WOA

with files such as:

/space/hall6/sitestore/eccc/crd/ccrn/users/scrd113/RAWOBS/clim/WOA/v2018/mon/temperature/woa18_decav81B0_t00_01.nc

esmvaltool data format --config_file config-user.yml WOA

Fails with the error message:

2024-05-31 23:00:14,355 UTC [86961] INFO    Writing program log files to:
/fs/homeu2/eccc/crd/ccrn_shr/rkw001/A4D_standard_diagnostics__cmor/work/cmor_woa/output/data_formatting_20240531_230012/run/main_log.txt
/fs/homeu2/eccc/crd/ccrn_shr/rkw001/A4D_standard_diagnostics__cmor/work/cmor_woa/output/data_formatting_20240531_230012/run/main_log_debug.txt
2024-05-31 23:00:14,356 UTC [86961] INFO    Starting the CMORization Tool at time: 2024-05-31 23:00:14 UTC
2024-05-31 23:00:14,356 UTC [86961] INFO    ----------------------------------------------------------------------
2024-05-31 23:00:14,356 UTC [86961] INFO    input_dir  = /home/rkw001/download_data/RAWOBS
2024-05-31 23:00:14,356 UTC [86961] INFO    output_dir = /fs/homeu2/eccc/crd/ccrn_shr/rkw001/A4D_standard_diagnostics__cmor/work/cmor_woa/output/data_formatting_20240531_230012
2024-05-31 23:00:14,356 UTC [86961] INFO    ----------------------------------------------------------------------
2024-05-31 23:00:14,356 UTC [86961] INFO    Running the CMORization scripts.
2024-05-31 23:00:14,356 UTC [86961] INFO    Processing datasets ['WOA']
2024-05-31 23:00:14,356 UTC [86961] INFO    Input data from: /home/rkw001/download_data/RAWOBS/Tier2/WOA
2024-05-31 23:00:14,356 UTC [86961] INFO    Output will be written to: /fs/homeu2/eccc/crd/ccrn_shr/rkw001/A4D_standard_diagnostics__cmor/work/cmor_woa/output/data_formatting_20240531_230012/Tier2/WOA
2024-05-31 23:00:14,357 UTC [86961] INFO    Reformat script: /fs/site5/eccc/crd/ccrn/users/rkw001/code/esmvaltool/esmvaltool_for_A4D_standard_diagnostics/esmvaltool/cmorizers/data/formatters/datasets/woa
2024-05-31 23:00:14,358 UTC [86961] INFO    CMORizing dataset WOA using Python script /fs/site5/eccc/crd/ccrn/users/rkw001/code/esmvaltool/esmvaltool_for_A4D_standard_diagnostics/esmvaltool/cmorizers/data/formatters/datasets/woa.py
2024-05-31 23:00:14,361 UTC [86961] INFO    CMORizing var thetao from input set temperature
2024-05-31 23:00:14,379 UTC [86961] ERROR   Program terminated abnormally, see stack trace below for more information:
Traceback (most recent call last):
  File "/fs/site5/eccc/crd/ccrn/users/rkw001/code/esmvalcore/esmvalcore_for_A4D_standard_diagnostics/esmvalcore/_main.py", line 499, in run
    fire.Fire(ESMValTool())
  File "/space/hall5/sitestore/eccc/crd/ccrn/users/rkw001/miniconda3/envs/a4d_env_v1p3/lib/python3.11/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/space/hall5/sitestore/eccc/crd/ccrn/users/rkw001/miniconda3/envs/a4d_env_v1p3/lib/python3.11/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
                                ^^^^^^^^^^^^^^^^^^^^
  File "/space/hall5/sitestore/eccc/crd/ccrn/users/rkw001/miniconda3/envs/a4d_env_v1p3/lib/python3.11/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^
  File "/fs/site5/eccc/crd/ccrn/users/rkw001/code/esmvaltool/esmvaltool_for_A4D_standard_diagnostics/esmvaltool/cmorizers/data/cmorizer.py", line 489, in format
    self.formatter.format(start, end, install)
  File "/fs/site5/eccc/crd/ccrn/users/rkw001/code/esmvaltool/esmvaltool_for_A4D_standard_diagnostics/esmvaltool/cmorizers/data/cmorizer.py", line 194, in format
    if not self.format_dataset(dataset, start, end, install):
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fs/site5/eccc/crd/ccrn/users/rkw001/code/esmvaltool/esmvaltool_for_A4D_standard_diagnostics/esmvaltool/cmorizers/data/cmorizer.py", line 284, in format_dataset
    success = self._run_pyt_script(in_data_dir, out_data_dir, dataset,
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/fs/site5/eccc/crd/ccrn/users/rkw001/code/esmvaltool/esmvaltool_for_A4D_standard_diagnostics/esmvaltool/cmorizers/data/cmorizer.py", line 386, in _run_pyt_script
    module.cmorization(in_dir, out_dir, cmor_cfg, self.config, start, end)
  File "/fs/site5/eccc/crd/ccrn/users/rkw001/code/esmvaltool/esmvaltool_for_A4D_standard_diagnostics/esmvaltool/cmorizers/data/formatters/datasets/woa.py", line 148, in cmorization
    extract_variable(in_files, out_dir, glob_attrs, raw_info, cmor_table)
  File "/fs/site5/eccc/crd/ccrn/users/rkw001/code/esmvaltool/esmvaltool_for_A4D_standard_diagnostics/esmvaltool/cmorizers/data/formatters/datasets/woa.py", line 100, in extract_variable
    cubes = iris.load(in_files, rawvar)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/space/hall5/sitestore/eccc/crd/ccrn/users/rkw001/miniconda3/envs/a4d_env_v1p3/lib/python3.11/site-packages/iris/__init__.py", line 326, in load
    return _load_collection(uris, constraints, callback).merged().cubes()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/space/hall5/sitestore/eccc/crd/ccrn/users/rkw001/miniconda3/envs/a4d_env_v1p3/lib/python3.11/site-packages/iris/__init__.py", line 294, in _load_collection
    result = _CubeFilterCollection.from_cubes(cubes, constraints)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/space/hall5/sitestore/eccc/crd/ccrn/users/rkw001/miniconda3/envs/a4d_env_v1p3/lib/python3.11/site-packages/iris/cube.py", line 97, in from_cubes
    for cube in cubes:
  File "/space/hall5/sitestore/eccc/crd/ccrn/users/rkw001/miniconda3/envs/a4d_env_v1p3/lib/python3.11/site-packages/iris/__init__.py", line 275, in _generate_cubes
    for cube in iris.io.load_files(part_names, callback, constraints):
  File "/space/hall5/sitestore/eccc/crd/ccrn/users/rkw001/miniconda3/envs/a4d_env_v1p3/lib/python3.11/site-packages/iris/io/__init__.py", line 206, in load_files
    all_file_paths = expand_filespecs(filenames)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/space/hall5/sitestore/eccc/crd/ccrn/users/rkw001/miniconda3/envs/a4d_env_v1p3/lib/python3.11/site-packages/iris/io/__init__.py", line 184, in expand_filespecs
    raise IOError(msg)
OSError: One or more of the files specified did not exist:
    * "/home/rkw001/download_data/RAWOBS/Tier2/WOA/temperature/woa18_decav81B0_t00_01.nc" didn't match any files

Note that CMORization scripts provided with ESMValTool were unchanged.

@k-a-webb
Copy link
Author

@malininae

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants