Filename extension issues in cat.search() #74

headmetal · 2023-05-18T08:09:26Z

I'm not sure if this is an intended behaviour or a bug (and I apologise in advance if this is in fact intended):

In the following case, if I attempt to explicitly nominate a given filename ocean_month.1mon from the available keys() it returns an empty data_dict:

However, if I add a wildcard and exclude .1mon from the filename, the data_dict is populated as expected:

I'm guessing the .1mon isn't the real file extension, but is in fact part of the filename - so is messing up loading the file?

The text was updated successfully, but these errors were encountered:

dougiesquire · 2023-05-18T10:24:47Z

Thanks @headmetal . ocean_month.1mon is not a filename, it's a "key" for a dataset in the intake-esm datastore. The keys are intake-esm's way of knowing how to concatenate all the files in the datastore into "datasets". For ACCESS-OM2 datastores, the keys are made up of two fields from the table (see subcat.df to see the table):

the file_id, which is parsed from the filename (the ocean_month part in your case)
the frequency (the 1mon part)

It you want to load a dataset directly by key, you can use something like:

subcat["ocean_month.1mon"].to_dataset_dict()

Alternatively, in this specific case where the filename ocean_month alone defines a unique dataset, you could get the same data by querying on the filename field as you have done above, e.g.:

subcat.search(filename="ocean_month").to_dataset_dict()

However, it isn't always guaranteed that filename alone defines a unique dataset. E.g. I've come across model runs containing two files with the same name containing data at different frequencies. That's why the frequency info is also needed in the key.

I hope this helps and doesn't make things even less clear. Some more info on keys in intake-esm can be found here: https://intake-esm.readthedocs.io/en/stable/how-to/understand-keys-and-how-to-change-them.html

dougiesquire · 2023-05-19T01:57:18Z

This issue originated from a confusing line in example_usage.ipynb. I've added a note for how to improve this in #75 (comment), so I'm closing this issue

dougiesquire closed this as completed May 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filename extension issues in cat.search() #74

Filename extension issues in cat.search() #74

headmetal commented May 18, 2023

dougiesquire commented May 18, 2023

dougiesquire commented May 19, 2023

Filename extension issues in cat.search() #74

Filename extension issues in cat.search() #74

Comments

headmetal commented May 18, 2023

dougiesquire commented May 18, 2023

dougiesquire commented May 19, 2023