CMIP7 Guidance for Data Users
This page is designed to inform users of climate model outputs on key CMIP7 concepts and tools. It is a landing page to redirect them to the proper resources to learn more.
1. Accessing CMIP7 data
CMIP7 model output is available through a distributed data archive developed and operated by the Earth System Grid Federation (ESGF). The data are hosted on a collection of nodes located at centres across the world.
Understanding ESGF Nodes
ESGF is a collaboration of groups, agencies and institutions around the world, that are dedicated to the development and operation of a long-term system for the management, access and analysis of climate data. The ESGF architecture is based on a system of autonomous and distributed Nodes. Data is hosted on a collection of nodes located at modelling centres or data centres across the world. Nodes exchange information about their data holdings and services, trust each other for registering users and establishing access control decisions. The net result is that a user can use a web browser or rich desktop client, connect to any Node, and seamlessly find and access data throughout the federation.
More documentation on CMIP nodes is available here.
ESGF data usage and publication metrics can be found on the CMCC dashboard.
There are 3 options to access the data:
-
MetaGrid (LLNL, DKRZ, ORNL, CEDA)
Web interface to search and download ESGF data. It provides access through http downloads, wget scripts, OPeNDAP URLs and Globus transfers. It is most useful for browsing and downloading a small number of files. The data can be accessed through any of the CMIP7 web interfaces linked above, which enable users to search across the entire distributed archive as if it were all centrally located.
-
Using a python package
For larger queries, it might be more appropriate to automate the search and downloads. A few packages are available to do this:
-
Alternative Access Platforms
Some CMIP data is also hosted in non-ESGF storage facilities. Below are links to some of these. If you know of another place CMIP data is currently being stored, please submit this form to let us and the community know!
- COMING SOON
For all non-ESGF data access routes, we encourage users to verify that the data used is the latest version.
2. Terms of use, citations and registration requirements
To enable modelling groups and others who support CMIP7 to demonstrate its impact (and secure ongoing funding), you are required to cite and acknowledge those who have made CMIP7 possible. When using CMIP7 data, you must
-
Acknowledge CMIP7.
In the Acknowledgment section, please insert the following text:
We acknowledge the World Climate Research Programme's Coupled Model Intercomparison Project contributors who coordinated and promoted CMIP7. We thank the climate modelling groups for producing and making available their model output, the Earth System Grid Federation (ESGF) for archiving the data and providing access, and the multiple funding agencies who support CMIP7 and ESGF.
-
Cite the specific dataset(s) used.
Please include a citation in the form of:
Authors/Data Creators (publication year): Title. Version YYYYMMDD. Earth System Grid Federation. DOI.e.g.
Swart et al. (2019): CCCma CanESM5 model output prepared for CMIP6 ScenarioMIP. Version 20190429. Earth System Grid Federation. https://doi.org/10.22033/ESGF/CMIP6.1317.If multiple models are used in a publication, please include a table with the sources (name of the model), institutions and citations. If the journal has a citation limit, a table in the Supporting Information is acceptable.
How to find the DOI and the version?
🔍 DOIs can be found through the Citation Search or in the citation tab of a dataset on MetaGrid. The version is indicated in a column on MetaGrid.
🖱️ It is also possible to take the
tracking_idglobal attribute of a file and append it to http://hdl.handle.net/ (e.g., http://hdl.handle.net/hdl:21.14100/be06a059-363d-47a4-97a2-d5253190fd15). From there, you can follow "The file is part of the following aggregation(s)" and find the DOI and version of the dataset.🤖 Instead of doing this by hand, you can also use the PROTOTYPE python library CMIPcite. Input tracking_id(s), dataset PID(s) or file paths(s) to retrieve the citation (textually or in the bibtex format).
Note that there are two citation granularities on experiment data and on model/MIP data. Also, the version has to be added separately as it is not included in the DOI.
Further information on the data citation concept is described in Stockhause and Lautenschlager (2017). Citations can also be searched using DataCite's catalogue and Google's Dataset Search.
-
Cite a paper from the GMD special issue
Cite, as appropriate, one or more of the CMIP7 GMD special issue articles, which include an overview of the CMIP7 experiment design and descriptions of the CMIP7 endorsed MIPs.
-
Register your work.
Register your work on the CMIP7 Publication Hub (coming soon).
-
Adhere to the license
Adhere to the license conditions listed in the global attribute of each dataset.
-
Use the standard vocabularies
Where possible, we recommend using the CMIP7 standard names as defined by the controlled vocabularies (CVs) (see Section 3) to make references as clear and unambiguous as possible. However, if your audience requires different terms, then you should use those but we recommend keeping a mapping from the term your audience uses to the standard name, again to ensure that references can be unambiguously resolved where needed. Refer to the collection of CMIP7 models as the “CMIP7 multi-model ensemble”.
Warning
The CMIP7 archive contains the output of scientific simulations of the past and potential future that are subject to multiple sources of error, ranging from errors in data handling, to errors in the representation of the real world in either the model, or the experimental setup for which the model was used. Different parts of the CMIP7 archive may be subject to differing levels of such errors, and users should be alert to these issues, and their potential consequences (and to the limitations of liability expressed in the data license).
3. CMIP7 facets and their documentation
CMIP7 datasets can be identified through a series of facets that represents key attributes of the data. The main facets are
- activity
- institution
- source
- experiment
- variant
- realm
- frequency
- variable
- grid
- version
Tip
Current advice from the CVs task team is to only access the CVs via ESGVOC. This will be subject to change in the future.
More information about the meaning of these facets is provided in the global attributes documentation, with further guidance provided on the Global Attributes page. The values associated with each facet are standardized through the CVs. They are used to search the ESGF database and can be found in the global attributes of the data. This section provides helpful links and gives a bit more information on a few key facets.
3.1. Source and Variant
- List of models (coming soon)
- Essential Model Documentation (EMD) (coming soon)
The Essential Model Documentation (EMD) contains a high-level description intended to contain information on model formulation that can be easily compared between different models. EMD pages contain links to more in-depth model documentation for each source.
Basic Concepts to Understand Variants
The source facet gives the name of the model and the variant facet represents each member of an ensemble for a given source. It can also be called the “ripf” identifier (“r” for realization, “i” for initialization, “p” for physics, and “f” for forcing).
A useful tool to evaluate the models is the Rapid Evaluation Framework (REF). It is an evaluation of the models participating in CMIP6 and the CMIP7 Assessment Fast Track (AFT).
3.2. Experiment and Activity
- List of experiments (coming soon)
- List of activities (coming soon)
The CMIP7 protocol and experiments are described in a special issue of Geoscientific Model Development with an overview of the design and scientific strategy provided in the lead article of that issue by Dunne et al. (2025).
Basic Concepts to Understand Experiments
Each model participating in CMIP7 will contribute results from the eight DECK experiments. These experiments are the only ones directly overseen by the CMIP Panel, and together these constitute the ongoing (slowly evolving) “CMIP” activity. In addition to the DECK, each modeling group may choose to contribute to any of the CMIP7 Community MIPs. The CMIP panel identifies key experiments to be prioritized on different timelines through fast tracks. The first one is the AFT, which includes a set of Community MIP experiments endorsed by the CMIP panel to address specific needs.
MORE COMING SOON
3.3. Variable
- List of variables (coming soon)
- Branded variable documentation
The variables produced in CMIP7 were recommended by the CMIP7 Data Request task team. In CMIP7, the concept of branded variable identifies the variables. It follows the template:
<variableRootDD>_<temporalLabelDD>-<verticalLabelDD>-<horizontalLabelDD>-<areaLabelDD>
3.4 Frequency
- List of frequencies (coming soon)
Models report data on a variety of time steps. The MIP table defines the frequency with which requested variables in an experiment should be reported.
Calendars and Time Handling
Climate models often use simplified or idealized calendars for numerical and computational reasons.
CMIP7 data include a calendar attribute associated with the time coordinate, which determines how dates are represented.
Before working with any CMIP dataset, users should check the calendar type and handle it appropriately.
Common calendars found in CMIP data include:
- gregorian (or standard) — follows the real-world Gregorian calendar including leap years.
- noleap — identical to Gregorian but without leap days (365 days every year).
- 360_day — each year has 12 months of 30 days (total 360 days).
- proleptic_gregorian — a continuous Gregorian calendar extended backward in time.
- all_leap — every year has 366 days (all years include a leap day).
These calendars are stored in the calendar attribute of the time variable, for example:
ncdump -h tas_day_ACCESS-ESM1-5_ssp585.nc | grep calendar
Further reading in CF conventions for time coordinate: https://cfconventions.org/Data/cf-conventions/cf-conventions-1.12/cf-conventions.html#time-coordinate
It is recommended that users use the cftime library to handle time.
3.5 Grid
- List of grids (coming soon)
- List of pressure levels: Table 2 of Dingley et al. 2025
- CMIP7 Guidance on Grids
Different climate models use a variety of different horizontal grids that are documented in the grid registry (coming soon).
Different MIPs also have different requirements for vertical grid reporting. Output can be defined either on the native model levels, or it can be remapped to pressure levels.
Masked Averaging
Many variables in CMIP7 are defined as masked means, defined as the mean of a quantity over a portion of the grid cell defined by an area type. For more information on this, see this webpage (coming soon).
4. CMIP7 data format
As in previous phases, all CMIP7 output has been written to netCDF files. Before being published, these files must pass the ESGF Quality Control (ESGF-QC). Many modelling centres use the CMOR software to standardize their files. They are then said to have been “CMORized”.
Essential features of CMORized data are :
- Standardized naming from CMIP CVs
- Consistent file naming convention
- Uniform metadata structure:
- Global attributes
- Coordinate variables such as time, lat, lon, plev
- One variable per file
- Self-describing (all metadata needed to interpret the data are included in the file)
- Consistent units and standard names following CF conventions
5. Reporting suspected errors
Information about discovered issues of CMIP7 data is captured by the Errata Service.
Any CMIP data user can report an error by submitting an issue through the Propose button on the Errata Service website. Proposing erratum through the webform requires a contact email address. Once the webform is validated and created, a special link is created and can be shared but the issue won’t appear on the index page. A moderator (from the relevant modelling centre providing data) will validate, update or reject the entry. When data errors are discovered, data providers are expected to retract the affected datasets from ESGF and, if possible, republish corrected data using updated dataset version identifiers. If no moderation action is taken after the 14-day validation period, the issue will be publicly indexed, albeit with a special flag.
6. New to CMIP?
First time using CMIP? Need a bit more help ? Check out the Entry-Level Documentation (coming soon), put together by the Fresh Eyes on CMIP group.
You have a more specific question ? Ask it on the Fresh Eyes Platform. (You need to register here first.)