Reach Scale Hydrology

Home >> Data Records >> 1-km 1-hourly CONUS Forcing on NWM Grid

Surface meteorological forcing downscaled from NLDAS-2 and StageIV over the continental United States

1-km 1-hourly CONUS Meteorological Forcing on NWM Grid

As part of data infrastructure building by the Hydrology Team at the Center for Western Weather and Water Extremes (CW3E), a meteorological forcing engine is developed to produce long-term (1979 to near real time), high-resolution (1-km, 1-hourly), and national-scale (CONUS) forcing data to support research and applications in hydrologic modeling and forecasting. The configuration of this forcing data product is set up to serve the demanding needs at the center, for example, to build near-real-time (NRT) forecasting system for various regions and to perform large scale modeling research for the National Water Model (NWM) (both its current generation WRF-Hydro and its NextGen) in collaboration with the CIROH consortium .

Methods of Production

Here an elevation (topography) based downscaling and merging procedure is established for all input forcing variables (precipitation, temperature, humidity, short-/long-wave radiation, pressure, and wind) according to the existing literature (Cosgrove et al., 2003) as well as AORC practices. This forcing engine ingests a series of inputs from different sources with different temporal/spatial resolutions, domains, reliability, period of coverage, and lag time, and generates multiple streams of forcing data, two of which are made available to the public: retrospective data (1979 to ~7 months behind real time) and near-real-time (NRT) data (up to the present day).

Input data

A number of near-real-time and historical data products from different agencies (including CW3E) are collected and updated on a daily basis:

Eight variables (precipitation, 2-m air temperature, downward shortwave/longwave, specific humidity, and wind U/V) are produced using the procedures shown in the table below. NLDAS-2, Stage-IV and PRISM are the backbone and HRRR is used only in the most recent 3.5 days. Since the longer the wait the more data available, more reliable data will replace less reliable ones once it becomes available. For example, NLDAS-2 is a reanalysis, thus more reliable than HRRR analysis and will replace the latter once available.

Downscaling/merging procedures

The downscaling procedure follows a series of commonly adopted physical principles, for example, a fixed air temperature lapse rate against elevations, hydrostatic pressure profile, and emission temperature adjustment for longwave radiation. All the downscaling procedures are performed on the lat/lon grid to 0.01° resolution and then the final results are reprojected onto NWM’s 1-km LCC grid. See the following table for details.

Data streams across time horizons and update schedule

Since the longer the wait the more data available, more reliable data will replace less reliable ones once it becomes available. For example, NLDAS-2 is a reanalysis, thus more reliable than HRRR analysis and will replace the latter once available. The same thing happens between Stage IV Real-time version and Stage IV Archive version, and between PRISM Provisional version and Recent History version. As older/less reliable data versions/sources are overwritten by newer versions/sources, multiple streams of data are created and rolling forward in time. 

For the purpose of hydrologic modeling and forecasting, we set up a couple of time horizons: retrospective, near-real-time (NRT), short-range forecast, and seasonal forecast (Figure 2). The forcing data in the retrospective period is fairly stable - to be revised for bug fixes and other quality improvements, and the forcing data in the NRT period would be subject to frequent updates. Given the space limitations, we make two streams of data products available to the public:

Data format

NetCDF format and grid projection

All the forcing data is in NetCDF format and follows the CF convention, thus most NetCDF-capable software will be able to read the data and interpret the meta information (e.g., time stamp, grid/projection settings). The naming of forcing variables and the units/sign definition all follow the WRF-Hydro convention such that the files can be read directly by the WRF-Hydro model. See the table for the list of 8 variables.

The data is 1-hourly and labeled in UTC time. Following NCEP conventions, the precipitation is the mean flux in the previous hour, for example, precipitation value time labeled at 12 UTC is the mean between 11 UTC and 12 UTC.

The data is in Lambert Conformal Conic (LCC) projection at 1-km resolution. The WKT projection parameters are as follows:


PROJCS["Lambert_Conformal_Conic",
    GEOGCS["GCS_Sphere",
        DATUM["D_Sphere",
            SPHEROID["Sphere",6370000.0,0.0]],
        PRIMEM["Greenwich",0.0],
        UNIT["Degree",0.0174532925199433]],
    PROJECTION["Lambert_Conformal_Conic_2SP"],
    PARAMETER["false_easting",0.0],
    PARAMETER["false_northing",0.0],
    PARAMETER["central_meridian",-97.0],
    PARAMETER["standard_parallel_1",30.0],
    PARAMETER["standard_parallel_2",60.0],
    PARAMETER["latitude_of_origin",40.0],
    UNIT["Meter",1.0]]

The data grid has 3840 rows and 4608 columns (1000 m grid spacing in both x and y directions), though the actual data is bounded between 25°N to 53°N in latitude and between 125°W and 67°W in longitude.

Here is one example dump of the NetCDF header (repeated metadata like missing values and coordinates for data variables are abridged):

dimensions:
        time = UNLIMITED ; // (24 currently)
        x = 4608 ;
        y = 3840 ;
        nv4 = 4 ;
variables:
        double time(time) ;
                time:standard_name = "time" ;
                time:long_name = "Time" ;
                time:units = "minutes since 2023-12-31 00:00" ;
                time:calendar = "standard" ;
                time:axis = "T" ;
        double lon(y, x) ;
                lon:standard_name = "longitude" ;
                lon:long_name = "longitude" ;
                lon:units = "degrees" ;
                lon:_CoordinateAxisType = "Lon" ;
                lon:bounds = "lon_bnds" ;
        double lon_bnds(y, x, nv4) ;
        double lat(y, x) ;
                lat:standard_name = "latitude" ;
                lat:long_name = "latitude" ;
                lat:units = "degrees" ;
                lat:_CoordinateAxisType = "Lat" ;
                lat:bounds = "lat_bnds" ;
        double lat_bnds(y, x, nv4) ;
        float T2D(time, y, x) ;
                T2D:standard_name = "air_temperature" ;
                T2D:long_name = "Air Temperature" ;
                T2D:units = "K" ;
                T2D:coordinates = "lat lon" ;
                T2D:_FillValue = -9.99e+08f ;
                T2D:missing_value = -9.99e+08f ;
        float Q2D(time, y, x) ;
                Q2D:standard_name = "specific_humidity" ;
                Q2D:long_name = "Specific Humidity" ;
                Q2D:units = "1" ;
        float PSFC(time, y, x) ;
                PSFC:standard_name = "air_pressure" ;
                PSFC:long_name = "Pressure" ;
                PSFC:units = "Pa" ;
        float U2D(time, y, x) ;
                U2D:standard_name = "eastward_wind" ;
                U2D:long_name = "U Wind" ;
                U2D:units = "m/s" ;
        float V2D(time, y, x) ;
                V2D:standard_name = "northward_wind" ;
                V2D:long_name = "V Wind" ;
                V2D:units = "m/s" ;
        float SWDOWN(time, y, x) ;
                SWDOWN:standard_name = "surface_downwelling_shortwave_flux_in_air" ;
                SWDOWN:long_name = "Downward Shortwave Radiation" ;
                SWDOWN:units = "W/m^2" ;
        float LWDOWN(time, y, x) ;
                LWDOWN:standard_name = "surface_downwelling_longwave_flux_in_air" ;
                LWDOWN:long_name = "Downward Longwave Radiation" ;
                LWDOWN:units = "W/m^2" ;
        float RAINRATE(time, y, x) ;
                RAINRATE:standard_name = "precipitation_flux" ;
                RAINRATE:long_name = "Precipitation" ;
                RAINRATE:units = "kg/m^2/s" ;

File and folder organization

The data files are organized by years and each data file contains data for one day. The naming follows WRF-Hydro’s convention:

[YYYY]/[YYYYMMDD].LDASIN_DOMAIN1

In Python, you can use the following lines to create file names:

from datetime import datetime
t = datetime(2024, 9, 20)
datafile = f'{t:%Y}/{t:%Y%m%d}.LDASIN_DOMAIN1'

An example folder structure will look like:

├── 2023

│   ├── 20230101.LDASIN_DOMAIN1

│   ├── 20230102.LDASIN_DOMAIN1

⠇   ⠇

│   └── 20231231.LDASIN_DOMAIN1

└── 2024

    ├── 20240101.LDASIN_DOMAIN1

    ├── 20240102.LDASIN_DOMAIN1

    ⠇ 

Common tools to manipulate data

For visualization and simple plotting purposes, we recommend NASA’s Panoply Viewer. It handles the map projections very nicely. It’s developed on top of Java thus requires a Java Runtime, which is available on both Windows and Linux.

For a quick peek into the data, we highly recommend the Ncview tool by David W. Pierce at Scripps Institution of Oceanography. You can install it with the Python package management tool conda. Ncview doesn’t project the data in LCC but it’s much faster and simpler than Panoply.

For data operations like reprojection, regridding, variable extraction, time/space subsetting, and time/space averaging, we recommend the cdo (Climate Data Operators). For simpler operations like compression, decompression, and metadata management, the nco (NetCDF Operators) tool also works.

For example, to regrid the data to a 0.1°x0.1° lat-lon grid, we can do:

# Create lat/lon grid definition
cat << EOF > latlon_conus_0.1deg.txt
gridtype = lonlat

xsize    = 580

ysize    = 280

xfirst   = -124.95

xinc     = 0.1

yfirst   = 25.05

yinc     = 0.1
EOF
# Use cdo remapbil operator to regrid the data
cdo -f nc4 -z zip remapbil,latlon_conus_0.1deg.txt 2024/20240101.LDASIN_DOMAIN1 20240101.LDASIN_DOMAIN1_0.1deg

For low-level manipulations that are not available in all those tools, Python netCDF4 library could be an option.

How to download

Both the retrospective and NRT forcing products are hosted on Globus:

To access them, you need to have an account on the Globus website . You may already have one with your institution, otherwise it’s free to register one. Once you login to the Globus web app, clicking on the links provided above will bring you directly to the data (see the screenshot).

If you prefer a command line way to download the data, or need to automate it, you can install the Globus CLI with pip or conda. Once you have the Globus CLI installed, you can list and copy data:

GLOBUS_RETRO=0351632c-c1f7-4885-8125-0a19290791ff

GLOBUS_NRT=1620b36c-6d83-45d1-8673-5143f09ac5d8

globus ls -l $GLOBUS_RETRO:1979

globus transfer $GLOBUS_RETRO:1979 [my globus endpoint]:[my data path]/1979 --recursive 

Caution: very large data files and folders.

Known issues

Striping noise in Stage IV hourly data over CNRFC

Among the input datasets, the NCEP Stage IV data started to have striping noise problems as of July 2020 in its hour data over the CNRFC region. Cumulative values at 6-hourly intervals or longer were not affected by this problem. The cause of this issue was assumed to be a buggy 6-hourly to 1-hourly temporal disaggregation procedure performed at NCEP. No fixes have been applied at NCEP so far. A temporary workaround is being developed to redo the temporal disaggregation using the NSSL MRMS (e.g., multisensor pass 1).

Lack of observation data outside of US border

No significant number of Canadian or Mexican observations exist in the input data products, causing abrupt changes across the borders. An ongoing effort is trying to blend in the Canadian Precipitation Analysis System (CaPA) data.

Bias correction by Maxwell Lab at Princeton University & HydroFrame team (2003-2005 only)

The Maxwell Lab at Princeton University and HydroFrame team investigated the temperature biases in the data against networks like SNOTEL. According to the findings, the teams have implemented corrections to the raw temperature and specific humidity data for Water Years 2003-2005 at the HUC02 level. Temperature was adjusted as described for each region below. Specific humidity was adjusted along with temperature using the Clausius-Clapeyron equation.

The data is made available by the HydroFrame team on their HydroData platform. HydroData is a powerful application developed by HydroFrame and provides a data catalog, an associated Python API library hf_hydrodata to query and retrieve data, and other tools to manipulate the data. See HydroData documentation here. To retrieve the bias-correction data, install the Python API library hf_hydrodata and follow instructions at here and the Accessing Gridded Data section:

import hf_hydrodata as hf
hf.register_api_pin("<your_email>", "<your_pin>")
# Define filters and return as NumPy array
filters = {"dataset":"CW3E", "variable":"air_temp", "temporal_resolution":"daily", "start_time": "2005-01-01", "end_time": "2005-01-02"}
data = hf.get_gridded_data(filters); print(data.shape)
# Get the metadata about the returned data
metadata = hf.get_catalog_entry(filters); print(metadata)

The dataset name is “CW3E” and the variable names can be found here.

Data citation and contact information

The data products are experimental and provided to the community without warranty of any kind. There has not been a peer reviewed journal publication about this dataset yet, though multiple studies have used this data for their modeling purposes (see the next section). The following Google Document contains the technical notes for the data product and will be updated more frequently than this webpage:

CW3E 1-km 1-hourly Meteorological Forcing on NWM Grid

If you want to use these products or report bugs/issues, please contact data producer:

Ming Pan, Senior Hydrologist

Center for Western Weather and Water Extremes (CW3E)

Scripps Institution of Oceanography

University of California San Diego

Email: m3pan@ucsd.edu 

Journal publications using this data

Martens, H. R., Lau, N., Swarr, M. J., Argus, D. F., Cao, Q., Young, Z. M., et al. (2024). GNSS geodesy quantifies water-storage gains and drought improvements in California spurred by atmospheric rivers. Geophysical Research Letters, 51, e2023GL107721. https://doi.org/10.1029/2023GL107721