Python Functions

global_forecast_validation.compress_netcdf.compress_netcfd()

Takes the 52 individual ensembles generated by RAPIDpy and combines them into one compact NetCDF file, saving disk space in the process by eliminating the forecasts that aren’t daily (the forecasts at 3 hour, 6 hour, etc).

Parameters:
  • folder_path (str) – The path to the folder containing the 52 ensemble forecast files in NetCDF format
  • out_folder (str) – The path to the folder that you want the more compact NetCDF file in.
  • file_name (str) – The name of the region. For example, if the files followed the pattern of “Qout_africa_continental_1.nc, this argument would be “Qout_africa_continental”
global_forecast_validation.extract_data.extract_by_rivid()

Extracts data from a folder with NetCDF forecast files (generated with the compress_netcdf function) into CSV files in the given path. The CSV files are named 1_Day_Forecasts, 2_Day_Forecasts, etc. The initialization values (water balance) as well as the high resolution forecasts are also provided in CSV format in the selected folder.

Parameters:
  • rivid (int) – The rivid (COMID) of the desired stream to extract data for.
  • folder_path (str) – The path to the folder containing the forecast NetCDF files (NetCDF files MUST be formatted as YYYYMMDD.nc).
  • outpath (str) – The path to the directory that you would like to write the CSV files to.
global_forecast_validation.validate_forecasts.compute_all()

Computes forecast metrics for all of the streams in a region.

Note that this function assumes that the same naming convention as the compress_netcdf.py file produces is used (i.e. YYYYMMDD.nc as the file names). It calculates the following metrics and skill scores (using a persistence benchmark, which metric is also provided in the results).

  • Continuous Ranked Probability Score
  • Mean Absolute Error
  • Mean Squared Error
  • Root Mean Square Error
  • Pearson R (Correlation)
Parameters:
  • work_dir (str) – The directory that contains all of the forecast files that were created with the compress_netcdf function. Make sure that this directory only contains the compressed forecast files.
  • out_path (str) – The path where the resulting CSV of results should be stored. Include the file name in this path! For example, if I wanted the file to be stored in the same directory as I ran the script in, I would simply set this parameter to be the name of the file (ie Skill_Scores.csv).
  • memory_to_allocate_gb (float) – Indicates the memory that you would like to be allocated on the computer when running the program. It is highly recommended to be conservative in this number as slightly more memory may be consumed (maybe up to half a gig in very large regions).
  • starting_date (str) – The starting date of the analysis formatted as YYYY-MM-DD (i.e. January 2, 2019 would be 2019-01-02).
  • ending_date (str) – The ending date of the analysis formatted as YYYY-MM-DD (i.e. January 2, 2019 would be 2019-01-02).
global_forecast_validation.organize_forecasts.organize_api_forecasts()

Organizes CSV files downloaded from the Streamflow Prediction Tool REST API.

Organize the contents of a folder with forecasts that have been stored in CSV format from the Streamflow Prediction tool REST API. This will create seperate CSVs that contain the 1-Day, 2-Day… forecasts from the given CSV files.

Parameters:
  • forecast_dir_path (str) – The path to the directory that contains all of the forecast files in CSV format that have been downloaded from the Streamflow Prediction Tool REST API. Note, the files will be sorted, so choose a naming convention that will be sorted properly (YYYYMMDD format works nicely, but whatever you choose that will be sorted in the correct order works.
  • out_dir_path (str) – The path to the directory where the resulting organized files will be dumped.
  • daily (bool) – If True (default), only daily values will be saved (i.e. one day forecasts). If not, then all of the time frequencies will be used (i.e. three hour forecasts, 6 hour forecasts, etc). Note that setting this parameter to false will generate many files.