reskit.weather.NCSource#

Classes#

Index

NCSource

The NCSource object manages weather data from a generic set of netCDF4 file sources

Module Contents#

class reskit.weather.NCSource.Index#

Bases: tuple

yi#
xi#
class reskit.weather.NCSource.NCSource(source, bounds=None, index_pad=0, time_name='time', lat_name='lat', lon_name='lon', tz=None, _max_lon_diff=0.6, _max_lat_diff=0.6, verbose=True, forward_fill=True, flip_lat=False, flip_lon=False, time_offset_minutes=None, time_index_from=None)#

Bases: object

The NCSource object manages weather data from a generic set of netCDF4 file sources

If furthermore allows access a number of common functionalities and constants which are often encountered when simulating renewable energy technologies

Note:#

Various constants can be set for a given weather source which can impact later simulation workflows.

Note that not all weather sources will have all of these constants available. Also more may be implemented besides (so be sure to check the DocString for the source you intend to use).

These constants include:

MAX_LON_DIFFERENCE
The maximum longitude difference to accept between a grid cell and the coordinates you would

like to extract data for

MAX_LAT_DIFFERENCE
The maximum latitude difference to accept between a grid cell and the coordinates you would

like to extract data for

WIND_SPEED_HEIGHT_FOR_WIND_ENERGY

The suggested altitude of wind speed data to use for wind-energy simulations

WIND_SPEED_HEIGHT_FOR_SOLAR_ENERGY

The suggested altitude of wind speed data to use for wind-energy simulations

LONG_RUN_AVERAGE_WINDSPEED

A path to a raster file with the long-time average wind speed in each grid cell * Can be used in wind energy simulations * Calculated at the height specified in WIND_SPEED_HEIGHT_FOR_WIND_ENERGY * Time range included in the long run averaging depends on the data source

LONG_RUN_AVERAGE_WINDDIR

A path to a raster file with the long-time average wind direction in each grid cell * Can be used in wind energy simulations * Calculated at the height specified in WIND_SPEED_HEIGHT_FOR_WIND_ENERGY * Time range included in the long run averaging depends on the data source

LONG_RUN_AVERAGE_GHI
A path to a raster file with the long-time average global horizontal irradiance

in each grid cell

  • Can be used in solar energy simulations

  • Calculated at the surface

  • Time range included in the long run averaging depends on the data source

LONG_RUN_AVERAGE_DNI
A path to a raster file with the long-time average direct normal irradiance

in each grid cell

  • Can be used in solar energy simulations

  • Calculated at the surface

  • Time range included in the long run averaging depends on the data source

See also

reskit.weather.MerraSource, reskit.weather.SarahSource, reskit.weather.Era5Source, Initialize, Note, -----, Generally, CordexSource, or

param path:

The path to the main data file(s) to load

If multiple files are given, or if a directory of netCDF4 files is given, then it is assumed that all files ending with the extension ‘.nc’ or ‘.nc4’ should be managed by this object. * Be sure that all the netCDF4 files given share the same time and spatial dimensions!

type path:

str or list of strings

param bounds:
The boundaries of the data which is needed
  • Usage of this will help with memory management

  • If None, the full dataset is loaded in memory

  • The actual extent of the loaded data depends on the source’s available data

type bounds:

Anything acceptable to geokit.Extent.load(), optional

param index_pad:
The padding to apply to the boundaries
  • Useful in case of interpolation

  • Units are in longitudinal degrees

type index_pad:

int, optional

param time_name:

The name of the time parameter in the netCDF4 dataset

type time_name:

str, optional

param lat_name:

The name of the latitude parameter in the netCDF4 dataset

type lat_name:

str, optional

param lon_name:

The name of the longitude parameter in the netCDF4 dataset

type lon_name:

str, optional

param tz:

Applies the indicated timezone onto the time axis * For example, use “GMT” for unadjusted time

type tz:

str, optional

param verbose:

If True, then status outputs are printed when searching for and reading weather data

type verbose:

bool, optional

param forward_fill:

If True, then missing data in the weather file is forward-filled * Generally, there should be no missing data at all. This option is only intended to

catch the rare scenarios where one or two timesteps are missing

type forward_fill:

bool, optional

param flip_lat:

If True, flips the latitude dimension when reading weather data from the source * Should only be given if latitudes are given in descending order

type flip_lat:

bool, optional

param flip_lon:

If True, flips the longitude dimension when reading weather data from the source * Should only be given if longitudes are given in descending order

type flip_lon:

bool, optional

param time_offset_minutes:

If not none, adds the specific offset in minutes to the timesteps read from the weather file

type time_offset_minutes:

numeric, optional

See also

MerraSource, SarahSource, Era5Source

WIND_SPEED_HEIGHT_FOR_WIND_ENERGY = None#
WIND_SPEED_HEIGHT_FOR_SOLAR_ENERGY = None#
LONG_RUN_AVERAGE_WINDSPEED = None#
LONG_RUN_AVERAGE_WINDDIR = None#
LONG_RUN_AVERAGE_GHI = None#
LONG_RUN_AVERAGE_DNI = None#
MAX_LON_DIFFERENCE = None#
MAX_LAT_DIFFERENCE = None#
variables#
fill = True#
_allLats#
_allLons#
_maximal_lon_difference = 0.6#
_maximal_lat_difference = 0.6#
_flip_lat = False#
_flip_lon = False#
extent#
time_name = 'time'#
_timeindex_raw#
data#
var_info(var)#

Prints more information about the given variable

Parameters:

var (str) – The variable to get more information about

Return type:

None

Note:#

You can access a list of all available variables by printing the member “.variables”

to_pickle(path)#

Save the source as a pickle file, so it can be quickly reopened later

Parameters:

path (str) – The path to write the output file at

Return type:

None

static from_pickle(path)#

Load an NCSource source from a pickle file

Parameters:

path (str) – The path to read from

Return type:

NCSource

list_standard_variables()#

Prints the standard variable loaders available to this weather source

sload(*variables)#

Load standard variables into the source’s data library

Parameters:

*variables (str) – The standard variables to read from the weather source

Return type:

None

Raises:

RuntimeError – If the given standard variable name is not known to the weather source

Note:#

The names of the standard variable do not refer to the names of the data within the source.

Instead, they refer to common plain-english names which are translated to the source- specific names within the associated standard-loader function

You can see which standard loaders are are available for the weather source by seeing the

class methods starting with the name “sload_

Common variable names include:

elevated_wind_speed -> The wind speed at WIND_SPEED_HEIGHT_FOR_WIND_ENERGY surface_wind_speed -> The wind speed at WIND_SPEED_HEIGHT_FOR_SOLAR_ENERGY wind_speed_at_Xm -> The wind speed at X meters above the surface elevated_wind_direction -> The wind direction at WIND_SPEED_HEIGHT_FOR_WIND_ENERGY surface_wind_direction -> The wind direction at WIND_SPEED_HEIGHT_FOR_SOLAR_ENERGY wind_direction_at_Xm -> The wind direction at X meters above the surface surface_pressure -> The pressure at the surface surface_air_temperature -> The air temperature at the surface surface_dew_temperature -> The dew-point temperature at the surface global_horizontal_irradiance -> The global horizontal irradiance at the surface direct_normal_irradiance -> The direct normal irradiance at the surface direct_horzontal_irradiance -> The direct irradiance at the surface on a horizontal plane

See also

NCSource.load, name, height_index, processor

load(variable, name=None, height_idx=None, processor=None, overwrite=False)#

Load a variable into the source’s data table

Parameters:
  • variable (str) –

    The variable within the currated datasources to load
    • The variable must either be of dimension (time, lat, lon) or (time, height, lat, lon)

  • name (str, optional) –

    The name to give this variable in the data library
    • If None, the name of the original variable is kept

  • height_idx (int; optional) – The height index to extract if the original variable has the height dimension

  • processor (func, optional) –

    A function to process the loaded data before loading it into the the data library

    • This function must take a single matrix argument with dimensions (time, lat, lon), and must return a matrix of the same shape

    • Example:If the NC file has temperature in Kelvin and you need C:

      processor = lambda x: x+273.15

  • overwrite (bool, optional) –

    If False, then this function will exit early if the desired variable name

    already exists within the data library. Otherwise, any pre-existing data is overwritten

Return type:

None

See also

sload
  • For loading standard variables into the weather source using pre-configured calls to ‘load’

static _loc_to_index_rect(lat_step, lon_step)#
loc_to_index(loc, outside_okay=False, as_int=True)#

Returns the closest X and Y indexes corresponding to a given location or set of locations

Parameters:
  • loc (Anything acceptable by geokit.LocationSet) –

    The location(s) to search for * A single tuple with (lon, lat) is acceptable, or a list of such tuples * A single point geometry (as long as it has an SRS), or a list

    of geometries is okay

    • geokit,Location, or geokit.LocationSet are best!

  • outside_okay (bool, optional) – Determines if points which are outside the source’s lat/lon grid are allowed * If True, points outside this space will return as None * If False, an error is raised

Returns:

  • If a single location is given (tuple) –

    • Format: (yIndex, xIndex)

    • y index can be accessed with ‘.yi’

    • x index can be accessed with ‘.xi’

  • If multiple locations are given (list) –

    • Format: [ (yIndex1, xIndex1), (yIndex2, xIndex2), …]

    • Order matches the given order of locations

Note:#

The default form of this function (which is the one used here) is not very efficient, ultimately

leading to much longer look-up than they otherwise need to be. When the weather source has grid cells on a regular lat/lon grid then a more efficient form of this function can be configured using the function generator “_loc_to_index_rect”. In these instances, this is the recommended function to use.

For example, if the weather source uses a latitude spacing of 0.5, and a longitude spacing of

0.625, then the function generator can be used like:

> source.loc_to_index = source._loc_to_index_rect(lat_step=0.5, lon_step=0.625)

get(variable, locations, interpolation='near', force_as_data_frame=False, outside_okay=False, _indices=None)#

Retrieve a time series for a variable from the source’s data library at the given location(s)

Can also use various interpolation schemes (e.g. near, bilinear, or cubic)

Parameters:
  • variable (str) – The variable within the data library to extract

  • locations (Anything acceptable by geokit.LocationSet.load( )) –

    The location(s) to search for

    • geokit.Location, or geokit.LocationSet are best

    • A single tuple with (lon, lat) is acceptable, or a list of such tuples

    • A single point geometry (as long as it has an SRS), or a list of geometries

  • interpolation (str, optional) –

    The interpolation method to use

    • ’near’ => For each location, extract the time series from the source’s

    closest lat/lon index * ‘bilinear’ => For each location, use the time series of the source’s surrounding +/- 1 index locations to create an estimated time series at the given location using a biliear interpolation scheme * ‘cubic’ => For each location, use the time series of the source’s surrounding +/- 2 index locations to create an estimated time series at the given location using a cubic scheme

  • force_as_data_frame (bool, optional) – If True, instructs the returned value to always take the form of a Pandas DataFrame regardless of how many locations are specified

  • outside_okay (bool, optional) – Determines if points which are outside the source’s lat/lon grid are allowed * If True, points outside this space will return as None * If False, an error is raised

Returns:

  • If a single location is given (pandas.Series) –

    • Indexes match to the source’s time dimension

  • If multiple locations are given (or if `force_as_data_frame` is True) (pandas.DataFrame) –

    • Indexes match to the source’s time dimension

    • Columns match to the given order of locations