extracting
Module containing the functions loading the downloaded data from the ENTOS-E databases
- ecodynelec.preprocessing.extracting.create_per_country(path_dir: dict, case: str, start=None, end=None, ctry: list | None = None, savedir: str | None = None, savedir_resolution: str | None = None, n_hours: int = 2, days_around: int = 7, limit: float = 0.4, correct_data: bool = True, is_verbose=False, progress_bar: ProgressInfo | None = None)[source]
Extracts all the data for every country.
- Parameters:
path_dir (str, default is None) – path to local directory with ENTSO-E data files
case (str) – ‘generation’ or ‘import’ to select the type of data to expect.
start – starting date, as str or datetime
end – ending date, as str or datetime
ctry (list, default to None) – list of countries to involve in the computation
savedir (str, default to None) – directory to save the processed data.
save_resolution (str, default to None) – directory to save information about frequency of each extracted time series
n_hours (int, default to 2) – max number of hours of missing data in a row to consider a gap as short gap and fill it with linear interpolation
days_around (int, default to 7) – number of days before and after a long gap to build an average day to infer values.
limit (float, default to 0.4) – max size of gap relative to the whole series to auto-complete. Gaps larger are filled with zeros
correct_data (bool, default to True) – to auto-complete the data
is_verbose (bool, default to False) – to display information
- Returns:
Transformed data for each country.
- Return type:
dict
- ecodynelec.preprocessing.extracting.extract(ctry: list, start=None, end=None, dir_gen=None, dir_imp=None, correct_gen: bool = True, correct_imp: bool = True, savedir_gen: str | None = None, savedir_imp: str | None = None, save_resolution: str | None = None, n_hours: int = 2, days_around: int = 7, limit: float = 0.4, is_verbose=False, progress_bar: ProgressInfo | None = None)[source]
Extracts all the data at once. Master function of the module.
- Parameters:
ctry (list) – list of countries to involve in the computation
start – starting date, as str or datetime
end – ending date, as str or datetime
dir_gen (str, default is None) – path to local directory with ENTSO-E generation data files
dir_imp (str, default is None) – path to local directory with ENTSO-E exchange data files
correct_gen (bool, default to True) – to auto-complete the generation data
correct_imp (bool, default to True) – to auto-complete the exchange data
savedir_gen (str, default to None) – directory to save the processed generation data.
savedir_imp (str, default to None) – directory to save the processed exchange data.
save_resolution (str, default to None) – directory to save information about frequency of each extracted time series
n_hours (int, default to 2) – max number of hours of missing data in a row to consider a gap as short gap and fill it with linear interpolation
days_around (int, default to 7) – number of days before and after a long gap to build an average day to infer values.
limit (float, default to 0.4) – max size of gap relative to the whole series to auto-complete. Gaps larger are filled with zeros
is_verbose (bool, default to False) – to display information
- Returns:
dict – Generation data for each country, if dir_gen is not None
dict – Importation data for each country, if dir_imp is not None
- ecodynelec.preprocessing.extracting.get_origin_unit(df, origin)[source]
Gets ordered list of sources (origin countries or production units)
- ecodynelec.preprocessing.extracting.get_parameters(case)[source]
Function used to define parameters for later code
- ecodynelec.preprocessing.extracting.get_time_line(unique_dates)[source]
Gets the time line and corrects it if needed