extracting

Module containing the functions loading the downloaded data from the ENTOS-E databases

ecodynelec.preprocessing.extracting.create_per_country(path_dir: dict, case: str, start=None, end=None, ctry: list | None = None, savedir: str | None = None, savedir_resolution: str | None = None, n_hours: int = 2, days_around: int = 7, limit: float = 0.4, correct_data: bool = True, is_verbose=False, progress_bar: ProgressInfo | None = None)[source]

Extracts all the data for every country.

Parameters:

path_dir (str, default is None) – path to local directory with ENTSO-E data files
case (str) – ‘generation’ or ‘import’ to select the type of data to expect.
start – starting date, as str or datetime
end – ending date, as str or datetime
ctry (list, default to None) – list of countries to involve in the computation
savedir (str, default to None) – directory to save the processed data.
save_resolution (str, default to None) – directory to save information about frequency of each extracted time series
n_hours (int, default to 2) – max number of hours of missing data in a row to consider a gap as short gap and fill it with linear interpolation
days_around (int, default to 7) – number of days before and after a long gap to build an average day to infer values.
limit (float, default to 0.4) – max size of gap relative to the whole series to auto-complete. Gaps larger are filled with zeros
correct_data (bool, default to True) – to auto-complete the data
is_verbose (bool, default to False) – to display information

Returns:

Transformed data for each country.

Return type:

dict

ecodynelec.preprocessing.extracting.extract(ctry: list, start=None, end=None, dir_gen=None, dir_imp=None, correct_gen: bool = True, correct_imp: bool = True, savedir_gen: str | None = None, savedir_imp: str | None = None, save_resolution: str | None = None, n_hours: int = 2, days_around: int = 7, limit: float = 0.4, is_verbose=False, progress_bar: ProgressInfo | None = None)[source]

Extracts all the data at once. Master function of the module.

Parameters:

ctry (list) – list of countries to involve in the computation
start – starting date, as str or datetime
end – ending date, as str or datetime
dir_gen (str, default is None) – path to local directory with ENTSO-E generation data files
dir_imp (str, default is None) – path to local directory with ENTSO-E exchange data files
correct_gen (bool, default to True) – to auto-complete the generation data
correct_imp (bool, default to True) – to auto-complete the exchange data
savedir_gen (str, default to None) – directory to save the processed generation data.
savedir_imp (str, default to None) – directory to save the processed exchange data.
save_resolution (str, default to None) – directory to save information about frequency of each extracted time series
n_hours (int, default to 2) – max number of hours of missing data in a row to consider a gap as short gap and fill it with linear interpolation
days_around (int, default to 7) – number of days before and after a long gap to build an average day to infer values.
limit (float, default to 0.4) – max size of gap relative to the whole series to auto-complete. Gaps larger are filled with zeros
is_verbose (bool, default to False) – to display information

Returns:

dict – Generation data for each country, if dir_gen is not None
dict – Importation data for each country, if dir_imp is not None

ecodynelec.preprocessing.extracting.get_origin_unit(df, origin)[source]: Gets ordered list of sources (origin countries or production units)

ecodynelec.preprocessing.extracting.get_parameters(case)[source]: Function used to define parameters for later code

ecodynelec.preprocessing.extracting.get_time_line(unique_dates)[source]: Gets the time line and corrects it if needed

ecodynelec.preprocessing.extracting.load_files(path_dir, start=None, end=None, destination=None, origin=None, data=None, area=None, case=None, is_verbose=False, progress_bar: ProgressInfo | None = None)[source]: Load the ENTSO-E data and concatenate the information

ecodynelec.preprocessing.extracting.load_single_files(file_path, column_types, area, useful, date_col=['DateTime'], area_level='CTY', status_col=None)[source]: Load the ENTSO-E data for a single file