Downloading ENTSO-E data
Data from the ENTSO-E is at the heat of ecodynelec. Though
ecodynelec integrates the possibility to download the required data
from ENTSO-E servers directly, and can also deal with already manually
downloaded data. This tutorial details the different possibilities to
download the ENTSO-E data, i.e. with and without ecodynelec.
In any case, an account must be created on the ENTSO-E website for using the SFTP service and access the data.
Manual downloading
Manual retrieval of the data requires an FTP software. We use
FileZilla to illustrate the
procedure. Provide the software with: * Host:
sftp://sftp-transparency.entsoe.eu * Port: 22 * Personal
credentials
Navigate through the remote architecture and download the files of
interest. The generation files are located in
/TP_export/AggregatedGenerationPerType_16.1.B_C/. The exchanges
files are located in /TP_export/PhysicalFlows_12.1.G/. Figure 1
give details on how to proceed with FileZilla.

Downloading via ecodynelec
The data can be downloaded via ecodynelec.
First, the configuration must be adapted, either using a spreadsheet or using python.
Then the download can be triggered, either in a standalone
fashion
or as part of the whole computation pipeline. The global tutorials on
how to use ecodynelec fully in
Python
and with spreadsheet
configuration
give more details on the latter.
Configuration via spreadsheet
The spreadsheet used in this tutorial can be downloaded from the examples on the github.

Each field must be written as presented, in low case. The fields are: *
host: the address of the sftp server. Per default, we use
“sftp-transparency.entsoe.eu”. * port: the port to connect to the
server. Per default, the port is 22. * username: your username,
as created for free on the ENTSO-E
website. It should be an
email@address. If the field is left blank in the spreadsheet, the
credential will be asked when the downloading is launched. *
password: your password, as created for free on the ENTSO-E
website. For security reasons, we
do recommend to let the field blank, which will let the downloading
package ask for the password in a more secured manner. * use
server: TRUE if you want to download the data. Blank or FALSE
will not download the data (default). * remove unused: TRUE if
you want the target directories (where to download) to be emptied before
downloading. Blank or FALSE to ignore other files in the target
directory (default).
The files will be downloaded and saved in the directories indicated at the fields path generation and path exchanges of the tab Filepath of the spreadsheet (c.f. Figure 3). Also make sure you set the date accordingly (tab Parameter), to allow the selection of files to download. More information on the various configuration possibilities available in the input data section.

Configuration in Python
from ecodynelec.parameter import Parameter # Import the configuration management class
# Initialize the parameter class
my_config = Parameter()
The server connection needs a configuration which is not the default. Here is how to change it.
Note that credentials can be specified directly in the configuration
object, but this is not necessary. Instead in this example, we let them
empty (or None for the example, which is strictly equivalent), and
these will be asked later when the downloading starts.
### Configure the server connection
my_config.server.useServer = True # Specifically ask to download data
my_config.server.host = "sftp-transparency.entsoe.eu" # This server is already set per default after initialization
my_config.server.port = 22 # This port is already set per default after initialization
### Credentials
my_config.server.username = None
my_config.server.password = None
Each field is accessible and modifiable with the syntax
my_config.server.field. The fields are: * host: the address of
the sftp server. Per default, we use “sftp-transparency.entsoe.eu”. *
port: the port to connect to the server. Per default, the port is
22. * username: your username, as created for free on the
ENTSO-E website. It should be an
email@address. If the field is left blank in the spreadsheet, the
credential will be asked when the downloading is launched. *
password: your password, as created for free on the ENTSO-E
website. For security reasons, we
do recommend to not specify it, which will let the downloading
package ask for the password in a more secured manner. * useServer:
TRUE if you want to download the data. Blank or FALSE will not
download the data (default). * removeUnused: TRUE if you want
the target directories (where to download) to be emptied before
downloading. Blank or FALSE to ignore other files in the target
directory (default). * _remoteGenerationDir: where to find the
generation data on the ENTSO-E server. This field should be left per
default, i.e. not specified. However it is included for flexibility
purpose if the server modifies its architecture. *
_remoteExchangesDir: where to find the exchanges data on the ENTSO-E
server. This field should be left per default, i.e. not specified.
However it is included for flexibility purpose if the server modifies
its architecture.
The data will be downloaded where the my_config.path section points
at, these fields must then be specified. Note that these path
settings are the same used by ecodynelec to find local files
containing data to be used in the main computation.
# Indicate where to save generation data
my_config.path.generation = "./test_data/downloads/generations/"
# Indicate where to save exchange data
my_config.path.exchanges = "./test_data/downloads/exchanges/"
Finally, the start and end dates must be specified in the main
section of the configuration object to only download the useful files.
### Set the dates (to select files to download)
my_config.start = '2017-02-01 05:00'
my_config.end = '2017-02-01 13:00'
Standalone download
Once the configuration is set properly, the download can be triggered.
This section demonstrates the standalone download, and showcases the
additional specific parameters otherwise not accessible. However for a
more generic usage, the downloading feature has also been integrated to
the whole computation
pipeline
of ecodynelec.
from ecodynelec.preprocessing.downloading import download
Here all parameters are specified, however only config is mandatory,
and every other parameter use default values if not specified.
Note that the configuration used here relies on the
Spreadsheet_download.xlsx, but similarly to the whole ecodynelec
pipeline, the config= parameter can also be a Parameter object,
such as the my_config that was built in the above
section.
download(config="./Spreadsheet_download.xlsx", threshold_minutes=15, threshold_size=0.9, is_verbose=True)
Connection...
Username: ledee.public@gmail.com
Password: ········
[Generation 1/1] Transferred: 45.9 MB Out of: 118.1
KeyboardInterrupt
The download can be a time consuming process, Thus in the previous cell,
the few extra parameters help deciding whether or not downloading a
specific file from the server. This comes handy only in the case of
re-using ecodynelec regularly, an occasional or one-time usage will
not be affected by these extra parameters.
threshold_minutes: if the last modification of a file on the server occurred less thanthreshold_minutesafter the last download of that file (if the downloaded file still exist on the user’s computer too), the remote file is not downloaded. Default is 15 min. The server “modifies” files regularly, either with no changes in the data (simple server maintenance) or some data modifications (as new information comes in). The parameter allows to skip a file if the file on the server is considered as not new enough.threshold_size: If the file has been partially downloaded, but the size of the local file is less thanthreshold_sizeof the server file (i.e. downloaded one is significantly smaller than the file on server), then the file is downloaded again. Default is 90%, i.e. the local copy must be less than 90% of the size of the remote to force the download. For whatever reason, a download may fail half way (connection issue, manually stopping a too-long process, etc.). In such case,threshold_minutesmay prevent to resume the download. For this reason,threshold_sizewas added to still force a download even if a file does not comply with thethreshold_minutesrule.