csis-architecture

Data Package Export and Import Tool

The Data Package Export and Import Building Block is a tool that can used at any stage of the adaptation planning process to export (download) any data that is directly available in the CSIS (e.g. stored in a Data Repository, see section 7.4) in standardised format, the CLARITY Data Package.

Information produced by CLARITY Climate Services must be provided in a common Data Package that accompanies the text-based report (document with text, graphs, maps, etc.) that is (semi-)automatically generated by the Report Generation Tool (6.2) at the end of the adaptation planning process. This Data Package should contain all data used to generate the contents of such a report so it can be (re)used later on in further stages. For example, a climate expert hired to further carry out local assessment of the climate impact should import the information from CLARITY Data Package and use it for his/her own analyses. Later on, he/she must generate a similar package that then can be imported in CLARITY CSIS.

Technically, a standardised Data Package can be realised as “distributed data object“, so that not all data must reside in the same location (database, server). Here arises also the need for “Smart Links” (Scenario Transferability, see section 5.2) that can combine, relate and describe different information entities (in this particular case the distinct elements of Data Package). Furthermore, a serialisation feature for Data Packages is needed that allows to put all contents of package into a concrete (zip) file that can be shared, e.g. with other experts. The output of Climate Services must be delivered as such a Standardised Data Package to ensure technical interoperability to the CSIS and thus the Climate Services Ecosystem. Consequently, a Data Package can either reside on the CSIS as Virtual Data Package (distributed among several physical data stores) if the provider of the Expert Climate Service uses the CLARITY CSIS to provide its service, or as concrete file (Serialized Data Package) if the provider works offline.

Requested functionality

Baseline requirements elicitation and the assessment of presently available Test Cases have yielded the following functional requirements for this Building Block:

Baseline functionality

Functionality requested by CSIS Test Cases

Exploitation Requirements assessment

The assessment of the Exploitation Requirements [11] identified the following concrete technical and functional implications on this Building Block:

Technology support

Figure 18 gives an overview on the technological possibilities and the related open-source frontend and backend software components that have been selected for the Technology Support Plan.

Figure 18: Data Package Export and Import Tool Technology Support

Regarding the fronted part of the Data Package Export and Import Tool, the same options as for the Data Dashboard Technology Support Plan (4.1.3) apply. The user interface could be similar to the Google Takeout Tool (https://takeout.google.com/settings/takeout) that is shown in Figure 19.

Figure 19: Google Takeout Data Export Tool (Example)

Similar to the Takeout Tool, which collects the information from different Google Services, the Data Package Tool must collect and possibly covert the user’s data stored in the Integration RDBMS (7.3) and the Data Repository (7.4). This task corresponds to the serialisation feature mentioned previously and is accomplished by a server-side script that is invoked by the Data Package frontend via a RESTful API. The interoperability standards for serialising the CLARITY Data Package are based on OGC’s GeoPackage and Frictionless Data’s Data Package standards.

As described in OGC Network™ (http://www.ogcnetwork.net/), a GeoPackage [12] is a universal open file format for geo-data provided by Open Geospatial Consortium (OGC).

http://www.geopackage.org/spec/

It is standards-based, application and platform independent and self-describing to increase the cross-platform interoperability of geospatial applications and web services. It is designed to facilitate widespread adoption and use of a single and simple file format by open-source software applications. Since it is built on top of SQLite, it can be accessed through SQL standard, giving all performance of a spatial database along with the convenience of a single file-based data set that can be easily shared.

https://www.sqlite.org/about.html

An OGC Data Package is able to store:

It could be useful to manage vector data to avoid ESRI Shapefile limitations and can be manipulated by OpenGIS Simple Features Reference Implementation (OGR) and Geospatial Data Abstraction (GDAL) libraries. Its major downside is that the underlying SQLite database is a complex binary format that is not suitable for streaming. It either must be written to the local file system or accessed through an intermediary service like GeoServer, one of the Data Repository technologies (see 7.4.3) selected for CLARITY.

A complementary approach is to use Data Packages, as described by Frictionless Data, is a simple container format for arbitrary data files.

https://frictionlessdata.io/data-packages/

It could be useful to include metadata and point to local or remote files including also OGC Data Package files as well as raster images, NetCDF files and Indicator Data resulting from (impact and adaptation) scenario analysis (see 4.1.3).

It currently offers two complexity levels:

Data Package specification, a simple format for packaging data for sharing between tools and people.

Tabular Data Package, a format to package tabular data that builds on Data Package but additionally it uses:

This simple format can be extended by additional CLARITY JSON metadata files that are recognized by the different Building Blocks and ICT Climate Service developed by CLARITY.