News

You can meet the team of the LEXIS Platform at SC 2024 (IT4I booth No. 4233).

 

LEXIS Distributed Data Infrastructure (DDI) gets operational

Author: atanas.pushkarov
|
Date: 23.11.2020

In the LEXIS project, we are developing an advanced and user-friendly computing environment that converges Big Data Analytics, High Performance Computing and Cloud Computing capabilities of European supercomputing centres. Three main layers – the infrastructure layer, the data management layer and a workflow orchestration layer – constitute the main building blocks of LEXIS. With the LEXIS platform and portal, we enable users in science, industry and society to automate and run their compute-and data-intensive workflows efficiently, and thus to accelerate their Research and Development.

The LEXIS Data System allows users to consistently manage input, output and temporary data of their workflows. Its core is the Distributed Data Infrastructure (DDI), which federates the storage systems of the LEXIS infrastructure layer and can be conveniently addressed via REST APIs (i.e., interfaces based on web technology) Figure 1. The DDI is based on the “Integrated Rule-Oriented Data System” (iRODS) and B2SAFE of the “European Data Collaborative Data Infrastructure” (EUDAT CDI). Technically, this means that LEXIS data can be accessed and managed in a uniform way, independently of where the data are physically located. From a collaboration perspective, the integration of LEXIS with EUDAT is one step towards a unified, European research data management following the FAIR principles (Wilkinson et al., 2016, https://doi.org/10.1038/sdata.2016.18). It gives us straightforward possibilities to federate our data system with more European data centres and projects.

LEXIS Infrastructure highlighting the core components of data management layer, the DDI and API (source: LRZ)
Figure 1: LEXIS Infrastructure highlighting the core components of data management layer, the DDI and API (source: LRZ)

In practice, all project members can interact with their data through the LEXIS Portal. They can utilize the data within their LEXIS workflows and iRODS automatically manages cross-site data transfer wherever necessary. The novel Burst Buffer systems in LEXIS can be used to prefetch remote data, if data transfer would take too long. Alternatively, immediate data availability and increased data security can be obtained by activating a convenient cross-site replication functionality implemented with iRODS/B2SAFE.

Recently, the first results from “Weather and Climate Large Scale Pilot” workflows exploiting the LEXIS Computing and Distributed Data Infrastructure were published (Parodi et al., 2020, https://doi.org/10.1007/978-3-030-50454-0_25). We are proud to disseminate these at CISIS 2020, and at SC 2020 with a poster highlighting the DDI (Figure 2 and 3).

Forest fire prediction and prevention workflow as executed by the orchestrator (source: CIMA, ATOS, LRZ)
Figure 2: Forest fire prediction and prevention workflow as executed by the orchestrator (source: CIMA, ATOS, LRZ)
Workflow results visualized with Dewetra platform (Italian Department of Civil Protection, CIMA)
Figure 3: Workflow results visualized with Dewetra platform (Italian Department of Civil Protection, CIMA)

Currently, we are working on benchmarking the DDI system, taking into account different network speeds between LEXIS sites. The LEXIS Orchestration System will be aware of physical data locations and typical transfer speeds. Thus, it can select suitable storage and computing sites for executing a given workflow at the best performance. Mastering the challenges in integrating IT4I, LRZ and further infrastructure within LEXIS, we keep our focus on providing an optimised system with immediate benefits to the users. Besides convenient usability via the portal, high performance and speed gains are key points for an optimum uptake of the DDI and the LEXIS platform as a whole.

Stay Connected

IT4Innovations, VSB - Technical University of OstravaAtos / BullFondazione LINKS / Istituto Superiore Mario Boella ISMBEiffage TESEOCommissariat a l Energie Aatomique et aux Energies AlternativesLeibniz Rechenzentrum der BAdW / Bayerische Akademie der WissenschaftenEuropean Centre for Medium-Range Weather ForecastsAssociazione ITHACACentro Internazionale in Monitoraggio Ambientale - Fondazione CIMAGE Avio SRLHelmholtz Zentrum Potsdam Deutschesgeoforshungszentrum GFZAlfred Wegener Institut Helmholtz Zentrum fur Polar und MeeresforschungHigh performance computing (HPC)Cyclops Labs GMBHBAYNCORE Labs LTDNumtechNational University of Irleand Galway / Irish Centre for High-End Computing EURAXENT / Marc Derquennes
Subscribe Open Call Timeline
© 2024 Lexis Project
Cookie settings