Patrick Valduriez

INRIA
Campus Saint-Priest - Bātiment 5
860 rue de St Priest
34392 Montpellier Cedex 5
France

Firstname.Lastname@inria.fr
Tel : +33 4 67 14 97 26
Fax : +33 4 67 41 85 00

Data integration

  • CloudMdsQL Polystore (2015-2018). Transforms queries expressed in a common SQL-like query language into an optimized query execution plan to be executed over multiple cloud data stores (SQL, NoSQL, HDFS, etc.) through a query engine. The compiler/optimizer is implemented in C++ and uses the Boost.Spirit framework for parsing context-free grammars. CloudMdsQL has been validated on relational, document and graph data stores in the context of the CoherentPaaS European project.
  • WebSmatch - Web Schema Matching (2011-2014). A flexible, open environment for discovering and matching complex schemas from many heterogeneous data sources over the Web. It provides three basic functions: (1) metadata extraction from data sources; (2) schema matching, and (3) schema clustering. It is delivered through Web services, to be used directly by data integrators or other tools, with RIA clients. Implemented in Java, delivered as Open Source Software (under LGPL), it has been used by Data Publica and CIRAD.

Scientific workflow management

  • DfAnalyzer (2017 -). A tool for monitoring, debugging, steering, and analysis of dataflows generated by scientific applications. It works by capturing strategic domain data, registering provenance and execution data to enable queries at runtime. It provides lightweight dataflow monitoring components to be invoked by HPC applications. It can be plugged in scripts, or Spark applications, in the same way users already plug visualization library components.
  • Scifloware (2013 -). A middleware for the execution of scientific workflows in a distributed and parallel way. SciFloware provides a development environment and a runtime environment for scientific workflows, interoperable with existing systems. We validate SciFloware with workflows for analyzing biological data provided by our partners CIRAD, INRA and IRD.

Distributed data management

  • SAVIME - Simulation And Visualization IN-Memory (2017 -). A multi-dimensional array DBMS for scientific applications. SAVIME supports a novel data model called TARS (Typed ARray Schema), which supports typed arrays. In TARS, the support of application dependent data characteristics, such as data visualization and UQ computation, is provided through the definition of TAR objects, ready to be manipulated by TAR operators. This approach provides much flexibility for capturing internal data layouts through mapping functions, which makes data ingestion independent of how simulation data has been produced.
  • Triton End-to-end Graph Mapper (2017 -). A server for managing graph data and applications for mobile social networks. The server is built on top of the OrientDB graph DBMS and a distributed middleware. It provides an End-to-end Graph Mapper (EGM) for modeling the application as (i) a set of graphs representing the business data, the in-memory data structure maintained by the application and the user interface (tree of graphical components), and (ii) a set of standardized mapping operators that maps these graphs with each other.

INRIA main page LIRMM main page