Type-driven data-science infrastructure for Idris2

PI: Ohad Kammar, University of Edinburgh
Researcher: Robert Wright

Data-science pipelines drive modern artificial intelligence technology and innovation. They take as input raw, unprocessed data, and pump it through layer after layer of intricate processing stages, to be emitted as data visualisations, analytics reports, or into decision-making processes. As we utilise data pipelines for more and more aspects of our life, it becomes more important to ensure that they are robust and stable.

Static typing is one of the most successful methods for developing robust software, since the compiler reports errors even before the program runs. Unfortunately, data analysis pipelines involve intricate dependencies and relationships, much too subtle for most common programming language type-systems to express.

The goal of this project is to develop infrastructure needed to explore how degree type-driven development can help with the construction of expressive and robust data analysis pipelines in Idris 2. Idris 2 is a new dependently-typed programming language that supports type-driven development and also multiple backends, including C, JavaScript, Haskell, OCaml, and Scheme.