Mechanising the Metatheory of SQL with Nulls

SQL is the standard query language used by relational databases, a multi-billion dollar industry. SQL’s semantics is notoriously subtle: the standard (ISO/IEC 9075:2016) uses natural language that implementations interpret differently.

Commercial databases optimise queries by applying rewriting rules to convert a request into an equivalent one that can be executed more efficiently. The lack of a well-understood formal semantics for SQL makes it very difficult to validate the soundness of candidate rewriting rules, and even widely used database systems have been known to return incorrect results.

Today’s databases are thus difficult to trust. Our proposal will build on exciting recent developments in (mechanised) formalisation of SQL semantics, to explore the feasibility of verified, trustworthy database systems: databases in which every important component, from query input to result, employs algorithms whose correctness is mechanically verified. This has (until recently) been unthinkable because of the absence of a plausible semantics for SQL with nulls and the difficulty of reasoning about query behaviour modulo SQL’s multiset semantics.

The proposed work would result in the first ever mechanised proofs of correctness for SQL’s real semantics (including multisets and nulls).

Researchers: Wilmer Ricciotti and James Cheney,  Laboratory for Foundations of Computer Science (LFCS), University of Edinburgh