Optimizing compilers that target Single Instruction Multiple Data (SIMD) architectures rely on sophisticated program analyses and transformations, in particular auto-vectorization, designed to automatically identify and exploit data level parallelism. Emerging SIMD architectures increase in complexity and vary in terms of instruction set, register size and types, alignment requirements, etc. To deliver the expected performance improvements, compiler writers often need to change passes, heuristics, and cost models. This process is challenging even for the few experts who possess the required range of skills. Errors introduced in this process affect
the entire software stack and likely compromise its reliability and security.
Ensuring correctness of these compiler optimizations is hard due to implicit interactions between optimization passes and abstruse details of SIMD instructions semantics and costs. It results in missed optimization opportunities and subtle bugs, such as miscompiled code,
which might remain undiscovered for a long time and manifest themselves in obscure ways across abstraction layers of a software stack.
Our aim is to enable software to take full advantage of SIMD capabilities of existing and new microprocessor designs, without modifying the compiler. Towards this end, we integrate *unbounded superoptimization* with *auto-vectorization*. This approach provides a way to reduce engineering effort needed to tune a production compiler for new SIMD architectures. Moreover, it provides a way to improve the reliability of compilers without compromising performance of generated code. We believe this approach will lead to fundamental advances in SMT solvers as well as industrial-strength optimizing compilers targeting SIMD architectures.