Committee

  • David Walker (advisor), Zak Kincaid, Brian Kernighan

Title

Synthesizing Bijective String Transformers

Abstract

Bidirectional transformers are groups of functions that convert between data formats. The most common bidirectional transformers are parsers and pretty printers, where a parser converts data from a string representation into a structured representation, and a pretty printer converts data from a structured representation back into a string representation. Other bidirectional transformers convert between data models and GUIs, database views and the underlying tables, and different string formats. Inversion guarantees are usually expected of bidirectional transformers; converting from one data representation and back should not alter the content of the data, beyond the modification of unimportant details like whitespace. Writing these functions in general purpose programming languages requires writing multiple functions and manual reasoning about the invertibility guarantees. Researchers have designed domain specific languages, like Boomerang and biXid, to express bidirectional transformations with a single term, while providing invertibility guarantees. However, coding in these languages requires learning a new paradigm, which has hindered their adoption. We aim to increase the accessibility of one of these languages, Boomerang, by synthesizing the bijective fragment of its programs. Boomerang is a bidirectional language for converting between string representations. The type of a Boomerang program is a pair of regular expressions that specify the format of the source and target data sources. Our program takes two regular expressions and a set of examples as input, and outputs a Boomerang program typed by the input regular expressions, that satisfies the examples. This is done using a method called “type-directed synthesis”, where the types inform the synthesizer how to efficiently search the space of possible programs. Existing work on type-directed synthesis operates on type systems with relatively few isomorphisms between the types. However, each regular expression is equivalent to an infinite number of regular expressions. Synthesis thus requires searching through the equivalent regular expression types as well as through possible terms of the types. Furthermore, the types for complicated data formats are much larger and more complex than the types used in existing work in type-directed synthesis, causing a combinatorial explosion when searching through terms. These issues are resolved through converting the types to a different language with fewer equivalences and treating the user defined types semi-opaquely. We evaluate our procedure on 25 examples taken from Augeas, a program for encoding bidirectional transformations of Linux configuration files, other papers, and microbenchmarks built to highlight the strengths and weaknesses of the program. Our synthesis algorithm is able to synthesize all of these programs in an average of less than half a second.

Reading List

Dissertations

  • Bidirectional Programming Languages. Nate Foster. 2010. link (Lenses)

Papers / Chapters

  • Combinatorial Sketching for Finite Programs. Armando Solar-Lezama, Liviu Tancau, Rastislav Bodik, Vijay Saraswat, Sanjit Seshia. 2006. link (Sketching, Synthesis)
  • Stochastic Superoptimization. Eric Schkufza, Rahul Sharma, Alex Aiken. 2013. link (Stoke, Stochastic Synthesis)
  • Dual Syntax for XML Languages. Claus Brabrand, Anders Møller, Michael I. Schwartzbach. 2005. link (Bidirectional Programming)
  • Application of Theorem Proving to Problem Solving. Cordell Green. 1969. link (Classic Paper, Synthesis)
  • Ch 4 (Focusing) of Frank Pfenning's ATP notes. Frank Pfenning. link (Automated Theorem Proving / Classic)
  • Regular Combinators for String Transformations. Rajeev Alur, Adam Freilich, Mukund Raghothaman. 2014. link (String Transformations)
  • A Methodology for LISP Program Construction from Examples. Phillip D. Summers. 1977 link (Classic Paper, Synthesis)
  • Type-directed Completion of Partial Expressions. Daniel Perelman, Sumit Gulwani, Thomas Ball, Dan Grossman. 2012. link (Synthesis, IDEs)
  • Learning Semantic String Transformations from Examples. Rishabh Singh, Sumit Gulwani. 2012. link (String Transformations Synthesis)
  • Synthesis through Unification. Rajeev Alur, Pavol Cerny, Arjun Radhakrishna. 2015. link (Synthesis)

Textbooks

  • Types and Programming Languages. Benjamin Pierce. 2002. ISBN 0-262-16209-1. (General PL)

Slides

They can be found here in PDF format.