Informatics Seminar Series
Fall Quarter 2021

Friday, November 5, 2021

“The Naturalness and Artifice of Code: Exploiting the Bimodality”

Prem Devanbu
Professor, Department of Computer Science
UC Davis

While natural languages are rich in vocabulary and grammatical flexibility, most human are mundane and repetitive. This repetitiveness in natural language has led to great advances in statistical NLP methods.

In our lab, we discovered (a decade ago) that, despite the considerable power and flexibility of programming languages, large software corpora are actually even more repetitive than NL Corpora. We went on to show that this “naturalness” of code could be captured in statistical models, and exploited within software tools. This line of work has been turbo-charged by the tremendous capacity and design flexibility of deep learning models. Numerous other creative and interesting applications of naturalness have ensued, from colleagues around the world, and several industrial applications have emerged. Recently, we have been studying the consequences and opportunities arising from the observation that Software is bimodal: it's written not only to be run on machines, but also read by humans; this makes software amenable to both algorithmic analysis, and statistical prediction. Bimodality allows new ways of training machine learning models, new ways of designing analysis algorithms, and new ways to understand the practice of programming. In this talk, I will begin with a backgrounder on "Naturalness" studies, and the promise of bimodality.

Prem Devanbu earned his B.Tech from IIT Madras, and a Ph.D from Rutgers University. After many years developing software for Bell Laboratories and offshots in New Jersey, he joined UC Davis where he conducts teaching & research in software engineering. He has won several awards for his work, including several best paper awards, distinguished paper awards, most influential paper awards, and test-of-time awards. Three of his papers were invited to appear in CACM Research Highlights. He served as PC Chair of ESEC/FSE 2006 and ICSE 2010, and also as GC of MSR 2014 and ESEC/FSE 2020. He has served on the Editorial boards of ACM TOSEM, IEEE ToSE, the JSME, and the EMSE Journal; he serves currently on the CACM Editorial Board. He has been an ACM Fellow since 2018, and won the ACM SIGSOFT Outstanding Research Award in 2021. He even has his own web page.

Return to Current Seminar Schedule