polozov@google.com

+1 (425) 623-4121

New York, NY

Welcome

I am Alex (Oleksandr) Polozov, a senior staff research scientist at Google DeepMind. I teach machines to write and analyze source code, build agents to assist developers, and am broadly interested in program synthesis and AI-assisted software engineering. You might want to check out my publications, notable writing, talks, or blog posts. I also spend an inordinate amount of time complaining about technology and reviewing movies on Twitter or BlueSky.

Previously, I was a principal researcher in the Deep Learning group at Microsoft Research, Redmond. There, I helped create PROSE, a framework for mass-market development of programming-by-example technologies, and shipped multiple program synthesis driven tools.

Before that, I completed my Ph.D. in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, advised by Sumit Gulwani and Zoran Popović. Originally from 🇺🇦 Ukraine.

Latest news

December 2024

Now at Google DeepMind in New York City. Working on coding in Gemini and the next generation of Gemini-powered SWE agents. Stay tuned for upcoming announcements!

November 2023

Wow, what a year. So much has happened, I lost track of updating news. Briefly:

Our X team is now at Google, and growing!
We have collaborated with Google Brain to help release 🌴 PaLM 2, a new state-of-the-art language model that has better multilingual, coding, and reasoning capabilities and is more compute-efficient than its predecessor PaLM.
We have worked on numerous applications of large language models to coding. PaLM 2 and its derivatives power Google Codey APIs, Duet AI for developers, AI-powered code completion in Colab, Android Studio Bot, and others.
“Learning math reasoning from self-sampled correct and partially-correct solutions” appeared at ICLR’23.
“Natural Language to Code Generation in Interactive Data Science Notebooks” appeared at ACL’23.

April 2022

“PaLM: Scaling Language Modeling with Pathways” released on arXiv. Our team at X has collaborated with Google Research on 🌴 PaLM – a single 540B-parameter dense language model for multiple domains and tasks, trained over two TPUv4 Pods. We created PaLM-Coder – an adaptation of PaLM fine-tuned on code and evaluated on software engineering tasks. As we found out, a single PaLM-Coder model can write code, translate code between languages, follow chains of reasoning, and fix build errors better than dedicated models despite being trained on 11X less code than its closest competitors. Moreover, these remarkable abilities keep improving with scale and further training.
Google AI blog: Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance.

January 2022

“Synchromesh: Reliable code generation from pre-trained language models” will (virtually) appear at ICLR’22.

December 2021

“Neurosymbolic Programming” , our survey of techniques and representations for bridging neural and symbolic approaches to AI and programming, will be published in Foundations and Trends^® in Programming Languages. Jointly written with Swarat Chaudhuri (UT Austin), Kevin Ellis (Cornell), Rishabh Singh (Google X), Armando Solar-Lezama (MIT), and Yisong Yue (Caltech).

October 2021

I moved to San Francisco and joined a team at X, the moonshot factory!

August 2021

“Programming Puzzles” will (virtually) appear at NeurIPS’21, the Datasets & Benchmarks track.

May 2021

“KaggleDBQA: Realistic Evaluation of Text-to-SQL Parsers” will (virtually) appear at ACL’21.

Check out our blog post “Conversations with data: Advancing the state of the art in language-driven data exploration” at Microsoft Research Blog, summarizing SCoRe , StruG , and RAT-SQL .

March 2021

“Structure-Grounded Pretraining for Text-to-SQL” will (virtually) appear at NAACL’21.

January 2021

“SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing” will (virtually) appear at ICLR’21.

December 2020

“SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing” and “Learning to Infer Run-Time Invariants from Source Code” presented at the Computer-Assisted Programming (CAP) workshop at NeurIPS 2020.

October 2020

“Structure-Grounded Pretraining for Text-to-SQL” released on arXiv.

July 2020

I gave a talk “Neuro-Symbolic Program Synthesis from Natural Language and Demonstrations” at the 9th Workshop on Synthesis (SYNT).

June 2020

“Neuro-Symbolic Visual Reasoning: Disentangling “Visual” from “Reasoning”” will (virtually) appear at ICML’20.

April 2020

“Learning Web-based Procedures by Reasoning over Explanations and Demonstrations in Context” will (virtually) appear at ACL’20.

November 2019

“RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers” released on arXiv.

September 2019

I gave a talk “From Examples to Natural Language and Back” at the “State of the Art in Program Synthesis” workshop hosted by Synthetic Minds.

“Program Synthesis and Semantic Parsing with Learned Code Idioms” to appear at NeurIPS’19.

July 2019

I gave a talk on “Program Understanding, Synthesis, and Verification with Graph Neural Networks” at the Learning & Reasoning with Graph-Structured Representations workshop at ICML 2019. Talk recording and slides are available online.

June 2019

“Program Synthesis and Semantic Parsing with Learned Code Idioms” released on arXiv.

May 2019

At ICLR 2019 in New Orleans, we presented our recent work on generative code modeling with GNNs . Also, Gustavo Soares and I showed a first public demo of a our new tool for automating repetitive source code editing on the fly, powered by the PROSE framework.

March 2019

“Are My Invariants Valid? A Learning Approach” released on arXiv.

December 2018

“Generative Code Modeling with Graphs” to appear at ICLR’19.

September 2018

Our new neuro-symbolic technique, execution-guided decoding , has helped two Microsoft Research models to take the top two spots on the WikiSQL leaderboard!

“IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles” released on arXiv.

“Robust Text-to-SQL Generation with Execution-Guided Decoding” released on arXiv.

July 2018

New blog post: “Program Synthesis in 2017-18” .

June 2018

“Execution-Guided Neural Program Decoding” to appear at NAMPI’18.

FlashProfile to appear at OOPSLA’18.

New site layout.

May 2018

“Generative Code Modeling with Graphs” released on arXiv.

April 2018

I will be attending ICLR 2018 in Vancouver to present our work on neural-guided deductive search . Let me know if you want to meet up!

February 2018

Presented “Program Synthesis via Neural-Guided Deductive Search” at the Machine Learning + Programming Languages Workshop at UW.

Neural-Guided Deductive Search to appear at ICLR’18.