Towards Neural Synthesis for SMT-assisted Proof-Oriented ProgrammingSecurityFormal MethodsAward Winner
Proof-oriented programs mix computational content with proofs of program correctness. However, the human effort involved in programming and proving is still substantial, despite the use of Satisfiability Modulo Theories (SMT) solvers to automate proofs in languages such as F*.
Seeking to spur research on using AI to automate the construction of proof-oriented programs, we curate a dataset of 600K lines of open-source F* programs and proofs, including software used in production systems ranging from Windows and Linux, to Python and Firefox. Our dataset includes around 32K top-level F* definitions, each representing a type-directed program and proof synthesis problem—producing a definition given a formal specification expressed as an F* type. We provide a program-fragment checker that queries F* to check the correctness of candidate solutions. We believe this is the largest corpus of SMT-assisted program proofs coupled with a reproducible program-fragment checker.
Grounded in this dataset, we investigate the use of AI to synthesize programs and their proofs in F*, with promising results. Our main finding in that the performance of fine-tuned smaller language models (such as Phi-2 or StarCoder) compare favorably with large language models (such as GPT-4), at a much lower computational cost. We also identify various type-based retrieval augmentation techniques and find that they boost performance significantly. With detailed error analysis and case studies, we identify potential strengths and weaknesses of models and techniques and suggest directions for future improvements.
Thu 1 MayDisplayed time zone: Eastern Time (US & Canada) change
14:00 - 15:30 | |||
14:00 15mTalk | Large Language Models as Configuration ValidatorsSecurity Research Track Xinyu Lian University of Illinois at Urbana-Champaign, Yinfang Chen University of Illinois at Urbana-Champaign, Runxiang Cheng University of Illinois at Urbana-Champaign, Jie Huang University of Illinois at Urbana-Champaign, Parth Thakkar Meta Platforms, Inc., Minjia Zhang UIUC, Tianyin Xu University of Illinois at Urbana-Champaign | ||
14:15 15mTalk | LLM Assistance for Memory SafetySecurity Research Track Nausheen Mohammed Microsoft Research, Akash Lal Microsoft Research, Aseem Rastogi Microsoft Research, Subhajit Roy IIT Kanpur, Rahul Sharma Microsoft Research | ||
14:30 15mTalk | Vulnerability Detection with Code Language Models: How Far Are We?Security Research Track Yangruibo Ding Columbia University, Yanjun Fu University of Maryland, Omniyyah Ibrahim King Abdulaziz City for Science and Technology, Chawin Sitawarin University of California, Berkeley, Xinyun Chen , Basel Alomair King Abdulaziz City for Science and Technology, David Wagner UC Berkeley, Baishakhi Ray Columbia University, Yizheng Chen University of Maryland | ||
14:45 15mTalk | Combining Fine-Tuning and LLM-based Agents for Intuitive Smart Contract Auditing with JustificationsBlockchainSecurity Research Track Wei Ma , Daoyuan Wu Hong Kong University of Science and Technology, Yuqiang Sun Nanyang Technological University, Tianwen Wang National University of Singapore, Shangqing Liu Nanyang Technological University, Jian Zhang Nanyang Technological University, Yue Xue , Yang Liu Nanyang Technological University | ||
15:00 15mTalk | Towards Neural Synthesis for SMT-assisted Proof-Oriented ProgrammingSecurityFormal MethodsAward Winner Research Track Saikat Chakraborty Microsoft Research, Gabriel Ebner Microsoft Research, Siddharth Bhat University of Cambridge, Sarah Fakhoury Microsoft Research, Sakina Fatima University of Ottawa, Shuvendu K. Lahiri Microsoft Research, Nikhil Swamy Microsoft Research | ||
15:15 15mTalk | Prompt-to-SQL Injections in LLM-Integrated Web Applications: Risks and DefensesSecuritySE for AI Research Track Rodrigo Resendes Pedro INESC-ID / IST, Universidade de Lisboa, Miguel E. Coimbra INESC-ID; Instituto Superior Técnico - University of Lisbon, Daniel Castro INESC-ID / IST, Universidade de Lisboa, Paulo Carreira INESC-ID / IST, Universidade de Lisboa, Nuno Santos INESC-ID; Instituto Superior Técnico - University of Lisbon |