Ayush Agrawal

PhD Student, University of Montreal and MILA
Montreal, CA.

About

Highly accomplished PhD Candidate in Computer Science with a strong focus on AI/ML research, particularly in Large Language Models (LLMs), AI safety, mathematical reasoning, and generative AI. Proven track record of high-impact contributions, including multiple publications in top-tier conferences and preprints, and significant awards totaling over $300K USD/CAD in research funding. Leverages deep expertise in PyTorch, TensorFlow, and advanced algorithms to drive innovation and solve complex challenges in AI research.

Work

Microsoft Research
|

Visiting Scholar & Research Fellow

Bangalore, Karnataka, India

Summary

Led and contributed to advanced AI research initiatives focusing on LLM safety, mathematical reasoning, and interpretability, driving significant advancements in AI capabilities.

Highlights

Initiated and led AI4Math projects, developing AI systems for mathematics autoformalisation and proof search, contributing to novel mathematical constructs research.

Designed and implemented advanced hallucination detection and analysis techniques for LLMs, enhancing model reliability and improving detection AUC to 0.93.

Conducted in-depth research into concept generalisation and brittleness of supervised finetuning for AI safety alignment, building comprehensive assessment datasets.

Pioneered mechanistic interpretability and probing techniques to demystify LLM black-boxes, advancing AI introspection and understanding.

Developed various prompting strategies for autoformalisation of proofs and theorem statements, leading a comprehensive study on LLM capabilities and limitations.

Proposed a neurosymbolic approach to integrate proof ingredients and powerful tactics (e.g., Aesop) into proof search, finetuning LMs to output ingredients with ~60% accuracy.

Artificial Intelligence Lab, Otto Von Guericke University
|

Research Intern

Magdeburg, Saxony-Anhalt, Germany

Summary

Developed and implemented a WaveNet-based architecture to precisely extract singing voice from musical mixtures, enhancing audio processing capabilities.

Highlights

Developed a WaveNet-based architecture to precisely extract singing voice from musical mixtures using spectrograms and vocoder features.

Employed advanced signal processing and deep learning techniques to improve audio source separation quality and fidelity.

Data Science Lab, University of Twente
|

Research Intern

Enschede, Overijssel, Netherlands

Summary

Addressed medical data scarcity by generating synthetic data using GAN models and exploring augmentation techniques for pneumonia detection in chest X-rays.

Highlights

Generated synthetic medical data using extant GAN models to mitigate data insufficiency for pneumonia detection, enhancing dataset diversity.

Explored and implemented diverse augmentation techniques to improve chest X-ray analysis and diagnostic accuracy for pneumonia.

Homi Bhabha Centre for Science Education, Tata Institute of Fundamental Research (TIFR)
|

Software Development Intern

Mumbai, Maharashtra, India

Summary

Re-versioned two educational platforms, NROER and DOER, to significantly enhance accessibility of online resources for schools in rural areas.

Highlights

Re-versioned NROER (website) and DOER (collection of docker containers) educational platforms, significantly improving accessibility for rural schools.

Utilized modern software development practices, including docker containers, to streamline deployment and management of educational content.

Volunteer

Akshay Patra Foundation
|

Volunteer

India

Summary

Optimized raw material cost sourcing and supported educational and nutritional programs for underserved communities.

Highlights

Optimized raw material cost sourcing based on detailed reports and graphs, contributing to improved operational efficiency for meal programs.

Volunteered in teaching and mid-day meal programs, directly contributing to community educational and nutritional initiatives.

Education

University of Montreal and MILA
Montreal, Quebec, Canada

PhD

Computer Science

Grade: GPA: 4.3/4.3

Birla Institute of Technology and Science Pilani
Pilani, Rajasthan, India

B.E. & M.Sc.

Computer Science & Economics

Grade: GPA: 9.48/10 | distinction

Courses

Neural Networks & Fuzzy Logic

Information Retrieval

Data Structures & Algorithms

Object Oriented Programming

Database Systems

Linear Algebra

Probability & Statistics

Econometrics

Techniques in Social Science Research

Machine Learning

Sequence Models

Convolutional Neural Networks

Awards

AI4MATH Fund Winner

Awarded By

Renaissance Philanthropy

Recognized as one of 29 worldwide winners for the proposed AI4Math fund, securing $242,000 USD for research.

Bourse d'exemption UdeM

Awarded By

Université de Montréal

Awarded $44,000 CAD to support PhD studies at Université de Montréal.

DAAD WISE Scholar

Awarded By

DAAD

Awarded to 100 students across India for engaging in research in German universities, valued at $3,500 CAD.

MITACS GRI Scholar

Awarded By

MITACS

Awarded to 2,000 students worldwide for engaging in research in Canadian universities, valued at $4,000 CAD.

Reserve Bank of India Scholar

Awarded By

Reserve Bank of India

Awarded to 120 students across India to carry out projects under RBI, valued at $4,000 CAD.

Bengalathon (Winner Top 10)

Awarded By

Bengalathon

Developed a facial recognition system for universities using minimal data, valued at $1,500 CAD.

IASC SRFP Fellow

Awarded By

IASC

Awarded to 300 students across India to carry out projects in Indian public institutions, valued at $2,000 CAD.

Smart India Hackathon (Finalist)

Awarded By

Smart India Hackathon

Proposed the idea of oil pilferage detection using Phased Antenna Array.

BITS Merit Scholarship

Awarded By

BITS Pilani

Awarded to top 1% students in the institute, valued at $13,000 CAD.

Publications

Learn Globally, Speak Locally: Bridging the Gaps in Multilingual Reasoning

Summary

Co-authored research exploring methods to bridge gaps in multilingual reasoning for Language Models.

Shape of Thought: When Distribution Can Matter More than Correctness in Reasoning Tasks

Summary

Co-authored research challenging the 'correctness-is-key' paradigm in data curation, demonstrating superior training signals from model-generated Chains of Thought (CoTs) with incorrect final answers.

Language Models' Factuality Depends on the Language of Inquiry

Summary

Co-authored research investigating how the language of inquiry influences the factuality of Language Models.

Towards Demystifying the Optimization Landscape of RLVR Methods

Summary

Co-authored research uncovering critical training instabilities in on-policy RLVR algorithms and demonstrating the surprising stability of off-policy training.

Do Language Models Know When They're Hallucinating References?

Published by

Findings of the Association for Computational Linguistics: EACL 2024; ICBINB NeurIPS 2023

Summary

Co-authored conference paper investigating hallucination detection in Language Models, published in EACL 2024 and ICBINB NeurIPS 2023.

Aesop Gets the Fables: Towards a Hybrid Proof Search Using Ingredient Prediction by Language Models

Summary

Co-authored research exploring hybrid proof search using ingredient prediction by Language Models for formal verification.

Ambient Intelligence for Securing Intelligent Vehicular Networks: Edge-Enabled Intrusion and Anomaly Detection Strategies

Published by

IEEE Internet of Things Magazine, 2023

Summary

Co-authored magazine paper on edge-enabled intrusion and anomaly detection strategies for intelligent vehicular networks.

LeanAide: Autoformalisation of Lean Theorems using Large Language Models

Summary

Co-authored research on autoformalisation of Lean theorems using Large Language Models to enhance proof assistants.

NovelADS: A novel anomaly detection system for intra-vehicular networks

Published by

IEEE Transactions on Intelligent Transportation Systems, 2022

Summary

Co-authored journal paper presenting a novel anomaly detection system for intra-vehicular networks.

Towards automating formalisation of theorem statements using large language models

Published by

MATH-AI Workshop, NeurIPS 2022

Summary

Co-authored workshop paper on automating the formalisation of theorem statements using large language models.

Towards a Mathematics Formalisation Assistant using Large Language Models

Summary

Co-authored research on developing a mathematics formalisation assistant utilizing Large Language Models.

DeepADV: A deep neural network framework for anomaly detection in VANETS

Published by

IEEE Transactions on Vehicular Technology, 2021

Summary

Co-authored journal paper presenting a deep neural network framework for anomaly detection in VANETs.

Deep neural networks for securing IoT enabled vehicular ad-hoc networks

Published by

ICC 2021-IEEE International Conference on Communications

Summary

Co-authored conference paper on deep neural networks for securing IoT-enabled vehicular ad-hoc networks.

Skills

Programming Languages

Python, Java, C++, C, F*, Lean4, MATLAB, SQL.

Libraries and Frameworks

PyTorch, Tensorflow, Librosa, OpenCV.

AI/Machine Learning

Large Language Models (LLMs), AI Safety, Mathematical Reasoning, Formal Theorem Proving, Generative AI, Deep Learning, Neural Networks, Mechanistic Interpretability, Anomaly Detection, Natural Language Processing, Computer Vision, Reinforcement Learning, Data Augmentation, GANs.

Research & Analysis

Problem Solving, Data Analysis, Experimental Design, Algorithm Development, Statistical Analysis, Academic Research, Technical Writing.

Projects

Shape of Thought: When Distribution Wins Over for Reasoning Tasks

Summary

Challenged the 'correctness-is-key' paradigm in data curation, demonstrating that model-generated Chains of Thought (CoTs), even with incorrect final answers, can serve as a superior training signal to human-verified data.

Analysis of Optimization Landscapes for RLVR Methods

Summary

Investigated critical training instabilities in on-policy RLVR algorithms (e.g., GRPO variants, PPO) and demonstrated the surprising stability of off-policy training.

AI Systems for Novel Mathematical Constructs

Summary

Project focused on developing AI systems capable of mathematical discovery by quantifying the usefulness of mathematics questions, definitions, and conjectures through formal theorem proving.