Learn Globally, Speak Locally: Bridging the Gaps in Multilingual Reasoning
Summary
Co-authored research exploring methods to bridge gaps in multilingual reasoning for Language Models.
Highly accomplished PhD Candidate in Computer Science with a strong focus on AI/ML research, particularly in Large Language Models (LLMs), AI safety, mathematical reasoning, and generative AI. Proven track record of high-impact contributions, including multiple publications in top-tier conferences and preprints, and significant awards totaling over $300K USD/CAD in research funding. Leverages deep expertise in PyTorch, TensorFlow, and advanced algorithms to drive innovation and solve complex challenges in AI research.
Bangalore, Karnataka, India
→
Summary
Led and contributed to advanced AI research initiatives focusing on LLM safety, mathematical reasoning, and interpretability, driving significant advancements in AI capabilities.
Highlights
Initiated and led AI4Math projects, developing AI systems for mathematics autoformalisation and proof search, contributing to novel mathematical constructs research.
Designed and implemented advanced hallucination detection and analysis techniques for LLMs, enhancing model reliability and improving detection AUC to 0.93.
Conducted in-depth research into concept generalisation and brittleness of supervised finetuning for AI safety alignment, building comprehensive assessment datasets.
Pioneered mechanistic interpretability and probing techniques to demystify LLM black-boxes, advancing AI introspection and understanding.
Developed various prompting strategies for autoformalisation of proofs and theorem statements, leading a comprehensive study on LLM capabilities and limitations.
Proposed a neurosymbolic approach to integrate proof ingredients and powerful tactics (e.g., Aesop) into proof search, finetuning LMs to output ingredients with ~60% accuracy.
Magdeburg, Saxony-Anhalt, Germany
→
Summary
Developed and implemented a WaveNet-based architecture to precisely extract singing voice from musical mixtures, enhancing audio processing capabilities.
Highlights
Developed a WaveNet-based architecture to precisely extract singing voice from musical mixtures using spectrograms and vocoder features.
Employed advanced signal processing and deep learning techniques to improve audio source separation quality and fidelity.
Enschede, Overijssel, Netherlands
→
Summary
Addressed medical data scarcity by generating synthetic data using GAN models and exploring augmentation techniques for pneumonia detection in chest X-rays.
Highlights
Generated synthetic medical data using extant GAN models to mitigate data insufficiency for pneumonia detection, enhancing dataset diversity.
Explored and implemented diverse augmentation techniques to improve chest X-ray analysis and diagnostic accuracy for pneumonia.
Software Development Intern
Mumbai, Maharashtra, India
→
Summary
Re-versioned two educational platforms, NROER and DOER, to significantly enhance accessibility of online resources for schools in rural areas.
Highlights
Re-versioned NROER (website) and DOER (collection of docker containers) educational platforms, significantly improving accessibility for rural schools.
Utilized modern software development practices, including docker containers, to streamline deployment and management of educational content.
Volunteer
→
Summary
Optimized raw material cost sourcing and supported educational and nutritional programs for underserved communities.
Highlights
Optimized raw material cost sourcing based on detailed reports and graphs, contributing to improved operational efficiency for meal programs.
Volunteered in teaching and mid-day meal programs, directly contributing to community educational and nutritional initiatives.
→
B.E. & M.Sc.
Computer Science & Economics
Grade: GPA: 9.48/10 | distinction
Courses
Neural Networks & Fuzzy Logic
Information Retrieval
Data Structures & Algorithms
Object Oriented Programming
Database Systems
Linear Algebra
Probability & Statistics
Econometrics
Techniques in Social Science Research
Machine Learning
Sequence Models
Convolutional Neural Networks
Awarded By
Renaissance Philanthropy
Recognized as one of 29 worldwide winners for the proposed AI4Math fund, securing $242,000 USD for research.
Awarded By
Université de Montréal
Awarded $44,000 CAD to support PhD studies at Université de Montréal.
Awarded By
DAAD
Awarded to 100 students across India for engaging in research in German universities, valued at $3,500 CAD.
Awarded By
MITACS
Awarded to 2,000 students worldwide for engaging in research in Canadian universities, valued at $4,000 CAD.
Awarded By
Reserve Bank of India
Awarded to 120 students across India to carry out projects under RBI, valued at $4,000 CAD.
Awarded By
Bengalathon
Developed a facial recognition system for universities using minimal data, valued at $1,500 CAD.
Awarded By
IASC
Awarded to 300 students across India to carry out projects in Indian public institutions, valued at $2,000 CAD.
Awarded By
Smart India Hackathon
Proposed the idea of oil pilferage detection using Phased Antenna Array.
Awarded By
BITS Pilani
Awarded to top 1% students in the institute, valued at $13,000 CAD.
Summary
Co-authored research exploring methods to bridge gaps in multilingual reasoning for Language Models.
Summary
Co-authored research challenging the 'correctness-is-key' paradigm in data curation, demonstrating superior training signals from model-generated Chains of Thought (CoTs) with incorrect final answers.
Summary
Co-authored research investigating how the language of inquiry influences the factuality of Language Models.
Summary
Co-authored research uncovering critical training instabilities in on-policy RLVR algorithms and demonstrating the surprising stability of off-policy training.
Published by
Findings of the Association for Computational Linguistics: EACL 2024; ICBINB NeurIPS 2023
Summary
Co-authored conference paper investigating hallucination detection in Language Models, published in EACL 2024 and ICBINB NeurIPS 2023.
Summary
Co-authored research exploring hybrid proof search using ingredient prediction by Language Models for formal verification.
Published by
IEEE Internet of Things Magazine, 2023
Summary
Co-authored magazine paper on edge-enabled intrusion and anomaly detection strategies for intelligent vehicular networks.
Summary
Co-authored research on autoformalisation of Lean theorems using Large Language Models to enhance proof assistants.
Published by
IEEE Transactions on Intelligent Transportation Systems, 2022
Summary
Co-authored journal paper presenting a novel anomaly detection system for intra-vehicular networks.
Published by
MATH-AI Workshop, NeurIPS 2022
Summary
Co-authored workshop paper on automating the formalisation of theorem statements using large language models.
Summary
Co-authored research on developing a mathematics formalisation assistant utilizing Large Language Models.
Published by
IEEE Transactions on Vehicular Technology, 2021
Summary
Co-authored journal paper presenting a deep neural network framework for anomaly detection in VANETs.
Published by
ICC 2021-IEEE International Conference on Communications
Summary
Co-authored conference paper on deep neural networks for securing IoT-enabled vehicular ad-hoc networks.
Python, Java, C++, C, F*, Lean4, MATLAB, SQL.
PyTorch, Tensorflow, Librosa, OpenCV.
Large Language Models (LLMs), AI Safety, Mathematical Reasoning, Formal Theorem Proving, Generative AI, Deep Learning, Neural Networks, Mechanistic Interpretability, Anomaly Detection, Natural Language Processing, Computer Vision, Reinforcement Learning, Data Augmentation, GANs.
Problem Solving, Data Analysis, Experimental Design, Algorithm Development, Statistical Analysis, Academic Research, Technical Writing.
→
Summary
Challenged the 'correctness-is-key' paradigm in data curation, demonstrating that model-generated Chains of Thought (CoTs), even with incorrect final answers, can serve as a superior training signal to human-verified data.
→
Summary
Investigated critical training instabilities in on-policy RLVR algorithms (e.g., GRPO variants, PPO) and demonstrated the surprising stability of off-policy training.
→
Summary
Project focused on developing AI systems capable of mathematical discovery by quantifying the usefulness of mathematics questions, definitions, and conjectures through formal theorem proving.