Kains Kavuluri

MSc student · Université de Montréal & Mila · Montréal, QC

I'm a Master's student in Computer Science (AI) at Université de Montréal and Mila - Quebec Artificial Intelligence Institute, holding a GPA of 4.3/4.3.

My current work spans two areas. In RL for drug discovery, I am building a discrete masked diffusion model that combines RL-based property optimization and natural language conditioning. In loss of plasticity, I am studying why deep RL agents degrade in learning capacity over time and whether evolutionary-inspired mechanisms can restore it. More broadly, I am drawn to questions about causal structure in RL, particularly whether latent variable models that recover causal environment dynamics could improve agent generalization, and how RL can steer generative models toward open problems in scientific discovery.

Before Mila I spent 4+ years at Deutsche Bank's Chief Innovation Office, working on financial AI for regulatory requirements, LLM agents, fine-tuning LLMs for internal deployment, and safety evaluation of LLM systems. I hold a Bachelor's in Information Technology from NIT Karnataka.

News

May 2026 Joining CRIM as Research Intern in May 2026, researching RL and RLVR methods to improve LLMs as co-pilots for complex software.

Sept 2025 Started MSc at UdeM & Mila (GPA 4.3/4.3).

Jul 2021 Joined Deutsche Bank CIO; worked on financial AI for regulatory requirements, LLM agents, fine-tuning, and safety evaluation of LLM systems.

May 2021 Graduated B.Tech from NIT Karnataka.

Research

Research projects at Mila and selected past work.

2026

ONGOING
Discrete Diffusion for Molecular Generation with RL and Natural Language Steering

Kains Kavuluri

Université de Montréal, April 2026

Learns drug-like molecule representations through discrete masked diffusion over SELFIES, with 100% chemical validity by construction. REINFORCE with per-step gradient accumulation steers generation toward desired properties, improving drug-likeness by 12.8%. Natural language conditioning via SciBERT cross-attention reaches 76% of SOTA at 15% of the parameter count.

read more
2025

ONGOING
Reinforcement Learning in Non-Stationary Environments: Loss of plasticity and evolutionary remedies

Kains Kavuluri

Ongoing research · Mila, 2025–present

Investigating the loss of plasticity phenomenon in deep RL agents and experimenting with evolutionary algorithms to overcome it.
2021

THESIS
Progressive Conformer for Image Classification

Kains Kavuluri

NIT Karnataka, 2021

A hybrid architecture combining Transformers and CNNs at the attention level for progressive feature learning in image classification.

Projects

Selected engineering and applied ML projects.

🏒

Expected Goal Prediction · NHL

Goal probability prediction for NHL gameplay. Peak AUC 0.88. Engineered 18+ spatial/temporal features; Bayesian and Grid Search hyperparameter tuning.

python · xgboost · pytorch · bayesian-opt

🤖

AI Agents for Computer Use

LLM agent that browses the web via natural language instructions, using DSPy for few-shot prompting over cleaned HTML context.

python · dspy · llm-agents · html-parsing

🔬

Deep UNet · Melanoma Segmentation

Image segmentation model for dermoscopy images used in downstream disease classification tasks.

pytorch · unet · medical-imaging

💡

Light-to-Camera Indoor Positioning

Novel indoor positioning system for mobile devices using light-based communication.

computer-vision

🧠

AI for Dementia

iOS app helping dementia patients manage routines via voice-to-text and text-to-voice models.

swift · ios · speech · tts

Skills

Languages

Python TypeScript Swift Go SQL Bash

ML / AI

PyTorch LangChain LangGraph DSPy MLflow JAX scikit-learn HuggingFace

Infra

GCP · Vertex AI Docker GitHub Actions Linux Kubernetes

Education

2025 – present

M.Sc. Computer Science, Artificial Intelligence

Université de Montréal · Mila, Quebec AI Institute

GPA 4.3 / 4.3 · Fundamentals of ML, Data Science, Representation Learning

2017 – May 2021

B.Tech. Information Technology

National Institute of Technology Karnataka (NITK), India

CV, Neural Networks, Deep Learning, Algorithms, Distributed Systems, HPC, Graph Theory

Experience

May 2026 –

Research Intern, LLM Agents and Software Co-pilot

CRIM (Centre de Recherche Informatique de Montréal) · Montréal, QC

Research on enabling LLMs to navigate and operate complex software as genuine co-pilots for end users, applying RL, supervised fine-tuning, and RLVR to train agents for real-world application use.

Jul 2021 – 2025

Associate / Senior Analyst / Analyst, Chief Innovation Office

Deutsche Bank Group · Pune, India

Financial AI for regulatory requirements, LLM agents for internal use cases, fine-tuning LLMs for internal deployment, and safety evaluation of LLM systems.

Apr–Jun 2020

Software Engineer Intern

Pulse Secure LLC · Bengaluru, India

Deep LSTM for insider-threat detection; Node.js log visualisation platform.