aarushi singh

ai researcher, connecting agents to humans

building at the intersection of machine learning research and systems engineering. currently focused on multimodal llms, agentic infrastructure, and retrieval-augmented generation. previously interned at microsoft on the azure data spark native execution engine. open-source contributor to transformers, langchain, and more.

work

02

software engineer intern

microsoft · azure data

worked on the azure spark native execution engine (nee) using c++, scala, velox, gluten. integrated fuzz-testing pipelines, improved operator reliability, and enhanced ci/cd diagnostics for large-scale distributed sql execution.

jun–aug 2025

undergraduate ml researcher

bennett university

researched recommender systems and computer vision using pytorch and tensorflow. improved ndcg/mrr on matrix factorization models and benchmarked cnn/transformer architectures for emotion recognition on fer2013.

aug 2024–now

open source

05

projects

07
clipdb

privacy-first clipboard history manager with fuzzy search and aes-256 encryption. open-source.

pythoncliaes-256
distributed log aggregation

high-throughput log ingestion system handling 50k+ logs/sec using go concurrency and kafka.

gokafkapostgresqldocker
enterprise ai workflows

end-to-end workflow system integrating azure openai and ai search for enterprise-grade automation.

pythonfastapiazure openaireact
volatility surface modelling

generative models (gan/vae) to produce smooth volatility surfaces for option pricing.

pythonganvaequantlib
pegasus transformer fine-tuning

fine-tuned pegasus on aeslc for abstractive summarization in low-resource email domains.

pytorchtransformersnlp
supply chain optimization

optimal path algorithms using dijkstra's and custom regression models for yield prediction.

c++python
seq2seq summarization

sequence-to-sequence model using encoder-decoder transformers for document summarization.

pythonseq2seqtransformers

research

05

memory isolation in multi-agent llm systems

independent research · 2025

formulated memory interference as a failure mode in llm-based multi-agent systems, designing architectural variants for controlled retrieval-scoping.

llmfuzz-bench

independent research · 2025

designed a controlled evaluation framework to quantify output stability of llms under stochastic decoding conditions across 1,500+ tasks.

recommender systems & diversity

bennett university · 2024 – ongoing

researching methods to improve diversity, relevance, and ranking stability in recommendation pipelines using llm-driven approaches.

emotion recognition with cnns & transformers

computer vision · 2024

benchmarked cnn and transformer architectures for emotion recognition on fer2013 dataset, analysing accuracy-efficiency tradeoffs.

matrix factorization enhancements

collaborative filtering · 2024

improved ndcg/mrr metrics on matrix factorization models through novel regularization and training strategies.

blog

02

skills

c++pythongojavascalabashsqldockergitazure devopsci/cdkafkafastapiflaskpostgresqlmysqlpytorchtensorflowkerastransformershugging facelangchainllamascikit-learnnumpypandasopencvpytestlinuxjupyter