Research that ships

Andrusha

I'm

About

From papers to production

Currently: M.S. AI at UT Austin · Research Engineering at Burns & McDonnell

I translate AI research into production systems at enterprise scale—bridging the gap between what papers discover and what products need.

From multi-agent orchestration platforms to NL-to-SQL agents serving 15,000 employees, I apply insights from interpretability, evaluation, and reinforcement learning to build systems that actually work outside a notebook.

Currently pursuing an M.S. in AI at UT Austin while leading research engineering at Burns & McDonnell.

Publications

IEEE, NAACL, arXiv

15K+

Users Impacted

Enterprise Platforms

Platforms Built

Research → Production

Experience & Education

Where I've been

2025 – 2027 (expected)

M.S. Artificial Intelligence

University of Texas at Austin

Graduate studies in AI with focus on agent systems, interpretability, and reasoning.

May 2024 – Present

Research Engineer

Burns & McDonnell

Drove adoption of AI-native engineering practices across Technology Solutions Group. Architected three production platforms—Axiom, a document processing SDK, and Experience IQ—applying research from agent orchestration, RLVR, and LLM evaluation to serve 15,000+ employees. Presenting at Google Cloud Next 2026.

2026

Speaker — Google Cloud Next

Google

Experience IQ: Enterprise NL-to-SQL Agent.

December 2025

Speaker — Google Dev Days

Google

AI Applications.

November 2024

Guest Lecturer — CS473

Purdue University

Retrieval Augmented Generation.

May 2024

Speaker — BMcD Innovation Roundtable

Burns & McDonnell

Introduction to Agentic Systems.

March 2024

Speaker — BMcD Innovation Roundtable

Burns & McDonnell

Applications of Large Language Models.

January 2024 – Present

Independent Researcher

University of Virginia (Aidong Zhang Group)

Published at IEEE Big Data 2024. Finetuned language models via PEFT and prompt-tuning for Social Determinants of Health extraction in MIMIC-IV.

January 2023 – April 2024

Data Engineer

1898 & Co.

Reduced manual data integration time by ~70% for utility-sector clients. Built ETL pipelines, a Flask REST API automating data migration from SharePoint to Oracle APEX, and PowerBI dashboards visualizing operational KPIs for 10–15 project managers.

2018 – 2022

B.S. Computer Science · Minor: Psychology

Purdue University

3 publications during undergrad. TA for CS390 Deep Learning, CS252 Systems Programming, ECE264 Advanced C. Research across NLP, education technology, and formal language theory.

Selected Work

What I've built

Axiom

Multi-Agent Engineering Platform

Orchestration platform supporting any Vertex AI model and coding harness (OpenCode, Claude Code) with full lineage tracking. Features a context engineering layer with MCPs, Skills, and A2A protocols—admins define guidance structures while end users interact through a brief agent → planner agent → specialized agent teams pipeline. Generates documents, images, data files, and deployable web applications.

Agent Teams

100+

Beta Users

10+

Models Supported

PythonFastAPIVertex AICloud RunADKMCPA2A

Experience IQ

Enterprise NL-to-SQL Agent

LLM agent translating natural language into SQL over a 110-table Spanner database via intent routing, dynamic schema retrieval, and SME-curated few-shot curriculum—inspired by Nvidia Voyager’s skill library. Systematic evaluation cycles categorized failure modes by root cause, improving query quality from 72% to 80% and reliability from 75% to 94%. Includes an LLM-as-a-Judge evaluator and a Looker LLMOps dashboard for real-time drift monitoring.

15K

Users

94%

Reliability

−85%

Search Time

PythonGCP SpannerBigQueryVertex AIGeminiLookerDSPy

Document Processing SDK

Intelligent Extraction Platform

Microservice for certificate-of-insurance-to-subcontract comparisons with rule-specific preprocessing and structured prompts routed to Gemini. Applied RLVR’s strict verifier paradigm to design deterministic verification functions sourced from SMEs. Core logic extracted into a reusable SDK—users provide a prompt, JSON schema, and optional verification function; the SDK handles the rest.

85–90%

Accuracy

1.5–3K

Users

−90%

Review Time

PythonFastAPIGemini APICloud RunDocker

Research

Published work

View all on Google Scholar

IEEE International Conference on Big Data01December 2024

Context-specific feature augmentation for improving social determinants of health extraction

L. Gong, A. Shor, A. Zhang, and K. Jha

Augments EHR discharge summaries with biomedical literature context to improve SDoH extraction. Introduces an adaptive feature infusion strategy combining information from different sources, significantly outperforming baselines on the MIMIC-SDoH dataset.

Read paper

IEEE Frontiers in Education Conference02March 2024

Clustering entity relationship diagrams: Enhancing feedback quality and grading consistency in large database courses

S. Thadani, A. Shor, S. Ahn, L. Gong, A. Alawini, and H. Benotman

A tool for clustering Entity Relationship Diagrams using object detection, OCR, and clustering to group similar student submissions. Identifies common approaches and mistakes, improving feedback and simplifying grading at scale.

Read paper

NAACL: Human Language Technologies03July 2022

A holistic framework for analyzing the COVID-19 vaccine debate

M. L. Pacheco, T. Islam, M. Mahajan, A. Shor, M. Yin, L. Ungar, and D. Goldwasser

Proposes a framework connecting stance analysis, reason analysis, and moral sentiment analysis to combat misinformation. Analyzed temporal trends in 2.7M COVID-19 tweets and validated BERT sentiment classifiers for a vaccine debate framework.

Read paper

arXiv preprint04January 2023

Neural operator: Is data all you need to model the world?

H. Viswanath, M. A. Rahman, A. Vyas, A. Shor, et al.

A comprehensive survey of neural operator architectures for learning mappings between function spaces, exploring whether data-driven approaches can replace traditional physics-based modeling.

Read paper

scroll to explore

Skills

What I work with

A diverse skillset spanning theoretical foundations and practical applications, with a focus on designing scalable AI solutions.

AI & NLP

LLM AgentsRAGNL-to-SQLLLM EvaluationPrompt Optimization (DSPy/GEPA)Multi-Agent OrchestrationMCP / A2A ProtocolsInterpretabilityReasoning TraceabilityRLVRControlled Generation

Platforms & Infrastructure

Vertex AIGCP (Spanner, BigQuery, Cloud Run)Azure ServerlessDockerCI/CDLookerCloud Workflows

Frameworks & Languages

PythonSQLJavaScriptMATLABCPyTorchTensorFlowHuggingFace TransformersFastAPIFlaskScikit-LearnNLTK

Currently

What I'm thinking about

Problems and ideas I'm actively exploring outside of day-to-day work.

Exploring

Context Engineering

How do we give agents the right information at the right time? Exploring MCP, A2A, and skills architectures for reliable multi-agent systems.

Thinking about

LLM Evaluation at Scale

Systematic approaches to catching agent drift and measuring reliability. From trace analysis and failure mode categorization to LLM-as-a-Judge evaluators.

Thinking about

Research-Informed Engineering

Bridging the gap between what papers discover and what products need. Applying frameworks like RLVR to build deterministic verification in production systems.

Exploring

AI in High-Stakes Domains

Healthcare, engineering, education — domains where AI mistakes have real consequences. How do we build systems that earn trust?

Contact

Let's connect

Open to research collaborations, speaking opportunities, and conversations about turning research into production systems.

andrusha.shor@gmail.com

GitHub

AndrushaShor

andrey-shor

Google Scholar

Publications

Resume

Download CV