i build cool stuff... (like this website)
My name is Alexander, but all my friends call me Zander. I'm a Software Engineer specializing in Machine Learning and AI technologies. I hold a B.S. in Software Engineering from St. Mary's University, completed from August 2021 to December 2025.
I have extensive experience developing production LLMs, building training pipelines, and creating scalable ML solutions. Currently based in New York, NY, I work on document classification systems and ML infrastructure at scale.
When I'm not building AI systems, you'll find me deep in the world of anime and all things weeby! This pic was actually taken while I was drifting JDM cars in the Japanese mountains (yes, really). If you're into anime too, I'd love to know your top 5 - it's always great to connect with fellow weebs in the tech industry!
Languages & Libraries: Python, SQL, TypeScript, C++, JavaScript, React, Pandas, NumPy, PostgreSQL, Regex, Scikit-learn
ML, Cloud & Infrastructure: AWS (SageMaker, S3, Textract), GCP, OpenAI API, LangChain, MLflow, Kubernetes, Docker, Snowflake, Datadog, LLMs/RAG, CI/CD, NLP
Built a local Python GUI integrated with the GPT API for a children's shelter, coordinating with OpenAI to ensure HIPAA-compliant rest API endpoints. Decreased report generation time by 95%, allowing staff to shift focus from manual documentation to direct care. View on GitHub
Built a full-stack gamified learning application using React, TypeScript, and Vite with Supabase backend, implementing Row Level Security policies, real-time data synchronization, and custom database triggers for user progress tracking. Developed comprehensive authentication system with XP-based progression, achievement badges, and streak counters to enhance user engagement and learning retention for Japanese road sign education. View on GitHub
Built an org-wide Document Classification Service on AWS (SageMaker, S3) to automate claims processing for the Manual Submissions team, replacing a rigid legacy model with a modular system capable of identifying loss runs, financial statements, and signatures.
Achieved 99% accuracy (up from 79%) in document categorization by implementing Chain-of-Thought (CoT) prompting and designing rigorous evaluation pipelines using MLflow and Python.
Reduced token consumption by 40% and lowered overall inference costs by refactoring output logic (removing JSON brackets) and identifying cost-efficiencies in migrating to newer NLP models.
Deployed a unified API endpoint integrating Regex brand classifiers and LLMs, enabling the broader engineering organization to access scalable document intelligence tools.
Architected and maintained the core infrastructure for an ML data platform serving enterprise clients (e.g., V7 Labs, DataCurve), scaling system throughput to support $50k MRR with 99.9% uptime.
Optimized PostgreSQL performance to handle high-concurrency write loads from 150+ simultaneous annotators; implemented custom indexing strategies that reduced p99 query latency by 40% during peak traffic.
Engineered a low-latency code execution sandbox in C++ using Linux namespaces and seccomp filters (via Google Sandbox2), enabling the secure, isolated evaluation of untrusted user code on bare-metal EC2.
Engineered production-grade data pipelines on Kubernetes and AWS SageMaker, automating the secure ingestion, versioning, and fine-tuning of datasets to support rapid model iteration cycles.
Designed a distributed quality assurance system using TypeScript and LangChain agents, removing manual verification bottlenecks and ensuring data consistency across distributed engineering teams.
Engineered multi-stage QA pipelines for LLM training data, designing consensus algorithms and automated conflict resolution logic that improved dataset accuracy from 91% to 98%.
Developed automated validation frameworks using Python, Pandas, and Scikit-learn, implementing statistical checks to detect anomalies and rank annotator performance at scale.
After my internship, I stayed on a contract basis to build a proprietary Python-based CMS and automated deployment infrastructure on GCP to centrally manage and scale production for over 300 client websites.
Scaled web production capabilities, automating the deployment of 300+ responsive websites built with React and TypeScript while ensuring cross-browser compatibility.
Developed Python automation scripts to aggregate and clean client performance data, reducing manual reporting time by 30% and enabling the product team to track key engagement KPIs in real-time.
Contributed to web development and component design in React (TypeScript), leading to a return offer for a contractor based software engineering role.
You can contact me at alexanderpaulsmall@gmail.com with any questions or concerns (or to squad up in Marvel Rivals)
Career updates and more information about my qualifications. Feel free to message me if you have any questions about my qualifications or to schedule an interview.