Maxwell J. Yin

Machine Learning Engineer specializing in scalable LLM systems, retrieval, and production ML.

Portrait of Maxwell J. Yin

About

Machine Learning Engineer specializing in large-scale LLM systems, distributed training, and production deployment, based in Toronto, Canada.

Ph.D. in NLP from Western University. Experienced in building end-to-end transformer training pipelines, retrieval systems, and production-ready ML infrastructure, with a focus on optimizing model quality and system efficiency under real-world compute constraints.

Experience

Huawei Noah’s Ark Lab — Machine Learning Engineer

Toronto, ON · 2025 – 2026

Led development and optimization of large-scale LLM training and evaluation pipelines, including a 400M-parameter Transformer trained on 40B tokens across 8×H100 GPUs.

Focused on distributed training, system profiling, and training stability, improving reproducibility, sustained multi-GPU efficiency, and deployment-aware model evaluation.

Western University — Graduate Research Assistant

London, ON · 2021 – 2025

Built production-scale semantic retrieval systems over 500K+ documents and 1.7M passages, replacing expensive reranking pipelines with FAISS-based ANN retrieval in a unified embedding space.

Combined research and engineering across retrieval, efficiency, and system design, improving end-to-end latency by more than 5× while strengthening result quality and scalability.

Featured Project

CineSeek: Agent-Enhanced Semantic Movie Search

CineSeek is an agent-enhanced semantic movie search system designed for interactive natural-language retrieval. It combines LLM-based query understanding and rewriting, FAISS-based ANN retrieval, and agent-guided reranking and explanation to improve search quality for complex user intents.

The project reflects how I build production-oriented GenAI systems: using agents to improve query understanding and interpretability, while keeping retrieval and serving fast, scalable, and grounded in system-level efficiency.

Technical Focus

LLM Systems Agentic Workflows Distributed Training Inference Optimization Retrieval & RAG FAISS LLM Evaluation PyTorch DeepSpeed vLLM FastAPI

Research & Publications

First-author publications in TACL, NAACL Findings, AAAI, and Expert Systems with Applications, focusing on building practical and scalable ML systems.

  • TACL 2024: Source-Free Domain Adaptation for Question Answering with Masked Self-training
  • NAACL Findings 2024: Source-Free Unsupervised Domain Adaptation for Question Answering via Prompt-Assisted Self-learning
  • AAAI 2025: MABR: Multilayer Adversarial Bias Removal Without Prior Bias Knowledge
  • Expert Systems with Applications 2024: A Fast Local Citation Recommendation Algorithm Scalable to Multi-topics
  • Full list available on Google Scholar.

Contact

Email: [email protected]
Location: Toronto, ON, Canada