Open to new opportunities · Scarborough, Ontario

Sumit Vaise Senior ML Engineer

I build AI/ML systems that actually ship — RAG pipelines, production LLM applications, and end-to-end MLOps infrastructure that handles millions of events daily across GCP and AWS.

9+
Years Experience
5
Companies
20M+
Daily Events Handled
40%
Avg Latency Reduction
SV
Sumit Vaise
Available for hire

The quick version

I stumbled into ML through a computer vision project at Bosch — spent six months teaching a model to spot defects on a production line, realized I'd found the thing I wanted to do every day, and never looked back.

These days I work across the full ML stack: designing RAG pipelines that reduce analyst research time by 40%, building real-time fraud detection systems that flag anomalies in under two seconds across 20M+ daily transactions, and leading teams to take models from notebook to production on GCP and AWS.

What I care most about is the gap between a model that works in a notebook and one that reliably works in production. That's the space I live in — MLOps, observability, CI/CD for ML, and making sure the thing you built six months ago is still the thing running today.

Currently based in Scarborough, Ontario. Canadian PR. Open to Senior ML / AI Engineer roles across Canada and the US.

🧠

Core Strength

RAG systems, LLM applications, and production-grade NLP pipelines using LangChain, HuggingFace, and OpenSearch.

☁️

Cloud & MLOps

GCP (Vertex AI, BigQuery ML, GKE) and AWS (SageMaker, EKS, Airflow). End-to-end CI/CD with GitHub Actions and Docker.

👁️

Also builds

Computer vision models for medical imaging (diabetic retinopathy detection) and industrial quality inspection.

📍

Location

Scarborough, Ontario · Canadian PR · Remote/Hybrid/On-site open.

Where I've worked

Five companies. Each one taught me something different about shipping ML at scale.

Senior ML Engineer
Fidelity Investments · Boston, MA, USA
Nov 2024 – Dec 2025
  • Built a Multi-Document RAG Pipeline using LangChain, Sentence Transformers, and ChromaDB — cut analyst research time by 40% across 500+ financial reports.
  • Deployed a Vector ETL pipeline for multi-source document ingestion, implementing recursive text chunking and automated embedding generation.
  • Developed end-to-end ETL and ML pipelines for real-time communication error detection, using AWS, Airflow, Snowflake, Flask, and AWS EKS.
  • Established MLOps best practices: MLflow experiment tracking, GitHub Actions CI/CD, automated retraining triggers, Grafana monitoring dashboards.
LangChain RAG ChromaDB AWS EKS Kafka XGBoost MLflow FastAPI
Senior MLOps Developer
Definity Financial · Toronto, ON
May 2024 – Oct 2024
  • Designed, built, and maintained scalable ML pipelines and cloud infrastructure (BigQuery, Vertex AI, Docker, Kubernetes, KubeFlow) for production ML systems serving 14M+ records.
  • Implemented end-to-end Random Forest pipeline predicting insurance policy lapses, identifying the top 2% at-risk customers and directly impacting business retention strategy
  • Designed comprehensive monitoring, alerting, and observability systems for ML model health, maintaining strict production SLAs and minimizing incident downtime.
  • Partnered with product managers to align ML technical requirements with business stakeholder needs in Agile sprints.
Vertex AI BigQuery ML GCP NLP Python Scikit-learn
Senior ML Engineer/Tech Lead
Quantiphi Inc. · Toronto, Canada
Jul 2021 – Mar 2024
  • Inventory Allocation Prediction (Team Lead, Team of 4): Optimized US shipment locations; Partnered with cross-functional teams and non-technical stakeholders to architect and lead end-to-end ML solution delivery, translating business requirements into rapid ML prototypes and production systems.
  • Insurance Claims Processing (Team of 12):Implemented a scalable Deep Learning classification system on GCP, leveraging TensorFlow, OCR, PyTorch, and MLOps, resulting in a 70% speeding up of claims processing and improving operational efficiency.
  • Mentored 3 junior engineers; led bi-weekly ML guild sessions on LLM fine-tuning and retrieval-augmented generation techniques.
  • Delivered 6 production ML systems across healthcare, fintech, and media verticals on GCP and AWS.
GCP KubeFlow BERT Document AI Kubernetes Grafana MLOps
Data Scientist
SalonEveryWhere ·Toronto, Canada
Jan 2021 – Jul 2021
  • Built a computer vision pipeline for hairstyle classification and recommendation using PyTorch CNNs; integrated with iOS/Android app serving 50K+ users.
  • Quantized models with TensorFlow Lite for mobile deployment, reducing model size while preserving accuracy.
PyTorch Computer Vision Recommendation Systems AWS Quantization
Computer Vision Engineer
Bosch · Bangalore, India
Aug 2017 – Dec 2018
  • Built and trained a person re-identification and tracking system using a CNN-based Siamese Network (PyTorch) with a custom GUI (Qt, C++) for real-time inferencing.
  • Reduced the object detection model training time of a CNN for an autonomous driving project by 22% without accuracy loss. Developed a Python algorithm to optimize dataset labeling (dense to sparse) for GPU efficiency. Trained the model on an AWS EC2 instance.
TensorFlow Computer Vision CNNs OpenCV Python
Computer Vision Engineer
LNT Tech Services · Mysore, India
Jan 2015 – Aug 2017
  • Led projects in computer vision application development, including medical image stitching (MATLAB, OpenCV, C++) and thermal person detection (Raspberry Pi, 82% accuracy).
  • Developed 3-class object detection using learning (VGG16) on NVIDIA Jetson-TK1 (96% accuracy).
TensorFlow Computer Vision CNNs OpenCV Python

Academic background

🎓
Master of Engineering (MEng)
Concordia University
Jan 2019 – Dec 2020 📍 Montreal, QC, Canada
Machine Learning Computer Vision Cloud Computing Deep Learning Neural Networks

Things I've built

Production-focused. Each project was built to solve a real problem, not just to exist on GitHub.

🔍
RAG Knowledge Assistant
A production-grade Retrieval-Augmented Generation system built with LangChain, FAISS, and HuggingFace Sentence Transformers. Ingests multi-format documents (PDF, DOCX, web), chunks intelligently, embeds to a local vector store, and answers questions with source citations. Architected for extensibility — swappable LLM backends (Ollama, OpenAI, Groq).
LangChain FAISS HuggingFace Ollama FastAPI Python
👁️
Diabetic Retinopathy Detection
Deep learning pipeline for automated diabetic retinopathy severity grading from retinal fundus images. Uses a fine-tuned EfficientNet backbone with custom augmentation strategy for handling class imbalance in medical imaging datasets. Achieves clinician-grade accuracy on the APTOS benchmark.
PyTorch EfficientNet Medical Imaging OpenCV Python
📋
Insurance Claims Processing (GCP)
End-to-end ML pipeline for automated insurance claims classification and routing, built on Google Cloud Platform. Combines Document AI for structured extraction, BERT-based NLP for intent classification, and BigQuery ML for risk scoring. Deployed on GKE with KubeFlow orchestration and real-time monitoring.
GCP Document AI Vertex AI BigQuery ML KubeFlow GKE

The full toolkit

Tools I reach for to build things that work in production — not just in notebooks.

AI / ML / GenAI
RAG Pipelines LangChain LLM Fine-tuning HuggingFace Transformers BERT / RoBERTa Sentence Transformers Agentic Workflows Prompt Engineering Scikit-learn XGBoost / LightGBM
Deep Learning & Computer Vision
PyTorch TensorFlow CNNs EfficientNet ResNet OpenCV Image Segmentation Object Detection
Vector Stores & Search
ChromaDB FAISS OpenSearch Pinecone Semantic Search Hybrid Retrieval
Cloud — GCP
Vertex AI BigQuery ML GKE KubeFlow Cloud Run Pub/Sub Document AI Cloud Storage
Cloud — AWS
SageMaker EKS EC2 S3 Lambda Airflow (MWAA) Snowflake Kafka
MLOps & Infrastructure
Docker Kubernetes MLflow GitHub Actions Jenkins FastAPI Grafana Prometheus CI/CD for ML Model Monitoring
Languages & Data
Python SQL Bash PySpark Pandas / NumPy PostgreSQL Elasticsearch

Writing about ML

Thoughts on RAG systems, production ML, and the messy reality of getting models to actually work. Published on Medium.

Loading latest posts…

Let's talk

I'm actively looking for Senior ML / AI Engineer roles in Canada or the US. If you're hiring, have an interesting problem, or just want to talk shop about RAG systems and LLMs — reach out. I reply to everything.