DATA SCIENCE • FAULT TOLERANCE • ML

Data Scientist. Fault Tolerance

Research-driven ML engineer & PhD student: dependable systems for high-performance computing, fault tolerance, LLVM-based fault injection, and silent data corruption modeling. I make complex pipelines feel simple.

See Skills Experience Digit Demo

GITHUB

Facebook

Linkedn

Resume

Education

PhDDependable Systems

The University of Florida — PhD in Computer Science

Aug 2025 – Present • Advisor: Prof. Guanpeng Li • Focus: dependable HPC, fault tolerance, GNN, DBN , SDC modeling, LLVM, Fault Injection.

Current

PhDDependable Systems

The University of Iowa — PhD in Computer Science

Aug 2024 – 2025 • Advisor: Prof. Guanpeng Li • Focus: dependable HPC, fault tolerance, LLVM-based fault injection & SDC modeling.

Transfered CGPA 4.0

BScDistinction

LUMS — BSc Computer Science

2019 – 2023 • CGPA 3.78 • Dean’s Honor List (2019–2022). Core: ML, DL, AI, Speech, Data Mining, HCI, Security, SE, Automata, NCC.

Honors CGPA 3.78

Skills & Abilities

Machine Learning & Data Science

PyTorch85%

TensorFlow80%

Scikit-learn75%

Pandas / NumPy85%

Development

JavaScript (ES6+)80%

React / React Native85%

HTML5 / CSS375%

Programming Languages

Python90%

C++80%

SQL75%

Systems & Tools

LLVM75%

Linux80%

Git / Docker75%

Experience

Graduate Research Assistant — University of Florida

Aug 2025 – Present · Part-time · Gainesville, FL

Research on dependable ML/HPC; fault tolerance and error resilience for HPC workloads.

Research Assistant — Dependable Systems Lab, University of Iowa

Jun 2025 – Aug 2025 · Full-time · Iowa City, IA

LLVM/runtime tools for instruction-level resilience analysis; fault detection in HPC workloads.

Research Assistant — Dependable Systems Lab, University of Iowa

Jan 2025 – May 2025 · Part-time · Iowa City, IA

Resilience tooling and analyses; supported experiments for dependable systems research.

Teaching Assistant — Artificial Intelligence, University of Iowa

Aug 2024 – December 2024

Support 30+ students with office hours, grading, and learning outcomes for AI coursework.

Research Assistant — DebloatBench (SRI • UArizona • LUMS)

Aug 2023 – Aug 2024

A research project in collaboration with Dr. Sazzadur Rehman (Assistant Professor at the University of Arizona), Dr. Ashish Gehani (Principal Computer Scientist at SRI) and Dr Fareed Zaffar (Assistant Professor at Lahore University of Management Sciences). The project provides a unified framework to evaluate a diverse set of container debloaters that can handle the diversity of design and execution environments for container debloating.

Associate Consultant — AI, Systems Limited

Jun 2023 – Jul 2024

ML & LLM Solutions for Global Organizations: - Collaborated with Microsoft to develop LLM solutions using Azure Cloud services. - Implemented solutions with Azure Cognitive Services, Omni Channel, RAG architecture, Databricks, and Data Robot. - Fine-tuned models using PEFT and LoRA techniques for optimal performance. - Applied multiclass classification and clustering for tailored ML solutions, integrating latest research advancements. LLM Training Sessions for Professionals: - Conducted comprehensive training on LLMs, covering Transformer architecture and RAG. - Designed and facilitated quizzes, labs, and final projects at Systems Limited. - Delivered practical, industry-focused courses emphasizing NLP basics and fine-tuning techniques.

Research Assistant — ISPL, LUMS

Aug 2022 – Aug 2023

Fraud Detection – The project drew inspiration from existing works on credit card fraud detection and focused on using Machine learning techniques to reduce security risks in financial transactions by leveraging large datasets provided by Clariba SEIDOR, a consultancy firm • Focused on refining credit card transaction security through techniques such as ensemble learning, data preprocessing, and the application of SMOTE to address class imbalance challenges. • Executed comprehensive literature reviews, conducted exploratory data analysis, and applied and compared classification and clustering techniques for fraud detection. • Integrated Snowflake and Dataiku tools to harness scalable computational power for machine learning model development and assessment. • Addressed practical challenges in credit card fraud detection, including class imbalance and memory constraints, showcasing adaptability and solution-oriented thinking.

Data Analysis Intern — VentureDive

Jul 2022 – Oct 2022

• GeoSpatial Data Processing o Applied Selenium for efficient web scraping to collect National and Provincial Constituencies data in Pakistan. o Processed and formatted the acquired data to prepare it for subsequent analysis and visualization. • Interactive Webpage Development o Developed an interactive webpage using Python, Flask, and HTML. o Integrated the GeoJson file generated from the collected data, enabling users to explore and interact with the geographical information easily. • Utilized online plotting tools to enhance the visual representation of geographical data on the webpage. • Implemented user-friendly features, allowing users to zoom, pan, and retrieve detailed information about specific constituencies directly from the webpage..

Branch Banking Intern — Habib Bank Limited

Jun 2022 – Jul 2022

ML for employee-effectiveness metrics; campaign ops & performance analytics.

Peer Advisor — LUMS

Aug 2021 – Jun 2023

Mentored ~30 students on course/major planning and academic progress.

Teaching Assistant — Computational Problem Solving, LUMS

Aug 2021 – Dec 2022

Led tutorials for 100+ students; authored quizzes/exams; grading & office hours.

Data Science Intern — JS Bank

Jun 2021 – Aug 2021

• Executed daily SQL queries to optimize the ETL process for efficient data extraction, transformation, and loading. • Leveraged existing customer data through SQL queries, contributing to the development of predictive analysis models. • Translated predictive insights into actionable strategies, enhancing informed decision-making across diverse organizational departments.

Projects

LLMAzureRAG

Enterprise RAG & PEFT on Azure

Production-ready retrieval-augmented generation with Cognitive Services, vector search, and LoRA adapters for task-specific tuning.

Repo

Stack: Azure OpenAI, Azure Search/Blob, LangChain/Prompt-flow, PEFT/LoRA, Docker.

Chunking + hybrid retrieval (sparse+dense); guardrails for PII and prompt injection.
Tracing & eval harness (faithfulness, context recall, answer quality) with offline test sets.
Keywords: MLOps, observability, vector stores, red-teaming, governance.

applied

NLPPyTorch

Streaming Language Classifier

Low-latency multi-lingual classifier; distilled models with real-time constraints for edge serve.

Repo

Stack: PyTorch, TorchScript, scikit-learn, FastAPI.

Tokenizer experiments (BPE vs WordPiece); class-imbalance handling; label smoothing.
Exported to TorchScript; batched inference; throughput vs accuracy tradeoffs.
Keywords: distillation, quantization, latency SLOs, A/B testing.

classification

DLKD

Knowledge Distillation Study

CIFAR-10 compression with multi-student KD; systematic ablations over T, depth, and data aug.

Repo

Stack: PyTorch, Albumentations, WandB/Matplotlib.

Teacher-student ensembles; temperature & loss-weight sweeps; early-stopping & checkpointing.
Keywords: model compression, calibration, robustness, learning curves.

experiment

CompilersPL

YAPL — Yet Another Programming Language (Interpreter)

From grammar to AST to bytecode-like executor; showcases end-to-end compiler pipeline skills.

Repo

Stack: Python, PLY (lex/yacc), unittest.

Lexer/parser, AST nodes, visitors, constant folding & dead-code elimination.
Keywords: grammars, CFG, tokens, semantic analysis, IR.

systems

CVFlask

Celebrity Identification (Classic CV)

End-to-end CV pipeline: detection → feature engineering → classification with a lightweight web UI.

Repo

Stack: OpenCV (Haar), scikit-learn, Flask, Wavelets.

GridSearchCV for model/param selection; simple deployment artifact for demo.
Keywords: classical ML, feature extraction, evaluation, microservice.

computer vision

LLMRAG

Knowledge-Base QA Bot

Retrieval-augmented QA with ChromaDB; adds conversation state & lightweight guardrails.

Repo

Stack: Python, ChromaDB, LangChain, FastAPI/Flask.

Chunking & embedding; answer-faithfulness evaluation; simple UI for enterprise demos.
Keywords: vector search, reranking, grounding, guardrails.

genai

GeoData

Interactive GeoSpatial Web

Scraped civic data → GeoJSON → interactive maps (zoom/pan/popovers) with a small Flask backend.

Repo

Stack: Selenium, Python, GeoJSON, Flask, Leaflet/HTML.

Automated ETL & sanity checks; searchable map with constituency details.
Keywords: data engineering, scraping, spatial joins, visualization.

maps

ONNXWebGL

In-Browser MNIST Classifier

Digit prediction fully client-side via ONNX Runtime Web (WebGL → WASM fallback); ~25–40 ms inference on desktop.

Try Live

Stack: ONNX Runtime Web, Canvas API, vanilla JS.

Path-robust model loading for GitHub Pages; graceful degradation and clear UX errors.
Keywords: model export, client-side inference, WebGL, WASM, DX.

demo

Draw-a-Digit (0–9)

Draw below and hit Predict.

Smooth Stroke

Probabilities

Get in touch

Based in the US • Open to research collabs & ML systems work.

ahmerjamil.aj@gmail.com +1 319-936-1014

EMAIL LINKEDIN GITHUB Resume TOP

Copied!