Che Zhu

Data & Cloud Engineer

Hi, I'm Che Zhu.

I design production AWS data pipelines, build end-to-end automated workflows in Python and SQL, and integrate LLM capabilities into data products.

12+
Production pipelines
2M
Records processed daily
99.5%
Uptime SLA
75%
Processing time reduced

About

Beyond the data.

I'm a Data & Cloud Engineer at Sun Life Global Investments, where I design and operate production AWS data pipelines processing ~2M records daily across 6 asset classes. I build end-to-end automated workflows, integrate LLM capabilities into data products, and deliver analytical insights through Tableau and Power BI.

My journey started in Environmental Physics at the University of Toronto — studying the patterns of our natural world. That curiosity for uncovering hidden structures led me to a Master's in Human-Centered Data Science, bridging the gap between complex models and real human impact.

Outside of work, I find calm in nature and thrill in discovery. Whether it's hiking trails around Ontario, experimenting with new recipes, or exploring the latest in generative AI — the best ideas come when you step away from the screen.

Cloud Engineering
AWS Lambda, Step Functions, Glue, S3, RDS, EventBridge — production at scale
Automation & ETL
End-to-end pipelines that run reliably at scale
AI Solutions
From LLMs to predictive models — practical AI for business
Data Analytics
Turning millions of records into actionable insight

Life Outside Work

Built different.

When I'm not engineering data pipelines, you'll find me engineering something else entirely.

Car Enthusiast

Car enthusiast at heart.

There's something about the precision of a well-tuned machine that resonates with me — both on the road and in code. I'm passionate about the automotive world, from the engineering under the hood to the design language on the surface. It's the same attention to detail I bring to every data pipeline and model I build.

Weekends are for scenic drives, weekend wrenching, and the occasional track day. The philosophy carries over: build it right, make it fast, keep it clean.

Skills

Technologies I work with.

Data & Analytics
PythonR SQLSAS PandasNumPy Scikit-learnPower BI TableauExcel / VBA
Cloud & Infrastructure
AWS (Lambda, Step Functions, Glue, S3, RDS, EventBridge) TerraformDocker SnowflakeSnowpark Databricks ETL / ELTCI/CD Git
Machine Learning & AI
LLM IntegrationRAG Pipelines Prompt Engineering TensorFlowPyTorch XGBoostLightGBM NLPClassification
Visualization & Databases
TableauPower BI (DAX) MatplotlibPlotly PostgreSQLMySQL DB2Snowflake SQL

Experience

Where I've made impact.

Data Analyst, Automated Agile Squad Current
Jul 2025 — Present
Sun Life Global Investments — Data Analytics
  • Design and maintain 12+ production AWS data pipelines (Lambda, Step Functions, Glue, EventBridge) processing ~2M records daily across 6 asset classes, achieving 99.5% uptime SLA.
  • Built end-to-end automated workflows that reduced manual data processing time by 75% (from ~4 hours to <1 hour per cycle), orchestrating S3 ingestion, Glue ETL, and RDS persistence with SNS/SQS alerting.
  • Developed 20+ Python serverless functions (boto3) to extract, validate, and reconcile investment data feeds from 8 external vendors; cut data reconciliation discrepancies by 40%.
  • Created 5 Tableau dashboards and 15+ SQL-driven reports used by 30+ portfolio managers and operations staff, reducing ad-hoc data requests by 60%.
AWSLambdaStep FunctionsGluePythonTableauSQLboto3
Data Analyst (Co-op)
May 2023 — Dec 2023
Royal Bank of Canada — Insurance
  • Built a predictive retention model for Future Income Options using LightGBM and XGBoost, improving accuracy by 7% to 91.6% and contributing to a projected $1.13M revenue uplift.
  • Engineered a proprietary Customer Loyalty metric from derived features (active accounts, product ownership, attrition rate), significantly boosting model predictive power beyond traditional demographic inputs.
  • Analyzed 8M+ Term-100 life insurance records spanning 40 years with DB2 SQL and Python; segmented customers by demographic and risk factors, then deployed a propensity model via CI/CD pipeline.
  • Delivered weekly analytical reports and Power BI dashboards to senior management, translating model outputs into actionable business strategy recommendations.
PythonDB2 SQLXGBoostLightGBMPower BICI/CD
Economic Data Research Assistant
Sep 2021 — Apr 2022
University of Toronto
  • Collaborated on education policy research, mining data from government reports and institutional assessments using Python web scraping and Excel.
  • Delivered structured datasets and analysis contributing to published research on Ontario education system disparities.
PythonWeb ScrapingExcelData Mining

Projects

Selected work.

Auto Accounting Tool
End-to-end automated bookkeeping system integrating Plaid API with Beancount, deployed on Oracle Cloud with Cloudflare Tunnel. Features a Flask PWA and iOS widget for real-time transaction tracking.
Ongoing — 2024 to present
PythonFlaskPlaid APIOracle CloudPWA
Daily Briefing Assistant
Scheduled LLM agent that aggregates news, calendar events, and personal data sources to generate personalized daily briefings with actionable insights.
LLM-powered — 2025 to present
LLMPythonRAGAutomation
Retention Forecasting Model
Predictive retention model for insurance at RBC using LightGBM and XGBoost. Engineered a proprietary Customer Loyalty metric from derived features.
91.6% accuracy — $1.13M projected revenue uplift
PythonXGBoostLightGBMDB2 SQL
Fraud Detection — Hackathon Finalist
Finalist entry at 2023 UofT Faculty Hackathon. Built a fraud detection pipeline for Service Canada Family Benefit data using ensemble ML methods.
2023 Faculty Hackathon Finalist
ML ClassificationPythonData Science

Education

Academic foundation.

Master of Information
University of Toronto
Graduated Jun 2024
  • Concentration in Human-Centered Data Science
  • cGPA: 3.91 / 4.0
  • Coursework: Programming for Data Science, Data Modeling, ML, MLOps, Practical AI Dev
  • Hackathon Finalist — Fraud Detection
Honours Bachelor of Science
University of Toronto
Graduated Jun 2022
  • Specialist in Environmental Physics
  • Dean's List (2019 — 2022)
  • Coursework: Software Design, Statistics, Linear Algebra, Microeconomics
IBM Data Science Professional (2025)
Databricks Generative AI (2024)
PwC Problem Solving with Excel (2025)
SAS Programming (2022)

Contact

Let's work together.

Open to opportunities in data engineering, cloud architecture, and AI.