👋 Hi, I’m Lin Xiaoya (林小雅)

I am an undergraduate researcher at Nanyang Technological University, Singapore, specializing in Data Science and Machine Learning.

NTU President Research Scholar | NTU Science & Engineering Scholarship Recipient

CGPA: 4.89 (Highest Distinction Honour) | Dean’s List AY 23-24, 24-25 (Top 5%)

My research interest centers around Adversarial Robustness & Deepfake Detection and Image Restoration

I am an Exchange Student at EPFL (Switzerland) for Spring 2026.

You can find my CV here: LinXiaoya’s Curriculum Vitae.

Email / Github / LinkedIn / WeChat

🎓 Education

Nanyang Technological University, Singapore (NTU)

Bachelor of Electrical and Electronic Engineering (Highest Distinction Honor) (Aug 2023 – May 2027)

  • CGPA: 4.89 / 5.00
  • Specialization: Data Science and Machine Learning
  • Awards:
    • NTU President Research Scholar
    • Dean’s List (AY 23-24, 24-25) - Top 5%
    • NTU Science & Engineering Undergraduate Scholarship
    • Presenter at ICUR

💼Research & Professional Work Experience

1. GlobalFoundries | Data Scientist Intern

May 2025 – Dec 2025 | Singapore

During my 7-month internship, I combined technical innovation with rigorous research to resolve complex engineering challenges.

Trace Data ETL Pipeline

The Problem: Processing multi-month sensor traces caused memory crashes and took hours.
The Solution: Built a scalable ETL pipeline using AWS SageMaker, PySpark, and Snappy compression.

  • 📉 90% Reduction in storage file size.
  • 97% Faster data extraction time.
  • 🛠 Tech: PySpark, AWS S3/Boto3, Parquet + Snappy, ETL Optimization.

"The solution developed addressed scalability issues... resulted in 90% reduction in file size."
Trace Pipeline Demo

Trace-to-Image Fault Detection (GAF-CNN)

The Problem: Existing methods were not suitable for classifying unlabeled equipment parameter traces.
The Solution: Engineered a novel ensembled method of Guard-Band + OC-SVM using Gramian Angular Fields (GAF).

  • 🎯 95% Accuracy (Outperforming Guard-Band by 14%).
  • 🛡 Valid for unlabeled anomalies (Zero False Alarms).
  • 🛠 Tech: Python (PyTorch), Gramian Angular Fields (GAF), CNN, One-Class SVM.

"Her final model achieved a higher accuracy than traditional Guard-Band method by 14%."
Fault Detection Visualization

🏆 Internship Endorsement - Recommendation Letter

Supervisor: Mr. Khoo Yong (Member of Technical Staff, GlobalFoundries)

"I was able to observe first-hand her ability to combine technical innovation with rigorous research capability... I am confident she will excel in her PhD program and in future Data Science/ Engineering roles."
Recommendation Letter Preview
Click to read full letter


2. OCBC Bank | Data Analyst Intern (Project Data Analysis)

Jan 2026 – Present | Singapore

  • Led a 4-person multidisciplinary team to build an end-to-end automation system using Power Automate and Python, targeting the elimination of manual data extraction for the Talent Acquisition team.
  • Developed a Power BI dashboard to track “Active Requisitions” and “YTD Performance,” enabling real-time monitoring of recruiter-specific closure rates and vacancy trends.
  • Established a standardized information architecture to categorize diverse hiring data, ensuring high data integrity for executive decision-making.


3. Shenzhen InnoX Academy | Market Data Analyst Intern

Dec 2025 – Jan 2026 | Hybrid

Project: "Interactive Clothes" GTM Strategy

The Problem: The product lacked market fit, oscillating between saturated "fitness" and "gaming" markets with high CAC.
The Solution: Pivoted to "Somatic Wellness" via Python sentiment analysis and engineered the digital launch infrastructure.

  • 🚀 500+ Leads captured in 48h via custom Landing Page.
  • 📊 Analytics: Mined competitor reviews to identify "Subscription Fatigue" as a key pain point.
  • 🛠 Tech: Python (NLP), Tableau, HTML5/Tailwind, GTM Strategy.

Part 1

Part 2

Watch Video


4. A*STAR | Healthcare Data Research Intern

Jan 2025 – Apr 2025 | Singapore

Project: Anonymization Workflows for Confidential Medical Datasets

  • Co-led a team of 5 in a collaboration with SG National Healthcare Group (NHG) to anonymize facial healthcare datasets for AI validation.
  • Addressed the trade-off between data utility and patient privacy, a core challenge in observational healthcare data.
  • Performed feature preservation analysis through rPPG-tool to optimize preprocessing for downstream AI performance.


5. Classbro (Shanghai DAOBI EdTech) | Data Science Instructor

Jun 2024 – Dec 2024 | Hybrid

  • Developed and delivered bilingual lectures on core ML algorithms and SQL.
  • Designed course content that contributed SGD 10,000+ revenue growth.


🔬 Academic Projects

1. AlzCare: AI-Powered Dementia Care Platform

The Problem: Dementia patients face risks of wandering, missed medication, and delayed health intervention.
The Solution: Developed an all-in-one web platform integrating Geofencing, AI Health Reporting, and Automated Reminders.

  • 📍 Geofencing: Real-time wandering alerts using Google Maps API & WebSockets
  • 🧠 Health Analytics: AI-driven insights for early intervention
  • 📅 Smart Scheduling: Auto-syncs with Google Calendar for medication reminders

Tech Stack: React.js · Node.js · Google Maps API

Presented at the International Conference of Undergraduate Research (ICUR) 2025

AlzCare Demo

▶ Click to Watch Demo

ICUR Poster

📄 View Poster

ICUR Certificate

🏆 View Certificate

Course Project Lead

  • Led a 4-member team to analyze booking behavior and predict cancellations.
  • Optimized model to 91% accuracy using TensorFlow, L2 regularization, and Batch Normalization.
  • Delivered visual storytelling insights to 50+ course instructors and peers.


📜 Certifications

Six Sigma
Lean Six Sigma Yellow Belt
Process Improvement · DMAIC · Root Cause Analysis · Waste Reduction
Stanford
Machine Learning
Supervised/Unsupervised Learning · Neural Networks · Regularization · Bias-Variance Tradeoff
Duke
Cloud Computing
Cloud Architecture · Virtualization · SaaS vs IaaS · Docker · CI/CD Pipeline
Bloomberg
Bloomberg Market Concepts
Economic Indicators · Currencies · Fixed Income · Equities · Terminal Functions
Macquarie
Excel Analytics
Advanced Charting · Pivot Tables · Interactive Dashboards · Data Storytelling
Colorado
Statistics (R)
Hypothesis Testing · Regression Analysis · ANOVA · Data Wrangling · ggplot2


🛠 Tech Stack & Tools

Languages Python R SQL

Machine Learning & Vision PyTorch TensorFlow Scikit-Learn

Big Data & Cloud PySpark AWS


🌱 Community & Leadership

  • NTU Institution of Engineering and Technology (IET): Liaison Manager (Jul 2024 – Present)
    • Coordinated company visits to A*STAR, Nestlé, SBS Transit, and GlobalFoundries.
  • NTU Investment Interactive Club (IIC): Subcommittee (Jul 2024 – Present)
    • Organized investment conferences with 200+ participants.
  • CFLS Model United Nation Club: Academic Director (Dec 2019 – Dec 2021)
    • Chaired the North-East MUN Conference (1,000+ attendees).
    • Trained 100+ students in resolution drafting, diplomacy, and public speaking