Available for New Projects

Matt Derya

Data Scientist bridging 15+ years of pharmaceutical expertise with cutting-edge GenAI & LLM engineering — building production AI systems that accelerate drug development and drive commercial decisions.

LangChain & RAG Agentic AI CNN / LSTM FDA Regulatory Pharmacokinetics AWS SageMaker IQVIA Analytics
15+
Years in Pharma
6+
Years AI/ML
99%
Model Accuracy
90%
Cost Savings
Input Hidden Output N C O H Genomics Deep Learning Drug Design

Featured Projects

Production-grade AI/ML projects spanning pharma R&D, GenAI engineering, and commercial analytics.

🔬 GitHub

Clinical Trial Cell Classification (CNN)

ResNet/EfficientNet models for Phase II clinical trial cellular analysis. Achieved 99% accuracy, reducing manual analysis from months to minutes.

PyTorchCNNAWSClinical Trials
View on GitHub →
🤖 GitHub

HIPAA-Compliant LLM Agent (RAG)

LangChain/LangGraph agentic AI system for enterprise pharma data querying. Multi-step reasoning over clinical databases with full regulatory compliance.

LangChainRAGLangGraphHIPAA
View on GitHub →
📊 Kaggle

Pharma Sales Forecasting (LSTM)

LSTM time series models for pharmaceutical sales forecasting and customer churn prediction, incorporating PK/ADME domain features. R² above 0.90.

TensorFlowLSTMPharmacokineticsTime Series
View on Kaggle →
📊 Kaggle

Drug Interaction Prediction (XGBoost)

ML model using drug-drug interaction and ADME features for interaction risk prediction. Domain-enriched feature engineering delivering 25–30% accuracy gains.

XGBoostScikit-learnADMEDDI
View on Kaggle →
🔬 GitHub

Adverse Event NLP Pipeline

Transformer-based NLP pipeline extracting structured clinical signals from 10,000+ adverse event reports and scientific literature for Oncology/Immunology programs.

Hugging FaceTransformersNLPOncology
View on GitHub →
🤖 GitHub

Demand Forecasting & CLV Models

LSTM, XGBoost & Ensemble forecasting for inventory optimization; RFM/Cohort-based Customer Lifetime Value models with A/B testing and Sentiment Analysis.

LSTMXGBoostCLVEnsemble
View on GitHub →

Where Drug Science Meets Artificial Intelligence

I started my career as a clinical pharmacist, spending years understanding how drugs behave in the body — pharmacokinetics, drug interactions, ADME. That domain knowledge turned out to be my greatest ML asset.

Today I build production-grade AI systems for pharmaceutical R&D and commercial operations: LLM agents that query clinical databases, CNN models that analyze cell images for Phase II trials, and RAG pipelines that extract insights from regulatory documents.

The result? AI that actually understands the science — not just the data.

🤖 LLM/RAG for Regulatory

Built HIPAA-compliant LLM agents (LangChain/RAG) for regulatory use cases including automated adverse event report drafting. Deployed to 50+ users across regulatory, medical affairs, and market access teams with 40%+ productivity gains.

⚖️ FDA Compliance & AI Governance

Managed end-to-end ML lifecycle ensuring FDA 21 CFR Part 11 compliance and European Pharmacopoeia standards. Implemented human-in-the-loop oversight for high-risk AI use cases impacting regulatory decisions.

📋 Regulatory Submissions Experience

Contributed to 10+ European regulatory submissions at OctaPharma for Human Albumin, Factor products, and vaccines. Prepared stability, bioequivalence, and CMC documentation — all successfully approved.

🧬 Oncology/Immunology Therapeutic Focus

Supporting Oncology drug development at Mentor R&D with AI/ML initiatives. Experience across Oncology, Immunology, Cardiovascular, and Diabetes therapeutic areas.

📊 Unstructured Regulatory Data & NLP

Developed NLP solutions (Transformers/Hugging Face) to extract insights from 10,000+ unstructured documents including PubMed articles, FDA reports, and regulatory correspondence. Built automated alert systems for regulatory and medical affairs teams.

🔍 Auditability & Data Governance

Ensured all AI-generated outputs maintain complete auditable trails to source documents. Implemented metadata tagging and vectorized text approaches for regulatory data traceability.

Full-Stack AI/ML Expertise

From data pipelines to production deployment — with a pharma domain layer no generic data scientist can replicate.

🤖

GenAI & LLMs

LangChainLangGraphRAGAgentic AIOpenAIClaudeHugging Face
🔬

Deep Learning

TensorFlowPyTorchKerasCNNLSTMResNetTransfer Learning
📈

ML & Analytics

XGBoostEnsembleScikit-learnNLPComputer VisionTime Series
💊

Pharma Domain

PharmacokineticsADMEDDIFDA RegulatoryOncologyImmunologyIQVIA
☁️

Engineering & Cloud

AWS SageMakerEC2CI/CDPythonSQLGitStreamlit
📊

BI & Visualization

Power BITableauMatplotlibSeabornPandasNumPy

Professional Journey

15+ years of industry experience across pharmaceutical R&D, data science, and AI engineering.

Jan 2025 – Present
IMEBRANDS
Illinois, USA
Data Scientist
  • Built RAG-based chatbots and Agentic AI assistants on company SQL databases for market trend analysis and customer insights generation.
  • Engineered LSTM, XGBoost & Ensemble time series forecasting models for revenue and demand optimization, significantly reducing excess inventory costs.
  • Designed Recommendation Systems, CLV models, and Computer Vision pipelines; built interactive dashboards and forecasting reports for executive decision-making.
Aug 2019 – Jan 2025
Mentor R&D
Germantown, MD
Senior Data Scientist
  • CNN cellular classification models (ResNet, EfficientNet, VGG) for Phase II clinical trials — 99% accuracy, 90% cost savings.
  • HIPAA-compliant LLM agents (LangChain/LangGraph/RAG) for enterprise Agentic AI workflows — 50% productivity improvement.
  • Pharma-domain ML features (DDI, ADME, PK) with Scikit-learn/XGBoost — 25–30% performance gains.
  • NLP/Transformer pipelines for adverse event analysis across 10,000+ documents for Oncology/Immunology programs.
  • AWS (SageMaker, EC2) production deployments via CI/CD; Power BI/Tableau dashboards with real-time IQVIA insights — 40% stakeholder engagement increase.
Nov 2015 – Jul 2019
SG Health
Trenton, NJ
Data Analyst & Owner
  • Founded and operated independent pharmacy; built inventory forecasting models and sales dashboards — reduced waste by 40%.
  • Analyzed prescription trends and patient demographics to guide data-informed business decisions.
May 2008 – Nov 2015
Octa Pharma
Ankara, Turkey
Data Analyst & Product Manager
  • Led competitive analysis and pricing optimization for Human Albumin — achieved 75% market share across Turkey and EU markets.
  • Contributed to 10+ European regulatory submissions for Human Albumin, Factor products, and vaccines — all successfully approved.
  • Built ETL pipelines (Python, Pandas, NumPy) for EU pharmaceutical pricing data ensuring regulatory compliance.

Academic Background

2023

Post Graduate Program in AI & ML

California Institute of Technology (Caltech)
2017

M.S. Clinical Pharmacy & Pharmacotherapy

University of Marmara, Istanbul
2006

B.S. Pharmacy

University of Hacettepe, Ankara

Let's Work Together

Open to pharma/biotech and tech opportunities. Let's talk about how AI can accelerate your drug development or commercial analytics.

📍 Princeton, NJ  |  +1 929 840 4971  |  US Citizen