Data Engineer · ML · Analytics

Richard
Antoine

Writing about machine learning engineering, data systems, and whatever else I find interesting. 3+ years building production data pipelines at P&G. M.Sc. Applied Mathematics & Statistics.

3+ years building
enterprise data systems
$350M+ business value
enabled via pipelines
11 production ML models
improved

Featured writing

All articles →

All articles

Deploying an LLM to SageMaker: What the Docs Don't Tell You

Instance types, memory limits, cold start behavior — the gaps in AWS docs that cost me hours.

Serverless Inference with Lambda: Wiring It to SageMaker

How to connect a Lambda function to a SageMaker endpoint with least-privilege IAM and actually handle timeouts.

API Gateway: Turning Your Lambda Into a Public Endpoint

CORS, throttling, and API keys — finishing the serverless MLOps pipeline so the outside world can hit your model.

Coming soon

More articles on Databricks, Delta Lake, medallion architecture, and production pipeline design.

Coming soon

Deep dives into CareerPulse and other personal projects — architecture decisions, failures, and what I learned.

Coming soon

Essays on whatever I find interesting — outside the data world.

Notable projects

CareerPulse

End-to-end medallion lakehouse pipeline ingesting live job posting data via REST APIs with incremental loading and a downstream XGBoost forecasting model tracked in MLflow.

LLM Inference API

Deployed a 738M parameter model to SageMaker and built a fully serverless inference pipeline via Lambda and API Gateway.

P&G Data Platform Migration

Led end-to-end migration of legacy enterprise pipelines to Databricks with Delta Lake, PySpark, and ACID-compliant distributed processing at scale.

Building things
at the intersection
of data & ML.

I'm a Data Engineer with a background in Statistics who spent 3+ years at Procter & Gamble building production-grade data infrastructure. I write about what I'm building and learning — from MLOps and lakehouse architecture to whatever rabbit hole I fall into.

Currently seeking remote ML Engineering, Data Engineering, and Analytics Engineering roles and freelance projects.

Python PySpark SQL Databricks Delta Lake AWS Azure MLflow scikit-learn HuggingFace XGBoost LangChain