- Built a real-time CDC pipeline from MySQL to Azure using Debezium, Kafka, Spark Structured Streaming, and Airflow
- Streamed binlog changes to Kafka, transformed data with Spark, and stored it in ADLS Gen2 & Azure SQL DWH
- Orchestrated and containerized the entire pipeline with Airflow and Docker Compose