- Orchestrated batch ingestion and cross-region S3 loads with Airflow and PySpark
- Modeled datasets end-to-end in DBT with partitioning/clustering and Hudi time-travel comparisons
- Migrated scan-heavy BI tables from Hudi to Iceberg using DBT, Airflow, validated via QA reports, Glue, Trino
- Built Metabase dashboards from Trino/S3 logs to monitor query hotspots, S3 scan volume, storage, cost
- Developed an internal S3-based backup for the Salesforce team with JupyterHub access, replacing a third-party tool and enabling ongoing annual cost savings