I process 2TB+ daily at Walmart, building the data infrastructure behind enterprise-scale retail analytics. From architecting cloud data lakes on Azure to optimizing PySpark jobs that cut runtime by 35%, I turn raw data into reliable, production-grade systems that 50+ stakeholders depend on every day.
// tech_stack
Azure (ADF, Databricks, Synapse, ADLS Gen2), Snowflake, AWS (S3, Redshift), BigQuery
PySpark, Kafka, Event Hubs, Hive, HDFS
Python, SQL, Spark SQL, Airflow, dbt, Docker, Git, CI/CD, Terraform
ETL/ELT, Kimball Star Schema, SCD Type I/II, Lakehouse Architecture, Delta Lake
Power BI, Tableau, Looker, Streamlit
// experience
Walmart — USA
Walmart — USA
Maxgen Technologies — India
// projects
Production-ready Kimball star schema warehouse with dbt and Snowflake. 35+ automated tests, SCD Type 2 snapshots, CI/CD pipeline.
Streaming and batch pipeline ingesting retail events via Kafka, processing with PySpark, loading to Snowflake with medallion architecture.
End-to-end Azure Lakehouse with ADF ingestion, Databricks transformation, ADLS Gen2 storage, and Synapse analytics layer.
Automated pipeline pulling CMS Medicare and FDA API data into Snowflake with dbt transformations and quality monitoring.
YAML-configurable monitoring framework with freshness, volume, schema, and anomaly detection monitors plus Slack alerts.
// education
University of Massachusetts Dartmouth — May 2025
// get_in_touch
I'm actively looking for Data Engineer opportunities. Whether you have an opening, a project idea, or just want to talk data — I'd love to hear from you.
saipcharan2023@gmail.com