Shardul Chavan

AWS Certified Data Engineer Associate

Data Engineering Cloud Architecture Machine Learning ETL Pipelines

Professional Experience

Crewasis, Remote
Data Scientist Intern
Jan 2025 - Jun 2025 | Remote

Designed and deployed scalable ETL pipelines processing 2M+ unstructured text records from social/news feeds, enabling downstream NLP, sentiment analysis, and actionable product-market insights

Built EC2-hosted machine learning pipeline orchestrated with AWS Lambda, automating Latent Semantic Indexing (LSI) training and reducing manual retraining cycles by 70%

Utilized Terraform scripts to deploy repeatable AWS infrastructure, ensuring high availability and reproducibility

Developed interactive Tableau dashboards combining NLP insights with CRM segmentation, empowering marketing teams to make data-driven decisions and increasing campaign conversions by 15%

Skyworks Solutions Inc.
Data Analytics Engineer Co-op
Jan 2024 - Jun 2024 | Boston, MA

Engineered 25+ Airflow and dbt pipelines across Azure SQL and data lakes, reducing data refresh latency by 60% and boosting accessibility for analytics teams

Standardized RF testing data into analytics-ready formats, saving 5+ engineering hours weekly and accelerating high-volume RF module validation

Containerized data services with Docker and set up Prometheus monitoring to track performance and failures, reducing downtime by 45%

Ensured data governance by integrating validation with Great Expectations and enforcing RBAC + row-level security policies

Designed scalable dimensional data models and built interactive Power BI dashboards for 3 business units

Northeastern University
BI Analyst & AI Researcher
Sep 2023 - Dec 2024 | Boston, MA

Developed scalable PySpark pipelines on Hadoop to process 1.2M+ insurance records, enabling efficient feature engineering for risk modeling and improving predictive accuracy by 20%

Mentored 50+ graduate students in building applied AI/BI applications, integrating LLMs (OpenAI, Anthropic) into enterprise analytics use cases

Built SnowGPT, a LangChain-powered tool on Snowflake, enabling natural language-to-SQL translation driving 40% increase in self-service analytics adoption

Deployed containerized ML models with Docker and GitHub Actions CI/CD, reducing manual deployment time by 60%

Accion Labs
Data Analytics Engineer
Jan 2022 - Jul 2022 | Mumbai, India

Engineered REST-based integrations between LLM APIs and ServiceNow, enhancing conversational AI for support bots, resulting in a 40% reduction in ticket escalations

Built parameterized SQL automation scripts to streamline reporting cycles, while partnering with QA and product teams to validate data accuracy against KPIs and SLA thresholds

Featured Projects

NewsSphere

Personalized News Platform

Engineered FastAPI-based services with optimized RAG indexing, reducing content retrieval times by 35% and driving higher user engagement.

Deployed containerized Streamlit app with Docker + Kubernetes, ensuring fast, reliable personalized news delivery every 8 minutes.

Google Cloud Docker Pinecone FastAPI Azure SQL

YouTube Data Analytics

AWS-Based Analytics Platform

Built serverless ELT pipeline using AWS Lambda, Glue, and Redshift to analyze 80K+ user interactions in real-time.

Automated deployment with Terraform IaC and scaled workloads via Kubernetes (EKS), improving pipeline resilience.

AWS Lambda Redshift Terraform Kubernetes QuickSight

Retail Sales Pipeline

Large-Scale Data Processing

Architected Spark pipeline processing 1M+ retail transactions, enabling trend analysis and accurate forecasting.

Implemented Grafana dashboards and OpenTelemetry tracing for proactive error detection and pipeline observability.

Apache Spark Databricks Azure Grafana OpenTelemetry

Iowa Sales Intelligence

Enterprise BI Solution

Engineered ETL workflows processing 25M+ sales records using Azure Data Factory and Alteryx.

Built Kimball-based models to boost ingestion speed by 60% and developed Power BI dashboards for improved forecasting.

Azure Data Factory Alteryx Power BI Talend

Technical Skills

Programming & Scripting

Python SQL Scala Bash Java C++

Data Engineering

Airflow dbt PySpark Databricks AWS Glue Azure Data Factory

Cloud & Infrastructure

AWS Azure GCP Kubernetes Terraform Docker

Data Platforms

Snowflake Redshift PostgreSQL MongoDB BigQuery

BI & Visualization

Tableau Power BI QuickSight Streamlit Superset

DevOps & Monitoring

GitHub Actions Jenkins Prometheus Grafana ELK Stack

Let's Connect

I'm always interested in discussing new opportunities in data engineering and cloud architecture.

Shardul's AI Assistant

Ask me about work experience, technical skills, projects, or anything about Shardul's journey!
Quick Questions:
Welcome! I'm Shardul's AI Assistant
I can tell you about his experience at companies like Crewasis and Skyworks, technical skills in AWS and Python, or his data engineering projects.