An end-to-end data pipeline built to provide actionable business insights for movie producers. This project automates the process of fetching, transforming, and visualizing movie financial data, focusing on Return on Investment (ROI) rather than just gross revenue.
Project Architecture
Movie producers often rely on gross revenue to evaluate performance, which hides the true profitability of a film. Financial data is scattered, partially structured, and time-sensitive—making it difficult to extract clear, decision-ready insights about what actually drives return on investment.
I built a fully automated, cloud-native ETL pipeline orchestrated with Apache Airflow. It fetches real-time movie data via API, transforms and enriches it using Python and Pandas (including ROI calculation and JSON normalization), stores it securely on AWS S3, and delivers insights through an interactive Streamlit dashboard designed for business users.
Streamlit app view.
Producers can now instantly identify high-ROI films, compare performance drivers, and make data-backed investment decisions. The architecture is scalable, secure, and production ready. Demonstrating my ability to build end-to-end data systems that turn raw data into strategic business value.