Business#
- Executive Summary: This repository contains the accompanying source code for the blog post "Finally! Bayesian Hierarchical Modelling at Scale", demonstrating how to build and train a Bayesian hierarchical Gamma-Poisson model for large-scale daily sales forecasting across many retail stores. The project uses JAX and NumPyro to implement a scalable probabilistic model that shares statistical strength across stores while handling missing observations and padded time series. Data is based on the Kaggle Rossmann Store Sales dataset, with preprocessing, model fitting, and evaluation orchestrated through Jupyter notebooks and a small Python package under src/bhm_at_scale. The code focuses on experiment reproducibility via conda environments, and on comparing hierarchical Bayesian predictions against a conventional regression baseline. Artifacts such as encoded feature matrices, learned parameters, and prediction summaries are written to disk for downstream analysis and visualization.
- Intended Use: The repository is intended for data scientists and researchers who want to study and prototype scalable Bayesian hierarchical models for multi-store retail demand forecasting using Python, JAX, and NumPyro. It is designed to support experimentation, educational use, and reproducible demonstrations that accompany the referenced blog post, rather than serving as a turnkey production forecasting system.
- Non Goals
- Providing a production-grade, fully tested, and supported forecasting service or generic time-series framework; the code is a research and teaching example that accompanies a blog post and uses a PyScaffold-based project template.
- Use Case: The main use case is forecasting daily sales for many retail stores simultaneously by fitting a Gamma-Poisson Bayesian hierarchical model that shares information across stores while modeling store-specific effects, using the Rossmann Store Sales dataset as a concrete example.
- User Populations
- Data scientists, machine learning practitioners, and researchers familiar with Python and probabilistic programming who are interested in scalable Bayesian hierarchical modeling for retail sales forecasting.
- Data Scientist
- Domain Expert
- Governance, Compliance & Ethics Officer
- ML Engineer
- Product Manager
- Project Manager
- Software Developer
- UX Researcher
