Technical Highlights
Time Series Analysis: Comprehensive comparison ARIMA (time series specialist) vs Linear Regression (simple baseline) dengan systematic evaluation.
Model Optimization: GridSearchCV untuk hyperparameter tuning, Ridge regression dengan L2 regularization untuk better generalization.
Surprising Results: Simple Linear Regression outperformed ARIMA by 41% (MAE $3,726 vs $5,800), demonstrating importance of model selection based on data characteristics.
Interactive Dashboard: Streamlit application dengan forecasting interface, confidence intervals, dan business insights visualization.
Skills Demonstrated
Time Series Analysis
- ▸ARIMA Modeling: AutoRegressive Integrated Moving Average
- ▸Parameter Selection: AIC-based optimization untuk (p,d,q) parameters
- ▸Stationarity Testing: Augmented Dickey-Fuller test
- ▸Seasonal Decomposition: Trend, seasonality, residual analysis
Machine Learning
- ▸Linear Models: Linear Regression, Ridge Regression
- ▸Hyperparameter Tuning: GridSearchCV dengan cross-validation
- ▸Model Evaluation: MAE, RMSE, R-squared
- ▸Regularization: L2 regularization untuk prevent overfitting
Data Science
- ▸Exploratory Data Analysis: Distribution analysis, pattern detection
- ▸Feature Engineering: Time-based features, aggregation strategies
- ▸Data Resampling: Daily → Weekly aggregation untuk noise reduction
- ▸Validation Strategy: Time-series split (90-10)
Software Engineering
- ▸Python Development: Clean, modular code dengan proper documentation
- ▸Interactive Dashboard: Streamlit untuk user-friendly forecasting interface
- ▸Visualization: Matplotlib, Seaborn untuk insights communication
- ▸Deployment: Streamlit Cloud dengan automated deployment
Model Comparison Results
| Model | MAE | RMSE | vs ARIMA |
|-------|-----|------|----------|
| ARIMA (5,1,0) | $5,800 | $8,178 | Baseline |
| Linear Regression | $3,726 | $4,792 | 41% better |
| Ridge (α=0.1) | $3,676 | $5,049 | Best |
Winner: Ridge Regression dengan alpha=0.1
Why Linear Regression Won?
Data Characteristics:
- ▸No strong seasonality (ARIMA's strength wasted)
- ▸Short time series (13 weeks, not enough untuk ARIMA)
- ▸Clear linear trend (perfect untuk LR)
- ▸Simple model prevents overfitting
Lesson: Model complexity ≠ Better performance. Always test simple baselines.
Technical Implementation
ARIMA Parameter Selection:
# Test multiple (p,d,q) combinations
for p in range(6):
for d in range(2):
for q in range(3):
model = ARIMA(train, order=(p,d,q))
aic = model.fit().aic
# Select best based on AIC
Ridge Regression Tuning:
param_grid = {'alpha': [0.1, 1.0, 10.0, 100.0]}
grid_search = GridSearchCV(Ridge(), param_grid, cv=5)
best_model = grid_search.fit(X_train, y_train)
Feature Engineering
Temporal Features:
- ▸Month, Day, Weekday extraction
- ▸Hour-based patterns (peak hours: 13:00-15:00, 19:00-20:00)
Aggregation Strategy:
- ▸Daily sales → Weekly sales (noise reduction)
- ▸Smoothing untuk better trend detection
Derived Metrics:
- ▸Rolling averages
- ▸Growth rates
- ▸Seasonal indicators
Business Insights
Forecast Results: 8% growth over next 12 weeks
Actionable Recommendations:
- ▸Inventory Planning: Increase stock by 10% (with buffer)
- ▸Staffing Optimization: More staff during peak hours (13:00-15:00, 19:00-20:00)
- ▸Promotion Planning: Target slow days untuk boost sales
Expected Impact:
- ▸30% reduction in stockouts
- ▸20% improvement in labor efficiency
- ▸15% increase in slow-day sales
Technical Challenges Solved
Short Time Series: Only 13 weeks data. ARIMA needs 50+ observations. Solution: Use simple models yang nggak overfit.
High Variability: Daily sales std $1,842. Solution: Weekly aggregation untuk smooth noise.
Model Selection: Systematic comparison dengan consistent metrics. Don't assume ARIMA is always best.
Hyperparameter Tuning: GridSearchCV dengan cross-validation untuk find optimal Ridge alpha.
Deployment: Interactive Streamlit dashboard dengan:
- ▸Historical data visualization
- ▸Forecast horizon selection (1-52 weeks)
- ▸Confidence intervals
- ▸Business metrics (total forecast, avg weekly, growth rate)
Live Demo: https://sales-forecasting-fauza.streamlit.app/
Read Full Story: Blog Post