My journey learning machine learning from scratch, building a churn prediction system with 82.8% recall and deploying to production
This was my first machine learning project in 2024. I started with zero ML knowledge - didn't know what overfitting was, how models trained, or even basic terminology.
What I did know: "ML can predict things from data." And I wanted to build something real, not just follow tutorials.
I chose churn prediction because:
Customer churn is expensive. Losing customers means:
If we can predict who will churn before they leave, companies can intervene early: offer promotions, improve service, or provide better plans.
Kaggle dataset with 7,043 telecom customers. Features include:
First challenge: Imbalanced dataset.
If the model just predicts "everyone stays", accuracy would be 73%. But that's useless.
Contract Type Impact:
Insight: Long-term contracts drastically reduce churn.
Tenure Impact:
Insight: New customers are highest risk.
Payment Method Impact:
Insight: Electronic check users churn most. Why? Probably easier to cancel.
Created derived features to give the model more signal:
Customer Lifetime Value (CLV):
data['CLV'] = data['tenure'] * data['MonthlyCharges']
Average Monthly Charges:
data['AvgMonthlyCharges'] = data['TotalCharges'] / (data['tenure'] + 1)
Tenure Groups:
data['TenureGroup'] = pd.cut(data['tenure'], bins=[0, 12, 24, 60, 72], labels=['0-1 year', '1-2 years', '2-5 years', '5-6 years'])
These helped the model capture non-linear relationships.
This was the trickiest part. Training on imbalanced data causes model bias toward the majority class.
Solution: SMOTE (Synthetic Minority Over-sampling Technique).
from imblearn.over_sampling import SMOTE smote = SMOTE(random_state=42) X_train_balanced, y_train_balanced = smote.fit_resample(X_train, y_train)
SMOTE creates synthetic samples from the minority class (churners) by interpolating between existing samples.
After SMOTE:
Tested multiple models:
| Model | Recall | Precision | F1-Score | |-------|--------|-----------|----------| | Logistic Regression | 82.8% | 65.7% | 73.3% | | Random Forest | 78.5% | 68.2% | 73.0% | | Gradient Boosting | 79.3% | 67.5% | 73.0% |
Winner: Logistic Regression.
Surprising? Yes. Complex models (RF, GB) didn't necessarily perform better. For this dataset, a simple linear model worked best.
In churn prediction, recall is more important than precision.
Recall: Of all churners, how many did we catch? Precision: Of all predicted churners, how many actually churned?
Trade-off:
Business perspective: Missing a churner (false negative) is more expensive than a false alarm (false positive).
False positive: Give promo to someone who won't churn. Worst case: waste promo budget.
False negative: Miss someone who will churn. Worst case: lose customer permanently.
That's why I optimized for 82.8% recall, even though precision is only 65.7%.
From Logistic Regression coefficients, top predictors:
These are actionable insights for business.
A good model is useless if it can't be used. I built an interactive dashboard with Streamlit.
Features:
Deployed to Streamlit Cloud - free hosting, auto-deploy from GitHub.
From the analysis, actionable recommendations:
Finding: Month-to-month contracts have 42.7% churn rate.
Action:
Expected Impact: 30-40% reduction in churn.
Finding: Customers with tenure under 12 months are highest risk.
Action:
Expected Impact: 25% improvement in 1-year retention.
Finding: Electronic check users have 45.3% churn rate.
Action:
Expected Impact: 15-20% churn reduction in this segment.
Logistic Regression outperformed Random Forest and Gradient Boosting. Don't assume complex = better.
Understanding telco business helped in feature engineering (CLV, tenure groups) and making actionable recommendations.
Optimizing for recall makes business sense. Missing a churner is expensive.
Handling class imbalance properly improved recall from 65% to 82.8%.
An undeployed model = useless model. Streamlit made deployment easy.
If starting again, I would:
This project taught me that ML isn't just about models. It's more about:
And most importantly: ship it. Better to have an imperfect model in production than a perfect model in a notebook.
Live Demo: https://customer-churn-fauza.streamlit.app/
For other projects, see Food Recommendation Chatbot, Sales Forecasting, and Sentinel Predictive Maintenance.