This project demonstrates an industrial-style Machine Learning pipeline for detecting spam emails.
It follows real-world ML engineering practices, covering data loading, preprocessing, model training, inference, and deployment using a Flask web application.
The system allows users to input text and receive a Spam / Not Spam prediction along with confidence insights.
Industrial-Style-ML-Model-Demo-main/
├── Spam Detection App/
│ ├── app/ # Flask web application (routes, controllers, UI)
│ ├── src/ # Core ML pipeline (preprocessing, training, inference)
│ ├── data/ # Datasets and data ingestion scripts
│ ├── artifacts/ # Trained model and vectorizer artifacts
│ ├── tests/ # Unit and integration tests
│ ├── Dockerfile # Docker configuration for containerized deployment
│ ├── requirements.txt # Python project dependencies
│ └── README.md # Application-level documentation
---
- End-to-end ML workflow (data → training → inference)
- Spam email classification
- Flask-based web UI
- Modular, production-style codebase
- Dockerized for easy deployment
- Reusable preprocessing and vectorization pipeline
git clone <repository-url>
cd Industrial-Style-ML-Model-Demo-mainpython -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install -r "Spam Detection App/requirements.txt"cd "Spam Detection App"
python app/app.pyOpen your browser and navigate to:
http://127.0.0.1:5000/
- Data Loading:
data/load_data.py - Preprocessing:
src/preprocessing.py - Vectorization:
src/vectorizer.py - Training:
src/train.py - Prediction:
src/predict.py
Build and run the application using Docker:
docker build -t spam-detection-app .
docker run -p 5000:5000 spam-detection-appRun tests from the project root:
pytestKey libraries used:
- Flask
- scikit-learn
- pandas
- numpy
- joblib
See requirements.txt for the full list.
Model artifacts are loaded from:
Spam Detection App/artifacts/
Ensure model.pkl and vectorizer.pkl are present before running the app.
Input:
Congratulations! You've won a free prize.
Output:
Spam (High Confidence)
- Model not found: Ensure the
artifactsdirectory contains trained files. - Module errors: Verify your virtual environment is activated.
- Port issues: Make sure port 5000 is free.