RetailPulse : SQL & ETL Data Pipeline Builder

RetailPulse is a full-stack data engineering project that demonstrates SQL proficiency, ETL pipeline development, and cloud integration. It extracts retail sales data from Kaggle, transforms it using Python, loads it into PostgreSQL, and visualizes results through a Streamlit dashboard. The project is modular, automated, and built for real-world data workflows.

Project Overview

RetailPulse simulates an enterprise-grade data workflow:

Extract — Ingest open retail data from Kaggle.
Transform — Clean, validate, and enrich using Python.
Load — Store processed data in PostgreSQL and upload to Azure storage blob.
Visualize — Serve analytics via API and Streamlit dashboard.

Architecture

[Kaggle Dataset] ↓ [Python ETL Pipeline] ↓ [Azure S3 + PostgreSQL (RDS)] ↓ [Flask/FastAPI API Layer] ↓ [Streamlit Dashboard ]

Technologies Used

Layer	Tools & Libraries
ETL Pipeline	Python 3.8+, pandas, kaggle, boto3, sqlalchemy, logging
Database	PostgreSQL (local or Azure)
Cloud Storage	Azure storage blob
API Layer	Flask or FastAPI
Hosting	Streamlit Cloud
Automation	GitHub Actions, Cron jobs
Version Control	Git + GitHub

Setup Instructions

1. Clone the Repository

git clone git@github.com:MmelIGaba/RetailPulse.git
cd RetailPulse

Configure Environment

Create a .env file in the /Back-End directory based on .env.example:

DB_HOST=your-db-host
DB_USER=your-username
DB_PASSWORD=your-password
DB_NAME=retailpulse
Azure_ACCESS_KEY=your-access-key
Azure_SECRET_KEY=your-secret-key

3. Install Dependencies

pip install -r Back-End/requirements.txt
pip install -r dashboard/requirements.txt

4. Run the ETL Pipeline

cd Back-End/etl
python extract.py
python transform.py
python load.py

5 launch the Streamlit Dashboard

cd dashboard
streamlit run app.py

6. Launch the Dashboard

cd Front-End
npm run dev

Repository Structure

RetailPulse/
├── Back-End/
│   ├── etl/
│   ├── sql/
│   ├── requirements.txt
│   └── .env.example
├── dashboard/
│   ├── app.py
│   ├── components/
│   └── requirements.txt
├── .github/
│   └── workflows/
├── presentation/
│   └── RetailPulse_Slides.pdf
├── .gitignore
└── README.md

Security & Configuration

All credentials are managed via environment variables. Do not commit real credentials or API keys. Use .env files locally and GitHub Secrets for automation.

Learning Outcomes

By completing this project, you will:

Write and optimize SQL queries for analysis.

Design modular ETL pipelines in Python.

Integrate Azure storage and PostgreSQL.

Automate data workflows using GitHub Actions.

Visualize metrics with modern frontend tools.

License

This project is licensed under the Apache License 2.0.

Author

Mmela Gabriel Dyantyi: Fullstack Developer and Aspiring Cloud Engineer
Boipelo M Ngakane: Frontend Developer | Low Code | AI | Cloud Engineer
[ammend as needed]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RetailPulse : SQL & ETL Data Pipeline Builder

Project Overview

Architecture

Technologies Used

Setup Instructions

1. Clone the Repository

Configure Environment

3. Install Dependencies

4. Run the ETL Pipeline

5 launch the Streamlit Dashboard

6. Launch the Dashboard

Repository Structure

Security & Configuration

Learning Outcomes

License

Author

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.devcontainer		.devcontainer
.github		.github
Back-End		Back-End
dashboard		dashboard
data		data
.gitignore		.gitignore
DeleloperFlow.md		DeleloperFlow.md
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
requirements.txt		requirements.txt

License

MmelIGaba/RetailPulse

Folders and files

Latest commit

History

Repository files navigation

RetailPulse : SQL & ETL Data Pipeline Builder

Project Overview

Architecture

Technologies Used

Setup Instructions

1. Clone the Repository

Configure Environment

3. Install Dependencies

4. Run the ETL Pipeline

5 launch the Streamlit Dashboard

6. Launch the Dashboard

Repository Structure

Security & Configuration

Learning Outcomes

License

Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages