Skip to content

This project leverages SQL to perform comprehensive data analysis on sales, customers, and products. It includes exploratory data analysis (EDA), advanced analytics, and key performance metrics to uncover business insights. πŸš€

License

Notifications You must be signed in to change notification settings

reetphy/Exploratory-and-Advanced-SQL-Data-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

29 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Motogear International - Sales & Customer Analysis πŸοΈπŸ’°

Motogear International, a premier supplier of high-performance bikes and accessories, has been at the forefront of innovation and customer satisfaction. With a diverse product range spanning mountain bikes, road bikes, essential bike accessories, components and clothing, the company has expanded its market presence across πŸ‡¦πŸ‡Ί Australia πŸ‡ΊπŸ‡Έ USA πŸ‡¨πŸ‡¦ Canada πŸ‡©πŸ‡ͺ Germany πŸ‡¬πŸ‡§ UK and πŸ‡«πŸ‡· France

πŸš€ Project Type: SQL Analysis | MySQL | Data Analytics

πŸ“Š Objective: Perform exploratory and advanced data analysis to extract insights from customer, product, and sales data, helping to drive business decisions.


πŸ“Š Key Business Metrics

Here are some key performance indicators derived from the dataset:

  • πŸ“ˆ Total Sales: $29,356,250
  • πŸ“¦ Total Quantity Sold: 60,423 units
  • πŸ’° Average Price per Unit: $486
  • πŸ›’ Total Orders Processed: 27,659
  • 🏷️ Total Products Available: 295
  • πŸ‘₯ Total Customers: 18,484

πŸ“Œ Technologies Used

  • Database: MySQL
  • Querying & Analysis: SQL
  • Version Control: Git & GitHub

πŸ“‚ Dataset Overview

The dataset represents Motogear International’s customer base, product catalog, and sales transactions from 2010-12-29 to 2014-01-28, with products sold across six countries. It consists of three key tables:

1️⃣ dim_customers (Customer Dimension)

Stores customer details, including demographics and account creation data.

Column Name Data Type Description
customer_key Integer Unique identifier for each customer (Primary Key).
customer_id Integer Internal customer ID.
customer_number String Unique customer number used for tracking.
first_name String Customer's first name.
last_name String Customer's last name.
country String Customer's country of residence.
marital_status String Marital status (Single/Married).
gender String Gender (Male/Female).
birthdate Date Customer’s date of birth.
create_date Date Date of account creation.

2️⃣ dim_products (Product Dimension)

Contains details about the products sold, including pricing and categorization.

Column Name Data Type Description
product_key Integer Unique identifier for each product (Primary Key).
product_id Integer Internal product ID.
product_number String Unique product number for tracking.
product_name String Name of the product.
category_id String Product category ID.
category String Product category (Bikes, Components, Accessories, Clothing).
subcategory String Subcategory of the product (e.g., Mountain Bikes, Road Frames).
maintenance String Indicates if the product requires maintenance (Yes/No).
cost Integer Cost of the product.
product_line String Product line (e.g., Road, Mountain).
start_date Date Date when the product was introduced.

3️⃣ fact_sales (Sales Fact Table)

Records sales transactions, linking customers and products.

Column Name Data Type Description
order_number Integer Sales transaction identifier (Primary Key).
customer_key Integer Foreign Key linking to dim_customers.
product_key Integer Foreign Key linking to dim_products.
order_date Date Date when the order was placed.
quantity Integer Number of units sold.
price Integer Price per unit of the product.
sales_amount Integer Total revenue generated (quantity * unit_price).
shipping_date Date Date when the order was shipped.

πŸ“– Data Dictionary

  • Dimension Tables (dim_*): Contain descriptive attributes about customers and products.
  • Fact Table (fact_sales): Stores transactional data related to customer purchases.
  • Primary Keys (*_key): Uniquely identify rows in dimension tables.
  • Foreign Keys (customer_key, product_key): Establish relationships between tables.

πŸ› οΈ How to Use the Dataset

  1. Load the data into a SQL database (MySQL, PostgreSQL, or SQLite).
  2. Perform exploratory data analysis (EDA) using SQL queries.
  3. Use Python (Pandas, Matplotlib, Seaborn) to visualize trends.

πŸ› οΈ Project Workflow

The project is divided into three major phases:

1️⃣ Exploratory Data Analysis (EDA)

Objective: Understand the structure and characteristics of the dataset.

βœ… Basic Database Exploration – Identifying missing values, duplicates, and inconsistencies.
βœ… Dimension Exploration – Analyzing customer and product attributes.
βœ… Date Exploration – Examining sales trends over time.
βœ… Measure Exploration – Assessing key metrics like total revenue and order count.
βœ… Magnitude Analysis – Identifying high-value customers and top-selling products.
βœ… Ranking Analysis – Ranking customers, products, and sales regions based on performance.

2️⃣ Advanced Data Analysis

Objective: Derive deeper insights into customer behavior and sales trends.

βœ… Changes Over Time Analysis – Examining year-over-year growth and seasonality.
βœ… Cumulative Analysis – Tracking cumulative revenue and customer acquisition.
βœ… Performance Analysis – Measuring sales efficiency and customer retention.
βœ… Part-to-Whole Analysis – Understanding contributions of different products and regions.
βœ… Data Segmentation – Categorizing customers into segments based on spending patterns.

3️⃣ Report Creation

Objective: Summarize findings in a structured format for business decision-making.

πŸ“Š Customer Report

Purpose: Consolidates key customer metrics and behaviors.

βœ… Key Features:

  • 1️⃣ Demographics & Transaction Details – Names, ages, and spending patterns.
  • 2️⃣ Customer Segmentation – Classification into VIP, Regular, and New customers.
  • 3️⃣ Aggregated Customer Metrics:
    • Total Orders – Number of purchases made.
    • Total Sales – Overall revenue contribution.
    • Total Quantity Purchased – Units bought over time.
    • Total Products Purchased – Diversity in purchases.
    • Lifespan (Months) – Duration of customer activity.
  • 4️⃣ Key Performance Indicators (KPIs):
    • Recency
    • Average Order Value (AOV)
    • Average Monthly Spend

πŸ“Š Product Report

Purpose: Consolidates key product metrics and behaviors.

βœ… Key Features:

  • 1️⃣ Product Spicifications – Name, category, subcategory, and cost.
  • 2️⃣ Product Segmentation – Classification into High-Performers, Mid-Range, or Low-Performers.-
  • 3️⃣ Aggregated Customer Metrics:
    • Total Orders – Number of purchases made.
    • Total Sales – Overall revenue contribution.
    • Total Quantity Sold – Units bought over time.
    • Lifespan (Months) – Duration of product activity.
  • 4️⃣ Key Performance Indicators (KPIs):
    • Recency
    • Average Order Revenue (AOR)
    • Average Monthly Revenue

πŸ“ˆ Key Insights

1️⃣ Customer Segmentation πŸ†

βœ… Insight: New customers generate the highest revenue (37.81%), indicating strong acquisition efforts, while VIPs (36.59%) provide sustained value, and Regular customers (25.59%) may need engagement strategies to boost spending.


2️⃣ Sales Trend Over Time πŸ“…

βœ… Insight: Sales experienced a significant rise from 2010 to 2013, peaking in 2013 with over 16 million in total sales.


3️⃣ Best-Selling Products 🏍️

βœ… Insight: Bikes dominate total sales, contributing 96.46% of revenue, while Accessories and Clothing combined account for less than 4%. This suggests that the business is heavily reliant on bike sales, and there may be opportunities to diversify revenue streams by improving the sales strategy for Accessories and Clothing.


πŸ“ Business Recommendations

πŸ“Œ Customer Retention Strategies πŸš€
πŸ”Ή Target VIP customers with exclusive offers
πŸ”Ή Engage New customers with onboarding discounts
πŸ”Ή Improve Regular customer frequency through loyalty programs

πŸ“Œ Product Optimization 🏍️
πŸ”Ή Focus on Bikes & Accessories for revenue maximization
πŸ”Ή Reduce inventory for Low-Performing items

πŸ“Œ Market Expansion 🌍
πŸ”Ή Highest sales in πŸ‡ΊπŸ‡Έ USA – expand marketing efforts
πŸ”Ή Emerging markets in πŸ‡¦πŸ‡Ί Australia need localized campaigns


πŸ› οΈ Future Improvements

βœ… More Advanced Analytics: Machine learning predictions for future sales.
βœ… More Visuals: Interactive PowerBI dashboards.


πŸ“ Author

πŸ‘€ Reet Chandra
πŸ“§ reetphy@gmail.com
πŸ”— LinkedIn


About

This project leverages SQL to perform comprehensive data analysis on sales, customers, and products. It includes exploratory data analysis (EDA), advanced analytics, and key performance metrics to uncover business insights. πŸš€

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published