E-commerce Sales Analysis

Advanced SQL analysis of 50,000+ transactions to uncover revenue patterns and customer behavior insights

SQL Data Analysis E-commerce Database Queries

Project Overview

This project analyzes an e-commerce dataset containing over 50,000 transactions to identify key revenue drivers, customer purchasing patterns, and product performance metrics. Using advanced SQL techniques including CTEs, window functions, and complex joins, I extracted actionable insights to support strategic business decisions.

The analysis focused on three main areas: revenue optimization, customer segmentation, and inventory management. By identifying high-value customer segments and underperforming products, the analysis provided clear recommendations for marketing focus and inventory adjustments.

Key Findings

$2.4M
Total Revenue Analyzed
18%
Revenue from Top 10% Customers
35%
Increase in Q4 Sales

Problem Statement

The e-commerce business needed to understand which customer segments and products were driving the most revenue, identify seasonal trends, and optimize inventory management. Manual reporting processes were time-consuming and lacked the depth needed for strategic decision-making.

Methodology

1. Data Preparation

Cleaned and validated transaction data, handling missing values and duplicate records. Created standardized date formats and calculated derived fields for analysis.

2. Revenue Analysis

Used window functions to calculate running totals, month-over-month growth rates, and product performance rankings.

3. Customer Segmentation

Implemented RFM (Recency, Frequency, Monetary) analysis using CTEs to categorize customers into value segments.

4. Insights & Recommendations

Synthesized findings into actionable recommendations for marketing campaigns and inventory optimization.

SQL Code Examples

Revenue Analysis Query

This query calculates monthly revenue trends and identifies top-performing product categories.

-- Monthly Revenue Trends with Growth Rate WITH monthly_revenue AS ( SELECT DATE_TRUNC('month', order_date) AS month, SUM(total_amount) AS revenue, COUNT(DISTINCT order_id) AS num_orders FROM orders WHERE order_status = 'completed' GROUP BY DATE_TRUNC('month', order_date) ) SELECT month, revenue, num_orders, LAG(revenue) OVER (ORDER BY month) AS prev_month_revenue, ROUND(((revenue - LAG(revenue) OVER (ORDER BY month)) / LAG(revenue) OVER (ORDER BY month) * 100), 2) AS growth_rate FROM monthly_revenue ORDER BY month DESC;

Customer Segmentation (RFM Analysis)

Segment customers based on Recency, Frequency, and Monetary value for targeted marketing.

-- RFM Customer Segmentation WITH rfm_calc AS ( SELECT customer_id, MAX(order_date) AS last_order_date, COUNT(order_id) AS frequency, SUM(total_amount) AS monetary FROM orders WHERE order_status = 'completed' GROUP BY customer_id ), rfm_scores AS ( SELECT customer_id, DATEDIFF(CURRENT_DATE, last_order_date) AS recency, frequency, monetary, NTILE(5) OVER (ORDER BY DATEDIFF(CURRENT_DATE, last_order_date) DESC) AS r_score, NTILE(5) OVER (ORDER BY frequency ASC) AS f_score, NTILE(5) OVER (ORDER BY monetary ASC) AS m_score FROM rfm_calc ) SELECT customer_id, recency, frequency, monetary, CASE WHEN r_score >= 4 AND f_score >= 4 THEN 'Champions' WHEN r_score >= 3 AND f_score >= 3 THEN 'Loyal Customers' WHEN r_score >= 4 AND f_score <= 2 THEN 'New Customers' WHEN r_score <= 2 AND f_score >= 3 THEN 'At Risk' ELSE 'Needs Attention' END AS customer_segment FROM rfm_scores ORDER BY monetary DESC;

Product Performance Analysis

Identify best and worst performing products by category with revenue contribution.

-- Top and Bottom Products by Revenue SELECT p.product_name, p.category, COUNT(oi.order_id) AS units_sold, SUM(oi.quantity * oi.unit_price) AS total_revenue, ROUND(SUM(oi.quantity * oi.unit_price) * 100.0 / SUM(SUM(oi.quantity * oi.unit_price)) OVER (), 2) AS revenue_percentage, RANK() OVER (PARTITION BY p.category ORDER BY SUM(oi.quantity * oi.unit_price) DESC) AS category_rank FROM products p JOIN order_items oi ON p.product_id = oi.product_id JOIN orders o ON oi.order_id = o.order_id WHERE o.order_status = 'completed' GROUP BY p.product_id, p.product_name, p.category ORDER BY total_revenue DESC LIMIT 20;

Results & Business Impact

  • Identified that the top 10% of customers contributed 18% of total revenue, enabling targeted VIP customer retention programs
  • Discovered 35% seasonal spike in Q4 sales, allowing proactive inventory planning for peak season
  • Uncovered 15 underperforming products accounting for 8% of inventory costs with minimal sales, recommended for clearance
  • Created automated SQL reporting queries reducing manual analysis time by 60%

Tools & Technologies

PostgreSQL SQL Server Window Functions CTEs Aggregate Functions Joins Data Cleaning Date Functions

View Full Code on GitHub

Access complete SQL scripts, sample data, and documentation

View Repository