Skip to content

Pratikchetry/retail-revenue-intelligence

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 

Repository files navigation

πŸͺ Retail Revenue Intelligence & Anomaly Detection

End-to-End Retail Analytics Project with PostgreSQL, Tableau, Power BI, and AI-Ready Extensions


Python PostgreSQL Tableau Power%20BI Pandas scikit--learn Jupyter


Project Overview

This project builds a retail analytics system from the Online Retail II dataset and is designed to answer four business questions:

  1. What happened?
    Revenue trends, country concentration, product performance, and sales structure

  2. Why did it happen?
    Returns, adjustments, merchandise vs non-merchandise logic, and customer concentration

  3. What should be monitored next?
    Customer segmentation, anomaly detection, and time-series behavior

  4. How should reporting be built correctly?
    Through a validated PostgreSQL data layer connected to Tableau and supported by Power BI / DAX


Current Project Status

Completed

  • Raw data profiling
  • Data cleaning
  • Cleaning audit logging
  • Exploratory data analysis
  • Non-merchandise code identification
  • PostgreSQL installation and initial schema work

Current permanent data layers

  • Raw source workbook
  • Staging sales/returns/adjustments files
  • Profiling and cleaning report artifacts

Next

  • PostgreSQL loader rewrite using staging files only
  • Post-load validation against benchmark numbers
  • SQL views
  • RFM segmentation
  • Anomaly detection
  • Forecasting
  • Tableau dashboard
  • Power BI / DAX companion
  • Later extension with Rossmann and NOAA data

Dataset

Current core dataset

  • Online Retail II
  • Two sheets:
    • Year 2009-2010
    • Year 2010-2011

Source file fingerprint

  • MD5: ed54ccfc5d358481c399cc11d0a244be

Key Findings So Far

Raw profiling

  • 1,067,371 raw rows
  • 34,335 exact duplicates
  • 243,007 missing customer IDs
  • 4,382 missing descriptions
  • 22,950 negative-quantity rows
  • 5 negative-price rows

Clean staging sales dataset

  • 1,007,914 valid sales rows
  • 20,476,634.02 total revenue
  • 11,205,149 total quantity sold
  • 40,078 unique invoices
  • 4,917 stock codes
  • 43 countries

Geographic concentration

  • United Kingdom revenue: 17,410,569.69
  • UK share of total revenue: 85.03%

Seasonality

Top revenue months:

  • November β†’ 2,968,159.92
  • October β†’ 2,313,165.95
  • December β†’ 2,281,745.01

Product interpretation risk

Confirmed non-merchandise codes:

  • M β†’ Manual
  • DOT β†’ DOTCOM POSTAGE
  • POST β†’ POSTAGE

Customer concentration

Customer-linked staging subset:

  • 779,425 rows
  • 5,878 identifiable customers

Top customer:

  • 18102.0 β†’ 580,987.04

Project Structure

retail-revenue-intelligence/
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ raw/
β”‚   β”‚   └── online_retail_ii/
β”‚   β”‚       └── online_retail_II.xlsx
β”‚   └── staging/
β”‚       β”œβ”€β”€ sales_main.csv
β”‚       β”œβ”€β”€ returns_cancellations.csv
β”‚       β”œβ”€β”€ accounting_adjustments.csv
β”‚       └── non_merchandise_codes.csv
β”‚
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ ABSTRACT.md
β”‚   β”œβ”€β”€ ABOUT_THE_ANALYST.md
β”‚   β”œβ”€β”€ data_findings.md
β”‚   β”œβ”€β”€ HOW_TO_READ_NOTEBOOKS.md
β”‚   β”œβ”€β”€ TABLEAU_GUIDE.md
β”‚   └── validation_benchmarks.md
β”‚
β”œβ”€β”€ notebooks/
β”‚   β”œβ”€β”€ 01_data_understanding.ipynb
β”‚   β”œβ”€β”€ 02_data_cleaning.ipynb
β”‚   β”œβ”€β”€ 03_eda.ipynb
β”‚   β”œβ”€β”€ 04_anomaly_detection.ipynb
β”‚   β”œβ”€β”€ 05_forecasting.ipynb
β”‚   β”œβ”€β”€ 06_rfm_segmentation.ipynb
β”‚   └── 07_validation.ipynb
β”‚
β”œβ”€β”€ outputs/
β”‚   └── reports/
β”‚       β”œβ”€β”€ profiling_summary.json
β”‚       └── cleaning_audit_log.csv
β”‚
β”œβ”€β”€ powerbi/
β”œβ”€β”€ sql/
β”œβ”€β”€ src/
β”œβ”€β”€ tableau/
β”œβ”€β”€ tests/
β”œβ”€β”€ .env.example
β”œβ”€β”€ .gitignore
β”œβ”€β”€ README.md
└── requirements.txt

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors