|
Custom Search
| |
|
How to generate Cumulative Sum (running total) by MySQL - Part 1 Cumulative sum, also known as running total or Year to Date (YTD) total, is a sum of data that is related to a time sequence such as daily, monthly, or yearly. But running total could also be calculated if the dataset does not have a time sequence related column. For example, below is a list of daily sales and running total.
For calculating running total in SQL, at a high level, the concept includes (1) rank data within a GROUP, (2) RANK BY a column, and (3) calculate running total by using SUM. For the daily sales illustrated above, the GROUP is the entire Northwind Trader company. The RANK BY column is OrderDate. SUM is applied to DailySales. To translate the calculation in plain language, we can say that we want to RANK daily total sales BY order date within Northwind Trader and calculate running total for each order date. It may not look too obvious what the GROUP is. To better understand the concept, consider we have 2 Northwind Traders - Northwind A and Northwind B. So for each Northwind company, we want to RANK daily total sales BY order date within each Northwind Trader and calculate running total for each order date. Then the running total is calculated within each Northwind - Northwind A and Northwind B. To generate a running total (or cumulative sum), please consider the following guidelines.
Now let's use two practical examples to see how to generate running totals. Practice 1: rank within a single GROUP which is the entire Northwind company 1.1 Create the basic dataset. The result of the basic dataset is the table we want to calculate the running total for. -- Basic set select b.OrderID, b.OrderDate, round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as Daily_Sales from Order_Details as a inner join Orders b on a.OrderID = b.OrderID group by b.OrderDate order by b.OrderDate;
480 records returned. 1.2 Generate running total by self join To calculate running total, join the above query to itself. Please note that, the join column must be a column that can uniquely rank every record by a time sequence. Here we used t1.OrderDate >= t2.OrderDate. If we use t1.OrderID >= t2.OrderID, it'll produce the same result because OrderID is sequentially incremented as time goes by. select DATE_FORMAT(t1.OrderDate, "%M %d %Y") as OrderDate, t1.Daily_Sales, sum(t2.Daily_Sales) as RunningTotal from ( select b.OrderID, b.OrderDate, round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as Daily_Sales from Order_Details as a inner join Orders b on a.OrderID = b.OrderID group by b.OrderDate ) as t1 inner join -- this is self join ( select b.OrderID, b.OrderDate, round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as Daily_Sales from Order_Details as a inner join Orders b on a.OrderID = b.OrderID group by b.OrderDate ) as t2 on t1.OrderDate >= t2.OrderDate -- this should also work -- on t1.OrderID >= t2.OrderID group by t1.OrderDate order by t1.OrderDate;
480 records returned. Below is equivalent version as above. To simplify the two queries above, we can create a physical table by using the basic dataset, then work out the running total from this new table. -- Create a new table from the basic set. create table NorthwindSales as select b.OrderID, b.OrderDate, round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as Daily_Sales from Order_Details as a inner join Orders b on a.OrderID = b.OrderID group by b.OrderDate order by b.OrderDate; -- Work out the running total from the new table. select DATE_FORMAT(t1.OrderDate, "%M %d %Y") as OrderDate, t1.Daily_Sales, sum(t2.Daily_Sales) as RunningTotal from NorthwindSales as t1 inner join NorthwindSales as t2 on t1.OrderDate>=t2.OrderDate group by t1.OrderDate order by t1.OrderDate; Practice 2: rank within multiple GROUPs which are the product categories 2.1 Create the basic dataset. -- Basic set select c.CategoryName, year(b.OrderDate) as OrderYear, round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as YearlySales from Order_Details as a inner join Orders b on a.OrderID = b.OrderID inner join Products as p on a.ProductID=p.ProductID inner join Categories as c on c.CategoryID=p.CategoryID group by c.CategoryName,year(b.OrderDate) order by c.CategoryName, OrderYear;
24 records returned. 2.2 Generate running total Within each product category, calculate running total for each order year. Please note that OrderYear is a naturally available rank value that orders YearlSales for each year, so we code it in the join clause like this t1.OrderYear >= t2.OrderYear. If we need to calculate a running total where a naturally available rank value does not exist, we'll have to use MySQL RANK(), DENSE_RANK(), or ROW_NUMBER() function to create the rank value. select t1.CategoryName, t1.OrderYear, t1.Yearly_Sales, sum(t2.Yearly_Sales) as RunningTotal from ( select c.CategoryName, year(b.OrderDate) as OrderYear, round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as Yearly_Sales from Order_Details as a inner join Orders b on a.OrderID = b.OrderID inner join Products as p on a.ProductID=p.ProductID inner join Categories as c on c.CategoryID=p.CategoryID group by c.CategoryName,year(b.OrderDate) ) as t1 inner join ( select c.CategoryName, year(b.OrderDate) as OrderYear, round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as Yearly_Sales from Order_Details as a inner join Orders b on a.OrderID = b.OrderID inner join Products as p on a.ProductID=p.ProductID inner join Categories as c on c.CategoryID=p.CategoryID group by c.CategoryName,year(b.OrderDate) ) as t2 on t1.CategoryName=t2.CategoryName and t1.OrderYear >= t2.OrderYear group by t1.CategoryName,t1.OrderYear order by t1.CategoryName,t1.OrderYear;
24 records returned. Below is equivalent version as above. -- This is equivalent version but create a new table for the basic dataset create table NorwindSales2 as select c.CategoryName, year(b.OrderDate) as OrderYear, round(sum(a.UnitPrice*a.Quantity*(1-a.Discount)),2) as YearlySales from Order_Details as a inner join Orders b on a.OrderID = b.OrderID inner join Products as p on a.ProductID=p.ProductID inner join Categories as c on c.CategoryID=p.CategoryID group by c.CategoryName,year(b.OrderDate) order by c.CategoryName, OrderYear; -- Calculate running total select t1.CategoryName, t1.OrderYear, t1.YearlySales, sum(t2.YearlySales) as RunningTotal from NorwindSales2 as t1 inner join NorwindSales2 as t2 on t1.CategoryName=t2.CategoryName and t1.OrderYear >= t2.OrderYear group by t1.CategoryName,t1.OrderYear order by t1.CategoryName,t1.OrderYear;
Happy Coding!
Other tutorials in this category 1. MySQL Northwind Queries - Part 1 |
Copyright © 2024 GeeksEngine.com. All Rights Reserved. This website is hosted by HostGator. No portion may be reproduced without my written permission. Software and hardware names mentioned on this site are registered trademarks of their respective companies. Should any right be infringed, it is totally unintentional. Drop me an email and I will promptly and gladly rectify it. |
Home | Feedback | Terms of Use | Privacy Policy |