Lazada Sales Analysis

Background

Our client want to create a discount promotions to increase the product sold and market sales of lazada. They are not sure what kind of discount promotions that would effectively increase the goals. Based on that, we will provide them an analytical overview of the market related to the issue. Lastly, we will make a recommendation based on the data analytics result to help them decide what kind of discount promotions to take.

Project Summary

Goal

Our goal is to find the most effective discount promotions that can be applied by our client to increase the sales.

Objectives

To find the most effective discount promotions that can be applied by our client, what we need to do in this project are :

  • Cleaning raw data using various method and make sure the data is feasible to analyze.
  • Conduct exploratory data analysis (EDA) to get the market overview of lazada.
  • Analyze products sold by brand and category.
  • Analyze sales of lazada.
  • Make a recommendation based on the analytical result.

Dataset

  • Lazada Items

  • Contains 10.000+ rows of lazada items with its ID, price, average rating, brand, category, etc.

  • Lazada Customer Review

  • Lazada review list in 59 month from April 2014 to October 2019. Contains 200.000+ rows of product review with rating, review, date, platform used, etc.

    Download Dataset

Data Cleaning

We need to make sure the data is clean enough before analyzing the data to get unbiased analysis. The cleaning steps are :

  • Remove irrelevant columns

  • The column in items sheet that we need to remove are url, category, total review, retrieved date, and average rating. The category column doesn't represent the product correctly so we need to remove it. For the review sheet we need to remove category, name, original rating, review title, review content, like count, upvotes, downvotes, helpful, relevance score, and retrieved date.

  • Handle missing value

  • After running missing value check, there are 4 missing rows on brand name column and 7107 missing rows on bought date column. For the brand name, we fix it by defining the brand name ourselves. As for the bought date column, we fix it by removing rows that have missing value.

  • Remove duplicates

  • For this step, we just need to apply it on the items sheet so that there is no duplicate items with the same ID. As for review sheet, we don't need to because even if there is duplicated rows, it could be from different transaction.

  • Convert data type

  • Convert item ID from integer(number) to varchar and bought date from dd-mm-yyyy to yyyy-mm-dd format. We also split the date by day, month, and year so that we can process it easily using python for visualization later.

  • Typo check

  • We need to check in case there is a typo especially for brand name column because it has too much different brand name. In short, what we do here are :

    • Removing space before and after brand name with "trim"
    • Checking and shorten if there is brand name with 2 or more space
    • Replacing "-" and "_" with space, removing unnecessary special character
    • Changing all similar brand name that's actually the same into exactly the same brand name
    • Making the first letter of a brand name with 4 or more letter in uppercase
    • Making brand name with 3 or less letter in uppercase

Merging Data

Merging data and categorizing items

In this part, we will categorize the items automatically based on the product name and then merging the items and review data.

Convert date type

The bought date data type need to be converted from object to datetime if we want it to work properly as a date in further analysis. The reason why we only want to convert it now is because SQL doesn't have datetime data type so we can't convert it using SQL, and now we can convert it using Python Panda after merging the data.

EDA (Exploratory Data Analysis)

Descriptive analytics

General analytics about mean, median, modus, count, distinct value, standard deviation, and etc for data on each column.

Monthly transaction

Despite there is insufficient data in 2014 from january until march, the lowest sales is not in those 3 months. The highest is in September then plummeted by 71.32% in October and became the lowest sales.

Client Used

Most of the customer used android app to access lazada. The rest 15% used desktop, ios, and other client to acces lazada.

Top Selling Brand

In the top 10 best selling brand, sandisk is on the first place with 40.475 total order or 20,58% among all other brands.

Top Selling Category

TV is the best selling category with 84.589 total order or 43,01% among all other category.

Week to week activity

During April 2018 until October 2019, top 3 highest acitivites are on July, August, and September. The highest total order in a single day ever reached is on Monday September around 9700 total order.

Total order by month

The chart above shows total order by month from April 2014 to October 2019. In general, there is an increasing number of total order in the marketplace and there are a few moments the total order experience a significant increase. It happens in December 2018, March 2019, May 2019, July 2019, and September 2019. We can look into that to get some promotions choice which already proven to be effective to increase total order. To do that, we need to breakdown each month of this chart to see the details about what happen at that time.

In 2018, the total order experience a significant increase on 12 December. After looking into what happened, in that day Lazada had 12.12 promotion. The promotions are in the form of flash sale, discount voucher, and free shipping.

There was a significant increase of total order on 27 March 2019. That day happened to be lazada's 7th birtday. The promotions are flash sale, big sale, group buy, and discount voucher.

On 14 May 2019, the significant increase on total order was caused by Ramadhan promotions. Such as flash sale, discount voucher, paylater payment, and cashback.

The significant increase on total order happened on 12 July 2019. There was Lazada mid year festival with various promotions such as flash sale, 99.9% discount, discount voucher, paylater payment, and cashback.

Lastly, on 9 September 2019 the significant increase of total order was caused by 9.9 promotion. It offered the same promotions such as flash sale, discount voucher, paylater, and cashback.