Instacart Basket

Uncovering information about sales patterns through Python analysis

Background

The Instacart an online grocery store that operates through an app, the stakeholders are most interested in the variety of customers in their database. They want to target different customers with applicable marketing campaigns to see whether they have an effect on the sale of their products along with their purchasing behaviors.

Objective

Perform an initial data and exploratory analysis to drive insights and suggest strategies for better segmentation based on the provided criteria.

Skills

Data cleaning, integration & transformation

Python

Data wrangling and merging

Deriving variables

Grouping datasets

Aggregating data

Reporting in Excel

Tools

PowerPoint

Python

Tableau

Excel

Key Questions

Analyze the busiest days of the week and hours of the day?

Analyze the frequency of product orders of different product departments?

Analyze the different types of customers in their system and how their ordering behaviors differ?

Analysis Process & Visualization

Data Cleaning

Use Python for data cleaning, merging, deriving variables, and grouping datasets.

For detailed Python execution, please refer to the link: Python_code

Population Flow

Create a Population flow to make the data cleaning process clearer.

Time Series Analysis

Use bar charts to analyze the order volume for different time periods.

Use line charts to analyze the average and total spending for different time periods.

Product Categorization

Additionally, use a bar chart to display the order volume for products from each department.

Categorize products into three classes based on price ranges and use a bar chart to display the order volume for each class.

User Segmentation

Segment users by region, loyalty, age, income level, and family size.

Consumption Behavior Pattern

Analyze user consumption behavior based on order quantity proportion, purchase frequency distribution, spending price distribution, and preferences for different product departments.

Conclusion & Recommendation

Key Question 1

They also want to know whether there are particular times of the day when people spend the most money, as this might inform the type of products they advertise at these times.

Findings:


The average spending fluctuates minimally
throughout the day, generally staying between $7.7 and $7.9. The cumulative spending data shows that the peak spending period for users is between 10 AM and 2 PM.

Recommendations:


Given that the cumulative spending is highest from 10 AM to 2 PM, it is advisable to promote high-value or high-profit products during these times. Users may have a higher willingness to purchase during these hours, and the return on advertising investment could be more favorable.


Key Question 2


Are there certain types of products that are more popular than others? The marketing and sales teams want to know which departments have the highest frequency of product orders.

Findings:


The departments with the highest sales are Department 4 (produce), Department 16 (dairy eggs), and Department 19 (snacks), indicating very high demand for products in these categories.

Recommendations:


Strengthen high-sales departments:Ensure sufficient inventory:
For high-sales departments such as produce, dairy eggs, and snacks, it is crucial to maintain an adequate supply of inventory to meet the continuous high demand.Intensify marketing efforts: Increase marketing efforts for these departments to leverage their high traffic. This could include more prominent placement in stores, targeted online advertising, promotional campaigns, and cross-promotional deals that can enhance product visibility and sales potential.


Age Profile

Findings:


I categorized users by age group, labeling those under 30 as 'young,' those aged 30 to 65 as 'middle,' and those 65 and older as 'senior'.

Differences in Order Quantity:

Middle-aged users (ages 30-65) place the highest number of orders.
Senior users (ages 65 and above) place the second highest number of orders.
Young users (under 30) place the fewest orders.

Differences in Transaction Prices:

The average and median prices across all age groups are very close, indicating similar price acceptance levels across age groups.

Differences in Purchase Frequency:

Purchase frequency, as measured by purchase intervals, is highest among middle-aged users, followed by young users, with seniors having the lowest frequency. This suggests that young and senior users tend to shop more frequently, while middle-aged users may prefer periodic bulk purchases.

Differences in Consumption Preferences:

When comparing the consumption proportion of each department among different age groups, there is no significant difference from the overall consumption proportion across departments (without age segmentation).

Produce (department_id 4) has the highest proportion of consumption across all age groups. Additionally, the Senior age group has a slightly higher proportion of spending in this departmentcompared to other age groups. Dairy eggs (department_id 16) also have a high consumption proportion across all age groups, with the Young age group having the highest proportion of spending within this department.


Recommendations:


For Senior Users:

Intensify promotions for health and nutrition products to meet the elderly's interest in healthy eating.


For Middle-aged Users:
Promote products suitable for the whole family's tastes and large packaging goods to meet family needs. Offer family packs and multi-item discounts to encourage bulk purchasing.


For Young Users:
Employ digital marketing strategies to more effectively attract young consumers, such as promoting healthy snacks and trendy beverages on social media.