Consumer Buying Behaviour Pattern Prediction Using Artificial Neural Network for Automobiles Sector
Neural Networks can help you to retain your Customers and even Predict whether the new customer will buy the car or not.
The gist of the Model We Will Be Creating Today
Introduction
It is no secret that many businesses prioritise client retention; obtaining new customers may be many times more expensive than maintaining existing ones.
Furthermore, knowing why customers leave and calculating the risk associated with specific customers are also important components of developing a data-driven retention strategy.
As a result, Consumer Behaviour Analysis is used to forecast which existing customers will buy again.
This Model will also help in understanding whether the new customer will buy the car or not.
Customer Analysis?
Customer analytics is crucial for developing a comprehensive understanding of consumers’ purchase behaviours, use trends, demographic distribution, and profitability.
Organizations must invest considerable time and money in learning about their customers and evaluating the data generated by their interactions with them.
How is consumer behaviour analysis performed?
On a small scale or with a little amount of data, this may be done manually by comprehending the data.
In circumstances where the customer’s dataset is massive and it is nearly difficult to manually search for individual rows and uncover patterns in data, Machine Learning comes to the rescue!
Machine Learning has various built-in methods and models that make the analysis procedure as simple as possible.
To carry out this study, data scientists employ a variety of frameworks.
One of the frameworks is as follows:
STP
STP is an acronym that stands for Segmentation, Targeting, and Positioning.
It’s a three-step marketing strategy.
The STP approach makes it simple to segment the market, target consumers, and position the offering inside each segment.
Segmentation
It is the process of categorising a population, potential or present consumers, into groups with comparable characteristics.
This group will have similar purchasing habits.
This segment is likely to respond to a variety of marketing initiatives.
Targeting
It is the appraisal of possible income from segments and the selection of which segments to focus on.
Consider criteria to choose whether to extend to the entire segment or only a portion of the section.
Positioning
It comes next after selecting where to target.
In marketing, positioning is a strategic process that comprises creating an identity for a brand or product in the minds of potential purchasers.
Customer segmentation with the STP framework may be accomplished in Python utilising supporting modules like PCA, Hierarchical Clustering, K-Means Algorithm & even more!
RFM
Another Method can be RFM (Recency, Frequency & Monetary).
Marketers may target particular groups of customers with communications that are considerably more relevant to their individual habits thanks to RFM segmentation. This strategy yields substantially higher response rates as well as increased loyalty and client lifetime value.
RFM segmentation is a powerful method for identifying customer groups that should be addressed differently. RFM is an acronym that stands for Recency, Frequency, and Monetary.
RFM has an advantage over other segmentation models in that it uses objective numerical scales to generate a high-level image of customers that is both concise and informative. Furthermore, marketers may use it without the need for pricey tools. And, most importantly, the output of the segmentation process is straightforward to interpret and evaluate.
Python Model
It’s a lot of theory 😄. Now, the most awaited part Neural Network Prediction Model
IDE: Visual Studio Code
Problem Statement
A business needs to know whether an existing or new consumer is interested in purchasing an automobile.
We must anticipate the possibility based on the dataset provided.
About the Dataset
The dependent variable is whether or not the consumer purchased the products, whereas the independent variables are UserID, Gender, Age, and Salary.
Data has 5 Columns and 401 Rows
Approach
Only clients who are able to purchase the items can be targeted by the marketing team.
As a result, we will forecast 0 or 1 as the outcome signifying whether or not the consumer will purchase a car.
Algorithm Selected
As there are a lot of algorithms that can be used for Consumer Behaviour Pattern Prediction Analysis, but, according to me the best type will be Neural Networks. Now, why neural networks? Because it is just like a human brain has millions and billions of neurons that receive the signal from different senses and then helps the brain to decode them and then helps to react according to the signals, similarly, Neural Network is an artificial brain that has millions and billions of neurons which helps to decode the signals and the helps the machine to react.
There are three main types of neural networks:
- Artificial neural networks (ANNs)
- Recurrent neural networks (RNNs)
- Convolutional neural networks (CNNs)
ANNs are the simplest type of neural network. They consist of an input layer, a hidden layer, and an output layer. The input layer receives the inputs, the hidden layer performs the computations, and the output layer produces the results.
RNNs are similar to ANNs, but they have additional layers that allow them to process sequences of data. This makes them well-suited for tasks such as speech recognition or machine translation and for time series data.
CNN’s are designed to process images. They have an input layer, a series of convolutional layers, and an output layer. The convolutional layers extract features from the images, and the output layer produces the results.
Hence, we will be using ANN.
Let’s get started:
The first step is to import the necessary libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import keras
from keras.models import Sequential
from keras.layers import Dense
from sklearn.metrics import confusion_matrix, accuracy_score
The second step is to upload the dataset
dataset = pd.read_csv('Social_Network_Ads.csv')
The next step is to check the data that one has uploaded
dataset.head()
The next step is to set the dependent variable(s) & independent variable(s)
X = dataset.iloc[:, 2:4].valuesy = dataset.iloc[:, -1].values
The next step is to check the data in both variables i.e., dependent & independent
print(X)print(y)
The next step is to split the data into train and test
from sklearn.model_selection import train_test_splitX_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)print("Splitting Complete - - - 100%")
The next step is to call the standard scaler to scale the data. So, data train data does not contain any outlier value.
Note:
Scaling does not mean changing the value of data and making it bais. It means arranging the data into a uniform structure so that model can understand the data well and can perform function efficiently and effectively.
X_train = sc.fit_transform(X_train)X_test = sc.transform(X_test)
The next step is to change the shape of the data, to avoid any 2D or 3D errors.
X_train = np.reshape(X_train,(X_train.shape[0], X_train.shape[1],1))X_train.shape
The next step is to load the classifier i.e., Artificial Neural Network
model = Sequential()model.add(Dense(50, kernel_initializer='glorot_uniform', input_shape = (x_train.shape[1],1)))model.add(Dense(50, kernel_initializer='glorot_uniform'))model.add(Dense(25))model.add(Dense(1))
After this, use adam optimizer and one can use either “accuracy” or “rmse” as a metric.
In this model, accuracy has been considered.
The most awaited step will use the algorithm here to fit the model
classifier.fit(X_train, y_train, batch_size = 5, epoch = 100)
Now we have to set up the model
y_pred = classifier.predict(X_test)y_pred = (y_pred > 0.5)
We have taken the 0.5 value just to convert the numbers into the binary form so that we can predict who will buy the car and who will not.
Yay! Now, we are 99% done.
Let’s check the accuracy of our Model.
For accuracy, we will use three different approaches (Just to be Sure)
- Confusion Matrics: A confusion matrix is a table that is used to define a classification algorithm’s performance. A confusion matrix visualizes and summarizes a classification algorithm’s performance.
- Accuracy Score Sklearn: The accuracy score function of the sklearn. metrics module in Python computes the accuracy score for a collection of predicted labels vs true labels.
- Rooted Mean Square Error: The standard deviation of the residuals is defined as the Root Mean Square Error (RMSE) (prediction errors). Residuals are a measure of how far away data points are from the regression line; RMSE is a measure of how spread out these residuals are. In other words, it indicates how concentrated the data is around the best fit line.
cm = confusion_matrix(y_test, y_pred)print(cm)print("")print("Accuracy : ",(accuracy_score(y_test, y_pred))*100,"%")print("")rmse = np.sqrt(np.mean(y_pred - y_test)**2)print("RMSE : ",rmse)
All the methods are showing amazing accuracy.
Now the Final code, so that, one can predict whether the person will buy the car or not
Lets check with:
Age = 32 years
Salary = 50000 per annum
Now, lets increase the salary amount
Age = 32 Years
Salary = 500000 Per Annum
This means that a 32 year old person having Rs. 5,00,000/- Annual Income won’t buy a car. Now, lets try to reduce the age.
This means Age is also a crucial element while considering the purchasing power.
Hence, we can conclude that this Model is great for Consumer Behaviour Pattern Prediction Analysis.
In case of questions, leave a Comment or Email me at aryanbajaj104@gmail.com
ABOUT THE AUTHOR
I recently completed BBA (BUSINESS ANALYTICS) from CHRIST University, Lavasa, Pune Campus.
Website — acumenfinalysis.com (CHECK THIS OUT)
CONTACTS:
If you have any questions or suggestions on what my next article should be about, please write to me at aryanbajaj104@gmail.com.
If you want to keep updated with my latest articles and projects, follow me on Medium.