Consumer Buying Behaviour Pattern Prediction Using Artificial Neural Network for Automobiles Sector

8 min readSep 23, 2022

Neural Networks can help you to retain your Customers and even Predict whether the new customer will buy the car or not.

The gist of the Model We Will Be Creating Today

Introduction

It is no secret that many businesses prioritise client retention; obtaining new customers may be many times more expensive than maintaining existing ones.

Furthermore, knowing why customers leave and calculating the risk associated with specific customers are also important components of developing a data-driven retention strategy.

As a result, Consumer Behaviour Analysis is used to forecast which existing customers will buy again.

This Model will also help in understanding whether the new customer will buy the car or not.

Customer Analysis?

Customer analytics is crucial for developing a comprehensive understanding of consumers’ purchase behaviours, use trends, demographic distribution, and profitability.

Organizations must invest considerable time and money in learning about their customers and evaluating the data generated by their interactions with them.

How is consumer behaviour analysis performed?

On a small scale or with a little amount of data, this may be done manually by comprehending the data.

In circumstances where the customer’s dataset is massive and it is nearly difficult to manually search for individual rows and uncover patterns in data, Machine Learning comes to the rescue!

Machine Learning has various built-in methods and models that make the analysis procedure as simple as possible.

To carry out this study, data scientists employ a variety of frameworks.

One of the frameworks is as follows:

STP

STP is an acronym that stands for Segmentation, Targeting, and Positioning.

It’s a three-step marketing strategy.

The STP approach makes it simple to segment the market, target consumers, and position the offering inside each segment.

Segmentation

It is the process of categorising a population, potential or present consumers, into groups with comparable characteristics.

This group will have similar purchasing habits.

This segment is likely to respond to a variety of marketing initiatives.

Targeting

It is the appraisal of possible income from segments and the selection of which segments to focus on.

Consider criteria to choose whether to extend to the entire segment or only a portion of the section.

Positioning

It comes next after selecting where to target.

In marketing, positioning is a strategic process that comprises creating an identity for a brand or product in the minds of potential purchasers.

Customer segmentation with the STP framework may be accomplished in Python utilising supporting modules like PCA, Hierarchical Clustering, K-Means Algorithm & even more!

RFM

Another Method can be RFM (Recency, Frequency & Monetary).

Marketers may target particular groups of customers with communications that are considerably more relevant to their individual habits thanks to RFM segmentation. This strategy yields substantially higher response rates as well as increased loyalty and client lifetime value.

RFM segmentation is a powerful method for identifying customer groups that should be addressed differently. RFM is an acronym that stands for Recency, Frequency, and Monetary.

RFM has an advantage over other segmentation models in that it uses objective numerical scales to generate a high-level image of customers that is both concise and informative. Furthermore, marketers may use it without the need for pricey tools. And, most importantly, the output of the segmentation process is straightforward to interpret and evaluate.

Python Model

It’s a lot of theory 😄. Now, the most awaited part Neural Network Prediction Model

IDE: Visual Studio Code

Problem Statement

A business needs to know whether an existing or new consumer is interested in purchasing an automobile.

We must anticipate the possibility based on the dataset provided.

About the Dataset

The dependent variable is whether or not the consumer purchased the products, whereas the independent variables are UserID, Gender, Age, and Salary.

Data has 5 Columns and 401 Rows

Approach

Only clients who are able to purchase the items can be targeted by the marketing team.

As a result, we will forecast 0 or 1 as the outcome signifying whether or not the consumer will purchase a car.

Algorithm Selected

As there are a lot of algorithms that can be used for Consumer Behaviour Pattern Prediction Analysis, but, according to me the best type will be Neural Networks. Now, why neural networks? Because it is just like a human brain has millions and billions of neurons that receive the signal from different senses and then helps the brain to decode them and then helps to react according to the signals, similarly, Neural Network is an artificial brain that has millions and billions of neurons which helps to decode the signals and the helps the machine to react.

There are three main types of neural networks:

Artificial neural networks (ANNs)
Recurrent neural networks (RNNs)
Convolutional neural networks (CNNs)

ANNs are the simplest type of neural network. They consist of an input layer, a hidden layer, and an output layer. The input layer receives the inputs, the hidden layer performs the computations, and the output layer produces the results.

RNNs are similar to ANNs, but they have additional layers that allow them to process sequences of data. This makes them well-suited for tasks such as speech recognition or machine translation and for time series data.

CNN’s are designed to process images. They have an input layer, a series of convolutional layers, and an output layer. The convolutional layers extract features from the images, and the output layer produces the results.

Hence, we will be using ANN.

Let’s get started:

The first step is to import the necessary libraries

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import keras
from keras.models import Sequential
from keras.layers import Dense
from sklearn.metrics import confusion_matrix, accuracy_score

Output 1

The second step is to upload the dataset

dataset = pd.read_csv('Social_Network_Ads.csv')

The next step is to check the data that one has uploaded

dataset.head()

The next step is to set the dependent variable(s) & independent variable(s)

X = dataset.iloc[:, 2:4].valuesy = dataset.iloc[:, -1].values

Output 4

The next step is to check the data in both variables i.e., dependent & independent

print(X)print(y)

The next step is to split the data into train and test

from sklearn.model_selection import train_test_splitX_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)print("Splitting Complete  - - - 100%")

Output 6

The next step is to call the standard scaler to scale the data. So, data train data does not contain any outlier value.

Note:

Scaling does not mean changing the value of data and making it bais. It means arranging the data into a uniform structure so that model can understand the data well and can perform function efficiently and effectively.

X_train = sc.fit_transform(X_train)X_test = sc.transform(X_test)

The next step is to change the shape of the data, to avoid any 2D or 3D errors.

X_train = np.reshape(X_train,(X_train.shape[0], X_train.shape[1],1))X_train.shape

The next step is to load the classifier i.e., Artificial Neural Network

model = Sequential()model.add(Dense(50, kernel_initializer='glorot_uniform', input_shape = (x_train.shape[1],1)))model.add(Dense(50, kernel_initializer='glorot_uniform'))model.add(Dense(25))model.add(Dense(1))

After this, use adam optimizer and one can use either “accuracy” or “rmse” as a metric.

In this model, accuracy has been considered.

The most awaited step will use the algorithm here to fit the model

classifier.fit(X_train, y_train, batch_size = 5, epoch = 100)

Now we have to set up the model

y_pred = classifier.predict(X_test)y_pred = (y_pred > 0.5)

We have taken the 0.5 value just to convert the numbers into the binary form so that we can predict who will buy the car and who will not.

Output 10

Yay! Now, we are 99% done.

Let’s check the accuracy of our Model.

For accuracy, we will use three different approaches (Just to be Sure)

Confusion Matrics: A confusion matrix is a table that is used to define a classification algorithm’s performance. A confusion matrix visualizes and summarizes a classification algorithm’s performance.
Accuracy Score Sklearn: The accuracy score function of the sklearn. metrics module in Python computes the accuracy score for a collection of predicted labels vs true labels.
Rooted Mean Square Error: The standard deviation of the residuals is defined as the Root Mean Square Error (RMSE) (prediction errors). Residuals are a measure of how far away data points are from the regression line; RMSE is a measure of how spread out these residuals are. In other words, it indicates how concentrated the data is around the best fit line.

cm = confusion_matrix(y_test, y_pred)print(cm)print("")print("Accuracy : ",(accuracy_score(y_test, y_pred))*100,"%")print("")rmse = np.sqrt(np.mean(y_pred - y_test)**2)print("RMSE : ",rmse)