Data Driven Marketing for Smartwatches

sqseah
9 min readApr 13, 2022

To effectively market smartwatch products and stay abreast of stiff competition in the market, we explore key ways for smartwatch marketers or essentially anyone else who wants to find out more about marketing analytics to adopt into their wider strategy.

Source: RecordTrend

Our team of 4 souls passionate about "anything analytics" consists of Christel, Claire, Djulian and Zi Rong. Here’s our team’s findings on how to effectively target the right customer segments to buy smartwatches and our thought process behind that.

Unravelling the Problem

The nascent smartwatch is poised to grow at a CAGR of 18.3% over the next 5 years, showing immense potential and opportunities to leverage on in this space. Naturally, we see more entrants into this space to try to capture a slice of this market.

Using this as an anchor point, we believe that the use of analytics to conduct targeted marketing would be a differentiator for smartwatch marketers to use marketing analytics to refine their strategies and compete effectively.

Motivations and Objectives

We want to analyse the profiles of potential customers to gauge one's propensity in purchasing a smartwatch. In order to increase the effectiveness of our marketing strategies, we aim to explore and develop the use of machine learning models to identify potential customers to target by using SAS Viya. By identifying these customers, we will be able to develop more personalised marketing strategies to effectively appeal to the target group to buy smartwatches.

The main business objectives are:

1. Identify potential customers who are more likely to purchase smart watches

2. Develop more personalised marketing strategies for the target customer increase sales

Approach Description

To identify potential customers who are more likely to purchase smart watches, we came up with the following approach:

Tools and Technology

The following tools are used in our project.

Tools and technology

Data Exploration and Preparation

Prior to developing the models, it’s important to conduct data exploration and preparation. We conducted the following steps during this stage:

  1. Exploratory data analysis (EDA) on original data
  2. Data cleaning based on previous step
  3. Exploratory data analysis on cleaned data
  4. Selecting the final input variables

1. EDA on original data

The initial dataset consists of 9831 observations and 42 variables. The target variable is WatchBuyInd, which measures whether a smartwatch is bought or not. 78.56% (7723) of the observations belong to target 0, while 21.44% (2108) of the observations belong to target 1. Although the data for the target variable seems to be mildly imbalanced, we will not be doing any event-based sampling as the percentage of imbalance is not huge.

Our intention of doing the first EDA is to better understand the structure of the dataset, in particular, to gain a better understanding of what needs to be done during data cleaning. In this stage, we identified:

  • Missing values
  • Outliers
  • Variable importance
  • Kurtosis & skewness
  • Correlations

2. Data cleaning based on initial EDA

Next, we cleaned the data to prepare it properly for the models. We conducted the following on the dataset:

  • Dropped unnecessary and highly correlated columns like CustCluster and id
  • Handled missing values by — Imputing the mean age for CustAge and changing all null values in CustomerGender to “UNknown”, creating the new variable IMP_CustomerGender
  • For the other variables with null values, we dropped the rows with any null values
  • Transformed several variables with high skewness using the “Best” transformation feature on SAS Viya. Log transformation was used on CustSpend and LoyalTenure and Inverse Square transformation was used on DemHouseholdIncome. Additionally, we binned the customer ages to create Age_Bucket

3. EDA on cleaned data

After data preprocessing, the dataset has a total of 8771 observations. We conducted an additional EDA on cleaned data to determine how the data cleaning phase changed the data:

  • Variable importance
  • Variable selection
  • Correlation matrix
Results of the variable selection produced by SAS

4. Final Variable selection

Using the results from the variable importance feature in the previous step, we utilised the Clustered Variables Network feature and determined that Last3MVisit and Last6MVisit are highly correlated. Thus, we decided to drop Last6MVisit as using only Last3MVisit produced more accurate classification results. Additionally, through trial and error, we concluded that including variables SurveySatQ3 and CustClusterGroup improved the accuracy of the results.

Hence, by combining the results of Variable Selection, Variable Clustering, and trial and error, we decided to use the following variables for the classification models below:

Data Analysis

Our data analysis consists of 2 parts — classification and customer segmentation.

  1. Classification

Our solution pipeline includes running 7 different supervised machine learning models — Decision Tree, Logistic Regression, Bayesian Network, Support Vector Machine (SVM), Gradient Boosting, Forest, Neural Network. These models are selected as they are ideal for training on small to large datasets and can be used for datasets with binary targets. They will be used to predict if a customer will buy a smartwatch or not.

Solution Pipeline

In addition, they are run using autotuned and default setting versions, except for Logistic Regression which does not have the autotuning function.

To determine the best model for classification, we conducted the following steps:

  1. Split data into train, validate and test data sets — 60%, 30%, 10% respectively. We chose to include a test data set to acquire unbiased estimates of the model. This way, we will be able to determine the best model more accurately.
  2. Compare autotuned and default model settings. Using the Misclassification Error, which shows the percentage of predictions misclassified, we compared models with different settings. We concluded that autotuned models produced results with higher accuracy.
  3. Select the champion model — using the Model Comparison node, we then selected the final model based on SAS Viya’s computation of the model’s accuracy.
Comparison of the top 5 models

The best performing model is the autotuned Decision Tree as it has the lowest misclassification error of 0.1197, highest accuracy of 0.8803, highest F1 Score of 0.6465 and highest cumulative lift of 4.1487. Additionally, we confirmed that the model does not overfit, as it performs better for the Train data set than the Test data set

Misclassification Errors for the autotuned Decision Tree model

2. Customer Segmentation

We further segmented customers using the results from the champion classification model. This is to gain deeper insight into which customer segments are more likely to purchase a smartwatch and hence curate marketing campaigns accordingly.

Using the Save Data node, we were able to view “Probability for WatchbuyInd = 1” which indicates the probability that a customer will purchase a watch as seen in the table below.

Probability that a customer will purchase a watch

Thereafter, we ranked the customers according to their probability of purchasing a watch and segmented them. The results of the segmentation are evaluated in the next section.

Insights and key takeaways

Customer Segmentation

Segmentation is a way to determine the correlation between customers in multiple segments to maximise customer benefits. This can empower marketers to interact with each customer segment in an optimal fashion that would optimise satisfaction, resulting in recurring sales.

Customer segmentation was done based on probability of one likelihood of purchasing the smartwatch. Our team determined that mode was best able to help us identify the value that appears the most often in the segment.

Different customer segments and their likelihood of purchase

Insight #1 — Marketing should be targeted towards males who have a low satisfaction in mobile devices

Males who have low satisfaction levels in mobile devices might be more likely to purchase a smartwatch as their needs might not have been adequately met by their mobile devices. They may want to turn to smartwatches to make up for the lack in functionalities.

For one that frequently exercises, they might want to know their heart rate and monitor their body metrics which a standalone mobile device might not be able to accurately track. Therefore, getting a smartwatch as an extension to their existing mobile device may be an attractive option.

Insight #2 — Marketing should be targeted towards males who visit the retail store more than 9 times within 3 months

Our team thought of how one’s frequency of visit to the store will influence their decision to buy a smartwatch and concluded that we want to target this group due to their potential

We came out with 3 possible reasons:​

  1. Being able to try and test out the smartwatch​ would increase the propensity of one to purchase a smartwatch. Bolstered by the knowledge of a salesperson who can sell the product
  2. The effect of advertisements — Being able to see more advertisements in the retail store for smart watch​
  3. People who stay abreast the latest smartwatch trends are more likely to visit the store more often to view the latest gadgets

Quantitative analysis

2 key questions:

  1. Who shall we contact yet still remain profitable — Cumulative Captured Response Chart
  2. What were the returns of this campaign — Return On Investment (ROI) analysis

Cumulative Captured Response Chart

Cumulative Captured Response Chart

We then calculated the break even response rate for the marketing campaign which is the cost of contact / profit from sale = 1.6%​. This means that as long as the predictive likelihood of a person is more than or equal to 1.6%, then we can contact them and remain profitable even after the campaign.​ From this, our team deduce that we can reach out to the top 99% of the dataset using our champion model.​

Return on Investment (ROI)

We calculated a ROI of 8.57%, which we deem to be decent within the initially set out scope of our project. ROI = Net Revenue/Cost of promotion = (112)*1208/(1.8*8771) = 8.57

Moving forward, we hope to better make ROI estimates by doing monthly comparisons particularly the sales from the business line in the months prior to the campaign launching.

Marketing Campaigns

We came up with 3 marketing campaigns, each with a different marketing angle and target group.

  1. Marketing strategies that would highlight the health tracking features: to provide potential customers in the middle aged group the health functionalities that can help them be more active and stay healthy
  2. Marketing strategies that would highlight the performance enhancing features: to help potential customers enhance performance with metrics such as Vo2 max (maximum amount of oxygen your body can utilise during exercise)
  3. Marketing strategies that would make one choose one watch brand over another: strategic partnerships with health tracking apps etc Strava, Nike Run, Spotify to offer buyers of the smartwatch free subscription trials​. This might not be the main reason for one to make a purchase immediately but more if one is deciding between 2 watches​, this might be a deciding factor

Limitations and Challenges Faced

We were hampered by the inability to use CustCustomerGrp in our insights analysis. Although it improves the classification results and this may value-add to our customer segmentation insights, the team was not able to identify the geographies or demographics in each CustCustomerGrp, inhibiting the team from further insights.

The Way Forward

To improve our model, we propose an:

  1. Interactive dashboard prototype:

Employees from the sales or marketing team can upload a csv list of customers and their attributes to obtain predictions on whether a customer will buy a smartwatch and their probability of purchasing by using our final champion model.

On the Prediction Page, users can manually change the values for different factors to predict whether a customer is likely to buy a smartwatch. The prediction output is based on our autotuned Decision Tree model that was trained previously.

2. Enhanced Recommendation model

For customers that are more likely to buy a smartwatch, we can collect their interests/hobbies and preferences for smartwatch (e.g. Beauty, Cost, Functionality, Brand, Durability) that can recommend smartwatch models that resonate with their interests or demographic profile

Our SAS Viya Experience — Our thoughts, summarised

Our methodology and findings are certainly not limited to that for smartwatch marketing but can also be extended to wider marketing analytics.

Altogether, our experience learning and applying machine learning techniques through SAS Viya was challenging yet rewarding . We are grateful to Prof Seema and Gemma from SAS for this opportunity and their guidance. Our team hopes that more people can be exposed to analytical tools like SAS Viya and see how analytics can be automated to create powerful machine learning models to improve business operations, marketing tactics, and more.

--

--