Every fourth person I meet talks about big data and data analytics. In fact, we are in times where people are investing in big data and data analytics like never before. Yet, some of the questions that continue to haunt us are:
These questions majorly stem from the fact that most people from the top management aren’t seeing the expected outcome from their big data investments.
For starters, having terabytes of data doesn’t make a company eligible to consider investing in Big data, especially if the data isn’t good enough or detailed enough. You are as good as your data.
When data analytics doesn’t yield the expected returns, the first factor to look at is: the data itself. It can be overfit or underfit. Overfitting occurs when a statistical model or machine learning algorithm beings to capture the noise of the data. More specifically, overfitting occurs if the model or algorithm shows low bias but high variance. On the other hand, underfitting occurs when a statistical model or machine learning algorithm cannot capture the underlying trend of the data. It occurs when the model or algorithm shows low variance but high bias.
Data Decay is another key issue to look into. Sameer, in his blog post, brilliantly describes the issue and its mitigations. Hence, I would not delve deeper on that aspect.
If you have good data, analytics can do wonders for you. You can slice and dice the data and unearth some amazing insights from customer data and make your team rethink their current strategy. Recently, one of our customers were awestruck by the insights our platform deciphered about their cross-sell revenue across different countries. The point in contention is – always consider the larger picture, and break the process into logical steps, and connect the dots.
For example, let us consider a retail / eCommerce scenario. Profiling large customer base into selected persona is an initial step for Ad recommendation / offering discounts etc. It is understandable that buying patterns of every user will be different. Hence, clustering the similar buying pattern will improve the Ad targeting and thereby the response rate.
The cluster differentiator signals could be anything from your location, the type of Operating System you are using to the detailed transaction history (buying patterns).
Let’s assume that we are looking to identify “Influential Buyers”. They typically care more about the longevity of the product than the brand, color or price. One of the key identifiers could be to check if the user has read all reviews about the product. When we do the analytics part (here clustering),it could lead us to some common problems like data source, data validations, data transformations and domain knowledge.
To sum up, it all starts with data. Analytical and data science models are as good as data. The data are as good as how actionable they are. Actionability is as good as how it can be put to use.
Looking forward to know what you feel about it!
Yaagneshwaran Ganesh (often called as "Yaag") is among the top 100 global martech influencers, and is responsible for the growth and product adoption for Freshchat, a customer engagement platform powering live chats and conversational marketing across industries. He is also a TEDx speaker, a member of Forbes Council and author of 5 books, with the latest one titled “Syncfluence”. He has been a speaker across business forums such as CII Young Indians, Chamber of Commerce Netherlands, Kerala Startup Mission (Initiative by Govt of Kerala), and academic institutions such as IITs, Saxion University of Applied Sciences and more. He is an active member of the startup ecosystem and is part of the Google for Entrepreneurs initiative "Startup Weekend", thus being a sounding board for startups in APAC and Europe. Yaag also writes columns for HuffingtonPost, Forbes, Martech Series, Martech Advisor, dtNEXT, Techstory, ManagementNext and has been a regular blogger on RonSela.com, the thought leader on marketing practices.