How the CMO Became A Data Scientist

How the CMO Became A Data Scientist

Kurt Cagle 10/07/2020 2
How the CMO Became A Data Scientist

Marketing directors have long been the subject of scorn among most IT professionals.

They were the guys in the cheap suits and the balding hair, the ones that were more interested in sales channels, marketing campaigns, advertising spends, and Excel spreadsheets than in computers and programming. It was the marketing director that would oversell the product, forcing programmers to go to great lengths to get even close to what was promised, and inevitably, when things went south, programmers usually tended to bear the brunt of the fallout, even though they repeatedly warned that what was promised wasn't feasible.

The attitude wasn't helped by the very different worldviews of the two professions. Programmers are precise, logical, and detail-oriented. They value truthiness because, without that consistency and precision, programs usually don't work. Marketers, on the other hand, tend to see truth as being ... malleable, Sometimes (okay, let's face it, frequently) you need to promote the positive and emphasize the lipstick on the pig, and not the pig itself. As such, it's perhaps not all that surprising that there's always been a significant rift between marketing and programming. 

Marketing, Modeling and Systems Theory

Yet something strange happened over the last couple of decades. Marketing really is as much about numbers as it is promotion. It is not enough to do accounting. Accounting tells you where you've been, but it doesn't necessarily out of the box tell you where you're going. To do that, you need to create models.

This is where Microsoft Excel, that favored tool of marketers and business analysts, really began to shine twenty years ago. You could create models in Excel, could see what happens if you plug in this vs. that sales point, determine net aggregates, and even perform basic linear regression to determine patterns in the data. Now, many of these models were hideously bad - too simple to provide meaningful insight - but they were still better than what everyone else was doing. The ones who kept at it, slowly but surely, also went from being marketing managers to being marketing analysts.

Modeling sounds like it should be very complex, and the way that it's practiced now, it is complex, arguably too complex. Part of this comes about because of a split that occurred around fifty years ago, with the advent of systems theory, which provided a lot of the groundwork for modeling how systems behaved. While programmers were busy writing tools that made it possible to store, modify and transmit information, systems theorists were trying to figure out how complex, frequently non-linear systems behaved.

To understand that, it's worth understanding why systems get complex (and even chaotic). Linear systems are typically systems (many actors doing many things that interact with one another) in which perfect information is known. These kinds of systems are easiest to automate, in great part because the mathematics behind them is usually straightforward. You can write algorithms that determine the values that come from these kinds of models that will describe the system in perpetuity.

A classic example of this is determining the motion of a large planet around a star. This was a hard problem in the seventeenth century, but Newton and Leibniz and Huygens and others managed to solve that problem cleanly, formulating the Newtonian view of the universe as one that had the precision of clockwork. 

Add another planet to the mix, however, and the problem becomes considerably more complex. In fact, it becomes something that is insoluble without the use of computers, and even then such a solution becomes far more sensitive to initial conditions. Why is this? It is because each body's movement is now dependent upon the relative position of the other two. The orbits go from being serene and stable to chaotic, Over time, the system will become quasi-stable, usually because the planets have moved far enough apart that the impact of any one planet on another is overwhelmed by the impact of the star on each. We know this because it became possible to model the interactions numerically, and by doing enough simulations (modeling) that astronomers were able to create an ensemble of potential orbits to give them an idea about what is likely to happen when certain configurations are present.

As an aside, the early solar system likely had several dozen planets in it at one point. Most of them either were absorbed by the sun, got kicked out of the solar system altogether, or impacted other planets (such as what happened with the progenitors of the Earth and the Moon). Talk about chaotic systems!

Embracing Chaos and Randomness

So what does this have to do with marketing directors? Quite a bit, actually. In an economic model where you have customers and a single business trying to win their hard-earned cash, the modeling is pretty easy - they're actually variants of what are called Lyapunov models, which are differential equations originally designed to model predator-prey relationships. In this two-agent problem, the population of predators and prey follow a predictable pattern, with the population of prey increasing over time, which increases the number of predators. The predators then grow faster than the prey causing the prey's numbers to start falling again. The predators begin to starve, and so their numbers too begin to decline.

In a monopoly held business, the same thing applies - in a monopoly environment, supply and demand tend to chase one another in regular, predictable patterns. The moment you add another factor, however (such as the prey potentially overgrazing their territory or a second competing company showing up) the equations go chaotic, though usually moving towards a while into a quasi-stable state - what mathematicians would call a local minimum in the phase space.

Statistics involves creating models where you are looking at aggregates of individual actors, where each actor can be thought of as a bundle of attributes that, with any luck, each don't have any interaction with other attributes. If you look at all of the potential states that each attribute can be in, and if you model things right (there is no bias towards one particular attribute in a set) then you can determine general sentiment and in a (mostly) predictable manner. The actors are independent, and as such, you can determine patterns and clustering that represent an average snapshot along with some measures of both how much divergence there is from the mean (the variance) and how likely that, when you pick a sample, you get one where the answers you get are likely representative of overall sentiment (the bias of the sample).

Statistics work primarily because the differing attributes (such as opinions) tend to converge towards a mean with a large enough sample. In a perfect democracy, everyone would vote on every decision, and you'd know what was likely the best possible action because the population on average has considered all potential actions in terms of their own biases, and major divergences from this ideal tend to cancel out. It's one reason why you generally want as large a sample size as possible in an election. The mathematics behind this is called a random walk, and it turns out to have all kinds of surprising, and sometimes very counterintuitive, implications.

Modeling markets straddles the boundary between linear modeling, non-linear modeling and stochastic (statistical) modeling. In a market with dozens of different participants of various sizes, the number of interactions between participants gets to be large enough that behaviors become chaotic, and the surety that you get in purely linear systems go away. What's more, you're typically dealing with emergent behaviors, islands of relative stability (local minima) that nonetheless can become destabilized by unexpected interactions. At the same time, you don't have a large enough sample size to make stochastic modeling accurate.

This is where neural networks (commonly called "deep learning") come into play. Neural networks take in-bound sample data and use it via a multistage process to create a set of equations with different weights that can be used to classify a given actor according to certain criteria based on the previous data. It's still a linear model, but it's a linear model that can identify for various points of stability the likelihood that a particular thing is of that type. It turns out that for a number of kinds of problems, this is usually pretty good, so long as you have enough data and that data is fairly independent. Mathematicians talk of piece-wise linear functions, which is a way of replacing non-linear equations with multiple sets of linear equations. This is roughly the equivalent of graphing the sets of points then calculating an approximating curve that mostly fits those points, except in many more dimensions.

A New Role for Marketing, Or An Old One Revamped?

What this has meant in practice is that a person working in the marketing space today likely is involved in attempting to build models to approximate various aspects of business that don't break when the odd piece of counterfactual information enters into the system. In a way, this has always been the role of marketing, but this was generally not realized because of marketing's other role as the source of communication to the outside world. Even here, though, data analytics plays a big role, in that the goal of corporate communication (whether a marketing campaign or piece of PR) is to shape messages that influence the market to respond.

In that regard, marketing and communication are the sensory functions and the brain of an organization - gathering information, using it to create models that help explain why that information is salient, shaping a response based upon that model to more effectively influence that market, execute upon that particular response, then start the cycle all over again. This data also should (though seldom does in all too many organizations) drive internal processes - the design of new products, the stance taken in adverse market conditions, decisions about supply chains and production, all of these should be driven off of the models that are being developed by marketing.

There's one final piece that's important here, and sometimes gets lost. Modelers are not a specialist in modeling. Rather, they are business subject matter experts who are able to determine what data is significant and what data is noise, and to help shape the models accordingly. What this points to is that it is generally better for a company to send out someone who understands the business for training in becoming a data scientist than it is to train a data scientist to become an expert in the business.

The tools are just that - tools. They require a good analytical mindset to master, but increasingly marketers should be learning how to use those tools if they want to survive in this industry, if only so that they can direct or manage those people who are statisticians or machine learning specialists or ontologists to refine those tools in pursuit of the broader needs for modeling in the first place.


Data is at the heart of digital transformations, but data in the absence of understanding is just noise. Ultimately marketing should be seen as the driver of business analysis within the organization, making sense of external data, using it to formulate multiple actions, then shaping the messaging that the organization provides to influence that complex market. It does require someone who understands the people side of the equation, but increasingly the expectation on the chief marketing officer is that they are skilled in being able to understand and interpret the analytics that they are receiving as part of the overall data cycle. I plan on addressing this in more depth in a subsequent newsletter.

Share this article

Leave your comments

Post comment as a guest

terms and condition.
  • Tim Mostrey

    The CMO position has changed throughout years

  • Carl Dean

    Insightful read

Share this article

Kurt Cagle

Tech Expert

Kurt is the founder and CEO of Semantical, LLC, a consulting company focusing on enterprise data hubs, metadata management, semantics, and NoSQL systems. He has developed large scale information and data governance strategies for Fortune 500 companies in the health care/insurance sector, media and entertainment, publishing, financial services and logistics arenas, as well as for government agencies in the defense and insurance sector (including the Affordable Care Act). Kurt holds a Bachelor of Science in Physics from the University of Illinois at Urbana–Champaign. 

Cookies user prefences
We use cookies to ensure you to get the best experience on our website. If you decline the use of cookies, this website may not function as expected.
Accept all
Decline all
Read more
Tools used to analyze the data to measure the effectiveness of a website and to understand how it works.
Google Analytics