At the outset, let’s get down to the basics, especially for those who don’t know what this titular term knowledge graph means.
To put it simply, it is nothing but a knowledge base which can contain your structured and unstructured data in a graphical way. It represents data using nodes and relationships (formally known as vertices and edges) where nodes represent business entities or attributes and relationships as the name suggests showcase a type, direction, and strength of connections between the nodes.
In all, it combines business data and business knowledge to give a contextualized and fully integrated view to the data consumers.
There is a parallel which can be drawn between deep learning and knowledge graphs: Like the former was hailed to the forefront in recent years once we got more compute power, even knowledge graphs are the result of years of research in semantic computing space, and with today’s graph technology, we can put that to use to cater to contemporary business problems.
Apart from this organic expansion of the field, a dire and rightly emerged need for making sense of machine learning model outcomes has also fuelled the recent explosion of interest in knowledge graphs.
In addition to that, they also work like humans do. The way we build and understand relationships is quite analogous to how they perform structurally. Hence it is also intuitively interpretable as well unlike their other networked counterparts (neural nets) which work like a beefy (and data-hungry) labourer, counting on the might (compute), tirelessly working on a trial-and-error basis to ensure which combinations work and which don’t, and settles with whatever works reasonably well. In comparison, these graphs leverage known relationships and model domain knowledge since the identification of key features in the order of importance is already in-built in the graph representation which will fast-track and better the model performance.
Knowledge graphs can be used as a logical layer providing a one-view structure over multiple data sources or can be constructed on top of a data lake making a comprehensive sense of the data in it. They can also support multiple architecture approaches: either by being an index for the differently stored data (virtualized) or by replicating the data in a graph manner (materialized). Lastly, they are by design a distributed system apt for the enterprise data at large.
Here are a few business cases where knowledge graphs can be used across industries: detecting communities of similar behaviours in fraud origination and detection cases, identifying key revenue-generating clients for your business, improving customer experience using the journey analysis, suggesting real-time recommendations, providing an optimized information search, performing root cause analysis to map consequences to their causes, and also improving predictions of algorithms, among many others.
Let’s move on to a specific business case: Imagine this possibility under the segment of machine translation. Why can’t we create something called a meaning space wherein the sentences are presented based on their semantic similarities across languages rather than token-based closeness? We need such a space that gives us vectors of synonymised, cross-language statements rather than a probabilistic approach played on a thesaurus built on a massive data of one language.
For instance, to translate someone “barks at the moon” in any other language, it doesn’t need to be broken by its tokens. Rather, it should have “engaged in an unproductive activity” or similar-meaning statements as its neighbours, even in the other language, in the same meaning space. Knowledge graphs built on this method could be readily used across languages for multitudes of business cases.
Coming to its enterprise-wide reception, Google has knowledge graphs behind the scenes to augment its results. LinkedIn has a social knowledge graph. Amazon and Netflix too have been using this for about a decade to know more about their customers and preferences for their products. Even Airbnb and Comcast are reaping benefits of the graph technology, along with many others. Its adoption is becoming increasingly pervasive with the widespread emergence of graph databases and related data science tools.
Some of the graph databases which have allowed the industries to tap into the potential of knowledge graphs are Neo4j, ArangoDB, Titan, OrientDB, Dgraph, Amazon Neptune, etc.
One of the best benefits that knowledge graph gives you is that it allows you to start today with just sufficient data for your use case rather than making you wait for a really large corpus and building a complex ontology. You can start with the simplest way to tackle any challenge or address any need of today.
Pick up a suitable database, and then try out a graph based on whatever internal data is available in line with your business objective. Since it is not static, you can keep on adding ontologies based on your business requirements through timely iterations. This intuitive, dynamic, single source of truth is sure to serve you very well.
With nearly 15 years of industry experience, Pratik works as a delivery head for global analytics projects at a Bangalore-based MNC. Involved in various innovative projects and concepts, he applies a range of Machine Learning and Deep Learning algorithms to create and deliver strategic insights. As part of his wide range of assignments, he pieces together new technology trends and shifting business demands to bring about cutting-edge applications. For years he has been blending his analytical prowess and people skills to tap into the unexplored and less-explored business dimensions and convert them into value creators. Passionate about sharing his continual learnings, he is also a corporate trainer and a speaker at events. Pratik holds an MBA in Finance with Information Technology and a bachelor’s degree in Industrial engineering.