As we stand on the precipice of 2024, the metaphorical "glass darkly" is a lens through which we seek glimpses of the unfolding future.
In this reflective journey, we contemplate the potential trends, challenges, and transformative shifts that may shape the year ahead.
I started Through a Glass Darkly back in 2003, and it has become, for me, a chance to identify those trends that I think are likely to play out over the next year to two years. I haven't always published it, and my track record is far from perfect, but normally because a trend that I thought would "break out" that year took longer than I thought to incubate.
If there's a theme, it's that we're now entering a period not of rapid evolution but rather iterative refinement.
As I've focused more and more on AI this year, there were times that I felt overwhelmed. Radical new changes, everything from LangChains and RAGs to attention-building algorithms on context to techniques for increasing the mathematical precision of calculations in queries, were coming so fast that it became a Herculean chore to keep up with it all. The whole OpenAI debacle, when it broke over a weekend, seemed to be perfectly emblematic of the whole situation. Things were happening so fast that people no longer had the time to think, and when that occurred, dumb decisions almost invariably followed.
There are signs that the firehose blast of AI innovations is turning into a stream - still torrential, but no longer capable of blasting paint off a car. A certain degree of LLM exhaustion has overtaken everyone, and even businesses are beginning to push back, indicating that they will not be adopting this technology until things slow down and all of the business things that need to be done, from testing to determining ROI scenarios, get done. Computer scientists in the field have also sounded an alarm that the pace needs to moderate because of increasingly worrisome characteristics in the largest LLM models.
One thing that I do see - we're watching the rise of an ecosystem. ChatGPT and Gemini are ideally poised to become social media plays and represent the big whales. On the other hand, Llamas - smaller, more focused LLMs, are continuing to evolve and are increasingly backed by knowledge graphs of various sorts.
What is most remarkable to me is that the architecture here is mostly the same at the various scales, from the largest to the smallest, as if we've finally settled down on a basic body plan, and the differences are largely in how they can take advantage of the data environments around them. The backing data stores, in turn, simply become components, the intrinsic DNA or the mitochondrial power source of these AI cells.
A true phase shift occurs when the organisational mechanisms within an environment transform. Ice has a comparatively rigid and regular structure. Water does not, for example. We are undergoing a phase shift and will continue to do so over the next decade.
Agents have already been targeted as the next major investment area by the powers that be. The drumbeat has already begun. No doubt there are decks at Gartner and Forrester and other industry arbiters that have been prepared with the words Agent in bold letters in the title, just ready to be dusted off to feed into the great marketing hype mill that seems to drive the tech sector.
An agent is, as you might expect, someone who does something for someone else (usually for an appropriate fee). An agent can be a human being, but the agents involved here are computer AI entities that act in lieu of their respective clients.
For instance, you may have an agent that is asked to invest $100,000 in a portfolio of stocks. That agent is given a set of permissions and the right to represent their client legally. The agent then purchases $100K in stock that meets the client's requirements, interacting with other agents representing the exchange. Agents are comparatively long-lived - they may exist to accomplish a specific task, or they may put themselves into a loop until some specific condition happens, then they awaken, perform their task, and go back to sleep.
Agents are capable of doing quite a number of things, but a rather astonishing percentage of them involve the transfer of money for some good or service. Not surprisingly, there is a strong financial incentive to be the one that provides the infrastructure for the agents, for in doing so, they can take a cut of the transaction as a fee.
There are other roles that agents can perform - they can find or filter content autonomously, they can act to oversee distributed or map/reduce type operations, they can transmit updates to various distributed data systems and so forth. However, it is in the monetary or data access roles that they excel.
Are agents AI? That's debatable. It's probably easier to think of them as intelligent state machines, changing from state to state as needed, based upon fairly fluid rules. Those rules may be generated from a neural network or Llama, but they may also come from a knowledge graph or similar tool. They may also be an interface that a human uses to interact with other humans through a proxy (such as buyers or sellers of specific goods), or employees working under contract.
Regardless, I expect that agents will be big business towards the end of 2024 or early 2025, generally under the auspices of an Agent API. I also suspect that while there may be a push to tie this back into Blockchain and decentralized finance, most companies will likely push for a more general API (see Crystal Memory below as one such possibility).
When looking at generative AI, it's easy to focus on the interactive and compelling nature of ChatGPT and related applications. However, in many respects the real revolution that has taken place in 2023 has been in the field of AI media.
In 2020, GANs first appeared as a way to use an adversarial network to create an image space between two images, eventually leading to very realistic photos of people who did not exist. Over the next three years, this led to a revolution in image generation, using neural-net-based models consisting of hundreds or even thousands of images to create pictures from other pictures. This process has raised many people's hackles along the way, especially as originally those images were obtained primarily from the Internet without permission. That situation has been considerably ameliorated since then, but it still left a bad taste in many people's mouths.
Since then, the quality of such images has improved dramatically. One key invention that helped this process with the introduction of the LORA (Low-Rank Adaptation of Large Language Models), which made it possible to create adaptors for models that reduced a significant number of parameters. This meant that rather than needing 4 to 6 GB for many different kinds of models, the artist could use 100 to 400 MB or sometimes less, cutting down on both space and the number of models needed, as the Lora brought in a speciality (for instance, wearing a particular costume, or being posed in a certain way). Loras made such models configurable for different needs and images.
Additionally, other techniques, such as in-painting, made it possible to modify certain parts of a picture based on both prompts and the existing image. This opened up the possibility of camera effects such as zooms, pans and rotations, and as people experimented, it was only a matter of time before image GANs led to video GANs.
As 2023 wraps up, video GANs are poised to take off, having already made their way into Instagram and other social media platforms. Software platforms such as Pika and Runway, which you Discord and web-based apps, respectively, let people create short (3 to 4-second) snippets of video, but already there is enough in place to see that the technology is likely to explode in 2024, providing would-be filmmakers, animators, and special effects artists, armed with an nVidia chip (or an array of same) the ability to create their masterpiece.
I expect two things are going to happen. The first is that tools to create AI videos will become commonplace and sophisticated enough to make a living for the average hobbyist or business. This is going to open up a lot of "production houses" capable of building fairly sophisticated, moving content both for advertising and for general consumption, and it likely means that there will be a generation of children and young adults who will take such skills for granted.
The second area I see is the integration of mesh-based (such as Blender or Maya) systems and AI systems (most notably point-cloud based systems or splats, which have in turn built on top of NERFs), especially as the interchange between the two kinds of data files becomes more commonplace.
Two factors will facilitate that - the increasing number of higher-end gamer GPUs in the hands of the average person and the ready need for larger organizations to exchange asset files that can nonetheless be manipulated for everything from rigging to final production.
The other area where this will become critical is in the rise of avatars. You're seeing this phenomenon already in the influencer arena, where ad agencies are creating realistic but completely fabricated influencers for Instagram, TikTok, and other popular social media platforms. Such avatars are not yet completely autonomous, but the writing is on the wall.
Note that it is likely that Hollywood IS paying attention here - while many of the techniques that have fueled the AI diaspora have their origins in Hollywood or the Marin Valley, things have been evolving so quickly that many of the latest techniques have not really made their way into the Big Media production pipeline (though the recent Marvel Spiderverse movies indicate that someone at least has been paying attention).
You will likely hear about Neuromorphic chips and quartz crystal storage a great deal this year. Neuromorphic chips are designed specifically to carry deep language models and thus are optimized for complex retrieval. This is an example of a technology (neural networks) that is making the jump from software to hardware.
Most AI models currently run on GPUs, which are high-speed, high-capacity chips designed specifically for complex mathematical processing, which is very common in games and 3D rendering. However, for neural networks, this is actually more than a little overkill, as what is needed is less complex transformational pipelines involving deep math and very fast processing that requires considerably less precision on fairly basic numbers (16 and even 8-bit floating point numbers).
By creating specialized chips optimized this way, these give up a lot of the more powerful (and costlier) graphics features in exchange for blazing speed and optimized design for neural net kernels, something that sits at the heart of nearly all deep learning systems.
Additionally, many of these chips also incorporate localized data storage that sits in each node of the processor stack, significantly reducing the latency of data reads and writes considerably. Most of the major tech companies are now exploring some variation of the Neuromorphic chip for everything from specialized llamas for companies to something that could embed AI into a car, television set, IoT device, or drone, and neuromorphic chips will likely start appearing in production toward the latter half of 2024.
One other interesting hardware innovation will be the rise of Crystalline Storage. A crystal is just that - a piece of pure quartz that is shaped in such a way as to allow for laser etching to embed information into the internal matrix of the quartz, which can then be read via a similar arrangement of lasers.
Crystalline Storage is a foray into optical computing, but in this case is used primarily as a permanent store - information written once will stay inviolate, and will have very low memory requirements for retrieval. While the most obvious use case for such crystal memory is as a repository for storing archived regulatory data (reducing the burden on large data servers that sit mostly unused), the real value comes in identity and key management, a role originally intended for blockchain to fill.
When a transaction gets written (perhaps redundantly), it is inviolate at that point, meaning that it can be used not only as a way of identifying resources but also for securing transactions, making it a far better (and faster) way than blockchain for managing e-commerce applications.
While AI took center stage in 2023, there is no question that social media was much more of the carnival sideshow, as the once ubiquitous Twitter, purchased in late 2022 by Elon Musk, very quickly became a clown car as the loss of most of the Twitter staff, the attempt to make money off verification, the hiring of a new CEO after the company proved a time sink for the tycoon, the shift in political philosophy, the long, steady slide in market value to a quarter of its worth, the rebranding to X, the openly anti-semitic remarks that sent advertisers running for the hills and Musk's own admission that he would not put any more money into the company all conspired to put the future of the site into serious doubt.
This hasn't stopped others from looking to fill the void. Mastodon and Blue Sky both launched this year, Instagram spun off Threads as a Twitter act-alike, and platforms such as Discord stepped up their own development efforts. It's possible that someone may still step in and buy X and revert the brand, though it's going to take a significant amount of resuscitation to get it back to health after the beating it took in 2023.
However, I think, given the continued strength of AI mania, the next likely wave of social media will come from the GPT/Gemini/Mistral side of things. A multiperson chat environment with access to the Internet sounds a lot like the perfect forum for social media, especially if it becomes possible to talk to ChatGPT or Gemini as simply one more voice in the queue (Discord comes closest to this, which is why it's worth paying close attention to that particular platform). ChatGPT, especially, has the ability to support gaming, plug-ins, and image generation and will likely support video and audio also raises the stakes considerably.
Microsoft, in particular, may be in the sweet spot here - it has previous experience with social media in a way that most other companies haven't, and once you get past the notion that social media today must look like Twitter or Facebook or Instagram and see it as a potential universal platform, a branded social media spinoff from Microsoft built around ChatGPT could change the dynamics of the market dramatically. If it does happen, I anticipate it will be towards the latter half of 2024 or early 2025.
In the waning months of 2023, I noticed that a growing number of knowledge graph vendors (both JSON and RDF-based) were incorporating vector stores into their offerings, specialized databases (really indexes) that are used to create a catalogue of all of the tokens (think words or phrases, taking into account stemming variations) for a given document. The tokens, in this case, were labels, and the values then indicated some set of frequency measures (such as count, normalized count, or some other weighted measure).
If you have two documents, you can do a weighted dot product on each document's list - if both documents share the same word, then the weight of each document's word is multiplied by that of the other, otherwise, a zero is calculated. By adding up this weighted sum of products, you can determine the similarity of the two documents.
Vector stores are used heavily by LLMs to determine causal associations of tokens in a sequence. If you also have a data structure that tracks the probabilities of given sequences of tokens (typically with holes in the token set that can then be assigned to a variable for pattern matching), this drives the LLM collection operation. Vector stores have other uses as well, especially in the knowledge graph space, where the vectors can be set up to generate documents from known data. Since similarity calculations are one of the things that in general graphs do not handle well, combining knowledge graphs and vector stores opens up a huge space that was otherwise slow and cumbersome.
My anticipation is that knowledge graphs, vector stores and llamas (one of the more general terms for all sized neural network document models) are currently in a complex dance that will see them merge into what I refer to as a LlamaGraph, with the graph serving as the front end for curational purposes as well as providing short term "memory" within the llamagraph, the vector store handling the similarity analysis, and the llama then being the primary vehicle for interaction as long term, read-only storage periodically updated by the knowledge graph.
As to the nature of the knowledge graph, that's largely an implementation detail. Support for Cypher is likely one requirement, and arguably (primarily for internal purposes) support for SPARQL and SHACL, with a GraphQL interface for querying from and reading the knowledge graph. This presupposes some kind of JSON interface, but again this question comes down to what you are trying to do.
One other big unresolved question is the role of ontologies in this space. Some argue that ontologies should be derived from the data. Others, myself included, feel that there should be an option to establish ontologies and then transform data to and from that central ontology to better allow for data interchange, especially in light of the comparative inefficiencies of LLMs for retrieval of information with respect to knowledge graphs.
Either way, it is likely that the integration era of knowledge graphs with machine learning is upon us, and that this year will see a significant move forward in this space.
As I write this, many companies in and adjacent to the tech industry are shedding jobs as fast as they can. It's easy to draw a connection between the introduction of AI in 2023 and the massive layoffs, but I think that this is more a case of two somewhat correlative events that are nonetheless driven by deeper factors, although I think 2024 will be a somewhat different story.
People are losing jobs because companies were borrowing money recklessly when it was cheap, and when it became expensive, they had to pay the bills in the most expeditious way they knew: cutting staff. The markets are waiting for money to become cheap enough again that they can go back to borrowing it without consequence.
They are also cutting staff because shareholders have clarified that dividends will not be cut. Investment, they reason, is not supposed to be risky, and because they control the salaries of the senior-most people at these companies, well, you get the picture.
AI, on the other hand, is an excuse for cutting into jobs, but they are at the other end of the spectrum, basically being the jobs of the makers, developers, and creators who have traditionally been the ones that actually create the product. Again, this has less to do with reality (I've noticed that most artists who are already digital savvy, the transition to AI has happened fairly seamlessly) but because those who would employ such artists feel that they can produce the same quality of product without them.
Ironically, many are now coming to the realization that artistic (and writing) skills and perception (such as good design) are not things necessarily owned by AI agents (or that maybe being a good prompt writer is more complex than it appears on the surface).
Finally, there's been a general collapse in advertising revenue, which is a hold-over of inflation - everything costs more, meaning more people are stretched financially, leaving little left over for often frivolous purchases.
I see some of these trends turning around in 2024. Interest rates have plateaued in many places, and I suspect these will drop in the first quarter of 2024 as central banks ease up on the rates based on local conditions. Prices won't come down (exactly), but you'll see more discounting to move inventory that's stacked up, which in turn means that when new product does start moving into the shelves, they will likely reflect these discounts. Additionally, gas prices are under pressure as the US has stepped up its own pumping, removing another factor in the inflation puzzle.
The dynamic that we are seeing is reminiscent of the DotCom crash of 2001. The Tech sector and those dependent upon it are in a fairly stiff recession, but the overall economy has mostly recovered from what was likely in a near recession earlier this year.
What's changed is that companies that have been trying to get a handle on AI earlier that have been sitting on the sides because things were changing SO fast are now beginning to explore options and looking at ways they can utilize these technologies without necessarily being dependent upon a single large provider. This means that its likely that the demand for expertise in the AI field (any expertise) is going to rise throughout the year.
On the supply side, this accelerated development cycle has also hampered developers who are struggling to learn a technology that has literally been changing weekly. Retraining skills take time, but it is an occupational hazard of being a developer. I suspect that there are many people who are retooling their skill sets at this point, and as such, the cost point for hiring someone with decent chops in AI will go down, reducing the overall deficit created by all the firings.
It's worth noting that this expertise is so new that nobody has much of it. Do you want someone who is an expert on Retrieval Augmented Generation, Chain of Code, Mixture of Experts, and so forth, with a minimum of five years experience in each of these? None of these existed six months ago. There are no best practices because the technology is still emerging.
The AI market right now is still very frothy - meaning that there is a layer of foam that seems thick but is fairly insubstantial, on top of the much more substantial milk in the latte that is "traditional" IT (how's that for stretching a metaphor beyond its breaking point). It dominates the news, yet most companies have, at best, token AI efforts underway, and many have none.
In 2024, we will see that change significantly. Expect established vendors to integrate generative capabilities into their tools, likely driven by in-house LLMs rather than working with a ChatGPT or Gemini API, with reconfigurability driven by prompt "cues" custom to each product, albeit with switchable modules. LlamaGraphs will become more common as well, as knowledge graphs and vector stores merge with Llamas. Multiuser chats will pave the way to the next social media paradigm, probably towards the end of 2024 or early 2025, even as AI media makes that medium, well, a medium.
There are many things I haven't covered here. Meta has an interesting new take on real-time live captioning, for instance. I'll be exploring these more in the new year.
Will there be dramatic new innovations in this space? Not really - most of those are in the past. 2024 will be a year for refinement, getting the engineering and the ethics right, and evaluating how this can all be used personally and in business. That's a good thing.
In Media Res,
Managing Editor, The Cagle Report
Kurt is the founder and CEO of Semantical, LLC, a consulting company focusing on enterprise data hubs, metadata management, semantics, and NoSQL systems. He has developed large scale information and data governance strategies for Fortune 500 companies in the health care/insurance sector, media and entertainment, publishing, financial services and logistics arenas, as well as for government agencies in the defense and insurance sector (including the Affordable Care Act). Kurt holds a Bachelor of Science in Physics from the University of Illinois at Urbana–Champaign.