As we recently discussed, deep learning (DL) is the most sophisticated form of fundamental AI available to enterprises today(see Exhibit 1). Deep learning networks’ multi-layered structure makes it more computationally powerful and capable of performing complex tasks than other forms of AI (see Exhibit 2).
However, DL systems still have a fundamental limitation: the systems don’t have the equivalent of a memory. Without some type of memory, DL networks are incapable of developing the equivalent of transferrable human skills, rendering them costly to train and less useful than they could be for enterprises—especially those reliant on highly transactional interactions with clients or sensitive materials—looking to scale their AI implementations. A growing number of companies are trying to remedy this shortcoming by developing DL networks that can learn and apply knowledge more adaptively, making them more broadly useful and resource-efficient for enterprises.
Exhibit 1: The building blocks of AI

Source: HFS Research, 2019
Exhibit 2: Using deep learning networks for facial recognition

Source: MongoDB, 2018
Memory networks could be the key to help enterprises unlock deep learning’s full value
DL systems learn by adjusting their parameters to a specific task, meaning they have to reset these parameters every time they start on a new task, a phenomenon known as “catastrophic forgetting.” As such, they cannot transfer knowledge as memories between tasks, which makes them incapable of creating any sense of context, e.g., building a full picture of a customer or other entity or procedure, that exists beyond a one-time transaction. For instance, a memory-enhanced customer service chatbot would be able to remember the previous interaction it had with the same customer in the past and use that prior knowledge to anticipate some of the customer’s requests and preferences. If the customer was booking a specific hotel for the second time, for example, the chatbot could suggest the same type and quality of room. Enhancements like this could potentially make the reservation process both more efficient and more seamless for the consumer.
Because DL systems are significantly more GPU-intensive and data-hungry than simpler ML networks, having their memories wiped between tasks means more resources are spent on re-training them and obtaining more training data between tasks. This lack of memory makes DL systems less adaptable and diminishes their usefulness in dynamic environments.
The goal of advanced memory network researchers is to develop deep learning systems capable of learning sequentially rather than from absorbing large blocks of data in one go. As one author puts it, underpinning these efforts is “a novel way of looking at sequential data: instead of analyzing it piece by piece, updating an internal fixed-size memory representation (that forgets more from the past the more inputs it gets), memory networks consider the entire history so far explicitly, with a dedicated vector representation for each history element, effectively removing the chance to ‘forget.’” In theory, this should make DL networks more adaptable and able to move between diverse contexts in complex real-world environments.
The outcome of such tweaks is DL networks with “the equivalent of a working memory system that can store fragments of inferred knowledge and their relationships so that it can be easily accessed from different layers in the network.” This ability is the equivalent of teaching an employee a transferable skill that they can apply to another role when they move jobs.
Recently, advances in AI research and breakthroughs in neuroscientific research have led to significant progress in making such memory networks a reality. Among recent developments are recurrent neural networks (RNNs), neural Turing machines (NTMs), and convolutional neural networks (ConvNets), to name just a few.
Enterprises should keep memory network R&D leaders on their radars or risk being left behind
Several companies are emerging as pioneers in the memory network field. Much of this research is still academic, but it is focusing on commercializing the technology. Moreover, some companies have already started selling their solutions to enterprises. Below is our shortlist of leaders:
Exhibit 3: EWC performance in training

Source: DeepMind, 2017
Exhibit 4: FAIR research timeline

Source: Facebook
The Bottom Line
As this shortlist indicates, the holy grail of memory network R&D is developing a “general-purpose artificial intelligence,” i.e., an AI that’s capable of learning as adaptively and quickly as a human. Fast, adaptive learning would be of obvious value to enterprises looking to automate as much of their operations as possible, as it would make it easier to disseminate AI throughout their organizations.
Today, one of the biggest challenges to scaling AI through organizations and developing holistic AI strategies is that AI is trained to perform very specific tasks and specialize in narrow areas, in part due to training data scarcity and in part, as we’ve seen, to AI systems’ inability to develop transferable skills. As such, endowing AI with memory could be the key to overcoming this gargantuan challenge and disseminating AI throughout the business world. However, as we’ll see in Part II of this POV, what memory networks can achieve is still different in practice and in theory.
Register now for immediate access of HFS' research, data and forward looking trends.
Get StartedIf you don't have an account, Register here |
With the exception of our Horizons reports, most of our research is available for free on our website. Sign up for a free account and start realizing the power of insights now.
Our premium subscription gives enterprise clients access to our complete library of proprietary research, direct access to our industry analysts, and other benefits.
Contact us at [email protected] for more information on premium access.
If you are looking for help getting in touch with someone from HFS, please click the chat button to the bottom right of your screen to start a conversation with a member of our team.