As a consequence, varied efforts underway are striving to improve the current model architectures, but also to suggest new directions for the overall paradigms of artificial intelligence research.
Refining existing architectures
1/ Continuous learning
Machine learning professor Tom Mitchell explained in 2006 that “the vast majority of machine learning work to date involves running programs on particular data sets, then putting the learner aside and using the result. In contrast, learning in humans and other animals is an ongoing process in which the agent learns many different capabilities, often in a sequenced curriculum.” This issue is still acute today, and researchers strive to grant algorithms the ability to adjust over the long term.
Example: NELL (for Never Ending Language Learner) is a program that has been running 24/7 since 2010, with the goal of learning to “read the web”. It started with an initial database of web pages, and a hierarchy of categories (e.g. sports). By reading 100,000 Google search requests per day, it is able to learn new categories on its own and to suggest “beliefs”, such as:
2/ Transfer learning
Even if the overall approaches developed by machine learning models can often be applied across a wide swath of domains, it is generally hard for algorithms to generalize to new problems, even closely related ones – architectures of models may be reused, but training must start from scratch. For instance, a checkers or chess program could not play a simpler game like tic tac toe.
Combined with the exploration of continuous learning, the focus on transfer learning points to a future where machines would learn how to learn.
Example: Last October, researchers from DeepMind introduced a machine learning model able to “learn tasks such as finding the shortest path between specified points and inferring the missing links in randomly generated graphs, and then generalize these tasks to specific graphs such as transport networks and family trees”. The key was to add an external memory to their neural network, so as to better store data over long periods – a feature that neural networks usually struggle with.
3/ Self-generation of datasets
Machine learning models are highly dependent on the availability of datasets. 2 approaches are explored to overcome this potential roadblock: the first is to create frugal yet efficient algorithms that need less data in their training – what is called “sparse data” in the AI jargon – and the second is to create models that succeed in generating (part of) their own training datasets.
- The AlphaGo program did not only use the logs of an online go server during its training; it also played numerous games against itself, generating a dataset of 30 million game positions in the process.
- In December, Apple researchers published the first machine learning paper from the company. It deals with the limits of datasets in the field of computer vision: synthetic images – 3D generated – are increasingly used to form training sets, but the resulting algorithms are generally less accurate when confronted with real images from a test set. The solution from Apple researchers was to develop a model that can combine synthetic images with features from real pictures. This approach results in numerous yet much more realistic image datasets. See the following illustration from the paper:
4/ Opening the black box of machine learning algorithms
Machine learning models may be highly efficient, they still have one major drawback compared to their rule-based forebears: it is for now impossible to understand precisely how one algorithm reaches a particular conclusion.
Opening such a black box will be necessary to provide feedback to users of AI applications in the future – maybe you would want additional information if you had a loan application denied by a deep learning algorithm. And above all, that may help explain, hence avoid, unbearable outcomes such as the classification by Google Photos of two black people as “gorillas” in July 2015.
Examples: Last August, the DARPA (the R&D funding branch of the American Department of Defense) launched a program called “Explainable AI (XAI)”, whose goal is to fund projects of machine learning models that can provide rationales for their choices – without losing a high level of accuracy. The global philosophy is well summarized by this illustration from DARPA:
Transforming the AI paradigms
5/ Bringing together deep learning and neuroscience
The original work on artificial neural networks arose in the 1940s and 1950s along advances in neurosciences. The first artificial neuron (note that “artificial” means here a program, not a physical artifact) in 1943 was for instance the result of a collaboration between a neurophysiologist and a logician.
However, as a recent deep learning textbook explains it, the two disciplines gradually grew apart: “The main reason for the diminished role of neuroscience in deep learning research today is that we simply do not have enough information about the brain to use it as a guide.”
However, more and more scientists from both sides call for a tighter integration between deep learning, neuroscience and sometimes also cognitive sciences. Recent advances in deep learning, even if not always inspired by neuroscience, can inspire new directions for neuroscience research, and in the other way an increased comprehension of the most evolved and complex learning system to be – the brain – could indicate brand new or refined architectures to AI researchers.
- In July 2015, 3 researchers with differing backgrounds suggested a converging approach called “computational rationality” to better understand intelligence in “brains, minds and machines”.
- Demis Hassabis, CEO of DeepMind, holds a PhD in Cognitive Neuroscience.
6/ Bridging the gap between the 2 main AI approaches
If you have read our article about AI vocabulary, you know that there have been two main approaches pursued through the history of the field: symbolic programming (top-down; popular in the early days of AI) and machine learning (bottom-up; much more popular nowadays).
Even if it has proven difficult to program complex algorithms with hard-coded rules (e.g. in natural language processing, where writing the numerous rules and equally numerous exceptions of a given grammar is time consuming, and combining them to derive the meaning of a sentence is too complex), adding a bit of symbolic logic to machine learning models could help improve some of the latter’s shortcomings – e.g. large datasets are not widely existent, let alone available, in every domain.
Example: Geometric Intelligence, acquired by Uber in December to form the basis of its AI Labs, is pursuing this approach, so as to develop models that need less data than usual.
7/ Neuromorphic computing, the next hardware architecture?
Neuromorphic computing consists in mimicking the functioning of the brain through electronic circuits. This idea has been floated since the end of the 1980s, but a recent series of large research initiatives on the brain (e.g. the Human Brain Project kickstarted by the EU in 2013, with a budget of €1bn over 10 years) have included an exploration of the concept via dedicated computer systems.
If tomorrow artificial neural networks do not just exist as code but also as specialized electronic circuits, maybe the progress of artificial intelligence will suddenly accelerate. However, we have yet to see how neuromorphic chips stack up against the more generalist chips that are CPUs and GPUs, in terms of performance and cost. Finally, the path towards a real replication of the brain is still long, since biochemistry plays a huge role in the transmission of information – next step: convergence between electronics and biology?
Example: IBM announced its neuromorphic chip TrueNorth in 2014. It included 1 million neurons – vs. 100 billion in a human brain – and 256 million programmable synapses. The most impressive feat was its power efficiency, comparable to an actual brain, and 1,000 times better than a typical CPU.
The TrueNorth architecture above tightly integrates computation – neurons – with memory – synapses – whereas the canonical structure of computers has always separated these two components.
Beyond scientific progress per se – and by now, we hope that you have become as passionate as we are about the field of artificial intelligence – keeping an eye on these ongoing research efforts is key to better foresee the future of AI as an economic factor.
They are weak signals because their success or failure will exert a direct influence on the breadth and depth of business applications that can be envisioned – either opening up new tasks, jobs or even whole industries to automation, or on the contrary burying the hopes of artificial general intelligence.
Interested in receiving every week a new episode on Artificial Intelligence - 1st Season by FABERNOVEL?Subscribe