Artificial General Intelligence: A New Perspective, with Application to Scientific Discovery

The dream of building general intelligence machines has inspired scientists for decades. Remarkable advances have been made recently; however, we are still far from achieving this goal. In this paper, we review different machine learning techniques used in scientific discovery with their limitations. We survey and discuss the main principles driving the scientific discovery process. These principles are used in different fields and by different scientists to solve problems and discover new knowledge. We provide many examples of the use of these principles in different fields such as physics, mathematics, and biology. We also review AI systems that attempt to implement some of these principles. We argue that building general intelligence machines should be guided by these principles as an alternative to the dominant approach of current AI systems that focuses on narrow objectives. Building machines that fully incorporate these principles in an automated way might open the doors for many advancements.


Introduction
In [1][2] Penrose talked about the existence of three different worlds: the mental world, the physical world, and the mathematical world.The physical world is governed by laws that reside in the world of mathematics, our minds emerge from the physical world, and those minds are able to access the mathematical world by discovering mathematics, which is within the scope of reason.
In the mathematical world, Bourbaki [5] likened mathematics to a city, where the outlying districts expand on the surrounding country.Plato believed that ideas or forms exist in some ideal world outside the physical world, which became later known as the 'Platonic world of forms' [3].If Plato's realm exists, it is very unlikely that different parts of such realm are disconnected and do not have links with each other, they would be beautifully connected and one can navigate between different parts of that realm, and discover new hidden structures.
Although on rare occasions, the intellect might break through into those worlds and get a limited glimpse of those realms as described by Penrose [1], and illustrated through many examples by Hadamard [6], still in most times we follow certain procedures and principles to reconstruct those realms.Similar to the mathematical and physical worlds, a curtail aspect of the mental world and hence Artificial General Intelligence (AGI) is to build the maps that represent other realms by using a set of principles to reconstruct these original worlds and discover new knowledge.Today these principles represent the major driving force of the scientific discovery process.We argue that these principles should be used as guiding principles for building science discovery machines.The landscape of AGI is extremely vast, in this paper we will focus on the scientific discovery process [6-8, 33-37, 41-42, 54-55, 77, 81, 85, 87-91, 116-123, 149].problems, this approach has key limitations in its generalization ability as discussed in earlier sections.The evolution-inspired path [83,[126][127] could provide an alternative way to build more general AI systems.However, the extremely large search space and the existence of many complex interacting parts still represent a major obstacle.In this study, we argue that the building of these systems should be guided by a set of principles as an alternative of narrow objectives or open-ended evolution.The use of these principles is backed by many historical examples of how different scientists made their discovery.Most scientific discoveries could be understood as instances of the use of one or more of these principles.They are the main approach used by scientists to solve problems and discover new knowledge.The use of these principles provides a way for machine learning systems to improve their generalization ability and to cut down the large search space of hypotheses by approaching the given problem using these principles in template ways as an alternative of the expensive random search..In addition to logic, which plays an important role in the scientific discovery process, in reality, logic alone is not enough, we usually use more sophisticated principles and structures.In the literature, there is a focus on two main principles, concepts combination and analogies.However, other principles should be taken into account to build a comprehensive framework.Different problems in science can be solved using one or more of these principles by using these principles in template ways.For instance, some problems require finding the equation that fits the experimental data, some problems require finding the optimization criteria that give rise to the observed phenomenon, other problems require finding the rules or the program that gives rise to the observed phenomenon, many problems require combining different ideas, unifying ideas or finding analogy with other ideas, and so on.Each scientific problem comes with an objective to meet, the problem could be approached by these principles to find which principle best satisfy the objective.These principles should seek to expand the knowledge base by discovering new knowledge, they should also reveal new connections that link different concepts.Proposing theoretical and computational frameworks that encapsulate these principles is beyond the scope of this paper.These principles can be summarized as follow

Mathematization
Mathematics is a very powerful tool to describe the natural world [30][31][32].Mathematics today is very effective in studying fields as diverse as physics, computer science, finance, and biology.Mathematics is not only able to describe the natural world, but this description on many occasions led us to predict and discover new aspects of the studied phenomena.On many occasions, testing the mathematical description in new extreme conditions led to new insights and sometimes to new theories.In 1915 for instance, General Relativity (GR) was at the frontier of the map of physics, many physicists used mathematics to derive new knowledge from the GR equation, they were able to predict gravitational waves, and black holes as solutions to the GR equation, both of these phenomena were confirmed experimentally in the few recent years.
In AI, there are many attempts to build symbolic regression algorithms, which are automated tools to find the mathematical equation that fits the experimental data [33].Udrescu and Tegmark [34,148] developed an algorithm that combines neural network fitting with a set of physics-inspired techniques.They applied it to 100 equations from the Feynman lectures on physics.It was able to discover all of them; the state of the art algorithm was only able to discover 71.For a more difficult test set, the state of the art success rate was improved from 15% to 90%.Many researchers recently [35][36][37] started to use recent advances in deep learning such as generative adversarial networks to discover physical concepts from experimental data without being provided with any additional prior knowledge and then use the discovered representation to answer questions about the physical system.The main limitation of these approaches is that it is difficult to transform the learned representations into interpretable properties unless prior knowledge is available.The main purpose of the algorithm that encapsulates the mathematization principle would be to find the equations that describe the experimental data.

Optimization
Optimization is one of the most powerful and most used principles from the least action principle in physics, survival of the fittest in biology, and utility maximization in economics, see [134][135] for a long list of examples from different disciplines.Optimization is one of the most used principles in everyday life, we constantly try to minimize energy, cost, distance, time, etc.Some other notable uses of this principle in science include minimizing the energy and time that are required to distribute fuels to the cells, gives rise to the circulatory system networks [27].Optimizing the balance between the input and output energy gives rise to bird migration patterns [28].Increasing entropy drives matter to acquire life-like physical properties [29].The main purpose of the algorithm that encapsulates the optimization principle would be to find the optimization criteria and constraints that describe the studied problem.

Analogies
Many prominent cognitive scientists [38] consider analogy to be one of the main building blocks of human cognition.There are many examples where analogy has played a crucial role in scientific discovery.Polya [39] observed that analogy has played a role in most mathematical discoveries.He provided many historical examples where analogy played the main role.See [40] for a long list of the use of analogy in scientific discovery.Nersessian [41][42] also gave a list of examples such as Newton's analogy between projectiles and the moon which gave rise to universal gravitation, Darwin's analogy between selective breeding and reproduction in nature which gave rise to natural selection, and the Rutherford-Bohr analogy between the structure of the solar system and the configuration of subatomic particles.Many algorithms in computer science have been inspired from biology to solve different problems such as the traveling salesman problem [43], [44].They took inspirations from ants, which are capable of finding the shortest path from the nest to a food source [45], [46], by using a chemical substance called pheromone.Other notable examples include genetic algorithms, see [47][48][49] for a list of bio-inspired algorithms.
Two of the most remarkable approaches to this principle are the High-Level Perception (HLP) theory of analogy [94] and the Structure Mapping Theory (SMT) [22,25], however; building representations and models of the world still represents a major obstacle for these approaches.Hill et al. [93] investigated the use of neural networks to solve analogical problems, they also took inspiration from both SMT and HLP, where they encouraged the models to compare inputs at the more abstract level of relations rather than the less abstract level of attributes.Zhang et al. [26] also took inspiration from the field of psychology and education where teaching new concepts by comparing with noisy examples is shown to be effective.They build a model that sets the new state-of-the-art on two major Raven's Progressive Matrices datasets.One key limitation of the deep learning approach is the lack of transparency where the biases in many of the used datasets often lead to finding shortcuts instead of finding the real analogy [115,[132][133].See [22,25,[93][94][95][96]115] for different theoretical and computational frameworks for the analogy principle.The main purpose of the algorithm that encapsulates the analogy principle would be to find matching between the studied problem and similar problems.

Concepts Combination
Concepts combination is a fundamental cognitive principle [50][51][52].Many scientific discoveries are based on conceptual combination, where new concepts arise by combining old ones [53][54][55].Concepts combination is also one of the main used themes in theoretical physics.In 1973 for instance, both general relativity and quantum mechanics were at the frontier of the map of physics, by combining ideas from these two fields, Hawking proposed that black holes emit thermal radiation.Moreover, by combining ideas from quantum mechanics and statistical mechanics, Bekenstein and Hawking proposed the formula that describes the black hole entropy, which later led to the holographic principle.
Some of the most notable approaches include conceptual blending [50], amalgamation [23], and compositional adaptation [24].These techniques combine input concepts from a knowledge base and output novel concepts.See [23, 24, 50, 56-57, 113, 114] for different theoretical and computational frameworks.One major limitation of these techniques is that they require well-formed knowledge as input.Furthermore, without deeper representations and models of the world these approaches and other AI systems will keep operating at a very shallow level.

Emergence
Emergence is a powerful approach to explain complex behaviors by simple underlying rules.One notable example is birds flocking, some birds fly in coordinated flocks that show remarkable synchronization in movements.Heppner [60] showed that the coordinated movements could be the result of simple movement rules followed by each bird individually.Another example is the Game of Life [61], a two-dimensional cellular automaton with rules that avoid the formation of structures that grow freely or quickly disappear.Remarkable behaviors have been observed such as the glider, a small group of cells that moves like an independent emergent entity.Graph neural networks could be more suitable than other techniques to model complex systems with multiple interacting parts.The main purpose of the algorithm that encapsulates the emergence principle would be to find the set of rules that gives rise to the emergent behavior.

Computability
Computation is a new paradigm that has revolutionized science and engineering [63,82], it has derived many advancements in science and changed the way it is done.Many biologists would agree that biology is information science.One of the most notable examples is the DNA, which gives rise to the whole biological system.A growing number of physicists would also agree that the interactions between physical systems are information processing [64][65].Zenil et al. [81] proposed a universal unsupervised and parameter-free model-oriented approach based on the concept of algorithmic probability to decompose an observation into its most likely algorithmic generative models.They demonstrated the ability of the approach to deconvolve interacting mechanisms regardless of whether the resulted objects are bit strings, images, or networks.A related topic is using machine learning for code generation (see [98] for a recent survey).The main purpose of the algorithm that encapsulates the computability principle would be to find the program that gives rise to the observed phenomenon.

Beauty
Aesthetic judgments play a guiding role in scientific discovery [66][67][68][69].Scientists often evaluate models and theories based on their aesthetic appeal.Many scientists have even suggested that the goal of science is to find beauty in nature.
The role of beauty in science has found some skepticism because we still do not have a satisfactory theory that can exactly test the claims made by scientists about the beauty of a theory [71].Recent works on empirical aesthetics [111] show that there is a general agreement on what is considered beautiful, despite the subjectivity of beauty appreciation.A recent interesting study about the nature of aesthetic in science by Zeki et al. [72] demonstrated that the aesthetic appreciation of mathematical equations corresponds to the same brain activity that corresponds to the appreciation of music and art.Zee [73], Thuan [74], and Dirac [70] also argued that beauty's attributes such as simplicity, symmetry, and elegance have universal values and that they should not be subject to revision in science.
Recent approaches for beauty assessments of visual contents [130][131] could shed new lights on how to assess different scientific models.Deep learning could be particularly interesting where promising results were reported.The main purpose of the algorithm that encapsulates the beauty principle would be to find a metric that evaluates the scientific model describing the observed phenomenon.

Universality
Universality means that a similar mathematical formulation can describe different phenomena across multiple fields.The spectral measurements of composite materials, such as sea ice and human bones, the time between the buses' arrival in the city of Cuernavaca in Mexico, the zeros of the Riemann zeta function, and many other phenomena have shown to have the same statistical distribution [58].Power laws are another example of universal laws that have been observed in a wide range of phenomena in fields as diverse as physics, biology, and computer science [59].Recently, Mocanu et al. [144] were able to significantly reduce the number of parameters of deep learning models with no decrease in performance by enforcing a power law distribution.

Unification
Unification [150][151] has played a key role in physics since Newton who unified celestial and terrestrial mechanics, Maxwell who unified electricity and magnetism, then the unification of the weak and the electromagnetic forces, and most recently the attempts to unify all the four fundamental forces.Unification has also played an important role in biology [140][141].In addition to several attempts to unify different machine learning approaches such as neuro-symbolic [15,92], neuro-evolution [143], and many others [10,14].

Symmetry
Symmetry has played an important role in science [75,[136][137][138][139] from Newton's laws to Maxwell's equations, and general relativity.Symmetry has also played a fundamental role in the development of quantum mechanics.Today, it is one of the most used principles in searching for the fundamental laws of physics and further unification.Convolutional neural networks represent an early use of the symmetry principle in deep learning.
Recently, more advanced symmetry was used to significantly reduce the number of examples required to train deep learning models [142].
Many of these principles could operate at different levels, for instance, the circulatory system example in the optimization principle.By studying the literatures, one can find that the energy and time should be minimized; here the optimization principle is operating at the conceptual level.Then the principle could operate at the mathematical level by using a mathematical description of the optimization process.Similar reasoning could be applied for other principles such as concepts combination where the ideas are firstly combined at the conceptual level and then at the mathematical level.

Discussion and Conclusion
This paper has presented a review of different machine learning techniques used in scientific discovery with their limitations.It discussed and reviewed the main principles used by scientists to solve problems and discover new knowledge.We argue that a key step to improve the generalization ability of AI systems is to build systems guided by these principles rather than focusing on solving specific and narrow problems, or searching the extremely large space of the evolution-inspired approaches.The main challenge to build science discovery machines and automate the scientific discovery process is to build the theoretical and computational frameworks that encapsulate these principles.Although some principles are harder to automate where the challenge of building representation and models of the world is more dominant such as concepts combination and analogy.However, a lot of progress can be made in working on other principles such as mathematization, emergence, etc. Deep learning could be a very effective tool to implement some of these principles, it has shown promising results for the mathematization principle.However, it might be limited for other principles.In the literature, there is a focus on few principles, we believe that there are rooms for many interesting future contributions by working on the rest of the principles by building different theoretical and computational frameworks or by investigating the use of some existing AI techniques.Incorporating these principles fully in an automated scientific discovery framework might open the doors for many advancements.Pursuing this research direction holds a great promise to help scientist in their research and to speed up the scientific discovery process.