Abstract

In an attempt to better understand how the navigation part of the brain works and to possibly create smarter and more reliable navigation systems, many papers have been written in the field of biomimetic systems. This paper presents a literature survey of state-of-the-art research performed since the year 2000 on rodent neurobiological and neurophysiologically based navigation systems that incorporate models of spatial awareness and navigation brain cells. The main focus is to explore the functionality of the cognitive maps developed in these mobile robot systems with respect to route planning, as well as a discussion/analysis of the computational complexity required to scale these systems.

1. Introduction

This paper reviews the current state of research in mobile robot navigation systems that are based on the rodent’s specialized spatial awareness and navigation brain cells. Specifically, these cells include place cells, grid cells, border cells, and head direction cells. The advantages of using a neurobiologically based system include the possible performance rewards that may be realized in the future pertaining to navigation and smart systems, as well as the benefits of using accurate models of the brain for other, related research [1, 2]. For artificial intelligence to take a major leap forward, machines will at minimum need to learn and think the way humans do. This will require computational elements that behave similarly to, and are as compact as, the neurons and accompanying dendrites and axons found in the human brain.

Although there is a need for new technical paradigms in artificial intelligence, this paper does not propose or present new methods but outlines work that may be a path to such answers. The most important attributes of the neurobiology based navigation systems covered are the types of cognitive maps produced by these systems and how they are, or can be, used for route planning. Thus, the focus of the analysis of the reviewed literature will be centered on mapping and route planning capabilities of these neurobiologically based systems.

Most papers that have reviewed such systems came around or before the year 2000, such as [35]. However, since 2000, much advancement has taken place in the miniaturization of electronic packaging (Moore’s Law), thus increasing the practicality of placing better sensors and processors and algorithms and more memory and so forth on mobile robots. In addition, the discovery of the grid cell by the Mosers in 2005 [69] added more insight as to how rodents navigate. Therefore, this paper fills the gap on a needed formal review of the state-of-the-art neurobiologically based navigation systems researched and developed from 2000 and on. On the nonneurobiological (classical) side of navigation in mobile robots, a good source that reviews map-learning and path planning strategies can be found in a paper by Meyer and Filliat, 2003 [10], as well as many books on the topic (e.g., [11]).

Thus, the outline of this survey proceeds as follows: Section 2 discusses the basics of the simultaneous localization and mapping (SLAM) method of navigation, as well as some fundamental issues that plague every navigation system (neurophysiologically based or not); Section 3 gives a brief review of the definitions of the neural cells that will be the center of focus in this paper; Section 4 covers state-of-the-art research that has been performed on neurobiologically based navigation systems (only those that have been realized in working, prototype mobile robot systems) with a critique of the cognitive maps developed for route planning algorithms at the end of each subsection; Section 5 presents an overall a general analysis and discussion of the research performed in the literature; and Section 6 covers neural networks and addresses scalability of the neurophysiologically based features in a mobile robot platform.

2. General Robot Navigation Background

2.1. Simultaneous Localization and Mapping

For a mobile robot to be truly autonomous, it needs to be able to operate and navigate without human intervention and in a non-specially engineered environment. More specifically, the following needs to be true: a mobile robot must be able to locate itself in an unknown location of an unknown environment by incrementally building a map of its environment, while simultaneously locating itself in that environment by use of the derived map. This process is known as simultaneous localization and mapping (SLAM) [1518]. As described in [18], the fundamental parts of a classical SLAM system are landmark extraction, data association, state estimation, state update, and landmark update. Of course, to be able to accomplish these SLAM steps the system requires hardware, used by the agent to interact with the environment and make decisions with them (i.e., sensors, actuators, processor, etc.), plus any filters and/or methods required to adequately perform these 5 tasks (e.g., sensor noise suppression, error correction algorithms). SLAM is not unique to just classical systems. It is accomplished, in some similar form, by rodents by use of their hippocampus [2, 1921]. The special neurons or brain cells which accomplish this will be covered in Section 3.

2.2. Fundamental Navigation Issue: Sensor Error

There are fundamental issues which plague every navigation system (neurobiologically based or not) [25]. These issues, largely path integration related, propagate up into the mapping and localization phases, levels L0 to L1 in Figure 1. In a neurobiological or neurophysiologically based navigation system, this is equivalent to either lesions introduced into hippocampus and related areas, lack of allothetic stimuli, or other similar targeted manipulations on rats [9, 23, 26]. The outcome, thus, has a negative effect on the accuracy of the overall navigation system.

Therefore, for any navigation system to work adequately, the mobile robot’s sensor data error must be within a usable margin and be reset periodically by allothetic information, whether visual, tactile, olfactory, or others. Idiothetic data is the most basic navigational data for the robot to use to track its movements and is the basis for path integration [2830]. The inherent issue with any ground based robot navigation system is the mobile robot’s measurement accuracy with respect to distance traveled and directionality (idiothetic data), since this data is used to derive the robot’s pose. Sources of classical and most neurobiologically based navigation systems’ measurement errors come from the data obtained from odometry devices, inertial measurement units (IMUs), distance sensors, and other position/pose measurement systems (use of idiothetic stimuli only). The sources of these errors fall into two categories, as described in [30, 31], of being either systematic or nonsystematic. Additionally, these errors accumulate over time [18, 28, 3032], making environment localization and mapping inaccurate if these measurements are used directly. Methods in error correction of odometry and related position/pose data have been, and still are, a major topic of research. However, probabilistic filters (e.g., extended Kalman filter (EKF)) or particle filters, as well as use of allothetic stimuli (e.g., landmarks), are used with SLAM algorithms in classical systems to help correct these errors in the pose data and location estimation.

Similarly, whether animals, insects, or animats, these navigation systems require path integration (PI) systems with corrective error mechanisms [6, 23, 33]. In the case of animats, or, more specifically, the neurophysiological modeled navigation systems reviewed in this paper, it is shown that visual data is key to keeping PI errors to a workable minimum. This will also be touched on in Section 5.

The following is a review of the definitions and characteristics of the specialized navigation neurons or cells, as found in the hippocampus and entorhinal cortex of a rodent brain, as well as the human brain [34]. This material is covered in other literatures [6, 20, 21, 35, 36] but is included here for completeness.

3.1. Place Cells

Place cells in rodents were discovered by O’Keefe and Dostrovsky in 1971 [37, 38]. Each of these cells, primarily located in the CA1-CA3 regions of the hippocampus, fires at a devoted location in a rodent’s roaming area. The place cell’s firing location is invariant to the head direction or body pose of the rodent. The firing area of each place cell also seems to follow the summation of two or more Gaussian distribution curves, one for each salient distal cue [34].

3.2. Head Direction Cells

Head direction cells were discovered in rodents more than a decade after the place cells [39, 40]. These cells are place invariant and each has a preferred direction with respect to the rat’s head direction in the horizontal plane, where it will fire at a maximum rate. They are silent for all other directions, except for a small region (a few degrees) of their preferred direction angle. The head direction cells only fire as a function of the rat’s head direction and not its body. Additionally, although the cells have different preferred directions, they seem to fall into a finite set of directions (e.g., N, NE, and SW). The directionality is relative, such that they will align relatively to a dominant external cue of the environment the rat is introduced to, if available, else it will set a direction based on other unknown origins [23, 41].

3.3. Border Cells

A border cell can be thought of as a specialized place cell, where it only fires with respect to a certain border or barrier [42, 43]. The area covered by a border cell can vary drastically, with respect to each other. Similar to the place cell, the firing characteristic of the border cell is invariant to the rat’s head direction.

3.4. Grid Cells

The grid cell was discovered by Edvard and May-Britt Moser in 2005 [69]. This set of special, navigation related brain cells, which is the most recent to be discovered, is located in the medial entorhinal cortex. Grid cells have a very interesting firing characteristic, as compared to the place cells and border cells. Single place cell and border cell only fire at a specific location/region, whereas a single grid cell fires at a geometrical constellation of locations/regions. These regions within the rodent’s roaming area form hexagonal/equilateral triangles, where each firing location is at a particular vertex of the equilateral triangles. The hexagonal lattice of each grid cell’s firing field is defined within a very short time of a rodent being introduced to a novel area. It appears that the lattice is anchored in orientation and phase to external landmarks and geometric boundaries of its environment [6, 7, 44].

4. State-of-the-Art Research in Neurobiologically Based Navigation Systems for Mobile Robots

This section covers state-of-the-art research in neurobiologically based navigation systems, where the systems have been implemented in a mobile robot since the early 2000s. Although some of the systems covered rely on external CPUs to perform neurophysiological simulation for the robot (e.g., Khepera mobile robot platform), they have been included anyway. However, by the definition of autonomous mobile robot used in this paper, these systems would be considered nonautonomous because of the need to communicate with external computers. Thus, the autonomy classification of each robot presented will be included in Table 1.

The neurophysiologically based navigation systems fall into three categories, based on the centric navigation cell that is being functionally emulated. These categories are place cell centric, theoretical cell centric, and grid cell centric categories. The theoretical cell uses one or more true neural navigation cells (one being the place cell typically) to create a new, fictional cell that is at the center of its navigation system. Although fictional, these cells, or functions, may indeed be plausible and real in one form or another. Basic features and capabilities of these systems are summarized in Table 1.

4.1. Place Cell Centric Systems
4.1.1. Arleo and Gerstner 2000

The study article by Arleo and Gerstner, 2000 [12], has had an influence, in one form or another, on many future works covered in this section, particularly [2, 13]. The references used in [12] fall into the categories of both neuroscience: O’Keefe and Nadel, 1978 [45]; Taube et al. 1990 [39]; Redish, 1997 [4]; and so forth and neurophysiological inspired circuits and models: Burgess et al. 1994 [26]; Brown and Sharp, 1995 [46]; Redish and Touretzky, 1997 [47], Zrehen and Gaussier, 1997 [48]; and so forth, which form a basis of references used by the other proceeding studies/articles. More references can be found in Arleo and Gerstner, 2000 [12] and 2000 [49]. Additionally, this paper’s presentation and functional use of neurobiological specialized spatial navigation cells found in the rodent’s hippocampus, for modeling in robotic navigation, are central to the theme of all of these papers covered.

(1) System Architecture. In [12], the Khepera robot system used consists of the following: an onboard camera for vision based self-localization (90° field of view in horizontal plane), eight infrared (IR) sensors for obstacle detection and light detection, a light detector for measuring ambient light, and an odometer for sensing self-motion signals. The neurobiologically based navigation system models two crucial spatial navigation cells: head direction cells and place cells. This is performed on an external computer.

(2) Head Direction and Place Cells for Spatial Navigation. In Figure 2, the allothetic inputs consist of data from the onboard camera, which is used for the place cells in the sEC submodule, as well as data from the eight IR sensors and the ambient light sensor, which are used by the visual bearing cells in the VIS submodule (left side of Figure 2). The neural networks (Sanger’s [50]) to the place cell from the camera input are programmed offline during an initial unsupervised, Hebbian learning phase [51]. During this initial, exploration/neural network training phase, each place cell location is learned by dividing images taken into smaller 32 × 32 pixels, running the reduced image through 10 different visual filters of 5 set scales each. This is done for the north, west, south, and east views of the robot’s arena from each snapshot/place cell location. The networks of each cell are then trained with the reduced images and adjusted for maximum response for each image location. Thus the place cells are programmed neural networks with the onboard camera image, divided into four quadrants of 32 × 32 pixels each, at the input, and will allow for self-localization in the online mode.

A light source is added to one wall of the robot’s arena, where the IR sensors and ambient light sensor can lock onto this global direction (with the help of neural networks for fine-tune positioning to the light source). This allows for calibration of the robot’s directional module (right side of Figure 2), which bounds the accumulated error in directionality.

The robot uses three different neural populations of cells (right side of Figure 2) to calculate its head direction from its current angular velocity and anticipated angular velocity and feedback from the system output and calibration cells. The end result is a set of quantized, directional cells to drive the robot’s motors for proper heading.

(3) Computational Complexity. The computational complexity of this system is a bit more involved than briefly covered here. Further details can be found in [12, 49, 52]. However, any neural network system is going to have a relatively high to extremely high computational complexity, based on the number of neural networks and the processing status of offline and online/real-time learning. The environment is somewhat engineered and needs to be static. This is true though of any system in the initial stages of wringing out system integration errors, model problems/accuracy, and so forth.

(4) Mapping and Route Planning. Visual based mapping, through the use of snapshot recognition (place cells), is used to help correct head direction error and not for obstacle avoidance or route planning. Therefore, true mapping and any form of route planning are not addressed in [12, 49].

4.1.2. Fleischer et al. 2007

(1) System Architecture. The neurophysiological modeled navigation system for Darwin XI mobile robot designed by Fleischer et al. [27] is not autonomous, by the definition used in this paper, due to the use of external computers to simulate a detailed neurophysiology based system. However, the system pushes the limit on simulating large scale features of vertebrate neuroanatomy and neurophysiology (the medial temporal lobe specifically) in real time. Through the use of a Beowulf cluster of 12 1.4 GHz Pentium IV computers running a Linux operating system, sensor data is communicated on a wireless link from the mobile robot to one of the cluster computers, while motor data is sent back to the robot. The simulation processing cycle from sensor data input to motor command output is approximately 200 ms of real time. The simulator, referred to as the brain-based device (BBD), simulates 57 neural areas, 80,000 neuronal units, and approximately 1.2 million synaptic connections.

Darwin XI is equipped with a visual system (camera), a head direction system (compass) plus wheel odometry (current head direction), a laser range finder system (facing downward to detect neuronal reward), and a whisker system which reads bumps along the plus-maze walls.

(2) Modeled Hippocampus. A schematic of the mobile robot mobile I/O sensors connected to the corresponding neurophysiologically based navigation system can be found in [27]. However, Figure 3 illustrates the type of connections and simulated parts of the medial temporal lobe, including the hippocampus entities. Figure 3 is similar to that found in [53, 54]; however further details pertaining to the various layers of the entorhinal cortex are lacking in this figure.

Although both the previous research using Darwin X [55, 56], which used a dry variant of the Morris water maze task [57], and that using Darwin XI, which uses the plus-maze, are performed on rodent based navigation testing platforms, the focus of these studies is on the formation of episodic memory. Through the use of a backtrace analysis tool, several seconds of neuronal activity and synaptic changes can be analyzed to determine causality of a particular neural event. Both studies showed the strongest synaptic influence from the entorhinal neuronal units on episodic memory, particularly from the performant path (, , and in Figure 3), while Darwin XI specifically focused on journey-dependent and journey-independent memory, as well as path prediction. A further detailed analysis can also be found in [58].

4.1.3. Strösslin et al. 2005

(1) System Architecture. Strösslin et al. [13] use the same mobile robot platform (Khepera) as Arleo. The robot has a camera, odometers, and proximity sensors. Thus, the robot only uses bodycentric, local sensor information for navigation. The Khepera is attached to a computer, running the neural model, with a long cable that also provides power to the robot and allows for sensor data to be transmitted from the robot to the computer.

(2) Neural Model: Place Code and Cognitive Maps. In a dry water maze, similar to that used for Darwin X, a navigation map is learned by the place cells in 20 trials, which is similar to the results found with rodents in the water maze [57]. Thus, visual and idiothetic information feeds the external neural model, which is composed of step cells (SCs) and rotation cells (RCs). These cells make up the local view (LV) and are fed by the visual input, a head direction (HD) system in the postsubiculum (poSb), path integration (PI) in the medial entorhinal cortex (mEC), and combined place code (CPC) in the hippocampus (HPC) and subiculum. The directional action cell (AC) in the nucleus accumbens (NA) is what eventually drives the navigational learning of the CPC. See Figure 4 for connectivity.

The cognitive map or spatial representation of the robot’s environment is accomplished through unsupervised Hebbian learning between the place cells and the head direction cells. Additionally, route planning is accomplished by use of biologically inspired reinforcement learning mechanism in continuous state space (place cells) and ACs.

4.1.4. Hafner 2008

(1) Place Code and Cognitive Maps. In [14], Hafner uses place cells for creating a cognitive map of a mobile robot’s area. The mobile robot, outfitted with only an omnidirectional camera and a compass, produces a cognitive map during an exploration phase, where the map is represented by place fields and place cells. Each snapshot taken by the camera is converted into a 16-dimensional transformation, which is used as the sensory input to a neural network system. That is, each 360° camera snapshot is divided up into 16 angular, azimuth sections of 22.5° each, filtered, and sent to the place cells’ neural networks. The weights of each neural network, initially set to random values, take on evolved values during the exploration phase. The place cells, as shown in the “output layer/map layer” in Figure 5, become relationally connected to each other based on a self-organizing map (SOM) methodology [59], where each single winner of a particular snapshot becomes connected to the previous winner and the corresponding connection weight is increased. Since the place cells are not geometrically fixed, they are assigned relative angles to each other, creating a topological map. This is all done without the use of reward during learning. Additionally, there is no goal state.

(2) Simulated Route Planning. However, once the neural cognitive maps have been built, they can only be used in simulation for navigation. The topological and metric information requires too much memory to reside in the mobile robot. Thus the mobile robot relies on landmark (snapshot) recognition and use of the SOM to reach goal spots or areas.

4.1.5. Barrera and Weitzenfeld 2008

(1) System Overview. Barrera and Weitzenfeld [2, 22] propose and implement a very complex, intricate, and modular neurophysiologically based navigation model. As with Arleo and Gerstner [12, 49], all of the proposed functionalities are mapped back to existing neurophysiological entities. Additionally, many of these modules are implemented using Gaussian distributions and the Hebbian learning rule/equation for neural networks. The main goals of this research are for the mobile robot to be able to learn and unlearn path selections for goal locations based on changing rewards, to create a realistic neuroscience based test bed for use in further behavior studies, and to add to the existing gap in the SLAM model between mapping and map exploitation [2]. The mobile robot’s test environment configurations are limited to the T-maze and the 8-arm radial maze.

The neurophysiological theory that forms the basis for this study comes from [60]. Thus, in addition to idiothetic and allothetic sensory inputs, there are also internal state/incentives and affordances information sensory inputs. Figure 6 shows the functional modules of this system, while removing many of the underlying details of the neurophysiological framework. Further details, such as model description, the neurophysiological framework, and equations for each of these modules can be found in [2, 6163].

Since the system lacks odometry and compass sensors, the idiothetic data comes in the form of kinesthetic data that is sent to an external motor control module, via the Action Selection module as shown in Figure 6, which is used for executing rotations and translations of the robot.

(2) Place Cells and Cognitive Map Generation. The Place Representation module in Figure 6 is where the cognitive map is made, stored, and accessed for the mobile robot to select movement options. Thus, this module represents the functionality of the hippocampus. The path integration information is combined with landmark information, through the Hebbian learning rule, to create a place cell layer. The overlapping place cell fields in this layer represent given locations or nodes that are found in the world graph layer (WGL), as shown in Figure 7.

The WGL uses a simple algorithm to decide its next move. It analyzes active nodes connected to the Actor Unit and, based on the highest weight, the WGL chooses the step that will get it closer to its learned goal or the best move for the time when a goal has been changed or not learned yet.

(3) Computational Complexity. Because of the high computational complexity of this neurophysiologically based navigation system, most of the model runs on an external 1.8 GHz Pentium 4 PC, which communicates wirelessly with a Sony AIBO ERS-210 4-legged robot. Thus, the system is not autonomous.

4.2. Theoretical Cell Centric Systems
4.2.1. Wyeth and Milford: RatSLAM, Version 3

(1) System Overview. Wyeth and Milford focus in [19, 20] on a neurobiologically inspired, SLAM based, mapping system for a mobile robot navigation system, based on models and earlier versions of RatSLAM [13]. Their robot, a Pioneer 2-DXE base system, performs mock deliveries in a large, single floor, office building using simple sensors: motor encoders for odometry, sonar and laser range finder for collision avoidance and pathway centering, and a panoramic camera system for landmark recognition. This system, named RatSLAM, uses the concept of place cells coupled to head direction (HD) cells to derive, what they call, pose cells.

(2) Pose Cells. The competitive attractor network (CAN) [4, 13] based pose cells are used with local view cells, which are snapshots of the panoramic camera along the robot’s journey. Thus, Milford and Wyeth have added a new type of cell: the pose cell. The pose cell is similar to the conjunctive grid cells, as they report, which is a combination of grid cells and head direction cells found in the rodent brain. The pose cells work like weighted probabilities that each local view cell is in the direction and location of the stored pose (averaged). Figure 8 illustrates the connectivity of the RatSLAM, version 3, as described here and in [23].

(3) Cognitive Map. The mapping algorithm incorporates a loop closure and map relaxation techniques to correct PI errors, thus creating more of a topological map than a metric map. A loop closure event only occurs when a threshold of consecutive local view cells matches the camera’s input, thus allowing for a change in the pose data. So as to save original pose data, the relaxed map is saved to an “Experience Map” (see Figure 9 for an illustration of the Experience Map Space), and the local view cells with accompanying pose cell data are stored in a connection matrix. Due to the topological nature of the Experience Map, transitions between experiences are stored, thus allowing route planning to be possible.

The benefit that comes from this design is that it is a first step into implementing the functionality of some of the specialized, navigation and spatial awareness, brain cells in a mobile robot. The downside is that it has been shown that the competitive attractor network can be easily replaced by a filter system [25], which leads to substantial computational speedup. Additionally, even with pruning in the Experience Map, data storage and processing do appear to grow unbounded.

4.2.2. Cuperlier et al. 2007

(1) Transition Cell. Cuperlier et al. built a neurobiologically inspired mobile robot navigation system in 2007 [24] using a new cell type which they named the “transition cell.” Their cell is based on the concept of moving from one place cell to the next over a defined interval of time. Thus, two place cells are mapped to a single transition cell, creating a cell which represents both position and direction of movement or spatiotemporal transitions, thus a graph-like structure.

(2) Computation Complexity. Multiple neural networks span the system’s architecture, as shown in Figure 10, from the landmark extraction/recognition stage to the cognitive map and motor transition stages. The many inputs of video, place cells, and so forth into a system of neural networks require many calculations to be carried out during each time step. This complexity is similar to Arleo and Gerstner [12, 49] and Barrera and Weitzenfeld [2, 22, 63], covered in the previous section. To illuminate the amount of processing that is required it is stated in [24] that the system uses 3x Dual Core Pentium 4 Processors which run at 3 GHz each. Azimuth angles are measured using an onboard compass, displacement is obtained from wheel encoders, and the visual is obtained from a panoramic camera.

The navigation process starts at the leftmost part of Figure 10, where a single, potential landmark is selected and analyzed at a given time. This occurs up to times per snapshot, where is set to a value to help balance the algorithm’s efficiency with its robustness. Therefore, as expected in any visual extraction/recognition system, a fair amount of processing time and power is spent during this stage. Additionally, during the initial exploration phase, weighted neural network coefficients are calculated for each potential landmark (32 × 32  pixels) and azimuth grid value, so that these small local views can be learned online. For more details on the calculations performed to arrive at the place cells from the landmark-azimuth matrix (PrPh) consult [24].

(3) Cognitive Map. Each place cell (center of Figure 10) is connected to each neuron of the landmark-azimuth matrix, where each connection has its own, unique, learned weights for that landmark-azimuth-place cell combination, as well as temporary scalars for the current, potential landmark view. However, it is very likely that several place cells will be active enough at a given location. The paper states that when a whole area has been mapped, during the initial exploration phase, the place cells are divided up into their own areas to eliminate these overlaps; see Figure 11, thus, creating a cognitive map.

An assumption is made about the average number of possible place cell transitions from any particular place cell for the test conducted in [24]. This is done to reduce neural network based, transition matrix to , where represents the number of possible transition place cell targets, thus, greatly reducing the computational complexity from O(N2) to O(N). However, this value may not work for all test cases, or in-field use.

(4) Route Planning. The robot’s cognitive map built during an initial exploration phase, as previously described, consists of nodes and edges, as shown in Figure 12, and is thus a graph: . Each node is a transition cell and an edge signifies that the robot has traveled between the two transition cells or nodes. The edges hold weight value (e.g., function of use) and the nodes hold activity values. The recorded nodes/edges of the cognitive map are used in a neural network version of the Bellman-Ford algorithm [64] to find the most direct route from a motivation point to the single source destination, while several types of motivations (drink, eat, sleep, etc.) are used to initiate the robot’s travel to the proper destination source. The satisfaction level of the motivations changes with time and distance traveled, while increasing at the source.

4.3. Grid Cell Centric Systems

Perhaps due to the fact that the grid cell was not discovered until 2005, or due to its complex nature and unknown functionality/contribution to navigation, there are a sparse number of robot navigation systems that are based on the grid cell. Instead related research in grid cells comes from computational/oscillational models [36, 53, 6568].

There are currently two prevailing computational model classes for describing the stimuli configuration required for the grid cell firing pattern. The first is the attractor network which follows along the lines of what was covered under the RatSLAM navigational model [53]. The second is a much more computationally complex model called oscillating interference [69]. The oscillating interference model is typically simulated using spiking neural networks on nonrobotic systems [36, 67, 68]. Both working models have strong pros and cons to their validity. The continuous attractor model, as introduced in [19, 20, 70], will be briefly covered in the next section on neural networks, while the computational model for a neurophysiological correct oscillating interference model is beyond the scope of this paper.

As covered in the previous section, Milford and Wyeth [19, 20] use pose cells in their neurobiologically based navigation model RatSLAM, which are based on the conjunctive grid cells found in the deeper layers of the medial entorhinal cortex (MEC), as further described in [70, 71]. Additionally, the wrapping connectivity of the pose cell grid creates a grid cell type pattern. However, there is much scientifically backed detail missing pertaining to the functionality of regular, nonconjunctive grid cells found at the superficial layers of the MEC, as well as the specifics of the conjunctive grid cells’ connectivity based on attributes of scale, orientation, and phase modeled. Thus, this work will remain in the theoretic cell section.

Additionally, Gaussier et al. [72, 73] used a mathematical model of the grid cell for their mobile robot navigation system. However, the grid cell’s firing pattern is a modulo projection of the path integration input. The tests performed on the mobile robot show poor patterns for the grid cell firing when relying on just path integration with growing accumulated errors as expected. Adding visual input to reset and recalibrate the path integration fixes the noisy path integration input, thus sharpening the firing pattern of the grid cells. The grid cells are thus used more as a test pattern for various arenas and path integration degradation settings. The grid cells do not aid in the mapping and route planning of the mobile robot. Thus, this study does not fully fit this section and will not be covered in any more detail.

5. Literature Survey Analysis

As stated previously, the main focus of this paper is to present research on state-of-the-art mobile robot navigation systems that are based on true rodent neurobiological spatial awareness and navigation brain cells. More specifically, this paper critiques how closely these navigation systems emulate neurobiological entities (e.g., posterior parietal cortex, dorsolateral medial entorhinal cortex, hippocampus, basal ganglia, place cells, and head direction cells) and the systems’ autonomy classification, as well as their cognitive mapping and route planning capabilities. A summary of the answers to these questions can be found in Table 1, as well as critiques at the end of each source surveyed.

5.1. Which Comes First, Technological Advances in Robotics or Insight for Neuroscientist?

The question as to what aspects of these models covered in the literature surveyed may benefit technological advances in robotics versus generation of new insights and testable predictions for biologists and neuroscientists needs addressing. To answer this question, it is the authors’ beliefs that current and future state of computational technology are what drives the answer to this question.

The systems covered in this paper generally fall into two categories: robots with external computers linked to them, which run relatively large scale neurophysiologically based navigation models (neural simulators) and robots with onboard computers to run smaller, partial neurally based models. The neurophysiological models that ran on external computers in the covered work obviously had the advantage of having more detailed models, as well as the capability for backtracing (e.g., [27, 55, 56]), which is a type of neural model debugging. It is these types of systems that have the most potential for giving biologists and neuroscientists data that will help in gaining new insights and testable predictions. Having a physical robot to gather sensor data and react to motor commands also helps add a dimension that cannot easily be realized, or issues that cannot be anticipated, in a simulator.

However, it requires ingenuity and thinking out of the box to implement a neurally based system within confined parameters. It is most likely that these types of systems would aid in technological advances in robotic systems first. For example, as will be covered in the Neural Networks, the use of graphics processor units for massively parallel, general purpose computing (GPGPU) is being introduced into robotic systems, primarily for deep learning. Deep learning has the advantage of being a powerful application for visual object recognition, speech recognition, object detection, and many other applications. Additionally, deep learning is key to place recognition for visual SLAM [74].

5.2. Importance of Visual Recognition in Navigation

As discussed in Section 2 and exemplified in the literature summarized above, it is quite apparent that there is a strong correlation between the visual recognition capabilities and the overall navigation capabilities of the neurobiologically based mobile robot. Navigation that is accomplished mainly by visual cues is referred to as taxon navigation, and it applies to animals, humans, insects, and so forth, as well as classical and neurobiologically based mobile robot navigation systems. This comes as no surprise as it has been shown that the specialized navigation and spatial awareness cells of a rodent are dependent to some degree on visual cues [23, 54, 7577]. Additionally, biological systems, such as those found in rodents, can navigate on nonvisuals cues as well. These can be auditory, olfactory, and/or somatosensory cues.

5.3. Possible Future Directions in Model Computation

Who knows what type of new, neural network based, computational system might be realized in the future from these studies? Certainly, a neural network based system should be composed of processing elements and connectivity networks that more greatly resembles that of a brain, thus reducing power and size requirements. Such a radically new, yet familiar, processing system would require the findings from both large, detailed models run on external computer clusters, as well as from smaller system implementations.

6. Neural Networks

For completeness, we present a discussion on the computational complexity of the various neural networks used, such as the continuous attractor network of the RatSLAM, the Hebbian learning rule and how it relates to the type of artificial neural networks (ANNs) used in the literature, and deep learning, which was not used but has interesting possibilities given current computational technologies. Additionally, the computational limitations due to scalability of these types of navigation systems are covered.

6.1. Continuous Attractor Network

To keep on track of the target of of closely modeling neurophysiological systems, both allothetic and idiothetic stimuli are fed into ANNs in all of the literature. The one difference is with the RatSLAM system [19, 20, 78, 79], which uses a variant of ANN system called the (3D) continuous attractor network (CAN) system (see Figure 9). Although the CAN is a type of ANN, it is less computationally complex to update due to the fact that the activity values of the CAN units are varied between 0 and 1, thus keeping the weighted connections fixed. However, the statistical nature of the RatSLAM cell calculations, as covered shortly, will tax the processing system. Changes in the CAN cell’s activity level is given in [20] byorwhere represents the activity matrix of the network, is the connection matrix, is the convolution operator, and the constant is used to create global inhibition and general inhibition in the connection matrix. At the CAN cell level, as described in (2), is the change in activity level for each cell, and is the 3D Gaussian distribution of weighted connections equation that creates local excitation and inhibition at the cell level, where , , and are wrap-around functions of , , and , respectively. Greater detail can be found in [70].

Another difference between the RatSLAM system compared to the rest of the systems presented in the literature review section is that the other systems use ANNs throughout their navigational system (thus increasing the computational complexity but staying with the neurophysiological model theme), while RatSLAM only uses the CAN for mobile robot pose determination. The visual snapshot matching appears to be of a non-ANN based algorithm, hence the scaling down of neurophysiological realism due to onboard computational constraints.

6.2. Hebbian Learning Rule

Hebbian based ANNs used in the research literature covered in this paper can be described by the general equations ofwhere is the output from neuron , is the th input, and is the weight from to . The scalar is known as the learning rate and it may change with time. The Hebbian learning rule () is named after Hebb [51] and his theory that the connection or synapse between two neurons strengthens as a result of a repeated pre- and postsynaptic neuron firing relationship. Incorporating a bias or threshold term and some transfer function results in the Hebbian rule, as shown in [8082], in the form ofThe transfer function is typically a discrete step function:or a smooth “sigmoid”; for exampleThe sigmoid, as well as the tanh, and rectified linear unit (ReLU) functions are typical nonlinear neurons used. The ReLU is currently a very popular activation neuron in deep learning.

The Hebbian general equation is inherently unstable, where all the synapses can either reach their maximum allowed value or transition to zero [8385]. Thus, a simple alternative equation to (4), such as that used in [12, 13, 86], is as follows:

The neural networks used in the literature surveyed typically use no more than a single hidden layer and are feedforward neural networks; see Figure 13. These ANNs are adequate for simple, discrete input/output combinations, such as heading and turn angle.

6.3. Deep Learning

Deep learning is a growing variant of the previously described simple ANNs. This is due to its ability to find intricate structure in large data sets. Deep learning network accomplishes this through added multiple nonlinear processing layers. These processing, or hidden, ANN layers are able to extract various object feature layers. As previously stated, deep learning has offered advances in many domains, such as image recognition, speech recognition, reconstructing brain circuits, natural language understanding, and relational data. Specifically, for navigation, it is the visual object recognition ability of deep learning and deep convolutional networks (e.g., traffic sign recognition, detection of pedestrians) which allow autonomous mobile robots and self-driving cars [87] to be realized.

Further details on specifics of the structure of deep learning neural networks, backpropagation mathematics, and so forth are beyond the scope and theme of this paper; thus they will not be covered here.

6.4. Computational Complexity Limiting Realism Scalability

When determining the computational complexity of a neural network, there are three important parameters to consider: size, depth, and weight of the network. The size is the number of neurons and the depth is the length of the longest path from an input point to an output neuron, while the sum total of the absolute values of the weights represents the weight of the network.

The training of the ANNs that are used for complex pattern recognition, such as those found in interfacing allothetic stimuli to the navigation system, can really only be accomplished offline. The processing power and time required would have too large of an impact on mobile robot resources and usability. This is due to the many forward propagation and back propagation cycles required to set the weights of the ANN to the most optimum values possible (given set number of cycle constraints) for each training sample in the training phase. This is particularly true for deep neural networks, which have many hidden layers. Thus, the time complexity will be a function of network size and particularly depth. An example of a simple two-input, two-output, single layer ANN is given in Figure 13. Further examples can be found in the literature surveyed.

Ways in which to add neurobiologically based entities, such as allothetic stimuli, other percepts, and/or controlling influences (e.g., nucleus accumbens, grid cells) from various parts of the brain, while maintaining a usable mobile robot footprint, are as follows:(1)Use of mobile GPGPU of more complex ANNs.(2)Removing ANNs from simpler parts of the system that can be easily replaced by a good, cheap sensor (e.g., head direction ANN in [12] with MEMs gyroscope).(3)Creating an application specific integrated circuit (ASIC) that models ANNs.

Option would be the most expensive but also the most efficient in power, size, and processing capabilities. Option is a more flexible option but still requires a great deal of power and special programming expertise. An example of what is available is NVIDIA’s® Tegra® K1 Mobile GPU with 192 lightweight parallel processor cores. NVIDIA GPUs can be programmed using CUDA or cuDNN. Option takes the system away from the realism of a neurobiological system, but some tradeoffs need to be made to model portions that are most important to the research.

7. In Brief

Certainly, one of the most important neurobiologically inspired systems in use today is the ANN. It offers a new paradigm in computation that is nearly impossible to recreate. Our computers are best for arithmetic computations and following a sequence of code. Visual and pattern recognition applications are too complex to program. Thus, research being done in this area, especially with the benefits found from deep learning, will continue to contribute to the field of artificial intelligence.

It is the hope of many researchers that work being performed in neurobiologically based navigation and spatial awareness systems will offer added technological advances to the autonomous navigation capabilities of mobile robots, as well as to better understanding of at least a small portion of the brain.

Competing Interests

The authors declare that they have no competing interests.