As AI research and technological development advances, we also need to consider the energy and infrastructure resources required to manage large datasets and perform difficult calculations. When we look to nature for models of efficiency, the human brain stands out for its dexterity in handling complex tasks. Inspired by this, Microsoft researchers aim to understand the brain's efficient processes and replicate them in AI.
Microsoft Research Asia (opens in new tab), in collaboration with Fudan University (opens in new tab), Shanghai Jiao Tong University (opens in new tab), and Okinawa Institute of Technology (opens in new tab), has three noteworthy projects underway: one to introduce neural networks that simulate how the brain learns and calculates information, another to improve the accuracy and efficiency of predictive models of future events, and a third to improve AI's language processing and pattern prediction capabilities. In addition to improving performance, the projects highlighted in this blog post aim to significantly reduce power consumption and pave the way for more sustainable AI technology.
CircuitNet simulates brain-like neural patterns
Many AI applications rely on artificial neural networks designed to mimic the brain's complex neural patterns. These networks typically replicate only one or two types of connectivity patterns. In contrast, the brain communicates information using a variety of neural connectivity patterns, including feedforward excitation and inhibition, reciprocal inhibition, lateral inhibition, and feedback inhibition (Figure 1). These networks contain densely interconnected local regions with fewer connections between distant regions. Each neuron forms thousands of synapses to perform a specific task within its region, but some synapses link distinct functional clusters, which are groups of interconnected neurons that work together to perform a specific function.
Figure 1: Four neural connectivity patterns in the brain. Each circle represents a neuron and each arrow represents a synapse.
Inspired by this biological structure, researchers developed CircuitNet, a neural network that replicates multiple types of connectivity patterns. CircuitNet's design features a combination of densely connected local nodes and fewer connections between distant regions, enhancing signal transmission through circuit motif units (CMUs), small repeating connection patterns that aid in information processing. This structure, shown in Figure 2, supports multiple rounds of signal processing and has the potential to advance the way AI systems process complex information.
Figure 2. CircuitNet architecture: A general-purpose neural network performs a variety of tasks, accepting a variety of inputs and producing corresponding outputs (left). CMUs increase efficiency by keeping most connections local and reducing long-range connections (center). Each CMU has densely interconnected neurons to model universal circuit patterns (right).
Evaluation results are promising: CircuitNet outperformed several popular neural network architectures in function approximation, reinforcement learning, image classification, and time series forecasting. It also achieved comparable or better performance than other neural networks in many cases with fewer parameters, demonstrating its effectiveness and strong generalization capabilities across a range of machine learning tasks. The next step is to test CircuitNet's performance on large-scale models with billions of parameters.
Spiking Neural Networks: A New Framework for Time Series Forecasting
Spiking neural networks (SNNs) are emerging as powerful artificial neural networks that are energy-efficient and have potential applications in fields such as robotics, edge computing, and real-time processing. Unlike traditional neural networks, which process signals continuously, SNNs activate neurons and generate spikes only when a certain threshold is reached. This approach simulates how the brain processes information and conserves energy. However, SNNs are not suited to predicting future events based on historical data, a function important in fields such as transportation and energy.
To improve the predictive capabilities of SNNs, researchers have proposed an SNN framework designed to forecast trends over time, such as power consumption and traffic patterns. This approach leverages the efficiency of spiking neurons in processing temporal information and synchronizes the SNN with periodically collected time series data. Two encoding layers convert the time series data into spike sequences, allowing the SNN to process them and make accurate predictions (see Figure 3).
Figure 3. A novel framework for SNN-based time series forecasting: Time series data are encoded into spikes using a novel spike encoder (center, bottom). Then, the spikes are processed by SNN models (Spike-TCN, Spike-RNN, and Spike-Transformer) for training (top). Finally, the learned features are sent to the projection layer for prediction (bottom right).
Tests show that this SNN approach is highly effective at time series forecasting, performing as well or better than traditional methods while consuming significantly less energy. SNNs capture temporal dependencies well and model the dynamics of time series, providing an energy-efficient approach that closely matches how the brain processes information. We plan to continue exploring ways to further improve SNNs based on how the brain processes information.
Improving SNN sequence prediction
SNNs can help models predict future events, but because they rely on spike-based communication, researchers have found that many techniques from artificial neural networks are difficult to apply directly. For example, SNNs struggle to effectively handle rhythmic and periodic patterns found in natural language processing and time series analysis. In response, researchers have developed a new approach to SNNs called CPG-PE, which combines two techniques:
Central Pattern Generator (CPG): A neural network in the brainstem and spinal cord that autonomously generates rhythmic patterns to control functions such as movement, breathing, chewing, etc. Positional Encoding (PE): A process that helps artificial neural networks identify the order and relative position of elements in a sequence.
By integrating these two techniques, CPG-PE enables the SNN to identify the location and timing of signals, improving its ability to process time-based information. This process is shown in Figure 4.
Figure 4: Application of CPG-PE in SNN. X, X′, and X outputs are spike matrices.
CPG-PE is evaluated using four real-world datasets: two covering traffic patterns and one each for electricity consumption and solar energy. Results show that SNNs using our method significantly outperform SNNs without position encoding (PE) (see Table 1). Moreover, CPG-PE can be easily integrated into any SNN designed for sequence processing, making it adaptable to a wide variety of neuromorphic chips and SNN hardware.
Table 1: Evaluation results of time series forecasting on two benchmarks with prediction lengths of 6, 24, 48, and 96. “Metr-la” and “Pems-bay” are traffic pattern datasets. The best SNN results are in bold. Upward arrows indicate higher scores, representing better performance.
Ongoing AI research aimed at improving capabilities, efficiency and sustainability
The innovations highlighted in this blog demonstrate the potential to create AI that is not only more capable, but also more efficient. We are excited to deepen our collaboration and continue to apply neuroscience insights to AI research, exploring ways to develop more sustainable technologies.
Open in new tab
Source link