CSL professor explores the limits of AI
In 2019, the CEO of Nvidia declared that Moore’s Law, the idea that the number of transistors in a computer’s circuit would double every couple of years, was dead – transistors could no longer increase in density. While there is still debate in the scientific community over the accuracy of this empirical observation, Moore’s law has had an indisputable impact on the manufacturing of computers and study of computational systems, as it played a massive role in driving innovation, planning, and thought about energy limits.
With the end of Moore’s law using traditional complementary metal-oxide-semiconductor (CMOS) transistors, new low-power nanotechnologies are being explored as computational substrates. The intersection between energy limits and cutting-edge artificial intelligence is where CSL research assistant professor Lav Varshney and his colleague from Indian Institutes of Technology Madras, Avhishek Chatterjee, decided to focus their efforts.
In their paper, “Energy-Reliability Limits in Nanoscale Feedforward Neural Networks and Formulas,” Varshney and Chatterjee explore what it would look like to implement deep neural networks, a technique within artificial intelligence, in nanoscale circuits. Deep neural networks are now being deployed in society for all kinds of tasks including vision, natural language processing, and speech processing.
Nanoscale semiconductor devices (such as spin electronics and carbon nanotubes) are being used to implement computational systems due to energy-efficiency requirements, but are noisy. That is, bits may be flipped randomly, and so these devices perform individual computations with some probability of error. As such, redundancy techniques from fault-tolerant computing are necessary to have circuit-level or system-level reliability despite device-level unreliability.
“This work originated in the Systems on Nanoscale Information Fabrics (SONIC) Center led by CSL’s Naresh Shanbhag, and focused on a variety of fundamental questions that arise in nanoscale computing,” explained Varshney. “As information theorists, it was natural for us to ask about fundamental limits in such computational systems.”
The mathematical theory they developed for efficiently implementing deep learning at a nanoscale level was also predictive of the way the brain works. Several experimentally measurable properties of neural structure and energy allocation in the mammalian sensory cortex all line up in the way the theory predicts.
In a practical sense, their work provides a way to design reliable nanoscale AI circuits. Theoretically, though, this work serves as a contribution to the understanding of fault-tolerant computing limits by putting an emphasis on energy, beyond what researchers like Bruce Hajek had looked at in the past.
Varshney and Chatterjee are not only happy about the research results themselves, but also to be contributing to the literal expansion of information theory into a broad range of fields as their paper will be published in the first issue of a new journal, the IEEE Journal on Selected Areas in Information Theory, Special Issue on Deep Learning.
“Deep learning is such an important, topical area within AI,” shared Varshney, who is currently on leave at Salesforce Research studying deep learning in the context of AI ethics and AI for social good. “Having more energy-efficient implementations of deep learning will allow this technology to be more sustainable when deployed into many corners of the world.”