How Machine-Learning System Based on Light Could Yield More Efficient LMMs


Deep Neural Networks (DNNs) are like super brains for computers. They are good at figuring out tricky stuff from lots of information. They enable computers to be smart enough to understand pictures, talk like humans, drive cars by themselves, and even help doctors spot diseases. 

DNNs copy how our brains work, making them efficient at solving tough problems that regular computer programs struggle with. But, the current digital technologies underpinning today’s Deep Neural Networks (DNNs) are finding themselves at the boundaries of their capabilities, all the while the field of machine learning continues to expand. Moreover, these technologies demand massive energy consumption and remain restricted to extensive data centers. This situation inspires the discovery and creation of new computing approaches.

As a result, MIT researchers dedicated extensive efforts towards enhancing this situation. A team led by MIT has ingeniously devised a system with the potential to surpass the capabilities of the machine-learning program driving ChatGPT by several levels. This newly crafted system also has the remarkable advantage of consuming significantly less energy than the cutting-edge supercomputers that drive modern machine-learning models. This system has reported an above 100-fold improvement in energy efficiency and a 25-fold improvement in compute density. Its calculations rely on manipulating light instead of electrons, achieved through hundreds of micron-scale lasers.

The researchers have emphasized that the technique has opened an avenue for large-scale optoelectronic processors to accelerate machine-learning tasks from data centers to decentralized edge devices. In other words, cell phones and other small devices could become capable of running programs that can currently only be computed at large data centers. Optical computations consume significantly less energy than their electronic counterparts. Light can convey a far greater amount of information across a considerably smaller space.

Dirk Englund, an associate professor in MIT’s Department of Electrical Engineering and Computer Science and the leader of this initiative, states that the capabilities of today’s supercomputers confine ChatGPT’s size. The practicality of training much larger models is limited due to economic factors. The technology they have pioneered can enable the utilization of machine-learning models that would otherwise remain inaccessible in the coming years. He further said they are still determining what capabilities the next-generation ChatGPT will have if it is 100 times more powerful, but that’s the regime of discovery that this kind of technology can allow. 

Despite the benefits of optical neural networks, the current optical neural networks (ONNs) have significant challenges too. For instance, they need lots of energy as they are inefficient at converting incoming data based on electrical power into light. Moreover, the components required for these operations are large and occupy substantial space. Despite being proficient in linear computations such as addition, Optoelectronic Neural Networks (ONNs) face limitations when it comes to nonlinear calculations like multiplication and conditional statements.

Check out the , where we share the latest AI research news, cool AI projects, and more.

n