ALSO READMicrosoft announces artificial intelligence-based products and services Twitter deploys AI-based deep learning to recommend tweets on timelines Artificial intelligence may outperform humans in all aspects by 2060: Study Google may get access to genomic patient data - why we should worry What is machine learning and AI revolution? Sriram Rajamani explains
With the help of ultra-low latency, the system processes requests as fast as it receives them.
"Real-time AI is becoming increasingly important as cloud infrastructures process live data streams, whether they be search queries, videos, sensor streams, or interactions with users," said Doug Burger, an engineer at Microsoft, in a blog post late on Tuesday.
The 'Project Brainwave' uses the massive field-programmable gate array (FPGA) infrastructure that Microsoft has been deploying over the past few years.
"By attaching high-performance FPGAs directly to our datacentre network, we can serve DNNs as hardware microservices, where a DNN can be mapped to a pool of remote FPGAs and called by a server with no software in the loop," Burger said.
He added that the system architecture reduces latency, since the CPU does not need to process incoming requests, and allows very high throughput, with the FPGA processing requests as fast as the network can stream them.
The system has been architected to yield high actual performance across a wide range of complex models, with batch-free execution.
Microsoft claimed that the system, designed for real-time AI, can handle complex, memory-intensive models such as Long Short Term Memories (LSTM), without using batching to juice throughput.
As we tune the system over the next few quarters, we expect significant further performance improvements," Burger noted.
"With the 'Project Brainwave' system incorporated at scale and available to our customers, Microsoft Azure will have industry-leading capabilities for real-time AI," Burger noted.