Yazar "692582" ile İTÜ Akademik Çalışmalar'a göz atma
Sayfa başına sonuç
ÖgeApproximate artificial neural network hardware aware synthesis tool(Lisansüstü Eğitim Enstitüsü, 2021) Nojehdeh, Mohammadreza Esmali ; Altun, Mustafa ; 692582 ; Elektronik ve Haberleşme MühendisliğiIn the previous decade, artificial neural networks (ANNS) have attracted considerable attention from researchers in many areas and have become a favorite method; from business to aerospace applications. We live in the information age where this information feeds artificial intelligence (AI). According to Forbes' estimate, over the last two years alone 90 percent of the data in the world was generated. At first glance, processing more information may seem like a dissipation of more power in central processing units(CPUs) and graphic processing units (GPUs) or spending more time to obtain the results, but for the portable systems due to limitations in battery capacity, power, and hardware area limitations, different concerns emerge. For example, less consumption of energy is vital to extend the battery supporting time for mobile devices. The problem starts to be bold when software engineers regardless of the hardware sources (especially for portable devices) develop different ANNs architecture, where they intend to achieve a network with the best performances. Similarly, hardware engineers' AI knowledge is limited and any change within hardware design in lack of this knowledge may yield a catastrophic defect in the expected performance. As a result, this uninformed state yields a gap between the hardware and software sides of ANNs. The emerged gap provides a pitch to hardware and software researchers to play their best performance, where more information about the rival side makes their performance more eye-catching. By obtaining this gap, the co-design method or hardware-aware training methods become prevalent recently. The object of this dissertation is also to develop a methodology to realize the ANNs with minimum hardware cost by regarding the software performance. Limitation in hardware cost, consumed energy, and dissipated power for devices leads designers to find new architectures and approaches. Approximate computing is one of them, where this method is an useful technique for error essence systems. By leveraging the approximate level, a trade-off between the output accuracy and hardware cost is attainable. For example, assume a 1-bit exact adder costs 18 transistors, and by removing 3 transistors, a new approximate adder by 15 transistors is achievable, but the new approximate adder generates inexact results when the input is $(0,0)$, and suppose that the results for the rest set of the inputs$((0,1),(1,0),(1,1))$ are correct. Therefore, the approximate adder saves 3 transistors at the cost of 1 inexact result. Generally, approximate computing is apple of designers' eye in applications with error tolerance capability, consequently, error tolerance inherence of ANNs nominates approximate computing as a potential method to reduce the hardware complexity of ANNs. Since multipliers and adders are fundamental building blocks of ANNs, in this thesis, by introducing novel approximate multipliers and adders we replace them with exact adders and multipliers. As mentioned earlier, approximate computing is a trade-off between accuracy and hardware cost, to adjust this trade-off, we synthesized the proposed approximate blocks based on the desired error metric. Also, we proposed an equation to calculate the mean absolute error of the introduced approximate multiplier and adders. Based on our best knowledge, the proposed approximate blocks are the only ones which are synthesized based on the mean error value. In next step, we introduced a new error metric called the approximate level to evaluate the performance of the proposed approximate blocks in ANNs. On the other hand, ANNs are made up of a lot of multipliers and adders, where the search space for the best combination of these blocks grows with the increase of bit-width or neuron numbers. To tackle this problem and by exploiting the proposed error metric, we introduce a new search algorithm to find the appropriate combination of the approximate and exact versions of the arithmetic blocks by taking into account the expected accuracy of ANNs. Also, in this thesis we realized ANNs under different synthesis techniques to obtain the pros and cons of each approach. Since the parallel architecture requires a large area we considered the time-multiplexed architecture as the main architecture method, where computing resources are re-used in the multiply-accumulate (MAC) blocks. As an application, the MNIST and Pen-digit database are considered. To examine the efficiency of the proposed method, various architectures and structures of ANNs are realized. Our experimental results show that exploiting the proposed approximate multipliers yields smaller area and power consumption compared to those designed using previously proposed prominent approximate multipliers. Also, according to these results, concurrent use of approximate multipliers and adders provides remarkable results in terms of hardware cost, where we obtain $60\%$ and $40\%$ reduction in energy consumption and occupied area of the ANN design with the same or better hardware accuracy compared to the exact adders and multipliers. To demonstrate the proposed method's scalability, we propose an efficient method to realize a convolution layer of convolution neural networks (CNNs). Inspired by the fully-connected neural network architecture, we introduce an efficient computation approach to implement convolution operations.