Model-based aı accelerator design on FPGA with in-depth evaluation of design parameters

thumbnail.default.alt
Tarih
2025-02-11
Yazarlar
Özdil, Gözde
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Graduate School
Özet
Artificial intelligence (AI) has advanced considerably in recent years. However, as models become increasingly complex, traditional hardware, such as GPUs and CPUs, faces limitations in meeting demands for power efficiency, low latency, and energy optimization. FPGAs, with their parallel processing capabilities, low latency, reconfigurable architecture, and reduced power consumption, have gained traction as an alternative platform for deploying AI models. Consequently, research on AI model deployment on FPGAs has seen significant growth in recent years. To facilitate AI deployment in FPGAs, frameworks such as Vitis AI have been introduced. This study used the Vitis AI framework to implement three different AI accelerators in the Kria KV260 Vision AI Starter Kit, a system-on-chip (SoC) platform. The first design involved vehicle color recognition using the ResNet-18 CNN model in PyTorch, fine-tuned with the Vehicle Color Recognition (VCoR) dataset. The model was quantized using the Vitis AI quantizer, compiled and optimized for deployment using the Vitis AI compiler, and then integrated onto the FPGA. The Vitis TRD flow, rather than the Vivado TRD, was followed to create the hardware design, simplifying the process and eliminating the need for PetaLinux, which is often more complex to use. Deployment on FPGA was performed using the PYNQ framework, enabling Python-based model integration without requiring any hardware description languages. The dataset was pre-processed before inference, and real-time performance was achieved by integrating the FPGA with a camera. In the second accelerator, the ResNet-18 model was fine-tuned for pneumonia diagnosis using chest X-ray images. This design reused the existing hardware, demonstrating the flexibility and adaptability of the model-based design approach for different classification tasks without additional hardware modifications. The third design implemented object detection using the YOLOv3 CNN model, pre-trained on the COCO dataset and obtained in a pre-quantized form from the Vitis Model Zoo. The model was compiled to fit the existing hardware configuration and deployed on the FPGA. Both pre-processing and post-processing steps were integrated using PYNQ, allowing bounding box generation and visualization of detected objects. Real-time object detection was achieved by connecting a live camera feed to the system. A key contribution of this work is the adoption of a model-based design approach, which simplifies FPGA deployment by avoiding the need for PetaLinux or low-level hardware design. By following the Vitis TRD flow, a more accessible and user-friendly alternative to Vivado TRD, this study developed a detailed guide for deploying AI models on FPGAs. Furthermore, the study conducted a comprehensive analysis of DPU configuration parameters and frequency settings, filling a significant gap in the literature. The results provided new insights into performance, resource utilization, power consumption, and energy efficiency, clarifying gaps, and addressing potentially misleading conclusions in previous studies within the academic literature. This work contributes to the field by presenting practical, user-friendly methods for designing and analyzing AI accelerators for both classification and object detection tasks. The study highlights the advantages of Vitis AI and model-based FPGA design, providing a guide for future research and development in the high-performance and energy-efficient AI deployment on FPGA.
Açıklama
Thesis (M.Sc.) -- Istanbul Technical University, Graduate School, 2025
Anahtar kelimeler
artificial intelligence, yapay zeka, ai accelerator design, yapay zeka hızlandırıcı tasarımı
Alıntı