LEE- Elektronik Mühendisliği-Yüksek Lisans
Bu koleksiyon için kalıcı URI
Gözat
Son Başvurular
1 - 5 / 49
-
ÖgeDesign of a clutchless automated gear control system for electric tractors(Graduate School, 2025-06-25)The electrification of agricultural machinery is becoming increasingly important due to rising fossil fuel prices and stringent environmental regulations aimed at reducing emissions. In developing countries, agricultural activities account for approximately 35% of total emissions, making the transition in this sector inevitable. Compared to internal combustion engines, electric agricultural machinery offers significant advantages in terms of higher energy efficiency, lower operational costs, and improved sustainability. Moreover, electrification not only enhances energy efficiency but also enables various subsystems within the vehicle architecture to become more effective and reliable. During this transformation process, traditional mechanical components are being replaced by advanced electronic and software-based control strategies. This study focuses on the control of a dog clutch mechanism that enables automated and clutchless gear shifting, with the objective of developing a more efficient and reliable gear transition method. While conventional synchronization methods rely on mechanical components, software-based control strategies allow for more precise synchronization of motor speeds, leading to smoother and more efficient gear transitions. The study utilizes optimized gear ratios and implements gear shifting through hydraulic actuators, which have been relatively less explored in the literature. While most existing studies focus on using electric motors for gear synchronization, this research prioritizes hydraulic actuators, leveraging the vehicle's existing hydraulic system. This approach not only makes efficient use of the available hydraulic infrastructure but also highlights the advantages of hydraulic systems over electrical ones, particularly in terms of instantaneous power transfer and high-power capacity. The primary objective of this research is to develop an optimized gear control algorithm for agricultural machinery, enabling the electronic control of a dog clutch mechanism in a moving vehicle. To achieve this, real-time feedback mechanisms have been developed through the vehicle management unit to enhance gear shift accuracy and improve system response time. Two different software algorithms—one incorporating active speed synchronization and the other without it—were tested on an actual vehicle to evaluate their performance and compare their benefits. The developed software aims to reduce reliance on mechanical components, thereby enhancing system durability and long-term operational stability. This thesis contributes to the advancement of software-based gear control in electric agricultural machinery, aiming to eliminate the inefficiencies of traditional systems. The performance of the developed algorithms has been evaluated under field conditions, demonstrating their advantages over conventional systems. Ultimately, the proposed approach seeks to reduce the number of gears in equivalent internal combustion engine-powered agricultural vehicles from up to 32 to just 2, thereby minimizing maintenance requirements, reducing system complexity, and optimizing dimensional efficiency.
-
ÖgeModel-based design and implementation of schedulers in ARINC-664 end system as a system on chip(Graduate School, 2022)Ethernet-based deterministic network protocol that provides bounded delay and jitter using redundant communication among the avionics applications. Achieving the end-to-end bounded delay objectives requires that incoming Ethernet frames must be regulated according to the ARINC-664 standard. In ARINC-664, each rate-constrained flow, i.e., Virtual Link (VL), is regulated by using End Systems (ESs) and Bandwidth Allocation Gap (BAG). Each regulated VL must be served at a time, so a scheduling mechanism must be used when more than one queue is ready to be served. ARINC-664 standard does not specify the details of the scheduling algorithm. However, some algorithms are proposed in the literature for ARINC-664 scheduling. Field Programmable Gate Array (FPGA) is one of the most preferred implementation choices for ARINC-664 due to its low power consumption, low latency data transfer, and security advantages. Traditional FPGA development requires building design and verification with Hardware Description Languages (HDLs). Instead of this time-consuming FPGA development, using a model-based hardware design enables faster prototyping and testing environment. In this thesis, first, a Single Queue model is designed and developed in Simulink to provide a basic queueing infrastructure for ARINC-664 ES. Then, the ARINC-664 ES model is developed on top of the Single Queue model. The scheduling algorithms in ARINC-664 ES are designed and developed using HDL convertible components. The Smallest BAG (SB), the Smallest Size (SS), the Longest Queue (LQ), and the First-In-First-Out (FIFO) ARINC-664 ES scheduling algorithms are implemented. This implementation allows collecting the mean, standard deviation, and maximum of jitter performances of the scheduling algorithms. In addition, an ARINC-664 ES Dynamic Scheduler model whose components can be converted to HDLs and C/C++ is built. This model contains all the scheduling algorithms, and the user can switch among the scheduling algorithms while the model is operating.
-
ÖgeModel-based aı accelerator design on FPGA with in-depth evaluation of design parameters(Graduate School, 2025-02-11)Artificial intelligence (AI) has advanced considerably in recent years. However, as models become increasingly complex, traditional hardware, such as GPUs and CPUs, faces limitations in meeting demands for power efficiency, low latency, and energy optimization. FPGAs, with their parallel processing capabilities, low latency, reconfigurable architecture, and reduced power consumption, have gained traction as an alternative platform for deploying AI models. Consequently, research on AI model deployment on FPGAs has seen significant growth in recent years. To facilitate AI deployment in FPGAs, frameworks such as Vitis AI have been introduced. This study used the Vitis AI framework to implement three different AI accelerators in the Kria KV260 Vision AI Starter Kit, a system-on-chip (SoC) platform. The first design involved vehicle color recognition using the ResNet-18 CNN model in PyTorch, fine-tuned with the Vehicle Color Recognition (VCoR) dataset. The model was quantized using the Vitis AI quantizer, compiled and optimized for deployment using the Vitis AI compiler, and then integrated onto the FPGA. The Vitis TRD flow, rather than the Vivado TRD, was followed to create the hardware design, simplifying the process and eliminating the need for PetaLinux, which is often more complex to use. Deployment on FPGA was performed using the PYNQ framework, enabling Python-based model integration without requiring any hardware description languages. The dataset was pre-processed before inference, and real-time performance was achieved by integrating the FPGA with a camera. In the second accelerator, the ResNet-18 model was fine-tuned for pneumonia diagnosis using chest X-ray images. This design reused the existing hardware, demonstrating the flexibility and adaptability of the model-based design approach for different classification tasks without additional hardware modifications. The third design implemented object detection using the YOLOv3 CNN model, pre-trained on the COCO dataset and obtained in a pre-quantized form from the Vitis Model Zoo. The model was compiled to fit the existing hardware configuration and deployed on the FPGA. Both pre-processing and post-processing steps were integrated using PYNQ, allowing bounding box generation and visualization of detected objects. Real-time object detection was achieved by connecting a live camera feed to the system. A key contribution of this work is the adoption of a model-based design approach, which simplifies FPGA deployment by avoiding the need for PetaLinux or low-level hardware design. By following the Vitis TRD flow, a more accessible and user-friendly alternative to Vivado TRD, this study developed a detailed guide for deploying AI models on FPGAs. Furthermore, the study conducted a comprehensive analysis of DPU configuration parameters and frequency settings, filling a significant gap in the literature. The results provided new insights into performance, resource utilization, power consumption, and energy efficiency, clarifying gaps, and addressing potentially misleading conclusions in previous studies within the academic literature. This work contributes to the field by presenting practical, user-friendly methods for designing and analyzing AI accelerators for both classification and object detection tasks. The study highlights the advantages of Vitis AI and model-based FPGA design, providing a guide for future research and development in the high-performance and energy-efficient AI deployment on FPGA.
-
ÖgeSystem-on-chip design with open-source FPGA IP(Graduate School, 2025-03-07)In recent years, the demand for computing power has increased due to the increasing number of Internet of Things (IoT) devices and artificial intelligence applications. Field programmable gate arrays (FPGA) are frequently used to meet this demand because of their simultaneous computation, reconfigurability, and high bandwidth. However, integrating FPGAs into the system is challenging. Since they use multiple voltage levels and consume lots of power, producing a suitable printed circuit board (PCB) takes time to develop and increases design costs. At the same time, FPGA packages are large, and the price per piece is higher compared to many integrated circuits, so the procurement costs of products containing FPGAs are also high. Embedded FPGAs (eFPGA) aim to solve these problems by integrating FPGA fabrics into the system-on-chips (SoC). EFPGA vendors can produce FPGA fabrics with fewer look-up tables (LUT), i.e. fabrics with less space and power consumption, to satisfy customer requirements. There are two different design methods for embedded FPGAs: hard and soft. Hard eFPGAs are designed at the transistor level, similar to discrete FPGAs, and are specific to a semiconductor manufacturing technology. Soft eFPGAs are generated as RTL code, and since they are independent of the manufacturing technology, the architectures can be easily fine-tuned and manufactured using different technologies. In this study, a system-on-chip system with soft eFPGA intellectual property (IP) is designed. There are similar studies on this topic, but the aim is to show that it is possible to design an SoC with open-source tools and designs. In addition to the embedded FPGA IP; the processor, the memory that stores the program data for the processor and bitstream files for the FPGA, and two UART elements, one for the processor to use and one for loading the data of the memory element. The Advanced eXtensible Interface 4 (AXI4) protocol of Advanced Microcontroller Bus Architecture (AMBA) standard provides the on-chip communication. The system is mostly prepared with open-source design tools or taken from open-source projects. The processor is CVA6, previously developed in the PULP Platform group of ETH Zürich and now maintained by Open HW Group. It is a 64-bit processor, has open-source RISC-V architecture and supports I, M, C and A extensions. It is a parametric core, the optimum performance can be obtained by changing the parameters which define the core. The embedded FPGA IP is generated with an open-source FPGA fabric generator called OpenFPGA. OpenFPGA can produce RTL codes for FPGA in Verilog format, verification environment, Synopsys timing constraint commands and bitstream files suitable for the desired architecture with Yosys open source RTL synthesizer and Versatile Place-and-Route (VPR), FPGA placement and routing program. The FPGA prepared in this study includes 1960 six-input LUTs, 1960 flip-flops, 50 input-output cells and a register interface. The logic blocks in the FPGA consist of ten LUTs, ten flip-flops and local routing multiplexers. This method gives the lowest value in terms of area-delay multiplication. Local routing multiplexers are reduced to 50\% and reduce the delays in the logic block. Thus, the critical paths and the area covered by multiplexers are reduced without damaging the logic block functionality. The switching block is the Wilton style, which is the best in terms of routability and area, and its flexibility coefficient is 3. Multiplexers are selected for the switches in the FPGA because they cover a smaller area than tri-state buffers and can be optimized in digital implementation tools. Due to the small size of the routing architecture, L4 segments were used; longer ones were not preferred. Input-output (IO) blocks are used as standard input-output blocks of the preferred production technology, and vertical and horizontal IO blocks are made of separate cells to be compatible with the grid in the technology. The register interface provides the AXI interface to which the processor and other blocks are connected to communicate with the design inside the FPGA. The interface consists of two control and status registers and six 64-bit data registers so that the status of the eFPGA can be learnt and a total of 384-bit data can be transferred to the FPGA simultaneously. For programming the embedded FPGA IP, the memory bank was selected from five programming protocols in OpenFPGA. It takes up less space than other protocols because it allows the latch structure for programmable memory. Inside the system, up to three bitstream files can be stored and read from them to reprogram during runtime. A configuration circuit is designed to program the IP according to the preferred protocol in this study. Each configurable element is controlled through bit line (BL) and word line (WL) signals.
-
ÖgeImplementation of the YOLOv8 convolutional neural network block on fpga(Graduate School, 2025-01-07)Convolutional Neural Networks (CNNs) are among the most extensively studied topics, with applications in areas such as computer vision, object detection, and image processing. CNN operations are divided into two main phases: training and inference. Training requires intensive computations, including forward and backward propagation. Forward propagation estimates output using current weights, while backward propagation calculates the error between predicted and actual outputs, updating weights iteratively using gradient descent. Hardware acceleration studies focus on improving either training or inference, given the computational intensity of CNNs, which often require millions of operations for a single output. Hardware platforms for CNNs include CPUs, GPUs, FPGAs, and ASICs. While ASICs offer high speed, their cost and single-purpose nature limit their use. CPUs lack flexibility for custom operations, and GPUs are best suited for tasks with high-density algorithms. However, GPUs show performance limitations for workloads with low-density algorithms. FPGAs fill this gap by offering reconfigurable architecture, parallelism, and pipelining, making them versatile for both simple and complex CNN layers. In this study, FPGA-based CNN implementations achieved a maximum frequency of 275 MHz, with most IP blocks operating at 250 MHz. Hard-wired structures were utilized for data storage, outperforming LUT-based approaches in terms of efficiency and temporal performance. BRAM-based FIFO structures with depths ranging from 16 to 512 entries provided substantial improvements, although benefits plateaued beyond 512 entries. Cascading also minimized delays caused by repeated memory fetches. In this study, parallelization is one of the main factors of performance increase. For example, with 8 parallel operations, the best execution time was 0.14 seconds, which could drop to around 0.07 seconds with 16 parallel operations. This highlights the importance of extensive parallelization, particularly in deeper CNN layers with more channels. Optimal performance requires segmented designs with minimal LUT and Flip-Flop utilization to maximize frequency and avoid internal resource bottlenecks. In conclusion, the study emphasizes parallelization, correct use of hard-wired structures, and cascading as critical strategies to enhance CNN performance on FPGAs while ensuring resource efficiency and scalability.