Test verilerine dayalı, makine öğrenmesi ve derin öğrenme yöntemleri ile batarya sağlık durumu tahmini
Test verilerine dayalı, makine öğrenmesi ve derin öğrenme yöntemleri ile batarya sağlık durumu tahmini
| dc.contributor.advisor | Çalışkan, Fikret | |
| dc.contributor.author | Arslantaş, Mehmet Ali | |
| dc.contributor.authorID | 518221010 | |
| dc.contributor.department | Mekatronik Mühendisliği | |
| dc.date.accessioned | 2025-10-27T08:30:11Z | |
| dc.date.available | 2025-10-27T08:30:11Z | |
| dc.date.issued | 2025 | |
| dc.description | Tez (Yüksek Lisans)-- İstanbul Teknik Üniversitesi, Lisansüstü Eğitim Enstitüsü, 2025 | |
| dc.description.abstract | Lityum-iyon bataryalar; elektrikli araçlardan enerji depolama sistemlerine kadar pek çok alanda kullanılmakta olup, kullanım süresince yaşanan performans düşüşü sistem güvenliği ve verimliliği açısından ciddi riskler barındırmaktadır. Bu nedenle batarya sağlığının doğru ve güvenilir biçimde tahmin edilmesi, sistemlerin sürdürülebilirliği için kritik önemdedir. Geleneksel yöntemler, tam şarj-deşarj döngülerine dayanarak kapasiteyi hesaplamaya çalışsa da bu hem zaman alıcı hem de hücre ömrünü azaltıcı bir yaklaşımdır. Ayrıca sıcaklık, iç direnç ve akım gibi değişkenlerin etkisini dikkate almamakta, bu da modelin sahadaki uygulanabilirliğini sınırlamaktadır. Bu bağlamda, bu çalışmada veri odaklı modelleme yöntemleriyle batarya kapasitesinin tahmin edilmesi amaçlanmıştır. Çalışmada bataryadan toplanan sıcaklık, akım, voltaj, iç direnç, şarj durumu ve çevrim sayısı gibi parametreler kullanılmış; önişleme sürecinde anomali tespiti ve özetleme yöntemleriyle veri sadeleştirilmiş, istatistiksel anlamı korunarak işlenebilir hale getirilmiştir. Özellikle şarj süresi ve SOC aralığı gibi özetleyici istatistiklerin çıkarılması, veri hacmini önemli ölçüde azaltırken anlamlı model çıktılarının elde edilmesine olanak tanımıştır. Zaman bilgisi kaybı nedeniyle özet veriler, özellikle zaman serisi modelleri açısından bazı performans sınırlılıklarına neden olmuştur. Bu nedenle model seçimi, veri yapısıyla birlikte değerlendirilmiştir. SVR, RF ve LSTM gibi farklı modelleme algoritmalarıyla yapılan analizlerde, her modelin avantaj ve sınırlılıkları gözlemlenmiştir. SVR modeli yüksek doğruluk sağlasa da karar mekanizmasının yorumlanabilirliği zayıf kalmaktadır. RF modeli hem doğruluk hem de değişken önemini açıklayabilme kapasitesiyle öne çıkmış, sahada uygulanabilirlik açısından en güçlü aday olmuştur. LSTM modeli ise zaman serisi bilgisiyle güçlü tahminler üretmesine rağmen, özetlenmiş durağan veri ile çalışırken performans kaybı yaşamış; bu durum, model mimarisinin doğrudan veri yapısına uygun olması gerektiğini göstermektedir. Ayrıca LSTM'nin gömülü sistemdeki uygulanabilirliği, işlem gücü sınırlamaları nedeniyle kısıtlı kalmaktadır. Gerçek zamanlı gömülü sistem testleriyle modellerin yalnızca doğrulukları değil; işlem gecikmeleri, sistem kaynaklarına etkisi ve tahmin süreleri de analiz edilmiştir. RF modeli düşük işlem yükü ve yüksek kararlılığı sayesinde gömülü ortam için en uygun yapı olarak öne çıkarken; LSTM gibi derin öğrenme tabanlı modellerin düşük frekansta çalıştırılması gerekmiş, bazı zaman aralıklarında veri kaybına neden olmuştur. Bu sonuçlar, sahada kullanılacak sistemlerin yalnızca doğruluğa değil; aynı zamanda zaman etiketleme, veri tamponlama ve donanıma uyumluluk gibi faktörlere göre tasarlanması gerektiğini göstermiştir. Bu çalışmanın son bölümünde, geliştirilen yaklaşımın sahaya dönük uygulanabilirliği ve literatürdeki yeri üzerine genel bir değerlendirme yapılmıştır. RF modelinin genellenebilirliği, yorumlanabilirliği ve düşük kaynak ihtiyacı sayesinde gelecekteki çalışmalar için güçlü bir temel sunmaktadır. Geliştirilen sistemin, farklı batarya kimyaları ve çevresel koşullarda test edilmesiyle daha esnek hale getirilebileceği; LSTM ve GRU gibi modellerin ise daha kompakt ve kuantize biçimlere dönüştürülerek gömülü sistemlerde daha verimli çalışabileceği belirtilmiştir. Ayrıca, uçtan uca otomatik veri toplama, eğitim ve güncelleme sistemlerinin kurulmasıyla modelin uzun vadeli sürdürülebilirliğinin sağlanabileceği sonucuna ulaşılmıştır. Çalışma boyunca izlenen yol, her bir modelin geliştirilmesinden başlayarak sahada test edilmesine kadar olan tüm süreci kapsamaktadır. Çalışmanın ikinci bölümünde; veri kaynakları, veri yapısı ve önişleme adımları detaylandırılmaktadır. Üçüncü bölümde SVR, RF ve LSTM algoritmalarının yapısı, avantajları ve bu çalışmada nasıl uygulandıkları anlatılmaktadır. Dördüncü bölümde, eğitilen modellerin hem istatistiksel performansları hem de gömülü sistemdeki gerçek zamanlı test sonuçları sunulmaktadır. Beşinci ve son bölümde ise model karşılaştırmaları yapılmakta ve ileriye dönük uygulanabilirlik önerileri yer almaktadır. Bu yapı, çalışmanın hem teorik derinliğini hem de pratik faydasını ortaya koyarak, benzer amaçlara sahip ileriki araştırmalara yol gösterici olmayı hedeflemektedir. | |
| dc.description.abstract | Lithium-ion batteries have become one of the most crucial enablers of the modern energy transition due to their high energy density, extended cycle life, and scalability across various applications. From electric vehicles and portable consumer electronics to renewable energy integration and grid-level storage systems, these batteries serve as a backbone for sustainable power systems. Their efficiency, compactness, and relatively low maintenance requirements have led to widespread deployment across multiple industrial sectors. However, despite their advantages, lithium-ion batteries undergo irreversible degradation processes over time, which reduce their performance, safety, and operational reliability. As battery-powered systems increasingly operate in dynamic and safety-critical contexts, the need for robust methods to monitor and predict battery condition has become paramount. In this regard, the concept of State of Health (SOH) has emerged as a key metric used to quantify battery degradation and determine remaining useful life. SOH is typically defined as the ratio of the current full charge capacity of a battery to its rated nominal capacity. It provides a percentage-based measure of how much of the original storage capability remains after aging effects have taken place. Accurate estimation of SOH is essential for system-level decision-making, such as load balancing, thermal management, charging optimization, and preventive maintenance scheduling. Moreover, from a safety standpoint, early detection of deterioration can help mitigate the risk of thermal runaway, capacity loss, or sudden failure, especially in high-voltage battery packs used in electric vehicles. Unfortunately, the process of monitoring and estimating SOH presents significant challenges, especially under real-world operating conditions where data is noisy, incomplete, and collected under varying environmental and load profiles. Traditional SOH estimation methods often rely on full charge-discharge cycles to assess the capacity degradation of a cell or module. While effective under controlled laboratory environments, this approach has multiple drawbacks when applied in practical scenarios. Full cycling is time-consuming, reduces battery availability, and accelerates wear by increasing stress on electrode materials. In addition, real-time systems rarely operate under standardized load profiles that allow such controlled cycling. Moreover, such methods typically fail to account for operational conditions such as varying current rates, ambient and cell temperatures, state-of-charge ranges, and load duty cycles, all of which significantly influence aging mechanisms. As a result, these methods cannot reliably track SOH under actual usage conditions, especially in complex systems with multiple battery modules or large-scale storage arrays. A widely used method in battery management systems (BMS) is the ampere-hour counting technique, which involves integrating current over time to estimate charge throughput and track remaining capacity. While simple and efficient, this method suffers from several limitations. It is highly sensitive to measurement drift, sensor errors, and cumulative integration inaccuracies. Furthermore, it cannot detect internal degradation mechanisms such as loss of active material, SEI growth, lithium plating, or structural changes in electrodes. Consequently, ampere-hour counters tend to diverge over time and fail to accurately reflect the health status of the battery without frequent recalibration. These shortcomings have prompted researchers to explore alternative approaches that leverage operational data and intelligent algorithms to infer battery health indirectly. One promising direction involves the use of data-driven models that employ machine learning algorithms trained on real or simulated battery data. These models can learn complex non-linear relationships between input features and target variables, enabling them to estimate SOH using features derived from current, voltage, temperature, internal resistance, and other signals readily available from the BMS. In this study, a data-driven SOH estimation framework was developed using multiple model types. The experimental dataset used in this research was collected from lithium-ion battery packs, each comprising 108 cells. Two identical packs were used: one for data collection and model training, and the other for independent testing. The tests spanned the entire state-of-charge range from 0 to 100 percent and covered a wide temperature window from -15 to 45 degrees Celsius. No variation in battery chemistry was introduced to ensure that model evaluations were not confounded by differences in material behavior. The raw data collected consisted of high-frequency time-series measurements of voltage, current, temperature, resistance, and cycle number. To enable efficient model training and minimize computational overhead, a data preprocessing pipeline was applied. The pipeline included anomaly detection to eliminate erroneous samples and summarization techniques to compress time-series segments into statistical features. These features included mean and standard deviation of signals, SOC range, charge duration, and resistance variation. Importantly, the summarization was performed over the 30 to 60 percent SOC interval, as this region is known to exhibit more linear voltage behavior and is less influenced by hysteresis. While summarization significantly reduces the data volume and enables efficient model training, it also results in the loss of temporal structure, which affects models that rely on sequential data. Three primary machine learning models were selected for training and evaluation: Support Vector Regression (SVR), Random Forest (RF), and Long Short-Term Memory (LSTM) neural networks. SVR showed high predictive performance on summarized features but suffered from limited interpretability and poor generalization to data outside the training domain. RF provided strong accuracy, fast inference, and the ability to quantify feature importance, making it highly suitable for embedded deployment. The LSTM model, designed for time-series tasks, underperformed when applied to summarized inputs, as the lack of sequential context limited its capacity to capture temporal dependencies. Moreover, LSTM models imposed substantial computational requirements, rendering them unsuitable for real-time execution on resource-constrained embedded systems. Real-time evaluation of the models was carried out on an Infineon-based embedded BMS. Among the three, only the Random Forest model successfully maintained low latency and real-time operation while preserving predictive performance. It seamlessly integrated into the existing BMS architecture and operated across a wide range of conditions without requiring specialized hardware. In contrast, LSTM and SVR models required lower execution frequencies to avoid overloading the processor, leading to synchronization issues and delayed outputs, which are unacceptable in safety-critical systems. In addition to the core models, further experimentation was conducted using a simulation-based battery model to evaluate three additional methods: linear regression, Multilayer Perceptron (MLP), and Gated Recurrent Unit (GRU) networks. Linear regression was included as a baseline due to its simplicity, but it failed to capture non-linear aging behaviors and yielded the lowest performance. MLP, a feedforward neural network consisting of multiple hidden layers, performed moderately well and was able to model non-linear interactions between input features and capacity. However, it lacked robustness under variable conditions and required significant hyperparameter tuning. GRU, a simplified variant of LSTM, retained sequential modeling capabilities while reducing computational load. It performed better than MLP and linear regression in simulation and exhibited a favorable trade-off between accuracy and complexity. Nevertheless, since these models were not deployed on physical hardware, their performance under real-time conditions remains to be verified. Beyond modeling, a deeper understanding of battery degradation mechanisms is essential to contextualize SOH estimation. Degradation processes are typically categorized into calendar aging and cycle aging. Calendar aging arises from chemical reactions that occur during storage, such as SEI layer growth and electrolyte decomposition, even when the battery is not in use. Cycle aging, on the other hand, results from repeated charging and discharging, leading to lithium plating, active material loss, and microstructural damage to electrodes. These changes reduce the number of cyclable lithium ions and increase impedance, leading to capacity fade and voltage instability. Such degradation is influenced by several factors, including C-rate, temperature, depth of discharge, and overcharge or over-discharge events. Importantly, these mechanisms do not produce direct and easily measurable changes, making capacity estimation a complex inferential task. During each charge-discharge cycle, lithium ions shuttle between the cathode and anode. Over time, the efficiency of this transfer is reduced due to irreversible side reactions. Electrolyte oxidation, SEI thickening, gas generation, and particle isolation further impair the mobility and storage capability of ions. These internal changes are not directly visible in current or voltage readings, which is why advanced analytics are needed to extract degradation signals from indirect indicators such as resistance rise, capacity delay, or SOC hysteresis. This study demonstrates that machine learning models trained on summarized data can successfully estimate SOH, provided that appropriate features are selected and model architecture is matched to the data structure. Among all models tested, Random Forest proved to be the most effective in balancing accuracy, interpretability, and deployment feasibility. While LSTM and GRU remain promising for raw time-series inputs, their performance is limited when data is summarized or when embedded constraints exist. Future research directions include the exploration of hybrid models that combine physics-based modeling with data-driven learning, deployment of quantized neural networks for embedded inference, and implementation of adaptive pipelines that periodically retrain models using updated operational data. Expanding the dataset to include different battery chemistries and cell formats would also enhance generalizability. Ultimately, robust SOH estimation under embedded constraints is essential to enable the next generation of intelligent, autonomous, and safe battery systems for a wide range of applications. To further support practical deployment in safety-critical systems such as electric vehicles and grid-scale storage, explainability and reliability of SOH models must also be addressed. While Random Forest models offer a degree of transparency through feature importance rankings, integrating model uncertainty estimation and anomaly detection mechanisms can enhance trust in predictions. This is especially critical when models encounter out-of-distribution data or operate under rare conditions not captured during training. Embedding self-checking logic and fallback strategies, such as reverting to conservative thresholds or physics-based approximations when confidence is low, can improve robustness and operational safety. As regulatory bodies and industry standards evolve to mandate more rigorous battery diagnostics and safety protocols, the development of certifiable, explainable, and adaptive SOH estimation frameworks will become increasingly vital for commercial adoption and long-term system reliability. | |
| dc.description.degree | Yüksek Lisans | |
| dc.identifier.uri | http://hdl.handle.net/11527/27818 | |
| dc.language.iso | tr | |
| dc.publisher | İTÜ Lisansüstü Eğitim Enstitüsü | |
| dc.sdg.type | Goal 9: Industry, Innovation and Infrastructure | |
| dc.subject | mekatronik mühendisliği | |
| dc.subject | mechatronics engineering | |
| dc.subject | derin öğrenme | |
| dc.subject | deep learning | |
| dc.subject | elektrikli araçlar | |
| dc.subject | electric vehicles | |
| dc.subject | electric vehicles | |
| dc.subject | batarya yönetim sistemleri | |
| dc.subject | battery management systems | |
| dc.title | Test verilerine dayalı, makine öğrenmesi ve derin öğrenme yöntemleri ile batarya sağlık durumu tahmini | |
| dc.title.alternative | Battery state of health estimation based on test data using machine learning and deep learning methods | |
| dc.type | Master Thesis |