LEE- Bilgisayar Mühendisliği-Yüksek Lisans
Bu koleksiyon için kalıcı URI
Gözat
Yayın Türü "Master Thesis" ile LEE- Bilgisayar Mühendisliği-Yüksek Lisans'a göz atma
Sayfa başına sonuç
Sıralama Seçenekleri
-
ÖgeA variational graph autoencoder for manipulation action recognition and prediction(Graduate School, 2022-06-23) Akyol, Gamze ; Sarıel, Sanem ; Aksoy, Eren Erdal ; 504181561 ; Computer EngineeringDespite decades of research, understanding human manipulation actions has always been one of the most appealing and demanding study problems in computer vision and robotics. Recognition and prediction of observed human manipulation activities have their roots in, for instance, human-robot interaction and robot learning from demonstration applications. The current research trend heavily relies on advanced convolutional neural networks to process the structured Euclidean data, such as RGB camera images. However, in order to process high-dimensional raw input, these networks must be immensely computationally complex. Thus, there is a need for huge amount of time and data for training these networks. Unlike previous research, in the context of this thesis, a deep graph autoencoder is used to simultaneously learn recognition and prediction of manipulation tasks from symbolic scene graphs, rather than using structured Euclidean data. The deep graph autoencoder model which is developed in this thesis needs less amount of time and data for training. The network features a two-branch variational autoencoder structure, one for recognizing the input graph type and the other for predicting future graphs. The proposed network takes as input a set of semantic graphs that represent the spatial relationships between subjects and objects in a scene. The reason of using scene graphs is their flexible structure and modeling capability of the environment. A label set reflecting the detected and predicted class types is produced by the network. Two seperate datasets are used for the experiments, which are MANIAC and MSRC-9. MANIAC dataset consists 8 different manipulation action classes (e.g. pushing, stirring etc.) from 15 different demonstrations. MSRC-9 consists 9 different hand-crafted classes (e.g. cow, bike etc.) for 240 real-world images. The reason for using such two distinct datasets is to measure the generalizability of the proposed network. On these datasets, the proposed new model is compared to various state-of-the-art methods and it is showed that the proposed model can achieve higher performance. The source code is also released https://github.com/gamzeakyol/GNet.
-
ÖgeAğ iletişimlerinde temel yenilikçi çözümlerin standartlaştırılması(Lisansüstü Eğitim Enstitüsü, 2023-08-30) Kalkan, Muhammed Salih ; Seçinti, Gökhan ; 504191579 ; Bilgisayar MühendisliğiAğ iletişimlerindeki problemler oldukça eskiye dayanır. Bu problemleri çözmek için birçok çalışma yapılmıştır. Bu çalışmalar, günümüzde OSI model olarak adlandırdığımız, katmanlı bir iletişim yapısını ortaya çıkarmıştır. Bu katmanlardan birisi uygulama katmanıdır. Mesajlaşma ile ilgili problemler, bu katmana aittir. Dolayısıyla, mesajlaşma ile ilgili özellikler bu katmanda kullanılır. Bazı mesajlaşma özelliklerini standartlaştırmak için, bazı uygulama katmanı protokoller oluşturulmuştur. AMQP, MQTT vb. protokoller, uygulama katmanı protokollerine örnektir. Bu araştırmada da, temel yenilikçi çözümler uygulama katmanında değerlendirilir. Uygulamalar, mesajlaşma ile ilgili sorunları farklı şekillerde çözmektedir. Bazı özellikler uygulama koduyla, bazıları kütüphanelerle ve bazıları da protokollerle standardize edilerek sağlanır. Uygulama koduna eklenen mesajlaşma özelliklerinin her uygulama için tekrar tekrar yazılması gerekmektedir. Her uygulama için gerekli mesajlaşma özelliklerinin kodlarının tekrar tekrar yazılması, iş gücü kaybına, hata olasılığına, kodun her seferinde artan karmaşıklığına neden olur. Mesajlaşma sorunlarını kütüphane kodları ile çözmek, bu kütüphanenin diğer tüm uç noktalarla paylaşılmasını gerekli kılar. Bu nedenle mesajlaşma özelliklerinin bir protokol ile standardize edilmesi gerekmektedir. Bu çalışmada, yerel ağlarda ve IoT'de kullanılmak üzere temel yenilikçi özellikleri standartlaştırarak iş gücü kazancı sağlanması, uygulama kodunun karmaşıklığının azaltılması, çözümlerin her uç nokta için ortaklanması amaçlanmıştır. Bir protokol standardı oluşturmak için, protokollere ait özelliklerin arkaplan bilgisine ihtiyaç vardır. Bu yüzden öncelikle, ikili-metin protokoller, iletişim modelleri, merkezi-merkeziyetsiz yaklaşımlar gibi arkaplan bilgileri incelenmiştir. İkili protokoller, verileri ikili olarak ileten protokollerdir. Metin protokolleri, verileri Unicode veya ASCII olarak ileten protokollerdir. İkili protokoller, verilerin daha küçük boyutlarda iletilmesini sağladığı için performans açısından daha iyidir. Metin protokolleri, verileri daha büyük boyutlarda iletmesine karşın ikili protokollere kıyasla kolayca hata ayıklanabilir ve veriler insan tarafından okunabilirdir. Hem yüksek performans özelliği, hem verinin okunabilir olma özelliğine sahip olmak için, izleyici uç noktanın, ikili verilerin metin karşılıklarını bilmesi gerekir. Ayrıca ikili protokoller için bayt sırası (endianness) önemliyken, metin protokolleri için bayt sırası önemli değildir. Cihazın endianness tipi little-endian veya big-endian olabilir. İkili protokollerde, farklı endianness'e sahip iki cihaz iletişim kurduğunda, verilerin serileştirilmesinden önce ve verinin serisini çözümleme işleminden önce verilerin bayt adreslemesi tersine çevrilmelidir. Bu problemlerin çözümleri, uygulama katmanında standartlaştırılırsa, geliştiricilerin bu problemleri tekrar tekrar çözmeye çalışmasına gerek kalmaz. Sunucu-istemci modeli, birden fazla istemci uç noktasının tek bir sunucu uç noktasından hizmet talep ettiği bir modeldir. Yayınla-abone ol modeli, yayıncı ve abone uç noktalarının merkezi bir mesaj yönelimli ara yazılım aracılığıyla mesaj iletimlerini sağlayan bir modeldir. Uç noktalar, konulara abone olur veya mesajları yayınlar. Mesaj aracısı, yayınlanan mesajları, mesaja abone olan uç noktalara iletir. Mesaj aracısı, gevşek bağlantı ve esneklik sağlar. Uç noktalar, birbirlerinin varlığından bağımsız olarak mesajlaşmaya devam eder. Transformatörler ve filtreler, mesaj aracısı üzerinde çalışabilir. Gevşek bağlantı aynı zamanda bir dezavantajdır. Yayıncı uç noktaları, abone uç noktalarının iletişim kurup kurmadığından emin olamaz. Yayıncılar ve aboneler arttıkça, mesaj aracısını aşırı yükleyebilir. Mesaj aracısı, merkezi olduğundan darboğaza neden olabilir. Bu, yatay ölçeklenebilirliği sınırlar. İletileri doğrudan hedef uç noktalara iletmek yerine önce mesaj aracısına iletmek gecikmeyi artırır. Mesaj aracısı ile gelen bu problemlerden kurtulmak için, merkezi olmayan yayınla-abone ol modeline ihtiyaç vardır. Mesajlaşan uç noktalar için en büyük sorunlardan biri, uç noktalardan birinde mesaj yapılarının güncel olmaması veya yanlış implement edilmiş olmasıdır. Mevcut mesajlaşma protokolleri için, bir bağlantıdaki uç noktaların mesaj yapılarının uyumluluğunu kontrol etmeye yönelik standart bir yaklaşım yoktur. Bir iletişimde giden ve gelen mesajları izlemek kritik olabilir. Mesaj gönderme noktadan noktaya ise, üçüncü bir izleme uzak uç noktası iletişime dahil edilemez. IP paket başlığındaki hedef IP adresi, noktadan noktaya iletişim için tek bir cihaza ait olmalıdır. Bu problem, uygulama katmanında üçüncü uzak noktalara yönlendirme yapılarak çözülebilir. Birçok uygulama katmanı protokolü, taşıma katmanındaki bir protokole bağlıdır. Bu da gelecek kullanımları kısıtlayabilir. Örneğin, QUIC protokolü, TCP'nin yerini aldığını varsayalım. Artık TCP implementasyonlarının ortadan kalktığını varsayalım. Bu durumda, düzinelerce TCP tabanlı protokolün yeni bir sürümle QUIC tabanlı olması gerekecektir. Bu yüzden alt protokollerden soyutlanmak, gelecek kullanımlar için önemlidir. Birden çok protokol kullanmak için birden çok iletişim arabirimi oluşturulmalıdır. Ancak bir protokol, çoklu alt katman protokol ile kullanılabilir olma özelliğine sahip ise, tek bir iletişim arabirimi yeterli olacaktır. Bu çalışmada, mevcut protokollerin, bu sorunları ne kadar çözdüğüne dair veriler toplandı. Bu sorunları çözen özellikler ile mevcut protokolleri kullanarak bir tablo oluşturuldu. Diğer uygulama katmanı protokollerinin tüm bu özellikleri desteklemediği görülmektedir. Bu nedenle, bu özellikleri sağlayan yeni bir protokole ihtiyaç vardır. Bu protokolün adı mesajlaşma kontrol protokolüdür (MCP). MCP'nin hedeflediği kullanım alanı daha çok yerel ağ iletişimleridir. MCP, daha çok yerel ağ iletişimleri, asenkron iletişimler, non-stateless iletişimler ve gömülü sistemlerde kullanılabilecek özelliklere yoğunlaşmıştır. MCP'nin alt katman protokollerinden bağımsız olması için ve çoklu alt protokollerle kullanılabilmesi için MCP'nin iki bileşeni vardır: MCP Adaptörü ve iletişim arayüzü. MCP Adaptörü, MCP'nin ön koşullarını sağlamak için gereklidir. Alt protokollerin işlevlerini kullanmak için iletişim arayüzü gereklidir. Böylece MCP alt protokollerden bağımsız hale gelir ve birden fazla alt protokol ile kullanılabilir. MCP'de iki mesaj sınıfı vardır: MCP Standart Mesajı, MCP Uygulama Mesajı. MCP, MCP standart mesajları olarak adlandırılan, uygulama kodundan bağımsız yerleşik mesajlara sahiptir. 5 tür standart mesaj vardır: El Sıkışma Mesajı, Kalp Atışı Mesajı, Rol Başvuru Mesajı, Abone Olma Mesajı, Abonelikten Çıkma Mesajı. İstemciler, kullanıcı tanımlı mesajların yapılarını el sıkışma istek mesajı ile JSON formatında gönderir. Böylece uç noktaların mesaj uyumlulukları kontrol edilir. Sunucu, endianness tipini el sıkışma yanıt mesajı ile gönderir. İstemci, sunucunun endianness tipini öğrenir. İstemci ve sunucunun endianness türleri farklıysa, istemci verilerin bayt sıralamasını otomatik olarak değiştirir. Bağlantının canlı olup olmadığını tespit etmek için periyodik olarak kalp atışı mesajı gönderilir. Bir istemci, bir mesaja abone olmak için ya da bir mesajın aboneliğinden çıkmak için Abone Olma Mesajı ve Abonelikten Çıkma Mesajını kullanır. MCP uygulama mesajları, uygulama kodunda tanımlanan mesajlardır. Dört tür uygulama mesajı vardır: İstek-Yanıt Mesajı, Olay Mesajı, Başlangıç Mesajı, Rapor Mesajı. İstek-yanıt mesajları için, yalnızca ilgili istek mesajı alındığında ilgili yanıt mesajı oluşturularak iletişim sağlanır. Olay mesajları, bir olayın tetiklenmesi ile iletilir. Olay mesajları tüm bağlı abone istemcilerine gönderilir. Başlangıç mesajı, aslında bağlantı kurulduğunda tetiklenen bir olay mesajıdır. Rapor mesajı, aslında zamana göre tetiklenen bir olay mesajıdır. Yetkilendirme için rol tabanlı erişim kontrol yöntemi kullanılır. İstemcilerin MCP bağlantısında rolleri vardır. İstemcilerin rolleri, mesajlaşma arayüzündeki mesajların erişilebilirliğini belirler. Sunucu, her mesaj için hangi istemci rollerinin erişebileceğini belirler. Rollerin istemcilere atanmasını ise, admin rolündeki istemci gerçekleştirir. Noktadan noktaya iletişimde mesajları izlemek isteyen istemcilerin rolü, izleme rolüdür. İzleyici rolü, iletilerin erişilebilirliğinden bağımsızdır. Noktadan noktaya iletişimdeki tüm mesajlar monitör istemcisine iletilir. İzleme istemcisi, iletişime katılmak için bir bağlantı isteği gönderir. Monitör, bağlantı kurma aşamasında el sıkışma mesajı ile mesaj yapılarını alır ve iletişimdeki ikili verilerin metin karşılıklarını öğrenir. Böylece veriler ikili olarak iletilse de, metin olarak görüntülenebilir. Uygulama katmanında oluşturulan MCP protokolü, mesajlaşma problemlerini protokol kodunda çözerek problemlerin çözümünü standardize eder. Diğer uygulama katmanı protokolleri, MCP'nin çözdüğü tüm sorunları çözemez. Bu nedenle, MCP fark yaratır. MCP kullanılırsa, bu çalışmada belirtilen çözümlerin uygulama kodunda olmasına gerek kalmaz. Böylece uygulama kodunun karmaşıklığı azaltılmakta ve mesajlaşma özelliklerinde oluşabilecek hatalar ortadan kaldırılmaktadır. MCP sadece mesajlaşma için birçok özellik sunmakla kalmaz, aynı zamanda performansa da önem verir. Performans için, MCP dinamik başlık boyutunu kullanır ve MCP ikili protokoldür. MCP, temel mesajlaşma problemlerine odaklandığı ve performansı önemsediği için yerel ağların yanında IoT'ye de uygulanabilir. Gelecekte IoT alanında MCP'nin kullanılabilmesi için analizler yapılabilir. Sonuç olarak, MCP yenilikçi temel mesajlaşma özellikleri sağlar, bu özellikleri standardize ederek hata olasılığını azaltır ve uygulama kodunun karmaşıklığını azaltır.
-
ÖgeAkademik hukuk makalelerinde atıf önerisi(Lisansüstü Eğitim Enstitüsü, 2023-06-22) Arslan, Doğukan ; Eryiğit, Gülşen ; 504201515 ; Bilgisayar MühendisliğiHukuk ve Doğal Dil İşleme çalışmalarının kesişiminde, hukuki metinlerin anlaşılması, işlenmesi, yorumlanması ve üretilmesi gibi konulara odaklanan "Hukuki DDİ" çalışmaları yer alır ve bu çalışmalar farklı hukuki metin türleri üzerinde çeşitli alt görevlere odaklanmaktadır. Bu çalışmalardan biri de Atıf Öneri görevidir. Atıf Önerisi, bilimsel makalelerde belirli bir metin için potansiyel atıfların belirlenmesi çalışmalarını kapsar. Ancak, bu görevdeki çalışmalarda, veri kümelerinin alan bazında yeterince kapsayıcı olmaması ve alanlara dengesiz dağılması gibi sorunlar genellikle ihmal edilmektedir. Son zamanlarda yapılan bir çalışmada, bu sorunlar ele alınmış ve farklı alanları kapsayan yeni bir veri kümesi oluşturulmuştur. Ancak, hukuk gibi bazı temel alanlar hala bu tür çalışmaların dışında kalmaktadır. Bu nedenle, Atıf Önerisi gibi alt görevlerde bile, büyük veri kümeleriyle eğitilen dil modelleri, alan bazında eksiklikler gösterebilmektedir. Hukuki Doğal Dil İşleme bağlamında Atıf Önerisi, çoğunlukla mahkeme kararları gibi bilimsel olmayan hukuki metinlerden, var olan argümanları gerekçelendirmek için çeşitli atıfların elde edilmesini amaçlar. Hukuk sistemleri, Ortak Hukuk ve Kıta Avrupası Hukuk sistemi olmak üzere iki ana kategoriye ayrılabilir. Ortak Hukuk sistemine sahip ülkelerde, kararların sonuçları geçmiş davaların incelenmesiyle belirlenir ve bu nedenle kararlar arasında çok sayıda atıf bulunurken, Kıta Avrupası Hukuk sistemine sahip ülkelerde karar verme süreci daha çok olgusal kanıtlar ve ilgili kanun maddelerine dayanır. Bu da kararların kanunlara ve tüzüklere daha fazla atıf içermesine yol açar. Her iki sistemde de hukuk uygulayıcıları için emsal kararları bulmak önemlidir, ancak bu süreç zaman alıcı olabilir. Türkiye'de Yargıtay tarafından yayınlanan 7 milyondan fazla karar bulunmaktadır ve avukatlar, ilgili içtihatları aramak için önemli miktarda zaman harcamaktadır. Hukuki Atıf Önerisi görevinin halihazırdaki önemi ve faydaları, akademik hukuk metinlerinin gereken ilgiyi görmemesi ve görev kapsamına alınmamasıyla sonuçlanmıştır. Bununla birlikte, bilimsel makalelerden otomatik olarak atıf bilgisi çıkarılarak elde edilecek olan işaretli veri ile, etiketli veri oluşturmanın maliyetli olduğu Hukuki Doğal Dil İşleme görevleri için önemli bir kaynak oluşturulabilir. Bu yaklaşım, Atıf Önerisi görevinin yanı sıra emsal karar bulma, hukuki belge benzerliği ve hukuki karar tahmini gibi diğer görevlerde de etkili olabilir. Bu şekilde, akademik hukuk metinleri daha verimli bir şekilde kullanılarak daha iyi performans gösteren dil modelleri geliştirilebilir. Ayrıca, diğer bilimsel alanlardan farklı dilbilimsel özelliklere sahip olan hukuki metinler için özel bir ilgi gerekir. Geleneksel Atıf Önerisi görevinden ayrışan Hukuki Atıf Önerisi, bu özellikleri anlayabilen ve etkili atıf önerileri sunabilen dil modellerine ihtiyaç duyar. Bilimsel yayıncılığın hızlı genişlemesiyle birlikte, atıfların güvenilirliği ve kalitesiyle ilgili endişeler ortaya çıkmış ve Atıf Önerme görevi zaman içinde önem kazanmıştır. Bu görev kapsamında işbirlikçi filtreleme, çizge temelli filtreleme ve içerik temelli filtreleme gibi yöntemler kullanılmaktadır. Farklı metin türleri, haberlerden patentlere ve yargı kararlarına kadar, Atıf Önerme görevinde kullanılmıştır. Görev, önerinin kapsamına bağlı olarak da genellikle yerel ve küresel olmak üzere iki ana kategoriye ayrılır. Çeşitli akademik makale veri kümeleri, Atıf Önerme tekniklerinin geliştirilmesi ve test edilmesi için kullanılmıştır. Atıf Önerİ yöntemleri, akademik olmayan hukuki metinleri (mahkeme kararları, tüzükler, atıfta bulunulan yasalar vb.) tespit etmek amacıyla hukuk alanına uyarlanmaktadır. Bu uyarlamalar, Hukuki Atıf Önerme görevi adı altında gerçekleştirilmektedir. Tez kapsamında Hukuki Atıf Öneri görevi için, akademik hukuki makalelerden oluşan bir veri kümesi toplanmıştır. Bu veri kümesi, Atıf Önerme ve ilgili görevlerde iyi performans gösteren veya hukuk alanında eğitilmiş toplamda yedi farklı modelin test edildiği dört farklı deney düzeninde kullanılmıştır. Gerçekleştirilen deneylerde, yedi farklı model için dört farklı deney düzeni kullanılarak, önceden eğitilmiş modellerin doğrudan kullanılması, modellere ince ayar yapılması ve BM25 ile ilgili makalelerin çekilmesiyle birlikte yeniden sıralanması üzerinde çalışmalar yapılmıştır. Benimsenen iki aşamalı yaklaşım, dil modellerinin hantallığını azaltmak için BM25 gibi daha hızlı ancak daha az doğruluk gösteren modelleri kullanarak makale örneklerini hızlı bir şekilde seçmeyi amaçlar. Bu yaklaşım, bilgi getirimi çalışmalarında sistem etkinliğini artırmak için sıkça kullanılır. İlk aşamada, hızlı modellerle ilgili belgelerin örneklerini alırken, daha sonra yavaş ancak daha doğru olan modellerle bu aday makaleler yeniden sıralanır. İngilizce hukuki atıf önerme görevi için LawArXiv adlı hukuki bilimsel makaleler veritabanından makaleler indirilmiştir. Bu veritabanı, 1366 bilimsel hukuki makaleye sahip olan ve çeşitli hukuki konuları kapsayan bir kaynaktır. Makalelerin atıf yapılan kaynakları elde etmek için Google Scholar kullanılmış ve 10 binden fazla atıf içeren makale elde edilmiştir. Elde edilen makalelerin öz kısmı pdfplumber adlı bir Python paketi ile çıkarılmış, ardından başarılı bir şekilde çıkarılan İngilizce makaleler seçilmiştir. Ön işleme adımlarıyla makaleler düzenlenmiş ve öz kısımları çıkarılmıştır. Deneylerde 719 LawArXiv makalesi ve 8,887 atıf içeren 10,111 atıf bağlantısı içeren bir veri kümesi kullanılmıştır. Makalelerin öz kısımları, benzer içerik temelli küresel atıf önerme çalışmalarıyla uyumlu bir şekilde, ince ayar, temsil elde etme ve test aşamalarında girdi olarak kullanılmıştır. Veri kümesi, eğitim ve test olarak ayrılmış olup, verilerin %70'i eğitimde kullanılmış ve kalan %30'u test için ayrılmıştır. İnce ayar aşamasında üçlü kayıp fonksiyonu kullanılmıştır. Bu fonksiyon referans girdiyi (çapa) pozitif bir girdiyle (benzer) ve çapayla eşleşmeyen negatif bir girdiyle karşılaştırır. İnce ayar ve temsil elde etme adımlarından sonra, belge temsil vektörleri vektör uzayında benzerliklerine göre sıralanmıştır. Tüm eğitim ve test süreçlerinde Sentence-Transformers çerçevesi kullanılmıştır. Deneylerin sonuçları, bilgi getirimi çalışmalarında yaygın olarak kullanılan üç farklı metrik olan Mean Average Precision (MAP) (Ortalama Kesinliklerin Ortalaması), Recall (Duyarlılık) ve Mean Reciprocal Rank (MRR) (Sıralamaların Terslerinin Ortalaması) kullanılarak sunulmuştur. Bu metrikler, bir makalenin ortalama olarak 14 atıf bağlantısına sahip olduğu göz önüne alınarak, getirilen ilk 10 belge için (n=10) raporlanmıştır. Önceden eğitilmiş çeşitli modeller ve derlenen veri kümesi eğitilmiş BM25 modelinin karşılaştırması, SciBERT'in diğer modellere kıyasla en düşük performansı gösterdiği, Law2Vec ve LegalBERT gibi hukuki derlemlerle eğitilen modellerin atıf önerme görevinde başarısız olduğu, SGPT'nin ise SPECTER ve SciBERT'ten daha iyi performans gösterdiği ancak BM25'in en başarılı model olarak öne çıktığı sonucunu ortaya koymuştur. Bu sonuçlar, literatürdeki bilimsel alan temelli Atıf Önerme çalışmalarıyla da uyumludur. Önceden eğitilmiş modellere ince ayar yapıldığında elde edilen sonuçlar incelendiğinde, modellerin genel olarak benzer performans sergilediği ancak BM25'i geçemediği görülmektedir. Bununla birlikte, ince ayarlı LegalBERT modelinin performansının önemli ölçüde arttığı, modelin göreve aşinalığının alan bilgisiyle birleşmesinin performansı artırdığı gözlemlenmiştir. En başarılı modeller arasında SciNCL ve SciBERT öne çıkmaktadır, SciBERT'in performansındaki sıçrama dikkat çekicidir. Önceden eğitilmiş modellerin sıralama yeteneklerini BM25'in geri getirme kapasitesiyle birleştiren deneylerin sonuçları önceden eğitilmiş modellerin BM25'in performansını artıramadığını gösterse de, SciNCL'nin tartışmasız olarak en başarılı model olduğunu ortaya koymaktadır. BM25 ile getirilen makalelerin ince ayarlı modellerle yeniden sıralanması sonucunda, tüm ince ayarlı modellerin BM25'in performansını artırdığı gözlemlenmekte olup, SciNCL'in diğer deneylerle uyumlu olarak en başarılı model olduğu görülmektedir (0.30 MAP@10). Bu çalışmada, İngilizce Hukuki Atıf Önerisi veri kümesi oluşturulmuş ve Atıf Önerisi görevinde başarılı modeller ile alana özel eğitilmiş modellerin performansları karşılaştırılmıştır. Ayrıca, iki aşamalı bilgi getirme yöntemi kullanılmıştır. Sonuçlar, öne sürülen hipotezlerin doğruluğunu desteklemektedir. Dil modellerinin Hukuki Atıf Önerisi görevinde başarılı olabilmesi için akademik hukuk makalelerine yer verilmesi gerektiği ortaya çıkmıştır. Aynı şekilde, hukuki dokümanlarla eğitilen modellerin daha kapsayıcı olabilmesi için akademik hukuk makalelerinin de eğitim veri kümesinde bulunması gerektiği gösterilmiştir. İki aşamalı bilgi getirme yöntemi, büyük dil modellerinin ve BM25'in en iyi yönlerini birleştirerek genel performansı artırmaktadır. BM25 ile SciNCL'in birlikte kullanılması, Hukuki Atıf Önerisi görevinde en başarılı sonuçları vermektedir. Gelecek çalışmalar açısından, iki aşamalı bilgi getirme yöntemi önemli bir araştırma alanıdır. Ayrıca, elde edilen Hukuki Atıf Önerisi modelinin farklı hukuki görevlere uygulanması ve başarımlarının test edilmesi önemlidir. Veri kümesinin boyutunu artırmak için çeşitli çalışmalar da yapılabilir. Özellikle veri kümesi büyüdükçe, BM25'in hızı ve performansı daha iyi değerlendirilebilir.
-
ÖgeAn online network intrusion detection system for DDoS attacks with IoT botnet(Graduate School, 2022-05-23) Aydın, Erim ; Bahtiyar, Şerif ; 504181513 ; Computer EngineeringThe necessity for reliable and rapid intrusion detection systems to identify distributed denial-of-service (DDoS) attacks using IoT botnets has become more evident as the IoT environment expands. Many network intrusion detection systems (NIDS) built on deep learning algorithms that provide accurate detection have been designed to address this demand. However, since most of the developed NIDSs depend on network traffic flow features rather than incoming packet features, they may be incapable of providing an online solution. On the other hand, online and real-time systems either do not utilize the temporal characteristics of network traffic at all, or employ recurrent deep learning models (RNN, LSTM, etc.) to remember time-based characteristics of the traffic in the short-term. This thesis presents a network intrusion detection system built on the CNN algorithm that can work online and makes use of both the spatial and temporal characteristics of the network data. By adding two memories to the system, with one of them, the system can keep track of the characteristics of previous traffic data for a longer period, and with the second memory, by keeping the previously classified traffic flow information, it can avoid examining all of the packets with the time-consuming deep learning model, reducing intrusion detection time. It has been seen that the suggested system is capable of detecting malicious traffic coming from IoT botnets in a timely and accurate manner.
-
ÖgeAnthropometric measurements from images(Graduate School, 2023-07-18) Ertürk, Rumeysa Aslıhan ; Kamaşak, Mustafa Ersel ; Külekci, Oğuzhan ; 504201535 ; Computer EngineeringIn this work, a system that simultaneously estimates several anthropometric measurements (namely, height and the circumferences of the bust, waist, and hip) using only two 2D images of a human subject has been proposed and tested. The proposed system has two components: a customized camera setup with four laser pointers and image analysis software. The camera setup includes an Android smartphone, four laser pointers around the smartphone's camera, and a tripod carrying the hardware. The image analysis software is a web-based application that has not been publicly available. The application takes the images as input, processes them, and yields the aforementioned anthropometric measurements in the unit of centimeters. The pipeline of the proposed system has the following components: 1. Feeding the images to the software, 2. Determining the locations of the body parts that will be measured, 3. Calculating the width of the body part on the specific location in both images (anterior and lateral), 4. Transforming pixel widths into physical units, 5. Estimating the circumference of the body part (or the height). For determining the locations of the body parts that will be measured, the software model applies pre-trained pose estimation and body segmentation models to both input images. For pose estimation, the MediaPipe framework, a tool developed for constructing pipelines based on sensory data, has been used. For body segmentation, BodyPix 2.0 in TensorFlow, a powerful tool that can perform whole-body segmentation on humans in real time, has been adopted. With the help of these models, body parts to be measured has been located on the input images. The width of a body part is measured as the largest distance between the left and right sides of the specific body part on the image. Laser points attached to the camera are leveraged while transforming pixel widths into physical units (i.e., centimeters). The last step of the measurements is converting the width into circumference. It is assumed that the cross-sectional areas of the body parts that are focused on in this research, namely, the bust, waist, and hip, are elliptical, and the circumferences of these body parts correspond to the perimeters of these ellipses. With the axes of the ellipses in hand, it is possible to estimate these anthropometric measurements. In order to evaluate the performance of the model, experiments were done on 19 volunteer human subjects. The actual measurements of these subjects were collected with traditional manual methods. The results obtained from the proposed model were compared with the actual measurements of the subjects, and the relative percentage errors were evaluated. The proposed hardware is a developed version of the prototype that was designed to assess the validity of the idea. The experiments described in this work, include the previous version of the proposed camera setup for better analysis and comparison. During the image collection stage of the experiment, the subjects that participated in the experiments are photographed with both versions of the camera setup, and the images are processed with software that is calibrated for individual camera setups. Finally, collected images are fed to a commercially available system that creates 3D meshes of humans from 2D images. This product can estimate body measurements from these meshes. For comparing the proposed system to a commercial product, this tool is included to the experiments. The images collected from the subjects who participated in the experiment are processed with the three systems mentioned earlier: the initial prototype, the improved version, and the commercially available tool. The results show that the initial prototype's relative errors for the bust, waist, and hip circumferences and height are 7.32%, 9.7%, 7.12%, and 5.0%, respectively. For the improved version, the errors become 15.97%, 9.92%, 2.01%, and 4.43%. The commercial product included in the study has relative errors of 7.8%, 10.69%, 12.43%, and 3.33% for the aforementioned body measurements. The main advantage of the proposed system over the alternative automatic methods is that, unlike the state-of-the-art measuring techniques, our method does not require predefined environmental conditions such as a specific background, a predetermined distance from the camera, or some clothing constraints. The lack of these restrictions makes the proposed system adaptable to various conditions, such as indoor and outdoor environments. The target user profile for this application would be medical practitioners, personal trainers, and individuals who want to keep track of their weight-loss progress since the system is lightweight, easy to use, and adaptable to various environments.
-
ÖgeCrowd density map estimation system from aerial images(Graduate School, 2023-07-31) Çetinkaya, Osman Tarık ; Ekenel, Hazım Kemal ; 504201559 ; Computer EngineeringToday, the concept of urbanization, which has emerged with the choice or necessity of people to live in cities is a social and economic transformation. In recent times, the notion of a "smart city" has gained significant popularity due to its ability to incorporate various elements like sustainability, livability, quality of life, competition, branding, governance, participation, social welfare, and digitalization, thereby contributing to the advancement of urban development. Cities of varying sizes across different regions of the globe have been formulating smart city strategies for numerous years. Making a city "smart" emerges as a strategy to alleviate the problems caused by urban population growth and rapid urbanization. In order to provide a smart solution to the increasing traffic density in a big city by making detailed analyzes, to develop an automatic system that does not allow new vehicles to enter when the capacity is full by directing the newly arrived vehicles to the empty spaces according to the total capacity in the parking lots, can be given as a good example. In the earthquake that took place in Kahramanmara¸s, Turkey in 23 February, we saw that a system that can automatically detect the places where earthquake victims are concentrated has already become mandatory. In any natural disaster that may occur like this, it has become very important to be able to quickly identify groups of people in the regions and provide support with the help of drones. Military use cases can be mentioned as another application area for crowd counting. Today, it is very important for unmanned vehicles, developed for military purposes, to process the images in videos or photographs and continue their duty within the framework of an algorithm. In the situations such as smuggling activities at the borders or an illegal immigration, it is becoming a great need to be able to predict people and crowds from images taken from UAVs. Crowd analysis is very important for situations that require visual surveillance such as anomalies and alarm situations. In recent years, many different methods have been proposed to perform crowd density map estimation, and it has now become the most popular method to calculate the crowd density map estimation by processing density maps. These density maps are usually calculated with the help of CNNs. Most of the crowd counting datasets in the literature consist of images collected from surveillance cameras. Such images taken at an oblique and fixed angle, with people occupying the majority of the image, taken at a distance relatively close to the drone footage. The proposed approach in this study is of great importance for emergencies where images are required to be taken by drones in the environments where there are no surveillance cameras. The developed system consists of two stages. In the first stage, we determine whether the image contains any person(s) with the help of a binary classifier. If there are persons in the input image, the crowd estimation algorithm then calculates the density map of people in the given image. This study involves the development of a crowd density map detection system that leveraged the robust feature extraction capabilities of deep CNN architectures. A binary classifier comes into play before running a CNN designed for the crowd counting task in our system. This binary classifier is included to the system to distinguish whether there is a person(s) or not in an image taken from an UAV. In order to test the performance of the proposed system we benefited from VisDrone-CC2020 dataset [1]. We used image inpainting methods on this dataset to create UAV images that do not contain any human. For binary classification, the pretrained ResNet50 model [6] was then fine-tuned on the dataset and %87 accuracy was achieved. In order to perform crowd counting, which is the second stage of this system, we used SGANet [9].SGANet has been designed specifically for this problem. We created a new architecture by adding several layers to this network. To train the network, first, ground truth density maps were created. Ground truth density maps are produced using images and labels provided by the dataset, while output density maps are learned with our SGANet. By comparing the learned density map with the ground truth density map, a loss is evaluated, and this loss is used to train our SGANet. We obtained 8.65 MAE, which is the most used metric in the crowd counting task. We then performed an error analysis for the models trained for both binary classification and crowd counting. In the model used for binary classification, we have deduced that the incorrect outputs for binary classification can be caused by the formed artifacts in the photos after image inpainting. For crowd counting, it has been deduced that small percentage errors in dense scenes affect MAE a lot, so new metrics should be developed for this problem. In addition, in photographs where the distance between the place where the photograph was taken and the ground is greatly increased, we see that the pixels representing any object on the ground and the pixels representing a person are close to each other and are very few in number. Therefore, we have deduced that in these scenes calculations are made as if there were more people than the actual number.
-
ÖgeÇapraz e-ticaret pazarlarında hibrit öneri sistemi(Lisansüstü Eğitim Enstitüsü, 2023-08-04) Köse, Emre ; Yaslan, Yusuf ; 504181559 ; Bilgisayar MühendisliğiÖneri sistemleri, film, müzik, e-ticaret ve diğer çeşitli platformlarda, çeşitli algoritmalar kullanarak kullanıcıların ihtiyaçlarına uygun ürünlerin tavsiye edilmesini amaçlamaktadır. Bu algoritmalar genellikle kullanıcı-öğe temsillerini elde ederek öneri yapmaktadır. Çalışmalar başlangıçta matris çarpanlarına ayırma ile ilerlerken, daha sonra hem işbirlikçi hem de içerik tabanlı önerilerde farklı bellek veya model tabanlı yaklaşımlar geliştirilmiş ve geliştirilmeye devam etmektedir. Çapraz pazar öneri problemi sosyal medya, e-ticaret uygulamaları ve diğer çevrimiçi platformlarda ortaya çıkmış, farklı kaynak pazarın/pazarların verilerini kullanarak, hedef pazar olarak adlandırılan kısıtlı veri kümesinde kullanıcılara öneri amaçlayan yeni bir çalışma alanı olarak ifade edilebilir. Veriden öğrenme aşamasında dikkat edilmesi gereken bazı noktalar bulunmaktadır. Kaynak pazarların verisinden öğrenilen ve optimize edilen modeller, hedef pazarın davranışları dikkate alınmadan uygulanırsa sorunlu sonuçlar ortaya çıkabilmektedir. Örneğin giyim kategorisinin diğer kategorilere göre daha yoğun kullanıldığı bir ülke düşünelim. Bu ülkenin ortalama sıcaklığı hedef pazardan çok daha yüksekse, kaynak pazarda standart pantolon alan bir müşteriye tişört önermek mantıklı olabilir ancak bu hedef pazarda alakasız olabilir. Bu nedenle verilerden öğrenme, her iki pazardaki dağılımları ve yanlılıkları dikkate alabilen bir kapsamda olmalıdır. Çapraz pazar öneri sistemleri son yıllarda ortaya çıkmış yeni sayılabilecek bir konu olarak ifade ediliyor olsa da bahsi geçen yöntemler burada farklı şekillerde çözüm olarak kullanılabilmektedir. Literatürde, FOREC algoritması bu alanda hem getirdiği çözüm hem de sağladığı açık kaynak veri kümesi ile önemli bir çalışma olarak yer almaktadır. Pazar adaptasyonu ve meta-öğrenme kavramları üzerinde ilerlenerek, 2021 yılında yayınlanan Pazarlar Arası Ürün Önerisi araştırmasında geliştirilen çoklu ağ yapısına sahip algoritma, XMarket ismiyle 18 yerel pazarın, yani ülkenin, 16 farklı kategorideki kullanıcı-öğe ikililerini ve skorlarından oluşan veri kümesini de içermektedir. Algoritma içinde ilk olarak GMF, MLP ve NMF modellerini kullanarak pazar-bağımsız, yani kaynak ve hedef pazar verisinin birlikte kullanıldığı bir eğitim gerçekleştirilir. Bu adımda buna ek olarak MAML çerçevesi ile few-shot öğrenme tekniğini de kullanır. İkinci aşamada ise pazara-özel olarak ifade edilen sadece hedef pazar verisi ile ekstra MLP katmanları eğitilerek FOREC sistemi eğitimi tamamlanmış olur. Yapay sinir ağları milyonlarca parametre ile ürün-kullanıcı çiftleri ile beslenerek, benzerliklerini anlayabileceğimiz ve karşılaştırabileceğimiz temsiller elde edebiliyor olsa da başlangıç noktasında her bir veri örneğini, örneğin kullanıcıları (veya ürünleri) fiziksel manada yakınlıklarını temsil eden bir yapıda değildir. Bu noktada, elimizdeki veriyi kullanıcı ve ürünlerin etkileşim halinde olduğunu da düşünerek, bir çizge ağı olarak temsil etmek, bağlama farklı bir mimari ve öğrenme yöntemi olarak girebilir. Evrişimli çizge ağları, komşu birleştirme yöntemini sadeleştirilmiş bir şekilde kullanarak, derin sinir ağlarının ya da few-shot öğrenme yönteminin mimari olarak öğrenmesi mümkün olmayan farklı derinliklerdeki komşu düğüm ilişkilerinin kullanımıyla birçok pazar verisinde, tek başına diğer yaklaşımların üstünde bir performans göstererek başarılı sonuçlar alabilmektedir. Bu çalışmada çapraz marketler için geliştirilen öneri sistemi çizge yapısını kullanmaktadır. Hafif Çizge Evrişimli Ağı (LGCN) yapısı, FOREC çalışmasında olduğu gibi pazar-bağımsız ve pazara-özel adımlarla eğitilmiştir. Bu iki aşama arasında temsil aktarımını uygulayarak geliştirdiğimiz sistem daha sade bir eğitim akışından oluşmaktadır. Eğitimin ilk adımında kaynak ve hedef pazar verisindeki ikililerle oluşturulan çizge ağı yine bu iki pazarın verisiyle eğitilmiştir. Bu aşamadaki eğitim sonrası kaydedilen kullanıcı ve ürün temsilleri, ikinci adımda yeni çizge ağı oluşturulurken yeni temsillerin yarısının başlangıç noktası olarak kullanılmıştır. Temsilin diğer parçası ise pazara-özel öğrenime odaklanabilmesi için bu adımda belli bir dağılımla rastlantısal olarak başlatılmıştır. Çalışmamızda test aşamasından önce, eğitimi tamamlanan çizge ağı ile farklı pazar verilerinin ilişkilerini ve potansiyel iyileştirme noktalarını keşfedebilmek için, doğrulama verisi ile ilinti gösterebilecek farklı metriklerin incelemesi yer almaktadır. Bu metrikler aşağıda listelenmiştir. - Kullanıcıların eğitim verisindeki ürünlerine verdiği ortalama puan değeri - Kullanıcının hedef pazar eğitim kümesinde birinci dereceden kaç ürün ile etkileşimde olduğu - Kullanıcıların kaynak ve hedef eğitim kümelerindeki ikinci dereceden kaç ikiliye sahip oldukları - Derece Merkezliliği (Degree Centrality) - Yakınlık Merkezliliği (Closeness Centrality) - Düğüm Fazlalık Katsayısı (Node Redundancy Coefficient) - Kümeleme Katsayısı (Clustering Coefficient) Görüldüğü üzere bu değerler arasında ham veriden çıkarılabilen temel istatistik değerlerinin hem de iki-parçalı çizge oluşumu sonrası çıkarılabilen metrikler bulunmaktadır. Bu aşamadaki sonuçlardan elde ettiğimiz çıkarım, kullanıcıların bireysel olarak nDCG skorlarının iki-parçalı çizgeden elde edilen Düğüm Fazlalık Katsayısı ve Kümeleme Katsayısı değerlerinin, diğerlerine oranla daha fazla ilintiye sahip olduğudur. Çalışmamızın detayında bu ilinti değerlerinin gelecek çalışmalarda nasıl kullanılabileceği ile ilgili fikirlere yer verilmiştir. Deney sonuçları yedi farklı modelin sonuçlarını içermektedir. Bunların beş tanesi referans araştırması olarak düşündüğümüz FOREC çalışmasında da yer alan sonuçların bizim benzer şekilde uygulamamız sonrası elde ettiğimiz sonuçlardır. Diğer iki model ise bu problem için geliştirdiğimiz sistemin ilk adımındaki pazar-bağımsız adımın sonucu, diğeri ise iki-aşamanın eğitimi sonrası elde ettiğimiz nihai hibrit LGCN model sonucudur. Bahsedilen sonuçlar pazarların ikili olarak eğitimini ve sonucunu içeren deneylerdir. Yani, FOREC çalışması yedi hedef pazarı üzerinden sonuçları her bir pazar için geriye kalan diğer altı pazarı tekli olarak kaynak pazar olarak kullanır ve eğitimlerini buna göre gerçekleştirerek sonuçlarını alır. Biz de referans noktası olarak düşündüğümüz FOREC çalışmasına benzer şekilde eğitimlerini ilerlettiğimiz sistemimizde, bu hedef pazarların içinden seçtiğimiz dört tanesini alarak ilerledik. Bunlar Almanya, Japonya, Meksika ve İngiltere pazar verileridir. Buna ek olarak Amerika pazarının verisi sadece kaynak veri olarak deneylerde yer almıştır. İki aşamalı yaklaşımımız ile farklı hedef pazarlar için %5 ve %8'lik bir aralıkta FOREC'in tüm sonuçlarından daha iyi sonuçlar elde ettiğimiz gözlemlenmiştir. Buna ek olarak, ilk adımdan sonra uyguladığımız pazara-özel eğitimin sonuçların iyileşmesinde %1 ile %2 oranında katkı sağladığı açığa çıkmıştır. Sonuç olarak, bu çalışmada çapraz pazarlar için iki aşamalı çizge sinir ağı ile öğrenilen model önerilmiş ve başarımları bu alanda yüksek sonuç verdiği gözlemlenen FOREC algoritması ile karşılaştırılmıştır. Önerilen model farklı hedef pazarlarında nDCG@10 değerlendirme metriği kullanıldığında FOREC algoritmasından daha iyi sonuçlar vermektedir.
-
ÖgeEffect of semi-supervised self-data annotation on video object detection performance(Graduate School, 2022-06-22) Akman, Vefak Murat ; Töreyin, Behçet Uğur ; 704191017 ; Computer SciencesAccess to annotated data is more crucial than ever when deep learning frameworks replace traditional machine learning methodologies. Even if the method is robust, training performance can be inadequate if the data has poor quality. Some methods were developed to address data-related issues. These methods, however, have a negative impact on algorithm complexity and processing cost. Errors related to human factors, such as misclassification or inaccurate labeling, should also be considered. Multiple steps in the data annotation process cost time and money. These steps can be listed as follows. Data gathering, annotation and formatting according to deep learning model architecture. Unfortunately, these steps are still not fully set to a standard and the whole process comes with a lot of difficulties. In this study, the effect of semi-supervised data annotation on video object detection is analysed by using the Soft Teacher algorithm. Soft Teacher is a Swin-Transformer backboned semi-supervised learning method which has a major advantage on overcoming limited data. Swin Transformer is a type of vision transformer. It creates hierarchical feature maps by merging image patches in deeper layers and has linear computation complexity to input image size. As a such, it can be used as a general-purpose backbone for tasks like classification and object detection. In Soft Teacher, there are two types of models; the Student model and the Teacher model. The Teacher model performs pseudo-labeling on weak augmented unlabeled images and the Student model is trained on both labelled and strong augmented unlabeled images while updating the Teacher model. Soft Teacher model was trained with open-source COCO data set that consists of 80 labels. The data set contains 118287 train, 123403 unlabeled and 5000 validation images, was created by the human. The Soft Teacher was trained with percent of 1, 5, 10 and 100 labelled data respectively. Then, using those trained Soft Teacher models, new data was created from the same raw data and some of the state-of-the-art object detection algorithms were trained with newly annotated data. To compare results, these object detection models were also trained with manual annotated data. The model trained with human data was shown to be less successful than the other in terms of mAPs. However, the model that was trained with self annotated data produced more false positives. Because, the trained model can perform mislabeling when generating new data. In conclusion, the results suggest that semi-supervised data annotation degrades the detection performance in expense of huge amounts of training time savings.
-
ÖgeEvent extraction from Turkish Trade Registry Gazette(Graduate School, 2023-05-16) Demirtaş, İrem Nur ; Eryiğit, Gülşen ; 504191565 ; Computer EngineeringThe Turkish Trade Registry Gazette is the official gazette published by The Union of Chambers and Commodity Exchanges of Türkiye. Companies announce crucial events like change in management, change in capital or bankruptcy in the gazette. In many industries, the gazette is used as an important source of information and intelligence. The gazette has a history of almost 70 years. The issues are also publicly available on the internet in image PDF format. This format is both hard to read for humans and hard to process for computers. On top of that, since the gazette has been published in newspaper layout, the text is usually in columns. In later issues of the gazette, some information can be given in tables. Although optical character recognition looks like a viable option for text extraction, it must be supported with image processing. To extract information from the Turkish Trade Registry Gazette, announcements of selected companies between January 2014 and August 2022 were collected. The collected data consists of PDF documents of gazette pages for the selected companies and related metadata. The metadata contains information about issue number, page number and what type of announcement the company has on the given page. Text was extracted using an image processing and optical character recognition pipeline. After the text was extracted, it was manually annotated. Since the text is extracted from the whole document, it contains multiple announcements. Thus, announcement boundaries were annotated. Based on the most important and frequent announcement types encountered in the Turkish Trade Registry Gazette, four event types were defined: Composition with Creditors, Notice to Creditors, Change in Management and Change in Working Capital. Events consist of triggers that signal the occurrence of the event, event arguments that specify general and event-specific entities involved in the events and event roles that define the relations between triggers and arguments. Using these definitions, triggers, arguments and roles were defined and annotated for each of these event types. Using announcement boundaries, an announcement splitting model was trained. After all collected announcements were split using this model, announcements listed in the metadata table were located in the pages and an announcement classification dataset with 16 announcement types was created. Using this dataset, an announcement classification model was trained. Since announcements are documents of varying lengths, the effect of context was observed. The announcement classification model achieves an F1 score of 0.83. For trigger and argument extraction, experiments were carried on in different settings. The effect of IOB tags, an added CRF layer and handling argument and trigger extraction separately were observed. The best performing model was determined to be the two-stage one that does not use IOB tags or a CRF layer, with a micro F1 score of 82.5. For event extraction, a rule-based model and Doc2EDAG [1] were explored. Although the rule-based model performs better on simpler event types, Doc2EDAG was found to be better with a micro F1 score of 73.9 on gold arguments and 54.2 on predicted arguments. Four approaches were proposed to improve the performance. Of these, removing the CRF layer and applying transfer learning yielded improved micro F1 scores of 74.9 and 75.2 over gold arguments and 60.5 and 62.9 over predicted arguments, respectively. The other two proposed methods, namely, turning off path expansion memory and field-aware path expansion yielded poorer results than the baseline.
-
ÖgeFight recognition from still images in the wild(Graduate School, 2022-06-22) Aktı, Şeymanur ; Ekenel, Hazım Kemal ; 504191539 ; Computer EngineeringViolence in general is a sensitive subject and can have a negative impact on both the involved people and witnesses. Fighting is one of the most common types of violence which can be defined as an act where individuals intend to harm each other physically. In daily life, these kinds of situations might not be faced too often, however, the violent content on social media is also a big concern for the users. Since violent acts or fights in particular are considered as an anomaly or intriguing for some, people tend to record these scenes and upload them on their social media accounts. Similarly, news agencies also regard them as newsworthy material in some cases. As a result, fighting scenes become available on social media platforms frequently. Some users may be sensitive to these kinds of media content and children who can be harmed due to the aggressive nature of the fight scenes also uses social media. These facts make it necessary to detect and put limitations on the distribution of violent content on social media. There are some systems focusing on violence and fight recognition on visual data. However, these works mostly propose methods on different domains for violence such as movies, surveillance cameras, etc., and the social media case remains unexplored. Furthermore, even if most of the fight scenes shared on social media are in video sequences, there is also a non-ignorable amount of image data depicting violent fighting. However, no work tackles the fight recognition from still images instead of videos. Thus, in this thesis, the problem of fight recognition from still images is investigated. In this scope, first, a novel dataset was collected from social media images which is named Social Media Fight Images (SMFI). The dataset was collected from Twitter and Google images and some frames were included from the video dataset of NTU CCTV-Fights. The fight samples were chosen among the samples which are recorded in uncontrolled environments. In order to crawl a large amount of data, different keywords were used in various languages. The non-fight samples were also chosen among the data crawled from social media in order to keep the domain consistent across the classes. The dataset is made publicly available by sharing the links to the images. For the classification of the Social Media Fight Images dataset, some image classification methods were applied to the dataset. First, Convolutional Neural Networks (CNN) were employed for the task and their performance was assessed. Then, a recent approach, Vision Transformer (ViT) was exploited for the classification of the fight and non-fight images. The comparison showed that the Vision Transformer gives better results on the dataset achieving a higher accuracy with less overfit. A further experiment was also held on investigating the effect of varying dataset sizes on the performance of the model. This was seen as necessary as the data shared on social media may be deleted in the future and it is not always possible to retrieve the whole dataset. So, the model was trained on different partitions of the dataset and the results showed that even if using more data is better, the model could still give satisfying performance even in absence of 60% of the dataset. Upon the successful results on fight recognition on still images problem, another experimental study was conducted on the classification of video-based datasets using a single frame from each sample. The experiment included four video-based fight datasets and results showed that three of them could be successfully classified without using any temporal information. This indicated that there might be a dataset bias for these three datasets where the inter-class visual difference is high across the classes. Cross-dataset experiments also supported this hypothesis where the trained models on these video datasets perform poorly on the other fight recognition datasets. Nonetheless, the network trained on the proposed SMFI dataset gave a promising accuracy on other datasets as well, showing that the dataset generalizes the fight recognition problem better than the others.
-
ÖgeGAN-based intrinsic exploration for sample efficient reinforcement learnin(Graduate School, 2022-05-23) Kamar, Doğay ; Ünal, Gözde ; 504181511 ; Computer EngineeringReinforcement learning is a sub-area of artificial intelligence in which the learner learns in a trial-and-error manner. The learner does so by executing an action depending on the current state it is in and observing the results. After executing an action, a reward signal is given to the learner and through the rewards, the learner can learn which actions are best in different situations. However, the learner is not given any prior information about the environment it is in or which action is the best depending on the current state. Therefore, exploring the environment is important for gathering the necessary information in order to navigate to the high rewards. Most common exploration strategies involve random action selection occasionally. However, they only work under some conditions such that the rewards need to be dense and well-defined. These conditions are hard to meet for many real-world problems and an efficient exploration strategy is needed for such problems. Utilizing the Generative Adversarial Networks (GAN), this thesis proposes a novel module for sample efficient exploration, called GAN-based Intrinsic Reward Module (GIRM). The GIRM computes an intrinsic reward for the states and the aim is to compute higher rewards for the novel, unexplored states. The GIRM uses GAN to learn the distribution of the states the learner observes and contains an encoder, which maps a query state to the input space of the generator of the GAN. Using the encoder and the generator, the GIRM can detect if a query state is among the distribution of the observed states. If it is, the state is regarded as a visited state, otherwise, it is a novel state to the learner, in which case the intrinsic reward will be higher. As the learner receives higher rewards for such states, it is incentivized to explore the unknown, leading to sample-efficient exploration. The GIRM is evaluated using two settings: a sparse reward and a no-reward environments. It is shown that the GIRM is indeed capable of exploring compared to the base algorithms, which involve random exploration methods, in both of the settings. Compared to the other studies in the field, the GIRM also manages to explore more efficiently in terms of the number of samples. Finally, we identify a few weaknesses of GIRM: the negative impact on the performance when sudden changes to the distribution of the observed states occur, and the exploitation of very large rewards not being avoided.
-
ÖgeGeneralized multi-view data proliferator (gem-vip) for boosting classification(Graduate School, 2022-08-08) Çelik, Mustafa ; Rekik, Islem ; 504131531 ; Computer EngineeringMulti-view network representation revealed multi-faced alterations of the brain as a complex interconnected system, particularly in mapping neurological disorders. Such rich data representation maps the relationship between different brain views which has the potential of boosting neurological diagnostic tasks. However, multi-view brain data is scarce and generally is collected in small sizes. Thus, such data type is broadly overlooked among researchers due to its relatively small size. Despite the existence of data proliferation techniques as a way to overcome data scarcity, to the best of our knowledge, multi-view data proliferation from a single sample has not been fully explored. Here, we propose to bridge this gap by proposing our GEneralized Multi-VIew data Proliferator (GEM-VIP), a framework aiming to proliferate synthetic multi-view brain samples from a single multi-view brain to boost multi-view brain data classification tasks. For the given Connectional Brain Template (i.e., represents an approximation of brain graphs that captures the unique connection shared by a population's subjects), we set out the proliferate synthetic multi-view brain graphs using the inverse of multi-variate normal distribution (MVND). However, one needs two crucial components, which are the mean an the covariance of a given population. As such, first, our proposed GEM-VIP framework obtains a population-representative tensor (i.e., drawn from the prior CBT) which can be mathematically regarded as a mean of the population. Second, drawing inspiration from the genetic algorithm paradigm our proposed GEM-VIP learns the covariance matrix of the population using the given CBT. Lastly, it proliferates synthetic samples using the earlier obtained representative tensor and created covariance matrix of the population on the MVND equation. We evaluate our GEM-VIP against several comparison methods. The results show that our framework boosts the multi-view brain data classification accuracy of AD/ lMCI and eMCI/ normal control (NC) datasets. In short, our GEM-VIP method boosts the diagnoses of the neurological disorders.
-
ÖgeIQ-flow: Mechanism design for inducing cooperative behavior to self-interested agents in sequential social dilemmas(Graduate School, 2022-12-20) Güresti, Bengisu ; Üre, Nazım Kemal ; 504191557 ; Computer EngineeringAchieving and maintaining cooperation between agents in order to accomplish a common objective is one of the central goals of Multi-Agent Reinforcement Learning (MARL). Although many methods promise high performance in the literature, these methods are mainly concerned with obtaining that performance in the same agent set-up as training. However, in real-world scenarios the environment is open-ended such that any number of agents can enter. Furthermore, in many real world scenarios, separately trained and specialized agents are deployed into a shared environment or the environment requires multiple objectives set to be achieved by different coexisting parties. These variations among specialties and objectives are likely to cause mixed motives that eventually result in a social dilemma where all the parties are at a loss. Nevertheless, when the specialty in the subject is single and the objectives do not cause a mixed motive problem, we can approach the situation as a transfer and generalization problem in cooperative MARL with decentralized execution. Thus, we first examine the scenarios with a single objective and deduce if an external mechanism is necessary to promote cooperation in these scenarios. Then, we turn our focus to cases where there is an underlying social dilemma in the environment such that we study and propose incentivization-based methods to promote cooperation under sequential social dilemmas. Centralization and decentralization are two approaches used for cooperation in MARL. While fully decentralized methods are prone to converge to suboptimal solutions due to partial observability and nonstationarity, the methods involving centralization suffer from scalability limitations and lazy agent problem. The centralized training decentralized execution (CTDE) paradigm brings out the best of these two approaches; however, centralized training still has an upper limit of scalability not only for acquired coordination performance but also for model size and training time. Since we want to study the situation where any number of agents with a single cooperative objective can be deployed into a shared environment, we adopt the centralized training with decentralized execution paradigm for our first study and investigate the generalization and transfer capacity of the trained models across a variable number of agents. The generalization and transfer capacity of the agents is assessed by training a variable number of agents in a specific MARL problem and then performing greedy evaluations with a variable number of agents for each training configuration. Thus, we analyze the evaluation performance for each combination of agent count for training versus evaluation. We perform experimental evaluations on predator prey and traffic junction environments and demonstrate that it is possible to obtain similar or higher evaluation performance by training with fewer agents. We deduce that the optimal number of agents to perform training may differ from the target number of agents and argue that transfer across a large number of agents can be a more efficient solution to scaling up than directly increasing the number of agents during training. Thus, we conclude that deploying trained agents to an open-ended environment does not constitute a problem or necessitate an external incentivizing mechanism when the objective is single and all of the agents use the same policy. Turning the focus to deployment of separately trained and specialized agents to a shared environment necessitates the study of Sequential Social Dilemmas (SSD), since agents with different specializations are prone to have mixed motives. Sequential Social Dilemmas are gaining attention in recent years. The current trends either focus on engineering incentive functions for modifying rewards to reach general welfare, or developing learning based approaches to modify the reward function by accounting for the impact of the incentives on policy updates. One of the most significant works in the learning based approach is LIO, which enables independent self-interested agents to incentivize each other by an additive incentive reward. LIO assumes that agents continually learn and adapt according to the changing incentives they give each other and has demonstrated success in several sequential social dilemma environments. We investigate LIO's performance under a variety of different setups in public goods game Cleanup in order to analyse its robustness against necessity of including inductive bias in incentive function, randomness in initial agent position with an option of asymmetric incentive potential, and assess its stability under frozen incentive functions after agents' explorations are reset. We observe and demonstrate empirically that LIO is indeed sensitive to these settings and it is not reliable for obtaining good incentives that would let the system stay stable when it is static. We conclude with some research directions that would improve the robustness of the method and incentive learning research. Finally, we study having a single incentivizing mechanism instead of giving every agent the ability to incentivize each other. We aim to preclude the suboptimal consequences of agents with mixed motives by using a central mechanism that learns its incentives adaptively while the agents in question learn their policies. Thus, we propose the Incentive Q-Flow (IQ-Flow) algorithm, which modifies the system's reward setup with an incentive regulator agent such that the cooperative policy also corresponds to the self-interested policy for the agents. Unlike the existing methods that learn to incentivize self-interested agents or adaptive mechanisms, IQ-Flow does not make any assumptions on agents' policies or learning algorithms, which enables generalization of the developed framework to wider array of applications. IQ-Flow performs offline evaluation of the optimality of the learned policies using the data provided by other agents to determine cooperative and self-interested policies. Next, IQ-Flow uses meta-gradient learning to estimate how policy evaluation changes according to given incentives and modifies the incentive such that the greedy policy for cooperative objective and self-interested objective yield the same actions. We present the operational characteristics of IQ-Flow in Iterated Matrix Games. We demonstrate that IQ-Flow outperforms the state-of-the-art incentive design algorithm in Escape Room and Cleanup environments. We further demonstrate that pretrained IQ-Flow mechanism significantly outperforms the performance of shared reward setup in Cleanup environment.
-
ÖgeMac sublayer protocol design and optimization for aerial swarms(Graduate School, 2023-07-28) Aydın, Esin Ece ; Seçinti, Gökhan ; 504211514 ; Computer EngineeringThe main objective of this thesis is to design and optimize a MAC sublayer protocol for ad hoc networks, with a primary focus on maintaining the reliabile communication. Ad hoc networks, comprising aerial swarms, provide benefits such as easy use and operation in diverse environments, thanks to their simple and economical deployment, along with their remarkable maneuverability. However, the communication standard used in these networks -IEEE 802.11 standard, widely known as Wi-Fi- is primarily designed for networks with limited mobility and minimal changes in network topology. As a result, the existing Wi-Fi standards have limitations in accommodating rapidly changing network topology. This limitation becomes particularly problematic for aerial swarms that require reliable and high-bandwidth multi-hop communication links, ultimately leading to an inability to meet the quality of service (QoS) requirements. Due to the dynamic and contested nature of ad hoc networks, ensuring reliable communication can be challenging at times. To address network management challenges in highly decentralized networks, a self-organizing TDMA-based protocol is proposed. This protocol is designed to tackle communication difficulties in ad hoc networks and optimize the overall communication process by incorporating intelligent topology management, dynamic slot assignment, slot migration, and slot releasing as key components. By integrating these features, the protocol aims to enhance communication reliability and address the specific requirements of ad hoc networks. Implementing this protocol at the data link layer allows for decentralized coordination among nodes, removing the requirement for a central unit and assuring continuous communication even in dynamically changing environments and conditions. In contrast to existing MAC-sublayer protocols, the goal of this research is to present and simulate a protocol that meets ad hoc network's specific requirements. The thesis begins with an examination and modeling of the current situation, which is followed by an outline of services, message formats, procedural rules, and sequence diagrams for the subsequent protocol design stage. The protocol's design incorporates a number of notable abilities, such as slot operations, frame size modifications, topology management, optimization in control packet exchange, and collision avoidance, all of which contribute to the protocol's successful operation. To validate the findings of this thesis, the suggested protocol is evaluated using the OMNeT++ simulation environment. In contrast to previous studies, the proposed S-TDMA protocol is assessed based on four key metrics: energy efficiency, control traffic, packet delivery ratio, and average channel utilization. The evaluation results indicate a substantial enhancement in overall channel utilization, reaching up to 55%, while also reducing control traffic overhead by approximately 13%. These outcomes highlight the effectiveness and benefits of the proposed protocol in improving network performance and resource utilization. The results of simulations provide important insights into the protocol's performance and ability to adapt to changing network conditions.
-
ÖgeMeasuring and predicting software requirements volatility for large-scale safety-critical avionics projects(Graduate School, 2022-02-01) Holat, Anıl ; Tosun Kühn, Ayşe ; 504171560 ; Computer EngineeringDuring the software development life cycle, software requirements are subjected to many changes despite the recent developments in software engineering. These modifications, additions, or removals are referred to as requirements volatility. Constantly changing requirements affect cost of the project, the project schedule and the quality of the product. In the worst case projects fail or partially completed due to requirements volatility. Various requirement volatility measures have been used in previous requirement volatility prediction studies and industrial volatility measurement practices. A very big safety-critical avionics software project with thousands of software requirements from ASELSAN company is employed to forecast the number of changes for each software requirement as requirements volatility in this thesis. To explain requirements volatility, we use a complete collection of the following metrics: requirement textual metrics, project-specific characteristics, and interdependencies between software requirements. Requirement textual metrics in this thesis are chosen from two requirements quality analyzer tools that are used in the literature. Project-specific metrics are created by focusing on safety-critical avionics project features one by one and including the ones that would give information on requirements volatility. Traceability links between system and software requirements are used to create a network graph, and network centrality metrics are created for software requirements with regard to this graph. Requirement volatility prediction is done by employing several machine learning techniques which are utilized by base studies: k-nearest neighbor regression algorithm, linear regression, random forest regression and support vector regression. Combining input metric groups with machine learning algorithms, 28 predictive models are created in this study. This research evaluates the performance of proposed models in predicting software requirement change proneness, outperforming input metric combinations, outperforming machine learning techniques, and the success of proposed models in labeling highly volatile software requirements. The model that combines requirement textual measurements, avionics project features, and network centrality metrics with a k-nearest neighbor machine learner produces the best prediction results (MMRE=0.366). Furthermore, the best predictive model properly labels 63.2 percent of highly volatile software requirements that are subject to 80 percent of total software requirement changes. The findings of our research are positive in terms of developing automated requirement change analyzer tools to minimize requirement volatility concerns in early development phases.
-
ÖgeMemory-based approaches to problems in probabilistic modeling(Lisansüstü Eğitim Enstitüsü, 2022-10-25) Akgül, Abdullah ; Ünal, Gözde ; 504201504 ; Computer EngineeringDeep neural networks are an accepted solution for many problems in deep learning; however, the application of deep neural networks to safety-critical areas such as health care is still a hot research topic. To employ deep neural networks in such fields, they are expected to fit the in-domain data set, provide calibrated predictions on problematic regions of the target domain, and separate the out-of-domain queries. Even though these expectancies are studied extensively, these studies are highly fragmented. Therefore, there is no model that is able to fit these requirements simultaneously. Continual Learning (CL) is a framework that aims to learn numerous tasks in a sequential way. The excellent CL method should adapt to new tasks perfectly without forgetting previous tasks. However, neural networks suffer from catastrophic forgetting which is a performance drop on previously learned tasks caused by the newly learned task. Yet, to get intelligent systems capable of adapting to environmental change, CL is crucial. Because of this, CL is a hot topic but the research on CL is mainly on image classification tasks and there is limited work on time sequence classification tasks. Yet, there is no work on multi-modal dynamics modeling. In this thesis, we employ an external memory to deal with problems in probabilistic modeling. Our solutions for these problems can be summarized as follows: i) Evidential Turing Processes (ETP): First, we define total calibration for the first time. After investigating two Bayesian paradigms which are the Bayesian model, and the Evidential Bayesian Model, we introduce the Complete Bayesian Model (CBM) which is a unification of those two paradigms. We develop ETP as an instance of CBMs with neural episodic memory. We build a pipeline to evaluate the models' performance for total calibration. We compare our solution, the ETP member of CBMs, with state-of-the-art members of other paradigms, and we also provide an ablation study. We investigate the models' performance under five real-world data sets including one time-series classification, and four image classification tasks. Furthermore, we tested the models in the corrupted versions of different data sets. We use four different metrics that are test error as prediction accuracy, Expected Calibration Error as in-domain calibration score, Negative Log-Likelihood (NLL) as model fit, and area under the ROC curve as out-of-domain detection score. We report that only the ETP can excel in all three aspects of total calibration simultaneously. ii) Continual Dynamic Dirichlet Process (CDDP) for Continual Learning of Multi-modal Dynamics: We introduce a new problem which is CL of multi-modal dynamics. Since the problem is novel, we create a baseline from the existing ones. For this new problem, we introduce a novel solution that employs an external memory to transfer knowledge between tasks. We curate a pipeline for this newly introduced problem, and in the pipeline new tasks are coming sequentially and each task has a certain number of different mode samples. Differences in task order may cause different results in CL setups; therefore, we change the task order for each replication. We also generate synthetic data sets and adapt time-series classification data sets to evaluate models' performance in the problem. We compare models' performance with Normalized Mean Squared Error as a measure of prediction accuracy and NLL as a measure of Bayesian model fit that quantifies uncertainty. We reveal that our approach, CDDP, compares favorably to the established parameter transfer approach in CL of multi-modal dynamical systems. To sum up, in this thesis, by experiments we show that external memory architecture can be used for both calibrations of neural networks to use in safety-critical areas and CL of multi-modal dynamics.
-
ÖgeMQTT-CT: İntelligent MQTT protocol with cloud integration(Graduate School, 2023-06-20) Erol, Muhammed Raşit ; Canberk, Berk ; 504201533 ; Computer EngineeringThe MQTT protocol, named Message Queuing Telemetry Transport, has become widely recognized as a superior communication protocol in the Internet of Things (IoT) community. However, conventional MQTT protocols described in existing literature have limitations in supporting distributed environments and scalability. To address these limitations, a more advanced MQTT protocol called MQTT-ST has been developed, which offers bridging capabilities within distributed environments, making it an attractive choice for IoT systems. We have created a better version of our MQTT protocol called MQTT-CB. Our upgraded MQTT-ST protocol has added features like intelligence, scalability, and distribution using containers, making it easy to transport and deploy. Moreover, we've made deploying a cloud-based architecture that takes advantage of cloud technology even simpler. Our research focuses on enhancing the MQTT-ST protocol by incorporating intelligence capabilities. We utilize LSTM (Long Short-Term Memory) network, a cutting-edge deep-learning model that can capture intricate patterns over time. In addition, our protocol uses predictive algorithms that enable it to anticipate retransmitted packets, dynamically adjust the number of brokers in real-time, and reduce brokers when clients are inactive. We have extensively tested our protocol MQTT-CB with MQTT-ST. As a result, MQTT-CB performs better than traditional MQTT-ST protocols in reducing latency between subscribers and publishers. This provides better efficiency and responsiveness in IoT systems. Furthermore, our protocol adapts to publication rate changes and provides robustness in dynamic environments. MQTT-CB is a dependable and effective means of communication for IoT applications. Its ability to seamlessly adapt to changing conditions makes it ideal for IoT systems deployed in distributed environments. MQTT-CB opens up new possibilities for IoT solutions that can operate effectively in various scenarios where scalability, intelligence, and distribution capabilities are crucial for success. In summary, MQTT-CB significantly advances MQTT-ST protocols, introducing intelligence, scalability, and distribution to enable efficient and reliable communication between IoT devices. Furthermore, with its integration of the predictive LSTM algorithm, MQTT-CB optimizes the performance of the MQTT-ST protocol, showing the way for enhanced IoT applications with improved responsiveness and adaptability in distributed environments. The content of this thesis, including the methodology and results presented in all sections, is based on my research paper titled "MQTT-CB: Cloud Based Intelligent MQTT Protocol".
-
ÖgeNext generation wireless networks for social good(Graduate School, 2023-08-18) Çoğay, Sultan ; Seçinti, Gökhan ; 504211531 ; Computer EngineeringThe advancement of technology and communication systems has yielded beneficial outcomes in everyday life. Including next generation wireless networks is an integral component of this evolutionary process. Consequently, the advancement of technology and evolving needs have led to the enhancement of wireless communication systems by implementing next generation wireless networks, thereby rendering them more powerful and efficient. These technologies, such as mobile communications, industrial applications, and the Internet of Things (IoT), significantly impact our lives. In addition to the factors above, wireless networks have emerged as a pivotal tool in addressing societal challenges. Next generation wireless networks have the potential to manage various critical domains such as natural disasters, environmental concerns, traffic and transportation challenges, and public health issues. Because of these reasons , this thesis has two main objective utilizing wireless networks. Firstly, we propose a wildfire monitoring method. Wildfires have emerged as a significant worldwide concern in today's world. The prevalence and severity of wildfires have increased due to climate change, anthropogenic actions, and natural influences. In response to the prevailing ecological crisis, researchers and professionals in science and engineering are actively exploring a range of technological and supplementary precautions. The findings of this investigation indicate that unmanned aerial vehicles (UAVs) significantly impact combatting forest fires. UAVs have become essential tools in firefighting and monitoring operations due to their notable attributes, including user-friendly interfaces, exceptional maneuvering capabilities, and enhanced availability. Nevertheless, the constrained energy capacity of a singular UAV poses a significant challenge in efficiently surveilling expansive fire zones. To effectively tackle these challenges and enhance the efficiency of firefighting operations, a proposed solution is implementing an advanced monitoring application called "Phoenix." Phoenix provides an advanced fire-tracking monitoring system, which integrates path planning, a graph engine, and modified Traveling Salesman Problem (TSP) algorithms. This system aids the UAV in effectively tracking fire areas and optimizing its trajectory. This capability enables the UAV to conduct a more efficient scanning of the fire area, reducing response time. Consequently, this helps to mitigate the spread of the fire. Phoenix has designed a network architecture that facilitates the prompt transmission of monitoring data to the fire brigade and other firefighting units. This enables the firefighting crews to remain informed about the prevailing conditions at the site and enhance their coordination efforts.The Phoenix application facilitates energy optimization to tackle the energy limitations an individual Unmanned Aerial Vehicle (UAV) faces. Therefore, UAVs can remain airborne for an extended duration and effectively survey more significant geographical regions. This enhances the efficacy of firefighting operations. The application operates by employing elliptical fire modeling and simulation techniques. Additionally, the analysis of critical fire zones incorporates fuel moisture content (fmc) data within the fire zone. This facilitates Phoenix's enhanced ability to respond effectively to real-world situations, thereby augmenting the likelihood of success in firefighting endeavors. Secondly, we propose a blind spot detection method to protect pedestrians, cyclists and motorcyclists in traffic and prevent accidents. Traffic crashes are a significant issue that regrettably results in numerous fatalities and injuries in contemporary times. Traffic accidents are a prominent contributor to global mortality rates, particularly in middle-income nations with high traffic volumes and insufficient or inadequate infrastructure. Despite implementing numerous safety measures to address this issue, a significant level of risk remains, particularly for susceptible road users, including pedestrians, cyclists, and motorcyclists. The significance of vehicle blind spots is a crucial factor in such accidents. Despite the recent introduction of advanced safety systems incorporating costly hardware, detecting vulnerable users remains challenging, particularly in situations where the field of view is obstructed. Furthermore, we utilized ultra-wide-band (UWB) technology to develop this system. UWB is an advantageous wireless communication tool for both cost-effectiveness and widespread availability. We use the Time Difference Of Arrival (TDOA) method to detect the vehicle or pedestrian in the blind spot. We have developed a demo by developing this proposed method. We used four UWB kits and a UWB-supported mobile phone for this demo. We implement the software in the kits used for the demo and the software of the application on the mobile phone ourselves. Apart from that, we compared our method with different methods using simulation. In conclusion, this thesis proposes two next-generation wireless network approaches. First, Phoenix, an advanced monitoring program, powers the suggested wildfire monitoring technique. This novel technology uses UAVs, advanced algorithms, and fire model to revolutionize firefighting, save lives, preserve ecosystems, and reduce wildfire damage. Phoenix shows how technology can safeguard our environment and develop a more resilient and sustainable future as we battle climate change and wildfires. The second stage of this thesis proposes and examines the continuing development and enhancement of road safety technology like blind spot identification, which reduces traffic accidents and saves lives. UWB technology and new algorithms may make roads safer and more inclusive. These road safety applications use technology, legislation, and public awareness to reduce accidents and make roads safer.
-
ÖgeOcclusion robust and aware face recognition(Graduate School, 2023-05-25) Erakın, Mustafa Ekrem ; Ekenel, Hazım Kemal ; 504201532 ; Computer EngineeringOccluded faces, due to accessories such as sunglasses and face masks, present a challenge for current face recognition systems. This thesis provides a comprehensive exploration of the issues caused by occlusions, particularly upper-face and lower-face obstructions, in real-world scenarios. The increased prevalence of sunglasses and face masks, the latter due to the COVID-19 pandemic, has amplified the importance of addressing these problems. In this thesis, the Real World Occluded Faces (ROF) dataset is gathered, a collection of faces experiencing both upper and lower face occlusions, serving as a critical resource for this area of study. Contrary to synthetic occlusion data, the ROF dataset provides an authentic representation of the issue, which our benchmark experiments have shown to be a significant impediment for even the most sophisticated deep face representation models. These models, while highly effective on synthetically occluded faces, exhibit substantial performance degradation when tested against the ROF dataset. This research comprises two distinct, yet interconnected sections. The first stresses the vital role of real-world data for the design and refinement of occlusion-robust face recognition models. Our experiments demonstrate the increased challenges posed by real-world occlusions in comparison to their synthetic counterparts. This insight allows us to gauge the performance and limitations of various model architectures under different occlusion conditions. The second section presents a novel, occlusion-robust, and occlusion-aware face recognition system, designed to increase performance on occlusions caused by sunglasses and masks, with minimal impact on generic face recognition performance. The system incorporates an occlusion-robust face recognition model, an occlusion-aware model, and an innovative layer integrating the outputs of these models to minimize occlusion effects. This unique configuration ensures the system's resilience to occlusions, focusing less on occluded regions and more on overall facial recognition. This thesis provides a thorough investigation of the challenges presented by occluded face recognition and proposes an innovative solution for the same. It underscores the necessity of utilizing real-world data for developing robust face recognition models and introduces a novel occlusion-aware face recognition system. This work has the potential to significantly enhance the performance of occluded face recognition methods in various real-world scenarios.
-
ÖgeOrder dispatching via deep reinforcement learning(Graduate School, 2022) Kavuk, Eray Mert ; Kühn Tosun , Ayşe ; 712817 ; Department of Computer EngineeringIn this thesis, the unique order dispatching problem of Getir, a retail and logistics company, has been studied. Getir serves in many cities and multiple countries, and its service area is expanding day by day. Getir, which serves thousands of customers every day in many different fields, is the pioneer of the market in this field. In this thesis, it has been studied on ultra-fast delivery, which is the first and most known service area of the company, which Getir found and started to apply as a first in the world. The aim of Getir's ultra-fast delivery business model is to deliver orders to its customers within minutes. In this business model, orders are fulfilled from the company's warehouses. It is a very challenging goal to complete order delivery in a very short time. Achieving an ultra-fast delivery goal becomes a real problem due to traffic congestion, high numbers of orders at certain times of the day or on certain days of the week. In addition, due to the Covid-19 pandemic and changing customer habits, people increasingly prefer home delivery and shopping method. For this reason, serious changes can be observed in the expected number of orders on a daily and weekly basis. Previously unknown curfews or other restrictions cause changes in the expected number of orders and their content. Therefore, it is not possible to predict these changes with data analysis and estimation methods. For these reasons, an order dispatching algorithm that can adapt to changing conditions is vital. In the ultra-fast delivery model, the goal is to serve as many customers as possible within the predetermined and promised time. Orders can be placed at any time during the working hours of the warehouses in the customer's service zone. It is decided to accept or reject the incoming order according to the order density of the relevant warehouses in the region and the courier shift plans. In the decision-making algorithm here, we recommend using a deep reinforcement learning algorithm instead of a rule-based structure that does not violate constraints. We suggest that an algorithm should be used that can keep up with the growth rate of Getir, which is a fairly fast growing company, and can adapt to the different characteristics of the regions. Before deep reinforcement learning methods that can be applied for this problem, we describe the related problem of Getir and one of the methods used by the company. We discuss the problems, limitations and shortcomings of the method used. We compare and highlight the differences between the proposed method and the current method. We measure the success of the approaches by comparing the proposed methods and the currently used methods over the actual order data. In the ultra-fast delivery business model, it is aimed to deliver the order to the user within 10-15 minutes.