LEE- Elektrik Mühendisliği-Yüksek Lisans

Bu koleksiyon için kalıcı URI


Son Başvurular

Şimdi gösteriliyor 1 - 5 / 19
  • Öge
    Evaluation of dielectric performance of high-temperature vulcanizing silicone rubber samples
    (Graduate School, 2023-01-19) Bilgiç, Taylan Özgür ; Kalenderli, Özcan ; 504191050 ; Electrical Engineering
    Electricity has become a must-have rather than a need in our current times. In addition to holding a very important place in people's daily lives, it is also a great need in industrial facilities. An unplanned power outage causes huge financial losses for industrial facilities. Therefore, it is necessary to minimize power outages and to ensure a continuous generation, transmission and distribution of electricity. One of the reasons for power cuts is due to the material used. Interruptions occur due to faults in the distribution and transmission networks of electricity from the generation stage until it reaches more users. In the selection of the materials used here, it is necessary to choose according to the place and conditions where they will be used, and attention should be paid to their lifetime. In addition, when the materials used have better properties, these new materials should be used to prevent future failures. Insulators are used in transmission lines to provide insulation between the energy part and the ground. If there is a problem in one of the insulators in a transmission line, high short-circuit current will be drawn as there will be a short-circuit and a malfunction will occur in the system. This brings about the necessity to pay attention to the lifetime of the insulators and to be aware of the innovations. For this reason, traditional insulators, which are ceramic and glass ones, are replaced by silicone insulators. Silicone insulators are preferred because of their hydrophobic properties, their lightness, their resistance to impacts, their cheapness, ease of installation, protection of their properties at wide temperatures and electrical resistance. Malfunctions in insulators are generally caused by short-circuit currents due to environmental conditions, namely weather conditions such as rain, fog and snow. The reason for this is that the dirt accumulated on the surfaces of the insulators creates a conductive path together with the water formed on the surface due to these weather conditions. When this conductive path is created, a short circuit occurs and short circuit currents occur. Silicone insulators can help prevent this thanks to their hydrophobic properties. The flow of water from the surface of a silicone insulator that has not lost its hydrophobic feature does not form a path, it flows drop by drop. In this way, the formation of short-circuit current is prevented. In this study, high temperature vulcanizing (HTV) silicone rubber samples were investigated in 3 different experimental setups. The first experiment is the Inclined Plane Experiment. With this experiment, the trace and erosion resistance of HTV silicone samples are examined. The experiment was carried out in 3 different voltage types as AC, –DC and +DC and they were compared. For AC, –DC and +DC voltages, 4.5, 3.15 and 2.45 kV voltage values were tested, respectively. According to these voltage levels, the pre-resistances and the contaminant liquid flow rate were determined. A total of 5 samples has been used simultaneously in the experiment. In addition, the temperature measurements of the samples for 6 hours were taken with the help of a thermal camera. In the same way, leakage current data were obtained using the labview program. The second test was the corona discharge test. In this test, the hydrophobicity properties of HTV silicone samples were investigated. In this test, AC, –DC and +DC voltage types were tested in the same way. The voltage level required to create a corona discharge has been found through trials. 5 kV in AC voltage, 21 kV in –DC and +DC voltage was applied. In addition, tests were carried out at different temperatures and different pressures to examine the effect of ambient conditions on hydrophobicity. For each test, 2 samples were used and corona discharge was applied with needle electrodes at 3 points determined on each sample surface. As long as the discharge was applied to these 6 points and afterwards during the recovery of hydrophobicity, the roofs were photographed by dripping water drops at different times. In these photographs, the change of hydrophobicity was examined by finding the angles between the drop and the surface with the help of the program. This change was examined first as loss and then as recovery. As the third test, the dynamic drop test was performed. In this test, the hydrophobicity properties of HTV silicone samples were also investigated. In this test, AC, –DC and +DC voltage types were tested in the same way. A voltage level of 6 kV has been applied in 3 voltage types. Five samples were used for each test. In this test, samples are subjected to electrical stress with the help of 2 electrodes. A liquid is run over the surface of the samples. As a result of electrical stresses, samples lose their hydrophobic properties over time. While at first no accumulation or water path is formed on the surface of the samples during the liquid flow without losing the hydrophobic properties of the samples. As time passes and they start to lose their hydrophobicity, water drops form on the sample surface. Then, when they completely lose their hydrophobicity, a water path is formed. The innovative approach of this study is to use 3 different tests to examine the properties of HTV silicone rubber samples and to perform these 3 different tests at AC, –DC and +DC voltage types. But as a more important innovation, testing at different temperatures and different humidity is performed to examine the effect of ambient conditions in the corona discharge test. Insulators in transmission and distribution lines are located in the open air and are affected by the changes in air conditions. By performing tests at different temperatures and different humidity values and examining the hydrophobic behavior of the samples, information can be obtained about the hydrophobicity properties of silicone insulators under various climate environments including the characteristics of seasons such as summer and winter. When the inclined plane test was performed at 4.5 kV AC voltage, all 5 samples lasted 6 hours and passed the test. In the inclined plane test performed at AC voltage, the average temperature of the 5 samples was measured as 81.5 ˚C and the average of the maximum temperatures of the 5 samples was found to be 113 ˚C. At most, the 2nd sample reached a temperature of 133 ˚C. The average mass loss of 5 samples is 0.0496 grams. In the inclined plane test performed at 3.15 kV negative DC voltage, all 5 samples survived for 6 hours and passed the test. The average temperature of the 5 samples was found to be 242.81 ˚C and the average of the maximum temperatures of the 5 samples was found to be 549.45 ˚C. The 3rd and 4th samples reached a temperature of 670.09 ˚C, which is the highest temperature that can be measured. The average mass loss of 5 samples is 0.0828 grams. In the inclined plane test performed at 2.45 positive DC voltage, only the first sample survived for 6 hours and passed the test. The other 4 samples failed in less than two and a half hours because their erosion length exceeded the value specified in the standard. The first sample, on the other hand, did not cross the erosion length limit of 2.5 cm at the tip of 2.45 cm. But the greatest mass loss is in the 1st sample. The reason for this is that it has been dealing with a great erosion both transversely as well as longitudinally. The average mass loss of 5 samples is 0.85 grams. The mass loss of the 1st sample is also the highest with 1.23 grams. The average temperature value of 5 samples was found to be 98.95 ˚C. The average of the maximum temperatures of the 5 samples is 648.37 ˚C and the 1st sample has the smallest maximum temperature with 602.64 ˚C. As can be seen from these results, the best results were found at AC voltage and the worst results were found at +DC voltage. Recovery of hydrophobicity for HTV SIR samples in CDT for all 3 voltage types is best in high temperature, ie 30 °C temperature and 54% humidity ambient conditions. In the recovery of hydrophobicity, the worst case in all three voltage types is at low temperature, that is, at 18 °C and 54% humidity. In hydrophobicity loss, the worst ambient condition was found to be high temperature in all three voltage types. The best condition for loss of Hydrophobicity in AC and positive DC voltage is low humidity, ie 24 ˚C temperature and 45% humidity. The best condition for loss of hydrophobicity at negative DC voltage is low temperature. Although the samples tested at high temperature gave the worst results in terms of hydrophobicity loss, the hydrophobicity loss rate is lower than the recovery rate. So the loss is more, but the recovery is even more. In the dynamic drop test, the lowest time for the 2nd sample at AC voltage is 116 minutes, the highest time is 212 minutes for the 4th sample, and the average of the 5 samples losing their hydrophobicity is 157.4 minutes. The lowest time at negative DC voltage is 45 minutes for the 2nd sample, the highest time is 239 minutes for the 4th sample, and the average of the 5 samples losing their hydrophobic properties is 124.2 minutes. At positive DC voltage, the lowest time for the 5th sample is 75 minutes, the highest time for the 2nd and 3rd samples is more than 720 minutes, and the average of the 5 samples losing their hydrophobic properties is 387.2 minutes. As can be seen from these results, the best results were found at +DC voltage and the worst results were found at –DC voltage. The time for the samples to lose their hydrophobic properties at AC voltage is close to each other and the standard deviation is the lowest with 42.34. Although the best results are obtained at +DC voltage, there is a great difference between the loss of hydrophobic properties of the samples.
  • Öge
    Improved tracking algorithm for rooftop pv systems employing multi-input DC-DC converter
    (Graduate School, 2023-01-27) Bayraktar, Gökhan ; Yıldırım, Deniz ; 504191021 ; Electrical Engineering
    The energy need of humankind has been increasing rapidly with the population and consumption increment. Various energy production methods have been investigated to meet this need since the beginning of the 20th century. After half of that century, solar energy has become one of the most studied concepts of energy production methods. With the help of economy-politics crises upon oil or natural gas, investment in non-dependent energy types has increased. Solar energy has become one of the invested areas. Here wishful thoughts may be such that the environmental risks of using fossil fuels are also one of the reasons for this tending, but it is not. The solar energy concept consists of three following main parts. Photovoltaic (PV) panels for transforming solar photon energy into DC electrical energy. Power electronics devices for MPPT implementation and manipulating the electrical power according to the load side. Lastly, the load part of the systems can be a DC load, AC load, or the utility grid directly. In this thesis, a study about the power electronics part of the concept has been completed. At the power electronics aspect, the system may have a single DC/AC converter or two-stage with a DC/DC and a DC/AC converter. As known, PV panel characteristics are not linear; therefore, a maximum power point tracking (MPPT) algorithm should be designed to extract the maximum available power from the panel. Also, to transform the PV panel's DC power into AC power, some electronic manipulation should be configured with switching mode power supplies. These main requirements can be provided within a converter that forms the single-stage PV power system or can be divided into two converters to build a two-stage PV power system. Both systems have their benefits and drawbacks. This study's content is designing a DC/DC converter of the two-stage PV power system. The main targets of the converter are implementing the MPPT algorithm and boosting the low DC voltage level of the PV panel up to 400V DC level for being transformed into AC voltage for utility grid injection. Additionally, the designed converter accepts four PV panels as its input and applies the MPPT algorithm to each one independently. The converter is named as Collector module. As a result, the Collector module consists of four small power electronics topologies whose outputs are connected in parallel to form the single high 400V DC voltage output. The input of the system (PV panels) can have various parameters between 25V to 50V voltage and up to 400W power. Thus, the total nominal output of the module is 1600W. The reason for this individual MPPT configuration is to eliminate the problems with the string-connected PV panel systems. As known, a PV panel has I-V and P-V curves due to PV cell configuration and environmental aspects such as irradiance strength and temperature. The MPPT algorithm aims to carry the PV panel operation point through these curves and locates the maximum power point. When the PV panels are serial or parallel connected to increase the system's power, these curves change according to connection configuration. However, the system performance degrades significantly if some shading effect or other problem occurs on even a single PV panel. Because in this case, the problematic PV panel is not just a lack of contribution to the total system but also has adverse effects on the power produced by other PV panels. In literature, many MPPT algorithms have been theoretically and practically examined and applied to PV system converters. They have advantages and disadvantages regarding implementation easiness, accuracy, stability, or settling speed towards ambient changes. These aspects can be calculated and predicted with theoretical methods. However, another phenomenon is named "power traps" above the I-V and P-V curves of the system. This phenomenon is caused by the interaction of PV panels and power electronics circuits. The outcome of this phenomenon is a disordered structure of a non-linear I-V curve. Such as, even though the ideal theoretical curve is not linear, its fundamental concept is that when PV voltage decreases, PV current should increase at the same irradiance strength as an inverse relationship. However, with the result of the panel and circuit integration, the resultant curve does not follow this fundamental, especially around the DCM-CCM limit. Consequently, when a regular MPPT algorithm is applied to the system, it is observed that the steady-state operation point is far away from the actual maximum power point. As a result, an improved version of the Incremental Conductance (InC) algorithm has been developed and applied to each circuit independently by a single microcontroller. This thesis mainly focuses on the system's software structure, such as designing the novel MPPT algorithm and time-shaping between the moments of required measurement occurrences for four circuits and the MPPT calculations. Lastly, these four circuits are driven with the interleaved technique by having a 45° phase shift between the consecutive circuit's PWM signals. Last, the collector module's hardware structure has been designed for this study. Push-pull topology has been used for four power circuits. The designed module has been tested in various ways. Firstly, individual power circuits were connected to a PV simulator device separately to check the MPPT accuracy. A PV simulator is an analog device whose output characteristic coincides with actual PV panels. With this device, a controllable imitation of a PV panel has been used; hence the circuits in the collector module could be tested under various input powers. According to the results, the MPPT efficiencies of all circuits are above 99%. This verifies that the designed MPPT algorithm has successfully tracked the maximum power point. On the other hand, power transfer efficiency is around 92%-93% for each circuit. Then, all inputs of the collector module were loaded at the same time to verify simultaneous power transfer. Firstly, 4 PV panels are used as inputs. Secondly, 3 PV panels and the PV simulator are used as inputs. In both cases, both MPPT and power transfer efficiency ended up with similar values to the individual test results. Consequently, simultaneous MPPT operation and power transfer are verified with these tests, as well as the availability of using different PV sources simultaneously.
  • Öge
    Konvansiyonel ve mikro şebeke içeren güç sistemlerinde dinamik ekonomik yük ve emisyon dağıtımının sezgisel yöntemlerle analizi
    (Lisansüstü Eğitim Enstitüsü, 2022-06-17) Aydın, Esra ; Türkay, Belgin ; 504191019 ; Elektrik Mühendisliği
    Yıllar içerisinde yaşanan nüfus artışı ve teknolojik gelişmeler ile birlikte enerji talebinde artış yaşanmaktadır. Bu artış ile birlikte elektrik enerjisi üretim sistemlerinin sayısında artış yaşanmakta ve güç sistemleri, daha büyük ve daha karmaşık bir hale gelmektedir. Talebin artması ile güç sistemlerinde yaşanan büyüme, bu sistemlerin ekonomik olarak işletilmesi konusuna büyük önem kazandırmaktadır. Bu hususta, güç sistemlerinin optimizasyon planlamalarından biri olan Ekonomik Yük Dağıtımı problemi oldukça önemli bir hale gelmiştir. Ekonomik yük dağıtımı, termik santrallerde yakıt maliyetinin en aza indirgenmesinin amaçlandığı ekonomik bir planlamadır. Bu kapsamda, güç ünitelerinin çıkış güçleri talep gücü karşılayacak şekilde yakıt maliyetinin minimum olması için optimum planlama yapılır. Bu planlama yapılırken sistemin kısıtları göz önünde bulundurulmalıdır. Güç denge kısıtları, generatör kısıtları ve rampa oranı kısıtları dahilinde en optimum planlama yapılmalıdır. Fosil yakıtların kullanıldığı güç ünitelerinde atmosfere emisyon gazları salınır. Sera gazı olarak da bilinen bu gazlar, atmosferde sera etkisine sebep olarak dünyadaki yaşamı pek çok açıdan tehdit etmektedir. Atmosferdeki emisyon gazı yoğunluğunu azaltmaya yönelik çalışmalar küresel bir boyuta ulaşmıştır. Güç sistemlerinin, emisyon yoğunluğuna en fazla sebep olan birimlerden biri olduğu düşünüldüğünde, emisyon yoğunluğunun minimuma indirilmesinin amaçlandığı ekonomik emisyon dağıtımı, önemli bir konu haline gelmiştir. Ekonomik emisyon dağıtımında, emisyon yoğunluğunun minimuma indirgenmesi amaçlanır, yakıt maliyetinden bağımsızdır. Ekonomik yük dağıtımı probleminde ise yakıt maliyetinin minimum olması amaçlanır, emisyon yoğunluğu önemsenmez. Ekonomik yük dağıtımı ve emisyon dağıtımının birlikte ele alındığı durumda ise birleşik ekonomik emisyon-yük dağıtımı fonksiyonu oluşturulur ve hem yakıt maliyetinin hem de emisyon yoğunluğunun en aza indirilmesi amaçlanır. Güç sistemlerinin ekonomik yük ve emisyon dağıtımı problemlerinin çözümünde çeşitli optimizasyon yöntemleri kullanılmaktadır. Bu yöntemler klasik ve sezgisel yöntemler olarak ikiye ayrılır. Sistemlerin büyük boyutlu olması sebebi ile klasik yöntemlerden ziyade sezgisel yöntemlerin kullanımı daha uygun olmaktadır. Sezgisel yöntemlerin karmaşık problemlere uygulanabilirliği, çözüm süresinin hızlı olması gibi sağladığı avantajlar popülerliğini arttırmıştır. Genetik Algoritma, Parçacık Sürü Optimizasyonu, Tabu Araştırma ve Yapay Sinir Ağları günümüzde uygulamalarda en çok tercih edilen sezgisel yöntemlerdendir. Bu tez çalışmasında, güç sistemlerinin dinamik ekonomik yük dağtımı ve emisyon dağıtımı gerçekleştirilmiştir. Problemlerin analizi için sezgisel algoritma yöntemlerinden olan Genetik Algoritma (GA) ve Parçacık Sürü Optimizasyonu (PSO) yöntemleri kullanılmıştır. Algoritmalar, 5 ve 10 üniteli sistemler ile mikro şebeke içeren sisteme uygulanmıştır. Algoritmalara ait kodlamalar MATLAB programında oluşturulmuştur. 5 ve 10 üniteli sistemlerin dinamik ekonomik yük dağıtımı, emisyon dağıtımı ve dinamik ekonomik emisyon-yük dağıtımı gerçekleştirilmiştir. Mikro şebeke içeren sistem için ekonomik yük dağıtımı gerçekleştirilmiştir. Uygulamada güç denge kısıtı, generatör limitleri, hat kayıpları, rampa oranı kısıtları ve valf nokta etkisi dikkate alınmıştır. Analiz sonuçları literatürde yapılan çalışmaların bulguları ile karşılaştırılmış, GA ve PSO yöntemleri ile daha optimum sonuçlar elde edildiği görülmüştür. Ayrıca bu yöntemler kendi arasında karşılaştırıldığında ise PSO algoritmasının daha uygun sonuçlar verdiği görülmüştür.
  • Öge
    Wide speed sensorless control of pmsm drive with smooth transition between HFSİ and extended luenberger observer
    (Graduate School, 2023-01-18) Avcı, Mustafa Mus Ab ; Öztürk, Salih Barış ; 504191041 ; Electrical Engineering
    It is well-known that the most efficient way to generate mechanical power is to use electric motors. In the beginning, conventional DC motors with commutator excitation are used. However, with the spread of AC power distribution, AC motors have become popular because they have superior efficiency and low maintenance and installation cost. Thus, induction motors have become a dominant factor in the industry. Moreover, although induction motors have satisfied the needs of the industry for a while, the energy shortage and the efficiency criteria caused a reorganization with respect to efficiency classes among electric motors. Thus, using new-generation motors with permanent magnets to create magnetic excitation has become indispensable worldwide. However, the transition to permanent magnet motors requires a motor controller for power electronics. Since, unlike induction motors, electromechanical commutation must be carried out electronically. With the improvements in semiconductor technologies and digital signal controllers, controller chips are becoming more available with low cost and high horsepower. In addition, sophisticated digital signal controllers allow engineers to develop superior control algorithms. At first, AC motors could only be controlled via scalar control. Since only controllable quantities in scalar control are voltage and frequency, naturally controlling bandwidth and dynamical responses are poor compared to vector control schemes. One of the well-known vector control schemes is field-oriented control, which enables control of motor armature and excitation voltages independently and on a vector basis. Thus, new-generation motors are driven with new-generation controllers. Although field-oriented control is one of the suitable control methodologies, it requires geometric knowledge of the position of the rotor flux vector during the operation. Besides, there are direct and indirect sensing operations to achieve rotor position information. Although direct sensing of rotor position information via a position sensor directly mounted on the shaft is the simplest way, there are some drawbacks. Especially, position sensors increase total system cost and decrease the reliability of the drive system according to environmental conditions like temperature, moisture, and altitude. On the other hand, indirect sensing methods rely on mathematical manipulations and computation via machine parameters like voltage, current, resistance, and inductance. Regarding self-sensing methods, they could be categorized in a manner of machine model-based and saliency tracking-based models. Machine model-based approaches use observers to get rotor flux position information by iteratively computing machine model state equations. Unlike model-based solutions, saliency tracking solutions require a signal injection concept to detect rotor position information from the demodulation of modulated injection frequency through the motor itself. In summary, although model-based approaches have promising performance in medium to high-speed regions, they have poor or moderate performance and controllability in the low-speed region; because of that, the magnitude of back electromotive force is proportional to the rotating speed. Contrarily, saliency-tracking approaches have better performance in terms of the quality of estimated position information in zero, low and nearly zero-speed regions. However, as the rotational frequency increases, signal processing becomes a burden and uncontrollable with the same sampling infrastructure and dynamics. Consequently, hybrid observer structures have become more popular since the two approaches have beneficial features with respect to operational speed region. A soft transition method must be developed to combine two self-sensing methods within the stable operation in terms of transients and steady-state operation. In this thesis scope, a hybrid sensorless field-oriented control methodology is proposed. First, the literature is reviewed in the aspects of machine model-based sensorless algorithms and saliency-tracking approaches. Two of the self-sensing methods are further investigated. One is the Luenberger observer that computes back electromotive force to estimate rotor flux position, and the other is high-frequency signal injection. Also, a soft transition from one method to the other is proposed. Furthermore, verification is made by modeling, analysis, and simulation using the MATLAB®/Simulink® environment. In the simulation, the motor is started with the estimated position referenced by the outcome of the high-frequency signal injection method. Next, beyond a defined transition point, the estimated position reference is changed to the observer algorithm. It is observed that a soft transition is necessary to keep the system under stable conditions. After the hybrid method is realized and implemented, a motor driver is designed via Altium. All component selection, design of gate drive circuits, design of current and voltage measurement circuits, as well as digital and analog interfaces are described. Beyond the hardware design phase, embedded software is developed to run hybrid control algorithms. Besides, embedded software flow and control loop descriptions are detailed. Also, a test bench including two identical motors, a dummy load, and a rectifier circuit for the generator side is prepared for the experiments. Experimental results are recorded, discussed and presented.
  • Öge
    DNS big data processing for detecting customersbehaviour of isp using an optimized apache spark cluster
    (Graduate School, 2022-02-03) Alkhanafseh, Yousef ; Akıncı, T. Çetin ; 504191100 ; Electrical Engineering
    During the past few decades, technology fields, especially Internet of Things (IoTs),have surpassingly evolved which in turn have contributed to great proliferation of datasources. Unfortunately, at that time, the available data processing tools in terms of va-riety and advancement were insufficient to analyze that huge data in a reasonable time.They suffered from several problems such as slowness, lack of comprehensiveness,limit size of clusters, high expense. These problems have constituted major obstaclesfor the progress and achievement in Big data field. Therefore, data has been unemployedfor a while. However, when its enormous benefits such as making smart decisions,saving time and cost, monitoring servers, improving performance, minimizing hiddencorrelations, and providing high quality reports have been closely realized, process-ing big data started to be prevalent. When dealing with big data, the most famousquestion that can be asked is "how can big data analysis make the enterprise jobs andbusiness better?". Currently, huge amounts of structured and unstructured data-sets,called as big data, have started to be processed by different types of companies suchas telecommunications, software and hardware, marketplaces, social media and so on.The current advanced services, hardware, and software have played an important rolein promoting big data processing by making its analysis faster, easier and inexpensive.It is important to know the difference between big data and traditional data sources.The main difference between them can be clearly noticed in data size, types, frequency,capturing speed, and used processing tools. Despite the current advanced technolo-gies, processing ExaByte (EB) or even YottaByte (YB) of data in an efficient way thatincludes the optimal usage of used system by completely utilizing its precise features isstill a challenge and need an expert who has a good mathematical background, knowl-edge of statistics, and superior experience in this field. Based on that, this thesis aims toprovide a comprehensive approach of setting up a system that consists of three differentstages which are collecting, processing, and visualizing huge amount of DNS data,daily of 1.3 TB, using an optimized YARN-based Apache Spark cluster. The process isachieved in two different clusters in terms of their place of establishment. The first onewas established on cloud by using Amazon Web Services Elastic MapReduce (AWSEMR) and the other one was established on local machines using Apache Ambari.Nevertheless, in this project, just the cloud cluster was discussed and reported in detail.The main goal of the one who was on cloud is to determine the features of neededmachines for local cluster. Moreover, it adequately made the understanding of ApacheSpark various configurations easier by trying each one of them with different values.Additionally, different structures of Python codes, especially related to Pyspark, weretried in different ways in order to specify the most efficient one. Initially, the thesisstarts by stating an extensive introduction that takes into consideration different sub-jects such as big data concepts, properties, sources, importance, future, limitations,challenges, and processing tools. Moreover, the architecture of the used DNS servers was thoroughly explained by stating their general purpose and their working principle.Similarly, under the title of data collecting, the project's main big data, DNS, andthe other used data-sets, which are Call Detail Record (CDR), Customer RelationshipManagement (CRM), Carrier-grade Network Address Translation (CGNAT), and IP-Blocks, were distinctly clarified by representing a sample of each one in separate tables.All these data-sets are encrypted and only the concerned authorities can understandits content. Then, an additional data-set that was captured from internet websites wasintroduced by representing a sample of it. A web scraping method has been talkedabout as well. There were more than one thousand URLs which can be classified inalmost 31 categories including education, games, VPNs, Services, banks, economy,etc. After that, several services that are utilized to process the data such as ApacheSpark, Yet Another Resource Negotiator (YARN), Hadoop Distributed File System(HDFS), ZooKeeper, and Hive were briefly investigated by interpreting their impor-tance, working principle, architecture, and main configurations. Meticulously, ApacheSpark is the data processing engine in this project. On the other hand, HDFS and Hivewere used as general storages to save processed data-sets and metadata, respectively.Zookeeper is a service that is utilized in order to maintain centralized configuration in-formation and provide distributed synchronization. Other services such as AWS EMRand AWS s3 were also used in this project. AWS EMR is a platform that Apache Sparkclusters can be built on. AWS s3 is a cloud storage that was temporarily used for savingprocessed data-sets. Next, based on different factors, the differences between ApacheSpark APIs, which are Resilient Distributed Data-set (RDD), Dataframe, and Dataset,were concisely illustrated. Subsequently, a procedure of optimizing a YARN-basedApache Spark cluster was proposed by interpreting the used mathematical equationsand giving a detailed example of how to start the object of Apache spark in an optimalway. Both Apache Spark and YARN configurations that are related to applicationproperties, run-time environment and networking, shuffle behavior, compression andserialization, memory management, and execution behavior were extremely elaborated.Next, various experiments of processing data were done by using different cluster sizesthat started from small number of machines with a small amount of resources of RAMand vCores to huge ones with high number of machines and large amounts of RAM andvCores. These clusters were optimized based on the previously stated configurationsand the values that can be found on both Resourcemanager and Spark admin interfacewere exactly the same as the calculated ones that are related to the amount of RAM,number of vCores, number of containers, and parallel tasks which in turn confirms theefficient use of the available resources. As a result, about %95 of RAM and CPUs ofthe clusters were successfully utilized. On the other side, the results of the experimentswhich contain input data size, number of operations, execution time, and output datasize were efficiently reported. Based on these results, a local cluster that has the samefeatures of the most appropriate cluster that was obtained in the experiments, is locallyestablished. After that, the output DNS data was grouped based on specific schemaand saved in a compressed format which is Parquet that reduces the size of the dataapproximately four times. Then, it was transferred to an optimized Elasticsearch clusterwhich is established in order to make fast queries to the output data and visualize it byusing an interactive Kibana dashboard. The Elasticsearch cluster includes one masternode and two slave nodes. The indices of Elasticsearch were properly configured andsplit into small indices. Also, they were defined in a way that only uses needed featureswhich in turn leads to enhance and tune the work of disks. Captured visualizations have played a major role in determining useful information such as the situation of DNSservers, customers segmentations, distribution of DNS traffic across Turkey neighbor-hoods, types of customers, most visited categories, most used URLs, and suitable placesfor advertising. Eventually an application that is based on time siers forcasting wasmade. A sample of the output data was prepared to be used in a time series forecastingusing Facebook Prophet model which were selected after trying several models such asautoregression (AR), Seasonal Autoregressive Integrated Moving-Average (SARIMA)and Vector Autoregression (VAR). However, only a comparison between VAR andFbprophet is discussed in this project. The main target of this prediction is defining thedensity of the used DNS servers, giving information about missed data, and providingapproximate information about the future of servers. The models were evaluated bycomparing the test data-set with prediction one and calculating its mean absolute error.It was almost %2.49 for Fbprophet. In short, some of this thesis achievements can beconcluded as providing solid knowledge about cloud computing systems and big datadifferent processing tools, performing various experiments on different clusters withdifferent sizes and resources, establishing local cluster based on these experiments,transforming daily of 1.3 TB of raw data into meaningful information, and making asystem for processing new data continuously. Furthermore, these processed informa-tive DNS data is used in a wide range of fields such as congestion prediction for DNSservers, classifying customers, enhancing content delivery network of some specificwebsites, running successful market advertising campaigns.