LEE- Müzik Lisansüstü Programı

Bu topluluk için Kalıcı Uri

http://hdl.handle.net/11527/19406

Gözat

Computational harmonic analysis with rhythmical weights

(Graduate School, 2022-08-16) İkeda, Ayşe Ruhan ; Karadoğan, Can ; Mazzola, Guerino ; 409132003 ; Music

Analysis of harmony is the first step in the analysis of common practice (Baroque, Classical and Romantic) period of Western music because in these genres musical structure is aligned with tonal motion and organicity is created primarily by harmony. The analysis includes finding regions of tonality/key, labeling chords and cadences, assigning functions to chords, finding prolonged harmonic functions and consequently forming a tree-like hierarchy. The result of this effort is the discovery of the harmonic motion and how musical entities function within this motion. This is how a music theorist analyses harmony based polyphonic music, i.e., music of the common practice period. A chord's function in its tonal context has an emotional projection that is perceived by the listeners: a harmonic tension that rises or falls, held in a suspension or resolved. A rise in harmonic tension raises an expectation for resolution and its resolution is an emotional relief. The fine balance between increase and decrease in harmonic tension through time is perceived by the emotionally sensitive listener. This ebb and flow in tension is a critical determinant of the aesthetics of harmonic language. In this thesis, we describe an algorithm for harmonic analysis of polyphonic music and demonstrate its implementation on the RUBATO Composer music composition and analysis environment. Our harmonic analysis model completes Riemann's unfinished program by assigning a function to any chord –not only triads and sevenths– based on the pitch content of the chord and the harmonic tension created between consecutive chords. As a background of music modeling, we overview mathematical approaches to music analysis at the symbolic level (i.e., note level and above) and then we examine how computational power can be used for modeling and analysis of music and musical processes. Then, we review similarities as well as differences between music and language and also musical structure analysis methodologies borrowed from linguistics research such as grammars and parsers. In Chapter 3, we give an overview of RUBATO Composer music composition and analysis environment including its historical line of development. We summarize its mathematical pillars and the software architecture. We also describe a number of rubettes that we designed and programmed on RUBATO for computational analysis purposes: a rubette that enables mixing of weights, and another rubette to translate a MIDI file into a MIDI denotator, and another one to be able to trim MIDI files. We also explain a rubette that translates harmonic analysis output to Lilypond, a music typesetting format. In Chapter 4, we give our motivation for computational harmonic analysis by reviewing related concepts such as tonality, harmony and tonal tension as well as a review of computational models for analysis of tonal tension and harmony. In Chapter 5, we describe our mathematical and computational models and their software implementation for analysis of harmony. During this thesis, we added some components to the Computational Harmonic Analysis Network –a suite of rubettes to analyze harmony. The additions are implementation of Viterbi algorithm for optimum path computation and direct-thirds method for Riemann Matrix computation.s The core of the thesis is however is being able to analyze harmony using also metric/rhythmic information of musical events. Our harmonic analysis model had previously assumed that chords have the same metric importance. However as musicians we know that meter in music imposes a hierarchy in perception of musical events in time, i.e., not every instant, and not every beat is equally important during perception. Thus, we extended our model to include metric importance of musical events. In Chapter 6, we give a review of recent research on perception of time and periodicity based on recent neuroscience research. Then, we focus on temporality in perception of music. We consider meter and rhythm as the skeleton system that span time, whereas melody and harmony are the flesh over the bones. This ontological order of where meter and rhythm is primordial is relevant for a vast majority of genres in music including music of the common practice period. Then, we overview metrical analysis algorithm based on Mazzola's metric analytics which reveals local (inner) meters in music and its implementation as the new MetroRubette. Finally, in Chapter 7, we describe a computational model for harmonic analysis of music where, next to pitch content and temporal position of neighboring chords, metric position of chords is also considered. Music is given to the analysis algorithm at a symbolic level as a MIDI file. We explain the new algorithm in detail and also give sample analyses with the algorithm's implementation as software on the RUBATO Composer. We compare harmonic analyses with and without metrical proximity, examine their differences and discuss results.
Evaluating the functionality of attributes in 3d sound with semantic differential scale on three dimensions:Importance, comprehensibility and noticeability

(Graduate School, 2024-08-01) Şahin, Laçin ; Karadoğan, Can ; 409152009 ; Music

This dissertation proposes a methodology for selecting and ranking sound attributes for 3D sound according to their functionality. Functionality is operationalized as a semantic differential scale with three dimensions such as importance, comprehensibility, and noticeability. A group of 20 expert assessors is presented with 51 attributes. They grade these attributes while listening to some 3D sound excerpts according to the provided scale. The dimensions have weights assigned to them by the results of a prior experiment. By using these weights, the weighted average scores of the attributes are calculated. This weighted average score is labeled as the functionality score, and attributes are ranked from 1 to 51 according to their functionality scores. In the end, top 20 most functional attributes are discussed in more detail.
Multipart music transcription using deep neural networks

(Graduate School, 2025-04-17) Germen, Emin ; Karadoğan, Can ; 409072004 ; Music

This research presents a comprehensive framework for automatic music transcription, specifically designed to replicate the auditory capabilities of a "trained ear" in identifying and interpreting complex musical interactions. Traditional Turkish instruments, Qanun and Oud, are used as the focal point of this study, addressing challenges associated with polyphonic music transcription in non-Western musical contexts. Using a foundational corpus and advanced machine learning models, the research aims to bridge the gap between traditional auditory analysis and contemporary computational approaches. The study emphasizes the importance of crafting a robust yet basic corpus capable of simulating essential auditory tasks while capturing the unique timbral and harmonic characteristics of these instruments. A pivotal aspect of this research is the development of a specialized corpus designed to emulate the core perceptual abilities of a trained ear. The corpus incorporates systematic combinations of musical notes played by Qanun and Oud, including sustained tones, chromatic sequences, and randomized patterns. These combinations simulate a wide spectrum of musical scenarios, encompassing monophonic and polyphonic textures as well as overlapping harmonic interactions. Despite its basic design, the corpus provides a detailed representation of the dynamic interplay between the two instruments, enabling computational models to learn critical aspects of pitch recognition, timbral distinction, and harmonic understanding. The corpus generation process begins with the systematic recording of individual notes and their combinations. Each recording captures the transient and sustained qualities of the Qanun and Oud, highlighting their contrasting timbres, the bright and resonant sound of the Qanun versus the dark and mellow tone of the Oud. This structured approach ensures that the data set reflects real-world auditory challenges, such as identifying simultaneous pitches and distinguishing between overlapping harmonic structures. Inclusion of randomized patterns introduces an element of variability, further improving the corpus' ability to mimic real-world musical performances. To analyze and transcribe the complex interactions captured in the corpus, a Deep Neural Network (DNN) and a Convolutional Neural Network (CNN) were developed. These models are trained using a carefully curated feature set, including the Short-Time Fourier Transform (STFT), Constant-Q Transform (CQT), Spectral Centroids (SC), and Band Energy Ratio (BER). Each characteristic contributes to a holistic representation of audio signals, capturing their temporal, spectral, and energetic characteristics. The integration of these features enables the models to extract meaningful insights from the data, such as note onset times, harmonic structures, and timbral nuances. The DNN architecture consists of six layers, each optimized to handle the multidimensional nature of the input data. Its ReLU activation functions and softmax output layer allow the model to classify 37 distinct musical notes across three octaves. Meanwhile, the CNN model leverages its convolutional layers to analyze spectrogram images, offering an alternative approach to learning musical patterns. The CNN architecture is particularly effective in identifying visual representations of audio signals, such as pitch contours and harmonic structures, making it a valuable complement to the DNN. Transient detection and onset analysis are critical components of this framework, providing the temporal precision necessary for accurate music transcription. Transients, characterized by rapid changes in amplitude and frequency, mark the beginning of new sound events, such as the attack phase of a note. Onset analysis further refines this process by pinpointing the exact start times of these events, enabling the models to capture intricate rhythmic and melodic details. In traditional instruments such as the Ud and Kanun, the acoustically limited sustain durations often led to a misinterpretation by the model, where sustained notes—despite being musically longer—were incorrectly classified as rests. To address this issue, a heuristic method was developed. By utilizing data-driven statistical analysis, the time segments misclassified as rests were reinterpreted to better align with plausible note durations. As a result, note lengths were represented more realistically, leading to a notable improvement in the overall transcription accuracy. The proposed framework has shown significant success in transcribing two-part music played by Qanun and Oud, achieving high accuracy in pitch and timbral recognition. The corpus, though basic in construction, has proven effective in capturing the essential harmonic and melodic characteristics of these instruments. This foundational work provides a solid foundation for further advancements in the transcription of more complex musical frameworks, such as Maqam music, which features intricate microtonal scales. Research has broad implications for the fields of musicology, auditory science, and machine learning. By bridging traditional musical practices with modern computational tools, the framework contributes to the development of culturally informed auditory systems, advancing the field of automatic music transcription. Furthermore, the corpus and models developed in this study can serve as valuable resources for musicians, educators, and researchers, fostering a deeper understanding of diverse musical traditions and enhancing the accessibility of non-Western music in digital formats. This study demonstrates the potential of combining basic corpus design with advanced machine learning techniques to achieve robust and accurate music transcription. By focusing on Qanun and Oud, the research highlights the importance of culturally specific datasets in addressing the unique challenges of non-Western music transcription. The proposed framework not only replicates the critical auditory capabilities of a trained ear, but also provides a scalable foundation for future research in complex musical systems. Through this work, significant progress has been made in bridging the gap between traditional auditory analysis and modern computational approaches, offering new avenues for exploring and preserving the rich diversity of the global musical heritage.
Narrative contribution of immersive audio sound design in audio-visual works

(Graduate School, 2022) Murtezaoğlu, Yusuf Can ; Karadoğan, Can ; 733200 ; Music Programme

Audio-visual works, whether it be a film, tv show, documentary, animation, live/recorded performing arts or an entirely abstract work of artistic representation; often relies heavily on sound. It can be argued that this is a reflection of how human senses work in tandem to make sense of reality. This reliance on observation and comprehension with ears, puts an emphasis on sound design to take charge in structuring a world, describing a scene, enhancing what's seen and collectively furthering the storytelling. With the advancement of technology and knowhow, this effect has been more prominent as time passed. One of the most apparent re-imaginings of the aural part of audio-visual mediums came with an approach called immersive audio. This is somewhat of an umbrella term that encompasses any form of soundstage reproduction that has spatial components to it. This was a revolutionary step towards a more enhanced experience for the viewer to enjoy. It also opened many paths towards creativity, considering there are a multitude of ways to achieve a given thing. Music, dialogue, sound effects and other miscellaneous audio events coming from not only the screen (or in a stereoscopic array very close to the screen) but possibly all around the audience's aural perception sphere, stimulated the industry towards experimentation and creativity. In the decades following the invention of this approach we saw many standards were being implemented both in recording, editing and playback phases of a production. As in all things, there are advantages and disadvantages to an approach that is developing so fast and is prone to change. Most attempts to standardize it has either changed and evolved into something else or became obsolete. And some have emerged as dominant as prominent forms and have promising potential future use cases. One of these come in the form of ambisonics. Of the many benefits of working with ambisonics, a couple stand out and in fact, are the reasoning behind this study being done with it. The ease of use, accessibility, future proofing aspects are industry leading to say the least. Also the scalability and expanded/enhanced playback options to it are arguably unmatched. In this thesis, main purpose is to explore the possible narrative contributions of an immersive sound design in an audio-visual work, made with ambisonics method. This is achieved through both a manner of thought experimentation and a scientific observation approach that is highly repeatable.
Top plate vibration analysis of the kanun instrument

(Sosyal Bilimler Enstitüsü, 2020) Ömeroğlu, Cem ; Karadoğan, Can ; 409032001 ; Müzik

Bu çalışma kanun enstrümanının göğüs tahtasının titreşim analizine yoğunlaşmıştır. Çalışmanın amacı göğüs tahtasının doğal titreşim frekanslarını tam anlamıyla çalışan ve geçerliliği yine bu çalışmanın içinde ispatlanmış bir üç boyutlu fiziksel model yardımıyla tanımlamak ve tasarım sürecinde öngörebilmektir. Böylelikle ladin ve çınar ağaçları ile birlikte metal ve kompozit malzemeler için de enstrümanın göğüs tahtasının frekans spektrumu model sonuçları ile birlikte analiz edilip değerlendirilecektir. Yöntem, başlangıç olarak çekiç testi olarak adlandırılan darbe deneyi aracılığı ile doğal titreşim frekanslarını ölçmeye dayalı olan deneysel çalışmayı kullanmıştır. Göğüs tahtaları çekiç ile darbelenmiş ve darbe tepkileri bir ivmeölçer aracılığı ile bilgisayar yazılımına kaydedilmiştir. Plakalar üstünde ayrık ve birden fazla sayıda nokta araştırılmış ve bu bilgiler sonraki aşama olan üç boyutlu fiziksel modellemeyi doğrulama aşamasında kullanılmak üzere ayrılmış ve saklanmıştır. Böylece deneysel çalışma, fiziksel modellemenin doğru ve tutarlı bir şekilde çalıştığını sağlamak ve gerçek ortam şartlarına mümkün olduğunca yakınlık sağladığını göstermek amacı ile bir referans noktası olarak kullanılmıştır. Farklı göğüs tahtaları fiziksel olarak modellenerek malzeme karakteristik bilgileri tanımlanmış, sonuçlar bilgisayar yazılımı ile üç boyutlu fiziksel modelin serbest titreşim modlarına göre ilgili yazılımla hesaplanmıştır. Model hesaplamaları ile deney sonuçları birbirleri ile kesiştikten ve serbest titreşim modlarına göre sağlama yapıldıktan sonra, fiziksel modellemenin sonraki çalışmalarda güvenli bir şekilde kullanılabilirliği onaylanmıştır. Sonrasında ise her iki farklı ağaç için sabitlenmiş titreşim modları çalışılmış ve sonuçlar aşağıdaki şekilde değerlendirilmiştir; İki ağacın sabitlenmiş titreşim modları karşılaştırıldığında; çınar ağacının (22) ladine (18) göre enstrümanın frekans aralığında daha fazla doğuşkan içeriğine sahip olduğu gözlemlenmiştir. Bu sonuç çınar ağacının göğüs tahtasının gürlük ve ses yayılımı anlamında ladine göre daha fazla potansiyele sahip olduğunu açıklayabilir. Geometriyi oluşturan plaka boyutlarında yapılacak cm bazında bir değişiklik dahi doğal titreşim frekans sonuçlarını etkilemektedir. Bundan dolayı; sesin doğuşkan içeriği ve yayılım şiddetinin genliklere bağlı olarak ses alanı içerisinde değişmesi beklenebilir. Boyutlar dalga boylarını belirlemektedir. Bundan dolayı, ses hızı sabitken boyutlar değiştiğinde doğal titreşim frekansında değişiklik beklenebilir. Üretim aşamasında çeşitli göğüs tahtalarını incelerken tüm plakalar hemen hemen aynı geometriye ve ölçülere sahip olsalar dahi ağırlıkları dolayısıyla da yoğunluklarındaki değişkenlik gözlemlenmiştir. Kısaca; ağaçların yoğunluğu ve bağıl nem oranı doğal titreşim frekansını kuvvetli bir şekilde etkilemektedir. Katılarda ses hızı Young Modülü ve yoğunluğa bağlıdır. Bu şekilde sadece yoğunluk parametresi düşerse, doğal titreşim frekanslarının tam aksine arttığı gözlemlenmiştir. Ladin ve çınar ağaçlarının göğüs tahtaları frekans spektrumu içinde değişik bölgelerde rezonansa girmektedir. Bu durumda yine geometrilerdeki benzerliğe vurgu yapılabilir. Yoğunluğa ek olarak Young Modülü, Sertlik Modülü ve Poisson's oranları bu doğal titreşim frekanslarını hep birlikte belirlerler. Ek olarak, göğüs tahtası için ağaçlara alternatif olabilecek farklı malzemelerin incelenmesi de fiziksel modelleme yoluyla çalışılmıştır. Metal olarak Al 3003-H18, kompozit malzeme olarak da GFRP ve CFRP Toray malzemeleri bu aşamada sunulmuştur. Tüm malzemeler; plaka kalınlığı, enstrüman frekans sahasına düşen doğal frekans sayısı açısından karşılaştırılmış ve ek olarak malzeme özellikleri ile belirtilmiştir. Biçim ve geometri çalışmaları ise göğüs tahtası üzerinde tek delik ve üç delik olmak üzere alternatif olacak şekilde çalışılmış ve sunulmuştur. Sonuç olarak, göğüs tahtasında tek delikli GFRP Toray malzemesi ile doğal frekanslar için enstrüman frekans sahasında maksimum sayıda (25) harmonik elde edilmiştir. Modelde kullanılan sonlu elemanlar yöntemine ilişkin parametrelerden; formüller, algoritmalar ve sonraki aşamalarda yapılabilecek değişikliklerin, sayısal ses işleme ve sentezleme konusunda fiziksel modelleme araçları olarak da kullanılması beklenebilir.

Gözat

Yazar "Karadoğan, Can" ile LEE- Müzik Lisansüstü Programı'a göz atma

Sayfa başına sonuç

Sıralama Seçenekleri