##
Dizi şifreleme sistemleri ve doğrusal karmaşıklık

Dizi şifreleme sistemleri ve doğrusal karmaşıklık

##### Dosyalar

##### Tarih

1994

##### Yazarlar

Savaş, Erkay

##### Süreli Yayın başlığı

##### Süreli Yayın ISSN

##### Cilt Başlığı

##### Yayınevi

Fen Bilimleri Enstitüsü

##### Özet

Günümüzde çok büyük miktarlardaki veriler bilgisayarlar tararından işlenmekte ve herkesin erişimine açık elektronik kanallar üzerinden iletilmektedir. Verilerin yetkili olmayan kişiler tarafından ele geçirilmesi ve değiştirilmesi tehlikesine karşılık, şifreleme sistemleri yardımıyla veriler gizlenerek iletişim güvenliği sağlanmaya çalışılır. Elinde gizli anahtar olmayan kişiler için gizli mesajın içeriğine ulaşmak imkansız olmalıdır. Bilgisayar teknolojisinin ve hesaplama tekniklerinin hızla gelişmesi veri güvenliği açısından hem iyi hem de kötü sonuçlar doğurmuştur. Artık son derece karmaşık işlemler gerektiren şifreleme sistemleri bilgisayarlar yardımıyla hızlı ve kolay bir şekilde gerçekleştirilebilmektedir. Ancak diğer yanda şifreli mesajı çözüp içeriğine ulaşmak isteyen kişilerin elindeki yüksek teknolojiye yg üstün hesaplama gücüne sahip bilgisayarlar yardımıyla şifreyi çözebilecekleri de gözönünde bulundurulmalıdır. Bu nedenle şifreleme sistemlerini tasarlayan kişilerin önce kendi sistemlerinin güvenilirliğini ölçmeleri gereklidir. Bunun için tasarımcıların ellerinde şifreyi kurmak isteyenlerin ellerindekine benzer bir takım yöntem ve araçların olması gereklidir. Bu çalışmada çok yeni olmamasına karşılık hala çok yaygın olarak kullanılan dizi şifreleme sistemleri ele alınmıştır. Bir dizi şifreleme sisteminin sağlam ve güvenilir olması için gerekli koşullar incelenmiş ve bu tür sistemlerin nasıl tasarlanması gerektiği üzerinde durulmuştur. Sağlam dizi şifreleme sistemlerinin inşaasında yararlı olabilecek bir takım analiz yöntem ve araçları tartışılmıştır. Bu analiz araçlarından en önemlisi doğrusal karmaşıklıktır. Doğrusal karmaşıklığın hesaplanmasında kullanılan çeşitli algoritmalar araştırılmış ve daha hızlı algoritmalar için çalışılmıştır. Algoritmaların C++ dilinde programlan yapılmış olup bu programlar UNDC makinalarda.çalışabilir niteliktedir. Programlar gerçekte kullanılan dizi şifreleme sistemlerinin analizinde doğrudan kullanılabilirler.

While the information age is being celebrated nowadays, the problem of securing and authenticating data has become an inseparable aspect of modern computer and communication systems. More and more data are stored and transmitted by electronic means, and thereby get exposed to unauthorized parties openly. The privacy of individuals as well as of organizations depends on protecting communicated information from unauthorized disclosure and modification. Cryptographic systems provide secrecy using transformations. Sender, under control of the enciphering key, transforms the plaintext message into the ciphertext that is supposed to be unintelligible to any eneıny cryptanalyst vsdthout the secret deciphering key. Authorized receiver who has the secret deciphering key retransforms the ciphertext into the original plaintext. Cryptographic systems are commonly classified into block and stream ciphers. Block ciphers divide the plaintext into blocks, usually of fbced size, and encipher each block independently. Block ciphers are therefore simple substitution ciphers and must have large alphabets to prevent cryptanalysis by exhaustive search. On the other hand, stream ciphers divide plaintext into characters and encipher each character with a time-varying function whose time dependency is controlled by the internal state of the stream cipher. After each character is enciphered, the device changes state according to some rule. Therefore two occurences of the same plaintext-character will usually not result in the same ciphertext character. Öne of the most remarkable of ali stream ciphers, perhaps of ali known ciphers, is the one-time-pad where the plaintext message is added bit by bit (ör in general character by character ) to a non-repeating random sequence of the same length. The one-time-pad is also referred to as the Vernam cipher in honor of G. Vernam who developed the principle in 1917 [1] for telegraphy communications. viiiThe remarkable fact about the one-tiıne-pad is its perfect security. Assuming a ciphertext only attack, Shannon [2] proved that even with infinite computing power and unlimited time, the cryptanalyst could never separate the true plaintext from ali other meaningful plaintexts. The disadvantage of one-time-pad cipher systems is the unlimited amount of key needed for perfect secrecy. The appealing features of the one-time-pad suggested building stream ciphers which encipher the plaintext using a pseudo-random sequence. Therefore the need for unlimited key is removed. This pseudo-random sequence is, under the control of a secret key, generated by a deterministle algorithm called the key stream generator. in a stream cipher sysytem, it is assumed that a persistent eavesdropper will obtain some segment of the plaintext and thereby obtain a segment of the pseudo- random sequence. For a stream cipher to be secure against this type of attack, the pseudo-random sequence must be unpredictable, i.e,, it must be computationally infeasible to recover more of the key sequence from a short, captured segment of itself. There are at least two approaches to this concept of unpredictability. Authors such as Shamir [3], Blum and Micali [4] insist that determining the next element of the key sequence based on the previous elements be provably as difficult as inverting a one-way function. On the other hand, Groth [5], Key [6], and Rueppel [7] emphasize linear complexity as a measure of unpredictability. in this thesis, the linear complexity of the key sequences is considered as a cryptanalytic measure of unpredictability. The term cryptographically strong is used as a synonym for unpredictability of key sequences in stream ciphers. The pseudo-random sequence generator employed in a stream cipher must be capable of rapidly producing random-looking bit streams and must be analyzable. Linear feedback sbift registers ( LFSR 's) satisfy these requirements. An LFSR consists of n stroge units, ör stages, and a linear feedback function. An LFSR is controlled by a clock and at each clock pulse the content in stage i is shifted into stage (z-1), with the element in stage O taken as the output. The element inserted into stage (n-1) is calculated form the linear feedback function. Linear feedback shift registers can be considered as basic finite state machines. Thus the output sequence of any LFSR is eventually periodic in the sense that there exists a k such that (Sk, Sk+ı, Sk+2,...) is periodic. The maximum possible period for an «-stage LFSR is 2n-l and any sequence that attains this upper bound is referred to as a ixpseudo-noise sequence. it is well known that the characteristic polynomial of a pseudo-noise sequence is primitive. Golomb [8] proposes three "randonutıess postulates" and suggests that sequences which satisfy these postulates are statistically random. Rueppel [7] has shown that Golomb 's randomness postulates almost exclusively describe pseudo- noise sequences that are generated by LFSR 's with primitive characteristic polynomial. These properties make pseudo-noise sequences usefol in many applications where pseudo-random bit streams are required. But it is well known that the pseudo-noise sequences are cryptographically weak, i.e., highly predictable sequences. Linear complexity of a sequence is the minimum length LFSR that generates the sequence. Linear complexity is a useful cryptanalytic tool in the search for cryptographically strong pseudo-random sequences. There is öne thing that makes linear complexity so popular as a cryptanalytic tool, and that is the existence of an elegant and effîcient method for determining the linear complexity of any periodic sequence. This procedure is the well-known Berlekamp-Massey algorithm which is an iterative procedure for decoding BCH codes introduced by Berlekamp [9]. Later Massey has shown that this algorithm actually provides a general solution to the problem of synthesizing the shortest linear feedback shift register capable of generating a given finite sequence of digits [10]. The Berlekamp-Massey algorithm requires only İL consecutive bits of a sequence to completely determine a sequence with linear complexity L. Since the time complexity of the Berlekamp-Massey algorithm is n2, the implementation of the algorithm for very long sequences can be difficult if not infeasible. Therefore the need for fast algorthims for computing the linear complexity of any given sequence has increased. For example Blahut developed a fast Berlekamp-Massey algorithm by splitting the iterations into two, foür, eigth, ete. parts [11]. in this manner, he achieves the time complexity (n loğ2 2 n) which is a relatively good result in the search for the linear complexity of very long sequences. in the special case where the period of the sequence is n = 2m, there is another algorithm which is more efficient and its time complexity is proportional to loğ 2 n [12]. But the procedure works only for too limited amount of sequences and the kind of modification that is required to make the algorithm applicable to other sequences Xis not known. But the method is promising and it is hoped that some good results will be derived from the concepts introduced by the algorithm in the near future. According to the Berlekamp-Massey algorithm, anyone who obtains 2Z consecutive bits of a key sequence can efficiently recover the entire pseudo-random key sequence with linear complexity of L. Hence, in a stream cipher the entire plaintext can be recovered. Therefore, a cryptographically strong sequence must have a high linear complexity. However a high linear complexity alone does not guarantee that a sequence is cryptographically strong. Thus the other aspects of linear complexity must also be taken into consideration. The notion of the linear complexity profile extends the concept of linear complexity by considering the growth in the linear complexity with respect to the length of the sequence. In other words, linear complexity profile is the growth of the linear complexity of a given sequence. Rueppel states that a cryptographically strong sequence should have a linear complexity near the maximum possible, and its linear complexity profile should follow the nil line "closely but irregularly" [7]. Since the message sequence has finite length, it is enciphered using only a finite subsequence of the complete key sequence. Therefore the local aspects of linear complexity profile must also be considered for key seqeunces. These are inspected in detail in Carter 's works [13]. The Berlekamp-Massey algorithm also computes the linear complexity profile of any given sequence. The lesser known Zero-Square algorithm has effectively the same running time and storage requirements as the Berlekamp- Massey algorithm [14]. Beside that, it also produces simultaneously the linear complexity profiles of all shifted subsequences. Thus this algorithm greatly reduces the time required for the investigation of a sequence which includes the determination of the linear complexity profiles of all shifted subsequences. This kind of inspection can be needed when local aspects of linear complexity profiles of key sequences are inspected. Zero-Square algorithm heavily relies on the investigation of Toeplitz-type matrix obtained from given key sequences. Another consideration in this regard is terror linear complexity or A-error linear complexity profile. The £-error linear complexity of a sequence is the smallest linear complexity that can be obtained when arbitrary k or fewer digits of the sequence are altered. The £-error linear complexity can be interpreted as a worst- case measure of the linear complexity when k or fewer error occur. Obviously, if a XIhigh proportion of the entire key sequence can be recovered by changing some of its bits then, the security of the stream cipher is in danger. The &-error linear complexity of any sequence could be computed by repeated applications of the Berlekamp-Massey algorithm, or, if appropriate, the Chan-Games algorithm. But the repeated applications of these algorithms require too much time and storage, even for short sequences. However there exists an efficient algorithm [15] for computing the £-error linear complexity in the case where the period of the sequence is a power of 2. Unfortunately an efficient algorithm for computing terror linear complexity of any given sequence does not exist. Here, by utilizing the local aspects of linear complexity profile, the Berlekamp-Massey algorithm is modified in order to compute the &-error linear complexity of any given sequence. Nevertheless such a good result which serves the purpose could not be found. All the algorithms mentioned above, have been implemented with C programs for SUN workstations. Since the standart ANSI C is employed, the programs can be used on all UNIX machines. Moreover, all the programs can be executed on PCs if the sources are recompiled and can be actually used as cryptanalytic tools in the search for strong pseudo-random key sequences. Although this thesis is devoted to the cryptanalysis of stream ciphers based on linear complexity, the linear and nonlinear aspects of stream ciphers are also considered. Futhermore, in order to make the subject clear other considerations, such as the relationship between linear complexity and continued-fractions, growth properties of linear complexity profiles are also given. Another aspect of the issue is how to interpret the results obtained by these tools. For this purpose, the features that a strong key sequence must have are exposed to a detailed investigation. Thus, this thesis can also be seen as a reference book for the cryptanalysis based on linear complexity of the key stream sequences. But these considerations can be useful only if the other considerations, such as statistical properties of the key sequences, are taken into account. Briefly, in the cryptanalysis and design of stream ciphers, it has been shown that the linear complexity has valuable properties as a measure for the unpredictability, or equivalently the randomness, of the running key as a finite sequence. But the issue needs further investigation. Especially, determining an efficient algorithm for the k-error linear complexity of finite sequences should be one of the main interests of the future studies.

While the information age is being celebrated nowadays, the problem of securing and authenticating data has become an inseparable aspect of modern computer and communication systems. More and more data are stored and transmitted by electronic means, and thereby get exposed to unauthorized parties openly. The privacy of individuals as well as of organizations depends on protecting communicated information from unauthorized disclosure and modification. Cryptographic systems provide secrecy using transformations. Sender, under control of the enciphering key, transforms the plaintext message into the ciphertext that is supposed to be unintelligible to any eneıny cryptanalyst vsdthout the secret deciphering key. Authorized receiver who has the secret deciphering key retransforms the ciphertext into the original plaintext. Cryptographic systems are commonly classified into block and stream ciphers. Block ciphers divide the plaintext into blocks, usually of fbced size, and encipher each block independently. Block ciphers are therefore simple substitution ciphers and must have large alphabets to prevent cryptanalysis by exhaustive search. On the other hand, stream ciphers divide plaintext into characters and encipher each character with a time-varying function whose time dependency is controlled by the internal state of the stream cipher. After each character is enciphered, the device changes state according to some rule. Therefore two occurences of the same plaintext-character will usually not result in the same ciphertext character. Öne of the most remarkable of ali stream ciphers, perhaps of ali known ciphers, is the one-time-pad where the plaintext message is added bit by bit (ör in general character by character ) to a non-repeating random sequence of the same length. The one-time-pad is also referred to as the Vernam cipher in honor of G. Vernam who developed the principle in 1917 [1] for telegraphy communications. viiiThe remarkable fact about the one-tiıne-pad is its perfect security. Assuming a ciphertext only attack, Shannon [2] proved that even with infinite computing power and unlimited time, the cryptanalyst could never separate the true plaintext from ali other meaningful plaintexts. The disadvantage of one-time-pad cipher systems is the unlimited amount of key needed for perfect secrecy. The appealing features of the one-time-pad suggested building stream ciphers which encipher the plaintext using a pseudo-random sequence. Therefore the need for unlimited key is removed. This pseudo-random sequence is, under the control of a secret key, generated by a deterministle algorithm called the key stream generator. in a stream cipher sysytem, it is assumed that a persistent eavesdropper will obtain some segment of the plaintext and thereby obtain a segment of the pseudo- random sequence. For a stream cipher to be secure against this type of attack, the pseudo-random sequence must be unpredictable, i.e,, it must be computationally infeasible to recover more of the key sequence from a short, captured segment of itself. There are at least two approaches to this concept of unpredictability. Authors such as Shamir [3], Blum and Micali [4] insist that determining the next element of the key sequence based on the previous elements be provably as difficult as inverting a one-way function. On the other hand, Groth [5], Key [6], and Rueppel [7] emphasize linear complexity as a measure of unpredictability. in this thesis, the linear complexity of the key sequences is considered as a cryptanalytic measure of unpredictability. The term cryptographically strong is used as a synonym for unpredictability of key sequences in stream ciphers. The pseudo-random sequence generator employed in a stream cipher must be capable of rapidly producing random-looking bit streams and must be analyzable. Linear feedback sbift registers ( LFSR 's) satisfy these requirements. An LFSR consists of n stroge units, ör stages, and a linear feedback function. An LFSR is controlled by a clock and at each clock pulse the content in stage i is shifted into stage (z-1), with the element in stage O taken as the output. The element inserted into stage (n-1) is calculated form the linear feedback function. Linear feedback shift registers can be considered as basic finite state machines. Thus the output sequence of any LFSR is eventually periodic in the sense that there exists a k such that (Sk, Sk+ı, Sk+2,...) is periodic. The maximum possible period for an «-stage LFSR is 2n-l and any sequence that attains this upper bound is referred to as a ixpseudo-noise sequence. it is well known that the characteristic polynomial of a pseudo-noise sequence is primitive. Golomb [8] proposes three "randonutıess postulates" and suggests that sequences which satisfy these postulates are statistically random. Rueppel [7] has shown that Golomb 's randomness postulates almost exclusively describe pseudo- noise sequences that are generated by LFSR 's with primitive characteristic polynomial. These properties make pseudo-noise sequences usefol in many applications where pseudo-random bit streams are required. But it is well known that the pseudo-noise sequences are cryptographically weak, i.e., highly predictable sequences. Linear complexity of a sequence is the minimum length LFSR that generates the sequence. Linear complexity is a useful cryptanalytic tool in the search for cryptographically strong pseudo-random sequences. There is öne thing that makes linear complexity so popular as a cryptanalytic tool, and that is the existence of an elegant and effîcient method for determining the linear complexity of any periodic sequence. This procedure is the well-known Berlekamp-Massey algorithm which is an iterative procedure for decoding BCH codes introduced by Berlekamp [9]. Later Massey has shown that this algorithm actually provides a general solution to the problem of synthesizing the shortest linear feedback shift register capable of generating a given finite sequence of digits [10]. The Berlekamp-Massey algorithm requires only İL consecutive bits of a sequence to completely determine a sequence with linear complexity L. Since the time complexity of the Berlekamp-Massey algorithm is n2, the implementation of the algorithm for very long sequences can be difficult if not infeasible. Therefore the need for fast algorthims for computing the linear complexity of any given sequence has increased. For example Blahut developed a fast Berlekamp-Massey algorithm by splitting the iterations into two, foür, eigth, ete. parts [11]. in this manner, he achieves the time complexity (n loğ2 2 n) which is a relatively good result in the search for the linear complexity of very long sequences. in the special case where the period of the sequence is n = 2m, there is another algorithm which is more efficient and its time complexity is proportional to loğ 2 n [12]. But the procedure works only for too limited amount of sequences and the kind of modification that is required to make the algorithm applicable to other sequences Xis not known. But the method is promising and it is hoped that some good results will be derived from the concepts introduced by the algorithm in the near future. According to the Berlekamp-Massey algorithm, anyone who obtains 2Z consecutive bits of a key sequence can efficiently recover the entire pseudo-random key sequence with linear complexity of L. Hence, in a stream cipher the entire plaintext can be recovered. Therefore, a cryptographically strong sequence must have a high linear complexity. However a high linear complexity alone does not guarantee that a sequence is cryptographically strong. Thus the other aspects of linear complexity must also be taken into consideration. The notion of the linear complexity profile extends the concept of linear complexity by considering the growth in the linear complexity with respect to the length of the sequence. In other words, linear complexity profile is the growth of the linear complexity of a given sequence. Rueppel states that a cryptographically strong sequence should have a linear complexity near the maximum possible, and its linear complexity profile should follow the nil line "closely but irregularly" [7]. Since the message sequence has finite length, it is enciphered using only a finite subsequence of the complete key sequence. Therefore the local aspects of linear complexity profile must also be considered for key seqeunces. These are inspected in detail in Carter 's works [13]. The Berlekamp-Massey algorithm also computes the linear complexity profile of any given sequence. The lesser known Zero-Square algorithm has effectively the same running time and storage requirements as the Berlekamp- Massey algorithm [14]. Beside that, it also produces simultaneously the linear complexity profiles of all shifted subsequences. Thus this algorithm greatly reduces the time required for the investigation of a sequence which includes the determination of the linear complexity profiles of all shifted subsequences. This kind of inspection can be needed when local aspects of linear complexity profiles of key sequences are inspected. Zero-Square algorithm heavily relies on the investigation of Toeplitz-type matrix obtained from given key sequences. Another consideration in this regard is terror linear complexity or A-error linear complexity profile. The £-error linear complexity of a sequence is the smallest linear complexity that can be obtained when arbitrary k or fewer digits of the sequence are altered. The £-error linear complexity can be interpreted as a worst- case measure of the linear complexity when k or fewer error occur. Obviously, if a XIhigh proportion of the entire key sequence can be recovered by changing some of its bits then, the security of the stream cipher is in danger. The &-error linear complexity of any sequence could be computed by repeated applications of the Berlekamp-Massey algorithm, or, if appropriate, the Chan-Games algorithm. But the repeated applications of these algorithms require too much time and storage, even for short sequences. However there exists an efficient algorithm [15] for computing the £-error linear complexity in the case where the period of the sequence is a power of 2. Unfortunately an efficient algorithm for computing terror linear complexity of any given sequence does not exist. Here, by utilizing the local aspects of linear complexity profile, the Berlekamp-Massey algorithm is modified in order to compute the &-error linear complexity of any given sequence. Nevertheless such a good result which serves the purpose could not be found. All the algorithms mentioned above, have been implemented with C programs for SUN workstations. Since the standart ANSI C is employed, the programs can be used on all UNIX machines. Moreover, all the programs can be executed on PCs if the sources are recompiled and can be actually used as cryptanalytic tools in the search for strong pseudo-random key sequences. Although this thesis is devoted to the cryptanalysis of stream ciphers based on linear complexity, the linear and nonlinear aspects of stream ciphers are also considered. Futhermore, in order to make the subject clear other considerations, such as the relationship between linear complexity and continued-fractions, growth properties of linear complexity profiles are also given. Another aspect of the issue is how to interpret the results obtained by these tools. For this purpose, the features that a strong key sequence must have are exposed to a detailed investigation. Thus, this thesis can also be seen as a reference book for the cryptanalysis based on linear complexity of the key stream sequences. But these considerations can be useful only if the other considerations, such as statistical properties of the key sequences, are taken into account. Briefly, in the cryptanalysis and design of stream ciphers, it has been shown that the linear complexity has valuable properties as a measure for the unpredictability, or equivalently the randomness, of the running key as a finite sequence. But the issue needs further investigation. Especially, determining an efficient algorithm for the k-error linear complexity of finite sequences should be one of the main interests of the future studies.

##### Açıklama

Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 1994

##### Anahtar kelimeler

Veri güvenliği,
Veri şifreleme yöntemleri,
Şifreleme,
Data security,
Data encryption methods,
Encryption