WEBVTT
00:06.520 --> 00:07.600
So, good morning.
00:08.000 --> 00:17.160
This is the last lecture of this year for Algorithms for Internet
00:17.160 --> 00:17.860
Applications.
00:18.600 --> 00:26.980
And it is a novel that I am recording this lecture looking into a
00:26.980 --> 00:30.260
classroom where there are not that many people sitting.
00:30.940 --> 00:32.640
So, welcome to those who are here.
00:34.260 --> 00:37.840
Last time we have looked at cryptographic algorithms again.
00:38.600 --> 00:41.320
And we would like to continue with that.
00:41.440 --> 00:51.240
We have looked at the different notions like cryptology separated into
00:51.240 --> 00:52.720
cryptography and cryptanalysis.
00:53.660 --> 01:00.120
I showed you the separation into different areas like symmetric key
01:00.120 --> 01:01.820
algorithms, public key algorithms.
01:01.820 --> 01:04.800
We have looked already at symmetric key algorithms.
01:04.980 --> 01:14.820
I showed you the very simple examples of a Caesar cipher, of a table
01:14.820 --> 01:17.920
cipher, then the VigenĂ¨re cipher, one-time pad.
01:18.040 --> 01:21.980
I showed you the encryption with bit sequences which is actually a
01:21.980 --> 01:25.500
standard method for stream encryption.
01:26.440 --> 01:32.300
And like a typical stream cipher where the key, the secret, is in the
01:32.300 --> 01:34.500
seed for the random number generator.
01:35.380 --> 01:42.080
And then we also looked at the notion of transposition, different from
01:42.080 --> 01:45.160
substitution which we looked at up to that point.
01:46.040 --> 01:50.900
We saw that simple example of reading out a matrix in different order
01:50.900 --> 01:52.700
than having written the text into it.
01:52.700 --> 01:57.520
And then we finally looked at the DES algorithm.
01:57.720 --> 02:01.800
I showed you the principle structure, the 16 rounds of the algorithm.
02:02.140 --> 02:11.020
I showed you the essential part, the Feistel encryption around having
02:11.020 --> 02:14.900
the property that it is easily invertible because it is a Xoring
02:14.900 --> 02:15.460
operation.
02:15.640 --> 02:18.700
That's one of the benefits of the Xoring operation that you can easily
02:18.700 --> 02:25.780
decrypt it or invert it by just applying the same operand twice to the
02:25.780 --> 02:30.020
same other operand.
02:30.700 --> 02:36.660
And that way you can retrieve the previous ciphers from ciphers
02:36.660 --> 02:40.000
further down in this algorithm.
02:40.000 --> 02:44.760
And so in this way invert the encryption which means decipher an
02:44.760 --> 02:46.900
encrypted message.
02:47.140 --> 02:50.960
And then we looked in more detail at the Feistel round within DES.
02:51.360 --> 02:57.400
I showed you that there we have these compression permutations for the
02:57.400 --> 02:59.160
key to extract the round key.
02:59.160 --> 03:09.800
We have the expansion permutation for the right half of the block that
03:09.800 --> 03:10.720
has to be encrypted.
03:11.640 --> 03:17.120
And then I also showed you what's inside the S-box, at least in one.
03:17.220 --> 03:21.060
You can easily look up the other ones in the internet or in documents.
03:21.800 --> 03:28.940
And so here we have a way of transforming a 6-bit entry which is
03:28.940 --> 03:31.780
separated into row and column index.
03:33.240 --> 03:38.280
And by looking up the value there which is a 4-bit value and this way
03:38.280 --> 03:41.800
you get the mapping from 6 to 4 bits.
03:42.500 --> 03:46.460
And we also looked at the tags on DES.
03:46.760 --> 03:48.940
I at least showed you some of the results.
03:48.940 --> 03:52.000
Otherwise we didn't go really into details of the different
03:52.000 --> 03:52.540
approaches.
03:52.800 --> 03:58.100
I showed you some hint on where to look for more information on
03:58.100 --> 03:59.480
cryptanalytic attacks.
04:00.060 --> 04:05.820
We also looked at side-channeled attacks like physical attacks to
04:05.820 --> 04:11.400
cryptographic algorithms which just modify the way you actually access
04:11.400 --> 04:14.300
information on what's being done in the algorithm.
04:14.300 --> 04:21.480
And so it is a really hard job to make statements on the security of
04:21.480 --> 04:26.040
an algorithm because you have to consider all these potential side
04:26.040 --> 04:28.680
-channel attacks which is a very difficult task.
04:29.940 --> 04:34.740
Then we looked also at an alternative, at public-key cryptosystems.
04:34.820 --> 04:39.520
I briefly showed you the requirements that we have and one important
04:39.520 --> 04:43.380
part there is the existence of one-way functions which are easily
04:43.380 --> 04:48.240
computed, like one-way, but the inverse is hard to compute.
04:48.740 --> 04:53.500
I gave you simple examples and the first new slide for today is the
04:53.500 --> 05:02.920
RSA cryptosystem which I showed, which is the standard algorithm for
05:02.920 --> 05:04.320
public -key methods.
05:04.860 --> 05:06.360
There certainly are others around.
05:06.480 --> 05:09.640
I will briefly mention them at the end of today's lecture.
05:09.640 --> 05:12.680
Or within this lecture today.
05:13.360 --> 05:15.500
So what is this algorithm about?
05:15.600 --> 05:19.680
The RSA cryptosystem designed by Rivis, Charmy and Edelman in 1978.
05:20.860 --> 05:30.120
So they have a public key, P for public, which is a pair of integers,
05:30.320 --> 05:33.600
n and c, where c stands for code.
05:33.600 --> 05:39.240
There is a secret key, S, which is again a pair of integers, so the n
05:39.240 --> 05:41.820
is not really the secret, but the d is the secret.
05:42.420 --> 05:43.960
And d stands for decode.
05:45.060 --> 05:50.460
C and d are values which are very large numbers.
05:51.340 --> 06:00.200
They have at most as many bits as n and the standard lengths of this n
06:00.200 --> 06:04.520
are a few thousand bits nowadays.
06:05.700 --> 06:09.820
The RSA encryption works the following way, that you have your message
06:09.820 --> 06:16.920
and you chop your message into segments, which are all less or equal
06:16.920 --> 06:17.700
than n.
06:18.650 --> 06:23.660
And so this is then m1, m2 and so on.
06:23.740 --> 06:28.120
So you have these sequences of blocks smaller than n.
06:29.160 --> 06:34.280
And actually not less or equal, but actually smaller, really smaller
06:34.280 --> 06:34.780
than.
06:35.780 --> 06:45.160
So the operation that you perform now is a numeric operation.
06:45.700 --> 06:49.680
You compute mi to the c modulo n.
06:50.620 --> 07:00.700
So you compute, mi is a value which can be almost as large as n, which
07:00.700 --> 07:03.100
may have a few thousand bits.
07:03.100 --> 07:08.520
And c is a value which can by itself have a few thousand bits.
07:08.680 --> 07:13.840
And so you can imagine that that is a really time-consuming operation
07:13.840 --> 07:15.440
to compute mi to the c.
07:16.060 --> 07:19.860
And so you really have to do something about making that as efficient
07:19.860 --> 07:20.660
as possible.
07:21.660 --> 07:25.940
And then you always compute in the range of numbers less than n,
07:26.020 --> 07:29.600
because you computed modulo this value n.
07:29.600 --> 07:32.480
Now this is the encryption.
07:32.720 --> 07:38.540
You get some new value which again is a value that is less than n by
07:38.540 --> 07:39.500
the modulo operation.
07:39.840 --> 07:46.360
And then the decryption works in a way that you divide the encrypted
07:46.360 --> 07:52.440
message c again into sequences of numbers c1, ck smaller than n.
07:52.440 --> 08:04.400
And you compute mi now from this ci by taking ci to the power d modulo
08:04.400 --> 08:04.680
n.
08:04.880 --> 08:08.900
And again we have very large numbers, may have very large numbers,
08:09.000 --> 08:11.520
certainly not all of these numbers will be large, but most of them.
08:11.780 --> 08:17.240
And so this is a time-consuming operation, violating in some way the
08:17.240 --> 08:20.120
requirement that encryption and decryption should be easy.
08:21.440 --> 08:26.000
So how can we choose, like this looks a bit strange, how is that
08:26.000 --> 08:35.800
possible that you can retrieve the value mi by computing the power of
08:35.800 --> 08:40.360
the original mi to the c and then the result taking the power to the
08:40.360 --> 08:40.580
d.
08:41.380 --> 08:45.540
Now this works because we specifically choose n, c and d.
08:46.700 --> 08:53.260
And so certainly here I say for n having 1024 bits, but this certainly
08:53.260 --> 08:56.000
can be any other value also.
08:56.140 --> 08:57.900
This is just one assumption.
08:58.380 --> 09:03.900
Nowadays you would rather have 2048 or even 4096.
09:06.380 --> 09:08.160
So this is a 2 there.
09:08.980 --> 09:10.560
Now what do we do?
09:10.640 --> 09:14.440
We randomly generate two large prime numbers p and q.
09:15.400 --> 09:20.720
Both having, like if we have 1024 bits, we choose two numbers having
09:20.720 --> 09:22.080
512 bits.
09:22.260 --> 09:27.960
In this way the product has exactly double the number of bits.
09:29.100 --> 09:36.240
And so we want to have that many bits, so the two prime numbers should
09:36.240 --> 09:36.980
be very large.
09:37.840 --> 09:41.240
We will come back to the problem of how we can actually select those
09:41.240 --> 09:41.920
prime numbers.
09:42.380 --> 09:44.580
Now we compute the product of those prime numbers.
09:44.680 --> 09:48.320
This can be done certainly by standard arithmetic, although then it's
09:48.320 --> 09:49.240
quite time consuming.
09:49.380 --> 09:51.260
It should be done more efficiently.
09:52.120 --> 09:55.060
And the product of p and q is our value n.
09:55.960 --> 10:02.720
You see, if you just have n, you actually don't know p and q unless
10:02.720 --> 10:07.740
you manage to factor that number n into those two prime factors, which
10:07.740 --> 10:15.040
I already mentioned is one of those one-way functions.
10:16.820 --> 10:24.320
Now we determine randomly one value c, but the requirement is that c
10:24.320 --> 10:28.120
is prime relative to p-1 times q-1.
10:28.280 --> 10:31.700
I will explain to you in a moment why we have that requirement, that
10:31.700 --> 10:33.820
value p-1 times q-1.
10:34.430 --> 10:40.180
And then we determine a value d such that the product of c times d is
10:40.180 --> 10:44.880
congruent to one modulo p-1 times q-1.
10:45.780 --> 10:49.040
And we have the following lemma.
10:49.520 --> 10:56.700
If we have chosen n, c, and d in this way, then we have for all values
10:56.700 --> 11:02.360
less than n this nice property that m taken to the power cd is
11:02.360 --> 11:04.060
congruent to m modulo n.
11:04.540 --> 11:08.400
And we will go into a little bit into showing why that is the case.
11:08.560 --> 11:11.280
We don't go into the complete depth of that proof.
11:11.980 --> 11:17.760
But first of all, let's briefly look again what actually it means to
11:17.760 --> 11:22.360
say x and y are congruent modulo z.
11:22.940 --> 11:23.980
What does it mean?
11:24.440 --> 11:30.060
That the difference of x minus y is a multiple of z.
11:30.060 --> 11:35.740
So it means that the difference here, the difference of m to the c
11:35.740 --> 11:40.300
times d and m is a multiple of n.
11:41.260 --> 11:47.340
And up here, cd-1 is a multiple of p-1 times q-1.
11:47.340 --> 11:49.300
So this is the essential point.
11:49.660 --> 11:56.900
Or you could also say if you divide the values x and y by z, then you
11:56.900 --> 11:59.180
have the same remainder.
12:01.040 --> 12:07.120
Okay, now the validity of that lemma depends on two essential insights
12:07.120 --> 12:08.440
of number theory.
12:08.440 --> 12:13.400
The first one is the small or the little theorem of Fermat.
12:14.320 --> 12:18.480
And this little theorem is about the following.
12:18.720 --> 12:25.500
It states that if you have a prime number p and another value a which
12:25.500 --> 12:31.800
is not divisible by p, then a to the p minus 1 is congruent to 1
12:31.800 --> 12:32.660
modulo p.
12:33.920 --> 12:36.280
Now this looks similar to what we have up here.
12:36.620 --> 12:37.820
We'll come back to that.
12:37.820 --> 12:44.140
So p prime a not divisible by p means a to the p minus 1 is congruent
12:44.140 --> 12:45.240
to 1 modulo p.
12:45.920 --> 12:50.160
And then there is a generalization of that, which is the following.
12:50.360 --> 12:55.680
For n out of n, we define the following function phi of n, which is
12:55.680 --> 12:56.960
called the Euler function.
12:57.680 --> 13:07.180
And this Euler function counts the number of values m between 1 and n,
13:07.780 --> 13:12.160
which are prime relative to n, which means that they don't have a
13:12.160 --> 13:14.980
common divisor with n.
13:16.480 --> 13:20.220
We'll look at specific examples of that in a moment.
13:20.220 --> 13:23.040
So this is the Euler function.
13:24.080 --> 13:30.960
And the theorem that we utilize for this proof of the correctness of
13:30.960 --> 13:38.260
the RSA algorithm is that if n and a are natural numbers and a is
13:38.260 --> 13:43.640
prime relative to n, that means they don't have common divisors, non
13:43.640 --> 13:48.740
-trivial common divisors, then a to the phi of n is congruent to 1
13:48.740 --> 13:49.460
modulo n.
13:49.460 --> 13:55.140
So another statement like the one we had in the little theorem of
13:55.140 --> 14:00.360
Fermat, but here it states, or it is not a to the p minus 1, where p
14:00.360 --> 14:05.620
is a prime number, but here it is a to the phi of n, which is the
14:05.620 --> 14:07.560
Euler function of that value n.
14:08.060 --> 14:14.680
Now, assume that this little theorem of Fermat and Euler's theorem
14:14.680 --> 14:21.300
hold, then we will show that the RSA algorithm actually is correct.
14:21.860 --> 14:28.400
The corollary of this certainly is that for all a and n, a to the phi
14:28.400 --> 14:34.100
of n minus 1 is the modular inverse of a with respect to n.
14:35.100 --> 14:44.300
Yes, so if you have modular computation for n, then a to the phi of n
14:44.300 --> 14:53.060
minus 1 certainly is the modular inverse, because a to the phi of n
14:53.060 --> 14:58.720
minus 1 times a is congruent to 1 modulo n.
15:01.440 --> 15:02.540
So that's trivial.
15:03.340 --> 15:06.640
Okay, nice little corollary which will be used.
15:07.220 --> 15:12.420
Okay, and now let's briefly look at our situation where we have these
15:12.420 --> 15:14.200
two numbers p and q.
15:14.800 --> 15:16.600
These two prime numbers p and q.
15:18.620 --> 15:23.640
And you know we have computed n, which is the product of p and q, and
15:23.640 --> 15:27.960
now the statement of this lemma is that the Euler function of p and q
15:27.960 --> 15:33.200
is just this value p minus 1 times q minus 1, which also appeared in
15:33.200 --> 15:38.320
the construction of the public and secret keys.
15:38.800 --> 15:40.240
So why is this the case?
15:40.320 --> 15:43.180
This is a rather simple observation.
15:43.960 --> 15:48.680
If p and q are prime, then we can easily see which numbers have a
15:48.680 --> 15:53.820
common divisor with p and q, so which are not prime relative to n.
15:55.800 --> 15:57.440
So which numbers are these?
15:57.560 --> 16:02.240
Well, it's just the multiples, all the multiples of p and all the
16:02.240 --> 16:03.340
multiples of q.
16:03.900 --> 16:08.100
They definitely have a non-trivial common divisor.
16:09.220 --> 16:14.660
In particular, these values here have a non-trivial common divisor p,
16:15.180 --> 16:18.340
and those here have a non-trivial common divisor q.
16:19.160 --> 16:21.240
So how many numbers are these?
16:21.400 --> 16:26.180
Here we have just definitely q values.
16:26.580 --> 16:31.180
Here we have p values, so we have p plus q values there.
16:31.880 --> 16:34.280
p plus q minus 1.
16:34.580 --> 16:36.440
You see here only up to p minus 1.
16:37.280 --> 16:43.060
So this is the number of values which have a common divisor with p
16:43.060 --> 16:49.360
times q, so all the remaining numbers are numbers which are prime
16:49.360 --> 16:51.680
relative to p times q.
16:52.640 --> 16:54.060
So what is that?
16:54.120 --> 17:02.040
p times q are all the numbers, and we subtract p plus q minus 1.
17:05.440 --> 17:11.820
And these are the numbers which are smaller than p times q and have
17:11.820 --> 17:13.440
non -trivial common divisor.
17:14.060 --> 17:17.580
And this is exactly p minus 1 times q minus 1.
17:18.500 --> 17:21.840
So this is easily computed, certainly.
17:22.020 --> 17:27.500
So this is the number of integers which are prime relative to p times
17:27.500 --> 17:33.540
q, which is our value n, the modulus for the RSA computation.
17:34.300 --> 17:38.120
Now, let's go back to the RSA algorithm and see what this means for
17:38.120 --> 17:44.900
getting insights into the correctness of that algorithm.
17:45.440 --> 17:49.440
So if p and q are prime, we now know that the Euler function is p
17:49.440 --> 17:51.940
minus 1 times q minus 1.
17:53.400 --> 17:59.320
Thus, if we multiply, or if we compute m to the p minus 1 times q
17:59.320 --> 18:05.360
minus 1, this is m to the phi of n, this is congruent to 1 modulo n
18:05.360 --> 18:07.920
due to the result from Euler's theorem.
18:08.560 --> 18:14.520
Now, certainly if we multiply both sides with m, then m to the p minus
18:14.520 --> 18:19.700
1 times q minus 1 plus 1 is congruent to m modulo n.
18:19.700 --> 18:25.400
Now, it would be nice if this here actually would be in some way, or
18:25.400 --> 18:29.920
could be related to c times d, because remember that was what we were
18:29.920 --> 18:37.880
looking for, some exponent to m such that m to this exponent is
18:37.880 --> 18:39.560
congruent to m modulo n.
18:41.120 --> 18:48.000
Now, we have chosen c and d such that c times d is congruent to 1
18:48.000 --> 18:52.200
modulo p minus 1 times q minus 1, the Euler function of n.
18:53.280 --> 18:59.640
Now, this is exactly the same as stating that cd minus 1 is a multiple
18:59.640 --> 19:01.420
of that modulus.
19:01.420 --> 19:04.820
So, a multiple of p minus 1 times q minus 1.
19:04.900 --> 19:09.220
There exists a k such that cd minus 1 is k times that modulus.
19:10.220 --> 19:17.680
And now this means, if we compute m to the c times d, this c times d
19:17.680 --> 19:28.400
is, well, a multiple of p minus 1 times q minus 1 plus 1.
19:33.100 --> 19:37.740
And a multiple of p minus 1 times q minus 1, each multiple of p minus
19:37.740 --> 19:45.800
1 times q minus 1 in the exponent of m is, as we have stated here,
19:46.280 --> 19:47.120
congruent to 1.
19:47.120 --> 19:49.400
So, all these cancel out.
19:49.960 --> 19:55.380
The only thing which remains is that m to the c times d is actually
19:55.380 --> 19:56.260
congruent to m.
19:56.940 --> 19:58.600
And so, this is what we wanted to have.
19:59.660 --> 20:06.340
And so, we have shown this is actually a correct algorithm.
20:07.040 --> 20:13.140
And the interesting point is that even if m is not prime relative to
20:13.140 --> 20:18.220
n, so this was the requirement here, m is prime relative to n, now we
20:18.220 --> 20:19.800
assume that m is not prime.
20:20.060 --> 20:24.940
We know that there are p plus q minus 1 values which are not prime
20:24.940 --> 20:25.760
relative to n.
20:26.320 --> 20:28.740
What happens to those values?
20:29.520 --> 20:34.420
Let's assume we have a value m which is a multiple of q.
20:36.300 --> 20:46.720
Now, if we, like t definitely is less than p, and so t is prime with
20:46.720 --> 20:51.920
respect to p, and so t to the p minus 1 is congruent to 1 modulo p
20:51.920 --> 20:57.960
because of the little fama, and also q to the p minus 1 definitely is
20:57.960 --> 20:59.520
congruent to 1 modulo p.
21:00.260 --> 21:06.140
Now, we can compute, we can look at the product of those in a moment.
21:06.140 --> 21:13.360
So, we also know that for all k greater than 0, we have that t to the
21:13.360 --> 21:19.100
k times p minus 1 times q minus 1 is definitely also congruent to 1
21:19.100 --> 21:21.720
because it's just a repetition of what we have seen up here.
21:22.320 --> 21:27.260
And also q to the k times p minus 1 times q minus 1 is congruent to 1
21:27.260 --> 21:31.100
modulo p because this is also a repetition of what we have seen up
21:31.100 --> 21:31.340
here.
21:32.880 --> 21:40.420
And that means if we compute, if we now form the product of those
21:40.420 --> 21:47.540
values, t to this power and q to this power, if we take t times q to
21:47.540 --> 21:54.960
the power k times p minus 1 times q minus 1, we get that that is
21:54.960 --> 21:57.460
congruent to 1 modulo p.
21:58.780 --> 22:05.940
And it's also, if we multiply it with t, it is congruent to t modulo
22:05.940 --> 22:06.200
p.
22:07.200 --> 22:13.300
And now we can also multiply that with q, and an easy check with the
22:13.300 --> 22:18.560
modular arithmetic shows that the product, if we now multiply with q,
22:20.100 --> 22:28.380
then here we have multiplied this power of tq with t times q, so we
22:28.380 --> 22:33.960
have t times q to this power k times p minus 1 times q minus 1 plus 1.
22:34.620 --> 22:43.320
This is t times q, congruent to t times q modulo p times q, and this
22:43.320 --> 22:44.920
is exactly what we wanted to have.
22:44.920 --> 22:51.580
So this shows that even for m, we have this nice relationship that we
22:51.580 --> 22:56.320
can retrieve m by taking it to the power c times d.
22:57.960 --> 23:03.360
So this certainly can be done in an analogous way for all multiples of
23:03.360 --> 23:05.760
p, here this was a multiple of q.
23:06.650 --> 23:14.380
Okay, now we have shown that for all values less than n, we have this
23:14.380 --> 23:19.000
nice property that m to the c times d is congruent to m modulo n.
23:19.520 --> 23:21.860
And we are done with the proof of the correctness.
23:24.660 --> 23:28.560
So let's look now at the efficiency of that algorithm.
23:28.720 --> 23:30.500
What are the costs of the RSA algorithm?
23:31.540 --> 23:36.380
Encryption and decryption, as you have seen, use exponentiation.
23:36.660 --> 23:38.520
We certainly have to do that in a fast way.
23:38.520 --> 23:43.860
Now the problem is, what does it mean if you do it in a naive way, if
23:43.860 --> 23:48.500
you have to compute x to the n, you could say, okay, I just take x
23:48.500 --> 23:54.600
times x times x and so on times x, and this n times, and this
23:54.600 --> 23:59.780
certainly would be n times the effort to multiply two numbers with
23:59.780 --> 24:01.560
increasingly larger numbers.
24:02.700 --> 24:11.940
Now, the interesting observation is that for all x and n, we can
24:11.940 --> 24:16.680
compute x to the n with order of log n arithmetic operations.
24:17.340 --> 24:21.180
The method, the easy method, as we'll see on the next slide, is that
24:21.180 --> 24:25.660
we just continuously square x and multiply the computed powers.
24:25.660 --> 24:28.000
So let's go to the next slide.
24:29.120 --> 24:34.460
And there, now we look at the details of fast exponentiation.
24:35.480 --> 24:43.780
So x is an arbitrary integer, and n, we want to compute x to the n, is
24:43.780 --> 24:45.680
also an arbitrary integer.
24:45.680 --> 24:50.540
And an integer can be represented as a binary number.
24:51.240 --> 25:02.340
So we just represent it as a sequence of like n as a k, a k minus one,
25:03.000 --> 25:06.100
down to a zero.
25:07.220 --> 25:10.960
Now, this is the dual representation or the binary representation of
25:10.960 --> 25:11.260
n.
25:11.780 --> 25:19.140
And if we now compute x to the n, it means that we take x to a sum of
25:19.140 --> 25:24.300
powers of two, which are weighted by these ai.
25:24.780 --> 25:28.740
So that means the ai select the powers of two that are actually needed
25:28.740 --> 25:30.840
here because ai are either zero or one.
25:30.840 --> 25:37.020
Now, if we take x to a sum of values, we can also compute or rearrange
25:37.020 --> 25:46.220
this as a product of powers of x, which are just these x to the ai
25:46.220 --> 25:47.380
times two to the i.
25:47.380 --> 25:52.140
And it means that we now can do exactly what I told you.
25:52.420 --> 26:00.100
We can successively square all the, like square x, certainly to get up
26:00.100 --> 26:06.940
to two to the k, the largest power that may be contained in, or that
26:06.940 --> 26:07.920
is contained in n.
26:08.810 --> 26:18.500
We just need k operations, and k certainly is equal to log n.
26:19.660 --> 26:29.360
Actually, the ceiling of that, no, sorry, floor of that, but it is
26:29.360 --> 26:30.220
order of log n.
26:30.880 --> 26:33.640
And so these are the powers that we need.
26:34.620 --> 26:38.440
And so this was supposed to be k.
26:39.400 --> 26:49.000
And the ai are used to select which powers of x actually have to be
26:49.000 --> 26:49.400
multiplied.
26:50.120 --> 26:56.480
So if ai is zero, this cancels, such a power cancels out.
26:56.580 --> 27:00.120
And those where ai is one are multiplied.
27:00.420 --> 27:05.760
And so this is a simple way of squaring and multiplying with some
27:05.760 --> 27:13.180
intermediate value to get in the end the appropriate product of x to a
27:13.180 --> 27:14.040
powers of two.
27:14.040 --> 27:18.280
Now a different way of representing that would be that we just
27:18.280 --> 27:24.140
rearrange the computation and say, okay, we can also just take x to
27:24.140 --> 27:29.520
the a0, x to the a1, x to the a2, and so on, and actually start here
27:29.520 --> 27:32.340
with x to the ak and square that.
27:33.100 --> 27:37.220
And then multiply with x to the ak-1.
27:37.400 --> 27:47.060
That means if ak-1 or some ai is zero, it is multiplication with one,
27:47.180 --> 27:48.400
which means you don't have to multiply.
27:48.640 --> 27:51.400
If it is one, we have to multiply with x.
27:51.400 --> 27:59.880
And so here we would decide whether we have to multiply with x, and
27:59.880 --> 28:02.400
then we square the resulting value.
28:03.260 --> 28:08.420
And this is exactly the same result as we have up here, so those two
28:08.420 --> 28:12.760
are equal, and immediately shows that we have two different ways of
28:12.760 --> 28:16.900
computing this power x to the n.
28:17.580 --> 28:23.760
So one way would be to start from the least significant bit and go to
28:23.760 --> 28:27.140
the most significant bit of n, so start with a0.
28:27.600 --> 28:29.180
This is what we do here.
28:30.240 --> 28:34.180
And then go to a1, a2, and so on.
28:34.180 --> 28:38.360
And this is what we do in this way, so this is right to left
28:38.360 --> 28:43.980
computation with respect to the ordering of this direction with
28:43.980 --> 28:45.740
respect to the bits of n.
28:46.920 --> 28:52.420
And if we decide for this algorithm, we would start with the most
28:52.420 --> 28:57.700
significant and go to the least significant bit, which means there we
28:57.700 --> 29:02.360
go left to right looking at the bits of n.
29:03.000 --> 29:08.020
And so this is here actually in this formula, we start, we go from
29:08.020 --> 29:13.860
right to left, but referring to the ordering of the bits in the binary
29:13.860 --> 29:17.080
representation of n, it is right to left or left to right.
29:17.080 --> 29:20.960
Least significant or most significant, or most significant or least
29:20.960 --> 29:21.440
significant.
29:22.180 --> 29:28.840
The complexity of both operations is, of both algorithms is the same,
29:29.500 --> 29:33.260
but the arrangement, the way we compute is slightly different.
29:33.260 --> 29:38.780
So we have successive squarings and we have to decide what we do with
29:38.780 --> 29:41.200
the value of those bits of n.
29:42.040 --> 29:46.040
Okay, let's go back to the previous slide.
29:46.980 --> 29:53.800
Here we have to look at what that means for the costs, so we have to
29:53.800 --> 29:56.100
perform multiplication, squaring.
29:57.080 --> 30:02.380
Now we have long integers, the values that we have here have several
30:02.380 --> 30:06.780
thousand bits, or at least a thousand, usually a few thousand bits.
30:07.360 --> 30:12.060
Now this means that our standard arithmetic is not sufficient, we have
30:12.060 --> 30:13.840
operations on long integers.
30:13.840 --> 30:20.360
So if this is the large number, we have to chop it into our root size,
30:20.540 --> 30:22.500
so maybe 32.
30:22.980 --> 30:29.120
If you are lucky you have a 64-bit computer, then you have 64-bit
30:29.120 --> 30:32.960
values, but still this is not sufficient.
30:32.960 --> 30:45.080
So if you have 1024 bits in your number, you need 32 words in that
30:45.080 --> 30:50.100
value, so it will represent 1024 bits.
30:51.160 --> 30:58.400
Okay, so if you just perform naive multiplication, you have order of k
30:58.400 --> 31:01.220
-square single word operations.
31:01.380 --> 31:04.340
That's simple, naive matrix multiplication.
31:04.640 --> 31:08.620
Not matrix multiplication, this is number multiplication.
31:09.480 --> 31:16.480
Now, if you do it in a slightly more efficient way, you use a
31:16.480 --> 31:23.980
recursive approach and you can take that complexity down to k to the 1
31:23.980 --> 31:24.760
.5.
31:24.920 --> 31:27.580
It's an easy exercise to try to do that.
31:28.420 --> 31:33.020
And if you are even more knowledgeable, you would use FFT, the Fast
31:33.020 --> 31:38.560
Fourier Transform, which I don't cover in this course, but in my
31:38.560 --> 31:39.820
course on efficient algorithms.
31:40.820 --> 31:45.480
This FFT approach for multiplication actually leads to a reduction
31:45.480 --> 31:55.240
down to k log k operations on long operations.
31:56.120 --> 32:04.540
So here the approach is that if we have k word integers, that means k
32:04.540 --> 32:11.380
words which make up one value, this is showing the complexity of that,
32:11.580 --> 32:12.540
order of k log k.
32:12.540 --> 32:16.120
Now what does this mean for specific values?
32:16.280 --> 32:21.720
If we have 32-bit arithmetic, an exponent length of 1024 bits, that's
32:21.720 --> 32:24.800
the length of n for the exponent.
32:25.540 --> 32:27.300
Now the exponent is n.
32:30.720 --> 32:36.000
It means that... so actually the number of operations that we have
32:36.000 --> 32:41.880
here, the order of log n it states, it's 2 times log n actually.
32:41.880 --> 32:50.900
And now log n for a 1024-bit value is 1024 bits.
32:51.240 --> 33:01.860
So it's 2 times 1024 operations, and now each operation takes k log k
33:01.860 --> 33:04.260
single word operations.
33:04.260 --> 33:10.960
Now that is k, in this case is 32, and log k is 5, and this product
33:10.960 --> 33:20.560
is, as you see here, 327,680 operations.
33:20.560 --> 33:26.640
Now this is a rather naive approach to do it in this way, although we
33:26.640 --> 33:33.360
already have used here intelligent multiplication, but nevertheless it
33:33.360 --> 33:34.820
can be done in a different way.
33:35.500 --> 33:41.060
You can actually use modular arithmetic, you can use the Chinese
33:41.060 --> 33:48.180
remainder theorem, you can use parallelization, pipelining, construct
33:48.180 --> 33:50.280
a hardware implementation of that algorithm.
33:51.420 --> 33:54.240
Would be a pleasure for me to tell you more about that, but we don't
33:54.240 --> 33:56.060
have the time for that, not in this course.
33:56.060 --> 34:00.680
But in this way you can achieve RSA encryption at quite high data
34:00.680 --> 34:07.000
rates, but nevertheless the costs are still quite high, and so usually
34:07.000 --> 34:14.680
RSA is viewed to be too large to be used on large amounts of data.
34:14.680 --> 34:16.700
The costs of RSA are just too large.
34:17.700 --> 34:19.800
And so we have to do something about that.
34:21.720 --> 34:26.780
At first sight it would mean, well, you don't use RSA at all on the
34:26.780 --> 34:31.380
documents that we would like to communicate, but you will see a little
34:31.380 --> 34:34.020
bit later how we actually get around that.
34:34.020 --> 34:40.540
So we just have to combine RSA with other cryptographic methods which
34:40.540 --> 34:43.340
are computed in an easier way.
34:45.080 --> 34:53.960
Now let's look at how we can disperse fast exponentiation, which we
34:53.960 --> 34:54.760
have looked at already.
34:56.180 --> 34:59.360
Again, let's look now at the cost of cryptanalysis.
34:59.520 --> 35:03.480
The first was the cost of computing the encryption and decryption.
35:03.820 --> 35:05.600
Now what about cryptanalysis?
35:05.700 --> 35:09.920
I briefly mentioned that factorization is a one-way function.
35:09.920 --> 35:16.200
So the problem that we have to look at here is that we are given N, we
35:16.200 --> 35:21.800
are given C, and we would like, like this is the public key, and now
35:21.800 --> 35:24.160
you would like to get P and Q from that.
35:24.160 --> 35:29.080
Now this requires factorization of N, and I mentioned already that it
35:29.080 --> 35:34.320
is a very complex operation, so the currently best-known methods are
35:34.320 --> 35:37.600
in complexity around what I have stated here.
35:37.600 --> 35:41.900
And so B is the number of bits of N.
35:42.040 --> 35:47.180
You see that here you have that in the exponent of E, and then you
35:47.180 --> 35:49.160
have also log B in there.
35:49.400 --> 35:55.360
This is quite a large value, and if you assume or make certain
35:55.360 --> 36:02.920
assumptions on the speed of your computer, and if you just assume that
36:02.920 --> 36:11.000
N is a number having 300 decimal digits, then in a naive over-the
36:11.000 --> 36:15.900
-thumb computation, you get something like a few thousand years to
36:15.900 --> 36:20.360
actually compute the factorization regardless of which number of that
36:20.360 --> 36:21.240
size you have.
36:21.240 --> 36:26.520
Now this is at least true with respect to current knowledge.
36:27.340 --> 36:34.140
There may be some advances in algorithmics which might give us new
36:34.140 --> 36:36.480
insights on how to compute factorization.
36:36.480 --> 36:42.040
In particular, we know of one insight which is valid as soon as we
36:42.040 --> 36:43.420
have quantum computing.
36:44.160 --> 36:48.960
And so if you have a quantum computer, you can actually get
36:48.960 --> 36:54.900
factorization in a very simple way, and this certainly is some kind of
36:54.900 --> 37:02.440
threat to the security of everything which is encrypted using such a
37:02.440 --> 37:04.380
public -key method like RSA.
37:04.380 --> 37:12.640
But to do something against that, if you are interested in having
37:12.640 --> 37:17.480
security over a long period of time, we need very long keys, much
37:17.480 --> 37:22.380
longer than for symmetric methods, because here, as I said, the
37:22.380 --> 37:27.600
strength of the algorithm, the cryptographic strength, depends on the
37:27.600 --> 37:30.340
complexity of factoring the numbers.
37:30.340 --> 37:39.520
And so these have to be very large numbers, whereas for the symmetric
37:39.520 --> 37:46.060
methods, we had other problems that had to be looked at in order to
37:46.060 --> 37:51.020
successfully attack these symmetric methods.
37:51.480 --> 37:53.240
So there the keys could be smaller.
37:54.640 --> 37:55.240
Okay.
37:56.960 --> 37:59.740
One question is, we need these prime numbers.
37:59.820 --> 38:00.540
How do we do that?
38:00.640 --> 38:03.380
How can we actually determine large prime numbers?
38:03.560 --> 38:08.740
So we have to find the largest prime number less than or equal to some
38:08.740 --> 38:10.120
value n.
38:12.760 --> 38:18.000
Like I said, we would like to have prime numbers having 512 or 1024
38:18.000 --> 38:22.320
bits, and so 2 to the 1024 would be that value.
38:22.320 --> 38:26.700
Now one simple method is the sieve of Eratosthenes.
38:27.200 --> 38:31.360
So there we just write down all numbers from 2 to n.
38:31.420 --> 38:33.260
Let me just do that.
38:53.700 --> 38:58.520
22, 23, 24, 25, and so on.
38:59.280 --> 39:00.720
Now, what is this for?
39:01.540 --> 39:08.740
So up to n, so here I can only write down a few numbers, but you just
39:08.740 --> 39:12.540
can see in a moment what the approach actually is about.
39:12.540 --> 39:14.520
Then we repeat the following.
39:15.280 --> 39:21.220
We delete the first number, which is 2, and all its multiples from the
39:21.220 --> 39:21.560
list.
39:21.960 --> 39:33.820
So we delete 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and so on.
39:33.820 --> 39:40.360
So we delete the first number and all its multiples until the list
39:40.360 --> 39:41.940
consists of a single element.
39:42.280 --> 39:43.220
So we repeat that.
39:43.640 --> 39:51.680
We delete 3, its next multiple is 9, its next multiple is 15, next
39:51.680 --> 39:57.140
multiple is 21, next multiple is, well, not on this list.
39:58.010 --> 39:59.960
We do the same with 5.
40:00.580 --> 40:06.420
We delete 5 multiple.
40:07.920 --> 40:09.140
There's 25.
40:10.020 --> 40:16.720
We delete 7, and see there's no multiple of 7 on this list.
40:17.720 --> 40:24.620
We delete 11, and see there's no multiple of 11 over there.
40:24.780 --> 40:30.820
We delete 17, we delete 19, we delete 23, and so on, and all their
40:30.820 --> 40:31.180
multiples.
40:31.180 --> 40:38.400
And in this way you see that the list is reduced quite fast to fewer
40:38.400 --> 40:44.300
elements, and the largest remaining value on the right-hand side
40:44.300 --> 40:53.500
certainly is the largest prime, which is less or equal to n.
40:53.500 --> 40:58.340
Certainly you only have to go to half the size of the list.
41:00.360 --> 41:03.800
Okay, you don't have to go all the way to the right.
41:03.900 --> 41:11.900
So it is quite an efficient algorithm, but if you imagine you have 2
41:11.900 --> 41:18.720
to the 1024 values, that's a very, very long list and it's still just
41:18.720 --> 41:20.360
unfeasible to do that.
41:21.220 --> 41:24.780
So if you have these large numbers, it's unfeasible.
41:25.720 --> 41:28.980
And so another approach is to use a probabilistic approach.
41:29.960 --> 41:37.320
We just choose a random odd number having 512 bits and test whether x
41:37.320 --> 41:37.920
is prime.
41:38.800 --> 41:42.300
If it is not prime, we just increase x by 2.
41:42.480 --> 41:44.320
Certainly we only have to look at odd numbers.
41:45.120 --> 41:50.040
And then we test it again, and we do that until we finally have found
41:50.040 --> 41:54.460
the next prime which is larger than x.
41:55.420 --> 42:01.420
Now this sounds as being much simpler because there will be some
42:01.420 --> 42:04.180
value, some prime number in that region.
42:04.900 --> 42:11.280
And if we are lucky, the first one is actually a prime number.
42:11.280 --> 42:15.300
But then the question remains, how can we actually test whether x is
42:15.300 --> 42:15.660
prime?
42:16.660 --> 42:19.560
And this looks like an interesting question.
42:19.880 --> 42:23.280
So how can we test whether x actually is prime?
42:23.400 --> 42:27.880
Like in that sieve of Eratosthenes, we knew that the remaining values
42:27.880 --> 42:34.060
were prime because they have not been multiples of any smaller value.
42:34.990 --> 42:38.500
Now, there is a very interesting approach.
42:39.220 --> 42:46.220
It was a seminal paper by Rabban and Miller, and they designed one of
42:46.220 --> 42:53.800
the first probabilistic algorithms by just doing the following.
42:53.800 --> 43:01.280
They first of all checked whether x is divisible by some small prime,
43:03.280 --> 43:06.520
so less than 10,000 for example.
43:07.080 --> 43:12.380
If you have just 10,000 values, you can easily do that using the sieve
43:12.380 --> 43:13.300
of Eratosthenes.
43:13.300 --> 43:19.660
And then you repeat at most r times, so this is a limit on the number
43:19.660 --> 43:29.180
of iterations, you choose a random value less than x until this value
43:29.180 --> 43:35.180
a is a witness of the compositeness of x.
43:35.180 --> 43:36.980
Now, what does it mean?
43:37.080 --> 43:46.220
If a is a value which can show you that a is not a prime number but a
43:46.220 --> 43:48.540
multiple of a smaller value than x.
43:50.500 --> 43:57.680
So, we would like to test whether x is prime, so if we stop by having
43:57.680 --> 44:04.020
found a value a, which is a witness of the compositeness of x, then we
44:04.020 --> 44:05.320
know x is not prime.
44:05.320 --> 44:12.120
If we have repeated that r times without having found such a witness,
44:12.360 --> 44:15.380
we conclude that x must be prime.
44:16.100 --> 44:22.260
Now, this is quite a courageous statement, we will see later that it
44:22.260 --> 44:25.220
is done or is made on quite good grounds.
44:25.900 --> 44:30.640
So, what is this algorithm actually doing?
44:30.800 --> 44:35.920
How can you say that you have a witness for the compositeness of x?
44:36.060 --> 44:36.720
What do we do?
44:37.620 --> 44:42.700
We just exploit again our insights that we had seen before.
44:42.700 --> 44:52.640
The little theorem by Fermat that if x is prime, then a to the x minus
44:52.640 --> 44:54.800
1 is congruent to 1 modulo x.
44:56.040 --> 44:58.840
We have seen that little theorem of Fermat already.
44:59.220 --> 45:04.080
Now, we would like to test whether x is prime, and we know that if x
45:04.080 --> 45:07.580
is prime, then we have this property.
45:12.700 --> 45:20.620
There is another theorem, another insight from number theory, that if
45:20.620 --> 45:29.280
x is prime, and if t square, just some value t, t square is congruent
45:29.280 --> 45:39.900
to 1 modulo x, then t must be 1 or minus 1 congruent to x.
45:39.900 --> 45:55.900
Now, that means if we take the opposite of that, so if we say, okay,
45:56.080 --> 46:04.320
this is statement A, from A we conclude B, now here we just draw B,
46:04.680 --> 46:09.080
not B, then implies the contraposition.
46:09.900 --> 46:12.020
That is, not B implies not A.
46:13.180 --> 46:27.040
So, if t square is congruent to 1 modulo x, and this here does not
46:27.040 --> 46:31.940
hold, that means t is not congruent to 1, and t is not congruent to
46:31.940 --> 46:36.560
minus 1, then x cannot be prime.
46:40.120 --> 46:45.740
Okay, so what we would like to compute, the other thing is that if A
46:45.740 --> 46:52.420
to the x minus 1 is not congruent to 1 modulo x, then x cannot be
46:52.420 --> 46:52.740
prime.
46:52.740 --> 46:57.740
So, those two properties, those two contrapositions actually are used
46:57.740 --> 47:03.080
in this algorithm by Robin and Miller, and to compute this witness
47:03.080 --> 47:03.520
function.
47:04.540 --> 47:08.720
So, the idea is to compute A to the x minus 1 modulo x.
47:08.720 --> 47:10.640
We know this should be 1.
47:10.720 --> 47:13.240
If it is not 1, we know x is not prime.
47:13.880 --> 47:19.620
And on the way to that, we check whether we get values which are
47:19.620 --> 47:29.420
actually congruent to 1 modulo x, while t is not equal to 1 and t is
47:29.420 --> 47:37.000
not equal to minus 1 or not equal to x minus 1 modulo x.
47:37.000 --> 47:40.040
x is not prime, we know that.
47:41.480 --> 47:49.160
So, we compute that value, we start with some value 1, this is our
47:49.160 --> 47:54.100
value t, you see here we have, there we need a value t, this is the
47:54.100 --> 47:55.020
intermediate value.
47:55.020 --> 48:00.460
We compute t and d now is the square of t modulo x.
48:01.160 --> 48:07.260
So, here we have t square, which is d, and there we look at t.
48:08.900 --> 48:13.260
So, now we have t and t square, certainly modulo x.
48:14.140 --> 48:26.160
And if now d is 1, this is this test, if d is 1 and t is not equal to
48:26.160 --> 48:33.520
1 and not equal to x minus 1, this is this test, then we know that x
48:33.520 --> 48:35.820
is not prime, we return true.
48:35.980 --> 48:38.920
That means we have found a witness for the compositeness of x.
48:40.440 --> 48:44.540
Otherwise, we have to continue with our computation of a to the x
48:44.540 --> 48:45.640
minus 1 mod x.
48:46.580 --> 48:52.620
And if now ai is 1, you see here we start with k, go down to a0, that
48:52.620 --> 48:56.120
means we start with the most significant bit, go down to the least
48:56.120 --> 49:02.040
significant bit of our power x minus 1.
49:02.040 --> 49:06.980
So, ak to a0 is the binary representation of x minus 1.
49:08.220 --> 49:14.400
And so, if ai is 1, you know that we have to multiply our value d,
49:14.900 --> 49:16.940
which we just have computed, we have squared.
49:17.620 --> 49:21.460
Now we have to compute it, or we have to multiply it with a.
49:22.740 --> 49:30.500
And in this way, that means we multiply with a to the ai mod x.
49:31.840 --> 49:32.700
That's it.
49:33.260 --> 49:35.260
We just repeat this.
49:36.280 --> 49:40.180
So this is the block here, the inner block of that for statement.
49:41.360 --> 49:46.320
So this is just repeating this computation, successive squarings and
49:46.320 --> 49:50.480
multiplications with a, and in this way we compute a to the x minus 1
49:50.480 --> 49:52.100
modulo x.
49:54.080 --> 50:02.040
Now, if the result d, if we did not jump out of this by returning
50:02.040 --> 50:06.700
true, then we check whether d is actually 1 or not.
50:06.780 --> 50:13.560
If it is not equal to 1, then if this is not equal to 1, we know that
50:13.560 --> 50:18.360
x cannot be prime, we return true.
50:18.360 --> 50:21.460
If it is equal to 1, we return false.
50:22.720 --> 50:32.480
Now, the property that a to the x minus 1 mod x is equal to 1 or
50:32.480 --> 50:37.240
congruent to 1 modulo x does not imply that x is prime.
50:38.240 --> 50:44.360
We just have found in that case, or it may be that we have some value
50:44.360 --> 50:47.700
which has this property without being prime.
50:48.200 --> 50:51.180
Now, there are some rare numbers having that property, and they are
50:51.180 --> 50:57.580
really rare, and we come back to that in the next slide.
50:59.000 --> 51:06.720
Another interesting experience or interesting result is that, or
51:06.720 --> 51:14.220
information, is that this problem of determining the complexity of the
51:14.220 --> 51:18.800
primality test, this has been an open problem.
51:18.800 --> 51:22.340
People did not know whether it is actually a polynomial algorithm or
51:22.340 --> 51:25.300
whether it is an NP-complete problem.
51:26.720 --> 51:33.180
And in 2002, so 12 years ago, the first polynomial time algorithm has
51:33.180 --> 51:33.880
been presented.
51:33.880 --> 51:40.520
And this was remarkable in particular because this algorithm was
51:40.520 --> 51:47.160
designed by students like Professor Agarwal at the IIT, the Indian
51:47.160 --> 51:50.880
Institute of Technology in Kanpur, and two of his students at the
51:50.880 --> 51:51.580
bachelor level.
51:51.580 --> 51:56.280
They just got an assignment where they had to solve a small problem
51:56.280 --> 52:01.940
with respect to number theory, and they actually designed a polynomial
52:01.940 --> 52:04.820
time algorithm for testing primality.
52:05.560 --> 52:08.800
Remarkable, but things like that happen again and again that people
52:08.800 --> 52:13.340
who are not aware of all the interesting theories around an open
52:13.340 --> 52:19.340
problem just have an intuition which leads to a new algorithm.
52:20.340 --> 52:23.400
So, it is a polynomial time problem.
52:24.720 --> 52:29.580
Still, it is expensive, but this algorithm here certainly can be
52:29.580 --> 52:31.440
executed very fast.
52:32.180 --> 52:37.860
So, to compute this iteration is done very fast.
52:38.040 --> 52:41.720
We just have to do that r times, which we can determine beforehand.
52:41.720 --> 52:44.100
Now, the question is, how do we determine that?
52:44.380 --> 52:52.140
How often do we have to repeat that choice of a random value a less x
52:52.140 --> 52:59.800
before we actually can be sure that we have a prime number?
52:59.800 --> 53:05.460
So, the probability of a false result of witness AX, that means
53:05.460 --> 53:10.660
witness AX returns false, but the value is not a prime number.
53:11.120 --> 53:15.060
This probability is smaller than one quarter.
53:15.060 --> 53:19.120
Now, if we repeat that several times, and since these choices are
53:19.120 --> 53:23.020
independent of each other, the probability of a false result of the
53:23.020 --> 53:28.700
Rubin -Miller test, since we repeat that r times, is one over two to
53:28.700 --> 53:29.500
the two r.
53:30.240 --> 53:36.240
Now, this easily gets down to a very, very small probability.
53:37.140 --> 53:43.680
And so, even with a few repetitions, you have a very high probability
53:43.680 --> 53:52.560
than the result that x is actually a prime number.
53:52.560 --> 53:58.580
So, it is a suitable, fast, and sufficiently secure method for
53:58.580 --> 54:00.920
generating prime numbers.
54:01.520 --> 54:07.160
So, as I said, you select a random number with appropriate number of
54:07.160 --> 54:16.520
bits that you are interested in, and then you repeat that test until
54:16.520 --> 54:21.100
you finally have found a number which is, according to that test, a
54:21.100 --> 54:21.700
prime number.
54:23.100 --> 54:28.600
Another question is, we now know how to determine a prime number, so
54:28.600 --> 54:34.040
we can compute p times q, and now we have to find c and d.
54:35.080 --> 54:36.440
So, how do you find those?
54:36.440 --> 54:38.840
So, to find c is not that difficult.
54:39.540 --> 54:45.740
You just take c' with respect to p-1 times q-1.
54:46.400 --> 54:49.860
You just take some prime value.
54:50.040 --> 54:55.820
You can actually choose a not-too-large integer, because this is for
54:55.820 --> 54:57.520
coding, for encryption.
54:59.000 --> 55:03.540
And so, this should be a small number.
55:03.700 --> 55:08.000
Not too small, but you have a choice of just taking any number there.
55:09.380 --> 55:11.440
Any prime number less than n.
55:12.080 --> 55:14.320
Another question is, how do you actually get d?
55:14.820 --> 55:19.060
With this property that c times d is congruent to 1 modulo the Euler
55:19.060 --> 55:21.340
value of the product p times q.
55:22.720 --> 55:29.040
So, the answer is that you use the extended Euclidean algorithm.
55:29.140 --> 55:32.680
Now, you know the Euclidean algorithm, which is the algorithm, the
55:32.680 --> 55:40.600
simple algorithm for determining the largest common divisor of two
55:40.600 --> 55:41.180
values.
55:41.940 --> 55:51.760
So, you test whether the greatest common divisor of c and p-1q-1 is 1.
55:51.900 --> 56:01.180
This is the Euclidean algorithm to test whether two values, x and y,
56:01.740 --> 56:06.440
have only a trivial largest common divisor.
56:08.200 --> 56:12.200
The interesting point is that we use an extended version of that.
56:12.340 --> 56:20.400
So, in the simple version, you just get this answer, well, they are
56:20.400 --> 56:22.420
prime with respect to each other or not.
56:23.360 --> 56:26.540
But, in the extended version, you actually get more.
56:26.860 --> 56:32.100
So, in the extended Euclidean algorithm, what you compute is, for
56:32.100 --> 56:38.120
integers c and e, which you are interested in, so here c and e, you
56:38.120 --> 56:45.320
compute integers x, y, and r, where r is that value.
56:45.320 --> 56:52.040
So, you compute those integers x, y, and r such that e times x plus c
56:52.040 --> 56:59.340
times y is equal to r, the greatest common divisor.
57:00.340 --> 57:06.640
So, this actually gives you the desired result.
57:06.800 --> 57:16.560
If r is 1, you have that e times x plus c times y is equal to 1.
57:16.560 --> 57:22.000
Now, for our problem, it means we take, like c is our prime number,
57:22.100 --> 57:28.320
which we have chosen for the public key, and now e is just this Euler
57:28.320 --> 57:30.040
value of n.
57:31.000 --> 57:39.620
So, the problem is, we look for some d such that c times d is
57:39.620 --> 57:46.680
congruent to 1 modulo e, or we can rephrase that and say that we look
57:46.680 --> 57:55.380
for some value, or we are interested, or we look for some x such that
57:55.380 --> 58:02.240
cd minus 1 is a multiple of e.
58:03.180 --> 58:10.540
That's what we said is like x is congruent to y modulo z if that
58:10.540 --> 58:12.960
difference is a multiple of z.
58:12.960 --> 58:16.020
And here we have c and d and 1.
58:17.260 --> 58:25.860
The difference is a multiple of the modulus e, or you could say e
58:25.860 --> 58:29.540
times minus x plus c times d is equal to 1.
58:29.540 --> 58:34.160
Now, this is exactly what we stated there with respect to e times x
58:34.160 --> 58:36.820
plus c times y equals to r.
58:37.580 --> 58:45.260
Okay, so if we have, if this extended Euclidean algorithm delivers us
58:45.260 --> 58:51.460
x and y, then we have exactly what we need.
58:51.780 --> 58:54.960
We have then found such a value d.
58:56.680 --> 59:06.380
Okay, and so this d is then, as I said, the d is then this value of x
59:06.380 --> 59:10.960
that we, no, sorry, this value of y that we are interested in.
59:11.500 --> 59:14.880
y is what we look for, that will be our d.
59:16.340 --> 59:19.220
Okay, now, what is this method doing?
59:19.520 --> 59:20.920
It is an iterative algorithm.
59:21.040 --> 59:23.980
As you know, the Euclidean algorithm is a simple iterative method.
59:24.780 --> 59:32.000
And so we have that schema, which is indicated here.
59:32.100 --> 59:33.540
We start with e and c.
59:34.160 --> 59:39.400
We have some intermediate values x0, x1, y0, y1.
59:39.400 --> 59:48.860
And we have some quotient qi, which is the integer, use integer
59:48.860 --> 59:58.860
division, ai minus 1 over ai, ai minus 1 over ai, a0, a1 are e and c.
59:58.860 --> 01:00:07.720
So we compute e over c, we divide e by c, and then take the integer
01:00:07.720 --> 01:00:08.480
value.
01:00:09.960 --> 01:00:22.160
Then we also look like here, ai minus 2, no, sorry, ai, then is ai
01:00:22.160 --> 01:00:27.280
minus 2 minus ai minus 1 times qi minus 1.
01:00:27.280 --> 01:00:38.300
And so a2 in this case would be a0, e minus ai minus 1 times qi minus
01:00:38.300 --> 01:00:47.920
1, in this case would be just the remainder of that value qi, which is
01:00:47.920 --> 01:00:52.960
the integer quotient of e over c in the initial round.
01:00:52.960 --> 01:00:57.240
And then we continue that, we just compute the appropriate, here some
01:00:57.240 --> 01:01:04.780
appropriate differences, continue that and iterate until the smallest,
01:01:05.520 --> 01:01:10.400
or until this ak actually is 0, you remember that from the Euclidean
01:01:10.400 --> 01:01:10.700
method.
01:01:10.700 --> 01:01:20.840
And then we know that the previous value of that a sequence, ak minus
01:01:20.840 --> 01:01:26.760
1, is the greatest common divisor, which certainly should be 1 in this
01:01:26.760 --> 01:01:27.140
case.
01:01:27.140 --> 01:01:36.360
And we also have for all intermediate values that e times xi plus c
01:01:36.360 --> 01:01:46.920
times yi is ai, and so the yi minus 1 would have been the value d that
01:01:46.920 --> 01:01:48.660
we are interested in.
01:01:48.660 --> 01:02:00.740
And so then if the greatest common divisor of c and e is 1, then y is
01:02:00.740 --> 01:02:06.600
the multiplicative inverse of c and modulo e, this is the y in this
01:02:06.600 --> 01:02:07.320
statement here.
01:02:07.320 --> 01:02:12.820
c times y is congruent to 1 modulo e, that's what we wanted to get.
01:02:13.400 --> 01:02:18.040
So this is the simple extended Euclidean algorithm, so you see that c
01:02:18.040 --> 01:02:26.540
and d are easily computed as soon as we have computed n, which was
01:02:26.540 --> 01:02:32.120
based on the determination of two large prime numbers p and q.
01:02:33.020 --> 01:02:38.740
Nevertheless, when I say this is easily computed, we still have all
01:02:38.740 --> 01:02:46.960
the operations here are operations on large integers, so they are
01:02:46.960 --> 01:02:47.600
expensive.
01:02:49.980 --> 01:02:57.200
So y is the required value of d, as I said, and if we apply the RSA
01:02:57.200 --> 01:03:02.700
algorithm, certainly we can use it for encryption of messages,
01:03:03.340 --> 01:03:08.940
certainly the asymmetry avoids the problem of secure key exchange.
01:03:10.300 --> 01:03:19.640
We don't need n squared, like if we have our n partners who would like
01:03:19.640 --> 01:03:24.940
to communicate, for every communication that we have here, we don't
01:03:24.940 --> 01:03:36.340
need to construct keys for every line in that complete graph here.
01:03:36.340 --> 01:03:44.180
These n square keys, but here every person only needs a public key,
01:03:44.580 --> 01:03:47.880
and everybody can use the public key to send an encrypted message,
01:03:47.880 --> 01:03:50.340
which only the recipient can decrypt.
01:03:50.340 --> 01:03:55.300
And so here we only need n public-private key pairs.
01:03:56.740 --> 01:04:03.800
So this is much better, and the disadvantage certainly is that the
01:04:03.800 --> 01:04:09.260
cost for encryption and decryption is quite significant, and so we
01:04:09.260 --> 01:04:10.280
have to consider that.
01:04:10.280 --> 01:04:15.680
Another problem is that it is easily broken if you only have few
01:04:15.680 --> 01:04:17.440
possible plain text messages.
01:04:17.920 --> 01:04:19.180
Why is this the case?
01:04:19.180 --> 01:04:29.900
Assume you have somebody who is taking care of a certain, like has a
01:04:29.900 --> 01:04:35.180
certain number of values in a bag, and he wants to draw a certain
01:04:35.180 --> 01:04:42.040
value, and then send the result to his friend, and this friend would
01:04:42.040 --> 01:04:46.480
like to know, and somebody else, you know, Alice, Bob, and Eve.
01:04:46.480 --> 01:04:51.560
Now, Eve would like to know what the selection actually is.
01:04:52.800 --> 01:05:00.680
Now, she might know that in this box there are values, let's say, A1,
01:05:00.880 --> 01:05:03.000
A2, and so on, up to AK.
01:05:03.620 --> 01:05:04.940
K different values.
01:05:04.940 --> 01:05:10.680
Now, if these are K different values, potential names of participants,
01:05:10.780 --> 01:05:16.040
for example, in some draw or whatever you would like to do, maybe you
01:05:16.040 --> 01:05:20.220
would have to decide for buying certain assets on a stock market, or
01:05:20.220 --> 01:05:24.900
you can easily imagine other situations where you have only a limited
01:05:24.900 --> 01:05:31.560
number of plain text messages which are known to Eve, but she doesn't
01:05:31.560 --> 01:05:34.280
know which one of those actually is selected.
01:05:34.280 --> 01:05:39.640
Now, what she can easily do is, she can just, like Eve can do the
01:05:39.640 --> 01:05:47.100
following, she just takes the public key of Alice, no, of Bob, and
01:05:47.100 --> 01:05:54.580
like assuming that A will use this algorithm, like use Bob's public
01:05:54.580 --> 01:06:02.280
key to send the result of her draw, Eve just computes, uses Bob's
01:06:02.280 --> 01:06:16.780
public key, computes PB of A1 down to PB of AK, and then since Alice
01:06:16.780 --> 01:06:24.400
will send the encrypted value of some AI which she has chosen, Eve
01:06:24.400 --> 01:06:29.560
only has to look at the encrypted values, which can easily determine
01:06:29.560 --> 01:06:33.660
which one of those has been selected, because she does not have to
01:06:33.660 --> 01:06:40.180
decrypt, she only has to know which message actually has been sent,
01:06:40.360 --> 01:06:45.120
and since this is a limited number of messages that can be sent, she
01:06:45.120 --> 01:06:49.380
can easily find out what the information is that has been transmitted.
01:06:49.380 --> 01:06:55.500
So this is an important insight that you have to be very careful how
01:06:55.500 --> 01:07:00.460
you apply such a public key method, because you have this public key
01:07:00.460 --> 01:07:01.900
which is used for encryption.
01:07:03.380 --> 01:07:05.060
So we have to do something about that.
01:07:05.800 --> 01:07:10.040
Whenever you have a limited number of potential alternatives which
01:07:10.040 --> 01:07:15.040
have to be encrypted, you can easily circumvent the problem of having
01:07:15.040 --> 01:07:16.920
to decrypt the message.
01:07:17.940 --> 01:07:22.940
Then there are some alternatives to RSA, which to some extent also
01:07:22.940 --> 01:07:27.680
have these problems as all the public key methods, like alternatives
01:07:27.680 --> 01:07:36.520
with respect to public key methods, which I don't have the time to go
01:07:36.520 --> 01:07:37.960
into more details.
01:07:37.960 --> 01:07:42.900
So Diffie-Hellman is a typical algorithm for exchanging keys in a
01:07:42.900 --> 01:07:48.260
secure way, like keys for symmetric methods, and elliptic curve
01:07:48.260 --> 01:07:55.000
algorithms are another approach to having public keys, and there
01:07:55.000 --> 01:08:00.400
actually the length of the keys can be significantly smaller than with
01:08:00.400 --> 01:08:03.280
the RSA or Diffie-Hellman algorithm.
01:08:05.060 --> 01:08:10.880
So maybe I should include that in future editions of this course.
01:08:12.020 --> 01:08:17.080
But I at least wanted to mention that these are practically relevant
01:08:17.080 --> 01:08:21.380
algorithms, in particular the elliptic curve algorithms are very
01:08:21.380 --> 01:08:27.080
important for secure encryption and decryption, particularly if you
01:08:27.080 --> 01:08:33.340
have to perform something like digital signatures and so on, which
01:08:33.340 --> 01:08:35.440
will be one of the next topics.
01:08:36.220 --> 01:08:40.620
So what we're interested in is secure key exchange for symmetric
01:08:40.620 --> 01:08:45.460
methods, because that could be one application of our public key
01:08:45.460 --> 01:08:45.800
methods.
01:08:45.900 --> 01:08:50.620
But first of all, let's try to find out how could we actually exchange
01:08:50.620 --> 01:08:55.360
keys for symmetric methods in a secure way.
01:08:55.360 --> 01:09:00.880
So, without making use of public key methods, you could just agree
01:09:00.880 --> 01:09:06.400
upon a general key, G, and you have some way of agreeing on that by
01:09:06.400 --> 01:09:10.140
phone, by letter, by courier, or whatever you have as a method for
01:09:10.140 --> 01:09:12.720
secure communication, a general key.
01:09:15.160 --> 01:09:22.400
And then you can transmit that by phone, by letter, by courier, or you
01:09:22.400 --> 01:09:29.400
just transfer some segments, so if this here is your value, a simple
01:09:29.400 --> 01:09:32.420
way of choosing segments would be to do something like that.
01:09:32.420 --> 01:09:38.140
A more sophisticated way would be to just perform some selection, so
01:09:38.140 --> 01:09:46.580
just take certain parts of that value and then have some pattern how
01:09:46.580 --> 01:09:52.840
you select the keys into those G1 to GK, and then you just have to
01:09:52.840 --> 01:10:03.320
know that function for getting the value G from those segments that
01:10:03.320 --> 01:10:03.920
you have sent.
01:10:04.920 --> 01:10:08.080
So this is some standard approach.
01:10:08.780 --> 01:10:14.660
So what you could also do is, for example, you have your value G, you
01:10:14.660 --> 01:10:23.060
could also just add some random value, and then you transmit G plus
01:10:23.060 --> 01:10:27.320
that random value, and if you know the random value, you can retrieve
01:10:27.320 --> 01:10:27.680
G.
01:10:27.680 --> 01:10:30.240
If you don't, you don't get it.
01:10:30.580 --> 01:10:36.000
But then, again, you have to recombine that R.
01:10:36.380 --> 01:10:39.740
So there are methods, there are interesting methods around those
01:10:39.740 --> 01:10:40.620
topics.
01:10:41.160 --> 01:10:43.240
We'll come back to that a little bit later.
01:10:45.480 --> 01:10:51.960
So this would be a way of transmitting some key, and then what you
01:10:51.960 --> 01:10:58.020
could do is you just use a new random session key for every exchange
01:10:58.020 --> 01:10:58.660
of messages.
01:10:59.600 --> 01:11:04.200
And the message M is encrypting using that session key for the
01:11:04.200 --> 01:11:09.860
symmetric method, and this session key is encrypted using your general
01:11:09.860 --> 01:11:17.020
key, so you have your previously transmitted general key, and then you
01:11:17.020 --> 01:11:20.720
only exchange encrypted session keys.
01:11:20.720 --> 01:11:27.320
The session keys are used only once, so you cannot perform any of
01:11:27.320 --> 01:11:31.400
those typical attacks where you have a range or a collection of
01:11:31.400 --> 01:11:36.380
messages which you just try to get encrypted and then analyze what
01:11:36.380 --> 01:11:40.860
happens to those messages in order to find out something about the key
01:11:40.860 --> 01:11:41.640
that has been used.
01:11:41.640 --> 01:11:45.920
If it's just used once, there's no way how you can actually perform
01:11:45.920 --> 01:11:46.640
those attacks.
01:11:47.940 --> 01:11:52.780
So G is used for short messages only, only for session keys, and S is
01:11:52.780 --> 01:11:54.320
used for one message only.
01:11:54.320 --> 01:11:59.720
And now certainly an alternative to this scheme would be that you
01:11:59.720 --> 01:12:04.260
don't use that general key, but you use asymmetric methods for key
01:12:04.260 --> 01:12:04.780
exchange.
01:12:05.920 --> 01:12:08.980
So it is definitely an improved protection against cryptanalytic
01:12:08.980 --> 01:12:15.660
attacks, and now we transfer that into using public key methods for
01:12:15.660 --> 01:12:16.320
key exchange.
01:12:16.320 --> 01:12:24.060
So Alice and Bob are wanting to communicate, and Alice would like to
01:12:24.060 --> 01:12:29.140
send a letter, and what she can do is she chooses a new session key
01:12:29.140 --> 01:12:31.140
for every exchange of messages.
01:12:31.140 --> 01:12:39.100
So she selects some random key, she encrypts the message, so now you
01:12:39.100 --> 01:12:42.900
have the encrypted message which can only decrypt if you have that
01:12:42.900 --> 01:12:43.540
session key.
01:12:44.460 --> 01:12:48.680
She has to send that session key to Bob, and she does that by just
01:12:48.680 --> 01:12:54.580
using Bob's public key, so the public key of the receiver using some
01:12:54.580 --> 01:12:56.240
asymmetric method.
01:12:56.240 --> 01:12:59.900
You could use RSA, but also some other method, elliptic curve, or
01:12:59.900 --> 01:13:00.560
whatever you like.
01:13:01.640 --> 01:13:08.020
And then send the encrypted message together with the encrypted key,
01:13:08.860 --> 01:13:14.300
which means that in some way you have a sealed envelope where the key
01:13:14.300 --> 01:13:19.620
is put into, and Bob certainly knows how to open that envelope,
01:13:21.160 --> 01:13:27.040
because Bob has his secret key, which is used to retrieve the session
01:13:27.040 --> 01:13:32.620
key, so he can use that, his secret key, get the session key, and then
01:13:32.620 --> 01:13:35.300
he can retrieve the message that Alice has sent.
01:13:36.500 --> 01:13:41.720
So this looks like a nice combination of asymmetric and symmetric
01:13:41.720 --> 01:13:46.660
methods, because the public key method is only used for encrypting
01:13:46.660 --> 01:13:48.020
those short keys.
01:13:48.660 --> 01:13:54.040
Certainly short key means something like 128 or 256 bits.
01:13:56.300 --> 01:14:04.140
And then the message is encrypted using a symmetric method, which is
01:14:04.140 --> 01:14:04.820
done fast.
01:14:05.380 --> 01:14:07.440
So this looks like quite a nice approach.
01:14:08.560 --> 01:14:14.940
So you don't have the need for secretly transmitting some general key,
01:14:15.140 --> 01:14:16.720
as I mentioned on the previous slide.
01:14:16.720 --> 01:14:22.240
The high cost of RSA certainly are tolerable for short session keys,
01:14:23.200 --> 01:14:27.700
and every key of the symmetric method is used only once, which is very
01:14:27.700 --> 01:14:28.240
important.
01:14:28.740 --> 01:14:34.720
This definitely increases the security of those messages.
01:14:34.720 --> 01:14:37.240
Okay, what's the problem?
01:14:38.440 --> 01:14:45.200
So this is almost the method that is used extensively in our secure
01:14:45.200 --> 01:14:48.460
communications, for example, in HTTPS and so on.
01:14:48.560 --> 01:14:53.400
It's exactly that scheme that is used, but almost, not quite.
01:14:54.260 --> 01:14:54.960
What's missing?
01:14:54.960 --> 01:15:02.720
The problem is that there might be some man in the middle.
01:15:03.740 --> 01:15:08.240
So this would be, in this case, Edgar, the man in the middle.
01:15:09.920 --> 01:15:16.240
And if the public key would not be... Alice assumes that the public
01:15:16.240 --> 01:15:19.200
key is PB, Bob's public key.
01:15:19.200 --> 01:15:26.360
But if that is replaced with the public key of Edgar, this man in the
01:15:26.360 --> 01:15:36.180
middle, then Edgar could retrieve that message, modify it on its own
01:15:36.180 --> 01:15:42.120
discretion, because he has his secret key to get the session key and
01:15:42.120 --> 01:15:46.040
decrypt the message and modify the message and so on.
01:15:46.040 --> 01:15:51.420
And then he will send it on using the public key of Bob, and Bob can
01:15:51.420 --> 01:15:52.460
decrypt that message.
01:15:52.660 --> 01:15:56.660
So that would be an attack, which is quite simple.
01:15:57.220 --> 01:16:03.720
And it shows that we need some knowledge about or some statement on,
01:16:03.880 --> 01:16:09.000
is that actually a valid public key of the recipient?
01:16:09.810 --> 01:16:13.840
Okay, we'll come back to that problem in a moment.
01:16:17.040 --> 01:16:23.300
I used to present another symmetric method, the method IDEA, which is
01:16:23.300 --> 01:16:25.740
an interesting method.
01:16:26.640 --> 01:16:32.640
It was designed at ETH in Zurich, but I don't have time to do that, so
01:16:32.640 --> 01:16:35.900
I will just go over that.
01:16:36.320 --> 01:16:40.720
Without covering it, you are welcome to look at those slides, but this
01:16:40.720 --> 01:16:43.420
will not be relevant for the final exam.
01:16:43.840 --> 01:16:51.620
And I rather would like to show you which other internationally
01:16:51.620 --> 01:16:55.680
accepted encryption standard is used.
01:16:55.760 --> 01:16:59.320
Meanwhile, I mentioned that DES is not the choice.
01:16:59.460 --> 01:17:04.180
I did not present IDEA, so I present the Advanced Encryption Standard,
01:17:04.900 --> 01:17:08.900
which is the result of a competition, as I mentioned already, to find
01:17:08.900 --> 01:17:09.960
a successor for DES.
01:17:09.960 --> 01:17:16.700
This was done in the last years of the 19th century.
01:17:17.920 --> 01:17:22.020
And there were several finalists.
01:17:22.300 --> 01:17:24.800
So here are those five finalists.
01:17:24.800 --> 01:17:31.320
And one, actually, Reindahl, was the one which was selected, designed
01:17:31.320 --> 01:17:34.200
by Joan Damon and Vincent Ryman.
01:17:35.040 --> 01:17:41.240
And so that's how this name, Reindahl, actually appeared.
01:17:42.900 --> 01:17:48.520
Now, it's now only known under the name AES, Advanced Encryption
01:17:48.520 --> 01:17:48.940
Standard.
01:17:49.320 --> 01:17:55.720
You can find a lot of information on that standard on the pages of the
01:17:55.720 --> 01:18:00.180
NIST institution in the United States.
01:18:00.180 --> 01:18:07.300
And you can also find publicly available codes in different languages
01:18:07.300 --> 01:18:09.520
for actually using that AES.
01:18:10.200 --> 01:18:12.420
So you can download the algorithm and use it.
01:18:14.020 --> 01:18:18.000
The information on the next slides is from the official specification.
01:18:19.060 --> 01:18:24.580
And the official specification by those two scientists was a little
01:18:24.580 --> 01:18:28.380
bit more general than what was actually put into the standard.
01:18:28.380 --> 01:18:35.580
The standard, under the acronym FIPS 197, was published in 2001.
01:18:36.460 --> 01:18:44.060
And that's what I use in the presentation of Reindahl in the following
01:18:44.060 --> 01:18:44.580
slides.
01:18:44.580 --> 01:18:58.880
So, Reindahl has two essential parameters, Nb and Nk, number of
01:18:58.880 --> 01:19:05.540
blocks, like that's the block length Nb and the key length Nk.
01:19:06.100 --> 01:19:10.540
And you see that the block length means it certainly is a block
01:19:10.540 --> 01:19:11.640
-oriented encryption.
01:19:13.080 --> 01:19:22.040
And so if Nb, like that, can have the values 4, 6, or 8, and Nk can
01:19:22.040 --> 01:19:26.660
have the values 4, 6, and 8, and this results in certain parameters in
01:19:26.660 --> 01:19:31.260
that table, what does it mean to have Nb equal 4?
01:19:31.260 --> 01:19:38.580
Nb equal 4 means that we multiply 32 by that number Nb.
01:19:39.240 --> 01:19:48.980
So if Nb is 4, like 32 bits, now 32 is exactly 4 bytes.
01:19:49.600 --> 01:19:54.960
So this is 4 bytes and this is repeated 4 times.
01:19:54.960 --> 01:20:00.740
So this is the block length for Nb equals 4.
01:20:01.500 --> 01:20:06.760
And in the FIPS 197, we only have Nb equals 4.
01:20:07.800 --> 01:20:13.860
Now, this is now the standard, but the algorithm in principle could
01:20:13.860 --> 01:20:18.900
also be applied to having something for Nb equals 6 or Nb equals 8.
01:20:19.920 --> 01:20:25.940
So, not just 128, but you could also have a block length of 192 or
01:20:25.940 --> 01:20:26.420
256.
01:20:27.640 --> 01:20:31.620
Now, the key length can be variable, so you can use that either with
01:20:31.620 --> 01:20:38.680
128, 192, or 256 bits, so quite a large number of bits for the key.
01:20:39.360 --> 01:20:45.440
And so this certainly has implications on the strength of the
01:20:45.440 --> 01:20:45.900
algorithm.
01:20:47.160 --> 01:20:55.200
And so this is the array of bytes for that key.
01:20:55.620 --> 01:21:00.120
And this, as we had for the DS algorithm, we again have a number of
01:21:00.120 --> 01:21:00.440
rounds.
01:21:00.720 --> 01:21:05.560
So the number that you see in there, for example, for Nk equals 6 and
01:21:05.560 --> 01:21:10.580
Nb equals 4 would be 12, which means that we have 12 rounds of
01:21:10.580 --> 01:21:11.080
encryption.
01:21:11.670 --> 01:21:16.660
And so every round has the following structure.
01:21:18.420 --> 01:21:26.280
The state means this is a variable for the current value of our block,
01:21:26.840 --> 01:21:33.240
so these four blocks of these four columns of four bytes.
01:21:35.500 --> 01:21:37.360
And that's the state.
01:21:37.980 --> 01:21:43.500
And then we have the round key, which is some number of some matrix
01:21:43.500 --> 01:21:44.000
again.
01:21:45.040 --> 01:21:47.520
And what we do is the following.
01:21:48.000 --> 01:21:52.960
We modify the state in some way by byte substitution.
01:21:52.960 --> 01:21:59.680
We shift the rows of the state, we mix the columns of the state, and
01:21:59.680 --> 01:22:04.040
then add the round key to the residing state.
01:22:04.340 --> 01:22:12.700
So we modify this state, the current intermediate cipher, and then at
01:22:12.700 --> 01:22:14.580
some point add that key.
01:22:15.940 --> 01:22:19.300
Okay, the final round of the cipher is slightly different there.
01:22:19.540 --> 01:22:22.240
We don't have the mix column part.
01:22:23.320 --> 01:22:29.380
And certainly you also have to look at that you have to inverse the
01:22:29.380 --> 01:22:32.240
whole thing, and this also works nicely.
01:22:32.240 --> 01:22:36.960
So all the operations that I will show you are all invertible
01:22:36.960 --> 01:22:37.880
operations.
01:22:39.880 --> 01:22:44.020
But you have to know the essential ingredients.
01:22:45.020 --> 01:22:49.760
Okay, let me start by showing you the first step, byte substitution.
01:22:50.320 --> 01:22:57.320
Now all these operations here are operations on finite fields, so
01:22:57.320 --> 01:23:02.380
actually arithmetic in a finite Galois field.
01:23:02.380 --> 01:23:04.000
I don't know if you know about Galois.
01:23:04.140 --> 01:23:14.080
Galois was a scientist, mathematician, some time ago, quite some time
01:23:14.080 --> 01:23:22.380
ago, and he actually went into a duel, into a fight with somebody
01:23:22.380 --> 01:23:25.640
else, on some nice lady, and he lost.
01:23:26.310 --> 01:23:32.140
And when people went to his room, they found a number of papers which
01:23:32.140 --> 01:23:37.900
had some formulas scribbled on them, and it was actually the Galois
01:23:37.900 --> 01:23:40.760
theory, the theory of finite fields.
01:23:40.760 --> 01:23:48.780
Very ingenious theory, and it is about having a finite set of values,
01:23:49.860 --> 01:23:55.180
and you know that if you have some value M, some set M, you have
01:23:55.180 --> 01:24:03.120
addition and multiplication, and now this set M has to be a group with
01:24:03.120 --> 01:24:06.920
respect to addition and multiplication has to be a field.
01:24:08.300 --> 01:24:13.780
And so we also have 0 and 1, and we have all these nice properties,
01:24:15.160 --> 01:24:23.080
and so in particular it means that if you have a finite field, we can
01:24:23.080 --> 01:24:27.380
compute plus and minus, that means the inverse of addition is also
01:24:27.380 --> 01:24:32.220
there, and we can compute multiplication and division, which means
01:24:32.220 --> 01:24:36.480
that multiplication is also an inverse operation, every element has an
01:24:36.480 --> 01:24:36.800
inverse.
01:24:38.140 --> 01:24:47.040
Now, this is done on small sets in this part here, Galois field, of 2
01:24:47.040 --> 01:24:48.560
to the 8 values.
01:24:48.880 --> 01:24:51.860
2 to the 8 values, why do we have this 8?
01:24:52.160 --> 01:24:54.860
We look at arithmetic on bytes.
01:24:54.860 --> 01:25:01.360
Every byte is a value of your 2 to the 8 different byte values, and so
01:25:01.360 --> 01:25:09.800
we have to look whether we can actually construct field operations on
01:25:09.800 --> 01:25:13.500
those values.
01:25:13.500 --> 01:25:19.760
So, the plus operation in this case is just bitwise XORing, so if we
01:25:19.760 --> 01:25:25.240
have 2 bytes, if we add that, we just perform XORing of that.
01:25:25.580 --> 01:25:30.160
Can be done very efficiently, just a bitwise operation, we don't have
01:25:30.160 --> 01:25:32.760
any carry that is rippling through.
01:25:34.240 --> 01:25:36.920
Multiplication, how can that be done?
01:25:37.040 --> 01:25:41.820
The result has to be a byte again, and so this is multiplication,
01:25:42.240 --> 01:25:47.500
polynomial multiplication, modulo this polynomial.
01:25:48.700 --> 01:25:56.740
Now, polynomial multiplication, modulo that value, how is that done?
01:25:57.340 --> 01:26:06.220
You can assume that every byte, you can interpret a byte as being,
01:26:08.080 --> 01:26:17.080
like if you have a byte B7, B6, B5 and so on.
01:26:18.040 --> 01:26:23.380
B1, B0, sorry that doesn't make sense here, take that out.
01:26:25.940 --> 01:26:36.020
B1 and B0, you just say, okay, X to the 7 plus B6, X to the 6 and so
01:26:36.020 --> 01:26:39.320
on, B1, X plus B0.
01:26:39.320 --> 01:26:44.040
So this is a polynomial, you can interpret a byte as a polynomial of
01:26:44.040 --> 01:26:44.720
degree 7.
01:26:45.840 --> 01:26:50.480
Now, if you multiply those polynomials, then you get another
01:26:50.480 --> 01:26:55.340
polynomial, certainly which has a higher degree, a degree of 14, but
01:26:55.340 --> 01:27:00.820
you do that, modulo another polynomial, in this case a degree 8
01:27:00.820 --> 01:27:05.620
polynomial, and the remainder of that will be a polynomial of degrees
01:27:05.620 --> 01:27:06.460
at most 7.
01:27:06.460 --> 01:27:12.740
And so the polynomial multiplication, modulo this polynomial here, is
01:27:12.740 --> 01:27:21.580
another degree 7 polynomial, which means that this actually is then
01:27:21.580 --> 01:27:24.480
another byte.
01:27:25.040 --> 01:27:30.300
The coefficients of that polynomial form the new byte.
01:27:31.040 --> 01:27:35.220
And this polynomial has been chosen such that this multiplication is
01:27:35.220 --> 01:27:36.200
invertible.
01:27:36.320 --> 01:27:42.900
That means every byte has an inverse with respect to that
01:27:42.900 --> 01:27:43.540
multiplication.
01:27:43.540 --> 01:27:51.620
And this is chosen here, so here you first take the multiplicative
01:27:51.620 --> 01:27:59.920
inverse of every byte in that color field, then you apply an affine
01:27:59.920 --> 01:28:04.400
transformation like this, so here you have a byte, there you have a
01:28:04.400 --> 01:28:09.500
specific value, and if you have a matrix, this is the affine
01:28:09.500 --> 01:28:16.060
transformation, which results then in some new byte value, and if you
01:28:16.060 --> 01:28:21.020
combine all that, like by taking the multiplicative inverse and
01:28:21.020 --> 01:28:26.580
afterwards applying the affine transformation, this corresponds to
01:28:26.580 --> 01:28:34.400
taking your byte value, you chop that into x and y, x is the row
01:28:34.400 --> 01:28:41.620
index, y the column index, and you just retrieve the values that give
01:28:41.620 --> 01:28:43.200
the new byte value.
01:28:43.380 --> 01:28:49.980
So this is a hexadecimal, these are pairs of hexadecimal digits, and
01:28:49.980 --> 01:28:56.880
so in this case you have the results, so the S-box, which is just
01:28:56.880 --> 01:29:09.840
those two steps, transforms every byte of that matrix into a new byte,
01:29:09.940 --> 01:29:12.420
so this aij is transformed into bij.
01:29:12.420 --> 01:29:16.220
This is the first step, this is byte substitution, and how that
01:29:16.220 --> 01:29:21.600
continues will be shown to you next time, and let me just very briefly
01:29:21.600 --> 01:29:29.080
show you, so this is the end of this presentation, but I would like to
01:29:29.080 --> 01:29:39.360
briefly show you one other thing, as this is the last lecture before
01:29:39.360 --> 01:29:49.480
Christmas, I would like to show you one nice, like very briefly, a
01:29:49.480 --> 01:29:54.740
nice program which actually utilizes the capabilities of a
01:29:54.740 --> 01:29:55.120
touchscreen.
01:29:55.120 --> 01:29:58.040
So this is a physics illustrator.
01:29:58.120 --> 01:29:59.660
What is a physics illustrator?
01:30:00.280 --> 01:30:06.240
It means I can draw some figures like this, or, sorry, I didn't want
01:30:06.240 --> 01:30:13.740
to draw that, I can draw this, and now I can say, okay, this should be
01:30:13.740 --> 01:30:23.380
moved that way, and I can also say, okay, I would like to turn that in
01:30:23.380 --> 01:30:28.740
some way, and I could animate that, and now those objects are moving
01:30:28.740 --> 01:30:30.260
according to physical properties.
01:30:31.000 --> 01:30:37.020
I could also say, okay, I just draw here something, and make that into
01:30:37.020 --> 01:30:41.040
a solid element, and now I do the same again.
01:30:41.040 --> 01:30:47.240
You see, now all of a sudden those elements turn around according to
01:30:47.240 --> 01:30:53.180
physical properties, so I can, if I look at this here, I can look at
01:30:53.180 --> 01:30:58.760
the physical properties here, for example, it could be rubber, steel,
01:30:59.100 --> 01:31:01.800
wood, ice, plastic, clay, or rock.
01:31:01.800 --> 01:31:05.040
You can imagine that you can play around with that quite nicely.
01:31:05.580 --> 01:31:16.860
I would like to show you one pre-generated design, which, like a very
01:31:16.860 --> 01:31:20.880
nice one, is this domino effect, which I can show you.
01:31:20.880 --> 01:31:26.860
You see here a nicely drawn arrangement, and if we animate that, then
01:31:26.860 --> 01:31:31.800
all of a sudden you see how this actually is generated, quite a
01:31:31.800 --> 01:31:33.360
complicated domino effect.
01:31:34.100 --> 01:31:37.380
You see that many elements are moved around in some way.
01:31:37.380 --> 01:31:42.560
They have different properties, some are solid, and even that ball
01:31:42.560 --> 01:31:49.180
here will be pushed away again by that element there, you see.
01:31:49.720 --> 01:31:50.100
Interesting.
01:31:50.100 --> 01:32:00.920
And now if, for example, I would modify the gravitational force, I
01:32:00.920 --> 01:32:05.900
could modify it in that way, what would happen, then certainly all of
01:32:05.900 --> 01:32:07.460
a sudden everything would be different.
01:32:08.080 --> 01:32:10.780
That domino effect would be no longer there.
01:32:10.780 --> 01:32:19.420
And the final slide, which I wanted to show you, is now this final
01:32:19.420 --> 01:32:22.440
slide for this course.
01:32:23.220 --> 01:32:28.360
So I wish you Merry Christmas, Frohe Weihnachten, and we can animate
01:32:28.360 --> 01:32:33.160
that also, and here you see that this will now crumble away, because
01:32:33.160 --> 01:32:35.560
all these elements will be moved in some way.
01:32:35.560 --> 01:32:43.160
So have a nice Christmas, and come back filled with eagerness to
01:32:43.160 --> 01:32:49.120
actually come to the lecture and see me in person and follow the
01:32:49.120 --> 01:32:50.320
contents of this course.
01:32:50.760 --> 01:32:55.840
Okay, it has been nice to teach you this in this year, next year we
01:32:55.840 --> 01:32:56.540
will start again.
01:32:57.560 --> 01:32:59.020
Okay, that's it for today.