WEBVTT

00:05.150 --> 00:09.850
Welcome to the lecture on theoretical basic computer science.

00:12.600 --> 00:14.200
What are we doing today?

00:14.480 --> 00:17.340
We repeat the last lecture at the beginning.

00:18.180 --> 00:21.120
We had met the classes P and NP.

00:22.320 --> 00:26.780
These were the classes of problems, decision problems, which were

00:26.780 --> 00:31.740
solved by deterministic or not deterministic Turing machines in

00:31.740 --> 00:33.140
polynomial time.

00:34.560 --> 00:39.480
You have to imagine the class P as the problems that we can solve

00:39.480 --> 00:43.780
efficiently with a computer in polynomial time.

00:45.380 --> 00:53.580
And the class NP has many problems, such as the Turing-Saltzmann

00:53.580 --> 00:58.580
problem, where we don't know how to solve them efficiently.

00:58.580 --> 01:05.200
The question is whether we were just not smart enough to find an

01:05.200 --> 01:09.660
algorithm that is fast enough for such problems as Turing-Saltzmann.

01:10.220 --> 01:13.160
In this case, P equals NP.

01:15.180 --> 01:20.720
Or whether these two classes are actually different and the ones in P

01:20.720 --> 01:26.440
are the simple problems, which we can solve exactly in a computer in a

01:26.440 --> 01:27.740
suitable time.

01:28.640 --> 01:37.460
And the others in NP, but not NP, the NP-complete ones, would be the

01:37.460 --> 01:41.740
ones that are simply too difficult, where it would be impossible to

01:41.740 --> 01:43.360
find an efficient algorithm.

01:47.100 --> 01:50.380
We then got to know NP-completeness.

01:51.060 --> 01:55.620
We said that a problem is NP-complete if it is in NP, i.e.

01:55.960 --> 02:00.420
if we have non-determinism in the Turing machine, which is not the

02:00.420 --> 02:02.140
case in a realistic sense.

02:03.140 --> 02:07.040
But in the conceptual sense, if we had a non-deterministic Turing

02:07.040 --> 02:10.380
machine, then we could do it in polynomial time.

02:10.580 --> 02:14.020
That means we can verify answers quickly.

02:18.880 --> 02:24.180
And every other problem in NP can be reduced to our problem.

02:24.300 --> 02:28.860
That means our problem is at least as difficult as all other problems

02:28.860 --> 02:29.920
in NP.

02:30.860 --> 02:32.500
That's the idea.

02:36.080 --> 02:40.640
NP-completeness means that these are the problems in NP that are at

02:40.640 --> 02:42.480
least as difficult as all other problems.

02:44.580 --> 02:49.460
We assume that the question up here has to be answered with no, i.e.

02:49.520 --> 02:51.260
that P is not equal to NP.

02:51.940 --> 02:56.740
That means these are NP-complete problems for which we simply cannot

02:56.740 --> 02:57.940
find an efficient algorithm.

03:01.320 --> 03:07.780
If you prove that a problem is NP-complete, then it not only has

03:07.780 --> 03:13.520
theoretical benefits, but it simply tells you that you can give up

03:13.520 --> 03:19.060
solving the problem exactly in a passable runtime.

03:19.760 --> 03:25.660
Either you have to invest a very long runtime to solve this problem

03:25.660 --> 03:30.200
exactly, or you don't have to solve the problem exactly.

03:33.000 --> 03:36.760
That means if you have a problem, a combinatorial problem, and you try

03:36.760 --> 03:41.760
to set up an algorithm, but you don't find an efficient algorithm,

03:42.980 --> 03:47.380
then it might be a good idea to consider whether the problem is NP

03:47.380 --> 03:47.700
-complete.

03:48.760 --> 03:53.440
If it is NP-complete, then you can stop looking for an efficient

03:53.440 --> 04:00.480
algorithm or at least stop wanting to have both at the same time, i.e.

04:00.560 --> 04:02.100
exact time and fast runtime.

04:07.440 --> 04:12.520
In this definition of NP-complete, we had this little symbol here,

04:13.320 --> 04:17.200
which means to reduce or to transform polynomially.

04:18.260 --> 04:23.360
This polynomial transformation was between two problems, i.e.

04:23.400 --> 04:29.580
from a problem P' to a problem P. The idea is that we can solve

04:29.580 --> 04:39.020
instances of P' if we can solve instances of P. With an algorithm for

04:39.020 --> 04:41.780
the more difficult problem, we can also solve the easier problem.

04:44.440 --> 04:50.820
Formally, we have a map that maps each instance of P' to an instance

04:50.820 --> 04:58.840
of P. This mapping has to be calculable by a Turing machine.

05:01.880 --> 05:05.660
This mapping of instances is such that instances of Y are mapped to

05:05.660 --> 05:08.500
instances of Y and instances of Y to instances of Y.

05:09.700 --> 05:14.360
Instead of solving my original instance of P', I can look at the

05:14.360 --> 05:18.580
mapped instance of P and distinguish yes from no.

05:19.220 --> 05:21.900
This would be the same answer as before.

05:30.320 --> 05:33.460
To show that P' is NP-complete, we would have to transform

05:33.460 --> 05:35.600
polynomially every problem into NP.

05:36.700 --> 05:38.860
This is a lot of work.

05:39.780 --> 05:46.000
In fact, it is enough to show that instead of taking all problems into

05:46.000 --> 05:51.600
NP, it is enough to take a single problem, P'.

05:51.600 --> 05:53.080
But this has to be NP-complete.

05:54.440 --> 05:57.580
We have to know that it is NP-complete, P'.

05:57.580 --> 06:04.260
If we make a single reduction from P' to P, we know that P is NP

06:04.260 --> 06:05.900
-complete, too.

06:05.900 --> 06:06.740
If P is in NP.

06:08.940 --> 06:12.440
This NP-completeness can be propagated.

06:13.260 --> 06:19.760
If P' is already NP-complete and I can reduce P' to P, P is also in

06:19.760 --> 06:22.700
NP, then P is NP-complete, too.

06:23.800 --> 06:25.940
Last time we set the starting point.

06:26.400 --> 06:34.080
We proved the sentence by Cook that SAT is NP-complete.

06:34.080 --> 06:41.340
Now we can take this as a starting point, as a P', and reduce it from

06:41.340 --> 06:46.640
SAT somewhere else to show that the other problems are NP-complete,

06:47.060 --> 06:47.980
too.

06:50.100 --> 06:53.520
We will do this all the time today.

06:53.940 --> 06:57.600
We will do a lot of polynomial transformations.

06:58.480 --> 06:59.720
This is the plan for today.

07:01.520 --> 07:03.360
This is the diagram.

07:04.260 --> 07:09.560
We are in class NP and we want to separate the NP-complete from the

07:09.560 --> 07:10.380
polynomials.

07:12.320 --> 07:15.900
We already know that SAT is in this red area.

07:17.420 --> 07:22.400
An arrow means a reduction or a transformation from SAT to another

07:22.400 --> 07:23.000
problem.

07:23.640 --> 07:28.840
We will start by showing that 3SAT is NP-complete.

07:28.840 --> 07:31.860
All the problems will be defined here.

07:32.960 --> 07:38.100
We show that 3SAT is in NP, in the yellow box.

07:38.780 --> 07:43.520
We will do a reduction, a polynomial transformation from SAT to 3SAT.

07:44.380 --> 07:46.960
Now we know that 3SAT is NP-complete.

07:49.280 --> 07:59.740
We can transform from SAT to clique or we know that 3SAT is NP

07:59.740 --> 08:00.040
-complete.

08:00.380 --> 08:04.760
We can transform from 3SAT to clique or 3SAT to color and in the end

08:04.760 --> 08:07.880
we will transform from 3COLOR to EXACTCOVER.

08:10.620 --> 08:15.880
Every time, to show that a problem is NP-complete, it is important to

08:15.880 --> 08:16.500
show two things.

08:17.680 --> 08:24.380
The problem is in NP and there is a reduction for a well-known,

08:24.380 --> 08:26.580
difficult problem, P'.

08:26.580 --> 08:30.320
The first part is usually very easy.

08:30.620 --> 08:36.920
The second part is usually not so easy.

08:37.720 --> 08:41.940
There is no real cooking recipe that I can give you.

08:43.920 --> 08:48.160
It is a kind of art to do this reduction.

08:49.280 --> 08:51.100
There are people who are very focused on this.

08:51.100 --> 08:57.760
You have to have a feeling for problems and what can possibly be

08:57.760 --> 08:58.520
transformed into what.

08:59.740 --> 09:05.940
That is why the approach here in the lecture is that you learn this by

09:05.940 --> 09:08.100
showing as many examples as possible.

09:08.880 --> 09:18.120
Then I try to push you through the art of reduction and maybe take

09:18.120 --> 09:24.240
something from each reduction, see a few tricks, what you can do, and

09:24.240 --> 09:27.560
then maybe get some simple reductions yourself.

09:31.160 --> 09:33.040
So let's get started.

09:35.300 --> 09:41.340
3SAT is NP-complete and the corresponding transformation from SAT to

09:41.340 --> 09:41.840
3SAT.

09:42.320 --> 09:44.280
What is the problem 3SAT?

09:44.280 --> 09:53.580
Well, the problem 3SAT is a specialization of the problem SAT, where

09:53.580 --> 09:56.660
we have almost the same thing again.

09:56.680 --> 10:01.320
We have a lot of U of variables and a lot of C of clauses about U.

10:03.180 --> 10:10.100
A clause is a denotation of literals from U and a literal is either a

10:10.100 --> 10:12.680
positive or a negated variable.

10:13.540 --> 10:18.500
But now we also have the additional property that each clause contains

10:18.500 --> 10:20.500
exactly three literals.

10:23.000 --> 10:27.180
Last time with the sentence of Cook we had any forms, some of which

10:27.180 --> 10:27.680
were very long.

10:27.940 --> 10:30.480
We also had forms with only two or one literal.

10:30.980 --> 10:35.740
And now we have made life easier for ourselves.

10:35.780 --> 10:41.080
We have limited our instances so that the clauses only have a length

10:41.080 --> 10:41.500
of three.

10:43.640 --> 10:46.680
So actually this is a specialization of SAT.

10:47.520 --> 10:54.220
But we will show that the specialization is at least as difficult as

10:54.220 --> 10:58.920
the general problem, by transforming the general problem into the

10:58.920 --> 10:59.440
specialization.

11:01.680 --> 11:06.780
I forgot to mention what the question is about the given instance.

11:06.940 --> 11:10.360
The question is whether there is a fulfilling proof of truth for C, i

11:10.360 --> 11:10.360
.e.

11:10.400 --> 11:14.980
a proof of truth for each variable from U, the truth value true or

11:14.980 --> 11:19.240
false, so that each clause in C is fulfilled.

11:22.780 --> 11:27.940
And the sentence we want to show is that the problem 3SAT is NP

11:27.940 --> 11:28.280
-complete.

11:30.940 --> 11:31.980
Two steps.

11:32.400 --> 11:37.600
Here I will make the first step, that 3SAT is NP-complete, a bit more

11:37.600 --> 11:37.920
detailed.

11:39.080 --> 11:42.140
But in the end we will only write it down in half a sentence.

11:44.960 --> 11:48.700
So, we have to show that 3SAT is NP-complete.

11:49.000 --> 11:50.500
Again, what does that mean?

11:50.700 --> 11:57.020
It means that there exists a non-deterministic Turing machine with a

11:57.020 --> 12:04.140
polynomial time complexity function which always stops and in Qj holds

12:04.140 --> 12:10.060
for the yes-instances of 3SAT and in Qn holds for the no-instances.

12:10.300 --> 12:14.120
Everything under a suitable encoding of instances of this problem.

12:17.180 --> 12:24.220
We prove the existence of such a Turing machine by assigning it an

12:24.220 --> 12:26.440
oracle Turing machine.

12:26.500 --> 12:30.820
This is our model of a non-deterministic Turing machine, which is the

12:30.820 --> 12:31.640
most suitable.

12:32.140 --> 12:37.640
The oracle Turing machine has two phases, oracle phase and check

12:37.640 --> 12:38.260
phase.

12:38.980 --> 12:43.780
The oracle module in the first phase gives us a solution reference to

12:43.780 --> 12:54.660
the given instance and we have to think about what would be suitable.

12:55.280 --> 13:00.580
Well, we want to know if this instance of 3SAT has a true proof.

13:00.920 --> 13:03.640
So, it would be nice if the oracle gives us a true proof.

13:05.100 --> 13:08.540
So, the oracle module gives us a true proof.

13:09.360 --> 13:12.100
Either variable u, a true or a false.

13:13.180 --> 13:17.040
Then it becomes inactive, finally the control becomes active and the

13:17.040 --> 13:22.360
final control has to check if this true proof really fulfills every

13:22.360 --> 13:23.360
clause in C.

13:26.200 --> 13:33.980
And if this is the case, if all clauses are fulfilled, the final

13:33.980 --> 13:36.980
control goes into Qj state and is accepted.

13:37.860 --> 13:42.560
And if at least one clause is found that is not fulfilled under this

13:42.560 --> 13:46.480
true proof, then it goes into Qn and rejects the word.

13:49.160 --> 13:56.060
And now it is important to see that there is a suitable oracle

13:56.060 --> 14:02.380
reference that ultimately ends in Qj exactly when the instance is

14:02.380 --> 14:05.020
really solvable, when it is a yes instance.

14:06.200 --> 14:10.220
Only if there is a suitable proof, can the oracle write down the

14:10.220 --> 14:16.360
suitable proof that fulfills all clauses and only then will the final

14:16.360 --> 14:17.740
control end in Qj.

14:19.420 --> 14:21.180
Last point, the runtime.

14:21.460 --> 14:24.660
We need the polynomial time complexity function.

14:24.660 --> 14:30.320
The runtime would be linear in the size of the given clause.

14:30.680 --> 14:37.620
So the final control has to check every literal in every clause.

14:37.920 --> 14:44.920
That was the detailed version.

14:46.460 --> 14:51.420
The short version would be a solid true proof that the true proof T

14:51.420 --> 14:58.200
can be checked in the polynomial time complexity function if T

14:58.200 --> 14:58.800
fulfills all clauses.

15:01.520 --> 15:06.320
That is usually what we write down to show that there is a problem in

15:06.320 --> 15:06.660
NP.

15:07.420 --> 15:17.080
We only say that a yes instance can be checked in the polynomial time

15:17.080 --> 15:24.040
complexity function But we come to the difficult part.

15:25.760 --> 15:30.160
Sat can be polynomially transformed into 3-sat.

15:31.460 --> 15:38.520
We apply a polynomial transformation f from sat to 3-sat.

15:39.320 --> 15:43.640
If we can solve 3-sat, we can also solve sat.

15:44.320 --> 15:50.380
We have to start with any sat instance with any length in the clause

15:51.680 --> 16:00.060
and during this transformation convert it into an instance of 3-sat.

16:00.400 --> 16:02.360
In the clause of length 3.

16:03.280 --> 16:05.680
We have to make sure that it is fulfilled.

16:06.820 --> 16:10.500
If it was a yes instance before, it is now a yes instance.

16:10.500 --> 16:13.880
If it was a no instance before, it is now a no instance.

16:15.580 --> 16:18.820
So let i be any sat instance.

16:19.820 --> 16:23.540
We construct a 3-sat instance f of i.

16:24.900 --> 16:32.180
To do this, we use every clause c in i and convert it into one or more

16:32.180 --> 16:36.860
clauses that I would like to call f of c in f of i.

16:39.120 --> 16:42.960
We do this for every clause.

16:42.960 --> 16:49.020
For a fixed clause, if it has only one literal I reproduce it on the

16:49.020 --> 16:50.700
reproduction of the same literal.

16:51.380 --> 16:52.320
Three times.

16:52.740 --> 16:53.700
Exactly three times.

16:54.100 --> 16:58.000
I have to construct clauses with length 3 for 3-sat.

16:58.280 --> 17:02.440
So from the clause only x1 becomes x1 or x1 or x1.

17:04.780 --> 17:08.040
Why we do this is still relatively clear.

17:08.320 --> 17:14.120
Of course this clause is fulfilled when the original clause is

17:14.120 --> 17:14.360
fulfilled.

17:14.600 --> 17:16.040
We want to achieve this.

17:18.000 --> 17:23.100
If it consists of two literals, we repeat one literal.

17:23.260 --> 17:28.400
If it consists of x1 or x2, we do x1 or x2 or x1.

17:31.460 --> 17:35.120
If it consists of three literals, we do nothing.

17:35.880 --> 17:42.120
We leave this clause as it is even in the 3-sat instance.

17:42.600 --> 17:45.380
A difficult case is when the clause is longer.

17:45.820 --> 17:48.040
Now we have to break it up.

17:49.200 --> 17:51.180
It's not as easy as it looks.

17:51.500 --> 17:55.040
We can't just take three literals and three literals.

18:01.000 --> 18:08.120
Instead, for a clause x1 to xk with k greater than 3 we introduce new

18:08.120 --> 18:10.080
variables that I want to mark with red.

18:11.640 --> 18:14.100
These are new variables that weren't there before.

18:17.660 --> 18:19.480
k-3 will be there.

18:19.740 --> 18:23.380
They depend on this clause.

18:23.600 --> 18:26.160
For another clause, the other new variable will be there.

18:26.520 --> 18:33.780
They are called yc3-yck-1.

18:35.060 --> 18:36.320
What do we want to do now?

18:36.620 --> 18:40.400
Before, the clause was just x1 or xk.

18:40.880 --> 18:46.360
Now we replace it with a clause of length 3.

18:47.340 --> 18:52.120
On the right side, there is a quote with the intention behind it.

18:52.300 --> 18:53.160
Why do we do this?

18:57.380 --> 18:59.480
Let's go through it one by one.

18:59.480 --> 19:05.780
The first clause is x1 or x2 or yc3.

19:07.120 --> 19:12.420
The intention is a kind of implication as we saw last week.

19:15.800 --> 19:27.040
If x1 and x2 are set to false, then this yc3 has to be set to true.

19:27.040 --> 19:30.560
So that this clause is fulfilled.

19:31.700 --> 19:38.140
If x1 or x2 are set to true, then the clause above is true and the one

19:38.140 --> 19:38.700
below is true.

19:41.500 --> 19:43.920
In both cases, we want the same result.

19:44.100 --> 19:51.380
But if both are set to false, then the red variable has to be set to

19:51.380 --> 19:51.380
true.

19:53.040 --> 20:02.040
The second one is not yc3 or x3 or yc4.

20:03.060 --> 20:04.200
Here is the interpretation.

20:05.500 --> 20:10.920
If the first two literals don't fulfill this clause, then the last one

20:10.920 --> 20:10.920
has to.

20:11.880 --> 20:20.120
If yc3 is set to true, then not yc3 doesn't help.

20:21.000 --> 20:28.420
If yc3 is set to false, then yc4 has to be set to true.

20:32.320 --> 20:36.920
So if the first three, x1 and x2, are set to false,

20:40.700 --> 20:42.560
then yc4 has to be set to true.

20:43.980 --> 20:45.320
That's the idea.

20:45.320 --> 20:47.940
I want to fulfill the clause.

20:48.540 --> 20:54.820
If I don't find a true literal at the beginning, then there has to be

20:54.820 --> 20:56.040
a true one at the end.

20:56.120 --> 21:00.240
Now we want to push it through the clause.

21:01.000 --> 21:02.340
Then it goes on like this.

21:05.880 --> 21:17.300
If yck-2 is the penultimate new variable, then yck-1, xk-2 or yck-1

21:17.300 --> 21:18.720
have the same interpretation.

21:19.080 --> 21:26.180
So if yck-2 is set to true, maybe because of this chain of

21:26.180 --> 21:35.600
implications and xk-2 is also false, then at least yck-1 has to be

21:35.600 --> 21:35.860
true.

21:37.220 --> 21:38.320
And it goes on like this.

21:40.020 --> 21:53.040
If yck-1 is set to false, then at least xk-1 or xk-1 has to be true.

21:53.680 --> 21:55.720
That's the intuition.

21:56.600 --> 22:00.140
We'll see in a moment why this is the case.

22:00.640 --> 22:05.780
This is the construction rule.

22:05.780 --> 22:08.380
From this clause c we make

22:12.180 --> 22:14.740
k -4 or so clauses.

22:20.760 --> 22:22.660
This is the construction rule.

22:23.560 --> 22:28.260
Now I've told you how to transfer any set instance to a 3-set

22:28.260 --> 22:28.760
instance.

22:31.820 --> 22:34.520
Now I have to show that this is the right thing to do.

22:34.660 --> 22:38.580
The first thing I have to show is that this formation has to be

22:38.580 --> 22:39.560
possible in polynomial time.

22:42.200 --> 22:47.540
What we usually do is to count how big what we create is.

22:47.700 --> 22:52.580
And if it's only polynomially big then everything is fine.

22:52.720 --> 22:55.940
How many clauses are we going to build?

22:56.480 --> 23:00.380
Well, we're going to build at most c times u.

23:02.240 --> 23:06.840
So, how long the clauses were before basically we just go through

23:06.840 --> 23:07.380
every clause.

23:08.100 --> 23:10.320
And that's linear time to build the new clauses.

23:16.340 --> 23:22.280
Now I have to say that this transformation is satisfiable.

23:23.780 --> 23:27.540
So, yes instances exactly when later a yes instance.

23:28.100 --> 23:37.080
So, i, the set instance, is satisfiable exactly when f of i, the 3-set

23:37.080 --> 23:38.500
instance, is satisfiable.

23:40.700 --> 23:45.640
And that's always after you had the idea how to build it you have to

23:45.640 --> 23:48.900
show the equivalence and that always goes in two steps.

23:49.180 --> 23:54.940
We assume that step i is satisfiable and show that f of i is

23:54.940 --> 23:55.400
satisfiable.

23:55.720 --> 23:59.940
So, how does a solution from my original instance go into a solution

23:59.940 --> 24:02.160
of the mapped instance?

24:03.540 --> 24:08.040
So, if the original instance is satisfiable that means we have a true

24:08.040 --> 24:15.760
proof T and we have to construct a true proof for the 3-set instance.

24:16.840 --> 24:22.000
And what we do is we take the truth values of the blue variables that

24:22.000 --> 24:27.720
survived just like they are in the true proof of i.

24:28.880 --> 24:34.580
And we just extend them to the red variables.

24:34.680 --> 24:37.260
We don't have a truth value for them yet.

24:38.940 --> 24:44.600
So, if there is a literal x in i and f of i then leave T of x as it

24:44.600 --> 24:44.820
was.

24:46.280 --> 24:47.780
So, we only have these red variables

24:51.120 --> 24:53.140
for the long clauses.

24:53.280 --> 24:54.760
Let's go through the short clauses first.

24:57.420 --> 25:05.420
If the clause x1 to xk is at most 3 then it was mapped to this

25:05.420 --> 25:10.680
repeating of literals then it is, as I said, still satisfiable.

25:14.380 --> 25:20.880
But if the clause is longer than 3 then I want all the clauses I

25:20.880 --> 25:22.580
created to be satisfiable.

25:24.300 --> 25:28.600
And for that I have to give the red variables a truth value.

25:29.600 --> 25:37.200
And what I do is for this clause I have these red variables y, c, j

25:37.200 --> 25:50.220
and I set to true if at least one of the x's is set to true with an

25:50.220 --> 25:51.700
index of at least j.

25:52.840 --> 26:00.460
So, the idea behind the y, c, j is that if there is no true value in

26:00.460 --> 26:02.840
the front then there has to be a true value in the back.

26:02.980 --> 26:06.220
So, there has to be a true value among the 3 or later.

26:07.340 --> 26:09.960
That's the idea behind the red variable.

26:10.680 --> 26:14.660
And if I already know that there has to be a true value among the 3 or

26:14.660 --> 26:17.800
later but 3 is false, then there has to be a true value among the 4 or

26:17.800 --> 26:18.160
later.

26:21.000 --> 26:21.980
That's the idea.

26:22.120 --> 26:23.580
And that's how I define it here.

26:25.200 --> 26:29.620
I say y, c, j is true if there is a true value among the j or later.

26:34.440 --> 26:35.600
And false otherwise.

26:37.380 --> 26:42.940
And then you look at the clause in the front and see that under this

26:42.940 --> 26:49.020
condition because at least one of the blue variables is true each of

26:49.020 --> 26:51.400
the 3 set clauses is true.

26:55.740 --> 26:59.060
So, this extension fulfills all clauses that belong to c.

27:03.030 --> 27:04.270
That's one direction.

27:05.090 --> 27:10.410
If I know my set formula or my set instance was true then my 3 set

27:10.410 --> 27:11.050
instance is also true.

27:11.290 --> 27:12.850
But I still have to show the other direction.

27:15.270 --> 27:21.170
If I know my 3 set instance is true then my original set instance

27:21.170 --> 27:21.550
should also be true.

27:23.970 --> 27:27.050
So, I assume that the 3 set instance is true.

27:27.330 --> 27:31.690
That means I have a true value for the blue and red variables and it

27:31.690 --> 27:35.470
fulfills these mixed clauses of length 3.

27:36.630 --> 27:39.670
And I need a true value only for the blue ones.

27:40.270 --> 27:43.990
And what I do is, I simply take the true value of the blue ones as it

27:43.990 --> 27:45.810
is in the 3 set instance.

27:46.890 --> 27:48.330
I simply forget the red ones.

27:53.040 --> 27:57.060
So, we simply limit the true value of f from i to i.

27:57.520 --> 28:02.300
So, for every literal x from i, we simply take the value t of x as it

28:02.300 --> 28:04.580
is in f of i and forget the red ones.

28:05.660 --> 28:10.900
Now we have to show that we know that all 3 clauses are true and that

28:10.900 --> 28:12.820
all clauses are fulfilled short or long.

28:13.820 --> 28:16.880
Again, if the clauses were short, then it's very easy.

28:17.460 --> 28:22.380
And if the clauses were long, then it has to be based on this special

28:22.380 --> 28:23.000
construction.

28:24.540 --> 28:25.180
Good.

28:28.460 --> 28:31.480
That was the first short clause again.

28:32.280 --> 28:34.040
I know that it is fulfilled.

28:36.280 --> 28:42.460
So, either x1 is true, x2 is true, or yc3 is true.

28:43.060 --> 28:46.880
What I want to show is that this one long clause is fulfilled.

28:47.060 --> 28:47.820
Then I'm done.

28:49.100 --> 28:53.000
So, if x1 is set to true, then I can stop immediately.

28:53.240 --> 28:56.900
I have shown that there is a true value in here, namely x1.

28:56.900 --> 29:02.340
And if x2 is set to true, then I'm done, too.

29:02.500 --> 29:07.220
The only bad thing is that the first two are false.

29:08.980 --> 29:14.100
But because this clause is fulfilled, then this yc3 has to be set to

29:14.100 --> 29:14.340
true.

29:16.420 --> 29:18.280
And I still have work to do.

29:21.000 --> 29:25.340
But now I already know that this yc3 is set to true.

29:25.340 --> 29:31.660
So, the second clause, which is also fulfilled, the yc3, or in general

29:31.660 --> 29:34.540
the ycj, doesn't help me, because it was set to true.

29:36.380 --> 29:40.380
So, this clause is either fulfilled by the blue one here, then I'm

29:40.380 --> 29:43.960
happy again, because the blue one also fulfills the one up there.

29:43.960 --> 29:47.520
Or it must have been fulfilled by the ycj plus 1.

29:50.560 --> 29:59.280
And then I go on, then I know that ycj plus 1 is true, and I go to the

29:59.280 --> 30:04.180
next one, and then either the blue one, which fulfills it, or the next

30:04.180 --> 30:05.220
red one, and so on.

30:05.480 --> 30:09.900
But at some point, at least in the last step, when I haven't found a

30:09.900 --> 30:15.640
blue one yet, then I know that this yck minus 1 is also set to true, i

30:15.640 --> 30:15.820
.e.

30:15.880 --> 30:20.360
negates to false, then it either has to do xk minus 1 or xk.

30:21.700 --> 30:24.960
So, at the latest there, I find blue ones that were set to true.

30:25.960 --> 30:31.000
So, in any case, I find one literal up there, which fulfills the

30:31.000 --> 30:31.220
clause.

30:32.860 --> 30:39.860
So, t of xi is true for at least an i, and therefore this c clause is

30:39.860 --> 30:41.240
fulfilled in the original instance.

30:43.400 --> 30:44.000
Good.

30:45.580 --> 30:49.240
So, that was the first reduction for today.

30:49.500 --> 30:51.060
From sat to 3sat.

30:52.180 --> 30:54.600
Now you might ask, why 3sat?

30:54.740 --> 30:58.980
Of course there is also 4sat and 5sat, but there is also 2sat.

30:59.420 --> 31:04.700
1sat might not be so interesting, but 2sat, what would 2sat be?

31:04.840 --> 31:09.400
Well, 2sat would also be a sat instance, where the clauses have all

31:09.400 --> 31:10.060
lengths of 2.

31:10.620 --> 31:13.860
There are exactly two literals in there.

31:14.740 --> 31:17.380
So, is there a legitimate problem?

31:18.000 --> 31:23.960
The thing is, however, that a reduction like ebend does not exist, or

31:23.960 --> 31:25.400
most likely does not solve.

31:27.060 --> 31:30.760
Because you can show that 2sat is in p.

31:31.480 --> 31:35.560
There is an efficient algorithm that solves 2sat in polynomial time.

31:37.420 --> 31:39.300
So it will not be np-complete.

31:40.480 --> 31:43.300
Most likely, if p is not equal to np.

31:45.160 --> 31:47.180
I will not mention the algorithm here.

31:47.300 --> 31:48.380
I will refer to the exercise.

31:48.780 --> 31:51.540
But let the exercise instructors know if they actually do that.

31:55.440 --> 31:58.360
Another problem is max2sat.

31:59.000 --> 32:00.200
We don't want that anymore.

32:00.200 --> 32:03.140
It's the same given, just that we have a number k.

32:04.220 --> 32:08.280
And we don't want to find a proof that fulfills all clauses, but at

32:08.280 --> 32:09.660
least k of the clauses.

32:11.880 --> 32:17.720
So we add to the formula, or to the number of clauses, and now we try

32:17.720 --> 32:19.800
to fulfill at least 117 of these clauses.

32:23.420 --> 32:26.560
This, on the other hand, is np-complete.

32:28.600 --> 32:35.300
So we can efficiently decide whether everything is feasible or not.

32:35.440 --> 32:41.100
But we cannot efficiently decide whether 117 of the 152 are feasible

32:41.100 --> 32:42.000
or not.

32:44.160 --> 32:46.360
I will not make the reduction here either.

32:46.640 --> 32:50.980
Just to give you a bit of context, that in this satisfiability world

32:50.980 --> 32:54.600
there is a bit more than sat and 3sat.

32:55.880 --> 33:00.100
So, I have also entered the two here without the arrows, because I

33:00.100 --> 33:01.120
didn't show them to you.

33:01.300 --> 33:05.900
So max2sat is up here, 2sat itself is in p.

33:08.980 --> 33:14.000
So, let's move on and solve the problem of clique.

33:16.040 --> 33:20.200
And clique is a problem that is completely different than

33:20.200 --> 33:22.200
satisfiability.

33:23.440 --> 33:25.680
It is a problem on graphs.

33:27.160 --> 33:33.120
And whenever this happens, that we have to make a reduction of a

33:33.120 --> 33:38.200
problem that is satisfiability, and we have to reduce it to a problem

33:38.200 --> 33:42.500
that is in graphs, then the reductions are usually not so obvious.

33:43.620 --> 33:47.980
So you have to somehow manage to convert proofs and clauses into

33:47.980 --> 33:49.040
graphs and cliques.

33:49.040 --> 33:51.980
So, let's move on.

33:52.800 --> 33:53.780
What is a clique?

33:54.980 --> 34:01.560
In a graph, G, V, E, something like this, nodes and edges.

34:02.020 --> 34:04.580
I hope this is all known.

34:05.100 --> 34:11.260
So, a clique is a lot of nodes, a subset of nodes that are paired

34:11.260 --> 34:13.320
together by edges.

34:14.520 --> 34:18.480
So, if you imagine that it is motivated by friendships, if the nodes

34:18.480 --> 34:24.620
are people and the edges are friendships, then a clique is a group of

34:24.620 --> 34:26.380
people that are paired with each other.

34:29.700 --> 34:35.540
For example, the three on the left do not form a clique because the

34:35.540 --> 34:36.800
two on the bottom are not paired.

34:39.580 --> 34:47.740
And the question is, given a graph and a number k, is there a clique

34:47.740 --> 34:48.980
of at least k size?

34:49.140 --> 34:52.460
So, I could ask you, I give you this graph and I give you the number

34:52.460 --> 34:58.280
4, and the question is, are there 4 nodes in there that are paired

34:58.280 --> 34:58.880
together?

34:59.460 --> 35:01.300
Is there a clique of size 4?

35:02.940 --> 35:08.900
Okay, and you can see that it is not so easy to see if there is one.

35:10.620 --> 35:14.360
On the one hand, it is relatively easy even if someone shows you a

35:14.360 --> 35:18.640
clique of size 4, then you can see that it is actually one.

35:18.960 --> 35:24.200
But to see why there are none is not so trivial.

35:24.980 --> 35:28.640
Okay, and this is a hint that it might be a difficult problem.

35:30.120 --> 35:30.440
Okay.

35:32.600 --> 35:37.500
So, the sentence we show is, the clique problem is NP-complete.

35:39.120 --> 35:40.060
Again, two steps.

35:40.060 --> 35:42.780
The first is, clique lies in NP.

35:44.320 --> 35:45.980
Here is the short version.

35:47.040 --> 35:52.860
For a given amount of V-strips of nodes, we can check in polynomial

35:52.860 --> 36:03.160
time whether the pairs are actually connected by edges and that at

36:03.160 --> 36:04.540
least k are many.

36:06.320 --> 36:11.460
If someone comes across this, look here, the 4 do it.

36:12.060 --> 36:14.660
Then you can check quickly, yes, correct.

36:14.840 --> 36:16.480
These are really 4 and they are connected.

36:18.660 --> 36:19.780
That's enough.

36:21.540 --> 36:23.520
This shows that the problem lies in NP.

36:26.120 --> 36:28.000
The reduction is the difficult part.

36:29.500 --> 36:34.000
We now have to take a known NP-complete problem and transform it into

36:34.000 --> 36:34.260
clique.

36:35.400 --> 36:40.560
And so far we only know SAT and 3SAT as NP-complete problems.

36:40.680 --> 36:49.300
That means we have to transform clauses and variables into graphs and

36:49.300 --> 36:57.080
a number k, so that the true proof to clique somehow corresponds.

36:57.980 --> 37:04.400
And then it gets a bit strange and a bit unintuitive, but it works.

37:05.780 --> 37:06.940
So, how do we do this?

37:07.600 --> 37:14.460
First we take 3SAT, then we know a bit about our input, that the

37:14.460 --> 37:15.480
clauses have all lengths 3.

37:16.120 --> 37:22.580
So we have a 3SAT instance with clauses c1 to cn.

37:24.920 --> 37:35.500
Clause i is x i1 or x i2 or x i3 and all xijs are literals from the

37:35.500 --> 37:36.120
number of variables.

37:36.400 --> 37:42.560
So the variables are u1 to um, then the literals are u1 to um, u1 to

37:42.560 --> 37:43.280
um.

37:45.160 --> 37:50.040
And now we transform this thing into a clique instance.

37:50.100 --> 37:55.000
A clique instance has two things, namely a graph and a number k.

37:56.980 --> 37:58.440
So, how do we do this?

37:59.380 --> 38:05.020
We say the knots will be exactly 3n pieces.

38:05.780 --> 38:09.820
n was the number of clauses we have here.

38:10.120 --> 38:16.420
So we say a knot vij for each i1 to n and j1 to 3, so for each

38:16.420 --> 38:17.280
literal.

38:17.720 --> 38:21.080
So I draw the cliques and surround them.

38:21.080 --> 38:22.960
Each literal is now a knot.

38:24.760 --> 38:27.340
x1 can appear multiple times, but if it appears multiple times in

38:27.340 --> 38:29.620
different cliques, it's also different knots.

38:32.040 --> 38:35.640
These are my knots, so I have exactly 3n pieces, because I had n

38:35.640 --> 38:38.100
clauses for each of the three literals.

38:40.280 --> 38:41.700
And now edges.

38:42.760 --> 38:52.960
I insert an edge between vij and vkl if i is not equal to j.

38:54.180 --> 38:58.480
These are in different clauses, so I will never insert an edge between

38:58.480 --> 39:00.840
the three knots of a clause.

39:04.240 --> 39:08.540
And if I insert an edge between clauses, I have to make sure that the

39:08.540 --> 39:11.560
two literals do not contradict each other.

39:11.900 --> 39:17.660
So I can insert an x1 in clause 7 with an x1 in clause 5.

39:18.080 --> 39:22.900
I can also insert an x1 in clause 7 with an x2 or x2 crosswise in

39:22.900 --> 39:23.460
clause 5.

39:24.140 --> 39:28.760
What I can't do is insert an x1 with an x1 crosswise somewhere in the

39:28.760 --> 39:29.000
graph.

39:29.600 --> 39:30.760
That's the only condition.

39:32.160 --> 39:38.280
So I insert everything except those in the same clause and those that

39:38.280 --> 39:39.240
contradict each other.

39:39.860 --> 39:46.460
So no x1 with x1 crosswise, no u1 with u1 crosswise, no u3 with u3

39:46.460 --> 39:47.840
crosswise, and so on.

39:48.460 --> 39:50.220
Okay, that defines my edges.

39:52.060 --> 39:57.460
Now I have defined the graph and now I have to set the number k and I

39:57.460 --> 39:58.420
set k to n.

39:59.500 --> 40:04.980
So I want to ask in this graph if there is a clique of size n there.

40:06.500 --> 40:13.760
And my claim is that if I had an algorithm that decides whether there

40:13.760 --> 40:18.880
is a clique of size n in this graph, then I could decide in the SAT or

40:18.880 --> 40:21.240
3SAT instance whether there is a true or false assumption.

40:23.440 --> 40:24.160
Okay.

40:25.860 --> 40:27.640
Oh, here's an example.

40:29.120 --> 40:31.780
This is a 3SAT instance, a very small one.

40:31.860 --> 40:34.000
Three clauses, three variables.

40:34.500 --> 40:38.680
So I will have nine nodes in the corresponding graph.

40:38.680 --> 40:45.520
Here I can say that the first line is the first clause u1 or u2 or u3

40:45.520 --> 40:45.840
crosswise.

40:46.860 --> 40:49.160
The second line is the second clause.

40:49.500 --> 40:51.600
The third line is the third clause.

40:51.960 --> 40:55.600
As you can see, I have nothing connected within the lines.

40:56.560 --> 40:59.800
Between the lines I have a lot connected, just a few things not.

40:59.880 --> 41:04.440
For example, I have not connected u1 with u1 crosswise, nor this u1

41:04.440 --> 41:06.020
with this u1 crosswise down here.

41:06.260 --> 41:09.660
Or this u2 crosswise with this u2 and so on.

41:10.000 --> 41:12.240
These were a few things I did not connect.

41:12.460 --> 41:17.460
But I have connected u1 with u1 or u3 crosswise with u3 crosswise.

41:19.180 --> 41:19.780
Good.

41:21.760 --> 41:27.480
And now we have to show that this transformation does the right thing.

41:27.800 --> 41:28.440
Two things.

41:28.440 --> 41:30.480
It is in polynomial time.

41:31.400 --> 41:35.140
Again, it is enough to say that it is polynomially large.

41:35.940 --> 41:39.080
We know that there are exactly 3n nodes.

41:39.740 --> 41:44.780
And if the SAT instance had a total length of 3n, then it is linear

41:44.780 --> 41:47.260
and there are many edges.

41:49.380 --> 41:53.640
More important to show is that JAR instances are mapped to JAR

41:53.640 --> 41:54.100
instances.

41:56.300 --> 42:03.040
The 3SAT instance C is a JAR instance, just like the CLICK instance GK

42:03.040 --> 42:05.400
is a JAR instance.

42:06.780 --> 42:11.000
Again, what does JAR instance mean for the 3SAT instance?

42:11.780 --> 42:16.980
For 3SAT it means that there is a true proof that fulfills all clauses

42:16.980 --> 42:18.760
at the same time.

42:19.820 --> 42:24.680
And what does it mean that this graph G with parameter k is a JAR

42:24.680 --> 42:25.080
instance?

42:25.540 --> 42:31.460
It means that there is a node set V' with a size of at least k and all

42:31.460 --> 42:35.100
nodes are pairwise connected in this node set.

42:35.560 --> 42:38.680
There is a CLICK with a size of at least k.

42:40.340 --> 42:40.580
Good.

42:41.440 --> 42:45.600
We need two directions again.

42:46.080 --> 42:50.120
The first direction is the original 3SAT instance is a JAR instance.

42:50.120 --> 42:53.180
The second direction is the CLICK instance is a JAR instance.

42:53.560 --> 42:57.900
So we need to extract a CLICK from a true proof.

42:59.800 --> 43:01.320
And we do that as follows.

43:02.780 --> 43:10.480
We have a true proof T and we know that every clause is true.

43:11.220 --> 43:14.260
That means that for every clause there is at least one of the three

43:14.260 --> 43:18.500
that are evaluated as true, one of the three literals.

43:18.500 --> 43:21.260
Maybe more than one, but at least one.

43:21.500 --> 43:24.020
Just choose one of them.

43:24.120 --> 43:26.700
Choose a true literal from each clause.

43:27.960 --> 43:33.540
For example, this is a true proof that does it for this example.

43:33.840 --> 43:38.720
And then the first clause U1 or U2 or U3 crosswise.

43:39.440 --> 43:41.760
In fact, all three of them would fulfill this clause.

43:42.300 --> 43:43.840
I just choose U1.

43:43.840 --> 43:45.020
That fulfills the clause.

43:45.780 --> 43:48.640
I do the same for the second clause.

43:50.140 --> 43:54.760
U1 can't do it anymore.

43:55.240 --> 43:59.320
U1 is set to true, so not U1 fulfills the clause.

44:00.100 --> 44:02.760
But U2 is also set to true, so I can choose U2 here.

44:04.380 --> 44:06.440
And so I get from true proof to knots.

44:08.360 --> 44:12.240
And the observation is now, well, how many knots did I choose?

44:12.240 --> 44:17.220
I chose exactly one knot from each clause, so exactly N pieces.

44:17.840 --> 44:20.480
And N is just our K, as we chose it.

44:21.220 --> 44:24.440
And the observation is, these knots form a clique.

44:26.040 --> 44:27.940
They are paired together.

44:29.160 --> 44:30.600
That's how I defined the edges.

44:31.400 --> 44:38.380
I said that knots are paired when they come from different clauses and

44:38.380 --> 44:41.380
when they don't contradict each other in the truth values.

44:42.300 --> 44:45.760
So as long as I don't pick up U1 and U1 crosswise at the same time,

44:46.740 --> 44:47.660
everything is fine.

44:48.000 --> 44:49.940
And as long as I don't pick up two from the same clause.

44:51.540 --> 44:56.640
But I can't pick up U1 and U1 crosswise at the same time, because U1

44:56.640 --> 44:59.340
can't fulfill a clause and U1 crosswise can't fulfill another clause.

45:00.720 --> 45:02.400
It only has one truth value.

45:03.380 --> 45:06.680
And that's why they actually form a clique of size 3.

45:07.520 --> 45:09.240
Or K in general.

45:10.360 --> 45:12.680
That means it's a yes instance in the graph.

45:15.320 --> 45:16.500
And the other way around.

45:17.380 --> 45:21.480
That would be to another truth value.

45:21.600 --> 45:23.200
Sure, there are other truth values.

45:23.360 --> 45:28.500
Then there are other fillers in the clauses and other cliques.

45:28.500 --> 45:30.920
But it's important that there is a truth value.

45:32.220 --> 45:34.820
Then there is also a clique of the right size.

45:35.540 --> 45:36.860
Now comes the reverse direction.

45:37.900 --> 45:42.820
We know that there is a clique of size n.

45:43.460 --> 45:46.480
And we have to construct a truth value from it.

45:47.460 --> 45:51.680
We choose a clique of size n.

45:52.740 --> 45:59.260
Then we know that the only way to achieve this is to take a node in

45:59.260 --> 45:59.260
each clause.

46:01.100 --> 46:03.040
Because I can't pick up two from the same clause.

46:03.540 --> 46:04.400
They are not connected.

46:06.920 --> 46:11.220
And since there are exactly n clauses and the clique has size n, the

46:11.220 --> 46:14.840
only way is to have one from each line.

46:15.620 --> 46:16.580
And they are paired.

46:20.660 --> 46:30.960
What I'm doing now is I'm setting the literal value in this clique to

46:30.960 --> 46:31.320
true.

46:31.540 --> 46:34.680
So if I picked up U1, I set U1 to true.

46:35.120 --> 46:39.420
If I didn't pick up U2, then U2 is false.

46:43.700 --> 46:44.780
And that's how I do it.

46:45.060 --> 46:47.420
I go through and see if I have to set something here.

46:47.520 --> 46:48.480
Maybe I have to set something double.

46:48.480 --> 46:50.080
Maybe I have to set something I already did.

46:50.800 --> 46:53.880
Or maybe some variables haven't been set yet.

46:56.700 --> 46:59.660
So that's how I set some truth values.

47:03.360 --> 47:09.520
The observation is that I won't try to set the same variable to true

47:09.520 --> 47:10.840
and at the same time to false.

47:13.460 --> 47:16.800
That would be the risk that I would see U1 and U1 cross.

47:16.800 --> 47:22.000
And both tell me that U1 should be true and U1 should be false at the

47:22.000 --> 47:22.000
same time.

47:23.320 --> 47:28.000
But that can't happen because I don't have these edges in the graph.

47:28.180 --> 47:29.940
So they can't be in this clique at the same time.

47:32.200 --> 47:38.380
And I also see that if I choose this truth value like this, then each

47:38.380 --> 47:39.060
clause is fulfilled.

47:39.440 --> 47:41.640
Because each clause has one from the clique.

47:41.640 --> 47:43.960
And this one already fulfills the clause.

47:44.840 --> 47:46.700
What the others do doesn't matter.

47:48.720 --> 47:52.380
So from the existence of the clique I can read the truth value.

47:54.760 --> 47:55.860
Here's an example.

47:56.680 --> 47:59.880
This would also be a clique of size 3 in this graph.

48:00.420 --> 48:03.100
The first one tells me that U2 should be set to true.

48:04.300 --> 48:05.900
This one tells me that U1 should be set to true.

48:05.900 --> 48:08.520
This one tells me that U2 should be set to false.

48:09.620 --> 48:10.740
We already knew that.

48:11.780 --> 48:15.180
And U3 doesn't matter if I set it to true or false.

48:17.700 --> 48:22.260
If you want a full truth value, you can do whatever you want.

48:23.440 --> 48:24.080
Good.

48:25.820 --> 48:26.780
So that was clique.

48:28.820 --> 48:30.440
Now another problem.

48:30.440 --> 48:37.340
We're going through the canonical problems in theoretical computer

48:37.340 --> 48:37.700
science.

48:38.220 --> 48:42.800
After satisfiability, clique and then color are the most important

48:42.800 --> 48:43.580
graph problems.

48:46.500 --> 48:48.940
Color is the following problem.

48:49.220 --> 48:53.220
We have a graph G and a parameter K again.

48:53.540 --> 48:56.680
The instance looks the same as in clique, but the question is

48:56.680 --> 48:56.980
different.

48:57.960 --> 49:03.140
The question is, is there a node color of G with at most K colors?

49:03.320 --> 49:06.320
We want to assign colors to the node colors.

49:08.140 --> 49:11.080
We have K colors available in our palette.

49:12.680 --> 49:18.600
This color should be such that every two adjacent nodes, i.e.

49:18.600 --> 49:23.120
nodes connected to an edge, are not allowed to get the same color.

49:25.460 --> 49:28.860
Whenever I see an edge, the two nodes have to be colored differently.

49:29.580 --> 49:33.600
If two nodes are equally colored, they are not allowed to be connected

49:33.600 --> 49:33.980
to an edge.

49:34.520 --> 49:36.880
That's the only forbidden situation.

49:37.540 --> 49:42.340
Can I do that for this graph with this number of K colors?

49:42.700 --> 49:43.280
That's color.

49:47.180 --> 49:52.880
Then there's the problem where the parameter K is fixed.

49:53.780 --> 49:58.560
On K, 3 is fixed globally, not coded in the instance.

49:59.640 --> 50:01.120
Then it's called 3color.

50:01.940 --> 50:04.960
Then I only gave the graph.

50:06.560 --> 50:09.660
The question is, can I color it with 3 colors?

50:13.020 --> 50:14.920
That's the problem we want to look at.

50:15.120 --> 50:17.780
3color and we want to show that it's either complete or not.

50:17.780 --> 50:19.000
That's a difficult problem.

50:19.160 --> 50:21.620
We won't be able to solve it efficiently.

50:27.370 --> 50:29.770
First of all, 3color is in NP.

50:32.610 --> 50:37.650
If someone tells us how to color with 3 colors, we can check in

50:37.650 --> 50:42.490
polynomial time whether we actually only used 3 colors and whether

50:42.490 --> 50:47.190
every two nodes with the same color are not connected to an edge.

50:48.270 --> 50:52.150
It can be done in linear time in the number of edges.

50:52.290 --> 50:53.110
I just have to check every edge

50:56.170 --> 50:59.110
whether the coloring with 3 colors is really permissible.

50:59.750 --> 51:02.170
Whether the endpoints of the edges have different colors.

51:05.950 --> 51:06.550
Reduction?

51:08.390 --> 51:13.430
Well, you may have been torn back and forth where you want to reduce

51:13.430 --> 51:13.470
from.

51:13.470 --> 51:17.390
So we're looking for a problem that we can solve with a 3-color

51:17.390 --> 51:18.010
algorithm.

51:20.410 --> 51:23.630
Another problem that we can solve with a hypothetical 3-color

51:23.630 --> 51:24.210
algorithm.

51:26.430 --> 51:28.490
Now, color is a graph problem.

51:28.710 --> 51:29.870
Maybe you're thinking of clique.

51:31.690 --> 51:34.550
Maybe 3 suggests 3-sat.

51:35.830 --> 51:38.050
Here we do 3-sat.

51:39.330 --> 51:43.210
We're showing a reduction from 3-sat to 3-color.

51:44.010 --> 51:46.850
At least 3 appears on both sides.

51:47.930 --> 51:48.510
Okay.

51:50.010 --> 51:52.930
Constructions are getting trickier and trickier.

51:53.630 --> 51:56.950
Here it's even more difficult than before.

51:58.070 --> 51:58.770
Again.

51:59.590 --> 52:03.830
We have to take any instance i of 3-sat.

52:03.830 --> 52:08.010
It has variables u1-um and clauses c1-cn.

52:08.650 --> 52:11.830
We have to construct a graph out of it.

52:12.570 --> 52:15.170
Like before, but this time another graph.

52:15.310 --> 52:20.550
It has to have the property that the 3-sat is colorable exactly when

52:20.550 --> 52:22.970
the 3-sat instance was feasible.

52:24.570 --> 52:26.450
This should also be possible in polymer time.

52:27.370 --> 52:29.330
The construction goes in several steps.

52:30.350 --> 52:31.510
First step.

52:32.330 --> 52:37.730
We take a small triangle with nodes t, f, and a.

52:38.510 --> 52:43.770
This should stand for true, false, and other.

52:44.650 --> 52:44.930
Okay.

52:47.750 --> 52:49.890
They are paired together.

52:50.550 --> 52:56.730
If this graph can be colored in 3, they get 3 different colors.

52:57.570 --> 53:02.130
The color t is blue, f is red, and a is yellow.

53:02.870 --> 53:06.830
Blue is true, red is false.

53:07.430 --> 53:10.110
Yellow is an auxiliary color.

53:11.370 --> 53:11.470
Okay.

53:12.850 --> 53:15.350
Now we need variables for each variable.

53:16.390 --> 53:23.050
We model them by adding two nodes to the positive and negated

53:23.050 --> 53:23.590
literals.

53:25.090 --> 53:29.350
We connect the two nodes so that they can't be colored in red.

53:32.230 --> 53:37.550
And we connect both nodes so that they can't get the yellow color.

53:39.750 --> 53:44.950
We make two nodes for each variable ui ui and ui cross and a triangle

53:44.950 --> 53:47.130
for ui, ui cross, and a.

53:49.290 --> 53:57.350
The interpretation is if the graph can be colored in 3 then I have to

53:57.350 --> 53:58.790
put red and blue on both.

53:59.230 --> 54:02.550
I have to decide if red is left, blue is right, or red is right, blue

54:02.550 --> 54:02.890
is left.

54:03.390 --> 54:08.170
This should code whether the variable should be set to true or false.

54:10.570 --> 54:11.430
That's the idea.

54:11.430 --> 54:15.670
Now I have the clauses.

54:16.690 --> 54:22.710
For each clause you build a six-node component.

54:24.750 --> 54:30.450
It consists of an inner triangle and three satellites.

54:32.310 --> 54:34.670
You don't ask how to get the idea.

54:34.990 --> 54:37.550
I'm the oracle in the oracle-turing machine.

54:37.550 --> 54:42.830
I give you the solution and you are the final control and verify that

54:42.830 --> 54:43.310
it's true.

54:45.070 --> 54:47.830
That's how you build it.

54:48.110 --> 54:50.550
Six nodes for each clause.

54:51.490 --> 54:54.970
Now we have to do something with the literals.

54:55.550 --> 54:56.850
We have to add edges.

54:58.410 --> 55:07.150
If the clause is x, y, z then I imagine this is x, this is y, and this

55:07.150 --> 55:07.630
is z.

55:08.350 --> 55:12.010
I want to connect them with the corresponding variables.

55:12.870 --> 55:25.490
This clause would be u1 or u2 crosswise or um crosswise.

55:26.430 --> 55:27.890
That's how I connect them.

55:27.890 --> 55:32.650
I also connect all the satellites with true.

55:36.050 --> 55:37.230
That's the construction.

55:38.010 --> 55:40.890
That's how we get a graph from a 3-SAT instance.

55:43.870 --> 55:47.010
Now we have to argue that it does what we want.

55:47.830 --> 55:50.730
First, the construction is polynomial.

55:51.310 --> 55:52.210
It's not too big.

55:52.990 --> 55:56.330
How many nodes and edges does this graph have?

55:57.230 --> 56:00.830
For each variable we have two nodes.

56:01.030 --> 56:02.890
For each clause we have six nodes.

56:03.870 --> 56:05.770
Then we have three extra nodes.

56:06.370 --> 56:10.470
In fact, the number of edges is also linear.

56:11.890 --> 56:14.850
The number of nodes is in there.

56:15.530 --> 56:16.990
So the transformation is polynomial.

56:17.750 --> 56:20.930
We also have to show that it's doing the right thing.

56:21.030 --> 56:24.510
Yes instances are mapped to yes instances, no instances to no

56:24.510 --> 56:25.010
instances.

56:25.470 --> 56:32.010
The 3-SAT formula is equivalent to the 3-SAT formula.

56:35.550 --> 56:37.010
Again, two directions.

56:37.530 --> 56:40.910
First, we assume the 3-SAT formula is feasible.

56:41.690 --> 56:45.650
The 3-SAT instance has a valid proof of truth T.

56:46.730 --> 56:53.210
We have to show that this instance of three colors, this graph, is

56:53.210 --> 56:54.090
also three-colorable.

56:54.910 --> 57:03.090
So we take the proof of truth and define on this basis a coloring for

57:03.090 --> 57:03.530
the graph.

57:05.390 --> 57:09.530
As I said, the coloring for the graph will look like this.

57:09.530 --> 57:12.630
In the starting triangle, I have red, yellow and blue.

57:13.470 --> 57:16.430
T is blue, F is red, A is yellow.

57:18.090 --> 57:25.190
First, we color the literals in the same way as the proof of truth

57:25.190 --> 57:26.870
that fulfills all clauses.

57:28.530 --> 57:32.490
I can't use yellow here anyway, only red and blue.

57:34.150 --> 57:39.030
If U1 is set to true, then U1 is set to false.

57:39.470 --> 57:42.450
So I make U1 blue and U1 red.

57:43.310 --> 57:45.350
If it's the other way around, I make it the other way around.

57:49.690 --> 57:51.670
Now I have colored the knots.

57:52.510 --> 57:55.030
So far, all edges are good.

57:56.390 --> 57:58.430
Next, I want to color the satellites.

58:00.090 --> 58:03.050
These are the three outer edges of the clauses.

58:04.650 --> 58:07.110
They are all connected to T.

58:07.670 --> 58:09.210
So I can't use blue here.

58:11.010 --> 58:12.730
I can't use blue here.

58:18.310 --> 58:24.050
I'd like to use red for at least one satellite.

58:24.950 --> 58:26.690
I can't use blue here.

58:26.950 --> 58:27.830
I don't want to make them all yellow.

58:27.830 --> 58:29.270
That would be bad.

58:30.010 --> 58:32.930
I want to make red when red works.

58:33.610 --> 58:35.210
And when does red work?

58:35.310 --> 58:39.870
Well, red works when the knot over here isn't red yet.

58:42.250 --> 58:43.570
But what does that mean?

58:45.150 --> 58:49.530
I can't make this one red, because U1 is already red.

58:50.490 --> 58:53.210
In my proof of truth, U1 is set to false.

58:56.490 --> 58:58.510
I'd like to make this one red, but I can't.

58:58.850 --> 59:00.370
U2 is already red.

59:01.350 --> 59:04.190
In my proof of truth, U2 is set to false.

59:06.470 --> 59:08.950
But I know the clause was fulfilled.

59:09.350 --> 59:13.370
The clause was U1 or U2 or UM.

59:13.830 --> 59:20.110
But the last one can't be red either.

59:20.970 --> 59:23.190
Then all three letters would be red.

59:23.190 --> 59:23.930
So this clause is false.

59:24.690 --> 59:27.830
There's at least one satellite that sees a blue one.

59:28.930 --> 59:32.690
I want to set that to red, and the rest to yellow.

59:32.750 --> 59:35.150
As long as I have one red on the satellites, I'm happy.

59:37.570 --> 59:38.890
I'll do it this way.

59:40.230 --> 59:42.770
Every clause has at least one red satellite.

59:44.050 --> 59:47.170
They can't get blue satellites anyway, the rest is yellow.

59:50.210 --> 59:50.770
Good.

59:51.390 --> 59:52.710
Now I have to go on inside.

59:54.230 --> 59:58.550
Inside you can see, if I have this yellow-yellow-red pattern, I can

59:58.550 --> 59:59.510
complete it inside.

01:00:02.310 --> 01:00:05.470
The red one gets the yellow neighbor, and the other two are red and

01:00:05.470 --> 01:00:05.650
blue.

01:00:06.970 --> 01:00:11.430
If I had yellow-yellow-yellow, I couldn't complete it inside.

01:00:12.190 --> 01:00:12.970
That would be a problem.

01:00:13.990 --> 01:00:14.670
Okay.

01:00:15.270 --> 01:00:20.370
But since I have a red satellite on every clause, I can only use these

01:00:20.370 --> 01:00:22.770
three colors, and all edges look good.

01:00:24.990 --> 01:00:26.350
That's one direction.

01:00:27.410 --> 01:00:31.430
If I don't have a true proof, I get a tricolor.

01:00:31.830 --> 01:00:33.610
Now I have to do the other direction.

01:00:34.210 --> 01:00:38.670
If the instance has a tricolor, I have to construct a true proof.

01:00:39.970 --> 01:00:41.650
I'll start with the tricolor.

01:00:41.950 --> 01:00:44.450
It's already there, but I'll cover it up.

01:00:45.730 --> 01:00:51.330
First, the triangle D is colored with three different colors.

01:00:51.550 --> 01:00:52.010
OBDA.

01:00:53.110 --> 01:00:56.170
This is blue, this is red, and this is yellow.

01:00:57.590 --> 01:01:02.050
Then I see all the literals.

01:01:02.370 --> 01:01:04.110
They're all not yellow.

01:01:05.030 --> 01:01:08.110
I know one of them is red, and one of them is blue.

01:01:08.350 --> 01:01:13.370
They can't get the same color, and they can't get yellow, because

01:01:13.370 --> 01:01:14.090
they're all neighbors to A.

01:01:16.090 --> 01:01:18.810
Now I interpret this as a true proof.

01:01:20.470 --> 01:01:23.330
If U1 is red, then U1 is false.

01:01:23.970 --> 01:01:26.430
U2 is blue, so U2 is true.

01:01:27.870 --> 01:01:29.850
UM is red, so UM is false.

01:01:30.910 --> 01:01:32.470
This is the true proof.

01:01:33.210 --> 01:01:36.590
Now I have to argue that the true proof does the right thing.

01:01:37.350 --> 01:01:41.570
Every clause has at least one literal that fulfills the clause.

01:01:43.170 --> 01:01:44.590
I look at a clause.

01:01:47.050 --> 01:01:48.890
I look at the satellites.

01:01:49.970 --> 01:01:52.650
I want to argue why C1 fulfills the clause.

01:01:52.810 --> 01:01:55.610
I see that C1 has a decent color.

01:01:57.170 --> 01:02:01.390
The satellites of C1 are definitely not blue, because they're all

01:02:01.390 --> 01:02:02.510
connected to the T node.

01:02:05.870 --> 01:02:07.590
Exactly, this is not a satellite.

01:02:07.730 --> 01:02:08.650
It has the color of T.

01:02:09.630 --> 01:02:18.350
Not all satellites are yellow, because if they were all yellow, then

01:02:18.350 --> 01:02:20.090
it would not be possible to complete this in the middle.

01:02:20.090 --> 01:02:23.610
But I assume that there are three colors.

01:02:24.150 --> 01:02:26.590
So there is at least one red satellite.

01:02:28.090 --> 01:02:30.150
And this red satellite

01:02:33.970 --> 01:02:36.930
has to be connected to something that is blue.

01:02:37.590 --> 01:02:39.630
So this literal has to fulfill the clause.

01:02:41.090 --> 01:02:44.390
At least one literal per clause is blue.

01:02:45.890 --> 01:02:48.150
So every clause is fulfilled.

01:02:48.150 --> 01:02:52.410
So this true proof is actually a true proof that fulfills the clause.

01:02:55.870 --> 01:03:00.430
This also proves that three-color NP is complete.

01:03:01.430 --> 01:03:06.890
Let's take a short break and talk in general about polynomial

01:03:06.890 --> 01:03:08.470
reductions or transformations.

01:03:10.130 --> 01:03:17.230
As I said, it's an art to find the actual construction that does the

01:03:17.230 --> 01:03:17.890
desired thing.

01:03:17.950 --> 01:03:23.570
You have seen that it can be quite complex, but sometimes it's also

01:03:23.570 --> 01:03:26.350
relatively close to what you have to do.

01:03:27.550 --> 01:03:31.690
This is the best thing that came to my mind as a guide.

01:03:32.550 --> 01:03:33.930
I can't get it any better.

01:03:35.810 --> 01:03:38.650
The first and most important thing is that we want to make a

01:03:38.650 --> 01:03:39.970
polynomial transformation.

01:03:39.970 --> 01:03:42.730
So we make a line after pi.

01:03:43.900 --> 01:03:49.090
The most important thing is to choose the right direction for the

01:03:49.090 --> 01:03:49.710
transformation.

01:03:51.620 --> 01:03:59.850
So pi' on the left is what is already known to be NP-complete.

01:04:01.110 --> 01:04:06.170
And pi is what we have to show that it is NP-complete.

01:04:07.870 --> 01:04:13.050
So the workflow starts in this table, so to speak, at the top right.

01:04:14.090 --> 01:04:19.190
You only get this pi and the task is to show that it is NP-complete.

01:04:20.110 --> 01:04:22.610
You are on the right side of the transformation.

01:04:24.530 --> 01:04:28.270
And you have to transform something that you don't know yet into your

01:04:28.270 --> 01:04:28.690
problem.

01:04:29.630 --> 01:04:31.570
It's important that you don't transform in your direction.

01:04:34.010 --> 01:04:40.030
So if you start at the top right, think of a suitable problem pi'

01:04:40.310 --> 01:04:42.030
where it might work.

01:04:43.390 --> 01:04:50.190
And the rule is as similar as possible to your given problem.

01:04:53.130 --> 01:05:00.410
So if pi has something to do with graphs, then maybe pi' also has

01:05:00.410 --> 01:05:00.970
something to do with graphs.

01:05:01.930 --> 01:05:06.790
Or if it's a satisfiability variant, then maybe also a satisfiability

01:05:06.790 --> 01:05:07.370
variant.

01:05:07.870 --> 01:05:12.110
So that the transformation doesn't get so weird between graphs and

01:05:12.110 --> 01:05:12.690
formulas.

01:05:14.950 --> 01:05:17.350
So as similar as possible, I would say.

01:05:17.510 --> 01:05:19.230
And it has to be NP-complete.

01:05:19.750 --> 01:05:23.250
You only have a limited pool of possibilities that you learn here in

01:05:23.250 --> 01:05:23.650
the lecture.

01:05:24.290 --> 01:05:26.050
And one of them has to be it.

01:05:26.050 --> 01:05:28.290
So the workflow.

01:05:28.510 --> 01:05:29.030
We start at the top right.

01:05:29.570 --> 01:05:30.110
We have pi.

01:05:30.330 --> 01:05:34.810
We have chosen a suitable candidate pi', which is already NP-complete.

01:05:35.910 --> 01:05:39.110
Now we have to work with this candidate first.

01:05:39.650 --> 01:05:42.110
With pi', which you have chosen.

01:05:42.770 --> 01:05:49.030
And transform any instance from here into special instances of the

01:05:49.030 --> 01:05:50.290
problem pi.

01:05:51.870 --> 01:05:53.090
And transform any instance of the problem pi.

01:05:55.630 --> 01:06:01.950
So you have to choose a problem and you have to deal with any instance

01:06:01.950 --> 01:06:04.510
of this problem in general.

01:06:05.530 --> 01:06:09.090
And any instance of this problem in general.

01:06:09.090 --> 01:06:14.230
is transformed into special instances of the problem pi.

01:06:16.010 --> 01:06:18.750
They are usually larger.

01:06:19.690 --> 01:06:23.710
You have just seen that for every clause there are 6 nodes.

01:06:24.050 --> 01:06:26.290
Or for every variable there are 2 nodes.

01:06:26.890 --> 01:06:28.670
So they are usually larger.

01:06:29.090 --> 01:06:32.030
Sometimes much larger than what you have just seen.

01:06:32.870 --> 01:06:37.130
But you always have to argue that it is only polynomially much larger.

01:06:38.270 --> 01:06:41.150
You are not allowed to make it exponentially much larger.

01:06:42.250 --> 01:06:43.250
That's important.

01:06:44.410 --> 01:06:48.890
It's okay if they get bigger, but not too big.

01:06:49.050 --> 01:06:49.890
Only polynomially.

01:06:52.070 --> 01:06:58.170
And when you have done that, you have to say that it is a polynomial

01:06:58.170 --> 01:06:58.810
transformation.

01:06:59.230 --> 01:07:01.810
And then you have to show this equivalence.

01:07:02.350 --> 01:07:05.950
Yes instance becomes yes instance and yes instance on this side is

01:07:05.950 --> 01:07:06.950
also a yes instance here.

01:07:08.230 --> 01:07:10.950
So there are still two directions to show.

01:07:13.150 --> 01:07:13.910
Okay.

01:07:16.670 --> 01:07:20.270
Maybe a little alternative perspective.

01:07:24.580 --> 01:07:29.320
You want to prove that the problem pi is NP-complete.

01:07:30.540 --> 01:07:33.840
That means for us, you don't believe that there is an efficient

01:07:33.840 --> 01:07:34.860
algorithm for it.

01:07:37.700 --> 01:07:41.620
Imagine someone claims to have an efficient algorithm for pi.

01:07:43.260 --> 01:07:45.400
And they want to say that's nonsense.

01:07:48.900 --> 01:07:54.120
If you have an efficient algorithm for pi, then I can build an

01:07:54.120 --> 01:07:56.140
efficient algorithm for pi'.

01:07:56.140 --> 01:08:00.140
And from pi' I already know that something like that probably doesn't

01:08:00.140 --> 01:08:00.600
exist.

01:08:02.280 --> 01:08:02.540
Okay.

01:08:03.300 --> 01:08:07.140
So if someone tells me that he can solve 3SAT efficiently, then I say,

01:08:07.700 --> 01:08:12.120
if you can solve 3SAT efficiently, then I could solve general SAT

01:08:12.120 --> 01:08:12.820
efficiently.

01:08:14.100 --> 01:08:15.600
And how do I do that?

01:08:15.700 --> 01:08:21.420
I go to my SAT instance that I want to solve efficiently, change it

01:08:21.420 --> 01:08:26.260
into a 3SAT instance, use this hypothetical algorithm that someone

01:08:26.260 --> 01:08:30.900
gives me to decide the 3SAT instance, and show that this decision was

01:08:30.900 --> 01:08:33.160
the correct decision for my SAT instance.

01:08:35.380 --> 01:08:35.480
Okay.

01:08:36.600 --> 01:08:43.680
So reductions or transformations are just a special kind of algorithm

01:08:44.500 --> 01:08:49.100
that has a black box at one point that you are not allowed to see.

01:08:50.840 --> 01:08:56.160
Someone gives you the black box to solve pi, and their task is to

01:08:56.160 --> 01:09:00.200
solve a difficult problem pi' with the help of this black box.

01:09:02.580 --> 01:09:03.060
Okay.

01:09:04.960 --> 01:09:08.860
And it's important that your algorithm has polynomial time, which is

01:09:08.860 --> 01:09:12.360
the polynomiality of the transformation.

01:09:13.320 --> 01:09:15.720
And it's important that you call the black box only once.

01:09:20.450 --> 01:09:25.590
So that's what I can say about that.

01:09:26.150 --> 01:09:31.230
I tried to find simple problems where you can test yourself.

01:09:32.750 --> 01:09:35.090
So that's the test yourself task.

01:09:35.830 --> 01:09:38.590
Can you show that the following problems are not NP-complete?

01:09:40.150 --> 01:09:47.210
The left is given a graph G and a number k, and the question is, is

01:09:47.210 --> 01:09:50.790
there a fraction of knots k' from V,

01:09:55.710 --> 01:10:00.770
so that no two knots are connected in V'.

01:10:03.390 --> 01:10:04.130
Okay.

01:10:05.050 --> 01:10:11.430
So I want to know in this graph 1 million knots and k is 100.

01:10:12.070 --> 01:10:17.290
Are there 100 knots among the 1 million so that these 100 are not

01:10:17.290 --> 01:10:17.650
connected in pairs?

01:10:20.470 --> 01:10:21.190
Okay.

01:10:21.630 --> 01:10:22.810
That's a decision problem.

01:10:23.290 --> 01:10:25.230
Can you prove that this is NP-complete?

01:10:27.750 --> 01:10:32.570
The other problem is given in the graph.

01:10:34.750 --> 01:10:41.210
Is there a quadruple of knots, so that no two even-colored knots are

01:10:41.210 --> 01:10:41.530
connected?

01:10:41.530 --> 01:10:43.610
So that's four colors, so to speak.

01:10:44.490 --> 01:10:47.610
That's color where the parameter for the number of colors is exactly

01:10:47.610 --> 01:10:48.010
4.

01:10:50.710 --> 01:10:52.730
Can you show that this is also difficult?

01:10:54.490 --> 01:10:56.690
That's the test yourself task.

01:10:59.210 --> 01:10:59.890
So.

01:11:01.170 --> 01:11:03.970
That was the plan for today.

01:11:03.970 --> 01:11:05.130
We're almost through.

01:11:05.330 --> 01:11:06.330
We've done a lot.

01:11:06.330 --> 01:11:14.230
We've shown that 3SAT is NP-complete by reducing SAT to 3SAT.

01:11:15.150 --> 01:11:18.630
If we could solve 3SAT, we could also solve SAT.

01:11:19.310 --> 01:11:23.210
We've shown that clique is NP-complete by reducing 3SAT to clique.

01:11:24.410 --> 01:11:29.590
We've shown that 3COLOR is NP-complete by reducing 3SAT to 3COLOR.

01:11:32.190 --> 01:11:36.530
With every successful transformation,

01:11:39.550 --> 01:11:43.550
our pool of NP-complete problems grows, from which we can then create

01:11:43.550 --> 01:11:46.270
to show new problems that are NP-complete.

01:11:46.650 --> 01:11:49.550
We only have to take one out of the pool, already known NP-complete,

01:11:49.810 --> 01:11:51.070
and make a reduction to something new.

01:11:52.790 --> 01:11:59.530
This is a technique that is used all the time, because it tells us

01:11:59.530 --> 01:12:03.330
that we don't have to hope that we'll be able to solve the problem

01:12:03.330 --> 01:12:04.190
efficiently.

01:12:06.930 --> 01:12:11.270
Today there are millions of problems that have all been shown to be NP

01:12:11.270 --> 01:12:11.570
-complete.

01:12:12.490 --> 01:12:15.570
And again and again you look for a suitable problem.

01:12:16.530 --> 01:12:18.170
It's already known that it's NP-complete.

01:12:18.410 --> 01:12:19.290
You make this reduction.

01:12:23.590 --> 01:12:25.770
I'm going to show you one more reduction at the end.

01:12:27.830 --> 01:12:28.510
EXACTCOVER.

01:12:29.610 --> 01:12:32.590
This is the most difficult reduction of today.

01:12:33.490 --> 01:12:34.730
Why am I showing it to you?

01:12:34.850 --> 01:12:36.910
I don't expect you to come up with such a reduction yourself.

01:12:39.050 --> 01:12:44.350
It's important for you to remember that the steps we have to go

01:12:44.350 --> 01:12:46.530
through all make sense to you.

01:12:49.390 --> 01:12:50.530
It's in NP.

01:12:50.690 --> 01:12:51.310
What does that mean?

01:12:52.010 --> 01:12:55.030
There's a reduction of a NP-complete problem.

01:12:55.570 --> 01:12:57.990
What does it mean that the construction is polynomial?

01:12:58.530 --> 01:13:01.310
What does it mean that Y-instances go to Y-instances?

01:13:04.390 --> 01:13:08.510
The other reason I'm showing you this reduction is because EXACTCOVER

01:13:08.510 --> 01:13:13.010
is another type of problem that occurs quite often.

01:13:13.630 --> 01:13:16.730
In the next lectures we're going to look at this type of problem.

01:13:18.750 --> 01:13:23.590
It's neither a Boolean formula nor a graph behind it.

01:13:23.730 --> 01:13:25.130
It's simply a mass system.

01:13:28.950 --> 01:13:34.310
There are other types of problems, but these three are the most

01:13:34.310 --> 01:13:36.750
important types of problems that you want to get to know.

01:13:37.930 --> 01:13:38.570
Good.

01:13:40.050 --> 01:13:40.690
EXACTCOVER.

01:13:43.110 --> 01:13:44.890
What is the problem EXACTCOVER?

01:13:45.870 --> 01:13:53.250
We've given a quantity X of something, letters or something, and a

01:13:53.250 --> 01:13:55.290
family of partial quantities of X.

01:13:55.530 --> 01:13:56.950
An example looks like this.

01:13:57.190 --> 01:13:59.050
X is the number 1 to 7.

01:14:00.010 --> 01:14:03.910
This family of partial quantities is this quantity of partial

01:14:03.910 --> 01:14:04.110
quantities.

01:14:04.110 --> 01:14:06.050
We have the partial quantities 1, 2, 3.

01:14:06.290 --> 01:14:07.570
We have the partial quantities 1, 2, 4.

01:14:07.770 --> 01:14:11.150
We have the partial quantities 2, 3, 4, but also 5, 7.

01:14:11.850 --> 01:14:16.030
But not the partial quantities 1, 6, 7, 5, 3.

01:14:18.350 --> 01:14:22.170
Just a selection of partial quantities of X.

01:14:23.570 --> 01:14:31.750
The question is, can I select a family of partial quantities S' from

01:14:31.750 --> 01:14:37.010
S, so that I select every element of X exactly once?

01:14:44.400 --> 01:14:48.700
I want to select every element of X, but only once.

01:14:51.400 --> 01:14:56.620
If I decide to take 1, 2, 3, I can't take 1, 2, 4, because then I'd

01:14:56.620 --> 01:14:57.300
have 1 twice.

01:15:00.320 --> 01:15:01.240
That's the question.

01:15:01.360 --> 01:15:04.640
Is there a selection of partial quantities from this set of partial

01:15:04.640 --> 01:15:07.640
quantities, so that every element of X is hit exactly once?

01:15:09.740 --> 01:15:14.700
Again, it's not clear whether this is a yes-instance.

01:15:16.280 --> 01:15:19.900
Again, it should be relatively clear that a yes-instance is easy to

01:15:19.900 --> 01:15:20.340
check.

01:15:20.640 --> 01:15:23.460
If I show you one, you can easily verify it.

01:15:26.100 --> 01:15:27.780
But it's not that easy to find one right away.

01:15:28.400 --> 01:15:30.200
In this case, here's one.

01:15:32.700 --> 01:15:37.420
For example, this is a selection of partial quantities.

01:15:37.840 --> 01:15:38.740
Let's check it.

01:15:38.840 --> 01:15:40.220
Let's play a final check.

01:15:41.980 --> 01:15:45.280
First, we have to see if 1, 5 actually appears in the list.

01:15:45.280 --> 01:15:46.140
Yes, 1, 5 appears.

01:15:46.680 --> 01:15:48.540
2, 3, 4 also appears.

01:15:48.540 --> 01:15:51.040
6, 7 is also allowed.

01:15:51.160 --> 01:15:58.720
Let's see if every element of X appears and appears exactly once.

01:15:59.840 --> 01:16:00.820
Yes, 1 appears once.

01:16:01.120 --> 01:16:05.520
2, 3, 4, 5, 6, 7 appear once and not twice.

01:16:07.680 --> 01:16:09.760
That's a yes-instance.

01:16:11.280 --> 01:16:13.560
That's the exact cover problem.

01:16:15.900 --> 01:16:19.560
The sentence we show is exact cover is NP-complete.

01:16:20.700 --> 01:16:22.820
Again, the same method.

01:16:23.340 --> 01:16:26.140
First, we show exact cover is in NP.

01:16:26.540 --> 01:16:29.080
There is a non-deterministic Turing machine that decides this in

01:16:29.080 --> 01:16:30.680
polynomial time functions.

01:16:31.640 --> 01:16:35.880
The problem, and everything we say about it, is that it can be checked

01:16:35.880 --> 01:16:40.340
in polynomial time whether a given partial quantity S' really consists

01:16:40.340 --> 01:16:42.780
of the smallest quantity and covers X completely.

01:16:43.360 --> 01:16:46.580
Whether this is really a solution as requested.

01:16:47.400 --> 01:16:49.180
This is easy to do in polynomial time.

01:16:50.660 --> 01:16:52.260
Difficult part is a reduction.

01:16:54.880 --> 01:16:58.760
I now use the problem of reducing three colors.

01:17:00.440 --> 01:17:05.040
I know three colors are NP-complete according to what we did before.

01:17:06.440 --> 01:17:08.520
So it's suitable.

01:17:09.100 --> 01:17:11.820
I know that the reduction works.

01:17:15.980 --> 01:17:19.860
So, G is a random three-color instance.

01:17:20.100 --> 01:17:22.800
Now I have to be able to deal with random graphs of three colors.

01:17:25.140 --> 01:17:29.020
By the way, not only the ones we saw in the reduction.

01:17:29.480 --> 01:17:31.580
I now have to be able to deal with random graphs of three colors.

01:17:31.580 --> 01:17:33.700
I now have to be able to deal with random graphs of three colors.

01:17:35.240 --> 01:17:38.440
I now construct an exact cover instance from such a graph.

01:17:39.200 --> 01:17:44.660
When I see the graph G, I have to say what is X and what is this

01:17:44.660 --> 01:17:45.780
partial quantity S.

01:17:47.380 --> 01:17:53.400
It should be so that the graph is three-colored when this exact cover

01:17:53.400 --> 01:17:54.500
instance is an exact cover.

01:17:54.780 --> 01:17:56.820
So X has an exact cover.

01:17:58.440 --> 01:17:59.120
Good.

01:17:59.900 --> 01:18:03.700
The construction is getting a bit bigger again.

01:18:06.640 --> 01:18:09.120
C is just my amount of colors.

01:18:09.260 --> 01:18:11.960
It just stands for red, blue and green.

01:18:12.300 --> 01:18:13.500
Just for the sake of notation.

01:18:15.840 --> 01:18:17.920
This is an excerpt from my graph.

01:18:19.340 --> 01:18:25.760
This is a node V and its so-called neighborhood N of V.

01:18:26.000 --> 01:18:29.740
These are all nodes U that are connected to the edge.

01:18:30.660 --> 01:18:32.260
U1 to U5.

01:18:32.760 --> 01:18:34.080
This node has five neighbors.

01:18:35.700 --> 01:18:43.920
What I do is, I define for each node V in X a whole bunch of elements.

01:18:44.540 --> 01:18:47.600
Namely, in X there will be an element, which is also called V.

01:18:48.480 --> 01:18:53.840
If the neighborhood has a size of five, then there will be three times

01:18:53.840 --> 01:18:57.360
six other elements for this node V.

01:18:58.600 --> 01:19:04.100
There are three special ones, which are called E...

01:19:08.380 --> 01:19:10.760
I can't read it with this font.

01:19:11.460 --> 01:19:14.400
E, V, R.

01:19:14.400 --> 01:19:17.220
The red one is E, V, R.

01:19:19.460 --> 01:19:21.600
The blue one is E, V, B.

01:19:22.420 --> 01:19:23.920
And the green one is E, V, G.

01:19:25.160 --> 01:19:27.040
Three special ones for the colors.

01:19:27.720 --> 01:19:31.360
And for each neighbor and each color an element.

01:19:33.220 --> 01:19:36.600
For each of these five neighbors I have a red one.

01:19:36.760 --> 01:19:38.720
For each of the five neighbors I have a blue one.

01:19:38.720 --> 01:19:41.300
And for each of the five neighbors I have a green one.

01:19:42.400 --> 01:19:44.340
That's what I do for each node.

01:19:46.860 --> 01:19:49.040
Then I have a whole bunch of elements in X.

01:19:49.300 --> 01:19:51.260
Now I have to define these timings S.

01:19:52.960 --> 01:19:57.040
They are caught up in curves.

01:19:59.040 --> 01:20:05.460
In the timings S, when I'm at node V, I have two elements.

01:20:05.460 --> 01:20:10.160
I say there are V and these special ones for the nodes.

01:20:11.440 --> 01:20:13.400
These are three two-dimensional elements.

01:20:14.900 --> 01:20:18.520
And then I have everything that has to do with red for this node V.

01:20:20.080 --> 01:20:21.560
That's a bunch I can choose.

01:20:22.540 --> 01:20:26.060
And everything that has to do with blue for this node is a bunch I can

01:20:26.060 --> 01:20:26.280
choose.

01:20:26.600 --> 01:20:28.920
And a bunch for all the greens.

01:20:31.160 --> 01:20:36.920
Those are the amounts and the elements I choose.

01:20:37.020 --> 01:20:40.780
I won't be able to draw a complete instance because that would be too

01:20:40.780 --> 01:20:41.000
big.

01:20:41.160 --> 01:20:42.960
It's just a section for one node.

01:20:44.920 --> 01:20:48.140
The interpretation is that the three amounts are for the three colors

01:20:48.140 --> 01:20:51.140
that the node might get in the three coloring.

01:20:51.380 --> 01:21:00.480
The idea is that when the node is colored green, then this two-amount

01:21:00.480 --> 01:21:04.500
should be chosen in the exact cover.

01:21:05.380 --> 01:21:06.320
And vice versa.

01:21:06.740 --> 01:21:10.460
If this two-amount is chosen in the exact cover, then the node should

01:21:10.460 --> 01:21:10.900
be colored green.

01:21:14.300 --> 01:21:18.000
Because it's an exact cover, V has to be covered.

01:21:18.320 --> 01:21:20.220
Those are the only three amounts that contain V.

01:21:20.220 --> 01:21:22.740
There are only three possibilities how V is covered.

01:21:23.860 --> 01:21:26.480
And it's covered by exactly one of these.

01:21:27.140 --> 01:21:29.900
The node will be red, blue or green.

01:21:32.420 --> 01:21:35.340
That's how I model the color of the node.

01:21:36.120 --> 01:21:41.000
But I also have to say that neighboring nodes don't get the same

01:21:41.000 --> 01:21:41.320
color.

01:21:42.140 --> 01:21:45.560
So for an edge like this, I have to hide it.

01:21:45.560 --> 01:21:51.520
This is a random edge in the graph.

01:21:51.920 --> 01:21:52.820
U and V.

01:21:53.280 --> 01:21:56.180
I made this huge construction for both of them.

01:21:56.880 --> 01:22:02.860
And now I'm going to pull in threads that prevent them from getting

01:22:02.860 --> 01:22:06.100
the same color when there's an exact cover.

01:22:07.300 --> 01:22:12.840
And the idea is that all the amounts that I'm pulling in are of size

01:22:12.840 --> 01:22:13.460
2.

01:22:15.620 --> 01:22:23.580
And they run between different colored blocks and the corresponding

01:22:23.580 --> 01:22:28.640
elements that belong to this neighboring edge.

01:22:30.620 --> 01:22:34.660
So remember, V has the three specials up there.

01:22:35.040 --> 01:22:38.280
And for every neighbor, V had five neighbors, and for every neighbor,

01:22:38.440 --> 01:22:39.720
there was one element down here.

01:22:39.940 --> 01:22:43.100
So one of these five belongs to neighbor U.

01:22:45.040 --> 01:22:48.100
And so U has three neighbors.

01:22:48.300 --> 01:22:53.880
So for each of its three neighbors, it has three in red, three in blue

01:22:53.880 --> 01:22:54.580
and three in green.

01:22:55.020 --> 01:22:56.880
So one of them belongs to V.

01:22:56.980 --> 01:23:01.100
Let's say the second one up here belongs to U and the first one up

01:23:01.100 --> 01:23:01.700
here belongs to V.

01:23:02.840 --> 01:23:07.600
And then I'm going to connect this U element with this V element but

01:23:07.600 --> 01:23:09.940
not with the same color.

01:23:12.280 --> 01:23:15.240
It's supposed to be allowed for the one up here to turn red.

01:23:16.080 --> 01:23:19.760
Like here, this thick border means that this node just turned red.

01:23:22.720 --> 01:23:24.660
Then this one down here turned blue.

01:23:24.960 --> 01:23:28.120
And then these two are supposed to connect like this.

01:23:28.120 --> 01:23:28.940
Okay.

01:23:32.540 --> 01:23:37.540
So, that's the complete construction and a little idea of why it

01:23:37.540 --> 01:23:38.020
should work.

01:23:38.500 --> 01:23:43.680
The idea is, this is what the exact covers should look like.

01:23:44.340 --> 01:23:47.020
There's supposed to be one or two for the node.

01:23:47.160 --> 01:23:49.300
It's supposed to indicate the color for the node.

01:23:50.120 --> 01:23:53.680
The colors that are not selected are covered by these big things.

01:23:53.680 --> 01:24:03.140
And where it was selected, I still have to cover the small stuff and

01:24:03.140 --> 01:24:04.740
it's covered by these little things.

01:24:05.820 --> 01:24:08.720
And this one here is covered by another one.

01:24:09.020 --> 01:24:10.380
It belongs to another neighbor.

01:24:10.680 --> 01:24:15.600
So that's the idea.

01:24:18.320 --> 01:24:20.600
Okay, let's do this formally.

01:24:20.600 --> 01:24:22.200
First, the construction.

01:24:22.400 --> 01:24:23.220
Yes, it's big.

01:24:23.940 --> 01:24:26.580
But it's not too big, it's just polynomially big.

01:24:28.660 --> 01:24:33.760
And the three-colorability is supposed to exist when the exact cover

01:24:33.760 --> 01:24:34.180
is there.

01:24:34.420 --> 01:24:39.180
If we have the color, we have to construct the cover.

01:24:40.220 --> 01:24:42.440
And we construct it like I just indicated.

01:24:42.820 --> 01:24:50.020
If the node V is red in the color, select this amount here, V with E,

01:24:50.260 --> 01:24:54.160
V at the bottom and red at the top, into the exact cover.

01:24:55.700 --> 01:24:59.600
If the node U is blue at the bottom, we select this amount into the

01:24:59.600 --> 01:25:00.240
exact cover.

01:25:01.020 --> 01:25:05.200
From the three large amounts for V, we select the two others that are

01:25:05.200 --> 01:25:05.640
not red.

01:25:06.440 --> 01:25:09.540
And for U we select the two others that are not blue.

01:25:10.940 --> 01:25:18.100
And for each edge of the graph, each edge of the graph runs between

01:25:18.100 --> 01:25:23.520
nodes of different colors, we select the corresponding, between a red

01:25:23.520 --> 01:25:26.720
and a blue node, we select the corresponding red-blue connection

01:25:26.720 --> 01:25:29.100
between these neighboring elements.

01:25:30.740 --> 01:25:31.400
Okay.

01:25:32.300 --> 01:25:33.520
And that's the construction.

01:25:33.800 --> 01:25:38.600
And you have to make sure that each element is actually covered and

01:25:38.600 --> 01:25:39.860
each element is covered exactly once.

01:25:40.500 --> 01:25:43.960
That means that this is actually a solution of exact cover, if the

01:25:43.960 --> 01:25:47.060
color at the front was a decent, permissible 3-color.

01:25:49.460 --> 01:25:55.000
And vice versa, if the thing has an exact cover, then we roll the

01:25:55.000 --> 01:25:56.560
whole thing up from behind.

01:25:57.460 --> 01:26:00.820
We assume that the node U is somehow covered.

01:26:01.100 --> 01:26:02.980
But there are only these three possibilities.

01:26:03.760 --> 01:26:08.780
That means that the exact cover tells us either this one, or this one,

01:26:08.820 --> 01:26:10.040
or this one is in the cover.

01:26:10.400 --> 01:26:12.520
And if it's blue, then we want to color the node blue.

01:26:14.180 --> 01:26:16.800
That's how we define the 3-color from the exact cover.

01:26:19.680 --> 01:26:22.200
And now we have to show that the 3-color is permissible.

01:26:22.760 --> 01:26:27.920
So that it's not the case that the blue thing was taken from the top

01:26:27.920 --> 01:26:30.800
of the exact cover and the blue thing was taken from the bottom.

01:26:32.060 --> 01:26:34.500
That it's not possible.

01:26:34.860 --> 01:26:36.420
Well, and why is it not possible?

01:26:37.140 --> 01:26:41.100
If it was already taken blue at the bottom, then the only possibility

01:26:41.100 --> 01:26:42.680
is to cover this point here.

01:26:43.780 --> 01:26:45.140
It's only in two amounts.

01:26:46.060 --> 01:26:48.460
The only possibility is to take this whole big red.

01:26:49.640 --> 01:26:52.940
And the only possibility is to take all the big green here.

01:26:52.940 --> 01:26:59.280
And now the only possibility to take this one here, the small one that

01:26:59.280 --> 01:27:02.620
belongs to this neighbor, to the neighbor V of U.

01:27:03.620 --> 01:27:05.740
I can't take the big blue one anymore.

01:27:07.060 --> 01:27:11.920
I can only take one of these two over here, either to the red one or

01:27:11.920 --> 01:27:12.860
to the green one.

01:27:13.100 --> 01:27:14.160
There is no blue one.

01:27:15.100 --> 01:27:24.120
And if I take it to the red one, then the one up here can no longer

01:27:24.120 --> 01:27:25.100
take non-red.

01:27:27.660 --> 01:27:30.540
Then it has to be red-colored, so to speak.

01:27:31.260 --> 01:27:32.460
If I take it to the red one.

01:27:34.260 --> 01:27:36.780
So, basically the same argument, only backwards.

01:27:37.880 --> 01:27:41.180
In the text form it says exactly that, only formally.

01:27:42.300 --> 01:27:46.460
And with that, if you think about all the details in detail, I know it

01:27:46.460 --> 01:27:52.440
was a very complicated transformation, we have shown that exact cover

01:27:52.440 --> 01:27:54.360
is also an NP-complete problem.

01:27:55.560 --> 01:27:59.120
That means we have learned a lot of NP-complete problems today.

01:27:59.520 --> 01:28:02.520
These are problems that we cannot assume that we can solve quickly and

01:28:02.520 --> 01:28:03.940
exactly at the same time.

01:28:05.580 --> 01:28:08.940
The center of choice to show that something is NP-complete is

01:28:08.940 --> 01:28:10.020
polynomial transformation.

01:28:11.140 --> 01:28:16.160
We have seen four or five examples today, and we hope that you develop

01:28:16.160 --> 01:28:18.000
a feeling for how something like this works.

01:28:18.900 --> 01:28:19.680
Thank you very much.

