WEBVTT

00:07.480 --> 00:11.420
So, welcome everybody to our lecture on Automotive Vision again.

00:12.620 --> 00:18.860
So, two weeks ago, since we had the last lecture, so let's start with

00:18.860 --> 00:22.580
one of the last slides that we have seen two weeks ago.

00:23.260 --> 00:25.360
So, we were talking about Hidden Markov Models.

00:25.540 --> 00:32.280
Hidden Markov Models are a statistical model about systems that evolve

00:32.280 --> 00:36.140
over time, that change the internal state over time, and that can be

00:36.140 --> 00:37.820
observed by an observer.

00:38.160 --> 00:43.040
An observer is a sensor, say, that is observing the system and that is

00:43.040 --> 00:47.580
measuring some variables which depend on the system state.

00:49.000 --> 00:53.840
For such a system, we derived these equations that you can see here.

00:54.440 --> 01:04.780
So, the equation at the bottom starts here with the probability of the

01:04.780 --> 01:09.540
current state given all observations up to the present point in time.

01:09.900 --> 01:15.840
That means we assume that we have observed a certain sequence of

01:15.840 --> 01:16.420
observations.

01:17.540 --> 01:24.020
Based on that, we have calculated the probability to be in a certain

01:24.020 --> 01:24.480
state.

01:25.260 --> 01:29.820
And what we aim is, we want to predict what is the next state, what is

01:29.820 --> 01:32.360
the subsequent state, and for the next point in time.

01:32.980 --> 01:35.000
So, that is what we want to derive.

01:35.720 --> 01:39.460
And that is shown here in the blue box.

01:40.400 --> 01:45.020
And, yeah, we were using the rules for calculating with probabilities

01:45.580 --> 01:49.360
and, of course, the assumptions of stochastic independence, which we

01:49.360 --> 01:55.320
made for Hidden Markov Models, to simplify the equation so that we get

01:55.320 --> 01:57.660
this equation here.

01:58.380 --> 02:04.180
And what we have here, this probability of st plus 1 given st, that's

02:04.180 --> 02:06.620
the state transition probability.

02:06.980 --> 02:11.020
So, that's the probability with which we expect that if we are in a

02:11.020 --> 02:18.230
certain state, the next state is that we achieve a certain next state.

02:18.650 --> 02:22.730
And this is what we model in the Hidden Markov Model, so we can assume

02:22.730 --> 02:26.710
that we know it or that we have made assumptions about that.

02:27.590 --> 02:38.990
The equation at the top, vice versa, relates the probability for a

02:38.990 --> 02:43.590
certain state, for a certain point in time, given only the

02:43.590 --> 02:48.750
observations up to the point in time before, with the probability of

02:48.750 --> 02:54.530
integrating the new observation set t into this reasoning.

02:55.570 --> 03:03.110
So, we assume we have such a predicted state probability for the point

03:03.110 --> 03:04.010
in time t.

03:04.450 --> 03:09.430
Now, we make the measurement set t, and so we want to update our

03:09.430 --> 03:14.430
probabilities and integrate this measurement set t as a new observed

03:14.430 --> 03:15.010
variable.

03:16.230 --> 03:21.430
And we can see again with the Bayesian formula, with the classical

03:21.430 --> 03:25.150
rules for calculating with probabilities, we can calculate this

03:25.150 --> 03:26.490
relationship here.

03:26.730 --> 03:32.090
So, the new probability with an integrated measurement is proportional

03:32.090 --> 03:38.450
to the predicted state probability times the probability to make a

03:38.450 --> 03:40.810
certain observation, assuming a certain state.

03:41.270 --> 03:45.090
Again, this probability of an observation is something that we specify

03:45.090 --> 03:47.930
in the Hidden Markov Model, so we can assume that we know it.

03:48.550 --> 03:56.770
Now, we have these two equations that actually establish two steps in

03:56.770 --> 04:01.510
an algorithm with which we can incrementally calculate the state

04:01.510 --> 04:02.250
probabilities.

04:02.810 --> 04:05.930
This works as shown here.

04:06.410 --> 04:08.190
So, we might start here.

04:08.370 --> 04:12.010
We have some predicted state probability for the very first point in

04:12.010 --> 04:12.290
time.

04:12.630 --> 04:16.630
Then, we make a measurement, and we integrate this measurement in a

04:16.630 --> 04:21.570
step that is called innovation step, and that is just implementing the

04:21.570 --> 04:26.170
equation at the top of the last slide in a computer program.

04:26.310 --> 04:33.290
That is calculating the left-hand side of this equation by evaluating

04:33.290 --> 04:35.190
the right-hand side of this equation.

04:36.250 --> 04:41.270
By doing that, we get these state probability distributions, and now

04:41.270 --> 04:46.590
we can make use the other equation, the equation that was written at

04:46.590 --> 04:51.010
the bottom of the last slide, in order to make a prediction for the

04:51.010 --> 04:55.290
next point in time, to predict what is the state probability

04:55.290 --> 05:00.330
distribution for the next point in time, without yet having made the

05:00.330 --> 05:02.010
measurement for the next point in time.

05:02.110 --> 05:05.930
This is called the prediction step, and it yields these predicted

05:05.930 --> 05:06.950
state probabilities.

05:07.730 --> 05:12.050
Now, we again have again these predicted state probabilities, and once

05:12.050 --> 05:15.250
we make a new measurement, we can integrate that in the next

05:15.250 --> 05:19.970
innovation step, and by doing that, we can cycle through these two

05:19.970 --> 05:25.370
steps and incrementally calculate the state probabilities.

05:26.450 --> 05:30.790
And we can either start here, or we can also start here with the state

05:30.790 --> 05:31.370
probabilities.

05:31.710 --> 05:35.910
That's a little bit application dependent, where we start, whether we

05:35.910 --> 05:40.130
start on the left-hand side and first apply a prediction step, or

05:40.130 --> 05:42.810
whether you first start with an innovation step.

05:43.070 --> 05:51.870
So, if we know the initial state very well, and we do not make

05:51.870 --> 05:55.630
immediately a measurement, then we would start with the state

05:55.630 --> 05:56.710
probability side.

05:56.790 --> 05:59.770
On the state probability side, in the first step, make a prediction.

06:00.530 --> 06:04.790
If we know the state very well, the initial state very well, and we

06:04.790 --> 06:10.310
immediately make a measurement without first having one step in time,

06:10.630 --> 06:15.050
we would start on the right-hand side with a predicted state

06:15.050 --> 06:15.570
probability.

06:16.830 --> 06:21.110
And if we don't know anything about the initial state, we typically

06:21.110 --> 06:27.070
start as well here on the right-hand side, and we model actually in

06:27.070 --> 06:30.510
the probability distribution with which we start, with about this

06:30.510 --> 06:35.070
initial probability distribution, that we don't know what the state

06:35.070 --> 06:35.390
is.

06:35.750 --> 06:39.490
So, if we have two possible values for the state variable, we would

06:39.490 --> 06:44.050
say, okay, we do not know anything, so each of these two possibilities

06:44.050 --> 06:46.390
occurs with a probability of 50 percent.

06:46.510 --> 06:51.670
That would mean we encode, so to say, in the probability that we are

06:51.670 --> 06:54.670
perfectly unsure about the real state.

06:56.290 --> 07:01.090
Okay, so that means the algorithm performs like that.

07:01.170 --> 07:06.130
So, we might start here with a probability of S1, then we make an

07:06.130 --> 07:09.410
innovation to integrate the first measurement, then we make the

07:09.410 --> 07:13.970
prediction step to predict the probability of the state in the second

07:13.970 --> 07:17.530
point in time, given the first measurement, then we make an innovation

07:17.530 --> 07:18.490
step, and so on.

07:18.710 --> 07:20.050
We would continue like that.

07:20.110 --> 07:26.730
And as I said, we could also start here with the probability of S0 and

07:26.730 --> 07:28.190
first make a prediction step.

07:29.950 --> 07:35.950
Okay, so let's execute this algorithm and look at that for a very

07:35.950 --> 07:41.230
special case, namely this case of a finite number of possible states

07:41.230 --> 07:43.190
and finite number of possible observations.

07:43.410 --> 07:48.230
So, we are in a discrete random variable case, where, say, we have two

07:48.230 --> 07:52.530
or three or four or ten or one hundred or two hundred or one thousand

07:52.530 --> 07:57.630
possible state values, or values that the state variable can take on.

07:58.510 --> 08:01.930
And as well, we have a certain number of crisp observations.

08:02.210 --> 08:06.230
So, the observation is not a distance that we would measure with a

08:06.230 --> 08:11.150
real number, but the measurement would be just a discrete value or

08:11.150 --> 08:14.250
value from a discrete set of possible values.

08:14.530 --> 08:19.230
Yeah, so something like, I can see a car or I can't see a car.

08:19.290 --> 08:23.030
That would be a discrete random variable, and that would be a

08:23.030 --> 08:24.370
measurement that we would use here.

08:25.090 --> 08:29.550
So, the hidden Markov model in this example is described by such a

08:29.550 --> 08:30.650
state transition graph.

08:31.350 --> 08:32.230
And what do we see?

08:32.350 --> 08:36.850
Well, first of all, here in this area on the left, we see in total

08:36.850 --> 08:41.670
three states, which I've named A, B, and C here to be able to

08:41.670 --> 08:42.430
distinguish them.

08:42.870 --> 08:47.190
Then we have these arrows, the black arrows, which indicate which

08:47.190 --> 08:49.490
transitions might happen between the states.

08:49.930 --> 08:53.870
And the numbers next to these arrows indicate the state transition

08:53.870 --> 08:54.610
probabilities.

08:54.870 --> 08:58.750
So, for instance, if the system is in state A, then the probability to

08:58.750 --> 09:03.550
stay in state A, the probability that at the next point in time, the

09:03.550 --> 09:12.290
state is still A, is 0.8, this reflexive arrow.

09:12.730 --> 09:17.170
And the probability to make a transition to state B can be seen here,

09:17.170 --> 09:18.170
is 0.2.

09:18.330 --> 09:20.930
And of course, both must add up to 1.

09:21.250 --> 09:24.250
We don't have any arrow from state A to state C.

09:24.390 --> 09:27.530
This means the probability is 0 to make such a transition.

09:28.390 --> 09:31.930
Now, it might also happen that some transitions are deterministic.

09:32.130 --> 09:36.150
Say, if the system is in state B, then the probability to make a

09:36.150 --> 09:37.930
transition to state C is 1.

09:38.210 --> 09:40.490
That means there is no other option.

09:40.690 --> 09:43.930
We will always make a transition from B to C once we enter B.

09:45.170 --> 09:46.970
Okay, so these are the black arrows.

09:47.210 --> 09:48.990
Then we have the observations.

09:49.050 --> 09:52.690
I have put them here in these rectangular blocks.

09:53.470 --> 09:56.930
Observation U and V, whatever U and V means.

09:57.370 --> 10:01.890
Might be, as I said, I can see a car or I can't see a car or something

10:01.890 --> 10:02.410
like that.

10:03.750 --> 10:08.030
And then we have these dashed arrows, and the dashed arrows indicate

10:08.030 --> 10:09.530
the observation probabilities.

10:10.010 --> 10:16.190
So that means, for instance, this 0.6 here means if the system is in

10:16.190 --> 10:20.830
state A, then the probability to make observation U is 0.6.

10:22.050 --> 10:27.010
And the probability if the system is in state A to make the

10:27.010 --> 10:29.570
observation V, the probability is 0.4.

10:30.030 --> 10:35.190
Both numbers must add up to 1 in this case.

10:36.050 --> 10:38.490
So these are the observation probabilities.

10:39.350 --> 10:45.170
Now let's use the basic algorithm to do some calculations with this

10:45.170 --> 10:46.870
very small hidden Markov model.

10:47.250 --> 10:53.790
So we are interested in calculating the state probabilities P of S T

10:53.790 --> 10:58.990
given a sequence of observations for a certain observation sequence,

10:59.110 --> 11:00.290
say U U V U.

11:00.510 --> 11:05.210
So U is the first observation, the second observation is as well U,

11:05.650 --> 11:09.310
the third observation is V, and the fourth observation is U again.

11:10.010 --> 11:14.050
And we have to make an assumption about the initial state probability.

11:14.770 --> 11:20.110
So in this case, we assume that the probability to be in A or B is 0

11:20.110 --> 11:23.790
.5, and the probability to be in C is 0.

11:24.350 --> 11:28.410
Yeah, so we do not know whether we stayed in A and B, but we are sure

11:28.410 --> 11:29.750
that we don't start in C.

11:32.210 --> 11:35.170
Okay, now let's have a look.

11:36.030 --> 11:37.790
In this case, we start with what?

11:37.910 --> 11:39.270
With a prediction step.

11:39.450 --> 11:41.870
Okay, we start with the innovation step.

11:42.030 --> 11:46.610
Okay, so let's do a little bit of calculation on the blackboard, and

11:46.610 --> 11:49.590
then later on we have a look at the table over there.

11:50.610 --> 11:55.070
So what do we know?

11:56.770 --> 12:05.130
So we know from the text the integer probabilities, so P of S1 and P

12:05.130 --> 12:17.950
of S1 equals A is 0.5, and P of S1 equals B is as well 0.5, and P of

12:17.950 --> 12:21.550
S1 equals C is equal to zero.

12:21.550 --> 12:24.310
So now let's assume we make the first measurement.

12:24.790 --> 12:30.810
So the first measurement Z1 is, as it is stated here, U.

12:31.450 --> 12:37.270
So Z1 is equal to U.

12:39.170 --> 12:40.650
Let's write it like that.

12:41.190 --> 12:54.330
So now we have to calculate the probability of P of S1 is equal to A,

12:54.870 --> 12:58.150
given that Z1 is equal to U.

12:59.610 --> 13:01.690
So this is the innovation step.

13:01.750 --> 13:05.230
We integrate the new measurement into our probabilities.

13:06.510 --> 13:07.730
That's whatever.

13:08.250 --> 13:12.650
And of course, we also have to calculate the probability of S2 being

13:12.650 --> 13:16.050
equal to A, given that Z1 is equal to U.

13:18.210 --> 13:20.590
So that's not equal to each other.

13:20.810 --> 13:25.430
And the third one is P of S1 here.

13:26.010 --> 13:36.570
S1 is equal to C, given that Z1 is equal to U.

13:37.230 --> 13:37.670
Like that.

13:37.750 --> 13:41.590
So these three probabilities is what we have to calculate.

13:42.810 --> 13:49.650
Now we can go back to the slide before, or some slides before, and

13:49.650 --> 13:52.290
have a look what we have to do in this innovation step.

13:52.450 --> 13:54.450
So let's go back, back, back, back, back.

13:55.390 --> 13:58.770
So the innovation step was the topmost step.

14:00.070 --> 14:04.090
And what we see is we take the probabilities that we already have and

14:04.090 --> 14:09.390
multiply them with the observation probability from our hidden Markov

14:09.390 --> 14:09.670
model.

14:10.050 --> 14:14.370
So here we take these probabilities, the blue ones.

14:14.890 --> 14:20.010
These are those that I have written on the blackboard, at the top of

14:20.510 --> 14:21.310
the blackboard.

14:21.730 --> 14:25.670
And then we have to multiply each one with the probability of

14:25.670 --> 14:32.470
observing U, this observation U, at the respective state.

14:33.090 --> 14:38.790
And this yields, what the result is proportional to the probability

14:38.790 --> 14:39.930
that we want to calculate.

14:41.410 --> 14:45.530
Okay, let's go back to the slide.

14:46.550 --> 14:54.870
Okay, so let's remove the equality and say this is proportional to,

14:55.910 --> 15:00.990
yeah, proportional to, okay, the observation probability.

15:00.990 --> 15:07.470
That means the probability to observe U in state A.

15:09.090 --> 15:13.990
And this probability, if you look at this diagram, the probability to

15:13.990 --> 15:19.750
observe U in state A is given by the topmost dashed arrow.

15:20.630 --> 15:23.050
So this arrow here.

15:23.930 --> 15:28.410
And we can just take the probability that is written there, 0.6.

15:28.930 --> 15:36.690
So this is 0.6 times the probability to be in state A.

15:36.910 --> 15:46.210
This is given as 0.5, the topmost line on the blackboard, 0.5.

15:46.690 --> 15:53.040
This is equal to 0.3, right?

15:54.280 --> 15:56.440
If you find a mistake, please complain.

15:57.580 --> 16:02.740
Then the second probability, the probability to be in state B, given

16:02.740 --> 16:05.120
the observation U.

16:05.760 --> 16:10.660
Again, we have to check what is the probability to observe U, the

16:10.660 --> 16:12.340
observation U, in state B.

16:12.340 --> 16:18.140
We can read that again from the state transition plot and see it's 0

16:18.140 --> 16:19.440
.2, yeah?

16:20.480 --> 16:27.380
As we can see, this arrow from B to U and the number next to it is 0

16:27.380 --> 16:27.960
.2.

16:29.410 --> 16:36.100
So 0.2 times the probability to be in state B is 0.5.

16:36.780 --> 16:40.360
So this yields 0.1.

16:41.080 --> 16:44.500
And finally, the third probability that we want to calculate, the

16:44.500 --> 16:49.940
probability to be in state C, given that Z1 is equal to U.

16:50.100 --> 16:55.560
Well, we look at state C and we see the probability to observe U in

16:55.560 --> 16:57.180
state C is 0.7.

16:57.780 --> 17:02.220
We get that from the state transition diagram, yeah?

17:02.240 --> 17:09.620
So we look at this arrow here, at this arrow, yeah, with 0.7 written

17:09.620 --> 17:10.240
next to it.

17:10.640 --> 17:15.840
So it's 0.7 times, well, the probability to be in C is 0.

17:16.220 --> 17:18.840
So the result is equal to 0.

17:19.840 --> 17:21.120
Okay, so far?

17:23.500 --> 17:24.140
Yeah?

17:24.140 --> 17:27.800
Now, of course, we only know it's proportional to.

17:28.980 --> 17:34.620
And if we sum up these numbers, we find the sum is 0.4.

17:34.980 --> 17:41.660
We expect that the sum is one, because the state can either be A or B

17:41.660 --> 17:42.940
or C, nothing else.

17:43.220 --> 17:47.280
So the sum of all those probabilities must be equal to one.

17:47.600 --> 17:49.980
So how can we achieve that?

17:50.120 --> 17:54.960
Well, we know there is an unknown proportionality factor hidden here

17:54.960 --> 17:55.360
somewhere.

17:56.140 --> 17:59.880
But we know that the sum of all the probabilities must be equal to

17:59.880 --> 18:02.200
one, so we normalize them, yeah?

18:02.280 --> 18:07.740
We divide all these numbers by the sum of those numbers, so by 0.4.

18:07.980 --> 18:13.360
That means, from that, we can conclude, yeah, from that, we can

18:13.360 --> 18:20.880
conclude that P of S1 being equal to A, given that Z1 is equal to U,

18:21.420 --> 18:31.420
is equal to, well, 0.3 over 0.4, and this is 0.75, yeah?

18:31.640 --> 18:38.700
And the probability of S1 being equal to B, given Z1 equal to U, is

18:38.700 --> 18:46.540
equal to 0.1 over 0.4, is equal to 0.25, yeah?

18:47.020 --> 18:54.240
And the probability to be in state C, given that Z1 is equal to U, is

18:54.240 --> 18:58.840
equal to 0 over 0.4, and this is 0.

18:59.460 --> 18:59.720
Okay?

19:00.860 --> 19:03.820
So, and now we are done with the first innovation step.

19:05.520 --> 19:11.360
Now we have these probabilities, we know those values, and we did the

19:11.360 --> 19:12.300
first step.

19:12.880 --> 19:13.240
Okay?

19:13.240 --> 19:13.860
Okay.

19:16.160 --> 19:16.560
Okay.

19:18.660 --> 19:21.980
So now, we have to make the next step.

19:22.180 --> 19:26.000
After the innovation step, the next step is the prediction step.

19:27.040 --> 19:30.640
And for the prediction step, we can have a look back to the slide

19:30.640 --> 19:31.980
where the formula is written.

19:32.500 --> 19:37.180
That's the slide at the bottom of this slide.

19:37.600 --> 19:41.860
We see that the red, this red probability that we have just

19:41.860 --> 19:45.160
calculated, enters the equation.

19:46.060 --> 19:50.580
And then we have to consider the state transition probability, which

19:50.580 --> 19:52.560
we get from the state transition diagram.

19:53.120 --> 19:59.480
And we have to sum up over all possible prior states, or previous

19:59.480 --> 20:00.540
states, better to say.

20:01.100 --> 20:03.580
And then we get the result.

20:05.000 --> 20:05.540
Okay.

20:05.920 --> 20:09.200
How does that look like in this case?

20:10.140 --> 20:10.400
Okay.

20:11.000 --> 20:13.360
So now we calculate what?

20:14.040 --> 20:21.180
The probability that state S2, the next state, is equal to A, given

20:23.180 --> 20:24.720
the first observation.

20:26.260 --> 20:31.540
Given the first observation that we already observed.

20:32.300 --> 20:36.900
Now by definition, or by, not by definition, by the equation that we

20:36.900 --> 20:41.700
have found, this is nothing else than, okay, we have to go through all

20:41.700 --> 20:45.180
possible states at the first point in time.

20:45.460 --> 20:49.400
So for all these, we have to consider all these three cases.

20:51.220 --> 20:55.420
We do not know what S1 really is, so we have to consider all

20:55.420 --> 20:56.120
possibilities.

20:56.680 --> 21:00.400
For each of those three possibilities, we multiply this value, the

21:00.400 --> 21:03.880
respective value in this line, with the state transition probability.

21:04.920 --> 21:08.000
Okay, this means, okay, S1 could be A.

21:08.280 --> 21:12.920
If it is A, this happens with a probability of 0.75.

21:14.440 --> 21:15.720
0.75.

21:16.540 --> 21:20.940
And the state transition probability to make a transition from A to A

21:20.940 --> 21:23.940
is given by this arrow here.

21:25.560 --> 21:30.480
This transition from A to A, and it happens with a probability of 0.8.

21:31.260 --> 21:35.600
So it's 0.8 times 0.75.

21:36.360 --> 21:39.880
That's one possibility how we can enter state A.

21:40.500 --> 21:46.500
The second possibility how we can enter state B is that we are in,

21:46.820 --> 21:52.000
sorry, the second possibility to enter state A is that we have been in

21:52.000 --> 21:52.660
state B.

21:53.340 --> 21:56.720
This happens with a probability of 0.25.

21:58.320 --> 21:59.440
0.25.

22:00.380 --> 22:03.580
And that we make a transition from B to A.

22:03.880 --> 22:07.700
If we look at the state transition diagram, there is no arrow from B

22:07.700 --> 22:11.580
to A, so the probability of making such a transition is 0.

22:12.020 --> 22:16.560
So 0 times 0.25.

22:17.420 --> 22:23.940
And the third possibility is that we have been in state C, and that we

22:23.940 --> 22:27.720
make a transition following this arrow from C to A.

22:27.820 --> 22:30.260
That happens with probability 0.5.

22:30.700 --> 22:38.560
So 0.5 as transition probability times the probability to have been in

22:38.560 --> 22:39.280
state C.

22:39.800 --> 22:41.240
This is zero in this case.

22:42.040 --> 22:48.280
So now we add up all these three possibilities to get the result,

22:48.660 --> 22:51.740
which is, oh yeah, okay.

22:53.260 --> 22:54.920
Okay, this is zero.

22:55.220 --> 22:56.000
This is zero.

22:56.760 --> 22:59.600
0.8 times 0.75.

23:00.240 --> 23:04.080
This is 0.6.

23:04.620 --> 23:05.140
Is it open?

23:05.800 --> 23:09.740
Okay, that's the probability to be in state A.

23:10.500 --> 23:18.540
Now let's calculate the probability to be in state B, given Z1 equals

23:18.540 --> 23:18.800
0.

23:19.360 --> 23:21.660
So the same story again.

23:21.780 --> 23:25.800
So to say, we check all possibilities how we can enter state B.

23:26.320 --> 23:30.360
And for each possibility, we multiply the state transition probability

23:30.360 --> 23:35.540
with the probability that we have been in the respective previous

23:35.540 --> 23:35.860
state.

23:36.560 --> 23:42.680
So, okay, state A, the probability to be in state A is 0.75.

23:45.640 --> 23:50.340
The probability to make a transition from state A to B is, given in

23:50.340 --> 23:53.380
the state transition diagram, it's 0.2.

23:53.900 --> 23:57.260
So 0.2 times 0.75.

23:57.840 --> 24:03.940
The probability that we are already in state B is 0.25.

24:04.860 --> 24:09.280
The probability to make a transition from B to B, that means to stay

24:09.280 --> 24:13.260
in B, well, if we look in the transition diagram, there is no arrow

24:13.260 --> 24:16.660
that is reflexive that goes from B to B.

24:16.760 --> 24:18.260
So the probability is zero.

24:18.960 --> 24:19.960
Zero times.

24:20.800 --> 24:24.220
And the probability to have been in state C is zero.

24:24.940 --> 24:29.240
The transition probability is given by this by this arrow.

24:29.520 --> 24:30.480
It's 0.5.

24:30.840 --> 24:33.100
So it's 0.5 times zero.

24:35.440 --> 24:36.320
Okay.

24:36.540 --> 24:38.920
And we get zero, zero.

24:39.600 --> 24:42.260
And what's that?

24:44.320 --> 24:49.380
0.2 times 0.74 is 0.375, I guess.

24:50.520 --> 24:51.320
0.15.

24:52.500 --> 24:53.240
Okay.

24:55.420 --> 24:56.540
Thank you.

24:57.240 --> 24:57.780
Oh, okay.

24:59.220 --> 24:59.780
Okay.

25:00.100 --> 25:10.040
So the probability to enter state C when we know that the observation,

25:10.980 --> 25:13.280
the first observation is U, is okay.

25:13.460 --> 25:18.260
Again, the three cases, the probability to be in A and make a

25:18.260 --> 25:22.960
transition from A to C, well, the probability to be in A is, again, 0

25:22.960 --> 25:23.220
.75.

25:23.960 --> 25:28.300
The probability to make a transition from A to C, when we look into

25:28.300 --> 25:30.280
the transition diagram, there's no arrow.

25:30.780 --> 25:31.700
That means it's zero.

25:32.360 --> 25:33.280
Zero times.

25:33.520 --> 25:39.140
Plus the probability to be in state B is 0.25.

25:39.440 --> 25:44.160
The probability to make a transition from B to C is one, following the

25:44.160 --> 25:45.040
transition diagram.

25:46.220 --> 25:50.820
And the probability to have been in C already is zero.

25:50.960 --> 25:55.060
The probability to make a transition from zero to zero is equal to

25:55.060 --> 25:55.360
zero.

25:55.940 --> 26:02.720
So we get, for the last case, 0.25 as a result.

26:03.900 --> 26:09.940
If you like, we can check whether all our calculations are correct by

26:09.940 --> 26:13.660
summing up all these probabilities and check whether they sum up to

26:13.660 --> 26:13.900
one.

26:14.660 --> 26:19.700
And if it doesn't sum up to one, then we made a mistake.

26:19.860 --> 26:23.980
If it sums up to one, then we cannot be sure that we didn't make a

26:23.980 --> 26:24.260
mistake.

26:24.400 --> 26:28.860
But at least we can find some mistakes by doing that.

26:29.640 --> 26:31.120
Okay, so this is the next step.

26:31.260 --> 26:34.520
Now we... and this is actually the prediction step.

26:35.120 --> 26:38.700
Now the next step would be to consider the next observation.

26:39.340 --> 26:44.360
So the second observation in this case, Z2, is equal to U again.

26:45.100 --> 26:49.460
And now we would have to execute the innovation step as we did here.

26:50.860 --> 26:54.860
But now for the second observation.

26:56.300 --> 27:01.140
Yeah, for time reasons, I leave that for you at home, so that you can

27:01.140 --> 27:02.540
do it at home.

27:03.260 --> 27:07.420
You find the result here in the table on the slides.

27:09.040 --> 27:16.060
And when we look there, so we start actually there.

27:16.280 --> 27:19.740
Here are the initial probabilities that are given.

27:20.220 --> 27:22.200
Then we do the first innovation step.

27:22.300 --> 27:25.360
That is what we did at the left side of the blackboard.

27:25.940 --> 27:31.440
And we get these probabilities, these after integrating the first

27:31.440 --> 27:31.880
measurement.

27:32.140 --> 27:33.640
Then we make the prediction step.

27:34.100 --> 27:37.460
And that is what we just did on the right-hand side of the blackboard

27:37.460 --> 27:40.380
to get the predicted state probabilities.

27:40.940 --> 27:46.380
Then the next step again would be to integrate the new measurement U.

27:46.740 --> 27:51.140
And we would end up with these probabilities, which are given here,

27:51.260 --> 27:53.160
rounded to two decimals.

27:53.740 --> 27:58.340
So the real numbers would have more decimals, but I've rounded all the

27:58.340 --> 27:59.520
numbers to two decimals.

28:00.860 --> 28:05.720
Okay, then we would make a prediction step again, end up in these

28:05.720 --> 28:06.620
probabilities.

28:06.980 --> 28:10.680
Then we would integrate the third measurement, which is V in this

28:10.680 --> 28:13.280
case, to get these numbers here.

28:13.800 --> 28:16.120
And then we would make a prediction step again.

28:16.620 --> 28:20.320
Then we integrate the fourth measurement U in the next innovation step

28:20.320 --> 28:22.240
and make another prediction step.

28:22.380 --> 28:25.500
And like that, we can go on and

28:29.300 --> 28:32.240
continue for subsequent observations.

28:33.000 --> 28:38.460
As we can see, what we need to do the innovation and prediction steps

28:38.460 --> 28:43.740
is, of course, the information from the state transition diagram.

28:44.140 --> 28:47.060
For the innovation step, we need the observation probabilities.

28:47.460 --> 28:50.500
For the transition step, for the prediction step, we need the state

28:50.500 --> 28:52.140
transition probabilities.

28:52.840 --> 28:57.620
And we need only those probabilities which we have calculated in the

28:57.620 --> 29:00.240
previous step of the algorithm.

29:00.440 --> 29:04.540
So we do not to memorize things that we were calculating in the very

29:04.540 --> 29:08.600
beginning to make calculations later on.

29:08.740 --> 29:11.780
And that's the big advantage of this algorithm.

29:12.440 --> 29:17.060
We only need to store the last probabilities, state probabilities.

29:17.060 --> 29:20.360
We don't need to memorize all the past.

29:20.820 --> 29:22.020
That's not necessary.

29:22.880 --> 29:27.120
Just the probabilities of the last steps are sufficient.

29:29.660 --> 29:31.580
Okay, now let's go on.

29:33.500 --> 29:35.760
This is, so to say, the simple case.

29:35.980 --> 29:41.060
If you have a finite number of possible state values and finite number

29:41.060 --> 29:42.840
of possible observation values.

29:44.540 --> 29:47.320
We will come back to that later on in the lecture.

29:47.880 --> 29:52.100
But now, since we were talking about tracking objects and how we can

29:52.100 --> 29:57.340
measure velocities and estimate velocities, of course, those are state

29:57.340 --> 30:04.480
variables which are not discrete random variables and where the number

30:04.480 --> 30:06.300
of possible values is infinite.

30:07.800 --> 30:12.580
Well, then the question is, if we have an uncountable number of states

30:13.200 --> 30:18.980
or and or uncountable number of observations, how can we implement

30:18.980 --> 30:23.080
this algorithm that we developed so far for this case?

30:23.720 --> 30:30.580
So the first thing that we can do when we talk about that is we

30:30.580 --> 30:34.880
replace discrete random variables by continuous random variables.

30:35.060 --> 30:41.580
And that means we have to switch from discrete probability

30:41.580 --> 30:45.700
distributions, where we can store a probability for all possible

30:45.700 --> 30:51.080
values, for each value, to probability density functions to represent

30:51.080 --> 30:52.620
the state distribution.

30:53.780 --> 30:58.800
So that means we have to use state probability densities to all the

30:58.800 --> 30:59.400
calculations.

31:00.160 --> 31:06.240
And at every place where we had a summation in the formula so far, we

31:06.240 --> 31:08.680
have to replace the sum by the integral.

31:09.340 --> 31:13.280
And if we do so, we get, we end up with these two equations, or these

31:13.280 --> 31:17.080
two formulas, that are written here, which are nothing else than what

31:17.080 --> 31:22.460
we have just derived before, but written with density functions here

31:22.460 --> 31:28.900
at each point, and replacing the summation by the integration here.

31:29.500 --> 31:33.960
But indeed it's nothing else, so we can derive it in the same way as

31:33.960 --> 31:35.020
for the discrete case.

31:36.380 --> 31:39.000
Now the question is how can we implement that?

31:39.980 --> 31:44.860
We need something with which we can map, or this calculation here,

31:45.500 --> 31:49.720
each step maps a probability density function to another probability

31:49.720 --> 31:50.780
density function.

31:51.320 --> 31:56.080
And this only works with a reasonable amount of work if the

31:56.080 --> 32:00.920
probability density functions that we deal with have some nice

32:00.920 --> 32:01.560
properties.

32:01.840 --> 32:05.340
Otherwise we run into trouble when we want to do this integration

32:05.340 --> 32:09.560
here, for instance, or when we want to do this multiplication, and

32:09.560 --> 32:14.080
then we need to normalize everything, then we would run into trouble.

32:14.220 --> 32:19.540
But for some very nice cases of probability distributions, things

32:19.540 --> 32:25.480
work, and we can really explicitly and analytically do this

32:25.480 --> 32:31.020
integration here, and analytically resolve all these calculations that

32:31.020 --> 32:32.000
are necessary here.

32:32.840 --> 32:37.320
So for which kind of probability distributions does it apply?

32:37.660 --> 32:41.780
The answer is for so-called linear Gaussian models.

32:42.040 --> 32:42.820
What does it mean?

32:42.980 --> 32:47.900
It means that the relationship between the state and the successor

32:47.900 --> 32:53.560
state should be described by a linear mapping plus an additive

32:54.580 --> 32:59.800
Gaussian noise, a small amount of noise that is randomly chosen from

32:59.800 --> 33:01.680
the Gaussian, from a distribution.

33:02.800 --> 33:03.720
Now that's one piece.

33:03.800 --> 33:08.720
And the second step is that also the observation is assumed to depend

33:08.720 --> 33:11.540
on the state by a linear relationship.

33:11.960 --> 33:17.340
So this observation depends linearly on the state plus an additive

33:17.340 --> 33:22.960
small random noise that is taken from a Gaussian distribution.

33:24.060 --> 33:27.640
So this is shown here in formulas, in terms of formulas.

33:28.020 --> 33:32.940
So we assume that the successor state, st plus one, depends on the

33:32.940 --> 33:39.500
present state st by a linear function.

33:40.020 --> 33:43.620
And a linear function, if we deal with vectors, so st is a vector, st

33:43.620 --> 33:47.560
plus one might be a vector, a linear function is implemented by a

33:47.560 --> 33:52.340
matrix multiplication plus an additional offset that we might add up.

33:52.800 --> 33:58.620
So st plus one equals a matrix that describes the system behavior

33:58.620 --> 34:05.640
times st, the present state, plus ut and constant offset that we add

34:05.640 --> 34:07.800
up to the state.

34:08.500 --> 34:15.000
So that is a general form of a linear function depending on st.

34:15.400 --> 34:18.460
And then we have this additional epsilon t.

34:19.160 --> 34:26.340
This epsilon t is assumed to be a random variable that is chosen from

34:26.340 --> 34:31.760
a Gaussian distribution with zero mean and a certain covariance matrix

34:31.760 --> 34:32.320
qt.

34:34.460 --> 34:40.460
This is said to be often in literature, this is known as white noise.

34:41.080 --> 34:45.920
White noise means a random variable epsilon t, which has zero

34:45.920 --> 34:48.820
expectation value, therefore it's called white.

34:50.760 --> 34:55.000
And yeah, noise means just some random disturbation.

34:55.540 --> 34:56.640
So this is epsilon.

34:57.040 --> 35:00.560
So qt, as I said, is a covariance matrix.

35:00.840 --> 35:04.800
We assume that we know qt, that we can describe the system property,

35:04.980 --> 35:08.980
the randomness that is in the system, and that we know it and can

35:08.980 --> 35:11.680
describe it by this covariance matrix qt.

35:12.900 --> 35:22.560
So yeah, you might see that these variables at, ut, qt, I put always

35:22.560 --> 35:24.200
this index t to it.

35:24.980 --> 35:31.880
This means, in theory, they can differ from time step to time step.

35:32.680 --> 35:37.680
But they must be known in advance and they must not depend on the

35:37.680 --> 35:39.300
state, on the current state.

35:40.460 --> 35:43.960
But they might be different from state to state.

35:44.560 --> 35:50.080
If you go to literature, people often don't write this lower index t

35:50.080 --> 35:54.840
and make it a little bit simpler by saying, well, we assume that those

35:54.840 --> 36:01.800
state transition matrices and so on are always the same for all steps.

36:02.640 --> 36:04.540
That is what you typically find in literature.

36:05.020 --> 36:09.880
But indeed, what you really could do is you could change these

36:09.880 --> 36:12.840
variables for each time step.

36:13.420 --> 36:17.200
And in practice, this becomes relevant when we deal with real

36:17.200 --> 36:18.140
measurement processes.

36:18.680 --> 36:22.680
Now, when we make real measurement processes, it might happen that the

36:22.680 --> 36:30.180
measurements do not come in at equally spaced time intervals, but at

36:30.180 --> 36:32.220
unequally spaced time intervals.

36:32.420 --> 36:36.480
So it might happen that the between the first and the third, the

36:36.480 --> 36:40.480
second measurement, maybe there is one second time in between.

36:40.620 --> 36:44.700
And between the second and the third measurement, there is 1.5 seconds

36:44.700 --> 36:45.600
time in between.

36:46.380 --> 36:50.320
And of course, if you think about a car that is moving, the distance

36:50.320 --> 36:55.940
that is covered in one second or in 1.5 seconds, that's different.

36:56.280 --> 37:00.920
And therefore, at needs to be different in these cases.

37:01.680 --> 37:05.200
And maybe also ut depends on what you want.

37:05.300 --> 37:10.080
And maybe if there is more time between two measurements, two points

37:10.080 --> 37:16.140
in time, then maybe also this noise is larger, because the driver

37:16.140 --> 37:18.540
might have accelerated more or whatever.

37:19.520 --> 37:23.560
Therefore, in practice, when you implement something, and you have

37:23.560 --> 37:29.640
unequally spaced points in time, then it becomes relevant to make

37:29.640 --> 37:38.440
these system matrices and this offset and the noise vector depending

37:38.440 --> 37:42.000
on the length of the time interval, for instance.

37:42.620 --> 37:49.540
But if you go to literature for the simplest case, you might first

37:49.540 --> 37:56.740
assume that the qt and ut and at, that those are independent of the

37:56.740 --> 37:57.240
time step.

37:58.580 --> 38:01.520
Okay, so that's about the state transition model.

38:02.100 --> 38:03.820
And now we look at the measurement model.

38:03.940 --> 38:06.520
We see the same, a very similar structure.

38:06.960 --> 38:12.020
So set t, the measurement, depends on st by a multiplication with a

38:12.020 --> 38:17.240
matrix ht, a measurement matrix ht, that relates in which way the

38:17.240 --> 38:18.620
measurement depends on st.

38:18.960 --> 38:21.740
That's also a linear relationship, as we can see.

38:22.100 --> 38:25.540
The only difference is that we don't have such an offset ut here.

38:25.840 --> 38:28.000
That's more for practical reasons.

38:28.140 --> 38:30.820
You might also add an offset here in this equation.

38:31.700 --> 38:35.280
Would work in the same way, actually, though.

38:35.720 --> 38:45.800
However, in practice, this kind of technique often is shown in the

38:45.800 --> 38:51.200
simplified version, because what we treat as measurement, we are free

38:51.200 --> 38:53.100
in defining what we treat as measurement.

38:53.640 --> 39:01.600
So if the sensor, say, outputs the measurement three, and we assume

39:01.600 --> 39:04.980
that there is some offset here in the measurement, then we can

39:04.980 --> 39:09.520
subtract that offset from the measurement and deal with this modified

39:09.520 --> 39:14.900
measurement as a measurement that we use for the description of this

39:14.900 --> 39:15.380
relationship.

39:15.680 --> 39:19.480
So actually, if you like to write an offset here, an additional offset

39:19.480 --> 39:23.980
here, you are free to do that, but we don't need it.

39:25.060 --> 39:29.260
However, a linear relationship given by this measurement matrix ht,

39:29.840 --> 39:34.540
and again, we have this delta t that, again, is some random variable

39:34.540 --> 39:37.420
which describes the randomness in the measurement.

39:37.980 --> 39:41.460
Again, some noise, Gaussian noise, white Gaussian noise.

39:41.680 --> 39:45.080
We assume that it's taken from a Gaussian distribution with zero mean

39:45.080 --> 39:50.420
and with a certain covariance matrix rt that we again assume to be

39:50.420 --> 39:50.720
known.

39:50.980 --> 39:56.660
And again, you see these indices here for t, rt here, and ht here.

39:57.620 --> 40:07.140
In literature, these entities are usually not indexed by the point in

40:07.140 --> 40:10.460
time, but again, theoretically, you could do that.

40:10.560 --> 40:15.060
You could say, okay, for each point in time, the uncertainty, for

40:15.060 --> 40:17.000
instance, of the measurement is different.

40:17.180 --> 40:20.860
So this covariance matrix or this measurement equation is different.

40:21.820 --> 40:25.180
This might be relevant, for instance, if you have two different

40:25.180 --> 40:27.060
sensors and you want to integrate them.

40:27.240 --> 40:30.240
And at some points in time, you get measurements from one sensor, and

40:30.240 --> 40:33.800
at other points in time, you get measurements from the other sensor.

40:33.980 --> 40:36.520
And at some points in time, you get measurements from both sensors,

40:36.920 --> 40:42.940
then actually you have to update the ht matrix, depending on which

40:42.940 --> 40:44.280
measurements are available.

40:45.020 --> 40:49.000
Yeah, so that would be a case in which that is necessary.

40:49.460 --> 40:54.280
Yeah, but for simplicity, for understanding the principle, assume that

40:54.280 --> 40:59.740
the measurements are all of the same type every time, and that ht and

40:59.740 --> 41:04.020
also rt are the same every time.

41:04.180 --> 41:09.820
But theoretically, in some cases, it might be relevant to modify them

41:09.820 --> 41:10.900
from time to time.

41:11.360 --> 41:17.740
But actually, of course, the choice of rt and ht must not depend on

41:17.740 --> 41:18.520
the present state.

41:19.500 --> 41:20.620
Definitely not.

41:21.100 --> 41:25.080
Or our guess what the present state is or something like that.

41:25.900 --> 41:30.800
Also, it theoretically is not allowed to make, for instance, rt

41:30.800 --> 41:33.760
depending on the measurement that we currently made.

41:34.920 --> 41:40.040
Although people do it sometimes in practice, theoretically this is not

41:40.040 --> 41:40.340
sound.

41:42.990 --> 41:44.770
Yeah, this is what I said.

41:44.970 --> 41:50.130
So, let's have a look at a very simple example on how we can model

41:52.130 --> 41:54.270
real problem like that.

41:54.730 --> 41:57.930
So, we assume we have a car that is moving along a road.

41:58.530 --> 42:02.990
Yeah, we assume that as a one-dimensional problem in the case that we

42:02.990 --> 42:07.610
only consider the longitudinal position of the car, not the lateral

42:07.610 --> 42:11.910
position within the lane, but only longitudinal position.

42:12.410 --> 42:14.990
So, we have a coordinate system x.

42:16.230 --> 42:21.610
So, we measure the position of the car or want to model the position

42:21.610 --> 42:22.690
of the car as xt.

42:23.070 --> 42:24.390
Then the car has a velocity.

42:24.670 --> 42:27.570
We assume that it's driving with constant, more or less, constant

42:27.570 --> 42:28.050
velocity.

42:28.890 --> 42:33.410
So, then we could say, okay, the state vector consists out of xt and

42:33.410 --> 42:35.770
vt, the position and the velocity.

42:36.670 --> 42:41.530
So, and we know that xt plus one, the position at the next point in

42:41.530 --> 42:46.550
time, is actually the present position plus, well, the time, the

42:46.550 --> 42:50.030
length of the time interval, let's say delta t, so this should be the

42:50.030 --> 42:53.630
length of the time interval, times vt, the present velocity.

42:54.030 --> 42:57.370
That should be, if you assume constant velocity, then we can conclude

42:57.370 --> 43:01.510
that this is the expected position for the next point in time.

43:01.610 --> 43:05.090
And of course, there is some noise, some random noise, this white

43:05.090 --> 43:10.270
Gaussian noise that occurs because the car is not perfect in its

43:10.270 --> 43:10.690
motion.

43:11.310 --> 43:15.050
The driver might accelerate, decelerate a little bit, whatever.

43:16.210 --> 43:19.190
So, there is always some imprecision in this movement.

43:20.150 --> 43:23.890
And for the velocity, we conclude, well, the new velocity is equal to

43:23.890 --> 43:26.470
the old velocity plus some randomness.

43:26.690 --> 43:30.120
Again, the driver might accelerate, decelerate a little bit.

43:31.060 --> 43:33.800
That can be modeled by this uncertainty.

43:34.480 --> 43:37.720
So, now we see that this actually, okay, there's a time interval,

43:37.880 --> 43:39.740
that's already what I said.

43:40.200 --> 43:44.740
So, now we can reformulate these two equations and write it into a

43:44.740 --> 43:46.740
matrix vector multiplication equation.

43:47.240 --> 43:52.380
So, if we summarize st, so vector st, the state vector that contains

43:52.380 --> 43:58.560
xt and vt, then we can say that the new state that contains xt plus

43:58.560 --> 44:04.080
one and vt plus one is equal to this matrix one delta t zero one times

44:04.080 --> 44:09.600
st plus zero zero plus this uncertainty, the vector of uncertainties.

44:10.980 --> 44:15.500
And if we do this multiplication, then we see actually what we get is

44:15.500 --> 44:19.120
exactly the right-hand side of these two equations above.

44:19.600 --> 44:24.320
So, we are able to rewrite these two system equations in the

44:24.320 --> 44:28.920
appropriate way that we need to treat that as a linear Gaussian model.

44:29.980 --> 44:35.640
So, now at is this matrix here, and ut is a vector that just contains

44:35.640 --> 44:39.360
zero here, so we don't have a constant offset for this problem.

44:39.820 --> 44:44.340
Yeah, and the vector of uncertainties is the vector that contains the

44:44.340 --> 44:46.000
position and velocity uncertainty.

44:49.320 --> 44:53.760
So, now the measurement, let's assume we have a binocular camera

44:53.760 --> 44:57.620
system like that, and with the binocular camera system we can measure

44:57.620 --> 44:59.160
the position of the vehicle.

45:00.640 --> 45:06.320
So, we say the measurement, the measured position, our measurement zt

45:06.320 --> 45:08.920
is equal to xt plus delta t.

45:09.600 --> 45:14.920
xt is a real position, which we do not know, but which somehow exists,

45:15.060 --> 45:15.940
but we don't know it.

45:16.740 --> 45:19.040
And zt is a sensed position.

45:19.680 --> 45:22.120
That's what we get from the camera system.

45:22.780 --> 45:27.020
Both are similar, but not the same, because there's always some

45:27.020 --> 45:31.520
imprecision in the sensor, and in the sensor evaluation, in the camera

45:31.520 --> 45:34.000
image evaluation process.

45:34.560 --> 45:38.680
So, therefore, this uncertainty, this little bit of randomness which

45:38.680 --> 45:43.780
exists, is covered by this delta t, by this measurement uncertainty.

45:44.900 --> 45:49.840
So, if we have this, this is the only measurement that we get with the

45:49.840 --> 45:50.180
camera.

45:50.580 --> 45:52.920
So, we can't measure anything else.

45:53.140 --> 45:55.560
We assume we just measure the position.

45:56.060 --> 46:01.100
So, let's rewrite this measurement equation in the form that we need

46:01.100 --> 46:03.280
to say that it's a linear Gaussian model.

46:03.740 --> 46:07.980
So, we say zt is equal to a matrix with one row and two columns.

46:08.680 --> 46:11.500
So, that contains 1 0 times st.

46:11.660 --> 46:18.280
If we do multiply out this multiplication here, we get exactly 1 times

46:18.280 --> 46:22.020
xt plus delta t, the measurement noise.

46:22.180 --> 46:27.240
So, this means also here this equation fits to our needs of a linear

46:27.240 --> 46:27.980
Gaussian model.

46:29.020 --> 46:32.480
So, for linear Gaussian models like the ones that we just introduced,

46:32.740 --> 46:37.760
it holds that if we choose a Gaussian distribution, a Gaussian

46:37.760 --> 46:45.760
probability density function, to model these kind of distributions,

46:46.780 --> 46:51.880
the distribution of the current state given all observations up to

46:51.880 --> 46:58.880
that state, then it can be found analytically by a lot of annoying

46:58.880 --> 47:05.520
calculations that then this predicted state probability can also be

47:05.520 --> 47:08.080
described by a Gaussian probability density function.

47:08.760 --> 47:14.000
So, that means we start with the Gaussian distribution and the, what

47:14.000 --> 47:18.320
is it, the prediction step also provides a Gaussian distribution.

47:19.460 --> 47:25.520
And that means, and furthermore, if these predicted state

47:25.520 --> 47:32.840
probabilities can be described by a Gaussian distribution, then the

47:32.840 --> 47:37.540
next time state distribution probabilities can also be described by a

47:37.540 --> 47:37.760
Gaussian.

47:38.080 --> 47:42.000
And that means the innovation step also maps Gaussian distributions to

47:42.000 --> 47:42.920
Gaussian distributions.

47:43.540 --> 47:48.660
That means if we have a linear Gaussian model and we use Gaussian

47:48.660 --> 47:54.200
distributions to model the state distribution probabilities, then all

47:54.200 --> 47:58.420
the distributions which occur in throughout all calculations are

47:58.420 --> 47:58.880
Gaussians.

47:59.360 --> 48:03.820
And that's nice, because to represent a Gaussian distribution, we only

48:03.820 --> 48:08.220
need to know two things, the expectation value, this mu vector, and

48:08.220 --> 48:11.060
the covariance matrix, the sigma matrix.

48:11.680 --> 48:17.460
So, that means in this case, a prediction step takes an expectation

48:17.460 --> 48:23.060
value vector and a covariance matrix from a Gaussian distribution and

48:23.060 --> 48:30.760
yields an expectation value vector and another covariance matrix.

48:32.620 --> 48:39.040
So, and the innovation step takes, again, such an expectation value

48:39.040 --> 48:45.240
vector and a covariance matrix and a measurement and yields an

48:45.240 --> 48:47.740
expectation value vector and a covariance matrix.

48:47.880 --> 48:51.880
So, we can only calculate with these expectation value vectors and the

48:51.880 --> 48:56.100
covariance matrices and map those to new ones.

48:57.860 --> 49:00.780
And everything can be calculated analytically.

49:01.020 --> 49:05.120
So, we don't need numerical integration and all that annoying stuff.

49:06.420 --> 49:07.840
Okay, this is shown here.

49:08.000 --> 49:12.420
So, we stay, say we start here with an initial estimate of the state

49:12.420 --> 49:19.100
of the predicted state distribution using a Gaussian density function

49:19.100 --> 49:21.120
to represent this distribution.

49:21.480 --> 49:26.520
That means we provide this value of expectation, this expectation

49:26.520 --> 49:31.640
value, which is written here as mu t plus one predicted and the

49:31.640 --> 49:35.140
covariance matrix, which is given by sigma t plus one predicted.

49:35.340 --> 49:40.000
Then we execute an innovation step that integrates a new measurement

49:40.000 --> 49:43.280
vector set t plus one and provides another Gaussian distribution

49:43.280 --> 49:48.660
represented by the mu t, the new expectation value vector, and the new

49:48.660 --> 49:50.940
covariance matrix sigma t.

49:51.540 --> 49:58.160
So, what we do is we take these, we take these two things here, mu t

49:58.160 --> 50:01.940
plus one predicted, sigma t plus one predicted, and the measurement,

50:02.200 --> 50:05.860
and we provide mu t and sigma t for the next point in time.

50:05.940 --> 50:08.540
So, in between, we increment the time index.

50:09.520 --> 50:13.380
And then, for the prediction step, we take these two here, mu t and

50:13.380 --> 50:17.680
sigma t, and we calculate mu t plus one predicted and sigma t plus one

50:17.680 --> 50:18.000
predicted.

50:18.160 --> 50:21.860
And like that, we can iterate, we can cycle through that loop and

50:21.860 --> 50:26.020
integrate all the measurements incrementally.

50:27.500 --> 50:28.900
Okay, how does it look like?

50:28.960 --> 50:34.920
So, if we do all the calculations analytically, we find out that the

50:34.920 --> 50:37.860
steps are implemented like that.

50:38.400 --> 50:43.680
So, in the prediction step, we have mu t and sigma t given, and we

50:43.680 --> 50:49.160
want to calculate these predicted mu t plus one and predicted sigma t

50:49.160 --> 50:49.600
plus one.

50:50.160 --> 50:55.900
And the formulas to do that, the equations which we use to implement

50:55.900 --> 50:57.100
it look like that.

50:57.200 --> 51:03.580
So, the new mu value, the new expectation value vector, is equal to A

51:03.580 --> 51:09.700
t, this state transition matrix, A times mu t plus u t, the offset.

51:10.380 --> 51:14.680
So, actually, what we do is, we take the mu t vector, we treat that as

51:14.680 --> 51:19.080
if it would be the true state of the system, and we apply this linear

51:19.080 --> 51:25.480
dynamics, this linear transition to this state vector mu t, or to this

51:25.480 --> 51:27.300
expectation value vector mu t.

51:28.020 --> 51:31.620
And for the covariance matrix, the equation looks like that.

51:33.080 --> 51:35.940
Well, how to explain it?

51:36.940 --> 51:42.020
So, actually, the offset is, actually, doesn't matter for the

51:42.020 --> 51:42.960
covariance matrix.

51:43.180 --> 51:47.800
So, if I take a random variable and I just shift it by a known offset,

51:48.200 --> 51:54.840
then the covariance matrix, the spread of the random variable doesn't

51:54.840 --> 51:57.500
change, only its location changes.

51:57.860 --> 52:01.400
Therefore, this u t doesn't occur here in the second equation.

52:02.200 --> 52:06.920
Well, then this multiplication A t sigma t A t transpose is actually

52:06.920 --> 52:12.400
something that we can interpret it as we apply this linear mapping

52:12.400 --> 52:15.960
given by A t to the covariance matrix.

52:17.180 --> 52:19.180
That is actually what happens here.

52:19.740 --> 52:25.040
And then what we do here, we add up the uncertainty, which comes in

52:25.040 --> 52:28.980
from the transition noise, from the imprecision in the state

52:28.980 --> 52:29.440
transition.

52:29.840 --> 52:32.980
And this is described by this covariance matrix q t.

52:33.160 --> 52:39.620
So, we add up this as an additional source of uncertainty to our

52:39.620 --> 52:40.080
knowledge.

52:41.920 --> 52:44.860
Okay, that's the prediction step.

52:44.860 --> 52:47.720
The innovation step looks a little bit more difficult.

52:48.840 --> 52:53.080
So, we start with these predicted variables here and with the

52:53.080 --> 52:57.400
measurement, and we want to get those mu t and sigma t.

52:58.080 --> 53:00.240
So, here is the calculation.

53:01.320 --> 53:02.940
Looks a little bit more difficult.

53:03.560 --> 53:08.160
So, typically, one intermediate matrix is calculated called Kalman

53:08.160 --> 53:08.460
gain.

53:08.860 --> 53:11.200
This is this k variable here.

53:12.080 --> 53:17.600
Well, hmm, yeah, the interpretation of these formulas is actually a

53:17.600 --> 53:21.020
little bit more difficult, and it's not that easy.

53:21.220 --> 53:23.140
But what we can see is the following.

53:23.920 --> 53:25.880
For the mu t value, what do we do?

53:26.420 --> 53:32.480
Well, we compare the measurement that we made with h t times mu t

53:32.480 --> 53:32.940
predicted.

53:33.100 --> 53:37.960
So, this is the most probable predicted state in which we are.

53:38.560 --> 53:44.760
Multiplying it with h t means that we calculate this part, the most

53:44.760 --> 53:48.640
probable observation that we expect to be faced with.

53:49.140 --> 53:53.200
So, we compare the true observation that we made with the observation

53:53.200 --> 53:54.720
that we expect to make.

53:56.040 --> 54:02.600
So, this term in brackets, so to say, tells us something about how

54:02.600 --> 54:08.820
good our expectation fits to the measurement that we made.

54:09.720 --> 54:14.960
And based on that, our predicted state, the predicted state or this

54:14.960 --> 54:19.120
expectation value for the predicted state is modified a little bit.

54:19.480 --> 54:23.940
And this k, this Kalman gain, can be interpreted as a kind of

54:23.940 --> 54:29.220
weighting factor, yeah, that weights somehow the uncertainty in the

54:29.220 --> 54:33.240
measurement and the uncertainty in the knowledge of the present state

54:33.240 --> 54:33.920
with each other.

54:34.200 --> 54:41.840
So, this is this term here, where we can see that here this r t, this

54:41.840 --> 54:46.860
uncertainty in the measurement, enters this equation in its inverse

54:46.860 --> 54:49.060
form to the power minus one.

54:49.420 --> 54:53.920
And the uncertainty of our knowledge of the previous, of the present

54:53.920 --> 54:58.680
state, enters here, so to say, in the non-inverted way.

54:58.880 --> 55:06.640
So, it's a kind of ratio that is calculated here between how well we

55:06.640 --> 55:11.900
know, how sure we are about the state so far, from our calculations so

55:11.900 --> 55:15.460
far, and how reliable the measurement is.

55:18.980 --> 55:21.640
Yeah, so that's actually this part.

55:22.080 --> 55:27.800
And the update of the uncertainty matrix sigma t goes on like that.

55:28.020 --> 55:34.340
Here, I never found any intuitive idea to explain that.

55:34.720 --> 55:42.640
So, for simple cases, if the state is, if the state vector just

55:42.640 --> 55:46.340
contains one variable, and if the measurement just contains one

55:46.340 --> 55:49.740
variable, the measurement vector, then try it out.

55:49.920 --> 55:56.000
Then things simplify, and it becomes clearer what these formulas do in

55:56.000 --> 55:58.560
this general case of matrix multiplications.

55:59.680 --> 56:04.080
So, at least I didn't find any intuitive explanation of the last

56:04.080 --> 56:05.260
equation.

56:06.800 --> 56:10.740
Okay, so that's the prediction and the innovation step.

56:10.880 --> 56:14.960
Once we implement them, we implemented something that has a name, a

56:14.960 --> 56:18.200
technique that has a name, that is called a Kalman filter, going back

56:18.200 --> 56:22.600
to a researcher, a Hungarian researcher named Kalman.

56:23.280 --> 56:27.180
So, and this Kalman filter is quite popular and used in many

56:27.180 --> 56:27.840
applications.

56:28.420 --> 56:32.060
Yeah, to show it, where is it?

56:32.400 --> 56:39.060
Okay, so let's apply it to this example of observing a car that is

56:39.060 --> 56:40.340
driving at constant velocity.

56:41.020 --> 56:45.140
So, we assume a frame rate of one second, that means the time interval

56:45.140 --> 56:48.300
between two points in time is assumed to be one second.

56:49.140 --> 56:56.740
So now, based on what we have derived so far, we can say that this

56:56.740 --> 56:59.780
matrix A has the shape 1, 1, 0, 1.

57:00.180 --> 57:05.740
Yeah, and the offset U is 0, 0, and the matrix H is 1, 0.

57:05.880 --> 57:07.580
That's what we already derived.

57:09.280 --> 57:14.280
Now, of course, we need these covariance matrices to describe the

57:14.280 --> 57:22.460
uncertainty of the state transitions, Q, and the uncertainties of the

57:22.460 --> 57:22.980
measurements.

57:23.600 --> 57:27.200
And of course, that's always a little bit a tricky story how to get

57:27.200 --> 57:27.460
them.

57:27.920 --> 57:32.500
I personally prefer to make some rough guesses.

57:33.220 --> 57:37.160
Yeah, and to simplify things and not to run into too much trouble,

57:37.800 --> 57:42.940
choose those matrices normally as diagonal matrices.

57:43.760 --> 57:49.280
Yeah, so actually these matrices must be symmetric positive definite

57:49.280 --> 57:49.800
matrices.

57:52.140 --> 57:57.120
So, you can put different values and zeros to the non-diagonal

57:57.120 --> 58:03.240
elements, but then managing to get a positive definite matrix is

58:03.240 --> 58:03.780
tricky.

58:04.820 --> 58:11.440
Yeah, so I would suggest if there is no reason to do it differently,

58:12.180 --> 58:13.440
do it like that.

58:13.800 --> 58:18.260
If you choose the non-diagonal elements as 0, and for the diagonal

58:18.260 --> 58:21.440
elements, you have to find reasonable values.

58:22.760 --> 58:25.940
So for Q, of course, this models the imprecision in the state

58:25.940 --> 58:26.380
transition.

58:26.840 --> 58:31.400
Each of the diagonal elements refers to the variance, to the spread,

58:31.580 --> 58:35.400
to the uncertainty for one state variable.

58:35.640 --> 58:39.200
So in this case, this number here refers to the uncertainty in the

58:39.200 --> 58:43.320
position of the vehicle, and this one refers to the uncertainty in the

58:43.320 --> 58:44.480
velocity of the vehicle.

58:44.860 --> 58:46.120
Now think a little bit.

58:46.340 --> 58:48.660
What are typical values that you expect?

58:49.120 --> 58:53.500
So how much of imprecision do you expect from a car to observe when

58:53.500 --> 58:54.920
it's driving for one second?

58:55.500 --> 58:58.880
How much meters of errors do you expect?

58:59.380 --> 59:06.740
And if you say, well, I expect maybe errors up to, say, 10

59:06.740 --> 59:08.900
centimeters, so 0.1 meter.

59:09.740 --> 59:13.940
If that is your guess, then there's a rule of thumb that says, take

59:13.940 --> 59:19.260
this value, divide it by two, three, or four, doesn't matter so much.

59:19.980 --> 59:23.920
So I prefer two, divide it by two, and then take the square of it.

59:24.180 --> 59:25.820
And that yields a reasonable number.

59:26.200 --> 59:29.480
So here, in this case, I was assuming an error of one meter.

59:29.880 --> 59:33.460
So a maximal error in this position of one meter.

59:33.860 --> 59:39.460
So I took one meter, divided it by two, so I get a half, and then I

59:39.460 --> 59:40.340
took a square of it.

59:40.580 --> 59:42.480
So therefore, 0.5 squared here.

59:43.140 --> 59:47.640
And for the velocity, calculating velocities in meter per second, I

59:47.640 --> 59:52.980
thought, okay, maybe the error could be 0.2 meter per second that

59:52.980 --> 59:57.700
occurs in one second, using some assumptions about typical

59:57.700 --> 01:00:00.960
accelerations and acceleration values that occur.

01:00:02.240 --> 01:00:08.960
And okay, then I get 0.2 divided by two yields 0.1 squared, and I was

01:00:08.960 --> 01:00:12.600
choosing that as the second entry in this matrix.

01:00:13.660 --> 01:00:15.960
And for the measurement noise, the same story.

01:00:16.160 --> 01:00:21.980
So we have one variable, one measurement variable set, so the matrix R

01:00:21.980 --> 01:00:24.420
becomes a one-by-one matrix.

01:00:25.300 --> 01:00:26.520
Okay, that's very simple.

01:00:27.020 --> 01:00:30.560
If we have two independent measurements, say, if we would be able to

01:00:30.560 --> 01:00:35.220
measure the position and the velocity of the car directly, because we

01:00:35.220 --> 01:00:39.900
have another kind of sensor, then we would have a two-by-two matrix.

01:00:40.380 --> 01:00:43.560
If we would have three independent measurements, because we have a

01:00:43.560 --> 01:00:47.100
very strange whatever sensor that is also able to measure

01:00:47.100 --> 01:00:50.760
accelerations, then we would have a three-by-three matrix as well,

01:00:51.080 --> 01:00:53.000
positively finite and symmetric.

01:00:54.320 --> 01:00:58.340
Again, the same trick, so choose all non-diagonal elements to be zero,

01:00:58.960 --> 01:01:04.380
at least the first guess, and choose appropriate values for the

01:01:04.380 --> 01:01:05.200
diagonal elements.

01:01:05.320 --> 01:01:10.100
Here, I would say a camera, a stereo camera, is not that accurate in

01:01:10.100 --> 01:01:10.920
measuring things.

01:01:11.440 --> 01:01:17.360
So I was saying, okay, let's say the measurement accuracy would be

01:01:17.360 --> 01:01:22.440
something like four meters, which would really be a very large error.

01:01:23.080 --> 01:01:29.120
So then take the half of it, and take the square of it, and then end

01:01:29.120 --> 01:01:30.600
up with a two squared here.

01:01:32.080 --> 01:01:37.080
Then later on, when you apply this Kalman filter, and you test the

01:01:37.080 --> 01:01:40.940
system, and you observe what's going on in the Kalman filter, please

01:01:40.940 --> 01:01:45.940
vary these numbers, and check whether things become better, the

01:01:45.940 --> 01:01:50.200
performance of the system becomes better with other values.

01:01:50.360 --> 01:01:54.020
Because sometimes you're wrong in your first guess, what could be

01:01:54.020 --> 01:01:57.840
reasonable values, then try out what happens if you change the values,

01:01:57.980 --> 01:01:59.760
and whether the system behavior improves.

01:02:01.120 --> 01:02:03.140
Okay, so that's the thing.

01:02:03.260 --> 01:02:08.540
Now we finally need the initial guess for the state, and the initial

01:02:08.540 --> 01:02:12.880
guess for the covariance matrix, so to express how sure we are.

01:02:13.560 --> 01:02:17.360
And here, in this case, maybe we do not know anything, we don't have

01:02:17.360 --> 01:02:20.800
any prior knowledge about where the car could be, and how fast it

01:02:20.800 --> 01:02:21.220
could be.

01:02:21.640 --> 01:02:25.460
If that happens, then choose just some arbitrary numbers, so for

01:02:25.460 --> 01:02:31.780
instance zero, as initial expectation value, and choose a covariance

01:02:31.780 --> 01:02:34.420
matrix with very large entries in the diagonal.

01:02:35.100 --> 01:02:40.260
This expresses, well, it could be zero zero, the initial state, but

01:02:40.260 --> 01:02:42.580
I'm very, very unsure about it.

01:02:43.780 --> 01:02:50.200
And so you don't, you don't disturb the Kalman filter by providing

01:02:50.200 --> 01:02:52.340
some bad initial guess.

01:02:52.920 --> 01:02:56.680
Of course, if you know something, if you know that the initial

01:02:56.680 --> 01:03:00.760
situation always starts with zero velocity, because you know that

01:03:00.760 --> 01:03:06.040
whenever you track a car that, well, that is standing at the

01:03:06.040 --> 01:03:09.540
beginning, then of course you can choose other values here, and then

01:03:09.540 --> 01:03:14.360
you can also express how sure you are about this initial situation.

01:03:15.140 --> 01:03:21.760
But here, for this example, let's say we choose, we prefer to express

01:03:21.760 --> 01:03:23.680
that we don't know anything in the beginning.

01:03:24.620 --> 01:03:28.080
Now, let's run the Kalman filter and see what the output is.

01:03:28.200 --> 01:03:34.480
So here, I am plotting the result of, after each innovation step.

01:03:35.100 --> 01:03:38.860
So, in red, we see actually the measured position.

01:03:39.060 --> 01:03:42.260
So, the horizontal axis here is the time, the point in time.

01:03:42.620 --> 01:03:45.480
The vertical axis is the position of the vehicle.

01:03:46.100 --> 01:03:48.640
The red cross is the measurement.

01:03:51.020 --> 01:03:56.200
The blue cross is actually the estimated position.

01:03:56.540 --> 01:03:59.660
So, this new value that we calculate in the Kalman filter.

01:04:00.340 --> 01:04:04.980
And you see, yeah, it fits somehow, so it's not exactly the same.

01:04:05.220 --> 01:04:08.620
So, there's some noise in the measurement, which is filtered out by

01:04:08.620 --> 01:04:09.840
the Kalman filter.

01:04:11.180 --> 01:04:15.340
And, yeah, and the vertical bars indicates actually the uncertainty

01:04:15.340 --> 01:04:16.640
that exists.

01:04:16.760 --> 01:04:19.820
So, it expresses somehow the shape of the covariance matrix.

01:04:20.140 --> 01:04:25.200
The larger this bar is, the more uncertain the Kalman filter is about

01:04:25.200 --> 01:04:26.180
the real position.

01:04:26.360 --> 01:04:29.680
So, this means the true position might be somewhere in this interval.

01:04:29.980 --> 01:04:34.040
We do not exactly know where it is, but the most probable position is

01:04:34.040 --> 01:04:35.300
at the blue cross here.

01:04:36.920 --> 01:04:40.220
So, for the estimated position, this looks very much like just

01:04:40.220 --> 01:04:43.500
smoothing, so to say, the measurements that we get.

01:04:44.000 --> 01:04:46.800
For the velocity, it's maybe a little bit more interesting.

01:04:46.940 --> 01:04:50.680
Of course, we don't have velocity measurements, but we can have a look

01:04:50.680 --> 01:04:54.220
at what the Kalman filter provides as output.

01:04:54.960 --> 01:04:58.580
In the first measurement, we can see the output is zero, which is not

01:04:58.580 --> 01:05:03.640
surprising, because if you make just one measurement, one position

01:05:03.640 --> 01:05:08.460
measurement of a vehicle, we cannot conclude anything about its

01:05:08.460 --> 01:05:09.920
velocity.

01:05:10.900 --> 01:05:15.060
So, therefore, there's no reason to change the value of zero to any

01:05:15.060 --> 01:05:15.680
other value.

01:05:16.140 --> 01:05:19.320
But we also see that this arrow bar here is rather large.

01:05:20.680 --> 01:05:25.660
So, it is still very unsure what the real velocity of the car is,

01:05:25.800 --> 01:05:29.300
because, of course, this information was not sufficient to make a good

01:05:29.300 --> 01:05:29.640
guess.

01:05:30.440 --> 01:05:34.100
At the second point in time, when the second measurement was

01:05:34.100 --> 01:05:39.320
integrated, we see that the method was already able to estimate that

01:05:39.320 --> 01:05:45.180
there is some non-zero velocity, and the uncertainty is decreasing.

01:05:46.800 --> 01:05:51.060
And the more measurements we get, the smaller the uncertainty becomes.

01:05:52.120 --> 01:05:57.840
And the more stable also the estimate becomes, and the more sure we

01:05:57.840 --> 01:06:00.980
are that the true velocity is something like one meter per second.

01:06:01.780 --> 01:06:05.580
However, this uncertainty will never become zero.

01:06:06.060 --> 01:06:07.900
It does not converge to zero.

01:06:08.340 --> 01:06:08.640
Why?

01:06:09.000 --> 01:06:13.200
Because in the state transition step and the prediction step, we

01:06:13.200 --> 01:06:16.780
always add some uncertainty to the measurement.

01:06:16.780 --> 01:06:22.940
And therefore, this converges asymptotically, but it does not become

01:06:22.940 --> 01:06:23.300
zero.

01:06:25.740 --> 01:06:29.840
So, the asymptotic behavior is not converging to zero.

01:06:31.260 --> 01:06:31.680
Okay.

01:06:32.120 --> 01:06:32.400
Oops.

01:06:32.600 --> 01:06:34.500
This is a slide that we already have.

01:06:34.640 --> 01:06:36.460
Here is another example.

01:06:36.840 --> 01:06:41.820
The same story, the same modeling, but here I was somehow mimicking

01:06:41.820 --> 01:06:47.960
that the car is moving forward, and at a certain point in time, it is

01:06:47.960 --> 01:06:49.520
immediately moving backward.

01:06:49.920 --> 01:06:51.560
So, that's physically impossible.

01:06:51.940 --> 01:06:58.400
It's more like you observe a marble or a ball that is bouncing against

01:06:58.400 --> 01:07:00.580
a wall, and then bouncing back.

01:07:00.700 --> 01:07:02.420
Something like that, you might imagine.

01:07:02.980 --> 01:07:07.300
Of course, this is a behavior that is not modeled in our state

01:07:07.300 --> 01:07:08.480
transition modeling.

01:07:08.880 --> 01:07:13.340
In the state transition modeling, we assume that the object is moving

01:07:13.340 --> 01:07:14.660
all the time straight forward.

01:07:15.320 --> 01:07:19.900
But here, something happens which contradicts the state transition

01:07:19.900 --> 01:07:20.200
model.

01:07:20.300 --> 01:07:21.020
So, what happens?

01:07:21.640 --> 01:07:26.460
So, in the beginning, in this period of time, the same story happens.

01:07:26.580 --> 01:07:29.860
As we have seen before, the Kalman filter becomes more and more sure

01:07:29.860 --> 01:07:33.700
about the real velocity of plus one meter per second.

01:07:34.440 --> 01:07:36.400
And so, it becomes better and better.

01:07:36.960 --> 01:07:41.120
But at that point in time, when this velocity is changing, we see all

01:07:41.120 --> 01:07:45.020
the measurements, the position measurements are decreasing again, but

01:07:45.020 --> 01:07:51.040
the Kalman filter still says, okay, but I'm sure that the object is

01:07:51.040 --> 01:07:53.000
moving forward with one meter per second.

01:07:53.480 --> 01:07:57.620
So, still, I expect that it's moving forward with one meter per

01:07:57.620 --> 01:07:57.940
second.

01:07:58.440 --> 01:08:03.980
And therefore, the expected, this calculated, the estimated positions

01:08:03.980 --> 01:08:08.200
where the object will be, deviate very much from the measurements.

01:08:09.060 --> 01:08:13.640
At this point in time, the assumption of constant velocity is

01:08:13.640 --> 01:08:18.160
violated, and the Kalman filter has really big problems to follow this

01:08:18.160 --> 01:08:20.360
change, to adapt to this change.

01:08:21.040 --> 01:08:27.780
And it takes a long time until the Kalman filter learns or adapts to

01:08:27.780 --> 01:08:31.420
the new situation that the velocity is not plus one meter per second

01:08:31.420 --> 01:08:33.420
anymore, but minus one meter per second.

01:08:33.880 --> 01:08:38.260
We see, when we observe the velocity, that at the end here, after 30,

01:08:38.480 --> 01:08:43.960
40 points in time, it was able to adapt to the new velocity.

01:08:44.660 --> 01:08:46.960
But we see that it took long time.

01:08:47.480 --> 01:08:51.900
And what is maybe surprising is, when we are in this situation here,

01:08:51.980 --> 01:08:56.820
where obviously the measurements and the state of the system that is

01:08:56.820 --> 01:09:02.700
estimated by the Kalman filter do not fit together anymore, still, the

01:09:02.700 --> 01:09:05.600
uncertainties are not increasing, but decreasing.

01:09:06.960 --> 01:09:08.620
That's a little bit surprising.

01:09:09.120 --> 01:09:12.820
That comes from the fact that we model the system as a hidden Markov

01:09:12.820 --> 01:09:13.140
model.

01:09:14.040 --> 01:09:19.680
And from that comes that the Kalman filter will not react on such a

01:09:19.680 --> 01:09:23.480
violation of the assumption, of the basic assumptions of constant

01:09:23.480 --> 01:09:25.740
velocity, by increasing the uncertainty.

01:09:26.180 --> 01:09:29.180
But it still will decrease the uncertainty.

01:09:29.820 --> 01:09:30.340
Oh, sorry.

01:09:31.480 --> 01:09:32.120
Bullshit.

01:09:32.600 --> 01:09:37.660
So, it will not increase the uncertainty, but it will decrease the

01:09:37.660 --> 01:09:40.540
uncertainty of the estimation, of the estimate.

01:09:41.220 --> 01:09:41.360
Yeah?

01:09:41.960 --> 01:09:45.140
So, that's a little bit counterintuitive.

01:09:45.420 --> 01:09:49.620
But this comes from the simplifications which we made with the hidden

01:09:49.620 --> 01:09:54.800
Markov model, and with the fact that we did not consider violations of

01:09:54.800 --> 01:09:56.940
the assumptions that we made.

01:09:59.620 --> 01:09:59.860
Okay.

01:10:00.180 --> 01:10:05.920
So, now, we have two methods with which we can track objects, like

01:10:05.920 --> 01:10:09.360
cars or so, and estimate velocities of objects.

01:10:09.700 --> 01:10:13.180
We started with a linear regression in the beginning of this chapter,

01:10:13.540 --> 01:10:15.540
and now we have seen the Kalman filter.

01:10:15.680 --> 01:10:19.800
And you might ask, how do these two methods compare with each other?

01:10:20.220 --> 01:10:21.300
Both can be used.

01:10:22.040 --> 01:10:23.940
Both are used in practice.

01:10:24.440 --> 01:10:27.520
What are the advantages and disadvantages of both?

01:10:28.000 --> 01:10:33.620
So, the first thing is, if we consider the models that are on which

01:10:33.620 --> 01:10:38.180
those methods are based, then, of course, in both cases, we assume

01:10:38.180 --> 01:10:43.520
some linear system behavior, linear dependencies, and Gaussian noise.

01:10:43.640 --> 01:10:45.240
That applies to both models.

01:10:45.340 --> 01:10:48.960
For regression, I didn't make it explicit, but implicitly, it is

01:10:48.960 --> 01:10:52.720
assumed Gaussian noise, and for Kalman filter as well.

01:10:52.720 --> 01:10:54.800
But there's a small difference.

01:10:55.060 --> 01:10:58.460
The difference is that in the Kalman filter, we assume that the state

01:10:58.460 --> 01:11:02.080
changes over time, at least slightly, while in the regression model,

01:11:02.440 --> 01:11:05.480
we assume that this state does not change over time.

01:11:05.860 --> 01:11:09.260
Only the observation depends linearly on the state.

01:11:10.300 --> 01:11:12.940
So, what is the state vector in the Kalman filter?

01:11:13.820 --> 01:11:18.600
It is the vector of regression coefficients in the regression case.

01:11:18.600 --> 01:11:20.960
That can be seen as analogon.

01:11:22.460 --> 01:11:25.940
Then, which linear independence assumption did we do?

01:11:26.180 --> 01:11:29.640
Well, in the Kalman filter, we did this assumption of Markovian

01:11:29.640 --> 01:11:30.800
independence.

01:11:31.100 --> 01:11:36.940
That means, yeah, the subsequent state depends only on the present

01:11:36.940 --> 01:11:42.000
state and not from the past, and the present observation only depends

01:11:42.000 --> 01:11:43.760
on the present state and not on the past.

01:11:44.260 --> 01:11:48.380
In the linear regression case, we also make an assumption, let me so

01:11:48.380 --> 01:11:55.320
-called IID, identically and independently distributed measurements.

01:11:55.860 --> 01:11:59.380
Yeah, we assume also that the measurements are independent of each

01:11:59.380 --> 01:12:03.520
other, that's very similar to the Markovian assumption, and that they

01:12:03.520 --> 01:12:06.880
follow the same distribution, that they are all distributed with

01:12:06.880 --> 01:12:09.420
respect to the same Gaussian.

01:12:11.480 --> 01:12:13.780
So, what, how do we calculate it?

01:12:14.080 --> 01:12:17.360
We have seen in the Kalman filter, we do some incremental calculation

01:12:17.360 --> 01:12:19.240
with each measurement that we get.

01:12:19.320 --> 01:12:22.600
We calculate a new prediction, a new innovation step based on the

01:12:22.600 --> 01:12:24.600
result that we have calculated so far.

01:12:25.380 --> 01:12:29.860
In the regression case, it's not that that easy.

01:12:30.440 --> 01:12:34.740
Yeah, we have more or less do a repeated calculation.

01:12:34.880 --> 01:12:38.760
So, for each measurement that we want to add to our calculations, we

01:12:38.760 --> 01:12:41.980
have to calculate the full regression again, so to say.

01:12:42.020 --> 01:12:44.120
That is meant with repeated calculation.

01:12:44.500 --> 01:12:48.100
Of course, if you're clever, we can save a little bit of time and so

01:12:48.100 --> 01:12:53.500
on by storing some intermediate results, but in general, we would say

01:12:53.500 --> 01:12:57.880
it's a repeated calculation while Kalman filters in incremental

01:12:57.880 --> 01:12:59.420
calculation.

01:13:00.440 --> 01:13:05.860
So, what do we have to store when we want to calculate these filters

01:13:05.860 --> 01:13:08.160
in an incremental or repeated way?

01:13:08.320 --> 01:13:13.620
In the Kalman filter, we only have to store the mu value, this

01:13:13.620 --> 01:13:18.540
expected state value, and the covariance matrix, nothing else.

01:13:18.660 --> 01:13:20.080
We can forget everything else.

01:13:20.260 --> 01:13:23.560
And once we know these two things, once we know the present Gaussian

01:13:23.560 --> 01:13:28.200
that describes the state distribution, that's sufficient to do all

01:13:28.200 --> 01:13:29.860
future calculations.

01:13:29.860 --> 01:13:35.280
And for linear regressions, we need to memorize all the measurements

01:13:35.280 --> 01:13:38.360
that we made so far, and that we want to incorporate in future

01:13:38.360 --> 01:13:38.960
calculations.

01:13:39.580 --> 01:13:41.700
So, the memory requirements are a little bit different.

01:13:43.340 --> 01:13:47.280
So, then we can ask, how much influence do the measurements have?

01:13:48.180 --> 01:13:54.260
So, assume we made 100 measurements, and now we calculate the present

01:13:54.260 --> 01:14:00.100
state with a Kalman filter, or we do a regression approach.

01:14:00.980 --> 01:14:06.020
We might ask, how much influence does each measurement have on the

01:14:06.020 --> 01:14:06.400
result?

01:14:06.900 --> 01:14:10.060
And then we find for the linear regression model, all the measurements

01:14:10.060 --> 01:14:11.520
have the same influence.

01:14:11.860 --> 01:14:19.120
It doesn't matter whether they were were sensed just now or 10 seconds

01:14:19.120 --> 01:14:22.320
before, doesn't matter, all have the same influence.

01:14:22.460 --> 01:14:25.560
All the measurements which are used in the linear regression, in a

01:14:25.560 --> 01:14:29.580
standard linear regression, let's say, have the same influence on the

01:14:29.580 --> 01:14:29.980
result.

01:14:30.460 --> 01:14:32.960
In the Kalman filter, it's not like that.

01:14:33.540 --> 01:14:38.560
But we could argue that somehow the influence decreases over time.

01:14:38.860 --> 01:14:42.720
The last measurement has the strongest influence on the state

01:14:42.720 --> 01:14:48.340
estimation, while the oldest measurement has the smallest influence.

01:14:48.680 --> 01:14:53.320
And roughly speaking, the influence decreases exponentially over time.

01:14:55.960 --> 01:14:59.640
Okay, so what else?

01:15:00.260 --> 01:15:04.320
Okay, what do we need to know and to specify if you want to apply it,

01:15:04.360 --> 01:15:08.640
especially concerning the uncertainties and noise, these matrices Q

01:15:08.640 --> 01:15:09.260
and R?

01:15:09.560 --> 01:15:13.420
We have to know them for implementing a Kalman filter.

01:15:13.560 --> 01:15:19.160
So, we have to know the amount of measurement noise, the covariance

01:15:19.160 --> 01:15:19.600
matrix.

01:15:20.360 --> 01:15:21.540
This must be known.

01:15:22.000 --> 01:15:25.480
The big advantage of linear regression is that we don't need to

01:15:25.480 --> 01:15:26.120
provide it.

01:15:26.520 --> 01:15:31.400
We don't need to provide how certain the measurements are.

01:15:31.920 --> 01:15:36.820
We only need to know that the uncertainty, this covariance matrix, is

01:15:36.820 --> 01:15:40.600
the same for all measurements, but we do not need to specify it.

01:15:42.100 --> 01:15:47.060
Okay, so, and then the variances that we estimate, that means what we

01:15:47.060 --> 01:15:47.820
have just seen.

01:15:47.940 --> 01:15:51.980
We have seen more or less in the Kalman filter, the covariance

01:15:51.980 --> 01:15:55.820
matrices, if you analyze them, the variances, the measures of

01:15:55.820 --> 01:15:59.800
uncertainty, they decrease asymptotically over time.

01:16:00.000 --> 01:16:01.160
So, as a rough idea.

01:16:03.380 --> 01:16:06.960
While in a linear regression, if you do a repeated linear regression

01:16:06.960 --> 01:16:12.160
and we add some measurements, and then the variance that we can derive

01:16:12.160 --> 01:16:14.000
for regression might also increase.

01:16:14.180 --> 01:16:18.640
So, the regression does not have this unintuitive behavior that if the

01:16:18.640 --> 01:16:24.880
measurements deviate very much from the expected measurements, that

01:16:24.880 --> 01:16:27.820
then still the uncertainty decreases.

01:16:28.380 --> 01:16:30.420
That does not apply for linear regression.

01:16:30.600 --> 01:16:34.340
So, in this way, the linear regression is more intuitive concerning

01:16:34.340 --> 01:16:34.640
that.

01:16:35.100 --> 01:16:38.040
So, both possibilities, both techniques are possible.

01:16:38.460 --> 01:16:41.560
If you are faced with a problem, try both, I would say.

01:16:41.940 --> 01:16:42.480
Try both.

01:16:42.700 --> 01:16:45.660
Sometimes the Kalman filter is easier or better to use.

01:16:45.980 --> 01:16:48.720
Sometimes the regression approach is easier or better to use.

01:16:49.200 --> 01:16:56.480
So, try both and do not say just because Kalman filter sounds that

01:16:56.480 --> 01:16:57.800
crazy, we use it.

01:16:58.320 --> 01:17:04.300
No, regression also might have, um, might be more beneficial in some

01:17:04.300 --> 01:17:04.880
situations.

01:17:05.140 --> 01:17:09.220
But sometimes also the Kalman filter is better than a regression

01:17:09.220 --> 01:17:09.720
approach.

01:17:11.640 --> 01:17:15.660
Okay, so, let's, okay.

01:17:16.120 --> 01:17:20.760
So, now, we have limited ourselves so far on linear systems, linear

01:17:20.760 --> 01:17:21.700
Gaussian systems.

01:17:22.080 --> 01:17:25.160
The question is what happens if the system is not linear anymore?

01:17:25.920 --> 01:17:27.800
And this easily happens in practice.

01:17:27.900 --> 01:17:31.980
So, it might happen that either the measurement depends in a non

01:17:31.980 --> 01:17:36.980
-linear relationship from the state or the state transition is non

01:17:36.980 --> 01:17:37.240
-linear.

01:17:37.460 --> 01:17:42.480
That means we have to generalize things and say, okay, now, um, st

01:17:42.480 --> 01:17:47.060
plus one is not a linear function on, in st, but it's just a non

01:17:47.060 --> 01:17:48.980
-linear function, say, f.

01:17:49.420 --> 01:17:56.020
So, let f be maybe non-linear function that models how st plus one

01:17:56.020 --> 01:17:57.360
depends on st.

01:17:57.360 --> 01:18:03.040
And let h be a function that maps the state onto the observation.

01:18:03.860 --> 01:18:07.540
Yeah, of course, plus some additive Gaussian noise, which is not

01:18:07.540 --> 01:18:10.280
mentioned here on the slide, but which we still assume.

01:18:10.880 --> 01:18:12.520
What can we do in such a case?

01:18:13.240 --> 01:18:14.900
Oops, that's not what we want.

01:18:15.440 --> 01:18:16.560
What can we do?

01:18:16.800 --> 01:18:16.960
Oops.

01:18:18.680 --> 01:18:20.200
So, what can we do?

01:18:20.400 --> 01:18:23.640
Well, Kalman filter doesn't work because the Kalman filter assumes

01:18:23.640 --> 01:18:25.200
that we have linear relationships.

01:18:26.200 --> 01:18:30.360
Cannot be used, but there are other techniques, extensions.

01:18:30.960 --> 01:18:33.060
One is a so-called extended Kalman filter.

01:18:33.840 --> 01:18:35.360
That's the simpler version.

01:18:36.000 --> 01:18:38.080
And then there's an unscented Kalman filter.

01:18:38.240 --> 01:18:40.840
It's also a kind of extension of the Kalman filter for non-linear

01:18:40.840 --> 01:18:44.660
systems, which is a little bit more difficult to deal with.

01:18:44.960 --> 01:18:48.180
And a very general solution is a so-called particle filter.

01:18:50.000 --> 01:18:51.840
We will have a look at that later on.

01:18:52.300 --> 01:18:55.320
Okay, let's start with the extended Kalman filter.

01:18:55.480 --> 01:18:59.200
The basic idea of the extended Kalman filter is very easy, and that is

01:18:59.200 --> 01:19:02.620
what you already know from other lectures, also from this lecture.

01:19:02.880 --> 01:19:07.820
If you deal with a non-linear system and you want to use a linear

01:19:07.820 --> 01:19:09.360
technique on it, what do you do?

01:19:09.460 --> 01:19:13.180
Well, you just linearize the system locally around the point of

01:19:13.180 --> 01:19:13.480
interest.

01:19:13.860 --> 01:19:16.600
That's actually the basic idea of the extended Kalman filter.

01:19:16.740 --> 01:19:19.460
And if you understood that, you understood what the extended Kalman

01:19:19.460 --> 01:19:20.100
filter is.

01:19:20.420 --> 01:19:20.560
Yeah?

01:19:20.980 --> 01:19:27.680
It's just using a linearization of the system around the present point

01:19:27.680 --> 01:19:29.220
of interest.

01:19:29.620 --> 01:19:34.880
So, if you do that, we can write the prediction step and derive the

01:19:34.880 --> 01:19:38.320
prediction and innovation step of the extended Kalman filter like

01:19:38.320 --> 01:19:38.600
that.

01:19:39.380 --> 01:19:49.960
The update of this expected state value mu t plus one predicted can be

01:19:49.960 --> 01:19:53.520
done by just applying this mu t vector onto the state transition

01:19:53.520 --> 01:19:54.240
function f.

01:19:54.800 --> 01:19:58.060
Now, this mu t vector has the same shape as a state, can be

01:19:58.060 --> 01:20:02.100
interpreted as a state vector, so we can apply the state transition

01:20:02.100 --> 01:20:03.000
matrix on it.

01:20:03.600 --> 01:20:03.640
Yeah?

01:20:03.880 --> 01:20:09.180
So, we consider all the non-linearity which is in f.

01:20:09.820 --> 01:20:13.080
For the covariance matrix update, this is not possible.

01:20:13.520 --> 01:20:18.420
We need something like a matrix that we can apply, multiply from the

01:20:18.420 --> 01:20:21.340
right and the left to the covariance matrix sigma.

01:20:22.320 --> 01:20:26.000
And instead of which matrix can we use here?

01:20:26.160 --> 01:20:31.200
Well, the matrix that we use is the Jacobian of f, of this function f

01:20:31.200 --> 01:20:37.680
here, evaluated at mu t, at this point mu t, at the present best

01:20:37.680 --> 01:20:39.880
estimate of what the present state is.

01:20:40.260 --> 01:20:44.740
So, what was the Jacobian of a function f?

01:20:46.920 --> 01:20:48.340
No one knows.

01:20:48.800 --> 01:20:54.820
The Jacobian is a matrix that contains all the partial derivatives of

01:20:54.820 --> 01:20:55.320
the function.

01:20:58.150 --> 01:20:58.830
Never heard?

01:21:21.030 --> 01:21:22.690
Okay, so back to English.

01:21:23.110 --> 01:21:28.210
So, Jacobian is the matrix that contains the first order derivative of

01:21:28.210 --> 01:21:28.930
the function f.

01:21:29.230 --> 01:21:29.230
Yeah?

01:21:30.270 --> 01:21:32.690
So, okay, that's a Jacobian.

01:21:32.790 --> 01:21:38.010
So, we use the Jacobian here instead of this function, of this matrix

01:21:38.010 --> 01:21:40.490
a t that we had in the Kalman filter.

01:21:40.650 --> 01:21:45.670
So, here the Jacobian is using, playing the role of this matrix.

01:21:46.410 --> 01:21:46.470
Yeah?

01:21:46.570 --> 01:21:50.710
And if we have, if f is a linear function, can be represented as a

01:21:50.710 --> 01:21:56.030
times s plus u, and you calculate the Jacobian, you get a.

01:21:56.930 --> 01:21:57.070
Yeah?

01:21:57.190 --> 01:21:58.290
So, that makes sense.

01:21:59.310 --> 01:22:00.930
Okay, so that's the prediction step.

01:22:01.030 --> 01:22:02.570
The innovation step looks like that.

01:22:03.650 --> 01:22:11.790
Actually, here again, we can apply this predicted state to the

01:22:11.790 --> 01:22:13.530
measurement function h.

01:22:14.250 --> 01:22:14.430
Yeah?

01:22:14.990 --> 01:22:17.750
Directly, because this looks like a state vector.

01:22:18.010 --> 01:22:20.130
So, we can apply this non-linear function.

01:22:21.190 --> 01:22:22.830
So, that occurs here.

01:22:23.310 --> 01:22:27.190
And everywhere else where we had this, in the Kalman filter innovation

01:22:27.190 --> 01:22:33.050
step, this matrix h t, we also have a matrix h t here, but now this h

01:22:33.050 --> 01:22:38.190
t is a Jacobian of small, of the function small h around this point,

01:22:38.510 --> 01:22:39.690
for that point mu t.

01:22:40.330 --> 01:22:40.410
Yeah?

01:22:40.570 --> 01:22:44.870
So, we take this measurement function h, calculate its first order

01:22:44.870 --> 01:22:50.430
derivative at the point of interest, namely this predicted state, and

01:22:50.430 --> 01:22:55.630
you enter it to the Kalman filter innovation step formula.

01:22:56.210 --> 01:22:56.630
That's all.

01:22:57.050 --> 01:22:57.150
Yeah?

01:22:57.730 --> 01:22:58.610
So, that's it.

01:22:58.810 --> 01:23:05.010
So, just a linearization of the system model, and then we use the

01:23:05.010 --> 01:23:05.870
linearized model.

01:23:07.290 --> 01:23:09.070
That's the extended Kalman filter.

01:23:09.390 --> 01:23:10.370
Oops, so we're back.

01:23:10.910 --> 01:23:12.450
We didn't forget any slide.

01:23:12.850 --> 01:23:15.970
Okay, now the second version is the unscented Kalman filter.

01:23:16.490 --> 01:23:20.710
The unscented Kalman filter is a little bit different.

01:23:21.070 --> 01:23:25.850
So, it also is using a kind of approximation, linear approximation, or

01:23:25.850 --> 01:23:30.290
approximation of the non-linear function f, but it's a little bit

01:23:30.290 --> 01:23:32.770
different in how it works.

01:23:33.450 --> 01:23:38.770
So, assume we have, assume in this case, we have a two-dimensional

01:23:38.770 --> 01:23:39.550
state space.

01:23:39.770 --> 01:23:43.530
Just say this is x, and this is v for our motion model case.

01:23:44.350 --> 01:23:52.290
And then a Gaussian distribution is described by, of course, the

01:23:52.290 --> 01:23:53.370
expectation value.

01:23:53.490 --> 01:23:56.770
That would be this point here, and by the covariance matrix.

01:23:57.330 --> 01:24:05.830
And if we ask for such a Gaussian density, which is, if we say, okay,

01:24:07.290 --> 01:24:12.630
for which values does this Gaussian density function yields the same

01:24:12.630 --> 01:24:15.590
value, then we end up with ellipses.

01:24:16.830 --> 01:24:23.210
So, an ellipse, this kind of ellipse, is somehow modeling this

01:24:23.210 --> 01:24:24.250
Gaussian distribution.

01:24:25.310 --> 01:24:30.350
The center position is an expectation value, and the covariance matrix

01:24:30.350 --> 01:24:33.750
is somehow represented by this shape.

01:24:34.310 --> 01:24:41.090
And the larger the diameter is of this ellipse, the larger is the

01:24:41.090 --> 01:24:42.770
uncertainty in this direction.

01:24:43.030 --> 01:24:46.310
So, here the uncertainty in this direction is smaller than the

01:24:46.310 --> 01:24:47.790
uncertainty in this direction.

01:24:48.370 --> 01:24:48.450
Yeah?

01:24:48.850 --> 01:24:51.650
So, that's actually how this can be interpreted.

01:24:52.550 --> 01:24:56.690
So, now, what we do in an unscented Kalman filter to, for instance,

01:24:56.750 --> 01:25:00.610
implement the prediction step, is that we create some points which are

01:25:00.610 --> 01:25:01.790
called sigma points.

01:25:02.310 --> 01:25:04.970
One of these sigma points is this point, the center.

01:25:05.890 --> 01:25:05.970
Yeah?

01:25:06.270 --> 01:25:11.950
And then there are some other points which are spread around the

01:25:11.950 --> 01:25:15.510
center, following somehow the shape of this ellipse.

01:25:15.590 --> 01:25:19.590
This is done in a symmetrical, systematical way, so not in a random

01:25:19.590 --> 01:25:19.890
way.

01:25:19.990 --> 01:25:24.210
We do not randomly sample points, sigma points, but we analyze the

01:25:24.210 --> 01:25:28.250
covariance matrix, and based on the covariance matrix, we generate

01:25:28.250 --> 01:25:29.250
these points.

01:25:29.890 --> 01:25:35.050
And by construction, these points are implicitly can be used to

01:25:35.050 --> 01:25:37.470
represent this Gaussian distribution.

01:25:38.170 --> 01:25:43.330
So, if we are given these points, if we calculate the expectation

01:25:43.330 --> 01:25:45.570
value, we end up with this point.

01:25:45.790 --> 01:25:49.890
And if we calculate the covariance matrix, we end up with the

01:25:49.890 --> 01:25:52.650
covariance matrix, which is given by this blue ellipse.

01:25:52.950 --> 01:25:56.830
That means, representing such a Gaussian distribution can be done with

01:25:56.830 --> 01:25:58.450
these so-called sigma points.

01:25:58.990 --> 01:26:02.770
Each sigma point, however, can be also interpreted as a state vector.

01:26:03.490 --> 01:26:06.770
So, if it's a state vector, we can apply the state transition function

01:26:06.770 --> 01:26:11.790
f to it, and map it, and look at the result.

01:26:11.910 --> 01:26:15.650
So, in this case, if you apply f or h, doesn't matter, onto that,

01:26:15.970 --> 01:26:19.450
maybe this point is mapped there, here, and this one there, this one

01:26:19.450 --> 01:26:21.990
here, this one here, this one here, this one here.

01:26:22.410 --> 01:26:24.950
Now, we again have a set of sigma points.

01:26:25.710 --> 01:26:29.510
And now, what we can do is, based on these sigma points, we can add a

01:26:29.510 --> 01:26:31.730
covariance matrix again, based on it.

01:26:31.910 --> 01:26:34.670
So, we can estimate a Gaussian distribution based on it.

01:26:35.650 --> 01:26:41.550
So, what we did now is that we have solved this problem of nonlinear

01:26:41.550 --> 01:26:45.590
function by first generating these sigma points as a kind of

01:26:45.590 --> 01:26:50.350
representative points that describe this Gaussian distribution, then

01:26:50.350 --> 01:26:54.470
apply this nonlinear function to each of these sigma points, and based

01:26:54.470 --> 01:26:59.750
on the sigma points which we get as a result, we re-estimate Gaussian

01:26:59.750 --> 01:27:00.290
distribution.

01:27:00.490 --> 01:27:05.870
And by doing that, we can overcome these problems of nonlinearities.

01:27:06.430 --> 01:27:09.850
So, that's the basic idea of the unscented Kalman filters.

01:27:09.990 --> 01:27:13.610
Of course, the details are a little bit more technical, but for

01:27:13.610 --> 01:27:17.710
reasons of time, I've omitted those technical details here in the

01:27:17.710 --> 01:27:17.950
lecture.

01:27:18.070 --> 01:27:22.230
If you are interested, just read the original papers, or read the

01:27:22.230 --> 01:27:28.310
chapter about the Kalman filter techniques in the book on

01:27:28.310 --> 01:27:29.930
probabilistic robotics.

01:27:30.270 --> 01:27:33.950
There, it is explained in a nice way, and you'll find the details.

01:27:34.990 --> 01:27:36.770
So, that's the method.

01:27:37.470 --> 01:27:39.250
So, okay.

01:27:39.570 --> 01:27:41.950
So far, I think now time is up for today.

01:27:42.190 --> 01:27:47.750
So, let's continue next time, then, with an example that shows how

01:27:47.750 --> 01:27:49.470
these can be used in practice.

01:27:49.470 --> 01:27:51.190
Yeah.

01:27:53.170 --> 01:27:54.090
That's it.

