WEBVTT
00:05.560 --> 00:12.880
Okay, so, last time we finished the chapter on optical flow and image
00:12.880 --> 00:13.580
-based tracking.
00:14.320 --> 00:18.540
And today we want to continue with, well, again, tracking.
00:18.740 --> 00:22.800
But now we don't want to track objects inside of an image,
00:23.400 --> 00:26.520
representing their position in pixel coordinates.
00:26.740 --> 00:30.500
But we want to track these objects in the real world, in 3D
00:30.500 --> 00:33.320
coordinates in the real world.
00:33.600 --> 00:39.420
And we want to be able to not only say this object that we see once in
00:39.420 --> 00:44.740
one image at that position is now visible at another position in the
00:44.740 --> 00:45.640
next image.
00:45.980 --> 00:50.660
But we also want to be able to estimate their motion.
00:51.100 --> 00:57.200
We want to estimate with which velocity or acceleration objects are
00:57.200 --> 01:00.160
moving, into which direction, and things like that.
01:00.280 --> 01:05.640
So now we leave this world of two-dimensional images and we enter the
01:05.640 --> 01:08.420
area of the three-dimensional world.
01:08.880 --> 01:13.060
So everything that comes now within this chapter, and also within
01:13.060 --> 01:17.240
chapter number six, refers to 3D world coordinates.
01:17.480 --> 01:22.060
So we assume that we have detected objects in the image with a
01:22.060 --> 01:24.180
suitable detection algorithm.
01:24.440 --> 01:28.700
We have used a binocular camera system so that we can determine the
01:28.700 --> 01:31.200
position of this object in the three-dimensional world.
01:31.720 --> 01:34.840
And based on these three-dimensional coordinates we want to track
01:34.840 --> 01:35.700
these objects.
01:36.800 --> 01:39.600
Okay, literature and references for this chapter.
01:40.160 --> 01:43.600
Well, we will introduce one technique called the Kalman filter.
01:44.260 --> 01:48.040
Here is a tutorial that might be useful for you to understand this
01:48.040 --> 01:48.760
Kalman filter.
01:55.260 --> 02:00.200
This paper here is an original paper introducing a technique that is
02:00.200 --> 02:02.140
called the unscented Kalman filter.
02:03.260 --> 02:05.420
And we touch this technique as well.
02:05.540 --> 02:10.200
So if you want to see the background of this technique, look at that
02:10.200 --> 02:10.820
paper.
02:11.400 --> 02:16.400
Then the robotics book of Thrunberger and Fox contains three chapters
02:16.400 --> 02:18.000
on these tracking techniques.
02:18.640 --> 02:20.800
So chapter number two to four.
02:20.980 --> 02:24.260
Then there's an older book, Applied Optimal Estimation.
02:24.360 --> 02:26.100
It also introduces a Kalman filter.
02:26.700 --> 02:29.180
So that's the book where I learned about the Kalman filter.
02:29.660 --> 02:32.260
If you are interested you might look at this book.
02:33.280 --> 02:42.200
And this one here is a larger book about general and topics of
02:42.200 --> 02:44.360
tracking objects.
02:45.060 --> 02:48.940
And it contains much more than what we introduce in the lecture.
02:49.080 --> 02:53.420
So this is more further reading for those who are more interested in
02:53.420 --> 02:56.840
this topic and want to get more information about it.
02:57.240 --> 03:00.120
However, it's already 17 years old.
03:00.120 --> 03:08.380
So let's start with one of the simplest ways to do that.
03:08.960 --> 03:12.200
It's often forgotten that this is also an option.
03:12.700 --> 03:16.880
But we should discuss it because it's often very useful.
03:17.500 --> 03:20.340
Namely motion estimation of objects by regression.
03:20.800 --> 03:21.940
What do we mean with that?
03:22.260 --> 03:25.960
So we observe an object at different points in time.
03:26.200 --> 03:28.180
That means, for instance, we have a camera image.
03:28.280 --> 03:30.300
We detect the object in the camera image.
03:30.420 --> 03:32.860
We calculate its position in the three-dimensional world.
03:33.900 --> 03:38.980
So that means for each point in time at which we have an image, we
03:38.980 --> 03:40.740
also have the position of the object.
03:41.260 --> 03:44.160
And what we want is, we want to estimate the position and the velocity
03:44.160 --> 03:45.040
of the object.
03:46.560 --> 03:48.620
Okay, so simple example.
03:49.320 --> 03:56.200
We want to estimate the motion of this car or the motion profile of
03:56.200 --> 03:56.640
this car.
03:56.800 --> 04:00.800
This car is waiting in front of a red traffic light, so it's not
04:00.800 --> 04:01.580
moving at all.
04:02.200 --> 04:08.780
And within each image, we somehow segment this car and then we
04:08.780 --> 04:09.920
determine its position.
04:10.300 --> 04:16.660
And due to some imprecision in the whole image processing algorithms
04:16.660 --> 04:21.200
and noise and the camera images and so on, the position that we
04:21.200 --> 04:24.660
measure varies from image to image.
04:24.660 --> 04:28.980
So it might happen that in the first image, we measure a position here
04:28.980 --> 04:33.300
in longitudinal direction of minus 0.3 meters and then in the next one
04:33.300 --> 04:37.320
over 0.1 meters and the next one is minus 0.4 meters.
04:37.820 --> 04:41.860
So we assume that the car is really waiting in front of the traffic
04:41.860 --> 04:42.260
lights.
04:42.280 --> 04:43.900
That means that it is not moving.
04:44.500 --> 04:48.160
So we want to say, okay, what is the real position of the car?
04:48.260 --> 04:52.060
These are the sense positions, which are somehow affected by
04:52.060 --> 05:00.300
imprecision and noise and disturbances in the whole sensor processing
05:00.300 --> 05:00.720
chain.
05:01.060 --> 05:05.320
So what would you do to get a best estimate of the position when you
05:05.320 --> 05:06.760
have these measurements?
05:08.200 --> 05:08.900
Very simple.
05:09.080 --> 05:10.660
Everyone would do it, I guess.
05:11.400 --> 05:12.200
What would you do?
05:13.260 --> 05:18.160
I hope everyone would do something like averaging these positions.
05:18.720 --> 05:24.280
Calculating just the average of these positions that yields another
05:24.280 --> 05:26.580
position estimate, say x hat.
05:27.240 --> 05:35.180
And this x hat is the best position estimate that we can get if we
05:35.180 --> 05:39.300
assume some properties of the imprecision that occur in this
05:39.300 --> 05:42.880
measurement and image processing process.
05:43.460 --> 05:49.560
Now if we assume that this imprecision that occurs follows some
05:49.560 --> 05:54.340
statistical properties, then there is a guarantee that this estimate,
05:54.820 --> 05:59.240
this averaging of the positions, is the best thing that we can do to
05:59.240 --> 06:07.000
get a position value that is better than any other position estimate
06:07.000 --> 06:08.980
that we can get, at least on average.
06:09.600 --> 06:17.400
We can also describe under these assumptions how certain this position
06:17.400 --> 06:17.700
is.
06:17.780 --> 06:20.880
That means how much uncertainty exists in this position.
06:21.500 --> 06:26.540
And this is typically described either by an interval, so-called
06:26.540 --> 06:31.180
confidence interval, that says, okay, with a probability of say 95
06:31.180 --> 06:35.740
percent the true position of the car is within a certain interval.
06:36.120 --> 06:37.460
That would be a confidence interval.
06:37.980 --> 06:43.820
Or another way to express this uncertainty is to provide the variance
06:43.820 --> 06:47.460
of this estimate that is provided here.
06:47.580 --> 06:49.120
It could be calculated like that.
06:49.760 --> 06:52.920
That's a measure of, say, imprecision.
06:53.020 --> 06:56.820
The larger this value is, the more imprecise is a position.
06:57.020 --> 06:59.220
That means the more uncertain it is.
07:00.720 --> 07:05.800
Okay, so this is a basic step, and this is actually the simplest step,
07:05.840 --> 07:12.440
and I think for this case no one would be surprised if we use this
07:12.440 --> 07:14.940
average as the best estimate.
07:15.400 --> 07:19.460
Now let's extend this task a little bit and say now we are observing
07:19.460 --> 07:23.500
this car which is moving on a highway with more or less constant
07:23.500 --> 07:24.100
velocity.
07:24.520 --> 07:29.720
So again we observe it over time along the road and we have several
07:29.720 --> 07:34.000
points in time at which we observe it and say now we get these
07:34.000 --> 07:34.420
measurements.
07:34.780 --> 07:39.200
So this is a point in time where we have taken a measurement and these
07:39.200 --> 07:41.620
are the positions in longitudinal direction.
07:42.540 --> 07:47.440
So now we would say, okay, what is the... how can we describe the
07:47.440 --> 07:52.500
movement of this vehicle, of this object, and how can we estimate its
07:52.500 --> 07:53.860
motion?
07:55.740 --> 08:00.680
Okay, the idea is actually an extension of the idea that we had
08:00.680 --> 08:01.100
before.
08:01.460 --> 08:06.800
Of course now averaging these positions does not really help because
08:06.800 --> 08:12.240
the vehicle is moving over time, so just averaging the positions would
08:12.240 --> 08:12.800
not help.
08:13.280 --> 08:16.260
But instead of that we could go another way.
08:16.340 --> 08:20.700
We could say, okay, we make an assumption about how this vehicle
08:20.700 --> 08:21.280
moves.
08:21.480 --> 08:26.160
If we say the vehicle moves with constant velocity in the same
08:26.160 --> 08:30.880
direction, then we could argue that we can describe the position of
08:30.880 --> 08:35.940
this object over time, so x depending on time, by a certain integer
08:35.940 --> 08:41.700
position plus the velocity of the vehicle times the time.
08:42.520 --> 08:48.960
So this comes from just the basic physical motion equations for the
08:48.960 --> 08:50.340
movement with constant velocity.
08:50.840 --> 08:57.500
The measurements that we get here must follow this equation plus some
08:57.500 --> 09:02.820
additional measurement noise, which occurs here in these measurements.
09:02.820 --> 09:07.840
So this is, so to say, an assumption that we might take how this
09:07.840 --> 09:09.880
vehicle moves.
09:10.220 --> 09:13.800
Now we could enter these numbers here in this equation.
09:14.180 --> 09:17.840
So we measure the position, that would be the left-hand side of this
09:17.840 --> 09:22.240
equation, and we are given the point in time, that means this value
09:22.240 --> 09:25.420
for t is something that we could add.
09:26.660 --> 09:31.200
That means for, in this case, we would get four equations, and we have
09:31.200 --> 09:37.640
in total two unknown variables x0, the integer position, and v, the
09:37.640 --> 09:39.700
velocity of the vehicle.
09:40.300 --> 09:43.140
So four equations with two unknown variables.
09:44.640 --> 09:48.340
In the best case, this system of linear equations has a unique
09:48.340 --> 09:48.840
solution.
09:50.500 --> 09:53.020
In the worst case, it doesn't have any solution.
09:53.700 --> 09:57.180
And the reason, in practice, it doesn't have a solution.
09:57.700 --> 09:58.040
Why?
09:58.240 --> 10:00.280
Because there is this unknown noise term.
10:01.420 --> 10:05.980
So it's not exact, an exact equality here, but it's only an
10:05.980 --> 10:07.640
approximate equality.
10:08.380 --> 10:10.460
So we have to deal with this noise.
10:10.620 --> 10:14.700
So we could say, there is an additional noise term in each equation.
10:15.280 --> 10:20.840
And now we could argue for which values of x0 and v do these
10:20.840 --> 10:24.020
additional noise terms become as small as possible.
10:26.700 --> 10:32.720
So that means what we would do is, we would isolate the noise term
10:32.720 --> 10:41.100
here by, for instance, subtracting this term x0 plus v times t from
10:41.100 --> 10:42.720
both sides of the equation.
10:42.960 --> 10:49.740
Then we get x of t minus x0 minus v times t is equal to the noise
10:49.740 --> 10:50.040
term.
10:50.700 --> 10:55.460
So now we want to minimize the noise in all equations in parallel.
10:55.760 --> 11:02.180
This can be done by minimizing the sum of the squared noise terms.
11:02.560 --> 11:05.800
And then we get a minimization problem that looks like that.
11:06.180 --> 11:15.880
So we want to minimize these terms here, which are equal to the noise
11:15.880 --> 11:18.600
term in each of these equations that we get.
11:19.360 --> 11:25.460
So we want to minimize the sum of the squared noise terms over all the
11:25.460 --> 11:27.480
equations that we get from these measurements.
11:28.080 --> 11:31.780
And we want to minimize it with respect to x0 and v.
11:32.480 --> 11:37.260
Now this is a function that should be minimized.
11:38.960 --> 11:42.680
To minimize this function, we can go on as we have learned it in the
11:42.680 --> 11:43.300
math class.
11:43.460 --> 11:47.480
Namely, we take the derivatives of this term here with respect to the
11:47.480 --> 11:52.560
unknown variables x0 and v and then zero these partial derivatives.
11:53.040 --> 11:57.000
By doing that, we get a system of linear equations with two equations
11:57.000 --> 11:59.840
and two unknown variables x0 and v.
12:01.400 --> 12:06.000
So this yields a method that is called linear regression.
12:06.780 --> 12:13.780
So the resulting linear system of linear equations looks like that.
12:14.480 --> 12:19.140
So on the left hand side we have this matrix that contains just the
12:19.140 --> 12:22.080
points in time at which the measurements were taken.
12:22.360 --> 12:25.700
And we have this vector here on the right hand side which contains the
12:25.700 --> 12:30.580
positions at which the objects have been detected and the product of
12:30.580 --> 12:31.460
times and objects.
12:31.940 --> 12:34.400
And here we have the vector of unknown variables.
12:34.960 --> 12:40.960
Now if this matrix is invertible, so if it has rank two, full rank,
12:41.380 --> 12:46.480
then we can get a unique solution of this system of linear equations.
12:46.860 --> 12:52.220
And this matrix has rank two as soon as we have at least two
12:52.220 --> 12:54.880
measurements taken at different points in time.
12:56.180 --> 12:57.800
And that's a good message.
12:57.960 --> 13:01.840
So as long as as soon as we have two measurements, we can calculate
13:01.840 --> 13:05.820
these terms, the initial position and the velocity of the vehicle.
13:06.460 --> 13:10.360
And based on that, we can also make predictions where will the vehicle
13:10.360 --> 13:12.620
be at a certain point in future.
13:15.120 --> 13:20.560
Okay, so if you are interested in the uncertainty, so up to now we
13:20.560 --> 13:24.780
were only providing this system of linear equations from which we get
13:24.780 --> 13:27.680
the best estimate of x0 and v.
13:28.900 --> 13:38.420
Under some assumptions about the noise in these measurement processes,
13:38.880 --> 13:42.620
what we get here is the best estimate of x0 and v.
13:43.120 --> 13:45.960
However, we might also be interested in the uncertainty.
13:46.400 --> 13:53.780
So the best estimate for the velocity might be something like say 30
13:53.780 --> 13:54.840
kilometers per hour.
13:54.980 --> 13:59.160
But of course, it makes a difference whether we are very sure about
13:59.160 --> 14:03.520
the velocity and can say, okay, it's definitely between 29 and 31
14:03.520 --> 14:04.480
kilometers per hour.
14:04.680 --> 14:08.440
Or whether we are very unsure and can only say, okay, it's somewhere
14:08.440 --> 14:11.560
between 10 kilometers per hour and 50 kilometers per hour.
14:11.700 --> 14:13.040
That makes a difference.
14:13.720 --> 14:17.960
So if you are interested again about these measures of uncertainty,
14:18.360 --> 14:24.600
it's possible to derive also variances and confidence intervals for
14:24.600 --> 14:26.540
those terms here, if you like.
14:27.180 --> 14:31.980
I don't provide this formula here, but it's possible with little
14:31.980 --> 14:32.440
effort.
14:33.000 --> 14:38.080
Similar to the case where we had a non-moving vehicle, we could also
14:38.080 --> 14:41.120
derive these variances for these terms here.
14:42.680 --> 14:46.640
Okay, so that would be the second possibility.
14:46.900 --> 14:49.420
We assume the vehicle is moving with constant velocity.
14:50.040 --> 14:55.440
That is useful for many, in many situations, but not in all.
14:55.540 --> 14:57.260
So it might also happen like that.
14:57.420 --> 15:00.300
So the vehicle still is waiting at the traffic lights.
15:00.800 --> 15:04.380
They go to green and then the vehicle is accelerating.
15:05.060 --> 15:08.700
Of course, if we assume constant velocity while the vehicle is
15:08.700 --> 15:14.860
accelerating like that, our assumptions are not met and the estimates
15:14.860 --> 15:17.420
that we would get would not be valid.
15:17.600 --> 15:20.300
It would not fit to the real scenario.
15:20.800 --> 15:25.280
So let's discuss how we could model this scenario of a vehicle that
15:25.280 --> 15:29.380
has a considerable acceleration that differs from zero.
15:30.500 --> 15:32.280
Okay, what could we do?
15:32.400 --> 15:37.460
We could follow the same way as we were arguing before, but now we
15:37.460 --> 15:38.960
need another motion model.
15:39.100 --> 15:42.500
We need another model that describes in which way the vehicle moves.
15:43.020 --> 15:48.180
And if you go back to the basic physical motion equations and assume
15:48.180 --> 15:52.620
that the vehicle is moving with constant acceleration, then we get
15:52.620 --> 15:57.000
this equation here, just from the basic physical equations.
15:57.260 --> 16:03.220
So the position of the vehicle at a certain point in time is equal to
16:03.220 --> 16:07.800
the initial position plus the initial velocity times the time plus a
16:07.800 --> 16:10.180
half times the acceleration times t squared.
16:10.700 --> 16:17.720
And if we are interested in or if we are modeling not the real
16:17.720 --> 16:24.620
position of the vehicle, but the sensed position, then we have to
16:24.620 --> 16:29.320
consider also the uncertainty and the imprecision and the noise and
16:29.320 --> 16:33.480
distortions and disruptions in the image processing process.
16:33.940 --> 16:36.220
So we have to add some measurement noise here.
16:37.460 --> 16:41.520
So based on this equation we can do the same as we did before.
16:41.640 --> 16:49.180
We were arguing that we get one of these equations for each
16:49.180 --> 16:54.580
measurement that we made and we want to minimize the sum of the
16:54.580 --> 16:59.100
squared noise terms here or error terms that occur.
17:00.380 --> 17:04.440
And by doing that we say we want to minimize this term here.
17:05.000 --> 17:11.780
We want to minimize these squared errors that we make in each
17:11.780 --> 17:16.720
measurement equation sum up over all the measurement equations and we
17:16.720 --> 17:20.520
want to minimize this term with respect to the three unknown variables
17:20.520 --> 17:23.740
which are now x0, b0 and a.
17:24.680 --> 17:28.620
And again we can take this term here, calculate the partial
17:28.620 --> 17:32.540
derivatives with respect to the three unknown variables, zero these
17:32.540 --> 17:38.080
equations, resolve everything and we get a system of linear equations.
17:38.620 --> 17:44.180
And based on that we can determine the three unknown variables x0, b0
17:44.180 --> 17:44.680
and a.
17:45.980 --> 17:51.880
So as well we could extend this method for any motion model that we
17:51.880 --> 17:53.940
can somehow describe physically.
17:54.400 --> 17:58.460
So if you are interested in modeling a movement not just in a one
17:58.460 --> 18:04.280
-dimensional space like we have did it on the slides before, but in
18:04.280 --> 18:08.840
the two-dimensional space where a vehicle is moving on a plane and we
18:08.840 --> 18:12.880
want someone to determine the direction into which the vehicle is
18:12.880 --> 18:15.380
moving, we can go on the same way.
18:15.560 --> 18:19.280
So we just have to think about the question what is the appropriate
18:19.280 --> 18:23.800
equation that describes the motion of this vehicle.
18:24.200 --> 18:27.640
If we are assuming that the vehicle is not just moving straight
18:27.640 --> 18:35.560
forward but that it might also steer and drive on a curve, we have to
18:35.560 --> 18:38.400
consider the motion model for driving on a curve.
18:39.020 --> 18:44.320
Then we have some more unknown variables like the draw rate that
18:44.320 --> 18:50.480
describes the amount of turning of the vehicle and we establish this
18:50.480 --> 18:52.260
motion model.
18:52.620 --> 18:57.760
Based on that we determine this optimization problem where we want to
18:57.760 --> 19:01.980
minimize the sum of the squared errors that occur in each measurement
19:01.980 --> 19:04.460
and then we try to minimize it.
19:04.460 --> 19:08.620
From a certain point on if the motion model becomes too complicated
19:08.620 --> 19:14.460
there is no analytical solution for this optimization problem anymore,
19:14.620 --> 19:20.020
but with numerical minimizers it's still possible to find a solution
19:20.020 --> 19:21.380
or with some tricks.
19:24.600 --> 19:30.900
Let us now summarize these techniques, regression techniques for
19:30.900 --> 19:32.920
estimating the motion of an object.
19:33.500 --> 19:39.300
The advantage is that these methods are really simple, easy to use,
19:40.000 --> 19:42.460
the calculations are all efficient.
19:43.300 --> 19:47.920
You have to solve a system of linear equations but a pretty small
19:47.920 --> 19:52.820
system of linear equations that does not require much computation
19:52.820 --> 19:53.420
time.
19:53.700 --> 19:58.420
It can be used to estimate straight movements in one, two or three
19:58.420 --> 20:00.440
dimensions if you like.
20:01.900 --> 20:04.700
It's possible to find analytic solutions for that.
20:05.240 --> 20:11.400
If it's a movement that is not straight then you have to solve a non
20:11.400 --> 20:15.820
-linear optimization problem which requires the numerical techniques
20:15.820 --> 20:17.840
for optimization but still is possible.
20:18.660 --> 20:23.420
My suggestion for you is if you are faced with a problem like that
20:23.420 --> 20:25.480
give it a try with regression.
20:25.760 --> 20:30.120
Often it turns out to be easier than the techniques that we will
20:30.120 --> 20:38.780
introduce next, namely the Kalman filter, which looks pretty nice but
20:38.780 --> 20:41.560
in practice is a little bit difficult to tune.
20:42.820 --> 20:47.300
So give this regression a try, it's worth trying it.
20:48.720 --> 20:53.300
So shortcomings, well non-linear methods are necessary for curved
20:53.300 --> 20:55.220
trajectories, that is what I already said.
20:56.120 --> 20:58.500
Then the computational effort grows.
20:59.840 --> 21:02.140
Okay, let's have a look at an example.
21:02.360 --> 21:06.280
So this is work that is meanwhile 12 years old roundabout.
21:06.940 --> 21:13.280
At that time I was creating together with some colleagues a robot
21:13.280 --> 21:21.360
soccer team, a team of robots that are playing soccer in an automated
21:21.360 --> 21:21.780
fashion.
21:21.960 --> 21:26.340
So they are not tailor-operated but they are fully autonomous.
21:26.900 --> 21:31.460
So these are our robots, this one here, these are the robots from
21:31.460 --> 21:31.920
Stuttgart.
21:32.880 --> 21:38.500
And yeah, there were real competitions in soccer playing robots and
21:38.500 --> 21:41.400
yeah we were one of these teams that were participating there.
21:42.140 --> 21:45.680
Of course if you want to play soccer you need to know where the ball
21:45.680 --> 21:48.900
is and you also need to know how fast the ball is moving and in which
21:48.900 --> 21:49.380
direction.
21:49.560 --> 21:53.460
Otherwise you will not be able to play soccer successfully.
21:53.760 --> 21:56.100
So that's a typical task for motion estimation.
21:57.460 --> 22:01.700
And actually for this task we were implementing this kind of
22:01.700 --> 22:03.980
regression algorithm quite successfully.
22:04.640 --> 22:05.240
Of course we would...
22:07.420 --> 22:12.040
...implementing it took a little bit time to model all the details.
22:12.600 --> 22:16.400
So of course a ball might roll on the field while it is rolling.
22:16.560 --> 22:21.480
It is decelerating a little bit but it might also be kicked by a
22:21.480 --> 22:21.720
robot.
22:21.940 --> 22:25.580
Then it is accelerating or it might collide with a robot.
22:25.860 --> 22:30.480
Then it is somehow reflected at the robot and changes the direction of
22:30.480 --> 22:30.860
movement.
22:31.420 --> 22:33.580
We might have to be faced with chip kicks.
22:33.820 --> 22:38.140
That means the ball might leave the ground while it is kicked.
22:38.540 --> 22:40.580
And we had to deal with all these details.
22:40.980 --> 22:46.820
But the basic method to estimate the motion of the ball was to use a
22:46.820 --> 22:47.600
regression model.
22:48.640 --> 22:51.440
And that was pretty successful in our case.
22:51.720 --> 22:56.220
Although the measurements that we had about the ball position were
22:56.220 --> 23:01.520
quite limited in accuracy because the whole sensor system that was
23:02.200 --> 23:05.780
here you see the camera on top which was looking into a mirror that
23:05.780 --> 23:06.640
was above it.
23:07.080 --> 23:09.980
So that it was looking into the whole environment.
23:11.180 --> 23:14.760
So it had a 360 degrees around view.
23:15.500 --> 23:18.420
However the resolution of the camera was quite poor.
23:18.600 --> 23:21.560
So it was 640 times 480 pixels.
23:21.940 --> 23:27.960
And if you compare nowadays consumer camera with whatever 10, 12, 11,
23:28.200 --> 23:31.140
12, 13 million pixels.
23:32.140 --> 23:35.500
This is pretty little resolution.
23:36.940 --> 23:42.580
And so that the precision of each measurement of the ball position was
23:42.580 --> 23:44.740
pretty limited in accuracy.
23:45.520 --> 23:50.440
But still we were able to interact with the ball pretty nicely.
23:51.560 --> 23:52.920
So how was it done?
23:53.060 --> 23:56.860
As I said so the horizontal for the horizontal movements are projected
23:56.860 --> 23:57.860
on the ground plane.
23:58.020 --> 24:01.600
We were using a constant velocity model of the ball.
24:02.680 --> 24:10.860
And we were always using a time frame of the last between 3 and 15
24:10.860 --> 24:12.560
points in time.
24:13.500 --> 24:19.360
Between the last 3 and 15 images from which we were taking the ball
24:19.360 --> 24:19.840
position.
24:20.740 --> 24:24.400
And then we were estimating the velocity and the position of the ball
24:24.400 --> 24:27.880
based on the assumption of constant velocity.
24:28.520 --> 24:32.480
So we had a frame rate of 30 images per second.
24:32.660 --> 24:39.080
That means 3 images means a time interval of in total something like
24:39.080 --> 24:40.560
67 milliseconds.
24:41.260 --> 24:44.900
And 15 images means half a second of measurements.
24:46.100 --> 24:51.120
For the vertical movement we were assuming an accelerated motion.
24:51.840 --> 24:53.620
We know there is gravity.
24:54.280 --> 24:57.800
That means we also know the acceleration in vertical direction.
24:58.380 --> 25:04.340
And so if the ball was chip kicked we were assuming that the motion
25:04.340 --> 25:09.620
model in vertical direction that is accelerated by with an
25:09.620 --> 25:11.580
acceleration given by gravity.
25:11.920 --> 25:15.380
Of course we had to consider deflections of the ball.
25:15.500 --> 25:19.640
When the ball bounces on the ground it bounces back.
25:20.280 --> 25:23.060
That had to to be modeled explicitly.
25:23.700 --> 25:28.420
This could not be reintegrated into this vertical motion model that we
25:28.420 --> 25:31.100
were using together with linear regression.
25:31.420 --> 25:34.160
But this was an additional modeling step.
25:34.840 --> 25:38.180
As I said we took between 3 and 15 observations.
25:39.080 --> 25:43.280
And while we made it adaptable that means we were always checking
25:43.280 --> 25:47.820
whether this assumption of a constant ball movement was still valid or
25:47.820 --> 25:48.080
not.
25:48.380 --> 25:52.560
And if you observe that the new measurements do not fit anymore to our
25:52.560 --> 25:55.820
motion model then we're saying something happened.
25:55.960 --> 26:01.240
The ball might be kicked or the ball might be deflected at an
26:01.240 --> 26:01.720
obstacle.
26:02.100 --> 26:06.440
That means then the motion we were assuming that the motion changed.
26:07.040 --> 26:11.700
And that means then we were reducing the number of observations that
26:11.700 --> 26:13.520
we considered to 3.
26:14.000 --> 26:24.000
And then we added observations in our temporal window up to a maximum
26:24.000 --> 26:24.940
number of 15.
26:26.680 --> 26:28.480
That was the procedure.
26:29.480 --> 26:31.920
Here is some result.
26:32.320 --> 26:34.480
This was actually an experiment in lab.
26:34.920 --> 26:36.740
What you can see is a soccer field.
26:37.340 --> 26:39.060
This is the position of the robot.
26:39.060 --> 26:41.300
So here is the front of the robot.
26:41.480 --> 26:43.020
That's the rear side of the robot.
26:44.140 --> 26:47.740
And here is what you can see is a ball.
26:47.920 --> 26:55.580
So this dark red broken circle is the sensed position of the ball.
26:55.680 --> 26:58.420
That means that's a position that the camera sensed.
26:59.060 --> 27:06.040
And this light red solid circle is the position of the ball that was
27:06.040 --> 27:08.500
estimated using the linear regression model.
27:09.140 --> 27:13.520
And the line here, this red line, indicates the motion direction and
27:13.520 --> 27:14.140
the velocity.
27:14.360 --> 27:19.640
That means the longer this line is, the larger is the estimated
27:19.640 --> 27:20.220
velocity.
27:21.880 --> 27:26.160
So this experiment runs in a cycle.
27:26.400 --> 27:29.900
So when we see observations here, the sequence starts.
27:30.700 --> 27:33.200
And now we have the first two measurements.
27:35.000 --> 27:38.660
We did not yet estimate the motion of the ball.
27:39.000 --> 27:44.340
But from the third observation on, we started to estimate the motion
27:44.340 --> 27:44.900
of the ball.
27:47.440 --> 27:55.740
And especially for the goalkeeper, we had a binocular camera set up so
27:55.740 --> 27:59.400
that we could also observe the ball when it is not located on the
27:59.400 --> 28:01.340
ground, but when it was chip kicked.
28:02.480 --> 28:08.600
For the other robots, we only had a monocular camera set up so that we
28:08.600 --> 28:13.040
could only get the position of the ball when it was lying on the
28:13.040 --> 28:13.380
ground.
28:14.300 --> 28:15.380
But that was sufficient.
28:15.820 --> 28:19.760
But for the goalkeeper, it was pretty important, of course, to be able
28:19.760 --> 28:22.320
to react also on balls that were chip kicked.
28:23.780 --> 28:30.500
Okay, so that's one example where such a linear regression model was
28:30.500 --> 28:33.880
very useful to estimate the motion of objects.
28:34.260 --> 28:38.680
Of course, this can also be done if you want to track cars in a
28:38.680 --> 28:43.620
traffic scenario or if you want to track pedestrians or if you want to
28:43.620 --> 28:45.540
track any other moving objects.
28:46.340 --> 28:54.000
Okay, so this was linear regression with an example for tracking
28:54.000 --> 28:54.520
objects.
28:55.140 --> 28:58.600
Well, the probability theory is something that we already introduced,
28:59.120 --> 29:04.660
so we can skip it today and can go to the next approach.
29:04.980 --> 29:10.360
A very generic approach first for tracking, which does not only apply
29:10.360 --> 29:15.920
for tracking objects in the sense of estimating their motion, but also
29:15.920 --> 29:19.700
for tracking system states in general.
29:19.980 --> 29:23.080
We will see later on how we can generalize it.
29:24.200 --> 29:29.020
But first of all, we need to introduce a stochastic model with which
29:29.020 --> 29:33.720
you can model these kind of processes, like the process of a ball that
29:33.720 --> 29:39.400
moves or the process of the car that drives or the process of a system
29:39.400 --> 29:42.140
that evolves and changes over time.
29:43.240 --> 29:47.000
And these are so-called hidden Markov models that we want to use.
29:47.380 --> 29:52.440
So our main idea is we observe something and we treat that as a
29:52.440 --> 29:58.620
system, whatever system means, but as a kind of entity that can be
29:58.620 --> 30:04.000
described by some equations, physical equations for instance, which
30:04.000 --> 30:12.420
describe a motion of an object or other equations from other sciences,
30:12.760 --> 30:15.060
if you want to describe other systems.
30:15.960 --> 30:19.820
But for us here, let's assume we want to describe the motion of an
30:19.820 --> 30:20.100
object.
30:20.560 --> 30:26.140
We treat this object and its motion as a system and we assume that the
30:26.140 --> 30:28.080
system is described by a state.
30:28.320 --> 30:29.160
What is a state?
30:29.160 --> 30:31.720
A state is just a set of variables.
30:33.100 --> 30:37.920
So for instance, if you want to model the movement of a car, you might
30:37.920 --> 30:41.440
say the movement is fully described if you know the position of the
30:41.440 --> 30:46.300
car, if you know the velocity of the car, the driving direction, maybe
30:46.300 --> 30:47.340
the steering angle.
30:47.700 --> 30:49.500
Then this motion is fully described.
30:50.100 --> 30:55.040
And such a set of variables that we need to fully describe the
30:55.040 --> 30:58.720
behavior of such a system is what we call the system state.
31:00.620 --> 31:06.040
And one property of the system state is that we can describe the
31:06.040 --> 31:11.340
changes of the system over time only based on these variables that are
31:11.340 --> 31:12.780
contained within the state.
31:13.140 --> 31:17.120
If we need some other variables, then the state is somehow not
31:17.120 --> 31:19.760
complete that we were using up to now.
31:20.400 --> 31:22.020
Yeah, so one example.
31:22.740 --> 31:28.080
Let's go back to the car accelerates at the traffic lights scenario.
31:28.080 --> 31:31.720
If we don't know the acceleration of the car, we cannot make good
31:31.720 --> 31:33.720
predictions of its future behavior.
31:34.020 --> 31:37.780
But if we know the acceleration, then we can make good predictions.
31:38.100 --> 31:42.500
So the acceleration is necessary, so the acceleration must be part of
31:42.500 --> 31:45.900
the state of the of the system moving car.
31:46.820 --> 31:52.440
Yeah, okay, so here is this example.
31:52.780 --> 31:55.300
First of all, this example without acceleration.
31:55.600 --> 31:59.380
So if you want to model a car that moves more or less with constant
31:59.380 --> 32:04.460
velocity, then we might say the system is fully described knowing its
32:04.460 --> 32:07.080
current position and its current velocity.
32:07.660 --> 32:13.460
And if we are interested in 2D motion or 3D motion, these two
32:13.460 --> 32:14.940
parameters become vectors.
32:15.600 --> 32:19.120
Yeah, so the vector-valued position of the vehicle and the vector
32:19.120 --> 32:21.420
-valued velocity of the car.
32:22.960 --> 32:28.780
Now, based on this system state, we can predict the changes of the of
32:28.780 --> 32:29.960
the system into future.
32:30.160 --> 32:35.700
We can say, okay, for some point in time, at a future point in time,
32:35.720 --> 32:39.520
which is different from present point in time t by a certain amount
32:39.520 --> 32:44.900
capital delta, we could argue that based on the system description,
32:45.400 --> 32:49.500
the future position is the current position plus delta times the
32:49.500 --> 32:50.340
current velocity.
32:51.380 --> 32:55.840
And the future velocity is equal to the current velocity.
32:56.640 --> 33:03.640
Of course, in practice, all these systems are not deterministic.
33:04.020 --> 33:07.100
Yeah, there is always a little bit of imprecision in the system.
33:07.680 --> 33:13.000
So in this case, the car driver might change the velocity a little
33:13.000 --> 33:13.360
bit.
33:13.680 --> 33:15.720
It might vary it a little bit.
33:16.040 --> 33:20.880
Or there might be some friction or whatever, so that this position
33:20.880 --> 33:24.860
here is not perfectly met, this position equation.
33:25.660 --> 33:30.780
So therefore, we have to add some statistics, some random noise in
33:30.780 --> 33:34.920
these two equations, which model this imprecision in the system.
33:36.580 --> 33:40.880
The imprecision and the transitions between different points in time.
33:41.660 --> 33:44.160
Now this is not yet the measurement noise.
33:44.560 --> 33:46.500
We didn't talk about measurements yet.
33:46.680 --> 33:50.380
This is just the imprecision in the system as such.
33:50.840 --> 33:54.600
Now the imprecision with which we can, which a driver can keep its
33:54.600 --> 33:55.720
velocity for instance.
33:57.020 --> 34:01.580
So in this case, we can even rewrite these two equations in term of a
34:01.580 --> 34:03.040
matrix vector multiplication.
34:03.600 --> 34:08.500
So if the state is written here as a kind of vector of state
34:08.500 --> 34:13.660
variables, and we can describe these transitions as matrix vector
34:13.660 --> 34:14.180
multiplication.
34:14.180 --> 34:22.100
In this case, the new state is equal to this matrix 1 delta 0 1 times
34:22.100 --> 34:26.600
the old state plus noise, where the noise here is a noise vector that
34:26.600 --> 34:30.140
has two components, one for the position and one for the velocity.
34:31.220 --> 34:35.880
Okay, so now let's talk about the measurements that we make.
34:36.020 --> 34:39.580
So we assume we have some sensor that can take measurements from the
34:39.580 --> 34:39.940
system.
34:40.780 --> 34:48.100
And we assume that we cannot necessarily see the system, measure the
34:48.100 --> 34:52.140
system variables, the state variables, but that we can make some
34:52.140 --> 34:57.760
observations that are only somehow related to the state variables.
34:58.580 --> 35:03.420
And that means we assume that there is a mapping that describes how
35:03.420 --> 35:09.880
the sensor works, that maps the state, the true state of the system to
35:09.880 --> 35:11.520
a measurement set.
35:12.320 --> 35:16.900
So set is used within chapter number five always for measurements.
35:17.400 --> 35:21.620
The set might be a single scalar measurement, or it might be a vector
35:21.620 --> 35:22.420
of measurements.
35:23.000 --> 35:27.100
So for instance, and furthermore, we assume that the whole measurement
35:27.100 --> 35:28.860
process is not perfect.
35:29.100 --> 35:33.420
So there is some uncertainty, some random imprecision in the
35:33.420 --> 35:35.660
measurement system as well.
35:35.760 --> 35:38.920
That means we have to add some noise here as well.
35:39.660 --> 35:44.900
So the sensor takes the true state of the system and maps it onto an
35:44.900 --> 35:49.700
observation and adds up some imprecision, some statistical, some
35:49.700 --> 35:50.480
random noise.
35:51.220 --> 35:56.440
And this indeed is the measurement noise, the noise that is created by
35:56.440 --> 35:59.860
the sensor and the sensor evaluation process.
36:01.440 --> 36:06.520
So for instance, let's assume we have this example of a car that
36:06.520 --> 36:07.880
drives at constant velocity.
36:08.500 --> 36:10.980
The system state is described by these variables.
36:11.500 --> 36:14.880
Now we assume, for instance, that we have a camera.
36:15.320 --> 36:19.340
The camera can measure the position of the vehicle, but not its
36:19.340 --> 36:19.860
velocity.
36:19.860 --> 36:23.700
Then we could argue that this measurement equation looks like that.
36:24.040 --> 36:26.180
So Z of t is equal to X of t.
36:26.340 --> 36:30.620
We can directly measure the position of the vehicle, plus some
36:30.620 --> 36:32.040
additional measurement noise.
36:33.460 --> 36:37.300
For other sensors, we might be able to sense the velocity of the
36:37.300 --> 36:38.700
vehicle, but not the position.
36:39.080 --> 36:42.540
For other sensors, we might be able to sense both the position and the
36:42.540 --> 36:42.920
velocity.
36:43.580 --> 36:48.040
In such a case, Z of t would be a vector of two entries.
36:49.260 --> 36:53.940
And again, in this case, we can represent that as well as a matrix
36:53.940 --> 36:56.400
times vector multiplication.
36:59.100 --> 37:04.820
Okay, so what we have to think about, or what is important to see, is
37:04.820 --> 37:07.560
that Z of t, these are the measurements.
37:07.720 --> 37:10.280
That is something that we can observe.
37:10.380 --> 37:13.020
That is something that the sensor provides to us.
37:14.380 --> 37:18.080
This S is something that we don't know.
37:18.380 --> 37:19.460
It's unknown to us.
37:20.680 --> 37:23.240
This is unknown to us.
37:23.660 --> 37:28.500
And it's also important to see that in this case, Z of t is not the
37:28.500 --> 37:29.540
same as X of t.
37:30.060 --> 37:34.380
We sense the position of the vehicle, but the measurement that we get
37:34.380 --> 37:40.060
is not exactly the position, but it's the position plus some noise on
37:40.060 --> 37:40.300
it.
37:40.800 --> 37:46.500
So we must be careful in dealing with these equations and always think
37:46.500 --> 37:52.160
carefully whether we talk about the true state, the true position of
37:52.160 --> 37:57.780
the vehicle, which would be X of t here, or whether we talk about the
37:57.780 --> 38:01.020
sensed position, the observation that we have.
38:01.580 --> 38:05.620
They should be related to each other, but they are not the same.
38:07.620 --> 38:13.400
Okay, so now let's assume we have modeled the system that we are
38:13.400 --> 38:18.300
interested in as such a system with a system state and with some
38:18.300 --> 38:23.160
observations and with a mapping from system states at one point in
38:23.160 --> 38:28.500
time to future points in time, and with a modeling that describes how
38:28.500 --> 38:33.560
the observation that the sensor provides is related to the state.
38:34.260 --> 38:39.740
Then, if we make some assumptions, we can describe that as a so-called
38:39.740 --> 38:42.720
hidden Markov model, or for short, HMM.
38:43.160 --> 38:43.720
What is it?
38:44.060 --> 38:48.540
It's a time discrete stochastic state transitioning system.
38:48.900 --> 38:52.720
Its observation and its successor state depend entirely on its present
38:52.720 --> 38:56.340
state and do not depend on previous states or observations.
38:57.000 --> 38:59.760
Okay, let's go through this definition step by step.
39:00.480 --> 39:04.220
So, first of all, it's a state transitioning system.
39:04.520 --> 39:05.140
That's clear.
39:05.280 --> 39:06.520
That's what we already saw.
39:06.640 --> 39:09.780
We have some state and the state is changing over time.
39:10.060 --> 39:11.760
Now that's a state transition system.
39:12.060 --> 39:13.500
Then it's stochastic.
39:13.840 --> 39:16.920
This means there's some randomness in the state transition system.
39:16.920 --> 39:20.540
We always have these noise terms, these random noise terms.
39:20.860 --> 39:24.700
It's not deterministic, but there is some randomness in it.
39:24.980 --> 39:29.100
Then time discrete means we evaluate the system only at discrete
39:29.100 --> 39:29.940
points in time.
39:30.540 --> 39:35.860
We are modeling it not as a continuous time process, but we are only
39:35.860 --> 39:38.600
focusing on some discrete points in time.
39:38.840 --> 39:46.780
For instance, only if we deal with cameras and if cameras are
39:46.780 --> 39:52.700
providing whatever, 30 images per second, that means one image after
39:52.700 --> 39:57.860
33 milliseconds, then these time intervals that we consider are 33
39:57.860 --> 39:58.400
milliseconds.
39:59.520 --> 40:05.360
And we only consider the system state for these intervals of 33
40:05.360 --> 40:08.540
milliseconds and we are not interested what goes on between.
40:09.620 --> 40:15.900
This means a discrete time state transition system and therefore we
40:15.900 --> 40:19.280
can enumerate the states over time with integers.
40:19.860 --> 40:24.760
So point in time 0, 1, 2, 3, 4, 5, etc.
40:28.380 --> 40:33.780
Now its observation and its successor state depend entirely on its
40:33.780 --> 40:37.400
present state and do not depend on previous states or observations.
40:38.180 --> 40:39.220
What does it mean?
40:39.660 --> 40:45.140
If you want to predict, if you know the true state of the system, then
40:45.140 --> 40:51.580
we can make good predictions for the observation that we will get and
40:51.580 --> 40:55.620
for the state at the next point in time.
40:56.380 --> 41:01.720
And if we know what were the previous states and the previous
41:01.720 --> 41:07.540
observations, it does not help us to get better predictions about the
41:07.540 --> 41:07.880
future.
41:08.240 --> 41:13.260
So all the information that describes the system is contained in the
41:13.260 --> 41:14.140
state variables.
41:15.780 --> 41:20.540
Knowing the history doesn't help to predict the future because all
41:20.540 --> 41:24.060
relevant information is already contained in the present state
41:24.060 --> 41:24.600
variables.
41:25.200 --> 41:26.880
That's the main message of this.
41:32.280 --> 41:40.140
Stochastically, this property can be defined by these two equations,
41:40.860 --> 41:49.800
which are equations that implement this assumption as a stochastic
41:49.800 --> 41:50.280
equation.
41:50.440 --> 41:56.180
It says the probability for the successor state, based on the
41:56.180 --> 42:00.000
knowledge of the present state, the present observation, the previous
42:00.000 --> 42:06.660
state, previous observation, and so on and so on, up to the first
42:06.660 --> 42:15.240
state at all, is the same as the probability of the successor state
42:15.240 --> 42:16.640
given the present state.
42:16.980 --> 42:21.600
Which means all this history of earlier states and earlier
42:21.600 --> 42:23.620
observations do not matter.
42:24.300 --> 42:30.220
Once we are faced with such an equation, with such a probability, we
42:30.220 --> 42:32.860
can simplify it to that probability.
42:34.020 --> 42:37.920
This is an assumption that we make about the world or about the
42:37.920 --> 42:38.260
system.
42:38.460 --> 42:43.900
It's not in general valid for any system, but it's an assumption that
42:43.900 --> 42:48.420
we have to make if we want to model a system as a hidden Markov model.
42:49.000 --> 42:52.520
And the same equation is similar to the first one.
42:52.660 --> 42:57.140
It says if we want to predict the present observation, or how does the
42:57.140 --> 43:01.980
present observation, what is the probability of the present
43:01.980 --> 43:06.860
observation given the present state and all the previous observations
43:06.860 --> 43:12.480
and all previous states, this can be simplified by assumption in a
43:12.480 --> 43:17.600
hidden Markov model to this very much simpler probability of the
43:17.600 --> 43:22.380
current observation given the current state and forgetting about the
43:22.380 --> 43:22.760
past.
43:23.660 --> 43:28.220
Hidden Markov model is a system where we can always forget about what
43:28.220 --> 43:31.580
happened in past because it will not influence the future.
43:32.220 --> 43:35.160
Once, at least, if we know the present state.
43:37.120 --> 43:39.780
This is not valid for all systems, of course.
43:41.000 --> 43:45.660
For some systems, you must know from where you came to be able to
43:45.660 --> 43:50.420
predict into which direction, so to say, you will move in future.
43:50.880 --> 43:55.420
Then we are, this, such a system would not be a hidden Markov model.
43:56.460 --> 44:01.340
Yeah, for the next, we assume that all the systems that we model meet
44:01.340 --> 44:06.840
these equations and these properties and can therefore be modeled as a
44:06.840 --> 44:07.920
hidden Markov model.
44:10.840 --> 44:13.760
So now we can ask some questions.
44:13.940 --> 44:18.940
Namely, one question would be, let us assume we made an sequence of
44:18.940 --> 44:25.600
observations and we want to know in which state does the system is
44:25.600 --> 44:27.120
currently in.
44:27.820 --> 44:33.660
Yeah, what is the present state of the system once we made a sequence
44:33.660 --> 44:34.520
of observations?
44:36.440 --> 44:42.900
So we observed a car several times and now we want to know how fast it
44:42.900 --> 44:43.400
is driving.
44:44.640 --> 44:46.660
Yeah, that would be a question like that.
44:47.240 --> 44:50.520
And another question is related to the first question.
44:50.680 --> 44:51.540
It looks like that.
44:51.780 --> 44:56.980
It says we observed the car several times and we want to know where
44:56.980 --> 45:04.100
will it be and how fast it will, will it be one time step ahead.
45:04.800 --> 45:07.540
So we want to make a prediction into future.
45:08.760 --> 45:15.340
So these are the two frequent questions that are relevant for many
45:15.340 --> 45:16.740
applications.
45:17.240 --> 45:22.460
So this is what is the present state of the vehicle based on the
45:22.460 --> 45:23.780
observations that we make.
45:23.920 --> 45:29.820
And this is the question, what is the future state of the vehicle once
45:29.820 --> 45:31.560
we made a sequence of observations.
45:32.240 --> 45:42.460
Now let's do that and derive equations for that based on the
45:42.460 --> 45:43.960
assumptions of a hidden Markov model.
45:44.080 --> 45:46.840
First of all, let's consider this equation here.
45:47.140 --> 45:51.660
The probability of the present state given the observations up to the
45:51.660 --> 45:52.800
present point in time.
45:53.840 --> 45:55.180
What can we do?
45:55.560 --> 46:00.800
Well, first of all, we can use Bayes theory to transform this
46:00.800 --> 46:01.700
equation.
46:02.520 --> 46:05.360
We exchange, what do know, what?
46:06.420 --> 46:13.000
Yes, we exchange ST and ZT in this conditional distribution and we
46:13.000 --> 46:17.580
keep the older observations here on the right hand side of this
46:17.580 --> 46:21.000
conditional probability distribution.
46:21.760 --> 46:27.220
Then we get this conditional probability where the role of ZT and ST
46:27.220 --> 46:32.640
has changed and some corrections terms, namely the probability of ST
46:32.640 --> 46:37.120
given all the previous observations and in the denominator the
46:37.120 --> 46:40.840
probability of the current observation given all previous
46:40.840 --> 46:41.460
observations.
46:42.220 --> 46:47.100
So now we can make two simplifications.
46:47.820 --> 46:53.860
The first simplification is, if we look at this equation, at this
46:53.860 --> 46:57.200
probability here, we see that's the probability of the current
46:57.200 --> 47:01.780
observation given the current state and some previous observations.
47:02.920 --> 47:07.440
So if we assume that this system is a hidden Markov model, then we can
47:07.440 --> 47:12.500
simplify this probability, this conditional probability, because we
47:12.500 --> 47:16.960
said for a hidden Markov model the history doesn't matter once we know
47:16.960 --> 47:17.840
the present state.
47:19.400 --> 47:23.020
All the information that is contained in this history of observations
47:23.020 --> 47:25.200
is also contained in ST.
47:26.040 --> 47:32.520
Therefore we can omit these previous observations in the right hand
47:32.520 --> 47:36.340
side part of this conditional probability.
47:37.260 --> 47:43.200
Now then this is simplified and this becomes this probability here.
47:43.780 --> 47:49.100
Furthermore this denominator, it just contains observations.
47:49.400 --> 47:51.460
It doesn't contain any states.
47:52.520 --> 47:54.260
It does only contain observations.
47:54.440 --> 47:56.640
The observations are what we know.
47:56.920 --> 47:58.160
We have measured them.
47:59.080 --> 48:04.520
And that means this denominator is just a constant that is independent
48:04.520 --> 48:06.340
on the presence of the present state.
48:07.320 --> 48:11.360
And therefore we can say, we could argue that this term is a constant,
48:11.600 --> 48:16.560
so we hide it inside of this is proportional to term.
48:17.500 --> 48:23.640
So this is proportional to term here just hides this denominator which
48:23.640 --> 48:27.020
is just a constant independent of the present state of the system.
48:29.620 --> 48:34.340
So this is this part.
48:34.960 --> 48:37.260
Now let's have a look at this equation.
48:37.560 --> 48:43.920
This deals with the predicted probability of the future state of the
48:43.920 --> 48:46.800
system given the observations up to now.
48:47.380 --> 48:48.660
What can we do here?
48:49.300 --> 48:53.360
Well we can use again the calculation rules for probability,
48:53.600 --> 48:55.080
calculating with probabilities.
48:55.660 --> 49:00.340
What we do is we add another variable here in this probability, namely
49:00.340 --> 49:01.920
ST, the present state.
49:03.140 --> 49:09.560
And to neglect this influence of ST again we summarize over all
49:09.560 --> 49:13.000
possible values that this variable ST can take.
49:13.680 --> 49:17.580
Now this is the marginalization rule that we introduced in the
49:17.580 --> 49:20.340
repetition of probability theory.
49:20.820 --> 49:26.080
So this equality here really holds because we neglect the influence of
49:26.080 --> 49:32.080
this added probability, a random variable by summing up over all
49:32.080 --> 49:33.040
possible values.
49:35.300 --> 49:39.960
Okay, so based on this equation we can split up this equation a little
49:39.960 --> 49:43.580
bit again using the definition of conditional probabilities.
49:43.860 --> 49:48.720
We want to push, so to say, this ST variable to the right hand side
49:48.720 --> 49:50.820
part of the conditional distribution.
49:50.820 --> 49:56.540
We can do so by multiplying it with some correction term like that.
49:57.060 --> 50:01.200
So the Z1 to ZT is always in the condition part, in the right hand
50:01.200 --> 50:07.700
part of these conditional distributions, conditional probabilities.
50:08.760 --> 50:14.760
And the ST, so here we have, so to say, the joint probability of ST
50:14.760 --> 50:17.780
plus one and ST given these observations.
50:17.920 --> 50:22.060
And here we have the conditional distribution of ST plus one given ST
50:22.060 --> 50:28.640
and all previous and all the observations times the marginal, so to
50:28.640 --> 50:33.220
say, of ST given, always given the observation.
50:33.380 --> 50:36.620
So that's as well just using the definition of a conditional
50:36.620 --> 50:40.140
probability this transformation.
50:41.000 --> 50:43.900
Now we can simplify this term.
50:44.340 --> 50:49.140
So this term is a probability of a successor state given the present
50:49.140 --> 50:52.940
state and some observations from past.
50:53.160 --> 50:56.140
Again we assume we are faced with a hidden Markov model.
50:56.480 --> 51:01.120
That means all the information from the past is already contained in
51:01.120 --> 51:02.180
the present state.
51:03.200 --> 51:08.720
And that means that we can neglect the history and remove these
51:08.720 --> 51:13.840
observations here from this conditional probability.
51:14.740 --> 51:18.720
And then things simplify and we get these simpler transition
51:18.720 --> 51:19.460
probabilities.
51:21.000 --> 51:22.460
Okay, so far.
51:22.540 --> 51:26.420
Now let's have a look at these two equations that we get.
51:26.880 --> 51:29.160
First of all, what is that?
51:29.460 --> 51:34.020
Well, that's just the simple state transition probability that says if
51:34.020 --> 51:38.560
you're in a certain state, how likely is it that you enter another
51:38.560 --> 51:39.320
state from it?
51:39.440 --> 51:44.180
So that's very simple and something that we defined in the hidden
51:44.180 --> 51:45.040
Markov model.
51:45.820 --> 51:50.120
Well, this term here, it has formed the probability of the current
51:50.120 --> 51:54.040
state given the observations up to the current point in time.
51:54.320 --> 51:59.140
And that's actually what we get as a result here in the first
51:59.140 --> 51:59.600
equation.
51:59.600 --> 52:05.800
So this term that enters here was calculated as a result of the
52:05.800 --> 52:08.600
equation here on the top of the slide.
52:09.920 --> 52:14.680
So that means once we have this result from here, we can enter it
52:14.680 --> 52:14.960
here.
52:15.240 --> 52:18.320
This is something that we defined with a hidden Markov model.
52:18.520 --> 52:23.220
That means then we can evaluate this term and we get this probability
52:23.220 --> 52:24.780
of the predicted state.
52:25.880 --> 52:28.600
Now let's go to the row here.
52:30.040 --> 52:31.180
What is this?
52:31.320 --> 52:33.520
Well, that's just the observation probability.
52:33.900 --> 52:34.740
That's a probability.
52:35.100 --> 52:39.760
If you are in a certain state, what is the probability to make a
52:39.760 --> 52:40.640
certain observation?
52:40.800 --> 52:43.960
That is what we defined within the hidden Markov model.
52:45.280 --> 52:47.000
Okay, and what is this term?
52:47.260 --> 52:51.340
Well, this is a probability of the present state given the
52:51.340 --> 52:54.260
observations up to the point in time before.
52:56.140 --> 53:02.240
So this is a predicted probability that predicts which state will be
53:02.240 --> 53:07.180
in the next point in time once we saw all the observations up to the
53:07.180 --> 53:08.460
previous point in time.
53:09.300 --> 53:15.060
So it has the structure of this probability here, but we see that this
53:15.060 --> 53:18.560
index here has just been incremented by one.
53:19.000 --> 53:24.440
So here we have st given z1 to zt minus one and here we have st plus
53:24.440 --> 53:27.080
one given z1 to zt.
53:27.340 --> 53:31.300
So there's just one increment changed in the time.
53:32.260 --> 53:36.500
But we see that the structure actually of this probability is the same
53:36.500 --> 53:37.340
as this one.
53:37.420 --> 53:41.760
That means if we want to calculate this term, we have to apply this
53:41.760 --> 53:45.480
calculation here at the bottom, but for the point in time before.
53:46.840 --> 53:51.440
That means in total we get some recurrent dependence between these two
53:51.440 --> 53:52.120
equations.
53:52.520 --> 53:57.180
The result that we calculate here at the bottom enters the calculation
53:57.180 --> 54:01.820
here on the right hand side of the calculation at the top of the slide
54:01.820 --> 54:05.480
and the result that we get here on the top of the slide enters the
54:05.480 --> 54:07.640
calculation at the bottom of the slide.
54:08.140 --> 54:15.800
But within such a cycle the time is incremented by one.
54:16.380 --> 54:22.220
That means we the calculations are always based on the results that we
54:22.220 --> 54:24.600
obtained at the point in time before.
54:26.520 --> 54:30.280
That means we have to start, of course still we have to start at a
54:30.280 --> 54:33.180
certain initial point in time with a certain assumption.
54:33.760 --> 54:38.240
So an initial assumption about the distribution or the probabilities
54:38.240 --> 54:43.120
of state variables, that would be actually something like the
54:43.120 --> 54:55.080
probability, well where is it, the probability of S1 given no
54:55.080 --> 54:56.260
observations at all.
54:56.920 --> 54:59.300
That's something that we have to assume somehow.
55:00.160 --> 55:06.480
If we have that, then we have this term here for S1, so where t
55:06.480 --> 55:12.180
actually is zero, then we can end that at that here and go on like
55:12.180 --> 55:18.940
that and through time do all the calculations step by step over time.
55:19.820 --> 55:24.860
So here is another scheme that shows this schedule.
55:25.220 --> 55:29.920
So we have an initial estimate about this predicted state
55:29.920 --> 55:30.620
probabilities.
55:31.260 --> 55:35.760
We take it from some background knowledge or we specify in this
55:35.760 --> 55:38.900
probability distribution with which we start that we do not have
55:38.900 --> 55:42.260
knowledge about the present state, that's also possible.
55:42.880 --> 55:44.680
And then we have these two steps.
55:45.660 --> 55:49.580
One step that takes these predicted state probabilities and
55:52.920 --> 55:57.600
integrates a new observation to get the state probabilities.
55:58.540 --> 56:02.880
And then the other step that goes from here to here that does the
56:02.880 --> 56:03.280
prediction.
56:03.820 --> 56:06.420
And these two steps are called prediction step.
56:06.420 --> 56:14.360
Once we calculate these predicted state probabilities from those here,
56:14.680 --> 56:16.920
this is the so-called prediction step.
56:17.540 --> 56:23.340
And this step where we add up a new observation is called either the
56:23.340 --> 56:28.620
correction step or the innovation step of this approach.
56:34.520 --> 56:38.840
Then the whole thing looks like that over time.
56:39.320 --> 56:42.180
So we start with this p of s1.
56:43.420 --> 56:49.080
Then we make an innovation step to get p of s1 given set 1.
56:49.480 --> 56:54.240
Then we make a prediction step to get this p of s2 given set 1.
56:54.480 --> 56:58.600
Then we make an innovation step to get p of s2 given set 1.
56:59.380 --> 57:05.260
Then we make again a prediction step and so on and so on.
57:05.700 --> 57:09.960
We could start either here or here.
57:10.340 --> 57:13.420
That depends very much on the problem.
57:14.460 --> 57:17.900
Whether it's better to start here or whether it's better to start
57:17.900 --> 57:18.160
here.
57:19.600 --> 57:21.340
Sorry, or to start here.
57:21.480 --> 57:25.480
So that's possible to start with p of s0 and then first make
57:25.480 --> 57:26.720
prediction step.
57:26.940 --> 57:29.100
That's also a possible starting point.
57:29.580 --> 57:32.000
But that depends very much on the application.
57:32.440 --> 57:36.320
Whether it's more reasonable to start here or more reasonable to start
57:36.320 --> 57:36.600
here.
57:41.400 --> 57:45.880
Yeah, so this is a very general scheme.
57:47.140 --> 57:52.860
And the problem with this general scheme of this algorithm is that if
57:52.860 --> 57:54.600
we look at the formula,
58:00.470 --> 58:03.130
a very general scheme of an algorithm.
58:03.630 --> 58:05.030
As long as
58:08.160 --> 58:13.210
the state variables are discrete random variables which can only take
58:13.210 --> 58:17.390
on values from a finite discrete set of possible values.
58:18.050 --> 58:22.130
And the observations are as well discrete random variables.
58:22.390 --> 58:26.770
That means the observations are also just discrete numbers.
58:28.570 --> 58:33.430
Then this algorithm can be implemented on a computer very efficiently
58:33.430 --> 58:37.610
and to do the prediction.
58:37.910 --> 58:44.230
So let's make an example for such a case and to get some better
58:44.230 --> 58:48.390
intuitive feeling for this kind of algorithm.
58:49.270 --> 58:55.090
So in this case now we describe this hidden Markov model in case of a
58:55.090 --> 58:56.050
transition diagram.
58:56.650 --> 59:02.650
And in this transition diagram each of these ellipses refers to one
59:02.650 --> 59:05.190
value that the state variable can take.
59:05.250 --> 59:09.170
So we assume one discrete random variable, a state variable.
59:09.410 --> 59:13.110
And it can take on three different values a b and c.
59:14.290 --> 59:18.870
And with the observation we assume that we have one discrete random
59:18.870 --> 59:23.190
variable for the observation and it can take on two different values
59:23.190 --> 59:24.890
either u or v.
59:26.510 --> 59:31.030
Now we have some transition probabilities between the different states
59:31.030 --> 59:34.070
and these are shown here by these black arrows.
59:34.570 --> 59:39.230
So each arrow models a possible transition and the numbers here refer
59:39.230 --> 59:40.990
to the transition probabilities.
59:41.650 --> 59:47.070
So this 0.2 is a probability that the system makes a transition from
59:47.070 --> 59:51.550
state a to state b from one point in time to the subsequent point in
59:51.550 --> 59:51.770
time.
59:53.390 --> 59:58.370
And we can also stay in state a with the probability of 0.8 for
59:58.370 --> 59:58.770
instance.
59:59.290 --> 01:00:04.570
And then these dashed cyan lines refer to the observation
01:00:04.570 --> 01:00:05.390
probabilities.
01:00:06.350 --> 01:00:10.350
So and the numbers here refer provide the respective probabilities.
01:00:10.590 --> 01:00:17.270
So this 0.8 says the probability to make the observation v in state b
01:00:17.270 --> 01:00:19.130
is 80 percent.
01:00:21.330 --> 01:00:26.870
And of course since these probabilities must sum up to one and the
01:00:26.870 --> 01:00:32.710
probability to make the observation u in state b must then be 0.2.
01:00:36.370 --> 01:00:42.090
Okay, so this is the transition diagram.
01:00:42.470 --> 01:00:47.390
And for this kind of system let's execute this algorithm a little bit.
01:00:47.610 --> 01:00:52.790
So we want to calculate these state probabilities for an observation
01:00:52.790 --> 01:00:54.730
sequence u u v u.
01:00:54.970 --> 01:00:58.810
So in the first point in time we get an observation u, in the second
01:00:58.810 --> 01:01:02.550
point in time we get observation u, and the third point in time we get
01:01:02.550 --> 01:01:06.050
the observation v, and in the fourth point in time we get the
01:01:06.050 --> 01:01:07.210
observation u again.
01:01:08.030 --> 01:01:13.010
And we assume as initial distribution that the probability to be in a
01:01:13.010 --> 01:01:17.830
that s1 is equal to a is equal to 0.5.
01:01:18.070 --> 01:01:21.750
And the probability that s1 is equal to b is also 0.5.
01:01:21.870 --> 01:01:26.390
And the probability that s1 is equal to c is equal to zero.
01:01:27.270 --> 01:01:33.490
Now that is our assumption for the initial point in time as one.
01:01:34.130 --> 01:01:36.390
So now let's do the calculations.
01:01:37.090 --> 01:01:44.230
So here is a table that contains all the results that we get.
01:01:44.870 --> 01:01:52.170
So here at the bottom these three lines refer to the predicted state
01:01:52.170 --> 01:01:52.910
probabilities.
01:01:53.510 --> 01:02:00.670
The state probability that the predicted state is either a or b or c.
01:02:01.810 --> 01:02:06.610
And the top rows, the three top rows refer to the estimated state
01:02:06.610 --> 01:02:11.130
probabilities after integrating the present observation.
01:02:12.270 --> 01:02:16.770
That means if you go from the top rows to the bottom rows we make a
01:02:16.770 --> 01:02:17.530
prediction step.
01:02:18.070 --> 01:02:23.810
And if we go from the bottom rows in diagonal direction to the right
01:02:23.810 --> 01:02:28.370
upper neighbor then we make an innovation step.
01:02:29.810 --> 01:02:34.810
So we started here with the predicted states, state variable, state
01:02:34.810 --> 01:02:35.430
probabilities.
01:02:35.830 --> 01:02:38.310
So first of all we have to make an innovation step.
01:02:38.910 --> 01:02:43.130
And the first symbol or the first observation that we got was u,
01:02:43.590 --> 01:02:44.650
observation u.
01:02:44.810 --> 01:02:46.890
So we may have to make an innovation step.
01:02:47.350 --> 01:02:49.050
So how did it look like?
01:02:49.170 --> 01:02:50.930
Well what did we have to do?
01:02:51.030 --> 01:02:55.510
We had to multiply the respective state probability, predicted state
01:02:55.510 --> 01:02:59.990
probabilities with the respective observation probabilities and
01:02:59.990 --> 01:03:05.950
afterwards normalize the numbers that we get such that they sum up to
01:03:05.950 --> 01:03:06.190
one.
01:03:06.590 --> 01:03:09.790
That means we have to find a normalization factor with which we
01:03:09.790 --> 01:03:14.210
multiply these values that we get such that afterwards the sum is
01:03:14.210 --> 01:03:14.890
equal to one.
01:03:15.310 --> 01:03:17.730
If we do that in this case what do we get?
01:03:18.210 --> 01:03:21.370
Well for state a the probability is 0.5.
01:03:21.790 --> 01:03:25.330
The probability to make an observation u is 0.6.
01:03:25.430 --> 01:03:29.890
So we have 0.5 times 0.6 yields 0.3.
01:03:31.010 --> 01:03:33.990
Okay let's remember, memorize 0.3 here.
01:03:34.630 --> 01:03:35.870
Now something like 0.3.
01:03:36.610 --> 01:03:42.830
For b the probability to be in is 0.5 times the observation to make
01:03:42.830 --> 01:03:44.870
observation u is 0.2.
01:03:44.990 --> 01:03:48.850
So 0.5 times 0.2 is 0.25.
01:03:49.490 --> 01:03:50.070
No is it?
01:03:50.190 --> 01:03:50.410
No.
01:03:50.850 --> 01:03:51.770
0.25.
01:03:53.170 --> 01:03:53.770
Yes?
01:03:53.910 --> 01:03:54.130
No?
01:03:54.410 --> 01:03:54.690
No.
01:03:54.910 --> 01:03:55.590
0.1.
01:03:55.590 --> 01:03:59.190
Oh what is it?
01:03:59.570 --> 01:04:00.830
0.1.
01:04:01.290 --> 01:04:01.830
0.1.
01:04:02.110 --> 01:04:02.330
No?
01:04:03.490 --> 01:04:06.690
So we would have 0.3 here and 0.1 here.
01:04:06.850 --> 01:04:09.410
For c the probability to be in is 0.
01:04:10.150 --> 01:04:13.010
The observation probability is 0.7.
01:04:13.130 --> 01:04:15.090
So 0 times 0.7 is 0.
01:04:15.650 --> 01:04:21.350
So which factor do we have to multiply to such that the probabilities
01:04:21.350 --> 01:04:22.290
sum up to one?
01:04:22.670 --> 01:04:24.270
Well we have here 0.3.
01:04:24.430 --> 01:04:25.590
Here we have 0.1.
01:04:25.730 --> 01:04:28.350
So the sum is one 0.4.
01:04:28.770 --> 01:04:32.490
Let's divide the numbers by the sum by 0.4.
01:04:32.630 --> 01:04:37.690
Then we get 0.75 here and 0.25 here and zero here.
01:04:38.010 --> 01:04:41.690
And now we see that these probabilities sum up to one again.
01:04:42.490 --> 01:04:47.450
So this normalization factor in this case was just the sum of the
01:04:47.450 --> 01:04:50.490
values that we were calculating here before normalization.
01:04:50.930 --> 01:04:54.630
Now we divide all the numbers by this sum and then we get the
01:04:54.630 --> 01:04:55.270
probabilities.
01:04:55.830 --> 01:04:56.390
Okay?
01:04:56.650 --> 01:04:57.950
Then we are here.
01:04:59.150 --> 01:05:09.810
And we see that introducing this observation helps us to get more
01:05:09.810 --> 01:05:13.650
certain about in which state we are really in.
01:05:14.590 --> 01:05:18.710
So before we did not know whether we were in state a or b.
01:05:18.890 --> 01:05:22.110
It was possible with the same probabilities.
01:05:22.670 --> 01:05:26.750
Now we know that it's much likelier to be in state a than to be in
01:05:26.750 --> 01:05:27.150
state b.
01:05:27.630 --> 01:05:32.430
So the innovation step helps us to get more information about what is
01:05:32.430 --> 01:05:33.290
a real state.
01:05:35.470 --> 01:05:41.190
Now let's start from here and execute the prediction step.
01:05:41.650 --> 01:05:44.130
In the prediction step what do we have to do?
01:05:44.510 --> 01:05:53.310
If you want to know the probability here to enter state a.
01:05:54.150 --> 01:05:56.130
How can we enter state a?
01:05:56.330 --> 01:06:01.630
Well either we have been in state a before and went this way.
01:06:01.790 --> 01:06:04.690
So we have stayed in state a before.
01:06:05.010 --> 01:06:08.750
Or we have been in state c and went this way here.
01:06:10.010 --> 01:06:13.410
Okay, let's calculate the probabilities for these two possibilities.
01:06:14.650 --> 01:06:18.350
To be in state a happens with a probability of 0.75.
01:06:19.450 --> 01:06:25.070
To stay inside of state a happens with probability of 0.8.
01:06:25.490 --> 01:06:28.970
So we have 0.75 times 0.8.
01:06:29.970 --> 01:06:33.490
And if you have a pocket calculator you might check that this yields 0
01:06:33.490 --> 01:06:34.130
.6.
01:06:34.410 --> 01:06:38.490
The second possibility is we could have been in state c with a
01:06:38.490 --> 01:06:43.410
probability of 0 and made this transition with a probability of 0.5
01:06:43.410 --> 01:06:46.870
which yields in total 0.5 times 0 yields 0.
01:06:47.350 --> 01:06:49.950
That means this possibility does not contribute.
01:06:50.710 --> 01:06:57.450
That means in total what we have to write here is 0.75 times 0.8 plus
01:06:57.450 --> 01:07:03.570
0 times 0.5 and this yields 0.6 here.
01:07:04.990 --> 01:07:10.050
For state number b what is the probability that in the next point in
01:07:10.050 --> 01:07:11.510
time we are in state number b?
01:07:11.990 --> 01:07:13.610
Well there are two possibilities.
01:07:13.910 --> 01:07:19.230
Either we have been in state a and have went this way or we have been
01:07:19.230 --> 01:07:21.430
in state c and go this way.
01:07:22.530 --> 01:07:29.470
So the first way happens with a probability of 0.75 times 0.2 and the
01:07:29.470 --> 01:07:36.310
second path happens with a probability of 0 times 0.5 and in total if
01:07:36.310 --> 01:07:40.350
you sum up these two possibilities we get 0.15 here.
01:07:41.870 --> 01:07:46.450
For state c how likely is it that we are in state c at the next point
01:07:46.450 --> 01:07:46.990
in time?
01:07:47.850 --> 01:07:48.910
Well how can it happen?
01:07:49.270 --> 01:07:53.570
The only way that it happens is if we are in state b and we go this
01:07:53.570 --> 01:08:02.150
way we have in total 0.25 times 1 yields 0.25.
01:08:03.250 --> 01:08:06.290
Okay that's a prediction step.
01:08:07.130 --> 01:08:12.390
Now maybe one further row column here to get a better understanding.
01:08:13.030 --> 01:08:17.450
Next step here if we make an innovation step here we make the
01:08:17.450 --> 01:08:18.290
observation u.
01:08:18.910 --> 01:08:21.730
Well for state a what do we have to consider?
01:08:21.890 --> 01:08:26.070
We have to calculate this probability this predicted probability 0.6
01:08:26.070 --> 01:08:31.390
and multiply it with the observation probability it's as well 0.6 so
01:08:31.390 --> 01:08:33.050
we get 0.36 here.
01:08:34.490 --> 01:08:39.670
Then for b we have the probability the predicted probability is 0.15
01:08:39.670 --> 01:08:57.710
the observation probability is 0.2 so we get 0.03 is it right 0.03 yes
01:08:57.710 --> 01:08:59.330
something like that.
01:09:00.270 --> 01:09:09.770
If I'm wrong just just complain yeah or and for state c it's 0.25
01:09:09.770 --> 01:09:16.590
times the observation probability of 0.7 that's something that you
01:09:16.590 --> 01:09:20.830
might ask your pocket calculator for and then you get some numbers
01:09:20.830 --> 01:09:26.910
here yeah you sum up these numbers that yields your correction term
01:09:26.910 --> 01:09:30.650
you divide all these numbers by this correction term and then the
01:09:30.650 --> 01:09:34.590
number said you get is 0.64 0.05 or 0.31.
01:09:36.330 --> 01:09:40.590
Okay by doing that you get you execute the innovation step and now the
01:09:40.590 --> 01:09:45.470
prediction step again well now it's a little bit more interesting for
01:09:45.470 --> 01:09:49.430
state a what are the ways to enter state a either you have been in
01:09:49.430 --> 01:09:53.890
state a and you go this way this happens with the probability of 0.64
01:09:53.890 --> 01:10:00.290
times 0.8 or you might have been in state c which happened with a
01:10:00.290 --> 01:10:04.930
probability of 0.31 and you went this way with a probability of 0.5
01:10:04.930 --> 01:10:10.470
you add up these two possibilities and you get as a result 0.67 here
01:10:10.470 --> 01:10:15.530
like that you fill out the other the other cells as well.
01:10:15.830 --> 01:10:27.910
Now okay so that is an example how we can calculate these
01:10:27.910 --> 01:10:34.470
probabilities in a hidden Markov model with discrete states and
01:10:34.470 --> 01:10:38.510
discrete discrete state variables and discrete observations.
01:10:39.050 --> 01:10:44.550
This is useful for some cases we will go back to this to these kind of
01:10:44.550 --> 01:10:50.570
examples later on but for many processes and many estimation tasks
01:10:53.090 --> 01:11:00.930
it's not yet useful because many things are not described by discrete
01:11:00.930 --> 01:11:03.050
variables but by continuous variables.
01:11:03.230 --> 01:11:10.790
So if you want to estimate the velocity of a traffic participant the
01:11:10.790 --> 01:11:15.670
velocity is not just drives fast or drives slow or doesn't drive at
01:11:15.670 --> 01:11:26.330
all some discrete values but it's a velocity of 35.82 kilometers per
01:11:26.330 --> 01:11:30.110
hour so a value from a continuous variable.
01:11:31.110 --> 01:11:34.770
And of course this kind of calculation that we have seen here only
01:11:34.770 --> 01:11:37.570
works for discrete variables.
01:11:38.270 --> 01:11:42.610
So what is the case if you are dealing with continuous random
01:11:42.610 --> 01:11:43.230
variables?
01:11:43.870 --> 01:11:47.330
So we deal with continuous random variables we can do the same
01:11:47.330 --> 01:11:51.470
calculations in general that we did but instead of having
01:11:51.470 --> 01:11:56.730
probabilities we have to deal with probability density functions and
01:11:56.730 --> 01:12:02.030
instead of sums summing up over probabilities we have to integrate
01:12:02.030 --> 01:12:05.570
over these terms that's the only difference.
01:12:05.950 --> 01:12:10.530
So these equations are actually the same that we have derived for the
01:12:10.530 --> 01:12:11.270
discrete case.
01:12:11.710 --> 01:12:16.950
The first one describes the innovation step the second one describes
01:12:16.950 --> 01:12:24.030
the prediction step but now we just replace the crisp discrete random
01:12:24.030 --> 01:12:29.090
variable probabilities by probability density functions.
01:12:30.970 --> 01:12:36.410
But the disadvantage in this general case is that while doing these
01:12:36.410 --> 01:12:42.050
calculations here especially these integrations often lead to terms
01:12:42.050 --> 01:12:47.530
that are either hard to solve analytical to integrate analytically or
01:12:47.530 --> 01:12:51.250
even impossible to integrate analytically.
01:12:51.750 --> 01:12:52.830
That's the disadvantage.
01:12:52.830 --> 01:12:59.770
But the good message is that for one case at least for one case it is
01:12:59.770 --> 01:13:05.330
possible to solve all these analytical these integrals in analytically
01:13:05.330 --> 01:13:11.490
and this is the case in when we are dealing with so-called linear
01:13:11.490 --> 01:13:12.550
Gaussian models.
01:13:12.730 --> 01:13:13.990
What are linear Gaussian models?
01:13:14.610 --> 01:13:17.450
Well they are linear and they are Gaussian.
01:13:17.550 --> 01:13:22.910
So Gaussian refers to the noise the noise should be distributed
01:13:22.910 --> 01:13:27.850
according to a Gaussian and normal distribution and not to something
01:13:27.850 --> 01:13:33.550
else but only to a Gaussian distribution and the system dynamics so
01:13:33.550 --> 01:13:38.810
the dependence between the future state variables and the present
01:13:38.810 --> 01:13:42.350
state variables should be linear as well as the dependence between the
01:13:42.350 --> 01:13:45.090
observations and the state variables should be linear.
01:13:45.610 --> 01:13:52.710
So that means such a system is linear Gaussian if we can write this
01:13:52.710 --> 01:13:59.450
state transition equation like that this future state is equal to a
01:13:59.450 --> 01:14:01.630
matrix times vector multiplication.
01:14:02.270 --> 01:14:08.530
So A is a matrix with fixed values independent of the current state.
01:14:09.330 --> 01:14:16.250
St is a current state vector then ut is an offset so and it doesn't
01:14:16.250 --> 01:14:20.410
matter how you calculate the offset as long as it is independent of
01:14:20.410 --> 01:14:21.370
the state variables.
01:14:22.010 --> 01:14:27.030
So this offset might be just a constant offset this offset might be
01:14:27.030 --> 01:14:30.370
determined from some control input of your system.
01:14:31.110 --> 01:14:36.630
It doesn't matter just some offset that is independent of the present
01:14:36.630 --> 01:14:37.010
state.
01:14:38.890 --> 01:14:47.290
Plus some noise epsilon t is a noise term that should be taken from a
01:14:47.290 --> 01:14:49.270
Gaussian distribution with zero mean.
01:14:50.350 --> 01:14:52.550
Zero mean Gaussian distribution.
01:14:53.450 --> 01:14:57.790
And the measurement equation must be written like that so the
01:14:57.790 --> 01:15:04.430
measurement vector is equal to a matrix ht that contains just some
01:15:04.430 --> 01:15:09.530
constants which are independent of the state of the state variables
01:15:09.530 --> 01:15:12.990
here times the state vector plus delta t.
01:15:13.090 --> 01:15:17.970
Delta t also is some noise term namely the sensor noise and the sensor
01:15:17.970 --> 01:15:22.090
noise must also be taken from a zero mean Gaussian distribution.
01:15:23.990 --> 01:15:30.430
If this is true for your system that you want to model and furthermore
01:15:30.430 --> 01:15:35.890
okay that is what i already said this is all everything that i already
01:15:35.890 --> 01:15:36.310
said.
01:15:37.850 --> 01:15:43.050
If this is true and furthermore you assume that your initial state
01:15:43.050 --> 01:15:49.350
distribution is taken from a Gaussian then all these equations are
01:15:49.350 --> 01:15:54.290
solvable in an analytical way and all the probability distributions
01:15:54.290 --> 01:16:00.070
that occur in this calculation are always Gaussian distributions
01:16:00.070 --> 01:16:00.450
again.
01:16:00.890 --> 01:16:06.850
That means then the calculations that we did the prediction step takes
01:16:06.850 --> 01:16:11.810
a Gaussian distribution and yields another Gaussian distribution and
01:16:11.810 --> 01:16:14.750
the innovation step as well takes a Gaussian distribution and a
01:16:14.750 --> 01:16:17.630
measurement and yields another Gaussian distribution.
01:16:18.090 --> 01:16:22.510
That means then the whole process just maps Gaussian distributions
01:16:22.510 --> 01:16:23.990
onto Gaussian distributions.
01:16:24.590 --> 01:16:26.170
So some remarks here.
01:16:27.790 --> 01:16:32.770
Yeah at models the transition from state to state might vary from time
01:16:32.770 --> 01:16:38.570
to time yeah that's true so but it must not depend on the present
01:16:38.570 --> 01:16:39.050
state.
01:16:40.530 --> 01:16:46.490
Ut might also vary from time to time might depend on whatever but not
01:16:46.490 --> 01:16:47.750
on the state variables.
01:16:49.330 --> 01:16:53.650
Epsilon t is this white Gaussian noise so the zero mean Gaussian
01:16:53.650 --> 01:16:54.210
noise.
01:16:56.870 --> 01:17:02.090
So yeah so white Gaussian noise means Gaussian noise with zero mean.
01:17:03.690 --> 01:17:11.590
Qt is the covariance matrix of the transition noise that must be
01:17:11.590 --> 01:17:11.870
given.
01:17:11.870 --> 01:17:17.110
Might also depend on the time but it must not at all depend on the
01:17:17.110 --> 01:17:21.510
state variables or something like that.
01:17:22.310 --> 01:17:26.590
Then delta t is the measurement noise or sensor noise.
01:17:27.330 --> 01:17:31.510
It's modeled as well as white that means zero mean Gaussian noise.
01:17:33.550 --> 01:17:39.110
Rt is the sensor noise the covariance matrix of the sensor noise so it
01:17:39.110 --> 01:17:41.650
describes how reliable the sensor is.
01:17:41.910 --> 01:17:49.270
It might vary from time to time but it must not at all depend on the
01:17:49.270 --> 01:17:52.390
state variables nor on the measurement.
01:17:52.970 --> 01:17:54.690
So at least theoretically not.
01:17:54.930 --> 01:17:58.290
So what you find in practice is that it often depends on the
01:17:58.290 --> 01:18:02.450
measurement itself but from a theoretical point of view it is not
01:18:02.450 --> 01:18:02.750
allowed.
01:18:02.970 --> 01:18:07.650
So it might depend on the time but it must not depend on the
01:18:07.650 --> 01:18:12.490
measurement itself nor on the state the current state.
01:18:12.930 --> 01:18:17.610
So you clearly see this these are linear equations therefore linear
01:18:17.610 --> 01:18:22.510
Gaussian models and you see here we are dealing only with Gaussian
01:18:22.510 --> 01:18:23.130
distributions.
01:18:25.790 --> 01:18:29.810
So yeah let's make an example.
01:18:30.310 --> 01:18:32.730
Yeah let's conclude with this example.
01:18:33.470 --> 01:18:38.670
So as again we assume a car that moves with constant velocity and we
01:18:38.670 --> 01:18:42.350
know that we can model the state of this car knowing the present
01:18:42.350 --> 01:18:45.750
position xt and the present velocity of the car.
01:18:46.010 --> 01:18:51.850
That means we can model that means we can also make predict the future
01:18:51.850 --> 01:18:56.990
position of the car xt plus one as xt plus some delta t which is the
01:18:56.990 --> 01:19:01.410
time interval from now to the next point in time.
01:19:01.410 --> 01:19:08.730
So if the camera provides 30 images per second then this delta t is 33
01:19:08.730 --> 01:19:12.250
milliseconds for instance the time interval between two measurements
01:19:12.250 --> 01:19:20.150
times vt the presence velocity times some random noise in the
01:19:20.150 --> 01:19:20.510
position.
01:19:21.510 --> 01:19:27.650
And vt the velocity vt plus one the next velocity is equal to the
01:19:27.650 --> 01:19:31.770
present velocity plus some random noise in the velocity.
01:19:32.550 --> 01:19:37.830
So that means this is the time interval that's what i already said.
01:19:38.690 --> 01:19:45.710
Okay that means st if st contains xt and vt then we can rewrite this
01:19:45.710 --> 01:19:51.730
state transition as well this matrix times the state vector times an
01:19:51.730 --> 01:19:59.430
offset vector of zero zero in this case times a random noise term that
01:19:59.430 --> 01:20:03.030
contains the random noise in the position and the velocity.
01:20:03.510 --> 01:20:09.670
Now so we see that this has exactly the form that we need for such a
01:20:09.670 --> 01:20:10.670
linear Gaussian model.
01:20:12.670 --> 01:20:16.870
And the measurement well let's assume we have a binocular camera
01:20:16.870 --> 01:20:20.050
system here that is measuring the position of the car.
01:20:21.090 --> 01:20:25.270
Then we might say the measurement that we get is equal to the position
01:20:25.270 --> 01:20:29.310
plus some random noise some measurement noise.
01:20:30.990 --> 01:20:36.050
And we can rewrite that in a matrix times the state vector
01:20:36.050 --> 01:20:37.090
multiplication term.
01:20:37.210 --> 01:20:44.830
So this is a matrix of one row two column matrix ht plus d delta t
01:20:44.830 --> 01:20:49.150
that's the noise that means this measurement equation has exactly the
01:20:49.150 --> 01:20:52.010
form that we need for a linear system.
01:20:53.050 --> 01:20:56.790
And if we assume that the delta t distribution is distributed
01:20:56.790 --> 01:21:00.730
according to Gaussian and the epsilon t is distributed according to
01:21:00.730 --> 01:21:04.490
the Gaussian we are faced with a linear Gaussian model.
01:21:05.770 --> 01:21:10.030
So i think that's a good point to stop for today.
01:21:10.750 --> 01:21:15.270
The next time we can continue and derive the filter equations for
01:21:15.270 --> 01:21:17.670
these linear Gaussian models.