WEBVTT

00:00.250 --> 00:02.970
Welcome to this video.

00:03.850 --> 00:07.510
It is about an important estimating method, the maximum likelihood

00:07.510 --> 00:11.330
method, for the situation that we have to estimate several

00:11.330 --> 00:15.710
probabilities at the same time, which is the case with the multinomial

00:15.710 --> 00:16.290
distribution.

00:17.470 --> 00:22.390
In this case, we have the situation that n independent equal attempts

00:22.390 --> 00:28.810
with s possible outputs, which we numerate from 1 to s, are carried

00:28.810 --> 00:29.310
out.

00:29.310 --> 00:34.410
The unknown probability for the output j is denoted by pj.

00:35.750 --> 00:42.010
For each j from 1 to s, pj is a non-negative number and all

00:42.010 --> 00:44.490
probabilities naturally add up to 1.

00:45.790 --> 00:51.190
In the first n attempts, hj times the output j would have occurred.

00:52.010 --> 00:57.290
However, h1 to hs are not negative whole numbers that add up to n.

00:58.290 --> 01:05.110
If we put the probabilities p1 to ps together, the probability for

01:05.110 --> 01:07.150
this is equal to this expression.

01:08.010 --> 01:12.050
If you are not familiar with this, you can watch my video about the

01:12.050 --> 01:13.330
multinomial distribution.

01:14.270 --> 01:19.150
In the maximum likelihood estimating method, we ask for which values

01:19.150 --> 01:23.010
of p1 to ps this expression becomes maximum.

01:24.590 --> 01:30.450
Since the quotient from faculties does not depend on pj, the

01:30.450 --> 01:35.970
equivalent task is to maximize the product of pj high hj as a function

01:35.970 --> 01:36.670
of pj.

01:38.670 --> 01:44.150
Because the natural logarithm is a strictly monotone function, you can

01:44.150 --> 01:48.070
also maximize the logarithm of it, i.e.

01:48.210 --> 01:52.870
this function, depending on p1 to ps.

01:54.370 --> 02:00.250
Since we have assumed that the values of pj can be equal to 0 and the

02:00.250 --> 02:05.670
logarithm of 0 is equal to minus infinity, we define 0 times minus

02:05.670 --> 02:06.970
infinity to 0.

02:07.970 --> 02:13.670
Note that this is a maximization task for a function of several

02:13.670 --> 02:16.130
variables under sub-conditions.

02:16.890 --> 02:21.590
In this context, we often find the Lagrange multiplier rule.

02:22.230 --> 02:28.590
This rule provides only local extremals without additional

02:28.590 --> 02:28.590
considerations.

02:29.310 --> 02:32.330
We will see that it is much easier to proceed in our situation.

02:34.950 --> 02:39.950
I formulate the result as a sentence about the maximum likelihood

02:39.950 --> 02:42.590
estimation in the case of the multinomial distribution.

02:43.550 --> 02:47.570
The function L index h1 to hs of p1 to ps, which is explained with non

02:47.570 --> 02:51.750
-negative and adding up to 1 components,

02:55.770 --> 03:03.730
which is defined as this sum, assumes its maximum for the following

03:03.730 --> 03:05.450
vector of probabilities.

03:06.410 --> 03:11.590
It is this vector of relative frequencies for the individual trial

03:11.590 --> 03:12.070
outputs.

03:12.850 --> 03:17.450
The writing method with the roofs is common when parameters are

03:17.450 --> 03:18.010
estimated in models.

03:18.910 --> 03:22.490
Note that this result generalizes the binomial case.

03:23.490 --> 03:27.690
There, the relative frequency of hits is the maximum likelihood

03:27.690 --> 03:30.370
estimate for the unknown probability of hits.

03:32.170 --> 03:36.670
The proof uses this upper barrier, which is valid for every positive

03:36.670 --> 03:40.830
x, for the natural logarithm.

03:41.150 --> 03:46.130
According to the definition of the function L index h1 to hs, we can

03:46.130 --> 03:50.690
run the sum only over the j for which hj is positive.

03:51.630 --> 03:54.850
This is followed by this equal sign.

03:56.370 --> 04:00.010
We now write this a little differently, i.e.

04:00.190 --> 04:06.590
under the use of the logarithm law logarithm of a times b, logarithm a

04:06.590 --> 04:09.550
plus logarithm b is in this form.

04:11.670 --> 04:15.490
Now an estimate is made upwards, i.e.

04:15.630 --> 04:21.150
we leave the sum and hj and for the first logarithm we use the

04:21.150 --> 04:24.410
logarithm equation above the two lines.

04:25.910 --> 04:29.150
We leave the second logarithm unchanged.

04:30.810 --> 04:38.450
We now move the sum of hj times the logarithm of hj by n forward and

04:38.450 --> 04:41.870
what remains is this sum n times after abbreviation of hj.

04:45.110 --> 04:50.930
We still have to subtract the sum of the positive hj because of the

04:50.930 --> 04:52.970
minus 1, but that is equal to n.

04:53.950 --> 05:00.490
We can estimate upwards by writing down only the first sum, because

05:00.490 --> 05:04.330
the sum above the pj is at most equal to 1.

05:05.310 --> 05:10.630
The last expression is evaluated by the function L index h1 to hs.

05:10.630 --> 05:15.370
This is evaluated for the vector of the relative hit frequencies and

05:15.370 --> 05:16.550
this was to show.

05:17.850 --> 05:20.490
Yes, we have already reached the end of this video.

05:21.630 --> 05:24.010
Thank you very much for watching.

05:24.550 --> 05:26.670
I am always grateful for hints and constructive criticism.

