How to setup front-end project with automated formatting, linting, testing and auto-generated documentation

  • DevOps
  • JavaScript
  • Programming
  • TypeScript
  • Website development

Original post in Russian

Kalman Filter

  • Algorithms
  • Mathematics

There are a lot of different articles on Kalman filter, but it is difficult to find the one which contains an explanation, where all filtering formulas come from. I think that without understanding of that this science becomes completely non understandable. In this article I will try to explain everything in a simple way.

Kalman filter is very powerful tool for filtering of different kinds of data. The main idea behind this that one should use an information about the physical process. For example, if you are filtering data from a car’s speedometer then its inertia give you a right to treat a big speed deviation as a measuring error. Kalman filter is also interesting by the fact that in some way it is the best filter. We will discuss precisely what does it mean. In the end of the article I will show how it is possible to simplify the formulas.


At first, let’s memorize some definitions and facts from probability theory.

Random variable

When one says that it is given a random variable $\xi$, it means that it may take random values. Different values come with different probabilities. For example, if someone drops a dice then the set of values is discrete $\{1,2,3,4,5,6\}$. When you deal with a speed of moving particle then obviously you should work with a continuous set of values. Values which come out after each experiment (measurement) we would denote by $x_1, x_2,...$, but sometimes we would use the same letter as we use for a random variable $\xi$. In the case of continuous set of values a random variable is characterized by its probability density function $\rho(x)$. This function shows a probability that the random variable falls into a small neighbourhood $dx$ of the point $x$. As we can see on the picture, this probability is equal to the area of the hatched rectangle below the graph $\rho(x)dx$.

Quite often in our life, random variables have the Gauss Distribution, when the probability density is $\rho(x)\sim e^{-\frac{(x-\mu)^2}{2\sigma^2}}$.

We can see that the bell-shaped function $\rho(x)$ is centered at the point $\mu$ and its characteristic width is around $\sigma$.
Since we are talking about the Gaussian Distribution, then it would be a sin not to mention from where does it come from. As well as the number $e$ and $\pi$ are firmly penetrated in mathematics and can be found in the most unexpected places, so Gaussian Distribution has deep roots in the theory of probability. The following remarkable statement partly explains presence of the Gauss Distribution in a lot of processes:
Let a random variable $\xi$ has an arbitrary distribution (in fact there are some restrictions on arbitrariness, but they are not restrictive at all). Let’s perform $n$ experiments and calculate a sum $\xi_1+...+\xi_n$, of fallen values. Let’s make a lot of experiments. It is clear that every time we will get a different value of the sum. In other words, this sum is a random variable with its own distribution law. It turns out that for sufficiently large $n$, the law of distribution of this sum tends to a Gaussian Distribution (by the way, the characteristic width of a bell is growing like $\sqrt n$. Read more in the Wikipedia: Central limit theorem. In real life there are a lot of values ​​which are a sum of large number of independent and identically distributed random variables. So this values have Gauss Distribution.

Mean Value

By definition, a mean value of a random variable is a value which we get in a limit if we perform more and more experiments and calculate a mean of fallen values. A mean value is denoted in different ways: mathematicians denote by $E\xi$ (expectation), Physicists denote it by $\overline{\xi}$ or $<\xi> $. We will denote it as mathematicians do.

For instance, a mean value of Gaussian Distribution $\rho(x)\sim e^{-\frac{(x-\mu)^2}{2\sigma^2}}$ is equal to $\mu$.


For Gaussian distribution, we clearly see that the random variable tends to fall within a certain region of its mean value $\mu$. Let us enjoy the Gaussian distribution once again:

On the picture, one may see that a characteristic width of a region where values mostly fall is $\sigma$. How can we estimate this width for an arbitrary random variable? We can draw a graph of its probability density function and just visually evaluate the characteristic range. However it would be better to choose a precise algebraic way for this evaluation. We may find a mean length of deviation from the mean value: $E|\xi-E\xi|$. This value is a good estimation of a characteristic deviation of $\xi$. However we know very well, how problematic it is to use absolute values in formulas, thus this formula is rarely used in practice.

A simpler approach (simple from calculation’s point of view) is to calculate $E(\xi-E\xi)^2$.

This value called variance and denoted by $\sigma_\xi^2$. The quadratic root of the variance is a good estimation of random variable’s characteristic deviation. It’s called the standard deviation.

For instance, one can compute that for the Gaussian distribution $\rho(x)\sim e^{-\frac{(x-\mu)^2}{2\sigma^2}}$ the variance is equal to $\sigma^2$ thus the standard deviation is $\sigma$. This result really corresponds to our geometrical intuition. In fact a small cheating is hidden here. Actually in a definition of the Gauss distribution you see the number $2$ in a denominator of expression $-\frac{(x-\mu)^2}{2\sigma^2}$. This $2$ stands there in purpose, for the standard deviation $\sigma_\xi$ to be equal exactly to $\sigma$. So the formula of Gauss distribution is written in a way, which keep in mind that one would compute its standard deviation.

Independent random variables

Random variables may depend on each other or not. Imagine that you are throwing a needle on the floor and measuring coordinates of its both ends. This two coordinates are random variables, but they depend on each other, since a distance between them should be always equal to the length of the needle. Random variables are independent from each other if falling results of the first one doesn’t depend on results of the second. For two independent variables $\xi_1$ and $\xi_2$ the mean of their product is equal to the product of their mean: $E(\xi_1\cdot\xi_2) = E\xi_1\cdot\xi_2$
For instance to have blue eyes and finish a school with higher honors are independent random variables. Let say that there are $20\% = 0.2$ of people with blue eyes and $5\%=0.05$ of people with higher honors. So there are $0.2\cdot 0.5 = 0.01 = 1\%$ of people with blue eyes and higher honors. This example helps us to understand the following. For two independent random variables $\xi_1$ and $\xi_2$ which are given by their density of probability $\rho_1(x)$ and $\rho_2(y)$, the density of probability $\rho(x,y)$ (the first variable falls at $x$ and the second at $y$) can by find by the formula

$\rho(x,y) = \rho_1(x)\cdot\rho_2(y)$

PVS-Studio 7.00

  • PVS-Studio corporate blog
  • C++
  • DevOps
  • Java

286 and the network

  • Old hardware

baidu / openedge


Extend cloud computing, data and service seamlessly to edge devices.

Последние записи






RSS / Atom