# Advice 1: How to build a distribution function

The law of distribution of a random variable is a ratio that establishes the relationship between the possible values of a random variable and the probabilities of their appearance in the test. There are three main distribution law of random variables: a number of probability distributions (for discrete random variables), distribution function, probability density.
Instruction
1
The distribution function (sometimes integral distribution law) is a universal law of distribution suitable for the probabilistic description of both discrete and continuous RV X (random variable X). Determined as a function of argument x (can be and its possible value X=x) equal to F(x)=P(X<x). That is, the probability that ST. X took a value less than the argument X.
2
Consider the problem of constructing F(x) of a discrete random variable X, given a probability number and is represented by the polygon distribution in figure 1. For simplicity, we will restrict ourselves to the 4 possible values.
3
When X≤x1 F(x)=0 because the event {X<x1} is the impossible event.When x1<X≤x2 F(x)=p1, because there is one more possibility to perform the inequality {X<x1}, namely, X=x1, which occurs with probability p1. Thus, (x1+0) occurred race F(x) from 0 to R. If x2<X≤x3 similarly F(x)=p1+p3, because here there were two possibilities of execution of the inequality X<x by X=x1 or X=x2. Because of the theorem about the probability of a sum of incompatible events, the probability of this is P1+P2. Hence (x2+0) F(x) has undergone the shift from p1 to P1+P2.Similarly, if x3<X≤x4 F(x)=p1+p2+p3.
4
At X = x4 F(x)=p1+p2+p3+p4=1 (normalization condition). Another explanation - in this case, the event {x<X} significantly, since all possible values of the random variable x is less than this (one of them should be taken in the SV experience required). Diagram is F(x) is shown in figure 2.
5
For discrete RV having n values, the number of "steps" on the graph of the distribution function will obviously be equal to n. For n tending to infinity, assuming that the discrete data points "completely" fill the entire number line (or its segment), we find that on the chart of the distribution function appears more and more steps, all of the smaller ("crawling", by the way, up), which in the limit go into a solid line that forms a graphic of distribution function of continuous random variable.
6
It is worth noting that a basic property of the distribution function: P(x1≤X<x2)=F(x2)-F(x1). So, if you want to build aggregation function F*(x) distribution (based on experimental data), then these probabilities should be frequency intervals pi*=ni/n (n is the total number of observations, ni is the number of observations in the i-th interval). Next use the methodology of constructing F(x) of a discrete random variable. The only difference is that the "steps" do not build, and connect (in series) points by straight lines. Must be non-decreasing broken. The tentative schedule is F*(x) is shown in figure 3.

# Advice 2 : How to check the normality

So you've done a lot of work: analyzed the existing sources, made a hypothesis, gathered empirical data, and now it is time for their mathematical processing. The greatest part of statistical surveys subject to the law of normal distribution, but you see a deviation from the normal curve, or a spike dependent measure. Your task is to determine whether these deviations are random, or you discovered something new in science. Maybe you're just wrong formed the sample.
Instruction
1
To determine whether your data is normal distribution, you need to have statistics on the General population. Most likely you can't do it, because if you know in advance the distribution of the studied indicator, your research is simply not needed to be done.
2
However, if you have stats on the General population, you can check whether you have formed the sample. Most often this applies to the Pearson criterion, statistics or Chi-square. This criterion is typically used for samples with number of observations more than 30, otherwise use t-student test.
3
First calculate the average value in the sample and the standard deviation. These indicators will be needed for any calculations. Next, you must determine a theoretical (hypothetical) frequency distribution of the studied trait. It will be equal to the expected value of the distribution of the required quantity, based on data from the General population, or if none, based on empirical data.
4
Thus you will receive two sets of variables, between which there is some dependence. Now you should check the series of indicators on the level of agreement on the criteria of Pearson's, Kolmogorov, or Romanowski for a given level of error probability alpha.
5
If the correlation coefficient between the empirical and the theoretical distribution of the trait would be beyond a given level of error probability, the hypothesis that you are learning sign corresponds to the normal distribution of the population should be rejected. Further interpretation of these results of statistical processing of the data depends on the research objectives and, to some extent, from your our scientific intuition or imagination.
Search