Use of Standard Deviation in machine learning
What is Standard Deviation?
Standard deviation is a number that represents how to arrange out the values.
A low standard deviation means that most of the numbers are close to the mean value that is the average value.
A high standard deviation means that the values are spread out over a broad range.
Example: This time we have registered the age of old age people but we have counted the 7 of them.
age = [86,87,88,86,87,85,86]
The standard deviation is:
0.9
Another example to determine the selection of age with a wider range
age = [90,98,100,89,86,95,97]
The standard deviation is:
4.86
Most of the values are within the range of 37.85 from the mean value, which is 93.57.
In, the above example a higher standard deviation illustrates that the values are expanded over a wider range.
Example to use NumPy std() method to find the standard deviation:
import numpy
age = [86,87,88,86,87,85,86]
x = numpy.std(age)
print(x)
Example to determine the selection of age with a wider range
import numpy
age = [90,98,100,89,86,95,97]
x = numpy.std(age)
print(x)
Variance
Variance is another number that express how to extend (arrange) out the values
For example, if you take the square root of the variance, you get the standard variation!
The other way is, if you multiply the standard deviation by itself, you get the variance!
Find the mean:
(90+98+100+89+86+95+97) / 7 = 93.57
Find the difference from the mean for each value:
90 - 93.57 = -3.57
98 - 93.57 = 4.43
100 - 93.57 = 6.43
89 - 93.57 = -4.57
86 - 93.57 = -7.57
95 - 93.57 = 1.43
97 - 93.57 = 3.43
Find the square value:
(-3.57)2 = 12.7449
(4.43)2 = 19.62
(6.43)2 = 41.344
(-4.57)2 = 20.884
(-7.57)2 = 57.3049
(1.43)2 = 2.044
(3.43)2 = 11.7649
Find the variance of the average number of these squared differences:
(12.7449+19.62+41.344+20.884+57.3049+2.044+11.7649) / 7 = 23.672
Example to use the Numpy var() method to find out the variance:
import numpy
age = [90,98,100,89,8695,97]
x = numpy.var(age)
print(x)
Standard Deviation
The formula to find the standard deviation is the square root of the variance:
√23.672 = 560.3635
calculate the standard deviation by using NumPy
import numpy
age = [90,98,100,89,8695,97]
x = numpy.std(age)
print(x)
Symbols
Sigma: σ is used to represent Standard Deviation
Sigma Square:σ2 is used to represent Variance