The following is the rule of using **ddof** in in Numpy.

`np.std()`

**Rule 1: **If you are calculating standard deviation for a *sample*, set **ddof = 1** in **np.std()**.

np.std(sample_name, ddof=1)

**Rule 2: **If you are calculating standard deviation for a *population*, set **ddof = **0 in **np.std()**.

np.std(population_name, ddof=0)

## Example of **ddof = 1**

The following is the Python code example for **ddof = 1** in np.std(). That is, this is to show how you can use np.std() to calculate standard deviation for a sample.

```
# import numpy
import numpy as np
# set seed, and you can change the number of 10
np.random.seed(10)
# Generate 5 numbers following standard normal distribution
Array_numbers = np.random.randn(5)
print("Array of Numbers: \n", Array_numbers)
# setting ddof=1, if Array_numbers is a sample
print("Use np.std for a sample (ddof=1): \n",np.std(Array_numbers,ddof=1))
```

The following is the output, and we can see the standard deviation for this sample is 1.09667.

Array of Numbers: [ 1.3315865 0.71527897 -1.54540029 -0.00838385 0.62133597] Use np.std for a sample (ddof=1): 1.0966713483434376

## Example of **ddof = **0

The following is the Python code example for **ddof = **0 in np.std(). That is, this is to show how you can use np.std() to calculate standard deviation for a population.

```
# import numpy
import numpy as np
# set seed, and you can change the number of 10
np.random.seed(10)
# Generate 10 numbers following standard normal distribution
Array_numbers = np.random.randn(10)
print("Array of Numbers: \n", Array_numbers)
# setting ddof=0, if Array_numbers is a population
print("Use np.std for a population (ddof=0): \n",np.std(Array_numbers,ddof=0))
```

The following is the output, and we can see the standard deviation for this population is 0.7519756.

Array of Numbers: [ 1.3315865 0.71527897 -1.54540029 -0.00838385 0.62133597 -0.72008556 0.26551159 0.10854853 0.00429143 -0.17460021] Use np.std for a population (ddof=0): 0.7519756036909285

## Formulas for np.std()

### 1. General Formula

The following is the full formula for np.std().

\[\sqrt{\frac{1}{N-ddof} \sum_{i=1}^N (x_i – \overline{x})^2}\]

where

**\( x_i \):**The i^{th}element in the data set**\( \bar{x} \)**: the mean of the data set- N: the number of elements in the data set

### 2. For population SD

When you have the whole population, you do NOT need **ddof=1** because we do not need to estimate the mean of the population (we already have all the data in the population).

In this case, `ddof=0`

and the formula below is to calculate SD for population data.

\[ population: \sqrt{\frac{1}{N-ddof} \sum_{i=1}^N (x_i – \overline{x})^2}=\sqrt{\frac{1}{N} \sum_{i=1}^N (x_i – \overline{x})^2}\]

### 3. For sample SD

When calculating the standard deviation for a sample, you need to set **ddof=1**. Here, ddof = 1 means that you use 1 degree of freedom from the sample to estimate the population mean.

In this case, `ddof=`

1. The following is the formula to calculate SD for a sample.

\[ sample: \sqrt{\frac{1}{N-ddof} \sum_{i=1}^N (x_i – \overline{x})^2}=\sqrt{\frac{1}{N-1} \sum_{i=1}^N (x_i – \overline{x})^2}\]

## Calculate Population SD from Scratch

We can also write a function to calculate population standard deviation (SD) from scratch in Python. The following is the full Python code.

```
# import numpy
import numpy as np
# set seed, and you can change the number of 10
np.random.seed(10)
# Generate 10 numbers following standard normal distribution
Array_numbers = np.random.randn(10)
print("Array of Numbers: \n", Array_numbers)
# setting ddof=0, if we assume it is a population
print("Use np.std for a population (ddof=0): \n",np.std(Array_numbers,ddof=0))
# Standard deviation function from scratch for a population
mean_number=np.mean(Array_numbers)
sd_from_scratch_population=np.sqrt((1/len(Array_numbers))*np.sum(np.square(Array_numbers-mean_number)))
print('SD function from scratch for a population:\n',sd_from_scratch_population)
```

The following is the output. We can see **np.std()** and our Python function from scratch for the population SD reach the same number (i.e., 0.7519756).

Array of Numbers: [ 1.3315865 0.71527897 -1.54540029 -0.00838385 0.62133597 -0.72008556 0.26551159 0.10854853 0.00429143 -0.17460021] Use np.std for a population (ddof=0): 0.7519756036909285 SD function from scratch for a population: 0.7519756036909285

## Calculate Sample SD from Scratch

We can also write a function to calculate sample standard deviation (SD) from scratch in Python. The following is the full Python code.

```
# import numpy
import numpy as np
# set seed, and you can change the number of 10
np.random.seed(10)
# Generate 5 numbers following standard normal distribution
Array_numbers = np.random.randn(5)
print("Array of Numbers: \n", Array_numbers)
# setting ddof=1, if we assume it is a sample
print("Use np.std for a sample (ddof=1): \n",np.std(Array_numbers,ddof=1))
# Standard deviation function from scratch for a population
mean_number=np.mean(Array_numbers)
sd_from_scratch_sample=np.sqrt((1/(len(Array_numbers)-1))*np.sum(np.square(Array_numbers-mean_number)))
print('SD function from scratch for a sample:\n',sd_from_scratch_sample)
```

The following is the output. We can see **np.std()** and the Python function from scratch for a sample SD reach the same number (i.e., 1.09667).

Array of Numbers: [ 1.3315865 0.71527897 -1.54540029 -0.00838385 0.62133597] Use np.std for a sample (ddof=1): 1.0966713483434376 SD function from scratch for a sample: 1.0966713483434376