 # Study Case with Excel Statistics functions #### Frédéric LE GUEN

Last Updated on 12/05/2023

Excel offers a large number of Statistics functions. This article will show you how to use these functions in your analyses.

## Datasets

To illustrate the statistical functions, we will start from the following table which contains the (fictitious) salaries of employees

And the distribution of wages is represented by this chart

The file to do the example is here

## Common Excel Statistics functions

To perform the wage analysis, we will use many of the most common Excel statistics functions.

### Min and Max functions

These functions return the minimum and maximum of the data set

=MIN(tbl_Employee[Salary]) => \$1,520

=MAX(tbl_Employee[Salary]) => \$4,750

### How many values are in the series

Watch out! There may be a trap because COUNT and COUNTA don't return the same information

=COUNT(tbl_Employee[Salary]) => 50

=COUNTA(tbl_Employee[Salary]) => 50

The two functions return the same value here, but this is not always the case 🤔

The COUNT function counts only the numeric values in your range of cells. The COUNTA function counts ALL the values (texts and numbers)

## Functions to analyze the entire series

Here we will see several functions that return different results. Properly used, these functions will reveal important information about your data.

### The AVERAGE

Everyone knows the principle of averaging; we divide the sum of the values by the number of values

On the other hand, many people forget that the average gives only a partial vision. Indeed, it is always necessary to complete the mean by the standard deviation

=AVERAGE( tbl_Employee[Salary]) => \$1,722.31

### Standard deviation

I have already presented the interest of the standard deviation in this article. But to summarize, it allows you to find the dispersion of your values around the average.

=STDEV.S(tbl_Employee[Salary]) => \$688.56

The difference between Average and Standard Deviation is essential.

• 68% of the data are between the mean and 1 times the standard deviation
• 95% of the data are between the mean and 2 times the standard deviation
• ...

### The Median

The MEDIAN is not the most known function but returns essential information.

The median returns the value that divides your series into two equal parts.

=MEDIAN(tbl_Salairés[Salary]) => \$1,765

Therefore, we can say that half of the salaries are less than \$1,765 (and also the opposite).

You can also find that the result of an Average and a Median has nothing to do with it and should not be confused.

### MODE

The MODE function returns the most present value. different of the MOD function for modulo)

=MODE.SNGL(tbl_Employee[Salary]) => \$1,560

So this indicates that the most common salary is \$1,560 in our company. But we do not know how many times this salary is present on our list 🤔

This information could be easily obtained with the COUNTIFS function.

=COUNTIFS(tbl_Employee[Salary];MODE.SNGL(tbl_Employee[Salary])) => 4

The salary of \$1,560 is present 4 times in our data set.

# Study Case with Excel Statistics functions

Last Updated on 12/05/2023

Excel offers a large number of Statistics functions. This article will show you how to use these functions in your analyses.

## Datasets

To illustrate the statistical functions, we will start from the following table which contains the (fictitious) salaries of employees

And the distribution of wages is represented by this chart

The file to do the example is here

## Common Excel Statistics functions

To perform the wage analysis, we will use many of the most common Excel statistics functions.

### Min and Max functions

These functions return the minimum and maximum of the data set

=MIN(tbl_Employee[Salary]) => \$1,520

=MAX(tbl_Employee[Salary]) => \$4,750

### How many values are in the series

Watch out! There may be a trap because COUNT and COUNTA don't return the same information

=COUNT(tbl_Employee[Salary]) => 50

=COUNTA(tbl_Employee[Salary]) => 50

The two functions return the same value here, but this is not always the case 🤔

The COUNT function counts only the numeric values in your range of cells. The COUNTA function counts ALL the values (texts and numbers)

## Functions to analyze the entire series

Here we will see several functions that return different results. Properly used, these functions will reveal important information about your data.

### The AVERAGE

Everyone knows the principle of averaging; we divide the sum of the values by the number of values

On the other hand, many people forget that the average gives only a partial vision. Indeed, it is always necessary to complete the mean by the standard deviation

=AVERAGE( tbl_Employee[Salary]) => \$1,722.31

### Standard deviation

I have already presented the interest of the standard deviation in this article. But to summarize, it allows you to find the dispersion of your values around the average.

=STDEV.S(tbl_Employee[Salary]) => \$688.56

The difference between Average and Standard Deviation is essential.

• 68% of the data are between the mean and 1 times the standard deviation
• 95% of the data are between the mean and 2 times the standard deviation
• ...

### The Median

The MEDIAN is not the most known function but returns essential information.

The median returns the value that divides your series into two equal parts.

=MEDIAN(tbl_Salairés[Salary]) => \$1,765

Therefore, we can say that half of the salaries are less than \$1,765 (and also the opposite).

You can also find that the result of an Average and a Median has nothing to do with it and should not be confused.

### MODE

The MODE function returns the most present value. different of the MOD function for modulo)

=MODE.SNGL(tbl_Employee[Salary]) => \$1,560

So this indicates that the most common salary is \$1,560 in our company. But we do not know how many times this salary is present on our list 🤔

This information could be easily obtained with the COUNTIFS function.

=COUNTIFS(tbl_Employee[Salary];MODE.SNGL(tbl_Employee[Salary])) => 4

The salary of \$1,560 is present 4 times in our data set.