First time here? Checkout the FAQ!
x
+1 vote
5.2k views
asked in Data Science Interview Questions by (600 points)  
  

1 Answer

0 votes
answered by (115k points)  

All three measures are used to give us a good representative for "average" in our data samples. However, based on the type and properties of each we have to use them in different situations. Based on the types of variables, we can use the following table to see what measure we should use:

Type of Variable Best measure of central
Categorical (Nominal) Mode
Ordinal Median
Interval/Ratio (not skewed) Mean
Interval/Ratio (skewed) Median

 

Consider the effect of Outliers

In addition, when we have ratio variables (such as numeric values) and it contains outliers, we have to use Median instead of the mean. An example is a salary data columns that may contain very large or very small values which affect the mean, but if we use Median instead, we can see a better representative for the "average". That is why on many websites you see Median Salary for a job position instead of mean. For more information, you can take a look at this page.

 

Related questions

...