First time here? Checkout the FAQ!
x
+1 vote
181 views
asked in Data Science by (170 points)  
In case of 3 sensors reporting loads of values individually.. one sensor might be off. The average of the 2 trustworthy sensors is to be reported.. the third in need for recalibration is to be neglected. I'm in need of an (excel) formula looking at three columns which row-by-row detects a significant deviation compared to the others and calculate the average of the most trustworthy.
Example:
48.1 ; 45.2 ; 45.4 => 45.3, as sensor 1 is way off....
36.0 ; 37;0 ; 45.0 => 36.5, as sensor 3 is way off....
36.0 ; 36;5 ; 37.0 => 36.5 as the deviation is too small to be considered an anomaly, so all values are valid to create the average.

Working with long periods of time.. the readings might be trustworthy for a few weeks, but in defect from moment X up until now... so simply ruling out one sensor is not really an option either.. What is the best way forward?
Please help. Highly appreciated.
  

1 Answer

0 votes
answered by (170 points)  
What seems to work is simple: create 4 new columns:
x=average(3 values) ; y=stdev.p(3 values) ; low threshold = x-y ; high threshold = x+y
, then repeat the numbers if these are within the boundaries and make a 'n/a' if outside of stdev.
With the values repeated (as within thresholds), the average can be calculated neglecting the extreme values.
Example:
10 ; 11 ; 20 : avg=13.67: stdev.p = 4.49; Low=9.17; high=18.16, so
10 ; 11; n/a  (as 20  > 18.16)
this gives an average of 10.5   :-)

Agree?
...