Skip to main content

Table 4 Skewness in data distribution inflate overall false alarm rate with the presence of true outliers but with different scale depending on whether IQR, MAD, or FAST-MCD approach is used

From: Input data quality control for NDNQI national comparative statistics and quarterly reports: a contrast of three robust scale estimators for multiple outlier detection

Asymmetry in data distribution

Data composition

True and false outlier rates by different approach

Preset skewness

Planted outliers

Simulated observations

IQR

MAD

FAST-MCD

0.000

10

 

1.000(0.000)

1.000(0.000)

1.000(0.000)

1.000

 

990

0.028(0.006)

0.029(0.006)

0.076(0.013)

Overall Outliers Detected / (10 + 990)

0.038

0.039

0.085

0.000

10

 

1.000(0.000)

1.000(0.000)

1.000(0.000)

2.000

 

990

0.053(0.006)

0.066(0.008)

0.223(0.014)

Overall Outliers Detected / (10 + 990)

0.062

0.075

0.231

0.000

10

 

1.000(0.000)

1.000(0.000)

1.000(0.000)

3.000

 

990

0.081(0.006)

0.123(0.008)

0.344(0.012)

Overall Outliers Detected / (10 + 990)

0.090

0.131

0.351

  1. Planted outliers from normal distribution, and simulated observations from Gamma distribution.