Outlier Checking Mahalanobis Distance

Essay add: 22-10-2015, 20:34   /   Views: 295

This section is considered in the data for analysis and investigated the causes that might affect the finding. However, the data screening is also observing the data position for all the remaining data. The 398 data are input into SPSS version 20 and analyzed using AMOS version 18.0.

The data screening included outlier detection, missing data, descriptive statistic, reliability, univariate normality, multicollinearity, linearity etc. Each part of data screening will be discussed in the next sections.

4.2.1 Outlier Checking (Mahalanobis Distance)

Statistical evidence has established outliers as any observations which are numerically distant if compared to the rest of the dataset (Bryne, 2010). In the line with this are several existing literatures that have been conducted on the different methods of detecting outliers within a given research, among which includes classifying data points based on an observed (Mahalanobis) distance from the research expected values (Hair et al., 2010; Hau & Marsh, 2004). Part of the constructive argument in favor of outlier treatments based on Mahalanobis distance is that it serves as an effective means of detecting outliers through the settings of some predetermined threshold that will assist in defining whether a point could be categorized as outlier or not (Gerrit et al., 2002).For this research, the table of chi-square statistics has been used as the threshold value to determine the empirical optimal values for the research.

This decision is in line with the arguments of Hair et al. (2010) which emphasized on the need to create a new variable in the SPSS excel to be called “response” numbering from the beginning to the end of all variables. The Mahalanobis can simply be achieved by running a simple linear regression through the selection of the newly created response number as the dependent variable and selecting all measurement items apart from the demographic variables as independent variables.

Doing this has assisted this study in creating a new output called Mah2 upon which a comparism was made between the chi-square as stipulated in the table and the newly Mahalanobis output.It was under this Mah2 that this current study identified 6 items out of the total of 398 respondents as falling under outliers because their Mah2 is greater than the threshold value as indicated in the table of chi-square statistics that is related to the 21 measurement items in the independent variable of this study and was subsequently deleted from the dataset. Sequel to the treatment of these outliers, the final regressions in this study was done using the remaining 392 samples in the dataset.Multivariate ateliers detections refer and are characterized as normal analysis from the observation within the context of data analysis. Multivariate outliers can be detected in SPSS by calculation of Mahalanobis Distance for each respondent.

This method measures statistic that allows for significance testing.

Table 4.1: Outlier Detection (Mahalanobis Distance)




Std. Deviation


Predicted Value-6.32451.63199.5077.192398Std. Predicted Value-2.6663.266.0001.000398Standard Error of Predicted Value5.89959.25928.5076.600398Adjusted Predicted Value-31.67463.81198.3278.266398Residual-253.756193.111.00085.293398Std. Residual-2.8172.144.000.947398Stud.

Residual-2.9112.314.006.997398Deleted Residual-270.853226.8811.17994.793398Stud. Deleted Residual-2.9422.329.0061.000398

Mahal. Distance



40.89718.576398Cook's Distance. Leverage Value.002.430.103.047398The value of Mahalanobis Distance (D2) is greater than a critical value and used as the threshold level for D2/df measure which should be conversation of significance (0.005 or 0.001) for designation on outliers (Hair et al., 2010).For this study, the Maximum of D2 is 170.845 that are greater than the critical value. The critical value mentioned to Chi-square value is 74.745. This means that Mahalanobis Distance has an insight as particular value leads to a high of critical value.

Once the potential outliers are identified, if the data is a large, but a viable segment of the population, and then perhaps the value should be retained. As outliers are deleted, it will run the risk of data. However, this study also measures observations as to the status as outlier to identify a complementary set of data perspectives.

4.2.2 Missing Data

Since the questionnaire is collected, the first step in data screening will be to identify. The extent of missing data concerns the effect of the unit data that is a risk to the analysis result. Normally, the missing data under 10 percent for an individual case or observation might not be a problem, except when the missing data occurs in a specific nonrandom (Hair et al., 2010).

In this study, the missing data does not exist in each questionnaire. Thus, this study should determine the number of cases without missing any of the variables, which provide the sample size variable for data analysis still remedies.

Table 4.3: Missing Data


4.2.3 Descriptive Statistic

The following profile was found among the data screening process. In general, the descriptive latent constructs include maximum, minimum, mean, standard deviation, mode, and median. The nine latent constructs (continuous learning (CL), Inquiry and dialogue (ID), team learning (TL), embedded system (ES), empowerment (EM), system connection (SC), strategic leadership (SL), organizational innovativeness (OI), and organizational performance (OP)) are presented in Table 4.4.

Table 4.4: Descriptive Statistics of Variables

CLIDTLESEMSCSLOIOPNValid392392392392392392392392392Missing000000000Mean3.1693.3793.3043.4013.4803.0983.3613.3422.998Std. Error of Mean. Deviation1.2631.1471.2231.1341.1291.2941.174.9451.037Minimum1. mean value of nine constructs with 41 items. Organizational performance (OP) is lowest for mean value (2.998) while the highest mean is empowerment (EM = 3.480).

For Standard Deviation, system connection (SC) is the highest value (1.294), continuous learning (CL) is 1.263, team learning (TL) is 1.223, strategic leadership (SL) is 1.174, inquiry and dialogue (ID) is 1.147, embedded system (ES) is 1.134, empowerment (EM) is 1.129, and organizational performance (OP) is 1.037, but the lowest value is the organizational innovativeness (OI) with 0.945. Besides, the highest of Standard Deviation Error of Mean is system connection (SC) (0.065) when organizational innovativeness (OI) is the lowest value (0.048). The nine constructs have the value of the median range from 3.000 to 3.667 and mode is 3.00 and 4.00.

The maximum and minimum are 5.00 and 1.00 respectively.

Article name: Outlier Checking Mahalanobis Distance essay, research paper, dissertation