
A nontechnical analogy: A mother sees various bumps and shapes under a blanket at the bottom of a bed. When one shape moves toward the top of the bed, all the other bumps and shapes move toward the top also, so the mother concludes that what is under the blanket is a single thing  her child. Similarly, factor analysis takes as input a number of measures and tests, analogous to the bumps and shapes. Those that move together are considered a single thing, which it labels a factor. That is, in factor analysis the researcher is assuming that there is a "child" out there in the form of an underlying factor, and he or she takes simultaneous movement (correlation) as evidence of its existence. If correlation is spurious for some reason, this inference will be mistaken, of course, so it is important when conducting factor analysis that possible variables which might introduce spuriousness, such as anteceding causes, be taken into account.
Factor analysis is part of the general linear model (GLM) family of procedures and makes many of the same assumptions as multiple regression: linear relationships, interval or nearinterval data, untruncated variables, proper specification (relevant variables included, extraneous ones excluded), lack of high multicollinearity, and multivariate normality for purposes of significance testing. Factor analysis generates a table in which the rows are the observed raw indicator variables and the columns are the factors or latent variables which explain as much of the variance in these variables as possible. The cells in this table are factor loadings, and the meaning of the factors must be induced from seeing which variables are most heavily loaded on which factors. This inferential labeling process can be fraught with subjectivity as diverse researchers impute different labels.
There are several different types of factor analysis, with the most common being principal components analysis (PCA), which is preferred for purposes of data reduction. However, common factor analysis, also called "principal factor analysis" (PFA), is preferred for purposes of causal analysis and for confirmatory factor analysis in structural equation modeling, among other settings..
Principal components analysis (PCA), a.k.a. components analysis or factor analysis  Principal factor analysis (PFA), a.k.a. principal axis factoring (PAF), common factor analysis, or factor analysis 
Analyzes a correlation matrix in which the diagonal contains 1's. (This is not equivalent to analyzing the covariance matrix.)  Analyzes a correlation matrix in which the diagonal contains the communalities. (This is equivalent to analyzing the covariance matrix, which is also what structural equation modeling does.) 
PCA accounts for the total variance of variables. Factors, properly called components, reflect the common variance of variables plus the unique variance. That is, manifest variables may be conceptualized as reflecting a combination of total variance and unique variance explained by the components, plus error variance not explained by the components.  Common factor analysis accounts for the covariation among variables. Factors reflect the common variance of the variables, excluding unique (variablespecific) variance. That is, manifest variables may be conceptualized as reflecting a combination of common variance explained by the factors, plus unique and error variance not explained by the factors. 
Components seek to reproduce the total variable variance as well as the correlations. That is, PCA accounts for the total variance of the variables.  Factors seek to reproduce the correlations of the variables. That is, PFA accounts for the covariation among the variables. 
PCA is thus a variancefocused approach.  PFA is thus a correlationfocused approach. 
For the first component, PCA creates a linear equation which extracts the maximum total variance from the variables; for the second component, PCA removes the variance explained by the first component and creates a second linear equation which extracts the maximum remaining variance; etc., continuing until the components can explain all the common and unique variance in a set of variables.  PFA seeks the least number of factors which can account for the covariance shared by a set of variables. For the first factor, PFA creates a linear equation which extracts the maximum covariance from the variables; for the second component PFA removes the covariance explained by the first component and creates a second linear equation which extracts the maximum remaining covariance; etc., continuing until the factors can explain all the covariance in a set of variables. 
Normally, components are orthogonal to (uncorrelated with) one another, though an oblique (correlated) option is available.  Normally, components are orthogonal to (uncorrelated with) one another, though an oblique (correlated) option is available. 
Adding variables to the model will change the factor loadings.  In principle, it is possible to add variables to the model without affecting the factor loadings. 
PCA is used when the research purpose is data reduction or exploration. PCA is not used in causal modeling (ex., not used with structural equation modeling)  PFA is used when the research purpose is theory confirmation and causal modeling. A type of PFA is built into structural equation modeling programs, for instance. 
In SPSS: Analyze, Data Reduction, Factor; click the Extraction button; from the "Method" dropdown, select "Principal components".  In SPSS: Analyze, Data Reduction, Factor; click the Extraction button; from the "Method" dropdown, select "Principal axis factoring". 
Warning: Factor analysis is not a silver bullet. Simulations comparing factor analysis with structural equation modeling (SEM) using simulated data indicate that at least in some circumstances, factor analysis may not correctly identify the correct number of latent variables, or sometimes even come close. While factor analysis may demonstrate that a particular model with a given predicted number of latent variables is not inconsistent with the data by this technique, researchers should understand that other models with different numbers of latent variables may also have good fit by SEM techniques.
A Qmode issue has to do with negative factor loadings. In conventional factor analysis of variables, loadings are loadings of variables on factors and a negative loading indicates a negative relation of the variable to the factor. In Qmode factor analysis, loadings are loadings of cases (often individuals) on factors and a negative loading indicates that the case/individual displays responses opposite to those who load positively on the factor. In conventional factor analysis, loading approaching zero indicates the given variable is unrelated to the factor. In Qmode factor analysis, a loading approaching zero indicates the given case is near the mean for the factor. Cluster analysis is now more common than Qmode factor analysis. Note, however, that correlations in factor analysis are treated in a general linear model which takes control variables into account, whereas cluster analysis uses correlations simply as similarity measures. For this reason, some researchers still prefer Qmode factor analysis for clustering analysis.
The following modes are rare.
In the SPSS example below, focused on subjects' music preferences (coded from 1 = "like it" to 3 = "dislike it"), the red cells show the loadings for the measured (row) variables most associated with each of the six extracted components (factors). The green cell illustrates a weak to moderate crossloading. Ideally, the researcher wants a "simple factor structure," with all main loadings greater than .70 and no crossloadings greater than .40 (some say greater than .3). Usually, as here, actual patterns fall short of simple factor structure, though this example comes close. Rap music preference in component 3 is the most clearly and heavily loaded. Component 1 is the most diverse, associated with disliking classical, opera, Broadway, and big band music and crossloaded with being less educated.
In the SPSS example below, again on analysis of music preferences, 18 components (factors) would be needed to explain 100% of the variance in the data. However, using the conventional criterion of stopping when the initial eigenvalue drops below 1.0, only 6 of the 18 factors were actually extracted in this analysis. These six account for 72% of the variance in the data.
The reproduced correlation residuals matrix may help the researcher to identify particular correlations which are ill reproduced by the factor model with the current number of factors. By experimenting with different models with different numbers of factors, the researcher may assess which model best reproduces the correlations which are most critical to his or her research purpose.
Oblique rotations, discussed below, allow the factors to be correlated, and so a factor correlation matrix is generated when oblique is requested. Normally, however, an orthogonal method such as varimax is selected and no factor correlation matrix is produced as the correlation of any factor with another is zero.
Problems arise even when the number of categories is greater than two. Spurious factors may be created not because items are similar in meaning but because they are similar in difficulty (Gorsuch, 1974; Lawrence, Shaw, Baker, BaronCohen, & David, 2004). Treating ordinal variables as interval is a form of measurement error and hence involves attenuation of correlation. This is why basing exploratory factor analysis (EFA) on a matrix of polychoric correlations, which are designed for ordinal data, results in higher factor loadings and higher eigenvalues as a rule. Monte Carlo studies by Joreskog & Sorbom (1986) uphold the desirability of basing EFA on polychoric matrixes, as does research by Muthen & Kaplan (1985) and by Gilley & Uhlig (1993). Polychoric correlation matrices can be created in PRELIS, the front end to LISREL, described by Joreskog & Sorbom (1986). See the discussion of levels of data.
COMPARING FACTOR ANALYSIS, MULTIDIMENSIONAL SCALING, AND CLUSTER ANALYSIS

FACTOR 
MDS 
CLUSTER 
Multivariate? (Uses partial coefficients which control for other variables in the model). 
Yes 
No 
No 
Group both vars and cases? 
Designed for variables but data matrix could be flipped to factor cases (Qmode). 
Designed for variables; data matrix could be flipped to scale cases. 
Hierarchical clustering gives choice of clustering either. Others are designed for clustering cases; data matrix could be flipped to cluster variables. 
How many groups? 
Use Kaiser criterion (eigenvalues > 1) or scree plot. is influenced by number of variables. 
Minimize stress or use scree plot. Number of groups is not influenced by number of variables, and MDS may yield fewer groups. 
In hierarchical clustering stop when dendogram distance jump is large. In twostep clustering, use lowest BIC and largest ratio of change. No

How to label groups? 
Infer from factor Loadings. 
Infer from which objects cluster In pspace map. Some then confirm groups In cluster analysis. 
Infer from group memberships of cases. 
Criteria for good model fit 
Cumulative % of variance explained in eigenvalue table. Simple factor structure. High communalities. Low reproduced correlation residuals. Maximum likelihood or GLS extraction methods have goodness of fit tests. 
Rsquared> .60; Scatterplots of linear (covariates) or nonlinear (factors) fit form 45degree line. 
Low proximity coefficient in agglomeration table for hierarchical clustering. Low mean square error in kmeans clustering in SPSS anova table, or in SAS high overall Rsquare and CCC > 3. Low BIC and high ratio of distance in 2step clustering in autoclustering table. 
Save group membership If cases grouped? 
Yes, can save factor scores for cases if factoring variables; or in Qmode, factor scores reflect group membership tendencies. 
No, would have to do manually. 
All three methods can save cluster membership number. 
Most central output. 
Table of factor loadings. 
Perceptual map. 
Cluster membership table for all methods. Tree diagram (dendogram) for hierarchical clustering. 
How to tell which Variables are most Important? 
Rotated factor loadings. 
Decomposition of normalized stress table in Proxscal. 
Variablewise importance plot in 2step clustering, Predictor importance plot in SPSS 20. 
How to spot influential cases? 
Not available; consider preprocessing with regression, which has casewise diagnostics and can save influence measures. 
Can obtain a solution for each individual when input is a rectangular matrix; weirdness index for INDSCAL/WMDS models. Also consider preprocessing with regression. 
Distance in cluster membership table In kmeans clustering. 
Assumptions 
Assumptions of general linear models, such as linearity and normally distributed variables.. 
do not apply. 
GLM assumptions do not apply. 
where
varlist is a list of variable names separated by commas
meanslist is a list of the means of variables, in the same order as varlist
stddevlist is a list of standard deviations of variables, in the same order
CORR statements define a correlation matrix, with variables in the same order (data above are for illustration; one may have more or fewer CORR statements as needed according to the number of variables).
Note the period at the end of the MATRIX DATA and END DATA commands.
Then if the MATRIX DATA command is part of the same control syntax working file, add the FACTOR command as usual but add the subcommand "/MATRIX=(IN(*)" (but without the quote marks). If the MATRIX DATA is not part of the same syntax set but has been run earlier, the matrix data file name is substituted for the asterisk.
Using confirmatory factor analysis in structural equation modeling, having several or even a score of indicator variables for each factor will tend to yield a model with more reliability, greater validity, higher generalizability, and stronger tests of competing models, than will CFA with two or three indicators per factor, all other things equal. However, the researcher must take account of the statistical artifact that models with fewer variables will yield apparent better fit as measured by SEM goodness of fit coefficients, all other things equal.
However, "the more, the better" may not be true when there is a possibility of suboptimal factor solutions ("bloated factors"). Too many too similar items will mask true underlying factors, leading to suboptimal solutions. For instance, items like "I like my office," "My office is nice," "I like working in my office," etc., may create an "office" factor when the researcher is trying to investigate the broader factor of "job satisfaction." To avoid suboptimization, the researcher should start with a small set of the most defensible (highest face validity) items which represent the range of the factor (ex., ones dealing with work environment, coworkers, and remuneration in a study of job satisfaction). Assuming these load on the same job satisfaction factor, the researcher then should add one additional variable at a time, adding only items which continue to load on the job satisfaction factor, and noting when the factor begins to break down. This stepwise strategy results in the most defensible final factors.
Computation: To compute KMO overall, the numerator is the sum of squared correlations of all variables in the analysis (except the 1.0 selfcorrelations of variables with themselves, of course). The denominator is this same sum plus the sum of squared partial correlations of each variable i with each variable j, controlling for others in the analysis. The concept is that the partial correlations should not be very large if one is to expect distinct factors to emerge from factor analysis. See Hutcheson and Sofroniou, 1999: 224.
SPSS: In SPSS, KMO is found under Analyze  Statistics  Data Reduction  Factor  Variables (input variables)  Descriptives  Correlation Matrix  check KMO and Bartlett's test of sphericity and also check Antiimage  Continue  OK. The KMO output is KMO overall. The diagonal elements on the Antiimage correlation matrix are the individual KMO statistics for each variable.
The factor invariance test, discussed above, is a structural equation modeling technique (available in AMOS, for ex.) which tests for deterioration in model fit when factor loadings are constrained to be equal across sample groups.
The comparison measures method requires computation of various measures which compare factor attributes of the two samples. Factor comparison is discussed by Levine (1977: 3754), who describes these factor comparison measures:
However, occasionally an oblique rotation will still result in a set of factors whose intercorrelations approach zero. This, indeed, is the test of whether the underlying factor structure of a set of variables is orthogonal. Orthogonal rotation mathematically assures resulting factors w
Also, oblique rotation is necessary as part of hierarchical factor analysis, which seeks to identify higherorder factors on the basis of correlated lowerlevel ones..
When modeling, oblique rotation may be used as a filter. Data are first analyzed by oblique rotation and the factor correlation matrix is examined. If the factor correlations are small (ex., < .32, corresponding to 10% explained), then the researcher may feel warranted in assuming orthogonality in the model. If the correlations are larger, then covariance between factors should be assumed (ex., in structural equation modeling, one adds doubleheaded arrows between latents).
For purposes other than modeling, such as seeing if test items sort themselves out on factors as predicted, orthogonal rotation is almost universal.
HFA is a twostage process. First an oblique (oblimin) factor analysis is conducted on the raw dataset. As it is critical in HFA to obtain the simplest factor structure possible, it is recommended to run oblimin for several different values of delta, not just the default delta=0. A delta of 0 gives the most oblique solutions, but the more the researcher specifies (in the SPSS "Factor Analysis" Rotation" dialog, invoked by clicking the Rotation button) a more and more negative delta, the factors become less and less oblique. To override the default delta of 0, the researcher enters a value less than or equal to 0.8.
When the researcher feels the simplest factor structure has been obtained, one has a correlated set of lowerorder factors. Factor scores or a correlation matrix of factors from the first stage can be input to a secondstage orthogonal factor analysis (ex., varimax) to generate one or more higherorder factors.
Note, however, that this orthogonalization comes at a price. Now, instead of explicit variables, one is modeling in terms of factors, the labels for which are difficult to impute. Statistically, multicollinearity is eliminated by this procedure, but in reality it is hidden in the fact that all variables have some loading on all factors, muddying the purity of meaning of the factors.
A second research use for component scores is simply to be able to use fewer variables in, say, a correlation matrix, in order to simplify presentation of the associations.
Note also that factor scores are quite different from factor loadings. Factor scores are coefficients of cases on the factors, whereas factor loadings are coefficients of variables on the factors.
Common factor analysis (PFA) determines the least number of factors which can account for the common variance in a set of variables. This is appropriate for determining the dimensionality of a set of variables such as a set of items in a scale, specifically to test whether one factor can account for the bulk of the common variance in the set, though PCA can also be used to test dimensionality. Common factor analysis has the disadvantage that it can generate negative eigenvalues, which are meaningless.
Copyright 1998, 2008, 2009, 2010, 2011, 2012 by G. David Garson.
Last update: 2/9/2012.
Archiver手机版科学网 ( 京ICP备07017567号12 )
GMT+8, 202342 16:11
Powered by ScienceNet.cn
Copyright © 2007 中国科学报社