I have recently been doing some basic Empirical
Orthogonal Function (EOF) analysis of some oceanographic data and have
found the literature to be rather confusing. Here I have
collected a few notes on the subject, matlab code and useful
references. The discussion is very basic and is not designed to
be an in-depth discussion of doing EOF analysis. If you have any
corrections to these notes, my contact information is
here.
Terminology:
First of all, there is absolutely no consistent terminology for EOF
analysis. There are several competing and at times contradictory
terminologies for EOF analysis. This can make it exceedingly
difficult to understand the literature. I will use "empirical
orthogonal functions" or EOFs to refer to the "spatial" patterns that
are the result of doing an EOF analysis and "expansion coefficients"
or ECs to refer to the "temporal" patterns. In the literature,
you will find:
EOFs = principal component loading patterns or, at times,
just principal components
ECs = EOF time series, expansion coefficient time series, principal
component time series, principal component scores, principal component
amplitudes or, at times, just principal components
There is also talk of covariance matrices and communalities. I
will explain what these are later.
Doing EOF analysis in 5 minutes or less:
This is the quickstart to doing EOF analysis.
- Put your data into a matrix so that the rows indicate temporal
development and the columns are variables or spatial data points.
The temporal relationship between rows is unimportant (ie. doesnt
have to be uniform). Same for the spatial relationship between
columns.
- Detrend the columns of the resulting matrix. Some EOF
routines do this for you, but I prefer to do it separately.
- Use singular value decomposition (svd) to break up your data
into 3 matrices:
Z = U * D * Vt
where U and V are orthonormal and D is diagonal. Then,
EOFs = V
ECs = U * D
covariance matrix = ECst * ECs / (n-1) = D2 /
(n-1)
communalities matrix = ECs * ECst
That is really all there is to it. The EOFs are really the
columns of the EOFs matrix. I have included
matlab code that performs step 3 above.
See the
references for a more
detailed
discussion.
After finishing these calculations, you will probably want to reduce
the EOFs and ECs to only those which explain a significant percentage
of the overall variance by just selecting out those columns of the ECs
and EOFs. You then may or may not wish to rotate the EOFs to
increase the physical explainability of the resulting patterns.
Finally, there are a number of useful ways to visualize the
results of your analysis. I will not discuss visualization here.
I will also not discuss EOF analysis of several fields.
Rotation of EOFs:
At times, the EOFs that result from the analysis will be difficult to
explain in terms of physical forces. In this case, it is often
beneficial to rotate the orthogonal basis you found to one which can
be better explained in terms of physical forces. Upon rotation,
you will at least loose the nice property that EOFs have that they are
an orthogonal basis (no cross-correlations). You will perhaps
also loose orthonormality of the EOFs matrix if you choose a
non-orthogonal transformation of the data. It is also important
to note that these rotations do not use any particular property of the
EOFs (such as orthonormality) and you essentially reduce EOF analysis
to noise reduction (via the reduction in the number of EOFs) if you
perform these rotations (ie. the result no longer has anything to do
with EOF analysis).
I will only discuss a particularly common orthogonal rotation known as
varimax. It seems to be the most popular and certainly has a
logical explanation. It looks to reduce the variances of the
projection of the data onto the rotated basis (for the EOFs, this
projection is just the ECs), thereby putting the basis closer to the
actual data and increasing interpretability.
I have included
matlab code and r
eferences for doing varimax rotation.
The
code has extensive documentation that represents my best understanding
of varimax rotation.
Matlab Code:
These are a couple of generic routines for EOF analysis and rotation.
EOF.m
varimax.m