Exercise 7-4
Decomposition using Principal Components

Earth observation imagery typically shows a great deal of variability over time. Thus it is common to want to decompose that variability into its underlying constituents. One of the most popular ways of doing this is through Principal Compo- nents Analysis (PCA -- also known as Empirical Orthogonal Function (EOF) Analysis).


If you have not done so already, read the Principal Components section of the Earth Trends Modeler chapter in
the IDRISI Manual. Then open the PCA panel on the Analysis tab. Select the SST data set as the input series. The defaults are set for their typical use in time series analysis so you can immediately click the Run button. When it has finished, ETM will automatically switch to the Explore PCA/EOT/Fourier PCA/Wavelet panel of the Explore tab. The first component will be displayed.

Note: A Clarification About Terminology. Please note the Climatology/Atmospheric Science communities use a different terminology from that used in the Geography and Remote Sensing communities. This goes beyond the issue of calling it EOF rather than PCA. The starting point for a standardized PCA/EOF is a correlation matrix (or a variance/ covariance matrix if it is unstandardized). In Geography/Remote Sensing applications (as it is in ETM), this correlation matrix is between images over time. Thus, if you have 300 images over time, this is a 300 x 300 matrix of correlations. In the climatological community, the correlations are between pixels over space. Thus if you have an image series with 100 columns and 100 rows, the correlation matrix will be a 10,000 by 10,000 matrix. Both procedures produce a series of images and a corresponding set of graphs, which are identical. In other words, there is only one solution regardless of how you construct the correlation matrix. This is because the solution is orthogonal over both space and time. However, the terminology is different. In the implementation here, the images are called components and the graphs are called loadings. If your correlations are between pixels, the graphs are the components and the images are the loadings. Note also that some climatologists refer to each component/loading pair as a mode.



Look at the first loading graph. This shows time on the X axis and correlation on the Y axis. Notice that the val-
ues are all very high. What this tells us is that every image has this pattern present within it. Thus, this is essentially the pattern of the long term average sea surface temperature. Note that in interpreting the components, you should focus on the pattern over space and not the absolute values of the component scores (the values in the image). Because it is a standardized analysis and successive components are based on residuals from previous components, it becomes increasingly hard to relate these values back to the original imagery. However, we can see in the title of the loading graph that this first component accounts for 98.22% of the variability in sea surface temperature over space and time. All remaining variability is contained within the remaining 1.78%.

Now in the Explore PCA panel, select Component 2 and click the Display (map) icon to its right. The compo- nent image will display. Notice that the loadings follow an annual cycle that is symmetric about the 0 correlation position. The loadings are positive during the northern hemisphere late summer/early autumn and negative in the early spring. Then notice that the component image also has positive and negative values. This is a case where it is best that the contrast stretch be symmetric about 0 so that it is unambiguous as to where there are negative values and where there are positive values. Therefore, make sure that the PCA layer is highlighted in Composer (it might not be if you have an automatic vector overlay), and click the middle STRETCH button at the bottom of Composer to create a symmetric stretch about zero.

Notice the hemispheric (north/south) differences in the component scores (the image values). Also notice in the Atlantic how the division between the hemispheres falls in the same position as the Atlantic Equatorial Counter Current noted earlier. Clearly this is the annual seasonal cycle. Notice also that while the component explains only a little over 1.5% of the total variance in SST over space and time, this represents over 85% of the variance

remaining after the effects of Component 1 are removed.

Looking at the loadings graph and the component image as a pair, the loadings say that geographically the pattern looks most like this during the boreal late summer/early autumn (August/September - i.e., when the load- ings are high) and the opposite of this during the boreal early spring months (February/March, when the loadings are highly negative). The nearly perfect sinusoidal pattern of the loadings supports the interpretation of this as the annual cycle, but there is evidently a lag in its maximum impact.

d) Now display the loading graph and component image for Component 3. Also use the STRETCH button on
Composer to stretch the image symmetrically. This is also an annual cycle, but notice that it is aligned more with the early winter (December) and early summer (June) and that it is much smaller in its accounting of variance (only about 4% of the variance explained by Component 2).

1 Compare the areas that have the strongest seasonality in Components 2 and 3. Given the timing of loadings, what does
this suggest about the relationship between components over space and time? We know that components are independent of each other. Are they independent of each other in time, space or over both?





Now display and examine the loading graphs for Components 4, 5 and 6. Stretch each of the component images symmetrically using the middle STRETCH option in Composer. Component 4 is also clearly a seasonal cycle; however it is semiannual. Component 5 is clearly an interannual cycle (we will have more to say about this shortly), while Component 6 appears to be a mix between a seasonal cycle (again, semi-annual) and an interan- nual oscillation. This highlights an interesting issue regarding PCA/EOF. Although the components can repre- sent true underlying sources of variability, they can also represent mixtures. We will explore this further in subsequent exercises.

Often it is these interannual oscillations that are a key interest in image time series analysis. If this is the case, then it is usually advisable to run the PCA on deseasoned data. Therefore, let's go back to the Analysis tab and run PCA again, but this time use the anomalies in SST you created in an earlier exercise. Use all the same parameters that you did the first time (i.e., the defaults).

Now look at Component 1 from this new analysis and compare it to Component 5 from your previous one. Clearly they are the same thing (although the loading for Component 1 of the anomalies in SST is more coherent over time), but the patterns are inverted in the component images and the loading graphs. Since they are both inverted, they therefore represent the same thing. It's like taking the negative of a negative number which yields
a positive. This leads to an important issue. It is mathematically permissible to invert the loadings graph (by multiplying by -1) if you also invert the component image. The end result is identical mathematically, but in some cases may be easier to explain. Don't hesitate to do this. For the graph, export the data to a spreadsheet (right- click on empty space in the graph and choose the clipboard text option to paste into your spreadsheet, and then subsequently multiply by -1); for the component image, use the SCALAR module or Image Calculator to multiply by -1.

If you have not yet stretched Component 1 from your anomalies analysis, do so now (with the symmetric option). This is the El Niño / La Niña phenomenon (also known as the El Niño / Southern Oscillation, abbre- viated as ENSO). ENSO is an irregular oscillation typically in the 2.5-7 year range. El Niño events are associated with a weakening (or even a reversal) of the prevailing easterlies (trade winds) along the equator. Normally, the frictional effect of these easterlies on the sea surface causes a movement of warm surface waters to the Asian side of the Pacific. In fact, normally, the Asian side is actually higher (by about 40 cm) than the South American side. When the trade winds weaken, this warm pool of water flows back to the South American side under the force of gravity. After a period of about 6-12 months of warming, the trade winds resume and the pattern reverses. In fact, El Niño events are characteristically followed by an abnormal strengthening of the trades, producing the opposite effect known as a La Niña.

2 Looking at your loading graph, the big peaks and big valleys represent El Niño and La Niña events, respectively. Tabulate the periods when you think El Niño conditions existed, when the La Niña pattern was prevalent and when neither was present (some call this "La Nada"). What do you think is the typical length of a complete El Niño event? What about the typical length of a La Niña? How normal are La Nada conditions?



ENSO is known as a climate teleconnection because it leads to correlated climate conditions over widely dispersed areas of the globe. A teleconnection can also be defined as a characteristic pattern of variability. There is great interest in the study of teleconnections because of their utility in seasonal forecasting. By monitoring SST in the central Pacific, we now have good warning about the development of ENSO conditions, which has facilitated seasonal forecasting around
