To predict the effect of the oceans on land climate, Earth Scientists havedeveloped ocean climate indices (OCIs), which are time series that summarizethe behavior of selected areas of the Earth’s oceans. For example, the SouthernOscillation Index (SOI) is an OCI that is associated with El Nino. In the past,Earth scientists have used observation and, more recently, eigenvalue analysistechniques, such as principal components analysis (PCA) and singular value de-composition (SVD), to discover ocean climate indices. However, these techniquesare only useful for finding a few of the strongest signals and, furthermore, im-pose a condition that all discovered signals must be orthogonal to each other.We have developed an alternative methodology for the discovery of OCIs thatovercomes these limitations and is based on clusters that represent ocean regionswith relatively homogeneous behavior [STK+01]. The centroids of these clustersare time series that summarize the behavior of these ocean areas. We divide thecluster centroids into several categories: those that correspond to known OCIs,those that are variants of known OCIs, and those that represent potentially newOCIs. The centroids that correspond to known OCIs provide a validation of ourmethodology, while some variants of known OCIs may provide better predictivepower for some land areas. Also, we have shown that, in some sense, our cur-rent cluster centroids are relatively complete, i.e., capture most of the possiblecandidate OCIs. For further details, the reader is referred to [STK+01].A number of aspects of Earth Science data and the previously describedanalyses require the use of high-performance computing. First, satellites are pro-viding measurements of finer granularity. For instance, a 1◦ by 1◦ grid produces64,800 data points, while a 0.1◦ by 0.1◦ grid produces 6,480,000 data points. Sec-ond, more frequent measurements, e.g., daily measurements, multiply monthlydata by a factor of 30. Also, looking at weather instead of climate requires finerresolution to enable the detection of fast changing patterns, e.g., the movementof frontsOur current clustering analysis, while effective, requires O(n2) comparisonssince it needs to evaluate the correlation of every ocean point with every landpoint. Furthermore, association rule algorithms can also be very compute inten-sive. Indeed, the computational complexity of these algorithms is potentially verymuch greater than O(n2). Finally, the amount of memory required for cluster-
đang được dịch, vui lòng đợi..