next up previous
Next: Methods Up: Statistical Vector Field Analysis Previous: Statistical Vector Field Analysis


Combined cross-sectional and longitudinal data sets are routinely collected in many developmental studies [McArdle et al.1991, Donaldson & Horn 1992]. Yet these data pose a number of problems for researchers attempting to understand the underlying structure of the processes which have been measured into the data set. Often the data are incomplete or censored. Often different individuals have been measured different numbers of times at varying lag intervals. Sometimes portions of the population can be under-represented or missing entirely [McArdle & Anderson 1990]. The researcher would like to be able to look for trends or growth curves in these data in a manner which tolerates substantial amounts of missing data [Tukey & Tukey 1981].

Growth curves have the underlying assumption that an individual is changing over age. But which age is being measured? Almost inevitably chronological age must be used as an approximation to a developmental age [Schroots & Birren 1990]. In fact, the best approximation for a developmental age may often be obtained by fitting the individual growth curve to a population normal growth curve. Multiple growth curve prototypes may also exist within a single population giving rise to what is called a growth surface, that is a multidimensional growth curve. These problems among others make the development of normalized growth curves an exceedingly difficult problem.

We present a method which is at once tolerant of missing data and which does not force the data to fit a single, parameterized growth curve. This method has its own inherent liabilities, but can be of considerable help in visualizing trends in difficult data sets.

Researchers in physics, magnetics and fluid dynamics have successfully used a tool called vector field analysis to represent complex flux in a field [Slepian 1951, Halliday & Resnick1967, Roberts & Potter1970]. The notion is that each point in the field has associated with it a vector which represents the flux in the field at that point. In two dimensions this can be represented by partitioning the plane into many small rectangles. The total flux within each rectangle can be thought of as an arrow whose direction represents the direction of the flux and whose length represents the strength of the flux. This regular division of the plane has the benefit of isolating the flux within one rectangle from the flux within the surrounding rectangles. This isolation allows the analysis of even complex fields which are not differentiable or even continuous functions [Thompson & Stewart1986, Glass & Mackey1988, Parker & Chua1989]. The method results in a data set that can be graphed to show field flux in an intuitive fashion [Tufte 1983].

Figure 1 illustrates the basic idea of a vector field plot. Imagine some three dimensional surface such as the ``hill'' shown in Figure 1-a. At each point on the hill, there is a direction in which water would flow if it were poured onto the hill, and a magnitude of slope in that direction. We can represent these two quantities by using an arrow, where the arrow's direction and length represent the direction and magnitude of slope on the hill. When an arrow is used in such a way, it is commonly called a vector. Figure 1-b is a two dimensional vector field plot of the slope of the three dimensional hill.

Figure 1. (a) Three dimensional plot of a computer generated ``hill''. (b) Vector field plot of the slope of the hill.

We have taken these ideas from vector field analysis and added some additional statistical methods which allow one to apply vector field analysis to combined cross-sectional and longitudinal data sets. We partition a two dimensional plot of score vs. age into small rectangles and accumulate summary statistics for measurements falling within each rectangle. Finally we plot the resulting summary matrices to show a graph which we call a statistical vector field ( svf). The svf plot fills the role of a scatterplot, but for balanced and unbalanced repeated measures data. The svf software creates these plots automatically from an ASCII data file.

next up previous
Next: Methods Up: Statistical Vector Field Analysis Previous: Statistical Vector Field Analysis

Steven M. Boker
Sun Feb 12 18:20:50 EST 1995