Abstract

Shape analysis provides a unique insight into biological processes. This paper evaluates the properties, performance, and utility of elliptical Fourier (eFourier) analysis to operationalise global shape, focussing on the human corpus callosum. 8000 simulated corpus callosum contours were generated, systematically varying in terms of global shape (midbody arch, splenium size), local complexity (surface smoothness), and nonshape characteristics (e.g., rotation). 2088 real corpus callosum contours were manually traced from the PATH study. Performance of eFourier was benchmarked in terms of its capacity to capture and then reconstruct shape and systematically operationalise that shape via principal components analysis. We also compared the predictive performance of corpus callosum volume, position in Procrustes-aligned Landmark tangent space, and position in eFourier n-dimensional shape space in relation to the Symbol Digit Modalities Test. Jaccard index for original vs. reconstructed from eFourier shapes was excellent (M=0.98). The combination of eFourier and PCA performed particularly well in reconstructing known n-dimensional shape space but was disrupted by the inclusion of local shape manipulations. For the case study, volume, eFourier, and landmark measures were all correlated. Mixed effect model results indicated all methods detected similar features, but eFourier estimates were most predictive, and of the two shape operationalization techniques had the least error and better model fit. Elliptical Fourier analysis, particularly in combination with principal component analysis, is a powerful, assumption-free and intuitive method of quantifying global shape of the corpus callosum and shows great promise for shape analysis in neuroimaging more broadly.

1. Introduction

Structural neuroimaging studies have provided invaluable insight into the normative development of the human brain and aetiology of neurodegeneration and disease. A focus on brain region is gradually being supplemented by recognition of the importance of also exploring shape characteristics [1]. For example the human corpus callosum, the main bundle of fibres between the left and right cerebral hemispheres, has been extensively studied due to its critical role in connecting distant specialised brain areas, and because of its implication in retardation and sensory and cognitive deficits when impaired. The shape of the corpus callosum is biologically meaningful because it reflects topological distribution of interhemispheric connectivity [2, 3]. Accordingly, corpus callosum shape has been shown to have clinical importance in dysfunction associated with interhemispheric connectivity (e.g., schizophrenia [4]) and neurodegenerative disease (e.g., multiple sclerosis [5, 6]).

Past studies have indirectly captured corpus callosum shape using area partitioning, where the two-dimensional contour is divided and the area of each division is compared (e.g., [7, 8]), or the more modern equivalent where parcellation or regional thickness is used in combination with the cross-section midline (e.g., [9, 10]). These approaches have provided useful insights into the importance of corpus callosum shape, but are limited in accounting for more subtle but potentially significant global shape characteristics (e.g., “fat and arched with bulbous splenium” and “thin and long with pointed splenium”).

Shape is the aspect of form that is invariant across rotation, rescaling, and translation [11]. Global shape is the overall form, distinct from small local details: global shape can be thought of as the overall shape of country on a map, and local shape (sometimes called local complexity) as the detailed perturbations of the coastline [12]. In neuroimaging, the process of MRI scan registration and segmentation largely removes nonshape information (translation, scale, and orientation). At this point, shape can be captured using pointwise or deformation techniques, where shapes in each scan are directly compared with an atlas shape on a point-to-point or deformation field basis [13]. Alternatively, shape can be captured by specifying a series of points applied to corresponding locations on a set of scans, known as landmarks [14, 15]. These approaches have been very successful in demonstrating developmental or pathological differences in the brain [11, 13]. Particularly in the case of the corpus callosum, these techniques have revealed differences between normal controls and those with dyslexia (e.g., [16]), autism (e.g., [17]), bipolar disorder (e.g., [18]), foetal alcohol syndrome (e.g., [19]), and individuals with history of methamphetamine use (e.g., [20]). Although innovations such as automated semilandmarks have improved traditional entirely manual landmark assignment approaches [21], the reliance on a priori decisions about shape (e.g., choice of atlas, choice of landmark number and position, and semilandmark number) may lead to undetectable omission of potentially important shape features from analysis. There is therefore need for techniques which can operationalise global shape without such a priori assumptions, for use in subsequent analyses.

Fourier analysis uses overlapping trigonometric functions to capture shape information. Different forms of Fourier analysis use different trigonometric functions. Radius (rFourier) focusses on the distance of any point on the outline to the centroid of the shape, and tangent (tFourier) focuses on the variation of the tangent angle for any point [22]. These techniques assume particular properties of the shape representation not typically found in real-world contours (e.g., equally spaced points around the contour). Their more modern expansion, elliptical (eFourier) begins with an ellipse (derived from the whole perimeter and aligned to the first point of the contour). It considers shape in terms of an abscissa and a series of harmonics for each of the x and y axes [23]. It makes no assumptions about the form of the coordinates denoting a shape contour and draws from the whole perimeter (rather than a manually assigned “start” coefficient) to instantiate its initial ellipse, so it is more flexible and useable. Of the Fourier methods, eFourier analysis is well established as a technique for operationalising shape in the animal biology and anthropology literatures (e.g., [24, 25]). Yet, while a number of Fourier methods (primarily tFourier) have been used as part of broader processes to establish shape space and conduct shape analyses (e.g., [26, 27]), and there are historical examples of eFourier and principal components analysis being used in combination to operationalise shape [2830], there are only a few examples of eFourier analysis being applied in neuroscience shape research [1, 31, 32].

While Fourier shape analysis has undergone rigorous scrutiny (e.g., [3335]), these investigations can be opaque to nonmathematicians, and rarely address questions of pragmatic importance to neuroscience and health researchers—those relating to sensitivity (how well does this technique capture shape?), reliability (how consistent is error across shapes?), and specificity (which nonshape considerations may bias results?), and interpretability (does this output reflect the shapes seen in the raw data?). Although there was growing interest and an increasing number of examples of Fourier shape analysis being applied in neuroimaging research and clinical outcomes (e.g., [32, 36, 37], publications using this technique have slowed, and there remains a lack of field-relevant benchmarking of the method. The aim of this paper is therefore to establish the properties of Fourier shape analysis techniques with a specific focus on properties relevant to neuroimaging and health analyses, using the corpus callosum for demonstration.

2. Methods

2.1. Shapes

The human corpus callosum exhibits a wide range of shapes but can be characterised by three sections: the rounded splenium (or “head”), the midbody, and the genu (or “tail”). This partitioning is biologically meaningful as these regions are histologically distinct [38] and have been linked with different neural regions (e.g., midbody linking the motor cortices [39]) and pathologies (e.g., schizophrenia is linked the splenium but not genu [40]). Global shape is formed by variability in the relative size and angle of these components, as well as the thickness and arch of the midbody. In order to establish a measure’s sensitivity to global shape (driven by variation in these components seen in real-world corpus callosum shapes), we simulated 8000 corpus callosum-like shapes in R version 3.2.0 running on a Windows computer [41] using the packages raster, version 2.4.20 [42]; sp, version 1.2.1 [43] (Figures 1 and 2). These shapes consisted of a circle (pseudo-splenium) with 1/5th of the perimeter removed for attachment to the midbody, which consisted of two straight lines twice the diameter of the original circle in length. The genu consisted of a blunted outer curve and thinner inner curve dynamically sized to ensure a closed non-self-intersecting contour when joined with the midbody lines. As scale is not included in shape information, all manipulations were undertaken relative to this starting position. As shown in Figure 2, eight sets of 1000 shapes were manipulated to systematically vary on (a) global shape characteristics: (1) midbody curve (induced by raising the centre of the initial midbody lines then extrapolating smooth curves while retraining connections to splenium and genu, varying from initial start point to 50% of total midbody length), (2) splenium scale (from -80% to +80% of initial starting size), (3) midbody length (from start point described above to 10 x longer relative to splenium circle size), and (4) a combination of these characteristics; (b) local complexity (regular puckering across the contour built from a curve of a sine wave at varying amplitudes); and (c) nonshape characteristics: (1) distribution of points along the outline (induced by resampling x y coordinates along the contour), (2) scale (linear transformation), and (3) rotation (from shape centroid).

Real corpus callosum shapes were drawn from the Personality and Total Health (PATH) Through Life study [44]. The current study focusses on 926 participants in the cohorts aged 40-45 and 60-65 years at baseline, followed up over twelve years with between one and three scans per participant. As described in Shaw, Sachdev [45], T1-weighted MRI scans were acquired in sagittal orientation (1mm slices, repetition time 1160 ms, echo time 4.24ms, flip angle 150, and matrix size 512x512) and processed in FreeSurfer v5.3, with each voxel sized to 1.0mm3. Estimated intracranial volume was derived from a transformation between voxelwise intracranial volume and Talairach space transform [46]. A total of 2088 corpus callosum tracings were available for this analysis. Contours were manually traced from the two-dimensional slice through the mid-sagittal slice of scans. A subset of these tracings (selection process described and further detail below) were used in later exploration of applied utility of the eFourier technique. Approval for the study was obtained from the human research ethics committee of the Australian National University and all participants provided written informed consent.

2.2. Global Shape and Shape Space Extraction

As outlined in Figure 1, all simulated and real corpus callosum shapes were denoted by points in two-dimensional Cartesian coordinates describing non-self-intersecting contours, which were imported into R and resampled to 200 points per contour (for consistency). All eFourier analyses were undertaken in Momocs, version 0.9.48 [47], with 50 harmonics specified. Though eFourier could theoretically perfectly reproduce a shape with a sufficiently high number of harmonics, the number of harmonics is limited by the number of initial points (two or more points required for each harmonic). Computational expense was historically a limitation, but very large numbers of harmonics are tractable with modern computing power. In practice 12-20 harmonics are typically used for morphometric analyses, hence the choice of 50 here has been selected to balance common practise with achieving high shape recovery fidelity. Shape recovery (via inverse eFourier Transform) and shape space construction (via Principal Components Analysis, PCA) were also undertaken in Momocs.

2.3. Benchmarking Metrics

Fidelity of shape recovery was investigated by comparing original shape against shape reconstructed by eFourier harmonics, via the Jaccard index [48] (illustrated in Figure 2, panels (a) and (b)). If is the original shape and is the reconstructed shape, then the Jaccard index superimposes and in space and describes their intersection relative to the union of the two shapes. In R, original and reconstructed shapes were rasterized into a 200x200 matrix, a resolution chosen to balance computational expense against sensitivity (these dimensions provide high resolution of 40000 points of possible overlap). An intersection matrix, I, was calculated such that ; then . The index is between 0 and 1, with values approaching 1 indicating greater similarity, and values approaching 0 indicating greater discrepancy between and . A value of 0.8 or higher (indicating 80% overlap) is typically regarded as good to excellent in neuroimaging studies of this kind. The Dice index was also calculated but is not reported because results were very similar between Jaccard and Dice indices. Logistic regression and Nagelkirke Pseudo-R2 were used to explore the impact of different shape manipulations on the Jaccard index.

Principal Component Analysis (PCA) constructs a low dimensional view of a higher dimensional space. Broadly, the intention is to explain the greatest portion of variability in a large set of elements (such as items in a questionnaire or biochemical assay [49], spatiotemporal EEG signals [50], or facial features [51]) using the minimum set of dimensions, so that a complex set of measures can be summarised parsimoniously for characterisation and subsequent modelling. Here, we apply PCA to construct a lower dimensional view of a “shape space” (mirroring the terminology “face space” typically used in the application of PCA in the facial feature literature [51]). In the case of the simulated shapes, the dimensions of this shape space are determined by the aspects of shape we manipulate, and the position within those dimensions corresponds to the degree of manipulation. We term “shape space recovery” to refer to the degree to which the lower dimensional view of shape space produced by PCA reflects the known properties of this shape space.

Fidelity of shape space recovery was investigated by comparing the known shape space properties (introduced at the point of simulation) with reconstructed shape space ascertained via Principal Component Analysis (PCA) outputs. The degree to which the reconstructed shape space resembles the known shape space gives insight into the success of the operationalisation technique. As summarised in Figure 1, the two considerations are whether PCA detects the known number of components (shape space dimensionality), and how well it detects each shape’s position on these components (sensitivity to shape properties within shape space). Three PCA were undertaken on different subsets of the simulated shapes: first including those shapes with only global shape features (to explore sensitivity; manipulations of midbody curve, splenium size, midbody length), second with the addition of shapes which had nonshape features manipulated (to explore bias from nonshape cues; manipulations of the points denoting the contour, scale, and rotation), and finally with the addition of simulated shapes with local shape manipulations (to explore bias from local rather than global shape).

2.4. A Case Study: eFourier in Practice

Fidelity in recovering shape and shape space are important only inasmuch as they impact the sensitivity of shape operationalisation in application. We therefore present a case study of applying eFourier and PCA in combination to explore the association between corpus callosum shape and lexical fluency in a large, community-living sample of older adults. The Symbol Digit Modalities Test (SDMT) was chosen as an example due to its sensitivity to several of the major correlates of corpus callosum health (executive function, visual search, attention, and processing speed) and use in clinical evaluation of disorders impacting the corpus callosum such as multiple sclerosis [5, 6]. Drawing from the PATH study, we excluded all individuals with neurological pathology during the course of the study (dementia, epilepsy, and stroke), resulting in a sample size of 868 participants (aged 40-65 years at baseline, 45% female, 11% left-handed). As well as demographic (age, gender, years of education, handedness side, and degree) and MRI data, these participants completed a series of cognitive tests, including the SDMT [52]. Briefly, participants were presented with key assigning abstract symbols to numbers 1 through 9 and are required to match the symbols to their paired digits in a worksheet below. They were given 90 seconds to match as many word/symbol pairs as possible from a total set of 110. The final score was the total number of correct symbol/digit pairs.

Following the historical development of shape analysis in neuroimaging, we compared the performance of two-dimensional corpus callosum volume (calculated from the area of the traces), position in Procrustes-aligned Landmark tangent space (LPC), and position in shape space from PCA following eFourier operationalisation (EPC). Briefly, LPC involves manually assigning “landmarks,” discrete anatomical points that are homologous across shapes, and sometimes (as in the current study) supplemented by “semilandmarks,” points whose positioned arbitrarily along a line which describe a curve. As with eFourier, a sufficiently high number of landmarks and semilandmarks would perfectly represent shape, but practical aspects such as computational expense and researcher time result in sometimes very few (5-10) landmarks being used with some forms. Procrustes alignment uses scaling, rotation, and translation to remove nonshape information denoted by these landmarks. These aligned shapes are projected into tangent space, an n-dimensional abstract space much like the n-dimensional abstract space invoked in PCA. We compare the performance of LPC and EPC in mixed effects (hierarchical) linear model framework, with repeated measures nested within individuals (random intercepts allowed). Total intracranial and corpus callosum volumes were obtained directly from Freesurfer output and divided by 1000 (converted to mm3). Much as in [14] and shown in Figure 3, 10 fixed landmarks were assigned at the rostrum while 35 sliding “semilandmarks” denoted the remaining contour, using the geomorph package, v 3.0.1 [53]. The purpose of this analysis is to compare the two methods, so rather than focussing on the meaning of results, so for the purposes of comparison no rotations will be applied, and only the first shape space component for each of eFourier/PCA and Landmark/Tangent space will be presented (denoted EPC1 and LPC1). Variance explained by each component for each method will also be presented for context. All measures were converted into z scores for the purposes of comparison. Sensitivity analyses were carried out using three-dimensional corpus callosum volume (from automated Freesurfer voxel counting), and with unstandardized versions of volume, EPC1 and LPC1.

3. Results

3.1. Shape Recovery

As Figure 2 (panel (c)) shows, the Jaccard index for original vs. reconstructed shapes was reliably excellent and highly stable across all shapes (M=0.98, SD=0.03). Logistic regression with global shape manipulations as a comparison group indicated that local shape manipulations were associated with slightly lower (b=-0.09, SE<0.01) Jaccard index, while nonshape manipulations were associated with a slightly higher (b<0.01, SE<0.01) Jaccard index. The global shape manipulation with the lowest Jaccard index and most variability was splenium size, though the correspondence between original and reconstructed shape remained excellent (M=0.90, SD=0.06). Nagelkirke Pseudo-R2 indicated that nonshape manipulations (scale, orientation, and randomness of points along the contour) explained more variance in the Jaccard index than shape manipulations (68% vs. 33%), though this has limited meaning due to the consistently limited variability in the Jaccard index (due to near ceiling performance).

3.2. Shape Space Recovery

PCA on eFourier operationalisations of shape resulted in a highly consistent, intuitive representation of shape space that map onto the known manipulated properties. Strong sensitivity to shape properties within shape space can be clearly seen in Figure 2 (panels (d), (e), and (f)) and is reflected in the correlations between the degree of manipulation (1000 steps between none and most extreme) and position on each component, particularly when only global shape characteristics were included in the PCA: , , and . PC3 captured hybrid characteristics of midbody curve (r=0.89) and splenium size (r=0.92). Turning to shape space recovery, cumulative variance explained for the first and fifth components shows how the inclusion of nonshape and then local shape shifted the explanatory power of some individual components but did not disrupt variability in shape explained by the overall shape space; PC1 79.1% vs. 75.1% vs. 60.8%; PC5 98.4% vs. 98.3% vs. 98.0%. This pattern indicates that the combination of eFourier and PCA performs excellently at recovering a known n-dimensional shape space, though the impact of the local shape manipulation indicates that while this space is specific to shape, it encapsulates both global and local shape information.

3.3. Case Study Outcomes

PCA from both eFourier and landmark methods resulted in a first component (PC1) which varied primarily in midbody thickness and arch (high arch + thin through low arch + thick, as in Figure 3 panel (b)). Cumulatively eFourier explained more variance in shape than Landmarks. For both methods, variability explained by each component indicated that substantive interpretation would require multiple components: for eFourier: PC1 explained 45.6% of the variance in shape, PC2 an additional 16.5%, PC3 an additional 11.5%, PC4 an additional 6.1%, and PC5 an additional 4.9%. For landmarks, PC1 explained 26.3% of variance in shape, PC2 an additional 17.6%, PC3 an additional 11.7, PC4 an additional 8.3%, and PC5 an additional 7.1%. This indicates that the first component of the landmarks was somewhat poorly defined.

The median SDMT score was 54 (range 13-89, SD=10.91). Mixed effects pseudo-R2 (MuMIn package v1.42.1 [54]) indicated that CC volume explained 17% of variability in SDMT, while EPC1 explained just 2% and LPC1 1%. Models including the first ten PCA components (trading parsimony for greater reflection of the multidimensional shape space) indicated eFourier components 1-10 together explained 10% of variability, while Landmark components 1-10 together explained only 2%. The superior performance of analyses drawing on more of the shape space should be noted, but for the purposes of clarity of comparison with CC volume, only EPC1 and LPC1 will be the focus.

Corpus callosum volume was more closely associated with EPC1 (r=0.24) rather than LPC1 (r=0.07). EPC1 and LPC1 were similar in both shape space and first component, which denoted a trend from a tall and arched shape to a flat stocky shape (Figure 3). Position on the first component within these spaces was strongly positively (but not perfectly) correlated, r=0.89. This is reflected in similar, but slightly higher Jaccard index for original contours vs. eFourier reconstructions than that of polygons denoted by the landmarks (M=0.95, SD=0.02). Mixed effects model results in Table 1 show that CC volume, LPC1, and EPC1 were all significantly positively associated with SDMT score (though LPC1 and EPC1 significance did not survive once covariates—intracranial volume, handedness side, and degree, gender, age, and years of education—were added). The strength of this association (model slope) was lowest for LPC1 and highest in volume.

Mixed effects model fit was best for EPC1 (AIC= 13,530.38), followed by volume (AIC= 13,574.53), and worst for landmarks (AIC= 13,587.35). Together, these results indicate all three methods (volume, LPC, and EPC) were detecting similar features (the thin-to-stockier component detected by both LPC and EPC logically corresponds to low-to-higher volume) and that EPC outperformed LPC in terms of coefficient magnitude and overall model fit.

4. Discussion

This investigation benchmarked the properties of eFourier shape analysis techniques, focussing on performance and usability relevant to neuroimaging and health analyses, using the corpus callosum for demonstration. Elliptical Fourier demonstrated excellent capacity to capture and reconstruct shape with minimal error, and similarly captured and reconstructed a known n-dimensional shape space in combination with principal components analysis. Further, the case study demonstrates that the combination of eFourier and Principal Components Analysis can produce equally clinically useful outcomes that may be more precise than those from landmark methods.

Previous combinations of eFourier and principal components analysis (PCA) were innovative but stopped short of leveraging the full potential of this combination of techniques. Ferrario et al. (1994) [55] conducted eFourier and PCA on corpus callosum curves but then reduced the rich information provided by position on multiple component to a single Cartesian distance from the median shape for subsequent analysis, in the interests of parsimony. Our results demonstrate that the n-dimensional shape space resulting from eFourier and PCA directly, intuitively maps onto a series of global shape characteristics which may each have concurrent but distinct biological meaning. We therefore argue one or more component loadings should be retained as indicators of global shape which directly correspond to actual viewable shapes.

Point-to-point, deformation, and landmark methods have proven useful [11, 13], but have been criticised for their need for a priori choices such as landmark placement schemes which may omit unknown but important aspect of shape. In the current case study, we assigned substantially more landmarks than are typically used (e.g., 15 in [56]). This was made pragmatically feasible due to advances in the usability and flexibility of semilandmarks [53]. Building on previous highly mathematically rigorous explorations of shape space constructed from landmarks [15, 56], our results demonstrated that with sufficiently high resolution, a landmark-based approach could provide high fidelity reconstruction of shape lead to a very similar shape space to that found by a combination of eFourier and PCA and consequently operationalises shape in a way that leads to very similar conclusions in downstream analyses. However, the complete automation of eFourier remains a substantial advantage over the substantially labour-intensive process of assigning the number of landmarks required for comparable, or in the current case slightly inferior, performance.

This paper has some limitations and leaves several avenues for further investigation. This paper was the first to use simulation to establish a shape space with known properties for the human corpus callosum. The choice of characteristics to vary was therefore necessarily arbitrary, leaving substantial scope for manipulation of further features. Similarly, analyses were limited to the corpus callosum due to the clear link between shape and function and easily identified contour. Similar benchmarking for other brain areas such as ventricles and hippocampus, or for pathologies such as cancer masses and stroke lesions, where shape is known to be important may prove fruitful. Real-world interpretation of the case study was limited by the focus of position on the first component rather than the whole shape space: while visual scatter plots and variance explained in both shape and SDMT indicated substantive rationale for attending to the shape space more broadly, the methodological focus of this paper required a simpler construction of using the first component only. Finally, although a focus on two dimensions is informative, there are potentially important shapes in neuroimaging that can only truly be captured in three dimensions (e.g., gyri). A clear future step is to expand the current approach to Spherical Harmonic Analysis (SPHARM), where overlapping trigonometric harmonics deform a sphere in three dimensions rather than eFourier’s deformation of ellipses in two [57], particularly given recent strides in SPHARM usability made in clinical medicine [58, 59].

4.1. Conclusions

Elliptical Fourier analysis, particularly in combination with principal component analysis, is a powerful, assumption-free, and intuitive method of quantifying global shape in neuroimaging data. Its weakness is that it is not a pure measure of global shape, as it is also sensitive to local shape information. Its strength is that it exhibits comparable if not greater sensitivity to shape at each stage of analysis than an equivalent landmark-based approach.

Data Availability

The underlying data for this submission (chiefly the PATH project data) cannot be made publicly available due to ethical and privacy concerns. However, interested researchers are welcome to apply for PATH data access following the procedures outlined on the project’s website: https://rsph.anu.edu.au/research/projects/personality-total-health-path-through-life.

Conflicts of Interest

The authors have reported no conflicts of interest.

Authors’ Contributions

All the authors have approved the final article.

Acknowledgments

The authors are grateful to Anthony Jorm, Helen Christensen, Kaarin Anstey, Peter Butterworth, Andrew McKinnon, Perminder Sachdev, and the PATH Team. The study was supported by NHMRC of Australia Grant No. 1002160, 1063907 and ARC Grant 130101705. This research was partly undertaken on the National Computational Infrastructure (NCI) facility in Canberra, Australia, which is supported by the Australian Commonwealth Government.

Supplementary Materials

This image depicts an overview of the contents of the manuscript, presented in visual form. It outlines the challenge of assigning a number to a neural shape, the example being the human corpus callosum. It then outlines our benchmarking of a combination of elliptical Fourier and principal components analysis for the purposes of extracting valid, parsimonious shape information for use in further analyses. (Supplementary Materials)