Abstract

In disease-association studies using neuroimaging data, evaluating the biological or clinical significance of individual associations requires not only detection of disease-associated areas of the brain but also estimation of the magnitudes of the associations or effect sizes for individual brain areas. In this paper, we propose a model-based framework for voxel-based inferences under spatial dependency in neuroimaging data. Specifically, we employ hierarchical mixture models with a hidden Markov random field structure to incorporate the spatial dependency between voxels. A nonparametric specification is proposed for the effect size distribution to flexibly estimate the underlying effect size distribution. Simulation experiments demonstrate that compared with a naive estimation method, the proposed methods can substantially reduce the selection bias in the effect size estimates of the selected voxels with the greatest observed associations. An application to neuroimaging data from an Alzheimer’s disease study is provided.

1. Introduction

In disease-association studies using neuroimaging data, such as those related to brain magnetic resonance imaging (MRI), screening of disease-associated regions in the brain is a fundamental statistical task to understand the underlying mechanisms of disease and also to develop disease diagnostics. Such screening analysis typically involves detection of disease associations in the framework of hypothesis testing, followed by estimation of the magnitudes of the associations or their effect sizes to determine their biological or clinical significance.

Many statistical methods have been proposed to detect disease associations. In a cluster-level inference, groups of contiguous voxels whose association statistic values are above a certain threshold are defined and then associated with disease status [1, 2]. Another approach is to test every voxel individually, which takes into account the serious multiplicity problem of testing enormous numbers of voxels simultaneously. In this voxel-level inference, several model-based methods based on random field theory have been proposed. Smith and Fahrmeir proposed to use an Ising prior in a classical Markov random field to model the dependency among contiguous voxels [3]. More recently, Shu et al. [4] proposed to use hidden Markov random field modelling and developed a multiple testing procedure based on the local index of significance (LIS) proposed by Sun and Cai [5] in multiple testing under dependency. Brown et al. proposed to use a Gaussian random field with conditional autoregressive models [6]. With these voxel-level methods, contiguous voxels may be more prone to rejection than conventional, voxel-level multiple testing procedures. They may also facilitate the interpretation of significant voxels or regions in neuroimaging data, as in cluster-level inference, while circumventing the problems with that approach, including the arbitrariness of the threshold used in initial clustering and the lack of spatial specificity [1].

On the other hand, for the problem of estimating disease associations, traditional neuroimaging studies reported “naive” estimates, such as Cohen’s , for significant voxels. However, several authors have pointed out that such methods may suffer from overestimation, reflecting a selection bias for picking up voxels with the greatest effect sizes, possibly due to random errors [7, 8]. Reddan et al. recommended several ways to either avoid such bias, for instance by testing predefined regions of interest or integrating effects across multiple voxels into a particular model, or to adjust bias using independent samples [7]. However, in association analysis of neuroimaging data with spatial dependency, the estimation problem has not been well studied compared with the detection problem using multiple testing.

In this paper, we use empirical Bayes estimation and hierarchical modelling of summary statistics from the whole set of features to derive shrinkage estimation for individual features [9, 10] and adapt this method to the analysis of disease-association studies using neuroimaging data with spatial dependence. Specifically, we employ hierarchical mixture models with a hidden Markov random field structure to incorporate the spatial dependency between voxels. We assume a nonparametric distribution for the underlying distribution of voxel-specific effect sizes. With a generalized expectation-maximization (EM) algorithm, we can estimate all the parameters in the model, including the effect size distribution. We then obtain shrinkage estimates for individual voxels and also an estimate of the LIS for control of the false discovery rate (FDR) in the detection problem based on the fitted model.

With an appropriate effect size statistic and its asymptotic sampling distribution, our method is generally applicable to effect size estimations in many neuroimaging association studies where general linear models have been employed, such as those with functional/structural MRI (fMRI/sMRI), diffusion tensor imaging (DTI), and so forth. This paper is organized as follows. We provide the proposed method in Section 2. We describe simulation experiments to evaluate the performance of the proposed methods and an application to neuroimaging data from an Alzheimer’s disease study in Section 3. We discuss the details of the methods and results in Section 4. Finally, we conclude this paper in Section 5.

2. Materials and Methods

We propose an estimation method based on a hierarchical mixture model in which the underlying distribution of voxel-specific effect sizes is specified. We suppose a simple situation where diseased and normal control subjects are compared without any covariates (see Section 2.5 for incorporation of covariates). We introduce a binary disease status variable with a group label of either 1 or 2, for example, disease or normal. Let and be the numbers of diseased and normal control subjects, respectively, and be the total number of subjects. We suppose that spatial normalization [1] has been performed for each subject to adjust for differences in the size or shape of the observed image, and the image is divided into voxels by a three-dimensional grid. We also suppose a further normalization to ensure normality of the voxel-level intensity values across subjects within each group. Let be the set of all voxels in the neuroimaging data, and denotes the number of voxels in . In order to measure the association of the observed intensity values from individual voxels with the disease status variable, we define the standardized mean difference between the two groups. Specifically, for voxel , where and are the means of voxel for groups 1 and 2, respectively, and is the common standard deviation for voxel across groups. As an estimate of , we use the following statistic: where and are sample means of voxel values in the two groups and is an estimator of the common within-group variance. This statistic is essentially a two-sample -statistic, apart from the sample size term. One may consider a calculation of from the -value provided by software packages such as Statistical Parametric Mapping (SPM, https://www.fil.ion.ucl.ac.uk/spm/). Let be the vector of for all voxels. Of note, the reason for using the standardized mean difference, rather than test statistics such as -statistics, is that it is a direct interpretation of the effect size of individual voxels with no dependency on the sample size.

2.1. Hierarchical Mixture Models in a Hidden Markov Random Field

We assume a hidden Markov random field model [4] for . Let be a set of latent variables, where if the voxel is null (i.e., no association with disease) and otherwise (i.e., association with disease). The dependence structure across contiguous voxels is modeled assuming that this latent variable is generated from the following Ising model with two parameters : where and is the normalizing constant. In the vector , the first component pertains to a summation over all pairs of contiguous voxels, , and the second component to a summation over all voxels, .

Given the latent status , we assume that the statistics s are mutually independent, such that

For the component , we define as the null density function, , and as the nonnull density function, . We assume the distribution of as the mixture of null and nonnull distributions,

Of note, this is an instance of the so-called “two-groups model” [11] when the hidden Markov random field model is introduced. When the sample size is sufficiently large, it is reasonable to employ asymptotic normality for . For the null voxels, we assume to be a normal distribution, , where . For the nonnull voxels, we assume the hierarchical structure with two levels:

At the first level, the conditional distribution of for effect size is normal with mean and variance , again based on asymptotic normality for . At the second level, the voxel-specific effect size has an effect size distribution . From this hierarchical structure, we can express the nonnull density function as the marginal density function, , where is a conditional density function in the first level of Equation (5). Note that Equation (5) is the Brown-Stein model for estimating effect sizes [9, 12, 13].

If the sample size is not large enough, as occurs in many exploratory neuroimaging studies, it is reasonable to use the -distribution rather than the normal distribution. In this case, the statistic follows a -distribution with degrees of freedom for the null voxels, and we consider the following hierarchical model for the nonnull voxels: where represents a noncentral -distribution with degrees of freedom and noncentrality parameter .

2.2. Nonparametric Effect Size Distribution

We can consider both parametric and nonparametric specifications for the effect size distribution . However, the information regarding the parametric form of is generally limited because of the exploratory nature of disease-association studies that observe neuroimaging data with a large number of voxels (see Section 4 for discussion of the technical difficulty of specifying parametric mixture models for the effect size distribution). We therefore consider a nonparametric specification and estimate it based on presumed parallel association structures across a large number of voxels. For this estimation, we propose to perform the smoothing-by-roughening method [14]; in the same way, this method has been used for analyzing genomic data [15]. We approximate that has discrete probabilities at each mass point , where is a sufficiently large number of mass points and discrete probability satisfies . In practice, we set , following the guideline by Shen and Louis [14]. The mass point may be specified to cover a possible range of and for any .

When asymptotic normality is assumed, then based on Equations (5) and (7), the marginal nonnull distribution of , , can be expressed as a mixture of normal distributions, where represents the density function of normal distribution, . If the sample size is not large enough, the noncentral -distribution, , is substituted for the normal distribution, , in Equation (8), where represents the density function of the noncentral -distribution . In this case, the marginal nonnull distribution of , , is a mixture of noncentral -distributions.

The parameter set specifying the above hierarchical model is . We use the vector to represent the set of all parameters, including those in the Ising model. The parameter set is estimated by a generalized EM algorithm. Details of the algorithm are provided in Appendix A. Another approach to estimating the effect size distribution is a nonparametric Bayes estimation with a Dirichlet process (DP) prior [16]. Assuming a DP prior for the discretized version of , Equation (5) forms a DP mixture model that is equivalent to an infinite mixture model. It is pointed out that the estimated nonparametric distribution based on the smoothing-by-roughening algorithm with initial distribution behaves similarly to the one based on DP hyper-prior with mean , where the number of repetitions in the smoothing-by-roughening algorithm is related to prior precision of the DP [15].

2.3. FDR Estimation

In our framework, multiple testing methods can be derived based on the estimated model. We employ the LIS [5] to estimate the FDR to incorporate the spatial dependency between voxels. As a function of the parameter , the LIS is defined as the posterior probability that the voxel is null given all s,

Note that the LIS corresponds to the local FDR [17] when independence across voxels is assumed. Multiple testing is based on the LIS. Let represent a series of ordered LISs across voxels, and let be the null hypothesis (representing no association with disease) on the voxel corresponding to . A LIS-based, oracle LIS procedure was proposed for minimizing the false-negative rate subject to a constraint on FDR under hidden Markov chain dependence [5]; this procedure was then extended under a hidden Markov random field for analyzing neuroimaging data [4]. The oracle LIS procedure determines rejected voxels using the following rule:

This procedure controls the FDR level at . Since the parameter is unknown, a plug-in estimator, , is used. This probability, , can be calculated using the Gibbs sampler from the distribution of [4],

In applying the aforementioned FDR estimation procedure to neuroimaging data, it is generally reasonable to divide all voxels into neurologically defined subregions with distinct functional or structural features, such as Automated Anatomical Labeling (AAL, [18]) (thus resulting in plausible heterogeneity in effect size and dependence structure across subregions). We apply the pooled LIS [19] and fit the model separately for each subregion, thereby obtaining LIS values within subregion. We then determine rejected voxels by Equation (10), where is the ordered LIS for a pool of all subregions.

2.4. Effect Size Estimation

As mentioned in Section 1, estimation of effect sizes for selected voxels is important for evaluating their biological or clinical significance. Of note, the naive estimator given by generally overestimates the true effect size (absolute ) for the selected “top” voxels with the highest statistical significance. This estimation bias reflects the selection bias, caused by random variation, that is inherent in selecting voxels with the largest absolute . We consider shrinkage estimation for selected voxels. Specifically, we extend posterior indices originally developed in the case of independent s [10] to the case of dependent s.

The posterior mean of for a nonnull voxel is given by where is the posterior probability, when the normal approximation is employed for the sampling distribution of . Since the effect size under the null hypothesis is zero, the posterior mean of the effect size of the voxel is given by

Based on these formulas, we can then estimate the effect size using the following posterior indices: where and are plug-in estimators, and . Based on Equations (12) and (13), we have the following form for the estimator :

This posterior mean is the shrinkage estimate of effect sizes, given that the voxel is nonnull. The probability depends on the multiple testing index , which incorporates spatial dependency and is calculated using the Gibbs sampler from the distribution of , presented in Equation (11). These two posterior indices adjust for two different errors. The first is overestimation of effect sizes, and the shrinkage estimate is used to adjust this bias. The second is incorrect selection of the null voxel, and is used to correct for this error. Again, if the sample size is not large enough, the -distribution is substituted for the normal distribution .

2.5. Incorporating Additional Covariates

We shall now address adjustment for additional subject-level covariates (other than the disease status) by employing general linear models. For each voxel, we first standardize all the intensity values across subjects based on the common within-group variance () such that the within-group variance equal to . Let be the standardized intensity value of voxel on subject (. We then assume a general linear model for as the observed intensity values for voxel , where is the binary variable on disease status, represents the additional covariates on subject , and is an error term. As an estimate of the effect size for voxel (with adjustment for the additional covariates), we use with the variance (in the first level of the hierarchical model). When (no additional covariates), may reduce to in Equation (1). We approximate that the distribution of is normal, , where and , and represents the (2, 2) entry of the inverse matrix . If the sample size is not large enough, we assume where and .

3. Results

3.1. Simulation Experiments

We conducted simulation experiments to evaluate the performance of effect size estimation in the proposed method. We simulated the values of the summary statistic according to the hierarchical mixture model in a hidden Markov random field, as given in Section 2.1. With this simulation, we supposed implementation of appropriate preprocessing normalization procedures for various neuroimaging analysis platforms and devices to obtain normally distributed intensity data across subjects for individual voxels. We considered a simple situation where disease and normal control subjects were compared with no additional covariates. The numbers of disease and normal control subjects, and , were set as . We specified the total number of subjects as 50, 100, or 200. Further, we specified the number of voxels as , which was the number of voxels per subregion defined based on brain parcellation in effect size estimation within subregion (see the application in Section 3.2). We generated the true latent variables from an Ising model with parameter values . We considered that the parameter values represented weak, intermediate, and strong degrees of dependency across voxels, respectively. Another parameter, , was determined such that the proportion of disease-associated voxels accounted for 10%, 20%, and 50% of all the voxels. When (i.e., the voxel was not associated with the disease status), the true effect size was set as ; otherwise, the true effect size was set as and generated from . Here, it is reasonable to assume positive effects only (i.e., one-sided detection) when studying the loss of neurological function after disease onset. The statistics were generated from a -distribution, .

For simulated data for voxels, we applied a counterpart of the proposed estimation method with normal approximation for the sampling distribution of (given the true effect size for voxel ), and also a method assuming a -distribution without normal approximation for the sampling distribution of (see Section 2). To reduce the computational burden when performing the proposed methods, we assumed that the parameters in the Ising model were constant. We ascertained similar simulation results for a small number of simulation repetitions when the parameters in the Ising model were estimated (results not shown). Following the guideline on the smoothing-by-roughening method [14], we used in these simulation experiments. We also ascertained similar results in estimating effect sizes of individual voxels when we used a smaller number (results not shown), indicating that the estimation is relatively insensitive to the selection of .

In evaluating the proposed method’s performance regarding effect size estimation, estimation biases for voxels with the greatest statistical significance (i.e., greatest values of ) were compared between the naive estimator and the proposed estimators. We conducted 100 simulations for each configuration of the parameter values in the Ising model and the total sample size. Figure 1 plots average bias values, each defined as the estimate minus the true value of effect size, over simulations at each voxel ranking for the naive estimator and the two counterparts of the proposed posterior mean in Equation (15), for the case in which the proportion of disease-associated voxels was 20% of all the voxels. Note that the top-ranked voxels differed across the 100 simulated datasets, but the three estimates pertained to the same voxels (based on the ranking based on Ys) for each simulated dataset. We also note that we had similar results for the other proportions of disease-associated voxels, i.e., 10% and 50% (see Appendix B).

From Figure 1, we can see that naive estimators suffered from serious overestimation. The proposed estimators were generally less biased. Moreover, we can see that the counterpart of the proposed method, based on a -distribution, generally gave less biased estimates for compared with the method based on normal distribution.

We also evaluated the performance in effect size estimation for two scenarios where the model was misspecified. Specifically, for Scenario 1, the true latent variables were generated independently across voxels as in Brown et al. [6], but the true effect sizes were smoothed with a Gaussian kernel after initial effect sizes were independently generated from across voxels. For Scenario 2, the true effect sizes were smoothed with a Gaussian kernel as in Scenario 1, but the true latent variables were generated from an Ising model to reflect spatial dependency. We ascertained similar performance in effect size estimation for these two scenarios where the model was misspecified.

3.2. Application

Alzheimer’s disease (AD) is one of the most common neurodegenerative disorders responsible for dementia with brain atrophy. We illustrated our method using a dataset on T1-weighted MRI images from the Open Access Series of Imaging Studies (OASIS), including longitudinal MRI measurements from subjects aged to years (website: https://www.oasis-brains.org/; dataset: “OASIS-2”) [20]. Each subject underwent MRI scans using the same scanner with identical sequences at two or more visits with intervals of at least one year. At each subject visit, three or four individual T1-weighted MRI images were obtained during a single imaging session, and the Clinical Dementia Rating (CDR) scale was administered. Here, we evaluated whether assessment of brain subregions at the first visit (baseline) could be used for early diagnosis of AD, by associating the baseline MRI measurements with the conversion from mild cognitive impairment (MCI) at baseline to AD at the second visit, where MCI was defined as and AD was defined as CDR ≥1. Specifically, in the original dataset, we identified MCI subjects (with CDR ) at baseline; of those 51, at the second visit there were nonconverters with and converters with CDR ≥1. Of note, converters were diagnosed as at the second visit within 2 years after the baseline visit. We thus compared baseline MRI data between the nonconverter and converter groups.

The baseline MRI data were obtained as follows. In order to make the subject-specific MRI data comparable in assessing brain atrophy at each coordinate across subjects, we utilized the SPM software (https://www.fil.ion.ucl.ac.uk/spm/) to obtain a voxel image grid with 2-mm cubic voxels for each subject. Specifically, three or four individual scan images were obtained during single imaging sessions at baseline for each subject and were then coregistered (to make them comparable across each subject’s scan images), and image intensity values at respective coordinates were averaged across scan images. The software was then used to achieve the following: segmenting the images into different tissue classes, coregistration of segmented gray and white matter (to make the averaged images comparable among subjects) using the algorithm Diffeomorphic Anatomical Registration using Exponentiated Lie algebra (DARTEL, [21]), normalization to a standard brain space (MNI-space, developed by Montreal Neurological Institute), modulation of the transformation of intensity values of gray and white matter images into the tissue volume for each coordinate, and smoothing across contiguous voxels based on an 8-mm cube of full-width at half maximum of the Gaussian blurring kernel. After the processing by SPM, gray matter intensity normalization was performed based on white matter intensity using R package WhiteStripe [22] to obtain comparable images across subjects. See Appendix F for more details of the aforementioned processes used to transform the original raw data to normalized data eligible for association analysis using the proposed method.

In the association analysis after the preprocessing of MRI data, the summary statistic in Section 2.5 was calculated from a -statistic for testing in the general linear model in Equation (17) with the gray matter intensity as the dependent variable and sex, age, and total intracranial volume as covariates. Owing to plausible heterogeneity in voxel intensity across brain regions, we divided the whole brain image into 116 subregions based on the AAL, and fit the model for each subregion separately. Of note, we can consider brain subregions other than those based on AAL. We then obtained the effect size estimate in Equation (15) and the LIS statistic in Section 2.4 for individual voxels based on the estimated model within each subregion. We used as the number of mass points used to estimate the effect size distribution . We also used a smaller number, , for some subregions with small sizes, but obtained similar results for and . We detected disease-associated voxels at by applying the pooled LIS procedure [19], where all the LIS values were pooled across subregions and ordered to determine rejection of voxels based on the criterion in Equation (10).

Since the total sample size was relatively small, we provide the estimation results based on the proposed method with -distribution for the sampling distribution of (see Appendix D for results based on the proposed method with normal sampling distribution). Figures 2(a) and 2(b) display significant voxels at by the pooled LIS procedure and all positive effect size estimates in Equation (15), based on the region-specific estimated models. We note that there were few voxels with negative effects; this is reasonable because brain atrophy should be linked to positive effects. In comparison with Figures 2(a) and 2(b) on effect size estimation apparently provides more information about the variation in the strength in disease association. As a reference, we also fit the counterpart of the proposed method based on normal distribution, but similar results were obtained (Appendix D).

For each subregion, we then calculated average effect sizes for significant voxels based on the proposed method with -distribution. Table 1 shows subregions with the greatest average effect sizes. As expected, the effect size estimates based on proposed method were generally smaller than those based on the naive estimation method for top voxels. See Appendix E for the differences in effect size estimates for top voxels within subregion between the proposed and naive methods. The top subregion, corresponding to the right middle temporal pole (TPOmid.R), has been reported by a connectivity analysis to be a region in which converters exhibited a decreased short-range degree of functional connectivity [23]. The other regions have already been associated with conversion to Alzheimer’s disease. For example, the left medial occipital lobe including the left cuneus (CUN.L) has been reported to be associated with MCI conversion [24], and the fusiform gyrus (including FFG.R) and parahippocampal gyrus (including PHG.R) have been reported as the regions with reduced volume in converters [25]. The right anterior portion of the parahippocampal gyrus (part of PHG.R) and left precuneus (PCUN.L) have been used to predict conversion [26]. The amygdala (including AMYG.R) has been used as a predictor of conversion from MCI to AD in many studies [2729]. The middle and inferior temporal gyri (including MTG.R and ITG.R) have been reported as the regions with reduced volume in converters [30]. Hypometabolism in the inferior parietal lobe (including SMG.R) has been used as a predictor of cognitive decline from MCI to AD dementia [31]. Although the right superior temporal pole (TPOsup.R) has not been examined in association studies based on the AAL, the temporal pole has been reported to be associated with disease conversion [32].

4. Discussion

This research was motivated by the growing recognition of the importance of effect size estimation for detected brain areas in disease-association studies using neuroimaging data [7, 8]. In order to permit flexible modelling of effect size distribution across a large number of voxels, while also incorporating the inherent spatial structure among voxels in neuroimaging data, we have integrated the frameworks of semiparametric hierarchical mixture modelling and hidden Markov random field modelling. The integrated framework allows for more accurate effect size estimation for individual voxels and also facilitates the accurate estimation of false discovery rates when detecting disease-associated voxels through multiple testing. With this framework, we could assess both voxel-level effect sizes and false discovery rates based on the integrated model without needing additional independent datasets. As shown in Figure 2(b), voxel-level effect size estimates can provide detailed and unbiased information about the association between detected brain areas and the disease, which may be helpful for biological or clinical analysis of the identified areas. We stress that the effect size index in Equation (1) allows for evaluation without dependency on sample size. This feature may be particularly useful for comparing effect size estimates across different studies with distinct sample sizes. Note that our proposed framework is generally applicable to many neuroimaging analyses where general linear models have been employed.

Although we have supposed a particular effect size statistic, i.e., the standardized mean difference between two groups as in Equation (1), and its sampling distributions, i.e., the normal or -distributions as in Equations (5) and (6), we can consider another effect size statistic and its sampling distribution. With specification of the appropriate effect size statistic and its sampling distribution, our method is widely applicable to many neuroimaging association studies where general linear models have been employed, such as those with fMRI/sMRI, DTI, and so forth. Related to this point, we can accommodate unequal variances between diseased and healthy brain images, rather than equal variance represented in Equation (1). Specifically, we may define the fold change, , as the effect size estimate, and assume asymptotic normality with fixed variances specified using reasonable estimators of the group-specific variances, although in our original formulation equal variance could be achieved by an adjustment for appropriate covariates in the framework of general linear models (see Section 2.5). Similarly, in fMRI analyses, an absolute effect size such as percent signal change can be evaluated, and asymptotic normality is assumed for the sampling distribution (see Desmond and Glover [33] for the specification of the asymptotic variance).

We have proposed two counterparts of the proposed method: one uses normal approximation, and the other is based on -distribution for the sampling distribution of the voxel-level summary statistic (or ), for both null and nonnull voxels (see Section 2.1). Our simulation experiments demonstrated that the proposed method with normal approximation could substantially overestimate voxel-level effect sizes when the sample size was small (), due to the erroneous assumption of a smaller dispersion of the sampling distribution of the statistic (or ) for both null and nonnull voxels, such that greater mass probabilities would be assigned for large effect sizes in estimating the effect size distribution . However, this problem disappears as the sample size becomes large, as demonstrated in our simulations. One advantage of the proposed method with normal approximation is shorter computational time for model estimation, compared with the counterpart with -distribution and heavier tails. We recommend using the proposed method with normal approximation if the sample size is sufficiently large (say, ); otherwise, use the its counterpart with -distribution.

As for the specification of the null distribution in Equation (4), we have specified the theoretical null, represented by or central -distribution, with the Ising model to incorporate spatial dependency in the association status across voxels. To accommodate residual dependency, we could assume the empirical null, say , and estimate the null parameters using the central matching method that fits an estimated curve for the frequency distribution of , such that we obtain an estimate [34]. However, for many neuroimaging data, the central peak may not pertain to a “null” distribution, rather a “nonnull” distribution, because moderate to large nonnull effects can dominate over small null effects, especially when the estimation is performed within subregion, as seen in our application example in Section 3.2.

With respect to specification of the effect size distribution , we have employed a flexible, nonparametric specification because the information about the distributional form of is generally limited in exploratory disease-association studies. Other flexible specifications may include the use of a parametric effect size distribution with several components, such as finite normal mixture models. When this type of model is assumed, the marginal distribution of may also have a finite normal mixture form when the sampling distribution of is normal, as in Equation (5). In this case, the model parameters can be estimated using the method described by Shu et al. [4], where a penalized likelihood is used to avoid an unbounded likelihood function (or nonidentifiability of the variances of the individual normal components) and Bayesian information criteria are used for selecting the number of components. However, a fundamental problem with this approach is that it lacks a natural constraint preventing the variance of the particular normal component in the marginal distribution of from becoming no smaller than the variance of the sampling distribution of (i.e., in Equation (5)). By contrast, the nonparametric specification incorporates this constraint in principle; each of a large number of mass points corresponds to a “component”, as seen in Equation (8), and the variance of the marginal distribution corresponding to each component is specified as the variance of the sampling distribution (). In addition, the nonparametric specification does not need a penalized likelihood maximization or repeated model fitting to select the number of components based on a model selection criterion, and thus, the computational burden is much lower.

Our method with a nonparametric effect size distribution, in principle, can capture any forms of the effect size distribution, and voxel-level effect sizes will be estimated based on the fitted effect size distribution. In practice, however, it is reasonable to consider estimation within subregions (e.g., those based on the AAL in Section 3.2) to take account of a large heterogeneity in the effect size distribution across subregions or to avoid influence of the heterogeneity on the estimation of voxel-level effect sizes in a particular subregion. Although our model could be extended to incorporate the heterogeneity, e.g., by introducing a hidden structure on the effect size distribution across subregions, estimation results may become difficult to interpret. We therefore simply recommend subregion analysis based on biologically relevant and interpretable brain parcellations in which effect sizes within subregion are deemed relatively homogeneous.

One inherent feature of the Ising model is that there is a critical value for the spatial interaction term , beyond which the model has a so-called phase transition, in which almost all binary (null or nonnull) indicators will have the same value. Thus, the algorithm for estimating does not converge, while the parameters in the hierarchical mixture model converge since the plug-in estimate assumes values close to 0 or 1 in such a situation. In implementing our algorithm, for the samples of under candidate new values of , we reject the values of if all the samples of are equal. Details of the algorithm and its implementation, including specification of the number of iterations, are provided in Appendix A.

It is interesting to discuss different approaches to modelling the association status (null/nonnull) and effect size distribution. Brown et al. [6] considered a parametric model where the association status and effect size follow a Bernoulli distribution and a conditional normal distribution, respectively, independently across voxels, but the mean of the conditional distribution is a weighted mean or smoothed across adjacent voxels, like the misspecified model investigated in our simulation (see Appendix C). On the other hand, our proposed model incorporates spatial dependency in the association status, but not the effect size, using the Ising model. Further, for effect sizes, a nonparametric marginal distribution is specified as in Equations (4) or (5). Even under the absence of the specification of dependence in effect sizes across voxels, our method worked well under various simulation models in Section 3.1. This could be explained by the feature of our method that it can yield similar effect size estimates for similar values of the observed association statistic from relatively adjacent voxels. However, integration of different modelling approaches for more efficient estimation is an interesting area for future study.

Lastly, another important aspect of the proposed framework for disease-association studies with neuroimaging data is that it can provide a flexible statistical model for the distribution of all neuroimaging data with a large number of voxels. Based on such a whole-brain, voxel-based model, it is appropriate to make a formal inference for a particular group of brain areas or contiguous voxels. In addition, power and sample size calculations of disease-association studies involving neuroimaging are another important direction based on whole-brain modelling.

5. Conclusions

The proposed method allows for accurate estimation of voxel-level effect sizes, as well as detection of significant voxels with disease association, based on the flexible, hierarchical semiparametric model incorporating spatial dependency across voxels. Our method can be generally applicable for many neuroimaging disease-association studies where general linear models can be assumed for voxel-level intensity values.

Data Availability

This research uses a publicly available dataset “OASIS-2: Longitudinal MRI Data in Nondemented and Demented Older Adults” available at: https://www.oasis-brains.org/. The codes used in this research are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was supported by a grant-in-aid for Scientific Research (16H06299) and JST-CREST (JPMJCR1412) from the Ministry of Education, Culture, Sports, Science and Technology of Japan. For the Alzheimer’s disease data in Section 3.2, we appreciate the Open Access Series of Imaging Studies (OASIS) supported under grants P50 AG05681, P01 AG03991, R01 AG021910, P50 MH071616, U24 RR021382, and R01 MH56584.

Supplementary Materials

Appendix A (referenced in Section 2.2) shows details of generalized EM algorithm for parameter estimation. Appendices B and C (referenced in Section 3.1) show the simulation results for other simulation settings. Appendices D and E (referenced in Section 3.2) show the results of the proposed method with normal approximation. Appendix F (referenced in Section 3.2) shows the details of the preprocess conducted in application of proposed method. (Supplementary Materials)