Abstract

In the structural earthquake engineering, a single parameter is often not sufficient enough to depict the severity of ground motions, and it is thus necessary to use multiple ones. In this sense, the correlation among multiple parameters is generally considered as an importance issue. The conventional approach for developing the correlation is based on regression analysis, along with simple pair copula approaches proposed in recent years. In this study, an innovative mathematical technique—vine copula—is firstly introduced to develop the empirical model for the multivariate dependence of pseudospectral accelerations (PSAs), which are the most commonly used earthquake ground motion parameters. This advancement not only offers a more flexible way of describing nonlinear dependence among multivariate PSAs from the marginal distribution functions but also highlights the extreme dependence. The results can be conventionally acquired in the ground motion selection and seismic risk and loss assessment based on multivariate parameters.

1. Introduction

In the structural earthquake engineering, a single ground motion parameter (GMP) is often not sufficient enough to characterize the severity of earthquake ground motions, and it is necessary to use multiple ones. Consequently, it is critical to evaluate the correlation among multiple GMPs when they are used to select ground motions [1, 2] or to calculate aggregated seismic losses of distributed infrastructures and portfolios [3, 4]. Attempts have been paid to study the correlation during the past few years [57]. As pseudospectral accelerations (PSAs) play an important role in antiearthquake design of structures, their correlation at different vibration periods is widely investigated [8, 9].

The correlation is computed based on the residuals of ground motion prediction equations (GMPEs) (e.g., [1013]) derived from large number of ground motion records. Conventionally, the correlation model of earthquake parameters (such as PSAs at various vibration periods) is developed using regression analysis (such as [68]) on Pearson product-moment correlation coefficients which are derived from the residuals of the GMPEs. In recent years, copula techniques have been more and more widely applied in engineering [1418] due to the advantages in the probabilistic analysis [19, 20]. A simple bivariate copula technique is introduced by Goda and Atkinson [21] to model the PSA interperiod dependence. It demonstrates that PSAs are marginally lognormal distributed. It also validates that the conventional two-step approach is appropriate to develop the correlation model. However, these techniques can only describe a linear correlation and take a bivariate interperiod dependence of parameters into account. In addition, the study of Weatherill et al. [3] shows that the interperiod dependence of PSA results in a larger difference of losses at lower annual probabilities of exceedance. Given the detail dependence has significant impacts on the low probability of risk, the detail dependence of PSAs should be highlighted in the risk assessment.

Multivariate elliptical copulas, such as normal copula and t copula, have been used for modeling multivariate dependence [2224], including for earthquake GMPs [21, 25]. However, multivariate normal copula is unable to capture extreme dependence because of its independent property in the tail region. In the meanwhile, the multivariate t copula has been criticized for using only a single parameter, i.e., degree of freedom, to determine tail dependence. Thus, it has limited abilities to describe the complicated tail dependence in the multivariate context. Furthermore, the vine copula approach, which is based on the decomposition of different bivariate copulas, can perform better in high-dimensional cases since it can embrace heterogeneous dependence structures among variables and can use a series of pair copulas to capture a complex relationship [2629].

Therefore, this study aims to develop a multivariate joint probability function of PSAs using vine copula technique. Instead of using pairwise interperiod dependence, we develop a multivariate dependence structure among PSAs at different periods. We also investigate the tail dependence. In this paper, firstly, we introduce the vine copula. Then, multivariate dependence structure of PSAs is derived from a ground motion record database. Since there are two orthogonal horizontal components for each ground motion record, we use the geometric mean of two components at different periods to calculate the interperiod dependence.

2. Vine Copula Dependence

In this section, we will provide an in-depth discussion about vine copula. We will review some basic concepts of vine copula methodology, namely, the definitions, the properties (i.e., tail dependence), and some widely used bivariate copula families which are nested in trees of vine copula.

2.1. Vine Copula Construction

A -dimensional copula is a cumulative distribution function on with uniform margins on [0, 1]. Let be a -dimensional distribution function with marginal distributions , ,…, . According to Sklar’s theorem [30], there exists a -dimensional copula C, such that :

If , are continuous, copula is unique. In a continuous case, after taking derivatives on both sides of equation (1), we obtain the joint density representation of as follows:

Joe [26] noticed that a -dimensional copula can be represented by different bivariate copulas through the decomposition method. Studies [28, 29] introduced the vine copula method on the basis of decomposition approaches. This method builds multivariate copulas based on the product of a series of simple pair copulas. Aas et al. [31] applied this PCC (pair copula construction) method to model the multiple dependence structure of returns and produced an entire algorithm relating to model estimation and simulation.

Vine copula approach allows us to combine different families of bivariate copulas for different pairs of margins and higher order dependencies. Fischer et al. [27] compared this approach with other methods and found that the vine copula method performed better in high-dimensional cases. Other studies [3234] pointed out the same conclusion.

Through vine copula decomposition, a multivariate density can be represented as a product of pair copula densities and marginal densities. For this reason, appropriate bivariate copulas can capture the variant dependence relationship of different pairs of margins instead of using only one multidimensional copula to describe the whole dependence structure of the multivariate density.

In general, an -dimensional vine structure is represented by trees. The -th tree has nodes and edges, with each edge corresponding to a pair copula density. The edges in are the nodes in . For a -dimensional multivariate distribution, the decomposition is not unique. Bedford and Cooke [28] presented the multivariate density in terms of a regular vine, termed an R-vine, as follows:where is a set of edges associated with tree ; ; and the edge connecting variables denoted with and in gave the variable labeled by , and denotes conditioning set for the edge ; see Corollary 1 in the study of Bedford and Cooke [28] for full representation and more details. This is a general representation, and it does not restrict the vine structure to a particular pattern. The selection of the vine structure depends on the observed data.

Aas et al. [31] identified two types of vines with particular structures, namely, D-vine and C-vine. As shown in equation (4), each tree of D-vine is a path, and each node is connected to no more than two other nodes in each tree. Meanwhile, C-vine shows a star structure. In each tree, there exists one unique node connecting with all other nodes (equation (5)):

It has been shown that vine copula decomposition offers a great deal of flexibility in modeling a complex dependence structure, especially in relation to the tail dependence, compared with the traditional multivariate copula (see [27, 35, 36] for details). In high-dimensional cases (), various pairs of variables may exhibit heterogeneous dependence patterns that traditional multivariate copulas are unable to capture. Vine copulas can solve this problem and provide good fit for high-dimensional data. We can fit the standardized residuals with an appropriate vine structure and select a suitable bivariate copula to describe the dependence patterns for each pair of margins.

In this paper, we fit the margins of the PSA residuals with normal distributions and select the vine structure and the pair copula families using the Akaike information criterion. Finally, the parameters of the overall model are estimated using the maximum likelihood method:where denotes the vine structure, is a collection of pair copula families, and denotes the parameters of the copulas, respectively.

2.2. Tail-Dependence Coefficients

Venter [37] investigated the tail concentration functions for different copula families and suggested to select copulas for a given dataset using the tail concentration characteristic. The tail dependence is expressed in terms of a conditional probability that one variable will incur a large loss (or gain), given that another variable also experiences a large loss (or gain). Considering two random variables and , with joint continuous cumulative distribution function , copula , and margins , the lower tail-dependence coefficient is defined as

The upper tail-dependence coefficient is defined as

If and are continuous, and can be expressed in terms of a copula representation. For a lower tail-dependence coefficient, formula (7) can be rewritten as

Analogously, for an upper tail-dependence coefficient, we have

If is radially symmetric, (see [20] for the proof). Intuitively, if and exist and fall in , and show lower or upper tail-dependence. If and are equal to 0, one can say that the two variables are independent in the tails; hence, extreme events seem to occur independently. We can describe different tail-dependent behavior by choosing an appropriate copula model.

In the aggregated seismic losses assessment of distributed infrastructures and portfolios, the differences between the losses considering and without considering the dependence of pseudospectral accelerations (PSAs) tend to be increasing with decreasing annual probability of exceedance [3]. Hence, the tail dependence among pseudospectral accelerations (PSAs) should be paid more attention to the loss estimation at extreme events. Herein, we can use the tail-dependence coefficient to measure the concordance between the extreme events of different random parameters.

2.3. Bivariate Copula Families

In this paper, we use the vine copula approach to measure the interperiod dependence structure of PSAs. Vine copula is a “pair copula construction” method; hence, we focus on the selectable two-dimensional copula families. The elliptical copulas related to an elliptical distribution are the most widely used in many research fields, (see [21]).

2.3.1. Gaussian Copula

In the bivariate case, the Gaussian copula is defined by the following expression:where is the bivariate normal cumulative distribution function with linear correlation coefficient and is the standard normal cumulative distribution function and is its inverse function. We can see that the bivariate Gaussian copula density is symmetric, and it has weak capability to capture skewness in the dependence structure. If we go far into the tail, the extreme events tend to be independent, even though we choose a very high correlation. Actually, the Gaussian copula shows asymptotical independence in the tail regions, more details can be found in [20].

2.3.2. Copula

The copula corresponds to a Student t-distribution. It is defined aswhere is the cumulative distribution function of a two-dimensional distribution, are the degrees of freedom, and is a measure of dependence. The copula also has symmetric shape, upper and lower tail dependences are identical, and they are completely determined by and . When gets large, the copula decays to a Gaussian copula. The expression of and is as follows:where is the cumulative distribution function of a univariate t distribution with degrees of freedom [20].

Archimedean copulas, defined by their generator functions, are also used intensively. Generally, if a function with a continuously decreasing and convex derivative, it can be considered as a generator function of an Archimedean copula. By definition, a -dimensional Archimedean copula has the following expression:

Different generator functions create different Archimedean copulas. More details about the generator function can be found in the studies of Joe [38] and Nelsen [39]. In the bivariate case, the copula function is defined bywhere is a function with and .

2.3.3. Frank Copula

The Frank copula is defined by

The generator function is .

Similar to the Gaussian copula, the Frank copula is symmetric in both tails and it is not sensitive to the relationship between the extreme values in both upper and lower tails. It shows asymptotic independence in the tails, whereas it has a strong dependence in the center of the distribution. This means that the Frank copula fails to capture tail-dependence behavior; hence, it is suitable to use the Frank copula when the tail dependence of a given dataset is relatively weak. If , the Frank copula will decay to an independent copula as a special case.

Joe [38] provided some examples of two-parameter bivariate copulas, such as Joe’s BB1, BB4, and BB8 copula. Two-parameter copulas are distinguished from other bivariate copula families mentioned above. They show a high flexibility in modeling bivariate dependence structures through two parameters, especially in modeling an asymmetric tail-dependence behavior. More details about these copulas can be found in the study of Joe et al. [40, 41].

2.3.4. Joe’s BB1 Copula

The BB1 copula is defined aswhere . The lower tail-dependence coefficient is , and the upper tail-dependence coefficient is equal to which is independent of . The concordance of two variables increases as increases. The Gumbel copula is the limiting case of BB1 copula as . Obviously, BB1 copula decays to the Clayton copula when is equal to 1 and decays to independent copula as and .

2.3.5. Joe’s BB8 Copula

The BB8 copula is defined aswhere . The independent copula is obtained as or . BB8 copula decays to the Joe copula when is equal to 1, while the Frank copula is obtained as , and the single parameter can be calculated by using the formula . BB8 copula dose not exhibit tail dependence except when is equal to 1.

2.3.6. Tawn Copula

The Tawn copula has been introduced by Tawn [42], regarded as an extension of the Gumbel copula with three parameters, and it can be expressed bywhere , , and . If , the Gumbel copula is obtained from the Tawn copula. and can be interpreted as skewness parameters. Hence, the Tawn copula can be divided into two types, and each type has one of the asymmetry parameters fixed to 1 so that the corresponding copula density is either left or right skewed (i.e., or ). And the upper tail-dependence coefficient has following expression:

3. Multivariate Dependence Structure of Ground Motion Parameters

3.1. Ground Motion Data and Residuals

In this section, we investigate the multivariate dependence of pseudospectral accelerations (PSAs) using vine copula technique. The residuals of PSAs calibrated with equation (21) are used to model the multivariate dependence structure based on vine copula:where represents the residual values of pseudospectral accelerations (PSAs); is the observed PSAs; and is the predicted median PSA values in logarithmic space calculated through ground motion prediction equations (GMPEs) [10] as the function of magnitude M, fault-to-site distance R, and other sets of variables . To illustrate, this study only focuses on the 5%-damped pseudospectral accelerations (PSAs) at vibration periods of 0.1, 0.2, 1.0, 2.0, and 4.0 sec. But the conclusion drawn in the paper can also be applied to the PSAs at other different vibration periods. We use the same set of ground motion record data for the analysis as that used for the development of ground motion prediction equations (GMPEs) used by Campell and Bozorgnia [10]. It consists of 1550 pairs of components of accelerograms, which are used to calculate . is computed using the GMPEs proposed by Campell and Bozorgnia [10].

3.2. Vine Copula-Based Multivariate Dependence Structure
3.2.1. Multivariate Copula Calibration

The lognormality of parameter PSAs has been well documented in many literatures (see [21] for more details).

First, we use the empirical cumulative distribution function to transform residual values of PSA into the so-called pseudoobservations in the domain of [0, 1] by the following formula:where denotes the indicator function which takes value of 1 if and value of 0 otherwise and , , , , and denote residuals of PSAs at vibration periods of 0.1, 0.2, 1.0, 2.0, and 4.0 sec, respectively. Then, the residuals are transformed to uniform data in domain [0, 1], and the pure dependence structure among the residuals at five vibration periods can be captured by the copula model using these uniform data, eliminating the effect of margins at each period. Figure 1 displays the bivariate dependence feature for each pair of , , , , and . The histograms on the diagonal panel indicate that transformed data are perfectly uniformly distributed. In the meanwhile, the bivariate contour plots and scatter plots reveal that different pairs of variables show heterogeneous relationship characters, i.e., asymmetry and dependence in tail regions. Notice that the traditional multivariate normal copula and t copula fail to handle these complex dependence patterns.

In order to check whether vine copula shows a more flexibility for modeling multiple dependences than elliptical copula in the high-dimensional case, we fit the residuals with the normal copula, t copula, and D-vine copula, respectively. The estimation results of five-dimensional normal copula and t copula are reported in Table 1. We can observe that all parameters are highly significant, and the off-diagonal elements of correlation matrix are similar between these two multivariate elliptical copulas. We then fit the residuals with a five-dimensional D-vine copula. The structure of D-vine is defined by using equation (4), and the pair copula families have been restricted to the bivariate t copula for the reason of comparison. Since the five-dimensional t copula is nested in the D-vine copula structure, the likelihood ratio test can be performed between these two copulas [22]. The results of D-vine copula are reported in Table 2. In each tree, the residuals of PSAs are connected as a path from the shorter period to longer period. And for each pair, tail-dependence coefficients can also be calculated.

Then, we calculate quantitative measure, i.e., Akaike information criterion (AIC) and Bayesian information criterion (BIC). The three fitted multivariate copula models for residuals of PSAs are compared based on two criterions: (1) the lower values of AIC and BIC mean better level of goodness of fit; (2) the greater log-likelihood value indicates the better level of goodness of fit. The results are displayed in Table 3. The five-dimensional D-vine copula shows the lowest AIC and BIC values and the biggest log-likelihood value. We also perform the likelihood ratio test between two nested models, i.e., t copula and D-vine copula. The statistic equals to 25.978 (2922.2330–2896.2550) with degree of freedom 9 (20–11), and the associated value is almost zero. It implies that five-dimensional t copula can be rejected in favor of D-vine copula for our data.

We restrict the structure of vine copula to a D-vine structure and limit the pair copula to bivariate t copula in order to compare three different copulas in the previous section. Then we fit transformed residual data with vine copula without restrictions. The selection of vine structure and the choice of pair copula families are data oriented. We choose the best vine copula model associated with the smallest AIC value. The suitability for selecting vine copula model has been shown (see [41]). And the independent test for bivariate copula has also been performed. The independent test for bivariate copula , where n is the number of observations and is the empirical Kendall’s of the data u and . The test statistic T is asymptotic normal distributed under the null hypothesis that the bivariate variable is independent (see [43]). Figure 2 illustrates the intuitive graphs for four trees of the vine structure. The nodes denote the margins, and the edges denote the bivariate copula between two linked nodes. Furthermore, the labels on the edges are the families and Kendall’s of corresponding pair copulas. Figure 2 also indicates that the fitted vine is D-vine structure and no node plays a major role in the whole structure. However, its nested pair copulas exhibit a heterogeneous characteristic, which is not only described by bivariate t copula but also by other two-parameter copulas.

The estimated results are reported in Table 4, and all parameters are highly significant. The pair copula families and copula parameters are estimated by the joint maximum likelihood method. Compared to the sequential estimation method, this method can provide more precise results since all the parameters are estimated simultaneously instead of only the bivariate scenario involved. Kendall’s and corresponding tail-dependence coefficients are displayed in Table 4. The AIC and BIC scores decrease significantly compared with other three multivariate copulas mentioned previously (i.e., multivariate normal copula, multivariate t copula, and D-vine restricted with pair t copula). Hence, it performs better at modeling PSA dependence structure at different vibration periods. It is notable that some pairs reveal strong tail dependence. In Tree 1, the dependence between PSAs at 0.1 s and 0.2 s is captured by the survival Tawn copula [38]. This two-parameter copula can describe a lower tail dependence. It implies that PSAs at 0.1 s and 0.2 s show very strong co-movement probability in left tail region, i.e., an extreme small PSA at 0.1 s tends to accompany with an extreme small PSA at 0.2 s, while PSAs at 0.2 s and 1 s vibration periods show a slight symmetric tail dependence which is captured by t copula. PSAs at 1 s and 2 s do not exhibit any unusual character in tail regions since normal copula has been chosen for this pair. The dependence between PSAs at 2 s and 4 s vibration periods is described by BB1 copula (Clayton–Gumbel) which can capture the asymmetric tail dependence [38]. The co-movement of the extreme small values is greater than that of the extreme large values. The remaining pair copulas do not show any unusual character in tail regions. The density of corresponding pair copulas for PSA residuals at different vibration periods is illustrated in Figures 35. The traditional multivariate normal copula and t copula fail to describe multivariate distribution as comprehensive as this one.

3.2.2. Joint Distribution Modeled by Vine Copula

The joint distribution of PSAs at five different vibration periods are obtained by combining the marginal distribution of PSAs at each period and its best fitted vine copula distribution. In particular, the joint density is the product of marginal density and corresponding vine copula density defined by equation (2). The margins represent the information at each period, and the copula contains the information about pure dependence structure of PSAs at five vibration periods. The joint density is defined by the following equation:where denotes the marginal distribution of residuals of PSAs at i-th period, i = 1, …, 5, i.e., a normal distribution. The vine copula model parameters (i.e., corresponding series of pair copula densities) are reported in Table 4.

3.2.3. Vine Copula-Based Joint Distribution in Earthquake Engineering

In the seismic hazard and risk assessment of calculating the aggregated losses of portfolios or infrastructures, for different types of structures, it is necessary to use different fragility functions characterized by different ground motion parameters. In this sense, multiple ground motion parameters are embedded in the seismic hazard and risk assessment. Herein, we present an example to illustrate the application and performance of the proposed vine copula-based multivariate joint distribution function by adopting joint exceedance probability of multiple ground motion parameters. The joint exceedance probability of multiple parameters is defined as a probability that a set of parameters (X1 to Xn) simultaneously and, respectively, exceeds a set of certain values (x1 to xn), shown as follows:

In the example, an earthquake scenario is assumed: (1) a hypothetical site is located at a distance of 30 km to a point strike-slip earthquake source; (2) an earthquake with moment magnitude of 6 occurs at this source; and (3) soil condition at the site is characterized with VS30 = 720 m/s. We investigate the PSAs at 0.1 s, 0.2 s, 1 s, 2 s, and 4 s at this site. The median values of the PSAs in logarithmic space, namely, , are calculated using the GMPEs proposed by Campell and Bozorgnia [10]. Based on the proposed vine copula-based multivariate joint probability function, we use the Monte Carlo method to generate 20,000 realizations of the jointly distributed multivariate residuals of PSAs () at the given site. We obtain the final realizations of the PSAs in logarithmic space by summing up the mean PSAs values in logarithmic space and the residual realizations.

Figure 6 describes the joint exceedance probability of PSAs based on the above realizations and demonstrates the effects of the multivariate joint distribution of the PSAs. The joint exceedance probability is calculated through equation (24), where n = 5, X1 to X5 is PSAs at 0.1 s, 0.2 s, 1 s, 2 s, and 4 s, respectively, and x1 to x5 herein indicates a certain level of PSAs at 0.1 s, 0.2 s, 1 s, 2 s, and 4 s, respectively. In this case, the level value is the mean value minus or plus a number times of standard deviation for each PSA investigated. Other two cases are also considered in Figure 6 for comparison purpose by assuming that the residuals of the PSAs are independent without correlation or perfectly dependent. The results imply that the vine copula-based multivariate distribution function proposed can properly characterize the joint distribution of multiple ground motion parameters. The joint exceedance probability of the ground motion parameters are underestimated or overestimated, respectively, if their correlations are ignored or they are assumed to be perfectly correlated. Especially, the difference among three cases become larger in the tail region, that is at a level of large values of ground motions, suggesting that the proposed copula-based multivariate distribution model is necessary to apply in the analysis, especially important in the extreme region.

4. Conclusion

In this study a multivariate joint probability function of PSAs at different vibration periods is calibrated using the vine copula technique. The dependence structure is developed based on a large set of ground motion data consisting of 1550 ground motion records. We show that vine copula can not only better capture the multivariate dependence of PSAs at different vibration periods but also capture their tail dependence which is critical to the losses estimation at low-probability high-impact risks.

In particular, (1) in this study, the vine copula performs better than normal and t copula according to the results of AIC, BIC, and likelihood ration tests; (2) among all the investigated vine copula structures, the best fitted one is a D-vine structure; (3) no residuals of PSAs play a major role but are connected as a path from the shorter period to longer period; and (4) it is observed that the bivariate copulas may show asymmetric tail-dependence property which the normal and t copula could not capture.

The proposed vine copula-based correlation model in this study can be conventionally used in the probabilistic aggregated seismic loss assessment of portfolios or infrastructures.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request. The recorded time histories for these events were obtained from the PEER-NGA database (http://peer.berkeley.edu/nga/).

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was financially supported by the National Natural Science Foundation of China (Grants nos. 51708460 and 71703123) and the Fundamental Research Funds for the Central Universities (2682017CX004 and 2452019117).