More Efficient Prediction for Ordinary Kriging to Solve a Problem in the Structure of Some Random Fields

Saber, Mohammad Mehdi; Aldallal, Ramy Abdelhamid

doi:https://doi.org/10.1155/2022/9712576

Complexity

On this page

Abstract Introduction Conclusion Data Availability Conflicts of Interest Authors’ Contributions References Copyright Related Articles

Special Issue

Complexity Arising in Financial Modelling and its Applications

View this Special Issue

Research Article | Open Access

Volume 2022 | Article ID 9712576 | https://doi.org/10.1155/2022/9712576

More Efficient Prediction for Ordinary Kriging to Solve a Problem in the Structure of Some Random Fields

Mohammad Mehdi Saber¹and Ramy Abdelhamid Aldallal²

Academic Editor: Atila Bueno

Received05 Nov 2021

Revised28 May 2022

Accepted24 Jun 2022

Published06 Jul 2022

Abstract

Recently, some specific random fields have been defined based on multivariate distributions. This paper will show that almost all these random fields have a deficiency in spatial autocorrelation structure. The paper recommends a method for coping with this problem. Another application of these random fields is spatial data prediction, and the Kriging estimator is the most widely used method that does not require defining the mentioned random fields. Although it is an unbiased estimator with a minimum mean-squared error, it does not necessarily have a minimum mean-squared error in the class of all linear estimators. In this work, a biased estimator is introduced with less mean-squared error than the Kriging estimator under some conditions. Asymptotic behavior of its basic component will be investigated too.

1. Introduction

Random fields (RFs) as a statistical model can be applied in many real-life events, such as biological sequences, management of soil resources in agriculture and forestry, text and image processing, designing environmental monitoring networks, artificial intelligence, and road and tunnel planning.

An RF is usually constructed based on a multivariate distribution’s finite-dimensional distributions. For example, the skew-normal (SN) RF based on multivariate SN distribution [1] has been defined by Kim and Mallick [2]. The closed skew-normal (CSN) RF based on multivariate CSN distribution [GF_2004] has been established by Allard and Naveau [3]. Similar work has been performed by Karimi et al [4], Hosseini et al. [5], and Karimi and Mohammadzadeh [6]. Zareifard and Jafari Khaledi [7] defined a second-order stationary RF named unified skew-normal (SUN) RF by using multivariate SUN distribution [8]. The generalized Skew-Normal (GSN) RF introduced by Mahmoudian [9] has also been constructed from multivariate GSN distribution as shown by Sahu et al. [10], while the generalized asymmetric Laplace (GAL) RF is defined by multivariate GAL distribution as shown by Kozubowski et al. [11]and was introduced by Saber et al. [12]. Some other RFs may be defined by multivariate skew-t distribution and its general form or multivariate extended skew-t (EST) distribution as shown by Arellano-Valle and Genton [13].

For the spatial structure of the model, Wall [14] studied the spatial structure implied by the CAR and SAR models. According to Saber et al. [12] and Mahmoudian [15], almost all mentioned RFs in the previous paragraph are not well defined by the Kolmogorov existence theorem. Saber et al. [12] showed that these RFs do not consider proper spatial autocorrelation structure (PSAS). In the present article, we will show that almost all aforementioned skew RFs have this deficiency. In fact, they do not consider PSAS.

A final application of any random field is spatial data prediction, which has been applied in some works of authors such as Basu and Reinsel [16], Saber and Nematollahi [17], and Saber [18]. Many researchers have studied the prediction of spatial data using mentioned unsuitable RFs. To solve the previously discussed problem, some authors such as Mojiri et al. [19] have considered a univariate distribution for detection of errors of the process, which results in a new problem in modeling spatial correlation. Some others such as Hosseini and Karimi [21] have used an approximate skew-normal RF. Although this work has some benefits for reaching a high percentage of PSAS, it has some fundamental problems, and it cannot solve the two mentioned problems.

Kriging is a spatial prediction methodology and is the most widely used method of spatial data prediction that has increased during recent years. Because this method does not require defining the previously mentioned random fields that have problem in PSAS, Cressie [22] presented an unbiased linear estimator for Kriging and has a minimum mean-squared error (MSE), although it is not the best estimator in the larger class of all linear estimators. On the other hand, according to Moyeed and Papritz [23] linear Kriging is as good as any nonlinear method, especially for symmetric data. However, for skewed data, some of the nonlinear methods perform better in estimating prediction uncertainty. The focus of this article is on linear Kriging as we suggested a biased estimator for Kriging where the only interest is to minimize the MSE. Our recommended estimator has less MSE than the Cressie estimator under some conditions. Therefore, it is appropriate to use it for skewed data when we have a problem in defining a valid RF and for ordinary Kriging that does not have precision for nonlinear methods.

The paper is organized as follows: In Section 2, some requirements of the previously mentioned multivariate distributions are reviewed to show defined RFs that do not consider the PSAS property. After that, an estimator for the mean of the process and then a predictor with a minimum MSE in all the linear predictors for ordinary Kriging are presented in Section 3. Finally, Section 4 is devoted to investigating some asymptotic behavior for the recommended estimator.

2. A Problem in Some RFs

The first part of this section reviews the variance matrix of multivariate SN, CSN, EST, GSN, SUN, and GAL distributions.

Consider a p-dimensional continuous random vector that has an SN distribution, denoted by . Then, we haveand consider another p-dimensional continuous random vector but with a CSN distribution, denoted by . Then, we havewhere and are complicated matrices based on . The detail about these matrices is not essential for us, although they are present in the study conducted by González-Farías et al. [24].

The variance matrix of a multivariate EST distribution is denoted by , which is given as follows:

However, the variance matrix of a multivariate GSN distribution is denoted by , which is given as follows:

Arellano-Valle and Azzalini [25] recently provided a very complicated and cumbersome computation of the variance matrix of multivariate SUN. Let . Then we have,

Finally, the variance matrix of the GAL distribution continuous p-dimensional random vector denoted by is given as follows:

After that, we state the following remark.

Remark 1. Let be the observations from an RF at n locations in the area . It is assumed that where is the norm of the vector and is a real value function. In a matrix form, where and with a notation . Some RFs have been defined so that they satisfy the condition , while in some other cases, the covariance matrix has been used somehow in which .
In continuation of this section, we investigate the condition where is named as PSAS in some well-known RFs.
Gaussian RF: and . Therefore, we choose to have a spatial correlation between variables.
t RF: and . It seems to be an appropriate choice to reach the PSAS when .
SN RF: and where it is convenient to substitute to get to the PSAS.
CSN RF: and . Choosing provides an adequate choice to reach the PSAS.
EST RF: and . By substituting , we will obtain the PSAS.
SUN RF: and . Making will fulfill the PSAS.
GSN RF: and . It provides a suitable choice to choose and then consider as a nugget and to accomplish the PSAS.
GAL RF: and . After finding the PSAS, we will choose , which results as and where is the matrix of covariates and is the regression coefficient.
As previously shown, some of the RFs satisfy the PSAS, while others do not. In the following remark, we denote this matter concisely.

Remark 2. The Gaussian RF, t RF, GSN RF, and GAL RF satisfy the PSAS. Other RFs containing the SN RF, CSN RF, EST RF, and SUN RF do not satisfy the PSAS. In some of the previous works that applied a second group of RFs, it was supposed that , although it does not lead to for achieving the PSAS.
Imposing the equality , without any other constraints, may result in a nonpositive definite for the matrix . There is a method from which we can add a small value to the diagonal of , but it does not work in every case. So, there is still a problem in getting the PSAS and positive definiteness for the matrix.

3. Linear Kriging as an Alternative for the Nonlinear Predictors Conducted by RFs

Let RF satisfies the following model:where is the mean value, (·) is the zero-mean intermediate-scale variation, and a white noise error denoted by (·) with a variance .

Kriging belongs to the family of linear least square estimation algorithms. Given the observations Z = (Z (),…, Z()), a common task is to predict at an unobserved location and to calculate the prediction error variance at each such location. The Kriging estimator is a linear predictor for (), which is as follows:where s (the Kriging weights) are some scalars that may be found under two conditions:(a) (unbiased condition)(b)

Simple Kriging (known mean) and ordinary Kriging (unknown but constant mean) are two well-known divisions of Kriging based on the mean value specification. In fact, in simple Kriging is assumed to be known, but it is unknown in the case of ordinary Kriging, which occurs more in the application. As in the previous section, let denotes the covariance matrix of the observations and let denotes the vector containing the covariances between observed and unobserved location .

According to Cressie [22], the simple Kriging estimator and its error variance are given, respectively, byand

Also, the ordinary Kriging and its error variance are given, respectively, byandwhere is the row vector of size n in which its components are s.

3.1. New Estimator

The Kriging estimator for ordinary Kriging is the best estimator regarding unbiasedness and the minimum MSE. However, a biased estimator with a less MSE is preferable in some circumstances. An estimator with this condition exists, and we denote our motivation for recommending this estimator in the following theorem.

Theorem 1. An unbiased linear estimator for the mean of the process given in (7) exists and has a minimum MSE among all unbiased linear estimators for the mean. This estimator is in the form ofand its variance is given by

Proof. A linear estimator for is in the form of . Unbiasedness implies that . Hence, under this condition, must be minimized. By using the Lagrange equation and by vector differentiating, one can solve this problem to reach , and its results are shown in (13). Now, we have, which completes the proof.
Now, a simple Kriging method can be applied instead of an ordinary Kriging method. By substituting (13) in (7), we can reach and . From (9), we can conclude thatwhich is the new estimator for interpolation. is a special case of when .

Corollary 1. The biasedness and prediction error variance of the estimator is given by

Proof. The proof comes from Theorem 1, and a direct computation is presented in the following equations:Because of this fact, we deduce that is a biased estimator. Now, to compare with the ordinary Kriging estimator given by (11), variance is not a useful criterion. The MSE is better when solving this problem and not only in this case but also while comparing between estimators even though at least one of them is biased. Finally, Theorem 2 shows that is better than the ordinary Kriging estimator (11) regarding the MSE under some conditions.

Theorem 2. Under one of the following two conditions (i) and (ii), will be less than .(i) and are between and (ii) and or )

Proof. From Corollary 1, we have . On the other hand, Equation (12) results in . Then, by defining and , one can show that
From the positive definiteness of , it is found that . By multiplying the above equality by , we get . Replacing by leads towhere . The quadratic form (18) has two roots and with respect to . Under condition (i) or (ii), Equation (18) will be negative; hence, under condition (i) or (ii) as well. This completes the proof.

4. Asymptotic Behavior of the Mean Estimator

Since the estimator of μ (13) in Theorem 1 has an essential role, its asymptotic behavior should be interesting for this study. First, we denote the following lemma, which is required for the continuation of this section.

Lemma 1. For two sequences of unbiased estimators and we can say that , , if . Here, states the mean square convergence.

Theorem 3. Let a random process (s) exists on a lattice in with coordinates, , for all and as tends to infinity, then and .

Proof. Without losing generality, assume denotes . First, we know that . Now, we can compute the variance of The last term has components. The assumption which , as tends to infinity gives that for all there exists an integer enough large such that for all and and that . For a fixed , the number of so that is less than . Therefore, the total number of points with condition is at last which leads to
which clearly proves . By Theorem 1, is unbiased for and . Finally, by Lemma 1 we get that .

Corollary 2. It is an immediate consequence of Theorem 4 that under its conditions tends to infinity as tends to infinity.

Theorem 4. Under conditions of Theorem 4 and at least one of the followings (i) or (ii) and .(i) as n tends to infinity.(ii) is bounded.

Proof. Regarding Lemma 1 and the unbiasedness of , we see that if it satisfies condition (i) or (ii). Since is the linear unbiased estimator for with the minimum variance and is the linear unbiased estimator for , we obtain thatCorollary 1 and equation (20) result in if and only if . On the other hand, a direct computation shows that and under condition (ii), it obviously tends to . Another form of is , with the fact that and if (i) satisfies, it will tend to .

5. Conclusion

In this paper, we showed some previously defined RFs that have been extensively applied to modeling time series, spatial, and spatiotemporal data. However, they have a few limitations that they cannot consider the PSAS, which has a crucial and basic role in modeling the previously mentioned data. For that, we presented some alternative RFs without the mentioned problem, and having a more general solution is still needed. The paper recommended a method for coping with this problem by finding a new predictor in ordinary Kriging. The customary predictor for ordinary Kriging has a minimum MSE in the class of all linear unbiased predictors. However, it does not necessarily have a minimum MSE in the class of all linear predictors. Therefore, we obtained a biased predictor with a less MSE than the Kriging predictor. Asymptotic behavior of this predictor alongside the past Kriging predictor was provided.

We studied six well-known multivariate distributions in this article. Some other multivariate distributions studied by Kotz et al. [26] such as the truncated multivariate normal distribution, truncated multivariate t distribution, Linnik’s distribution, gamma distribution, and logistic distribution can be surveyed under this paper’s point of view.

Despite providing some solutions for the problem of lacking PSAS, it seems that the best method for modeling skew spatial data is defining a consistent skew RF. This matter may be performed in the future.

Data Availability

There are no data related to this paper.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

The idea of the paper was conceived and planned by Mohammad Mehdi Saber. Ramy Abdelhamid Aldallal was involved in writing the manuscript. All authors provided critical feedback and helped shape the manuscript’s research, analysis, and construction.

References

A. Azzalini, “A class of distributions which includes the normal ones,” Scandinavian Journal of Statistics, vol. 12, pp. 171–178, 1985.
View at: Google Scholar
H.-M. Kim and B. Mallick, “A Bayesian prediction using the skew Gaussian distribution,” Journal of Statistical Planning and Inference, vol. 120, no. 1-2, pp. 85–101, 2004.
View at: Publisher Site | Google Scholar
D. Allard and P. Naveau, “A new spatial skew-normal random field model,” Communications in Statistics - Theory and Methods, vol. 36, no. 9, pp. 1821–1834, 2007.
View at: Publisher Site | Google Scholar
O. Karimi, H. Omre, and M. Mohammadzadeh, “Bayesian closed-skew Gaussian inversion of seismic AVO data for elastic material properties,” Geophysics, vol. 75, pp. 1–11, 2010.
View at: Publisher Site | Google Scholar
F. Hosseini, J. Eidsvik, and M. Mohammadzadeh, “Approximate Bayesian inference in spatial GLMM with skew normal latent variables,” Computational Statistics & Data Analysis, vol. 55, no. 4, pp. 1791–1806, 2011.
View at: Publisher Site | Google Scholar
O. Karimi and M. Mohammadzadeh, “Bayesian spatial prediction for discrete closed skew Gaussian random field,” Mathematical Geosciences, vol. 43, no. 5, pp. 565–582, 2011.
View at: Publisher Site | Google Scholar
H. Zareifard and M. J. Jafari Khaledi, “Non-Gaussian modeling of spatial data using scale mixing of a unified skew Gaussian process,” Journal of Multivariate Analysis, vol. 114, pp. 16–28, 2013.
View at: Publisher Site | Google Scholar
R. B. Arellano-Valle and A. Azzalini, “On the unification of families of skew-normal distributions,” Scandinavian Journal of Statistics, vol. 33, no. 3, pp. 159–188, 2006.
View at: Publisher Site | Google Scholar
B. Mahmoudian, “A skewed and heavy-tailed latent random field model for spatial extremes,” Journal of Computational & Graphical Statistics, vol. 26, no. 3, pp. 658–670, 2017.
View at: Publisher Site | Google Scholar
S. K. Sahu, D. K. Dey, and M. D. Branco, “A new class of multivariate skew distributions with applications to Bayesian regression models,” Canadian Journal of Statistics, vol. 31, no. 2, pp. 129–150, 2003.
View at: Publisher Site | Google Scholar
T. J. Kozubowski, K. Podgórski, and I. Rychlik, “Multivariate generalized Laplace distribution and related random fields,” Journal of Multivariate Analysis, vol. 113, pp. 59–72, 2013.
View at: Publisher Site | Google Scholar
M. M. Saber, A. R. Nematollahi, and M. Mohammadzadeh, “Generalized asymmetric Laplace random fields: existence and application,” Journal of Data Science, vol. 18, pp. 51–68, 2018.
View at: Google Scholar
R. B. Arellano-Valle and M. G. Genton, “Multivariate extended skew-t distributions and related families,” Metron, vol. 68, no. 3, pp. 201–234, 2010.
View at: Publisher Site | Google Scholar
M. Wall, “A close look at the spatial structure implied by the CAR and SAR models,” Journal of Statistical Planning and Inference, vol. 121, no. 2, pp. 311–324, 2004.
View at: Publisher Site | Google Scholar
B. Mahmoudian, “On the existence of some skew-Gaussian random field models,” Statistics & Probability Letters, vol. 137, pp. 331–335, 2018.
View at: Publisher Site | Google Scholar
S. Basu and G. C. Reinsel, “Properties of the spatial unilateral first-order ARMA model,” Advances in Applied Probability, vol. 25, no. 3, 1993.
View at: Publisher Site | Google Scholar
M. M. Saber and A. R. Nematollahi, “Comparison of spatial interpolation methods in the first order stationary multiplicative spatial autoregressive models,” Communications in Statistics - Theory and Methods, vol. 46, no. 18, pp. 9230–9246, 2017.
View at: Publisher Site | Google Scholar
M. M. Saber, “Performance of extrapolation based on Pitman's measure of closeness in spatial regression models with extended skew t innovations,” Communications in Statistics - Theory and Methods, vol. 48, no. 2, pp. 282–299, 2019.
View at: Publisher Site | Google Scholar
A. Mojiri, Y. Waghei, H. R. Sani, and G. ., R. Borzadaran, “Comparison of predictions by kriging and spatial autoregressive models,” Communications in Statistics - Simulation and Computation, vol. 47, no. 6, pp. 1785–1795, 2018.
View at: Publisher Site | Google Scholar
A. Mojiri, Y. Waghei, H. R. Nili Sani, and G. R. Mohtashami Borzadaran, “Non-stationary spatial autoregressive modeling for the prediction of lattice data,” Communications in Statistics - Simulation and Computation, forthcoming, 2021.
View at: Publisher Site | Google Scholar
F. Hosseini and O. Karimi, “Approximate pairwise likelihood inference in SGLM models with skew normal latent variables,” Journal of Computational and Applied Mathematics, vol. 398, 2021.
View at: Publisher Site | Google Scholar
N. A. C. Cressie, Statistics for Spatial Data, Wiley, New York, NY, USA, 1993.
R. A. Moyeed and A. Papritz, “An empirical comparison of kriging methods for nonlinear spatial point prediction,” Mathematical Geology, vol. 34, no. 4, pp. 365–386, 2002.
View at: Publisher Site | Google Scholar
G. Gonzales-Farias, G Dominguez-Molin, and A. K. GuptaM. G. Genton, “The closed skew normal distribution,” Skew-Elliptical Distribution and Their Applications: A Journey beyond Normality, Chapman & Hall, Boca Rayton, FL, USA, pp. 25–42, 2004.
View at: Google Scholar
R. B. Arellano-Valle and A. Azzalini, “Some properties of the unified skew-normal distribution,” Mathematics arXiv: Statistics Theory, vol. 63, pp. 461–487, 2020.
View at: Google Scholar
S. Kotz, N. Balakrishnan, and N. L. Johnson, Continuous Multivariate Distributions. Models and Applications, Wiley, New York, NY, USA, vol. 1, 2000.
View at: Publisher Site

Copyright

Copyright © 2022 Mohammad Mehdi Saber and Ramy Abdelhamid Aldallal. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

215

Downloads

405

Citations