Abstract

Space use, estimated based on location data, provides fundamental knowledge in the basic and applied ecology of wild animals. There is a trade-off between sampling frequency and duration in location data which are collected from a tracking device attached to an animal because the battery of the device has a limited life. In this study, we assessed how different combinations of sampling frequency and duration would affect estimates for home-range size and maximum utilization area (MUA) of Japanese macaques, using datasets subsampled by reducing frequency and/or duration from the original dataset. Estimates of MUA were likely to be overestimated if the sampling duration was shorter than 80 days for Japanese macaques. Reductions in sampling frequency and duration had opposite effects on estimates of home-range size: the estimated area decreased with decreasing sampling duration, while it increased with decreasing sampling frequency. Moreover, these opposite effects can be offset when the sampling frequency and duration are simultaneously reduced. We discussed the applicability of our results to animals other than Japanese macaques and how to design the sampling frequency and duration in future research.

1. Introduction

Quantifying animal space use is important for developing conservation and management strategies as well as for basic ecology. Wild animals have high-usage areas typically used for activities such as foraging and reproduction, and peripheral sites outside the main area of activity are utilized occasionally (e.g., rendezvous sites). The home range, an important concept in basic ecology, is restricted to the area utilized by an individual in their normal activities, and peripheral sites outside the normal utilization area are not included in the home range [1]. However, there is increasing evidence that peripheral sites outside the home range are important in applied ecology such as wildlife conservation and management [24]. For example, incidents related to human-wildlife conflict have often been reported at peripheral sites [5, 6]. Hence, the concept of “maximum utilization area (MUA)” includes both the home range and peripheral sites [7].

There are a variety of estimators to calculate animal space use based on location data such as minimum convex polygon (MCP), kernel density estimation (KDE), and dynamic Brownian bridge movement model (dBBMM). The MCP estimator determines the smallest polygon that encompasses all locations including peripheral sites beyond the main area of activity (i.e., 100%-MCP). The KDE estimator uses the spatial distribution of locations to calculate the density of space use in which space boundaries are built by joining sites with equal density. The dBBMM estimator, based on the temporal distribution (i.e., movement pathway) as well as the spatial distribution, is developed for home-range estimation to handle a spatiotemporally high-resolution data collected with a global positioning system (GPS) device. In recent studies on animal home ranges, KDE and dBBMM are considered as superior estimators to MCP [810] because the home range is restricted to the area utilized by an individual in their normal activities. In contrast, the MCP estimator, in which the space boundary includes peripheral sites in addition to home ranges, is applied for the estimation of MUA [11].

The widespread use of global positioning system (GPS) devices, one of the recent advances in animal tracking technology, has allowed researchers to collect location data of wild animals accurately [12]. However, since batteries attached to the GPS devices have limited lives and cannot be replaced with new ones unless the animal is recaptured, the GPS devices are inevitably subject to the trade-off between sampling frequency and duration of data collection [13]. Higher sampling frequencies result in higher battery usage and thus shorter durations of data collection. For the estimation of home ranges with the KDE estimator, Fleming et al. [14] showed that shorter sampling durations can result in smaller areas. Noonan et al. [15] reported that a high sampling frequency can cause autocorrelation in location data, resulting in a smaller area. These studies clearly indicate that the area of animal space use can vary depending on both frequency and duration of observation, but the simultaneous effects of sampling frequency and duration on the areal estimates have not been examined using empirical data. In previous studies on animal space use, the sampling frequency and duration of location data considerably varied depending on the tracking device (e.g., very high-frequency (VHF) radio tracking device vs. GPS data-logging collar), study animal (e.g., small vs. large mammals), and research objectives (e.g., annual vs. seasonal). To link animal space use based on datasets with different sampling regimes, we here assessed how different combinations of sampling frequency and duration would affect estimates of space use of Japanese macaques. In the Discussion section, the applicability of the results obtained from the Japanese macaques was checked with other animal species and simulated data.

2. Materials and Methods

2.1. Study Animal and Sites

The Japanese macaque (Macaca fuscata) was used as our model species. The natural foods of the species are mainly seeds, nuts, fruits, young leaves, flower buds, and shoots, depending on the season [16]. They are distributed on the main islands of the Japanese archipelago excluding Hokkaido (i.e., Honshu, Shikoku, and Kyushu) and some small islands. We focused on nine troops (TID1–TID9) of Japanese macaques, located on Honshu (the largest island of Japan) and Shikoku (Appendix S1: Table S1). These troops inhabit Satoyama, a Japanese traditional socioecological landscape including paddy fields, secondary forests, plantations, and grasslands. Japanese macaques inhabiting Satoyama generally have home ranges of one to tens of square kilometers in size [1720]. In each troop, location data were collected from one adult female with a GPS data-logging collar (Tellus 1C light, Followit, Lindesberg, Sweden; GLT-02, Circuit Design, Nagano, Japan) because adult females are unlikely to leave their troops [20]. The location data were collected every hour between 6:00 and 18:00 (13 location data per day). Because of battery life constraints, the observation periods in this study were shorter than one year. Details of the collection of location data for analysis and ethical treatment of macaques were described in our earlier paper [7].

2.2. Calculation of Space Use

Here, we focused on both home-range size and maximum utilization area (MUA) as the animal space use. The area of space use should reach an asymptote with an adequate sample size [10, 21]. Since MUA includes both the typical utilization area (home range) and occasional utilization area (peripheral sites), MUA was calculated using the 100%-MCP estimator based on a modified asymptotic curve of the utilization area plotted against the sample size as proposed by Terayama et al. [7]. The modified asymptotic curve was developed to reduce unpredictable effects due to predator avoidance and rainfalls which were associated with animal activity and environmental conditions. For the home-range estimators, kernel density estimator (KDE), and dynamic Brownian bridge movement model (dBBMM), we calculated the areas of 95% isopleths. In KDE which relies on a smoothing parameter (bandwidth, h) to generate a utilization distribution, we used the reference bandwidth method (KDEhref) as the bandwidth selection algorithm. In dBBMM, we used a moving window size of 13, a margin of 3, and a location error of 22 m.

2.3. Manipulations of Sampling Frequency and Duration

To investigate the effects of the sampling frequency (sampling interval length) on estimates of space use, we created a series of subsamples with increasing intervals of 2, 3, 6, and 24 hours as manipulated datasets, subsampled from the original dataset with intervals of 1 hour (Appendix S1: Table S1). For example, the sample size (number of locations) of the manipulated dataset with intervals of 2 hours was reduced to one-half of the original dataset (reduction rate, r = 1/2). For clarity, we denote the sampling frequencies (fr) of the original (1-hour intervals; r = 1) and manipulated (n-hour intervals; r = 1/n) datasets as f1 and f1/n, respectively. Note that the sampling frequency of the manipulated dataset with intervals of 24 hours (one location per day) was expressed as f1/13 because the original location data were collected only in the daytime (13 locations per day). We also considered additional subsamples with sampling intervals of <2 hours (i.e., f5/6, f3/4, and f2/3; see Appendix S1: Table S1 for details).

To investigate the effects of the sampling duration (number of days) on estimates of space use, we created a series of subsamples with reduced durations as manipulated datasets, subsampled from the original dataset with the full duration. Following the manipulations of the sampling frequency, the sample sizes (numbers of days) of manipulated datasets were reduced by the same reduction rates as sampling frequency (i.e., r = 5/6, 3/4, 2/3, 1/2, 1/3, 1/6, and 1/13), and the sampling duration (Tr) of the original and manipulated datasets were denoted based on the sample size (e.g., T1 for the original dataset and T1/2 for the manipulated dataset with one-half duration). The reduced sampling duration was obtained from all possible combinations of, for example, selecting k consecutive days from the original dataset with n days, where k = r n (k was rounded to the nearest integer). In this case, there are n − k + 1 possible combinations.

We assumed that the area calculated with the original dataset (f1 and T1) of each troop was the most accurate and complete estimate of space use, and we used this area as our reference area. Based on the reference area, we compared how manipulations (i.e., reductions in sampling frequency and duration) affected the space use estimate. To quantify differences between the reference area and the predicted areas estimated with manipulated datasets (see Figure 1), we calculated errors of omission (false negative: FN) and errors of commission (false positive: FP). We defined the omission error ratio as the ratio of the FN area to the reference area. Likewise, we defined the commission error ratio as the ratio of the FP area to the reference area.

All analyses were performed in R v. 3.6.1 [22] by using the range estimation techniques implemented in the R packages of adehabitatHR [23], drc [24], and move [25]. Since the package adehabitatHR requires at least 5 locations to calculate an area, days with data less than 5 locations were excluded from the analyses.

3. Results

The full sampling durations (T1) of nine macaque troops ranged from 214 to 283 days (Table 1). The reference areas of maximum utilization area (MUA) ranged from 3.1 to 87.1 km2, those of kernel density estimation (KDE) ranged from 3.1 to 57.5 km2, and those of dynamic Brownian bridge movement model (dBBMM) ranged from 2.9 to 32.2 km2.

The area ratio (AR), defined as the ratio of the predicted area to the reference area, calculated with MUA depended on the sampling duration rather than on the sampling frequency (Figure 2(a)). The predicted area was comparable to the reference area if T ≥ T1/3 (i.e., AR: 0.77–1), while it was considerably greater than the reference area when T < T1/3 (AR: 1.6–5.26). For KDE, the area ratio slightly decreased with decreasing sampling duration, while it increased with decreasing sampling frequency (Figure 2(b)). For example, AR = 1 with T = T1 and f = f1, AR = 0.86 with T1/13 and f1; AR = 1.24 with T1 and f1/13, and AR = 1.22 with T1/13 and f1/13. For dBBMM, the area ratio decreased with decreasing duration, while it increased with decreasing frequency except for f1/13 (Figure 2(c)): AR = 0.55 with T1/13 and f1, AR = 1.43 with T1 and f1/6, and AR = 0.86 with T1/13 and f1/6. The area ratio with f1/13 drastically decreased from 1.43 (T = T1) to 0.54 (T1/13). For KDE and dBBMM, a decrease in sampling frequency resulted in an increase in the predicted area, while a decrease in sampling duration resulted in a decrease in the predicted area.

For KDE and dBBMM, errors of omission increased with reductions in duration and frequency, and the errors were more sensitive to reductions in duration than reductions in frequency (Figures 3(a), 3(c)). For example, in KDE, the omission error ratio (ERomi) was 0.12 when only duration was halved (f1 and T1/2), and it was 0.0075 when only frequency was halved (f1/2 and T1). In dBBMM, ERomi = 0.20 with f1 and T1/2 and ERomi = 0.021 with f1/2 and T1. Note that ERomi = 0 with the original dataset (f1 and T1). Errors of commission increased with reductions in duration and frequency (Figures 3(b) and 3(d)). In KDE, the commission error ratio (ERcom) was 0.080 with halved duration (f1 and T1/2), and it was 0.061 with halved frequency (f1/2 and T1). In dBBMM, ERcom = 0.067 with f1 and T1/2 and ERcom = 0.17 with f1/2 and T1. Note that ERcom = 0 with f1 and T1.

4. Discussion

Reductions in sample size (i.e., decreased sampling frequency and duration) on the predicted area differed between maximum utilization area (MUA) and home-range size. We showed using location data of Japanese macaques that MUA, which includes peripheral sites as well as home range, was likely to be overestimated when the sampling duration (T) was shorter than the critical duration (T1/3 = 82 ± 6 days, mean ± SD, and n = 9). To judge whether the critical duration depended on the absolute duration (82 days) or on the relative duration of the full sampling duration (T1/3), MUA calculation was applied to the simulated data with different full sampling durations (i.e., 180, 360, 720, and 1080 days). With simulated data created by biased correlated random walk, the critical duration depended on the absolute duration (Appendix S1: Figure S1). By using published data (see Appendix S1: Figure S1), we also applied the MUA calculation to other animals. The results obtained from the bobcat (Lynx rufus) and coyote (Canis latrans) were qualitatively similar to those of the macaques in the predicted area that was close to the reference area (i.e., macaque: AR = 0.77–1, bobcat: AR = 0.83–1, and coyote: AR = 0.96–1.04) as long as T was equal to or longer than a critical duration (Appendix S1: Figure S1). The critical durations of the bobcat and coyote were 115 and 86 days, respectively.

For kernel density estimation (KDE) and dynamic Brownian bridge movement model (dBBMM), reductions in sampling frequency and duration had opposite effects on the predicted area: the predicted area increased with decreasing sampling frequency, and it decreased with decreasing sampling duration. The KDE estimator calculates the area based on the probability density that an animal is found at a given point in space, where the smoothing parameter (bandwidth, h) controls the “width” of the probability density placed over each location point and a wider bandwidth results in a larger predicted area [23]. Noonan et al. [15] reported that the predicted areas decreased with increasing sampling frequency due to autocorrelation in location data. To take autocorrelation in location data into consideration, Fleming et al. [14] developed the autocorrelated kernel density estimation (aKDE) which optimized the bandwidth using a fitted autocorrelation model. In addition, Long and Nelson [26] reported that a low sampling frequency resulted in a large predicted area, due to an increase in the bandwidth. We confirmed using our Japanese macaque data that a reduction in the sampling frequency resulted in an increase in the bandwidth of KDE, and resulted in a decrease in the predicted area simply because of the decreased sample size of aKDE (Appendix S1: Figures S1 and S4). These results suggest that location data with a high sampling frequency are likely to result in a small predicted area due to decreased bandwidth associated with autocorrelation. Location data with a low sampling frequency are likely to result in a large predicted area due to increased errors of commission (Figure 3(b)).

In contrast to reductions in sampling frequency, reductions in sampling duration decreased the predicted area. Location data with a short sampling duration resulted in large errors of omission (Figure 3(a)), indicating that the predicted area failed to reflect the seasonal movement of animals. In fact, the home ranges of our macaque data varied monthly in size and shape (Appendix S1: Figure S5). A short sampling duration may not provide data representative of the movement of an animal because it can take a sufficient amount of time for the animal to journey through its home range [14].

The predicted area calculated with dBBMM decreased more markedly with reductions in sampling duration than the area calculated with KDE (Figures 2(b) and 2(c)). The dBBMM estimator calculates the area based on the temporal distribution (i.e., movement pathway), as well as the spatial distribution (probability density), of the location data of an animal [27]. Therefore, it tends to prune the home range down to the footprint of the animal along its movement track such as core areas and corridors [9, 28]. Our Japanese macaque data showed that this tendency was strengthened when the sampling duration was reduced (Appendix S1: Figure S6).

We here estimated home ranges using 95%-KDEhref or 95%-dBBMM. Applying other estimators such as KDE with least-squares cross-validation as the bandwidth selection algorithm (KDElscv), 50%-KDEhref and 50%-dBBMM did not alter the results that reductions in sampling frequency and duration had opposite effects on the predicted area (Appendix S1: Figures S7 and S8).

In this paper, we used Japanese macaques as our model species. The simultaneous effects of reductions in sampling frequency and duration on the home-range estimation were tested for location data of other medium-sized mammals (bobcat and coyote), and we confirmed that the predicted area increased with decreasing sampling frequency, while it decreased with decreasing sampling duration (Appendix S1: Figure S9). Based on a comprehensive analysis of location data of animals including large-sized mammals (e.g., buffalo and lion), birds (e.g., hornbill and vulture), and other vertebrates (e.g., turtle), Noonan et al. [15] showed effects of sampling frequency and effects of sampling duration independently on the estimates of home ranges, similar to the results obtained in this study. Taken together, our results based on Japanese macaque data that home ranges were likely to decrease and increase with sampling frequency and duration, respectively, would be applicable to a wide variety of vertebrates.

Our results provide insights into the estimation of home ranges in relation to the sampling regime in cases where home ranges are compared based on datasets with different sampling frequencies and/or durations. Among datasets with different sampling frequencies, a dataset with a lower sampling frequency is likely to overestimate the home range. Among datasets with different sampling durations, a dataset with a shorter sampling duration is likely to underestimate the home range. The opposite effects of reductions in sample size can be more or less offset when a dataset with a lower sampling frequency and a shorter duration is compared to another dataset (Appendix S1: Figure S10), which is a novel information on the effects of sample size on home-range estimates. In other words, it could happen that a home range estimated with a low sampling frequency and a short duration (i.e., low-resolution data) could take a close value in size compared to a home range estimated with a high sampling frequency and a long duration (high-resolution data). However, researchers should keep in mind that the home-range boundary with low-resolution data does not reflect the accurate home range on a map because errors of omission and commission increase with decreasing sampling frequency and duration. Since the battery life attached to GPS data loggers is inevitably limited, the insights provided here can help to design the sampling regime (i.e., the trade-off between sampling frequency and duration) in future research. For example, underestimating a home range due to a short sampling duration is not appropriate for seasonally migrating animals. We suggest that researchers should pay more attention to the sampling duration (i.e., ≥1 year including all seasons) than the sampling frequency if the battery attached to an animal is limited in size due to animal welfare. In contrast, overestimating a home range may be inappropriate for megaherbivores such as elephants because restoring their vast home range is not feasible due to a lack of funds or manpower [29]. For such animals, an increase in sampling frequency would reduce an overestimate of the home range.

Data Availability

The data used to support the findings of the study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

We thank Matthew Helmus for comments on the manuscript. This work was supported by the Sasakawa Scientific Research Grant (Grant no. 2023-5001) from the Japan Scientific Society. We also thank the Japanese Ministry of Environment and local governments of Kyoto and Tokushima Prefectures and Toyokawa City (Aichi Prefecture) for permission to use location data for analysis.

Supplementary Materials

Table S1: sampling schedules in relation to sampling frequency in our original and manipulated datasets. Figure S1: effects of reductions in sampling frequency (f) and duration (T) on MUA estimates using simulated data with 180 (a), 360 (b), 720 (c), and 1080 days (d). Figure S2: effects of reductions in sampling frequency (f) and duration (T) on MUA estimates using location data of the bobcat, Lynx rufus (a), and the coyote, Canis latrans (b). Figure S3: comparison of the variation in area ratio between 95%-aKDE (autocorrelated kernel density estimation) and 95%-KDEhref (KDE with the reference bandwidth method), at different sampling frequencies, for Japanese macaque (a) and simulated data (b). Figure S4: standard deviation (SD) in smoothing parameter (bandwidth, (h)) at different sampling frequencies for Japanese macaque (a) and simulated data (b). Figure S5: distribution map of monthly home ranges of nine troops (TID1–TID9) estimated using KDEhref. Figure S6: examples of the reference and predicted areas estimated with KDEhref and dBBMM for the three troops of Japanese macaques (TID2, TID7, and TID8). Figure S7: effects of reductions in sampling frequency (f) and duration (T) on the home-range estimates based on KDElscv (KDE with least-squares cross-validation as the bandwidth selection algorithm) using Japanese macaque data. Figure S8: effects of reductions in sampling frequency (f) and duration (T) on the 50%-KDEhref (a), 50%-KDElscv (b), and 50%-dBBMM estimates (c) using Japanese macaque data. Figure S9: effects of reductions in sampling frequency (f) and duration (T) on KDEhref and dBBMM estimates using bobcat (a, b) and coyote data (c, d). Figure S10: effects of the simultaneous reductions in sampling frequency (f) and duration (T) on home-range estimates (KDEhref and dBBMM) using Japanese macaque (a, b), bobcat (c, d), and coyote data (e, f), where the sampling frequency (f) and duration were reduced at the same reduction rate (r). (Supplementary Materials)