Abstract

FTIR microspectroscopy (FTIRM) is a commonly used nondestructive method to characterise thin bone sections. However, spectrum analysis methods are often highly sensitive to small variations (e.g., boundary limits), thus implying a time-consuming and redundant analysis process. To solve this issue, software has been developed based on several algorithms to automate the analysis. Furthermore, a rigorous framework has been established concerning the peak fitting method to obtain the systematic best potential solution. Validation of the automatic method has been performed by comparison with the manual method. Results and validation proved the reliability of the automatic process. The developed algorithms provide the means necessary to fully compare the results between bone FTIRM studies and between different laboratories.

1. Introduction

Bone analysis methods are quite varied [13], ranging from the noninvasive, e.g., X-ray scanning, to the destructive, e.g., mechanical rupture tests. FTIR microspectroscopy (FTIRM) is a promising field which characterises bone samples, as it is able to provide information on the chemical composition and structural organisation of the analysed matter without damaging it on a local level of a thin section [49]. This method combines optic, quantum mechanics, chemistry, and biology knowledge, which leads to a complex understanding that might have an impact on the conclusion of an FTIR study [10]. In addition, the spectrum analysis is laborious if done “manually.”

Several different methods are currently used to analyse FTIRM spectra (peak fitting method [1113], second derivative method [13, 14], etc.), and each has its advantages and limitations; moreover, none are considered to be a gold standard. We chose to stick with the peak fitting method used in [11, 12]. The second derivative method, although quite promising due to its objective and purely mathematical analysis, is too sensitive to noise (producing peaks that are not there initially) and quite inadequate in the case of peaks that are too close and with different full width at half maximum (FWHM) (the wider peak is then masked by the thinner one). This is why the second derivative method is not chosen (for more explanation, cf. Appendix C).

Traditionally, the peak fitting of a bone spectrum is done by selecting a region of interest (ROI) in the bone spectrum, followed by fitting the curve with a sum of Gaussians. Previously, this fitting was done by assuming both position and characteristics of the peaks (height and FWHM), giving a starting state for the least-squares fitting method. The result of the fitting process was then left to the operator for visual verification. As one can easily imagine, this technique proved to be fastidious, as correct results were typically obtained through several tries. Indeed, the regression may sometimes be highly sensitive and be executed multiple times for one ROI in order to find the correct fitting result.

In a bone spectrum (Figure 1), six distinct regions are identified as the vibrational response of different constituents:(i)The amide (I, II, and III) areas (1300, 1700) cm−1 corresponding to the response of the organic matrix in the bone tissue (with also a ν3CO3 contribution at around 1400 cm−1)(ii)The ν1ν3PO4 area (900, 1200) cm−1 corresponding to the symmetric and antisymmetric stretching of the phosphates(iii)The ν2CO3 area (800, 900) cm−1 corresponding to the symmetrical stretching of the carbonates (CO3), which also contains a large HPO4 contribution, which is removed before analysis by using a wisely placed baseline correction(iv)The ν4PO4 area (500, 650) cm−1 corresponding to the antisymmetric deformation of the phosphates (PO4).

We automated the methods used in [9, 11, 14, 15], which are biologically relevant methods (not meant to be spectroscopically accurate), to obtain the five parameters within these regions to characterise the sample:

Directly measured on the spectra are as follows:(i)Mineral-to-organic ratio (mineral/matrix): ratio of the ν1ν3PO4 area (910, 1184) cm−1 over the amide I area (1592, 1730) cm−1 [14](ii)Carbonation: ratio of the ν2CO3 area (862, 894) cm−1 over the ν1ν3PO4 area (910, 1184) cm−1 [14].

Postpeak fitting is given as follows(i)Crystallinity Index: inverse of the FWHM of the ∼604 cm−1 peak [15](ii)Mineral Maturity (MM): ratio of the apatitic (∼1030 cm−1 peak) over nonapatitic (∼1110 cm−1 peak) part of the ν1ν3PO4 area [15](iii)Collagen Maturity (Col Mat): ∼1660 cm−1 peak over ∼1690 cm−1 peak (part of the amide I area) [5, 11].

The manual peak fitting method might have induced subjectivity and imprecision, as there were no strict means to analyse bone spectra. Also, these methods are not always spectroscopically accurate (i.e., there are not always the right number of peaks corresponding to the exact composition of the studied material); however, these methods have been validated and are sufficient to conclude the proportion of compounds in the analysed bone.

These methods might be improved by the careful study and translation of the analysis part into an algorithm implemented into custom software.

The aim of this technical note was to offer a solution to the automation process of the peak fitting method used in FTIR spectrum analysis of the bone. Software, i.e., LYOS spectrum analysis, written in Python, has been developed to fully automate the analysis process of healthy human and thin mouse bone section’s infrared spectra, even though the software can be used to analyse other types of spectrum.

2. Materials and Methods

The validation process has been done using the following:(1)Fifty-one trabecular thin sections of human L2 vertebra from 51 donors (French body donation to science program, declaration number: DC-2015-2357; Laboratory of Anatomy, Faculty of Medicine Lyon Est, University of Lyon, France) (20 spectra per sample at 30 × 100 µm2 spatial resolution)(2)Four trabecular and cortical thin sections of human iliac crest from four donors (Laboratoire d’anatomie, Université Claude Bernard, Lyon, France) (78 spectra at 50 × 50 µm2 spatial resolution)(3)Twelve cortical thin sections of human femur from 12 subjects (MULTIPS: ANR-13-BS09-0006-01) (12 spectra at 150 × 150 µm2 spatial resolution)(4)One trabecular and cortical thin section of mouse tibia from one mouse (Balb/C E2A-55 #DR2015-58 MENESR: #APAFIS# 39132015121 0903612v2, Lysbone: ANR-15- CE14-0010) (10 spectra at 30 × 30 µm2 spatial resolution).

After dissection, samples were fixed in 70% ethanol for two weeks, dehydrated for 48 h in 100% ethanol, substituted for 48 h in methylcyclohexane (solvent that removes lipids in the samples), and then embedded in methyl methacrylate (MMA). The sections of 2 µm thickness were obtained with a microtome Polycut E (Reichert-Jung, Leica, Germany) and stored between two slides at room temperature.

The raw spectra were collected at 1 cm−1 resolution and averaged by 40 scans per scanned region in transmission mode with a Perkin–Elmer GXII AutoIMAGE microscope (Norwalk, CT, USA) equipped with a wide-band detector (mercury-cadmium-telluride) (7800–400 cm−1). The instrument used a Cassegrain objective of numerical aperture 0.6. The system has a spatial resolution of 10 µm at typical midinfrared wavelengths. Contribution from air was subtracted from the original spectrum.

The commercial software Spectrum (v6.2.0.0055) provided by Perkin–Elmer and Grams/AI 8.0 Spectroscopy Software provided by the Thermo Electron Corporation offer various preprocessing options and customisation for the automatic peak fitting process. The mandatory methods used in these software have been implemented and improved for automation in the custom Python software and are described below, such as a transformation from transmittance to absorbance values, a spectrum component removal (to extract the components of interest from a raw spectrum), and a baseline correction method (to remove the high-scale effects of a bone FTIRM measure).

2.1. Transmittance to Absorbance

The FTIRM spectrum of a thin section of the bone is obtained in transmittance. However, the spectrum analysis is conventionally done on an absorbance spectrum. The transmittance spectrum (T, in percent) is then transformed into its absorbance (A) form by the following function:

2.2. Spectrum Component Removal

Thin bone sections could not be realised without an embedding medium (methyl methacrylate, MMA). This medium creates a potential infrared-reactive component in the spectrum that must be removed in order to obtain the bone-only spectrum. This component presents a high absorption peak around 1730 cm−1 (Figure 2). The first step is to scan the MMA to have its pure spectrum in the same range as the bone spectrum (450, 4000) cm−1. Then, a custom algorithm of optimisation subtracts the MMA spectrum from the bone spectrum (cf. Appendix A). This process is done by selecting the boundaries of the highest MMA peak (1826, 1706) cm−1 between which is computed with the second derivative of the bone spectrum (corresponding to the curvature of the spectrum). An iterative loop consecutively guesses a coefficient by which the whole MMA spectrum is multiplied, and the subtraction result is evaluated on the convexity of the shape of the spectrum. The convexity of a curve is mathematically defined by the positivity of the second derivative. The best result is the one that has the minimum negative part. Each loop will check the negative part of the second derivative of the bone-embedded spectrum and try to minimise it by subtracting the weighted MMA-only spectrum (Figure 3). The algorithm can be summed up as follows:Find which is minimumwith : the spectrum subtracted by the weighted MMA-only spectrumand

2.3. Baseline Correction

The baseline of the bone in the (450, 4000) cm−1 is convex. The purpose is to realign this line horizontally. A multilinear baseline is chosen as an approximation of the effects of the interferences (beam spreading, sample heating, etc). Based on the convexity hypothesis, the algorithm used to find the baseline will first detect the local minima using a Savitzky–Golay smoothing [16, 17] (keeping intensity and position of the local extrema of the raw spectrum) and draw the segments between two minima containing the most minima above it while forbidding the presence of minima beneath it (Figure 4).

2.4. Peak Fitting

The peak fitting method is one of the methods used in the conventional FTIRM spectrum analysis [18] (least-squares method with trust region reflective algorithm to find the best suited parameters). The curve fitting function used is curve_fit from the scipy.optimize module version 1.0.0. Usually, only a first guess state for each Gaussian peak is given for the manual method, allowing the peaks to place everywhere with any characteristics (position, FWHM, and height). This first guess-only method often induces inaccurate results and forces the operator to execute the method multiple times in order to find a valid fit. The automation process is implemented by adding boundaries to find each peak’s parameter. These boundaries are carefully chosen for the solution to be unique (restricting the plurality of solutions induced by the manual method with first state guessing only). Ideally, boundaries should be chosen considering the characteristics of each sample (temperature, internal constraints, crystal properties, etc.). However, which characteristic and how each influences a peak’s parameter cannot be answered precisely with our current knowledge. Thus, a more empirical approach has been chosen to determine the boundaries of each peak. Based on a set of more than a thousand spectra considered to be well peak fitted, some schemes have been observed. Eight peaks were found in the amide area, five peaks in the ν1ν3PO4 area, and four peaks in the ν4PO4 area. Boundaries for each peak’s parameters have been established and are registered in Table 1. For each area, the first guess of each peak located at a specific wave number is set at 90% of the absorbance in height and at 5 cm−1 in FWHM. The ranges for the peak fitting of each area are chosen to give mathematically correct results. They are related to the biologically relevant range given in the introduction part but generally need to be wider, as mathematical information is present beyond the biologically relevant range (for example, the amide area range is (1300, 1700) cm−1 and the mathematical range used is (1300, 1787) cm−1, the same for the ν1ν3PO4 and ν4PO4). The algorithm can be summed up as follows:Find the Gaussians parameters for which minimizes with and : amplitude of the Gaussian,: position of the Gaussian,: of the Gaussian

with respect to the constraints to each Gaussian’s parameters explicited in Table 1.

2.4.1. The Amide Area (1300, 1787) cm−1

The area of interest inside the amide part of the spectrum lies around the (1600, 1730) cm−1 boundary (Figure 5). Even if (1300, 1600) cm−1 is not used in the measured parameters, it was kept to accurately fit the (1300, 1730) cm−1 area. The main peaks are the dashed-blue peak (∼1690 cm−1) and the dashed-purple peak (∼1660 cm−1). The 1660 cm−1 peak is highly influenced by the 1633 cm−1 (dashed orange), which is as well influenced by the 1600 cm−1 peak (full green peak). In bone, the 1633 cm−1 peak must be higher than the local minima at 1600 cm−1, and the height of the 1600 cm−1 peak greatly influences the height of the 1633 cm−1. All the parameters are constrained with static values (unchanged regarding the spectrum) (cf. Table 1). It may happen that the MMA peak is not correctly removed due to small differences of the shape of the MMA peak between the pure MMA spectrum and the MMA component in the bone + MMA spectrum. In order to minimise the implications of the residual MMA peak, an additional peak is included for the peak fitting method, which is located at 1745 cm−1. This is an automatic process realised when the algorithm detects a significant peak between 1730 cm−1 and 1760 cm−1 (i.e., when the positive area of the derivate between those values is superior to the half of the negative area of the derivate).

2.4.2. The ν1ν3PO4 Area (910, 1184) cm−1

The two peaks of interest here are the 1030 and 1110 cm−1 peak (full purple and green) (Figure 6). These two peaks are mainly influenced by the position of the peaks between 1060 and 1070 cm−1. Every parameter for this area determination is a static value (cf. Table 1).

2.4.3. The ν4PO4 Area (531, 659) cm−1

Only the 604 cm−1 peak (full yellow) is used to characterise the bone in the ν4PO4 area (Figure 7). This peak is mainly influenced by the 575 cm−1 peak. In bone, the 575 cm−1 peak’s height must be inferior to the 565 cm−1 peak to avoid the 575 cm−1 peak to spread too much on the 604 cm−1 peak area. Every parameter for this area is a static value (cf. Table 1).

2.4.4. Statistical Analysis

The validation of the spectrum analysis methods has been done using the root mean square of the coefficient of variation to assess the variation in manual methods, with the Bland–Altman method to compare manual and automatic results.

3. Results and Discussion

3.1. Validation

The first step of the validation process is done by analysing the precision of the manual peak fitting method. Two types of precision are measured: (1) the intraoperator precision that measures the ability of one operator to obtain the same result multiple times on the same spectrum and (2) the interoperator precision that measures the ability to get the same result from two different operators on the same spectrum.

The first operator (operator 1) is experienced with the bone FTIRM peak fitting, and is considered the reference. The second (operator 2) is an inexperienced operator who has been given the instructions to execute the peak fitting analysis.

Five different bone spectra have been peak fitted with the manual method two times by each operator (cf. Table 2). For the manual peak fitting, the root mean square of the coefficient of variation shows that results are operator-dependant.

Then, we compared both automatic and manual methods based on the spectra extracted from the human samples and mouse sample. The Bland–Altman graph plots the means and the differences between automatic and manual methods measured by a human operator. For the human bones, Figure 8 shows, at first sight, an almost null shift for every parameter. The ratio of valid results (i.e., results between the mean ± 1.96 ∗ SD) is 92% for the Crystallinity Index, 93% for the Mineral Maturity, and 92% for the Collagen Maturity.

For the mice tibia (Figure 9), the shift is also almost null for every parameter. The ratio of valid results under the aforementioned hypothesis is 90% of valid results for the Crystallinity Index, 100% for the Mineral Maturity, and 100% for the Collagen Maturity.

3.2. Discussion

Gaussians have been long used to establish methods and peaks of interests in human bone, meaning changing the fitting function would end in a change of the peaks of interest. Thus, to maintain consistency with previous studies, Gaussians are kept for the fitting method. However, it should be noted that Lorentzians or Voigt functions would be more accurate regarding the quantum reality of the spectra [19].

The development of the automatic method not only allowed a significant gain of time but also suppressed the interoperator and intraoperator variability.

Currently, there is no gold standard to check the exactitude of the results obtained. After careful examination of the automatic peak fitting of these spectra, it has been found that the results obtained with the automatic method are valid compared with the analysis done by the expert using the manual method.

The observed shifts are mainly due to the choice of boundaries within which the peak fitting method is applied. Indeed, there are no strict boundaries required for the manual peak fitting of the ν4PO4 area. This induces the variation of the shift, as shown in Appendix B. This problem is solved by the automatic method that fixes those boundaries, chosen after careful analysis of several sets of boundaries.

We could argue that it is not the most exact method in terms of localisation of the peaks, compared with the manual method (which is not a gold standard method and can be different from one laboratory to another). However, the main advantage provided by the automatic peak fitting is the absence of intraoperator and interoperator variation and reproducibility. The second advantage of the automatic process is the normalisation of its preprocessing methods (MMA removal and baseline subtraction), removing the influence a human operator has in the spectrum preparation. Finally, the automatic method reduces drastically the time needed to analyse a spectrum (10 s is required for the automatic process, while the manual process takes 5 to 10 min per spectra).

4. Conclusions

The algorithms described in this paper offer the possibility to analyse quickly an infrared spectrum of the thin bone section. This automatic method will allow the analysis of large sets of samples, which confer robust results.

By using machine learning methods on large data of deconvolved FTIRM spectra, the automatic method could be improved. This could bring specificity of peak fitting parameters regarding the type of the sample analysed, thus improving the exactitude and reliability of the results.

The software will be available under GNU GPL license.

Appendix

A. Simple Algorithm of Spectrum Component Removal

Variables used in the algorithm (Figure 10):

OrigSpectrumData: raw MMA-embedded bone spectrum.

RemSpectrumData: MMA-only spectrum.

FinalSpectrumData: bone spectrum (removed from the MMA component).

MainPeakBoundaryInf: inferior boundary of the position of the main MMA peak.

MainPeakBoundarySup: superior boundary of the position of the main MMA peak.

Coef: coefficient used to weight the MMA-only spectrum.

CoefSaved: the current best coefficient used to weight the MMA-only spectrum.

Delta: increment for the coefficient.

Area: area of the negative part of the second derivative of OrigSpectrumData − RemSpectrumData ∗ coef.

AreaSaved: current minimum area.

Precision: maximum precision for the coefficient.

MaxLoop: maximum number of loop.

n: current number of iteration.

B. An Example of the Choice of the Region of Interest Boundaries for the Same Spectrum for the Crystallinity Index

In Figure 11(a), if we choose the (536, 634) cm−1 boundaries, compared to the (531, 659) cm−1 (Figure 11(b)), we can see the Crystallinity Index varies from 0.040 to 0.035. The chosen boundaries for the automatic deconvolution are those of the bottom figure (531, 659) cm−1.

C. Searching the Position of Underlying Peaks in a Spectrum: Commentary on the Method of Second Derivative

The purpose of this study is to balance advantage/disadvantage of the method using the second derivative for the analysis of spectra.

The second derivative of the Gaussian function gives a peak located at the same abscissa for the maximal amplitude, but its pseudofull width at half maximum (FWHM) is around 0.53 ( is the FWHM of the original Gaussian).

The main advantage is that peaks are well discerned using the second derivate compared to the original Gaussian, and then, this value (0.53 ) permits to better discriminate peaks in spectra. An example is given in Figure 12. Taking a Gaussian (blue) with FWHM = 1, after calculation of the second derivative of this function (green), we can see that the peaks are located at the same abscissa, and the second derivative also shows a thinner peak (Figure 12, an example of a Gaussian with its FWHM = 1 and, for visualization, its second derivative normalized by the maximum of the Gaussian).

However, the second derivative brings up two major disadvantages:The emergence of two perturbations from either side of the main peak (cf. the 2 small green peaks around the main one).The maximal amplitude of the second derivative (A″) depends not only on the maximal amplitude of the original Gaussian A, but also of its FWHM (A″ ≈ 2.77 ∗ A/). In other words, thinner peaks in the original spectrum will be more present and visible in the second derivative than wider peaks. In some cases, it may happen that the wider peak in the original spectrum will be hidden (in the second derivative) by the thinner one (Figure 13).

If we use more complex spectra, the discrimination of peaks becomes more difficult. An example is given in Figure 14. As previously mentioned, the 5 original Gaussian curves (pink) are summed into one spectrum (blue). The second derivative (green) curve shows only 3 minima, which corresponds to the thinner or isolated original peaks. The position of the peak is also slightly shifted. In an ideal analysis, we should get 5 minima corresponding to the 5 original peaks (in abscissa).

A simple example is given in Figure 13. Two simple original Gaussian curves (dotted pink, vertical dotted line showing the position of the maximal amplitude for each) are summed into one spectrum (plain blue line). The second derivative of this spectrum is shown in green. As we can see, this method does not allow the obtaining of the 2 initial pink peaks. One of the wide pink peaks has been “hidden” after using the second derivative method.

And finally, if we introduce noise and a Savitzky–Golay smoothing (commonly find in the spectra treatment), other peaks start to appear and the position of peaks slightly shifts (Figure 15).

An example is given in Figure 15. As previously mentioned, the 5 original Gaussian curves (pink) are summed into one spectrum. This curve is degraded by adding noise, and then, a smoothing is applied (Savitzky–Golay smoothing), to correspond to the “classical” spectra we obtain in the bone (blue curve). The second derivative (green) curve shows more minima (9) than the original Gaussian curves. The position of the peak is also slightly shifted.

The method second derivative is applied to our bone sample which is shown in Figure 16.

One of the domains of the bone spectrum of the bone is shown between 900 and 1200 cm−1 (phosphate), with smoothing and baseline correction. The second derivative (green) is calculated, and outcomes enounced previously are visible on the graph. There are 5 minima. One appears (the first one) on the left and does not correspond to the presence of a real peak, and one disappears around (1100–1050 cm–1).

In conclusion, the second derivative method shows great advantages (peak thinning, easier discrimination, and localization of maxima in the case of a simple curve). However, in the case of complex spectra, the “purely mathematical” aspect of this method does not take into account the real composition of the analyzed samples and may give partially false results. In our lab, we decided to use another method: the peak fitting using a fix number of peaks, previously defined in [11, 12, 15], and based on the composition of the material (bone).

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

There are no conflicts to declare.

Acknowledgments

This study was partly funded by the ANR project MULTIPS (ANR-13-BS09-0006) and by the ANR project LYSBONE (ANR-15-CE14-0010), which provided the mice. The authors would like to thank Pr. Gérard Panczer from Université Claude Bernard Lyon 1 for his expert advice in FTIR analysis and his scientific review of the manuscript.