Green Fluorescent Protein and Phase-Contrast Image Fusion via Generative Adversarial Networks

Tang, Wei; Liu, Yu; Zhang, Chao; Cheng, Juan; Peng, Hu; Chen, Xun

doi:https://doi.org/10.1155/2019/5450373

Computational and Mathematical Methods in Medicine

On this page

Abstract Introduction Related Work Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2019 | Article ID 5450373 | https://doi.org/10.1155/2019/5450373

Green Fluorescent Protein and Phase-Contrast Image Fusion via Generative Adversarial Networks

Wei Tang,¹Yu Liu,¹Chao Zhang,¹Juan Cheng,¹Hu Peng,¹and Xun Chen²

Academic Editor: Liangjiang Wang

Received26 Sept 2019

Revised07 Nov 2019

Accepted20 Nov 2019

Published04 Dec 2019

Abstract

In the field of cell and molecular biology, green fluorescent protein (GFP) images provide functional information embodying the molecular distribution of biological cells while phase-contrast images maintain structural information with high resolution. Fusion of GFP and phase-contrast images is of high significance to the study of subcellular localization, protein functional analysis, and genetic expression. This paper proposes a novel algorithm to fuse these two types of biological images via generative adversarial networks (GANs) by carefully taking their own characteristics into account. The fusion problem is modelled as an adversarial game between a generator and a discriminator. The generator aims to create a fused image that well extracts the functional information from the GFP image and the structural information from the phase-contrast image at the same time. The target of the discriminator is to further improve the overall similarity between the fused image and the phase-contrast image. Experimental results demonstrate that the proposed method can outperform several representative and state-of-the-art image fusion methods in terms of both visual quality and objective evaluation.

1. Introduction

In the field of cell and molecular biology, fluorescent imaging and phase-contrast imaging are two representative imaging approaches. As a widely used tool in fluorescent imaging, green fluorescent protein (GFP) displays bright green fluorescence when exposed to light in the range of blue to ultraviolet. The GFP image contains functional information related to the molecular distribution of biological cells but has very low spatial resolution. Phase-contrast imaging is an optical microscopy technique that visualizes phase shifts through converting it to variation of amplitude or contrast in the image. The phase-contrast image provides structural information with high resolution. Fusion of GFP image and phase-contrast image is of great significance to the localization of subcellular structure, the functional analysis of protein, and the expression of gene [1].

In recent years, a variety of image fusion methods have been proposed. Generally, existing image fusion algorithms mainly consist of three steps: image transform, fusion, and inverse transform [2]. The representative fusion methods include multiscale transform-based ones [3–8], sparse representation-based ones [9–13], spatial domain-based ones [14–17], hybrid transform-based ones [18–21], etc. In most of the existing image fusion methods, the role of each input image is equivalent in terms of the fusion system, which means that the input images generally undergo identical transforms and uniform fusion rules. However, for the problem of GFP and phase-contrast image fusion, considering that the input images vary significantly from each other, different roles can be assigned to them in the fusion system by carefully addressing their own characteristics, which is likely to provide a more effective way to tackle this fusion issue.

In this paper, we propose a novel GFP and phase-contrast image fusion method based on generative adversarial networks (GANs). The fusion problem is modelled as an adversarial game between a generator and discriminator. The aim of the generator is to obtain a fused image that integrates the functional information from the GFP image together with the structural information from the phase-contrast image, while the discriminator further ensures the overall similarity between the fused image and the phase-contrast image. This adversarial process enables the fusion result to capture the complementary information from different input images as much as possible. An example of the proposed method is illustrated in Figure 1, where the input GFP and phase-contrast images are shown in Figures 1(a) and 1(b), respectively. Figure 1(c) shows the fusion result obtained by the proposed method. By referring to the input images, it can be seen that our method achieves high performance in terms of the preservation of functional and structural information. The main contributions of this paper are summarized as follows:(1)We propose a deep learning- (DL-) based GFP and phase-contrast image fusion method via generative adversarial networks (GANs). To extract information from these two kinds of biological images adequately, the input images are treated differently in the proposed fusion model according to their own characteristics.(2)Extensive experiments on more than 140 pairs of input images demonstrate that the proposed method outperforms several representative image fusion methods in terms of both visual quality and objective evaluation.

(a)

(b)

(c)

The remainder of this paper is organized as follows. Section 2 depicts some related works. In Section 3, the proposed GAN-based image fusion method is introduced in detail. The experimental results and discussions are given in Section 4. Finally, Section 5 concludes the paper.

2.1. GFP and Phase-Contrast Image Fusion

Fusion of GFP and phase-contrast images is conducive to the study of subcellular localization and functional properties of protein. In the past few years, several image fusion methods have been proposed to address this issue [22–24]. Li and Wang [22] proposed a NSCT-based GFP and phase-contrast image fusion method. In their method, the intensity components of input images are decomposed by NSCT and the obtained coefficients are then merged by a variable-weight fusion rule. In [23], Feng et al. introduced a fusion approach for GFP and phase-contrast images based on sharp frequency localization contourlet transform (SFL-CT). To fuse the decomposed coefficients, they designed a maximum region energy- (MRE-) based rule, a maximum absolute value- (MAV-) based rule, and a neighborhood consistency measurement- (NCM-) based rule to merge the approximation subbands, the finest detailed subbands, and other detailed subbands, respectively. Recently, Qiu et al. [24] presented a complex shearlet transform- (CST-) based method to fuse GFP and phase-contrast images. The high-frequency subbands are fused with the traditional absolute-maximum rule, while a Haar wavelet-based energy rule is introduced to merge low-frequency subbands.

It is worth noting that all of the above GFP and phase-contrast image fusion methods are based on conventional multiscale transforms. Moreover, the role of each input image is equivalent in these fusion methods, as they handle the GFP image (more precisely, its intensity component) and phase-contrast image in the same way.

2.2. Deep Learning-Based Image Fusion

In recent years, due to the high effectiveness and convenience in feature representation of deep learning (DL) models, DL-based study has emerged as a very active direction in the field of image fusion [25]. Many DL models such as stacked autoencoders (SAEs) and convolutional neural networks (CNNs) have been employed in a wide range of image fusion problems including remote sensing image fusion [26, 27], multifocus image fusion [28–30], multiexposure image fusion [31, 32], medical image fusion [33, 34], and infrared and visible image fusion [35–37]. In [26], Huang et al. firstly introduced deep learning into remote sensing image fusion by applying a sparse denoising autoencoder to characterize the nonlinear mapping between low- and high-resolution multispectral image patches. Liu et al. [28] proposed a CNN-based multifocus image fusion method in which a Siamese network is designed to simultaneously act as the roles of activity level measurement and fusion rule. In [31], Kalantari and Ramamoorthi introduced a learning-based multiexposure image fusion approach via CNN to model the complex deghosting process in dynamic scenes. Hermessi et al. [33] presented a CNN-based medical image fusion method which preextracts the shearlet features of source images as network input. Most recently, Ma et al. [35] introduced a novel generative adversarial network- (GAN-) based infrared and visible image fusion method by modelling the fusion problem as an adversarial game, aiming to preserve infrared intensities and visible details at the same time. This work demonstrates the high potential of the GAN models for multimodal image fusion.

2.3. Motivations of This Work

In this work, considering that the characteristics of the GFP image and the phase-contrast image are significantly different, unlike the exiting fusion methods on this issue introduced in Section 2.1, different roles are assigned to the input images for extracting information from them more effectively. To this end, and inspired by the great progress recently achieved in image fusion by deep learning, a GAN-based GFP and phase-contrast image fusion method is presented. We mainly adopt the GAN-based fusion scheme introduced in [35] due to its effectiveness and simplicity in multimodal image fusion, while carefully devising the loss functions according to the characteristics of the GFP and the phase-contrast images. To the best of our knowledge, this is the first time that a DL-based approach is used in the field of GFP and phase-contrast image fusion.

3. The Proposed Method

3.1. Overview

Figure 2 shows the schematic diagram of the proposed GFP and phase-contrast image fusion method. The fusion issue is formulated as an adversarial problem to preserve the complementary information contained in the input images as much as possible. The GFP image is treated as an RGB color image in the fusion process. It is firstly converted into the YUV color space that can effectively separate the intensity or luminance component from the color image. Actually, this is a widely used approach in the field of functional and structural image fusion [6, 38].

(a)

(b)

During the training process, the GFP image is converted into YUV color space to acquire the Y, U, and components: , , and . Then, and the phase-contrast image are concatenated in the channel dimension to generate a two-channel map , in which the first channel and the second channel . Next, is fed into the generator and the output is termed as the intermediate fused image , which inclines to maintain the functional information of and retain the structural information of . and are fed into the discriminator to further ensure the overall similarity between them. In this way, adversarial game between and is founded.

During the testing process, and are concatenated in the channel dimension and then fed into the trained generator to obtain the intermediate fused image . The final fused image is acquired by performing the inverse YUV conversion (i.e., YUV to RGB) over , , and .

3.2. Network Architecture

The network architecture of the generator is shown in Figure 3. The input of the generator is the concatenated and , followed by a five-layer convolution network. The filters used in the first two layers, the next two layers, and the last layer are , , and , respectively. The symbol “n256s1” denotes the corresponding layer has 256 feature maps and the stride is 1, and so forth. In each convolutional layer, the stride is 1 and there is no padding operation. To preserve the details contained in the source images, the downsampling process is not adopted in each layer. Besides, to overcome the problems of vanishing gradient and data initialization sensitivity, batch normalization are employed in the first four layers. Leaky ReLU and tanh activation functions are used in the first four layers and the last layer, respectively. The output of is the intermediate fused image .

Figure 3

Network architecture of the generator. In each convolutional layer, there is no padding operation. During the training process, the input is image patches of size pixels and the output is of size pixels (see Section 3.4 for more details). During the testing process, the input is the entire images with 6 pixels padded in each direction to ensure that the output has the same size with the input images. For better visualization, we adopt the entire images as the input and output in this figure.

The network architecture of the discriminator is shown in Figure 4. The inputs of the discriminator are and , followed by a five-layer convolution network where filters are used in the first four layers with a stride of 2. The discriminator actually plays the role of a classifier. Batch normalization is employed in the second, third, and fourth layers, and the leaky ReLU activation function is used in the first four layers, and the last layer is a linear layer. The output of the discriminator is the predicted label (the dimension is one).

3.3. The Definition of the Loss Functions

The loss functions of our network are composed of two parts: the loss function of the generator and the loss function of the discriminator . To improve the quality of generated images and the stability of training process, they are designed based on the least squares generative adversarial networks (LSGANs) introduced by Mao et al. [39].

3.3.1. The Loss Function of the Generator

The loss function of is formulated aswhere and denote the adversarial loss between the generator and the discriminator and the content loss, respectively. The parameter is used to control the balance between and . The first term is defined aswhere is the number of training samples in a batch and denotes the fused image with . The parameter is the value that the generator expects the discriminator to believe in terms of the fake data. The second term is formulated aswhere and indicate the height and width of the input images, respectively, denotes the matrix Frobenius norm, and represents the structural similarity operation [40]. The first term is designed to preserve the functional information of GFP image. The second term aims to extract the energy (represented by image intensity) of the phase-contrast image, and the third term is devised to maintain the structural information contained in the phase-contrast image. and are trade-off parameters to balance these three terms.

3.3.2. The Loss Function of the Discriminator

The information of is incapable of being completely expressed only by its energy and structural information. For example, the texture details may not be fully extracted in this way. To further improve the overall similarity between and , a discriminator is introduced into the proposed framework. The loss function of is formulated aswhere and stand for the labels of and , respectively.

3.4. Training Details

The popular GFP database, which is available at http://data.jic.ac.uk/Gfp/, released by the John Innes Centre [1] is employed as the training data in this work. The database contains 148 pairs of registered GFP and phase-contrast images of size pixels that focus on the Arabidopsis thaliana cells.

In order to obtain sufficient data for network training, each input image is cropped into a large number of patches of the same size pixels. The stride for cropping is set to 12. As a result, we totally acquire 65268 pairs of GFP and phase-contrast image patches, and the range of each patch is normalized to . In each iteration during training, the input of the generator contains pairs of input image patches (i.e., the batch size is ), and the output intermediate fused patches and the phase-contrast patches (the central part of size pixels) are employed as the input of the discriminator. Moreover, in each iteration, the discriminator is firstly trained times (i.e., the training step is ) using the Adam optimizer [41] and then the generator. Algorithm 1 summarizes the procedure of network training.

(1)	for number of training iterations do
(2)	for steps do
(3)	Select fused patches from generator;
(4)	Select phase-contrast image patches ;
(5)	Update discriminator with the Adam optimizer: ;
(6)	end for
(7)	Select GFP image patches as well as phase-contrast image patches from training data;
(8)	Update generator with the Adam optimizer: ;
(9)	end for

In our experiments, the parameters for training are set as follows. The batch size and the number of epochs are set to 32 and 10, respectively. Accordingly, the number of training iterations is . The training step of the discriminator is fixed as 2, and the learning rate is set to . For easier training, as suggested in [35], soft labels are adopted for , , and . That is, they are set to random numbers rather than specific ones. The label of and the label of are with the ranges of 0 to 0.3 and 0.7 to 1.2, respectively. The label of ranges from 0.7 to 1.2.

4. Experiments

4.1. Experimental Settings

4.1.1. Testing Images

Considering that the proposed method is an unsupervised approach (there is no ground truth fused images for training), all the 148 pairs of images used for training in the GFP database [1] (as mentioned in Section 3.4) also act as the role of testing images.

4.1.2. Compared Methods

Seven representative multimodal image fusion methods are selected for performance comparison: the dual-tree complex wavelet transform- (DTCWT-) based method [3], the curvelet transform- (CVT-) based method [4], the non-subsampled contourlet transform- (NSCT-) based method [5], the sparse representation- (SR-) based method [9], the convolutional neural network- (CNN-) based method [36], the sharp frequency localization contourlet transform- (SFL-CT-) based method [23], and the complex shearlet transform- (CST-) based method [24]. The first three are based on popular multiscale transforms, and their parameters are set to the optimal values reported in an influential comparative study [42]. The fourth one is based on sparse representation via simultaneous orthogonal matching pursuit (SOMP) algorithm. The fifth one is a recently proposed deep learning- (DL-) based method, while the last two are the fusion methods specially designed for GFP and phase-contrast images. The parameters in these methods are all set to the default values for unbiased comparison.

4.1.3. Objective Metrics

In [43], Liu et al. presented a comprehensive review of the objective evaluation metrics for image fusion and classified them into four categories: the information theory-based ones, the image feature-based ones, the image structural similarity-based ones, and the human perception-inspired ones. In this paper, to conduct an all-round objective assessment, one widely used metric is chosen from each category. The first one is the normalized mutual information (Q_MI) [44] that measures the mutual dependence between the input images and the fused image. The second one is an image feature-based metric using phase congruency (Q_P) [45]. This metric assesses the fusion quality through comparing the local cross correlation of corresponding feature maps of the input and fused images. The third one is Yang’s metric (Q_Y) [46], which evaluates the structural similarity between the input images and the fused one. The last one is proposed by Chen and Blum (Q_CB) [47] based on human visual system (HVS) models. In addition, the visual information fidelity (VIF) measure [48] between the input phase-contrast image and the fused image is also employed for objective assessment. By characterizing the relationship between image information and visual quality, the VIF measure has been widely verified to be highly consistent with subjective evaluation. It is worth noting that the same measure between the GFP image and the fused image is not included. As reported in [23] (Table 1), the result on VIF measure between the GFP image and the fused image (the proposed method has the lowest score) is on the contrary with that of the VIF measure between the phase-contrast image and the fused image (the proposed method has the highest score). We also verify this point in our experiment. Specifically, we experimentally find that the result on VIF measure between the phase-contrast image and the fused image is highly consistent with other fusion metrics, while the situation for the GFP image is just on the contrary. One possible explanation for this issue is that most of the pixels or regions in the GFP image are dark (the intensity is zero), which is significantly different from the situations of the fused image or the phase-contrast image. Therefore, a higher VIF measure between the GFP image and the fused image may not indicate a better fusion result. Based on the above observations, only the VIF measure between the phase-contrast image and the fused image is used for evaluation in this work. For each of the above metrics, a higher score indicates a better performance.

4.2. Parameter Analysis

In this section, the impacts of three trade-off parameters , , and in our method are quantitatively studied via the objective fusion metrics. Based on a large quantity of experiments, we obtain an appropriate setting: , , and . As a popular approach for analysing the impacts of multiple parameters, the controlling for a variable is adopted to verify this point. The results are shown in Figure 5. Considering that it is practically difficult to show all the results that contain too many combinations, only one set of results is provided to exhibit the impact of each parameter, by fixing the other two as the well-performed values (this is a widely used manner in the study of image fusion [8, 38]). For each metric, the average score of 148 images is employed for evaluation in Figure 5. It is obvious that for each parameter, the best performances on all the five metrics are mostly obtained when its value is 6. Accordingly, these three free parameters are all set to 6 in our method.

(a)

(b)

(c)

4.3. Results and Discussion

Figures 6 and 7 provide two sets of fusion results which include the input images and the fused images obtained by different methods. In each image, two representative regions are enlarged as close-ups for better comparison.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

It can be seen that the DTCWT-based, CVT-based, NSCT-based, and SR-based methods can well capture the functional information from the GFP image and the spatial details from the phase-contrast image. However, these methods tend to lose a large amount of image energy from the phase-contrast image. As a result, the brightness of the fused images is obviously lower in comparison to the phase-contrast image, leading to undesirable visual artifact (see the first close-ups in Figures 6(b)–6(f) and 7(b)–7(f)).

For the CNN-based method, the image energy can be well preserved, but the functional information is not well tackled as the green regions are actually over emphasized when compared with the GFP input image. As a consequence, some structural details are concealed by the green regions (see the second close-ups in Figures 6(g) and 7(g)). The SFL-CT-based and CST-based methods achieve obvious improvement on this issue, but still suffer from this defect to a certain degree (see the second close-ups in Figures 6(h)-6(i) and 7(h)-7(i)).

The proposed method can achieve the highest visual quality among all the methods. On the one hand, the functional information from the GFP image is accurately preserved by method. On the other hand, the fused images of our method well inherit both the structural information and image energy from the phase-contrast image.

The objective assessment of different fusion methods on the above five metrics are listed in Table 1. For each method, the mean value (MV) and the standard deviation (SD) of each metric over 148 pairs of input images are reported. Moreover, the number of image pairs on which the corresponding method achieves the highest score is counted and termed as winning times (WT) in Table 1. The maximum mean value, minimum standard deviation, and maximum winning times among all the methods are indicated in bold. It can be seen that the proposed method clearly outperforms the DTCWT-based, CVT-based, NSCT-based, SR-based, CNN-based, and SFL-CT-based methods on all the five evaluation metrics. In comparison to the CST-based method that wins the first places on Q_Y and Q_CB, our method owns obvious advantage on Q_MI, Q_P, and VIF, while achieving very close performance on Q_Y and Q_CB. Besides, the proposed method obtains relatively small standard deviations on all the five metrics, which indicates that it can stably obtain high-quality fusion results.

Based on the above qualitative and quantitative comparisons, the proposed method exhibits clear advantages over the other seven methods. Moreover, the computational efficiency is sufficiently high for practical usage. Specifically, under the hardware environment consisting of an Intel Core i7-7820K CPU and a NVIDIA TITAN Xp GPU, it takes only about 0.06 seconds for our method to fuse two images of size pixels. Since all the other methods are implemented in Matlab, their running time is not provided for comparison.

4.4. Influence of Network Architecture

In this section, we study the influence of network architecture on the fusion performance of the proposed method. Specifically, the impacts of the number of feature maps and the number of convolutional layers are studied. Firstly, two sets of experiments are conducted to investigate the influence of the number of feature maps, one of which is halving the number of the feature maps in the first four layers of the generator and the discriminator, and the other is doubling them. Secondly, to analyse the impact of the number of convolutional layers, we perform another two sets of experiments, one of which is removing the first layer of the generator and the fourth layer of the discriminator (both of them contain 256 feature maps), while the other is adding a convolutional layer with 512 feature maps into the generator before the first layer and into the discriminator after the fourth layer, respectively.

Table 2 lists the objective evaluation results of the above experiments, which are denoted by halved feature maps, doubled feature maps, reduced layers, and increased layers. The results of the original network architecture are also given as reference. For each approach, the mean value of each metric over 148 pairs of input images is reported. It can be seen that the proposed method can generally obtain better performance with more feature maps and convolutional layers. In particular, the number of feature maps has relatively more effect on the fusion performance in this task, in comparison to the number of convolutional layers. By taking the results given in Table 1 into consideration together, we can see that the proposed method with a slighter model (halved feature maps or reduced layers) is still competitive enough among all the fusion methods. A heavier model (doubled feature maps or increased layers) can provide some further improvement in terms of the original network architecture, but the extent is not significant. Considering the factors like memory consumption and computational efficiency, it is an appropriate choice to employ the network architectures described in Section 3 as the default settings.

4.5. Verification of the Overfitting Problem

As mentioned above, the proposed fusion method is essentially an unsupervised approach since there is no ground truth fused images used for training. Accordingly, the whole dataset can be employed for training and testing in the above experiments, without dividing it into training set and testing set. Although it is a reasonable manner to obtain the fusion results for all the images, the performance of the trained model on new testing data remains unknown.

To address this issue, we conduct a 5-fold cross validation to study if the proposed fusion model has the overfitting problem. Specifically, all the 148 pairs of images are randomly divided into five groups, with 30 pairs in the first four groups and 28 pairs in the last group. In each fold, four groups are employed as training data and the remaining one is used for testing. Therefore, each pair of images is employed for testing only once, and all the 148 fused images obtained in the testing process are used for objective evaluation. Table 3 shows the objective assessment results of the five-fold cross validation experiment, along with the results of original training/testing manner for comparison. For each approach, the mean value of each metric over 148 pairs of input images is given. It is not surprising that the performance of the cross validation approach has a slight decrease when compared with that of the original manner. By referring to the performances of other fusion methods reported in Table 1, we can find that this decreasing extent is very small, which demonstrates that there is no obvious overfitting phenomenon and the proposed image fusion model has good practicality to new examples.

5. Conclusion and Future Work

In this paper, we propose a GFP and phase-contrast image fusion method based on generative adversarial networks. The fusion problem is addressed as an adversarial game between a generator and a discriminator by carefully considering the characteristics of different input images. Experimental results demonstrate that the proposed method can simultaneously extract the functional information from the GFP image and the structural information from the phase-contrast image, leading to better performance than several existing methods in terms of both visual quality and objective assessment. The proposed fusion framework is of high generality to functional and structural image fusion problems. In the future, we will study its feasibility in multimodal medical image fusion issues such as magnetic resonance (MR) and positron emission tomography (PET) image fusion.

Data Availability

The data supporting this study are from previously reported studies and datasets, which have been cited. The dataset used in this research work is available at http://data.jic.ac.uk/Gfp/, released by the John Innes Centre.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This study was supported by the National Natural Science Foundation of China (grant nos. 61701160 and 81571760), the Provincial Natural Science Foundation of Anhui (grant no. 1808085QF186), the Fundamental Research Funds for the Central Universities (grant no. JZ2018HGTB0228), and the SenseTime Research Fund.

References

O. A. Koroleva, M. L. Tomlinson, D. Leader, P. Shaw, and J. H. Doonan, “High-throughput protein localization in arabidopsis using agrobacterium-mediated transient expression of GFP-ORF fusions,” The Plant Journal, vol. 41, no. 1, pp. 162–174, 2005.
View at: Publisher Site | Google Scholar
S. Li, X. Kang, L. Fang, J. Hu, and H. Yin, “Pixel-level image fusion: a survey of the state of the art,” Information Fusion, vol. 33, pp. 100–112, 2017.
View at: Publisher Site | Google Scholar
J. J. Lewis, R. J. O’Callaghan, S. G. Nikolov, D. R. Bull, and N. Canagarajah, “Pixel- and region-based image fusion with complex wavelets,” Information Fusion, vol. 8, no. 2, pp. 119–130, 2007.
View at: Publisher Site | Google Scholar
F. Nencini, A. Garzelli, S. Baronti, and L. Alparone, “Remote sensing image fusion using the curvelet transform,” Information Fusion, vol. 8, no. 2, pp. 143–156, 2007.
View at: Publisher Site | Google Scholar
Q. Zhang and B.-L. Guo, “Multifocus image fusion using the nonsubsampled contourlet transform,” Signal Processing, vol. 89, no. 7, pp. 1334–1346, 2009.
View at: Publisher Site | Google Scholar
Y. Yang, Y. Que, S. Huang, and P. Lin, “Multimodal sensor medical image fusion based on type-2 fuzzy logic in nsct domain,” IEEE Sensors Journal, vol. 16, no. 10, pp. 3735–3745, 2016.
View at: Publisher Site | Google Scholar
J. Du, W. Li, and B. Xiao, “Anatomical-functional image fusion by information of interest in local laplacian filtering domain,” IEEE Transactions on Image Processing, vol. 26, no. 12, pp. 5855–5866, 2017.
View at: Publisher Site | Google Scholar
Z. Song, H. Jiang, and S. Li, “A novel fusion framework based on adaptive PCNN in NSCT domain for whole-body PET and CT images,” Computational and Mathematical Methods in Medicine, vol. 2017, Article ID 8407019, 9 pages, 2017.
View at: Publisher Site | Google Scholar
B. Yang and S. Li, “Pixel-level image fusion with simultaneous orthogonal matching pursuit,” Information Fusion, vol. 13, no. 1, pp. 10–19, 2012.
View at: Publisher Site | Google Scholar
Z. Zhu, Y. Chai, H. Yin, Y. Li, and Z. Liu, “A novel dictionary learning approach for multi-modality medical image fusion,” Neurocomputing, vol. 214, pp. 471–482, 2016.
View at: Publisher Site | Google Scholar
Y. Liu, X. Chen, R. K. Ward, and Z. J. Wang, “Image fusion with convolutional sparse representation,” IEEE Signal Processing Letters, vol. 23, no. 12, pp. 1882–1886, 2016.
View at: Publisher Site | Google Scholar
H. Li, X. He, D. Tao, Y. Tang, and R. Wang, “Joint medical image fusion, denoising and enhancement via discriminative low-rank sparse dictionaries learning,” Pattern Recognition, vol. 79, pp. 130–146, 2018.
View at: Publisher Site | Google Scholar
X. Ma, S. Hu, S. Liu, J. Fang, and S. Xu, “Multi-focus image fusion based on joint sparse representation and optimum theory,” Signal Processing: Image Communication, vol. 78, pp. 125–134, 2019.
View at: Publisher Site | Google Scholar
X. Bai, Y. Zhang, F. Zhou, and B. Xue, “Quadtree-based multi-focus image fusion using a weighted focus-measure,” Information Fusion, vol. 22, pp. 105–118, 2015.
View at: Publisher Site | Google Scholar
Y. Liu, S. Liu, and Z. Wang, “Multi-focus image fusion with dense sift,” Information Fusion, vol. 23, pp. 139–155, 2015.
View at: Publisher Site | Google Scholar
S. Liu, J. Zhao, and M. Shi, “Medical image fusion based on rolling guidance filter and spiking cortical model,” Computational and Mathematical Methods in Medicine, vol. 2015, Article ID 156043, 9 pages, 2015.
View at: Publisher Site | Google Scholar
J. Ma, C. Chen, C. Li, and J. Huang, “Infrared and visible image fusion via gradient transfer and total variation minimization,” Information Fusion, vol. 31, pp. 100–109, 2016.
View at: Publisher Site | Google Scholar
S. Li and B. Yang, “Hybrid multiresolution method for multisensor multimodal image fusion,” IEEE Sensors Journal, vol. 10, no. 9, pp. 1519–1526, 2010.
View at: Publisher Site | Google Scholar
Y. Liu, S. Liu, and Z. Wang, “A general framework for image fusion based on multi-scale transform and sparse representation,” Information Fusion, vol. 24, pp. 147–164, 2015.
View at: Publisher Site | Google Scholar
C. Qiu, Y. Wang, H. Zhang, and S. Xia, “Image fusion of CT and MR with sparse representation in NSST domain,” Computational and Mathematical Methods in Medicine, vol. 2017, Article ID 9308745, 13 pages, 2017.
View at: Publisher Site | Google Scholar
J. Xia, Y. Chen, A. Chen, and Y. Chen, “Medical image fusion based on sparse representation and PCNN in NSCT domain,” Computational and Mathematical Methods in Medicine, vol. 2018, Article ID 2806047, 12 pages, 2018.
View at: Publisher Site | Google Scholar
T. Li and Y. Wang, “Biological image fusion using a NSCT based variable-weight method,” Information Fusion, vol. 12, no. 2, pp. 85–92, 2011.
View at: Publisher Site | Google Scholar
P. Feng, J. Wang, B. Wei, and D. Mi, “A fusion algorithm for GFP image and phase contrast image of arabidopsis cell based on SFL-contourlet transform,” Computational and Mathematical Methods in Medicine, vol. 2013, Article ID 635040, 10 pages, 2013.
View at: Publisher Site | Google Scholar
C. Qiu, Y. Wang, Y. Guo, and S. Xia, “Fusion of GFP and phase contrast images with complex shearlet transform and haar wavelet-based energy rule,” Microscopy Research and Technique, vol. 81, no. 6, pp. 569–578, 2018.
View at: Publisher Site | Google Scholar
Y. Liu, X. Chen, Z. Wang, Z. J. Wang, R. K. Ward, and X. Wang, “Deep learning for pixel-level image fusion: recent advances and future prospects,” Information Fusion, vol. 42, pp. 158–173, 2018.
View at: Publisher Site | Google Scholar
W. Huang, L. Xiao, Z. Wei, H. Liu, and S. Tang, “A new pan-sharpening method with deep neural networks,” IEEE Geoscience and Remote Sensing Letters, vol. 12, no. 5, pp. 1037–1041, 2015.
View at: Publisher Site | Google Scholar
G. Masi, D. Cozzolino, L. Verdoliva, and G. Scarpa, “Pansharpening by convolutional neural networks,” Remote Sensing, vol. 8, no. 7, pp. 594–606, 2016.
View at: Publisher Site | Google Scholar
Y. Liu, X. Chen, H. Peng, and Z. Wang, “Multi-focus image fusion with a deep convolutional neural network,” Information Fusion, vol. 36, pp. 191–207, 2017.
View at: Publisher Site | Google Scholar
H. Tang, B. Xiao, W. Li, and G. Wang, “Pixel convolutional neural network for multi-focus image fusion,” Information Sciences, vol. 433-434, pp. 125–141, 2018.
View at: Publisher Site | Google Scholar
Y. Yang, Z. Nie, S. Huang, P. Lin, and J. Wu, “Multilevel features convolutional neural network for multifocus image fusion,” IEEE Transactions on Computational Imaging, vol. 5, no. 2, pp. 262–273, 2019.
View at: Publisher Site | Google Scholar
N. K. Kalantari and R. Ramamoorthi, “Deep high dynamic range imaging of dynamic scenes,” ACM Transactions on Graphics, vol. 36, no. 4, pp. 144–156, 2017.
View at: Publisher Site | Google Scholar
K. R. Prabhakar, V. S. Srikar, and R. V. Babu, “Deepfuse: a deep unsupervised approach for exposure fusion with extreme exposure image pairs,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 4724–4732, Venice, Italy, October 2017.
View at: Publisher Site | Google Scholar
H. Hermessi, O. Mourali, and E. Zagrouba, “Convolutional neural network-based multimodal image fusion via similarity learning in the shearlet domain,” Neural Computing and Applications, vol. 30, no. 7, pp. 2029–2045, 2018.
View at: Publisher Site | Google Scholar
M. Wang, X. Liu, and H. Jin, “A generative image fusion approach based on supervised deep convolution network driven by weighted gradient flow,” Image and Vision Computing, vol. 86, pp. 1–16, 2019.
View at: Publisher Site | Google Scholar
J. Ma, W. Yu, P. Liang, C. Li, and J. Jiang, “FusionGAN: a generative adversarial network for infrared and visible image fusion,” Information Fusion, vol. 48, pp. 11–26, 2019.
View at: Publisher Site | Google Scholar
Y. Liu, X. Chen, J. Cheng, H. Peng, and Z. Wang, “Infrared and visible image fusion with convolutional neural networks,” International Journal of Wavelets, Multiresolution and Information Processing, vol. 16, no. 3, pp. 1–20, 2018.
View at: Publisher Site | Google Scholar
H. Li and X.-J. Wu, “Densefuse: a fusion approach 400 to infrared and visible images,” IEEE Transactions on Image Processing, vol. 28, no. 5, pp. 2614–2623, 2018.
View at: Publisher Site | Google Scholar
M. Yin, X. Liu, Y. Liu, and X. Chen, “Medical image fusion with parameter-adaptive pulse coupled neural network in nonsubsampled shearlet transform domain,” IEEE Transactions on Instrumentation and Measurement, vol. 68, no. 1, pp. 49–64, 2019.
View at: Publisher Site | Google Scholar
X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang, and S. Paul Smolley, “Least squares generative adversarial networks,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 2794–2802, Venice, Italy, October 2017.
View at: Publisher Site | Google Scholar
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
View at: Publisher Site | Google Scholar
D. P. Kingma and J. Ba, “Adam: a method for stochastic optimization,” 2014, https://arxiv.org/abs/1412.6980.
View at: Google Scholar
S. Li, B. Yang, and J. Hu, “Performance comparison of different multi-resolution transforms for image fusion,” Information Fusion, vol. 12, no. 2, pp. 74–84, 2011.
View at: Publisher Site | Google Scholar
Z. Liu, E. Blasch, Z. Xue, J. Zhao, R. Laganiere, and W. Wu, “Objective assessment of multiresolution image fusion algorithms for context enhancement in night vision: a comparative study,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 1, pp. 94–109, 2011.
View at: Publisher Site | Google Scholar
M. Hossny, S. Nahavandi, and D. Creighton, “Comments on “Information measure for performance of image fusion”,” Electronics Letters, vol. 44, no. 18, pp. 1066-1067, 2008.
View at: Publisher Site | Google Scholar
J. Zhao, R. Laganiere, and Z. Liu, “Performance assessment of combinative pixel- level image fusion based on an absolute feature measurement,” International Journal of Innovative Computing, Information and Control, vol. 3, no. 6, pp. 1433–1447, 2007.
View at: Google Scholar
C. Yang, J.-Q. Zhang, X.-R. Wang, and X. Liu, “A novel similarity based quality metric for image fusion,” Information Fusion, vol. 9, no. 2, pp. 156–160, 2008.
View at: Publisher Site | Google Scholar
Y. Chen and R. S. Blum, “A new automated quality assessment algorithm for image fusion,” Image and Vision Computing, vol. 27, no. 10, pp. 1421–1432, 2009.
View at: Publisher Site | Google Scholar
H. R. Sheikh and A. C. Bovik, “Image information and visual quality,” IEEE Transactions on Image Processing, vol. 15, no. 2, pp. 430–444, 2006.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2019 Wei Tang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1801

Downloads

1479

Citations

Computational and Mathematical Methods in Medicine

Green Fluorescent Protein and Phase-Contrast Image Fusion via Generative Adversarial Networks

Abstract

1. Introduction

2. Related Work and Motivations

2.1. GFP and Phase-Contrast Image Fusion

2.2. Deep Learning-Based Image Fusion

2.3. Motivations of This Work

3. The Proposed Method

3.1. Overview

3.2. Network Architecture

3.3. The Definition of the Loss Functions

3.3.1. The Loss Function of the Generator

3.3.2. The Loss Function of the Discriminator

3.4. Training Details

4. Experiments

4.1. Experimental Settings

4.1.1. Testing Images

4.1.2. Compared Methods

4.1.3. Objective Metrics

4.2. Parameter Analysis

4.3. Results and Discussion

4.4. Influence of Network Architecture

4.5. Verification of the Overfitting Problem

5. Conclusion and Future Work

Data Availability

Conflicts of Interest

Acknowledgments

References

Copyright