Abstract

Filaments are a type of wide-existing astronomical structure. It is a challenge to separate filaments from radio astronomical images, because their radiation is usually weak. What is more, filaments often mix with bright objects, e.g., stars, which makes it difficult to separate them. In order to extract filaments, A. Men’shchikov proposed a method “getfilaments” to find filaments automatically. However, the algorithm removed tiny structures by counting connected pixels number simply. Removing tiny structures based on local information might remove some part of the filaments because filaments in radio astronomical image are usually weak. In order to solve this problem, we applied morphology components analysis (MCA) to process each singe spatial scale image and proposed a filaments extraction algorithm based on MCA. MCA uses a dictionary whose elements can be wavelet translation function, curvelet translation function, or ridgelet translation function to decompose images. Different selection of elements in the dictionary can get different morphology components of the spatial scale image. By using MCA, we can get line structure, gauss sources, and other structures in spatial scale images and exclude the components that are not related to filaments. Experimental results showed that our proposed method based on MCA is effective in extracting filaments from real radio astronomical images, and images processed by our method have higher peak signal-to-noise ratio (PSNR).

1. Introduction

A substantial part of interstellar medium exists in the form of a fascinating web of omnipresent filamentary structures [1], called filaments. The astronomical filament is first discovered in the Milky Way. Along with the development of telescopes, various filaments come into sight. Among them the filaments in star-forming regions are the most fascinating, many magnetohydrodynamic (MHD) simulations have shown that giant molecular clouds (GMCs) primarily evolve into filaments before they collapse to form stars [2, 3]. Recent observations also confirm these simulations [4, 5]. Since the formation of massive stellar objects is still unclear, further research on filaments is essential. Filaments in Galactic and cosmological fields are also important. Studies have argued that low mass galaxies got their gas through “cold accretion”, which is often directed along filaments [6, 7]. Some researchers paid more attention to the large-scale filaments of the universe [8], which may give clues to better understand the slightly nonuniform cosmic microwave background (CMB) and the birth of the first generation of stars. In addition, filaments have been observed in other objects, such as supernova remnants (SNR) [9] and protoplanetary disk [10].

The fact that many filaments are fuzzy in images causes difficulty to distinguish them from background and surrounding objects. Schneider et al. [11] investigated spatial and density structure of the Rosette molecular cloud, by applying a curvelet analysis, a filament-tracing algorithm (DisPerSE), and probability density functions (PDFs) on Herschel column density maps. Hennebelle et al. [12] showed a method based on adaptive mesh refinement magneto hydrodynamic simulations, which treat self-consistently cooling and self-gravity. Tugay [13] proposed a layer smoothing method, which described cellular large-scale structure of the universe (LSS) as a grid of clusters with density larger than a limited value, to detect extragalactic filaments. Men’shchikov [14] proposed a multi-scale filaments extraction method named getfilaments, which decomposed a simulated astronomical image containing filaments into spatial images at different scales to prevent interaction influence of different spatial scale structures. The getfilaments works well in simulated images and has been used to identify filaments for real astronomical images, e.g., the far-infrared images of Musca cloud observed with Herschel [15]. However, getfilaments might exclude some tiny structure of filaments in astronomy images, because it removes tiny structures just by counting connected pixels number, and filaments in astronomy images are usually weak.

In this paper, we develop an improved method based on morphology components analysis (MCA) and getfilaments. MCA is able to decompose the image into morphological components based on different features from the perspective of mathematical morphology and is often used in image restoration, separation, and decomposition [1619]. The basic idea of MCA decomposition algorithm is to choose two dictionaries: smooth dictionary and texture dictionary, to represent morphology components [20]. We can design different dictionaries to represent different sparse components in the image. Smooth dictionary produces the decomposed smooth component which carries the geometric and piecewise smooth information of the image, and texture dictionary produces the decomposed texture component which carries the marginal and edge information.

The paper is structured as follows. The improved method named filament extraction algorithm based on MCA is described in Section 2. Section 3 is devoted to discussing experimental results of our method and comparing our method with the getfilaments method by employing data from GALFA-HI of Arecibo.

2. Filament Extraction Algorithm Based on MCA

2.1. MCA Model

MCA was proposed by Starck et al. [17]. MCA is a kind of decomposition algorithm based on signal sparsity and morphological diversity. MCA assumes that signals are linear combinations of several morphological components, and each morphological component can be sparsely represented on its own dictionary.

We assume that image comprises different morphological components: . We design different dictionaries for different morphological components and assume all components mix together linearly. The image as an one-dimensional vector of length M can then be represented as follows: where the matrix is a dictionary. is the vector of sparse coefficients. The equivalent constrained optimization problem is as follows:However, this model does not take into account factors that may lead to the failure of the image decomposition, such as noise. When noise exists in the image , the vector might be not sparse since noise cannot be sparsely represented. For this kind of noise, we put the noise in the error item to achieve the sparse decomposition of the image . The constraint in (2) is modified as follows:where represents the noise level in the image .

2.2. MCA Decompostion Algorithm

In this paper, we focus on the image decomposition into two components: cartoon layer and texture layer. Cartoon layer contains cartoon and piecewise smooth information, and texture layer may contain other texture information, marginal information, and noises [21, 22]. Studies [23, 24] have shown that noises exist in both cartoon and texture layer. In other words, the smooth part not only contains the majority of the useful information, but also contains a small part of the noise. If we set the same threshold of noise variance for the whole image, rather than calculating the threshold for each part of the image, some useful information might be removed. We therefore introduce the MCA decomposition algorithm to process an image into smooth (cartoon) layer and texture layer.

We assume that matrix is the dictionary matrix of the texture layer and that is the dictionary matrix of the cartoon layer. A solution for the decomposition could be obtained by relaxing the constraint in (3) to become an approximate one: where is a Lagrange operators. Define and . Given , we can recover as , where is the Moore-Penrose pseudoinverse of . In order to get piecewise smooth component, add a TV (Total Variation) penalty [25] to fit the smooth layer. TV is used to damp ringing artifacts near edges and oscillating. Put these back into (4), and, thus, we obtain the following: is the TV regularization parameter, in multiscale method, a lower is able to remove the artefacts caused by curvelet. is a measure of the amount of oscillations in the cartoon layer. Penalizing with TV, the cartoon layer is closer to the piecewise smooth image. However, TV suffers from the so-called staircase effect that impacts the quality of images reconstruction. The adaptive TV [26] and the higher order derivative [27] are solutions to reduce the staircase effect.

Then we discuss the choice of dictionaries for the cartoon and the texture layer. Appropriate dictionaries are very important for sparse representations over the image. Generally, the choice of dictionaries depends on experiences. The structure in the dictionary is more matched with the image easier to form a sparse representation. The commonly used dictionary of MCA includes wavelet transform, ridgelet transform, curvelet transform, discrete cosine transform (DCT), and so on. The two dictionaries used in this paper are described as follows.

First, we choose curvelet as the dictionary for cartoon layer. The curvelet transform based on the multiscale ridgelet transform was proposed by Cand & Donoho [28]. It first decomposes the image into a set of wavelet bands and then analyzes each band with the ridgelet transform at different scale levels. The curvelet transform performs well at the detection of anisotropic structures, smooth curves, and edges of different lengths [29].

We next choose the local DCT as the dictionary for texture layer. DCT is a variant of the discrete Fourier transform (DFT). It uses a symmetric signal extension to replace complex analysis with real numbers and is appropriate for the sparse representation of the texture and periodic part of the image.

2.3. Filament Extraction Algorithm Based on MCA

The shape of the filaments can be obtained by applying the above filament extraction algorithm to radio astronomical images. We finally use the watershed algorithm to highlight filaments.

The watershed algorithm [30, 31] is based on mathematical morphology. The watershed algorithm segments an image into nonoverlapping regions and gets a pixel width and continuous boundary for the purpose of extracting and identifying a specific area [32, 33]. A grayscale image can be viewed as a topographic surface. A high grayscale value of a pixel denotes a peak or hill while a low grayscale denotes a valley. Each local minimum of pixels and the affected region are called a catchment basin, and the boundary of catchment basins forms the watershed. By filling each isolated valley (local minimum) with differently colored water (labels), the region of influence of each local minimum gradually expands outwards. The adjacent regions then converge, and the boundaries that form the watershed appear.

The whole filament extraction algorithm (https://github.com/MiWBY/MCA) can be roughly divided into four steps (as shown in Figure 1).

(1) Convolution. First, using a Gaussian filter, we convolve the original images into a series of layered images. Different full widths at half maximum can be set for different image layers:where is the original image, is the jth subimage after convolution, and are different Gaussian beams for different spatial components, is the convolution operation, and is the number of the layers.

In this process, structures at different scales in the astronomical images can be separated into different layers (subimages), and each layer contains similar scales, which make the input sources become simpler in the later denoising and extraction process.

(2) Decompostion. We apply the MCA algorithm to each layered image so that each layered image is decomposed into a cartoon layer and a texture layer. Here we use the curvelet as the dictionary for the cartoon layer and local DCT as the dictionary for the texture layer as described in Section 2.2. The cartoon layer contains most of filaments and low-frequency noise, and the texture layer contains sources, high-frequency noise, and small part of filaments.

Starck et al. [17] proposed the MCA decomposition algorithm based on BCR algorithm (Block Coordinate Relaxation). The algorithm is given as follows. Input: The subimage after convolution, which is described as the input image here, dictionary of the cartoon layer, dictionary of the texture layer, number of iterations , and the threshold .

Output: Cartoon layer and texture layer .(1)Initialize , and , where is the value of noise level. Then the threshold .(2)For For (i) Update assuming is fixed:(a)Caculate the residual .(b)Calculate .(c)Soft thresholding the coefficient with the threshold and obtain .(d)Reconstruct by .(ii) Update with the above methodApply the TV correction by , where is the minimum parameter, and is chosen by a line-decreasing the overall penalty function, or as a fixed step-size of moderate value that guarantees convergence.(3)Update the threshold by .If , return step 2. Else, finish.

(3) Denoise. First, we denoise each layer using the iterative cleaning algorithm proposed by Men’shchikov et al. [34]. The cleaning algorithm employs a global intensity threshold for single-scale images, as the larger-scale background has been effectively filtered out by the spatial decomposition. This iterative algorithm automatically finds a cut-off level that separates the signal of important sources from the noise and background at each scale. Next, we enhance details for both the cartoon layer and texture layer.

(4) Combination and Extraction. To get the extracted filaments, we first merge the cartoon layer and textual layer for each layered image. Because filaments are irregular, and structures of filaments exist in both cartoon layer and texture layer, it is not appropriate to use just one component to represent the filaments. For example, if we just use cartoon layer to represent filaments, filaments may lose some texture. Thus, we merge the cartoon layer and textual layer to represent filaments better. Next, the layered subimages are added together to produce the filaments. Finally, we apply the watershed algorithm to highlight the contours of filaments.

By applying MCA to decompose a real image, new features (components) can be obtained. This leads to better image separability. Furthermore, the smooth components have a better signal-to-noise ratio than the original image.

3. Extraction Results

3.1. Results for a Simulated Image

Before applying our method to real radio astronomical images, we simulated an image that is composed of a straight filament with size of FWHM, a string of sources with size of FWHM, a simple background with size of FWHM, and a moderate-level noise with noise level=1.05 to test the improved algorithm (Figure 2(a)). The simulation method is the same as that mentioned in Men’shchikov et al. [14]. In the simulated image, there is only one spatial component, while our method assumes there are many spatial components, which is similar to real astronomical images. In other words, even if there is only one spatial component, our method will also treat it as many components.

We first extract filaments using MCA method without convolution and denoising (Figure 2). In Figure 2(c), texture layer (especially in the area marked by red box) still contains part of filaments structures, which means just using cartoon layer to represent filaments is insufficient. So it is necessary to combine cartoon layer and texture layer. However, noises and sources also exist in texture layer. If the two layers are combined directly, the reconstructed filaments contains noises and sources (Figure 2(d)), so denoising is necessary before combination.

Next, extracted results obtained using our improved method are shown in Figure 3. Compared to Figure 2(d), the reconstructed filaments in Figure 3(e) contain less noise. The edge of the filament is unreal as the result of decomposition.

3.2. Results for Astronomical Images
3.2.1. Decomposition and Denosing Results

Aiming at real radio astronomical images, we compare the extraction results of our method with those of the getfilaments method. We employ data from GALFA-HI of Arecibo as example images. The equatorial coordinates of the objects are (12.00h, +), and the object name is ’GALFA-HI RA+DEC Tile 004.00+02.35’. The data cube contains 2048 images at different velocity (with respect to the local standard stationary system). Here we select the 715th image from the 2048 original images as the experimental image (Figure 4). In the experimental images, filaments describe significantly elongated structures. After convolution of the 715th image, we obtain 99 layers (subimages) at different scales (Figure 5) and select the 40th subimage as a comparison example (Figure 5(b)). In order to display the image properly and improve visual contrast between getfilaments and our method, we mark the image with different colors according to the intensity (Unit: MJy/sr) in the image.

First, we apply the MCA algorithm to process image layers before applying the iterative cleaning algorithm. As described in Section 2.2, we choose curvelet as the cartoon layer dictionary and LDCT as the texture layer dictionary for MCA. We decompose each layered image to get the cartoon layer and texture layer (Figure 6). The cartoon layer contains smooth parts of the image and retains most of the low-frequency information of the filament in the layered image. The frequency of the texture layer is higher. The texture layer contains more edge information that is difficult to be distinguished visually in the layered image. The texture layer is also part of the filament. The texture layer also contains part of the filament. It shows some artefacts that might be caused by DCT in texture layer; these artefacts can be removed after denoising.

We then set different reasonable threshold for cartoon layer and texture layer, respectively, and apply the iterative cleaning algorithm to the cartoon layer and texture layer to remove noise (as shown in Figure 7). Compared to Figure 6(b), Figure 7(b) almost contains no artefacts.

Next, the cartoon layer and texture layer are fused according to the intensity ratio (e.g., the information of the texture layer is expanded by a factor of 5). Small structures can then be retained and interference information is removed. Processing results after fusion are shown in Figure 8(c). To allow comparison with our method, we also use the iterative cleaning algorithm directly to process the 40th subimage; the results are shown in Figure 8(b).

As seen in Figure 8(b), most of the noises in the 40th subimage are removed by cleaning. However, getfilaments method sets only one noise threshold, and the values less than the threshold are cleared. This might clear weak information of the filament. As shown in the red box of Figure 8(b), the weak part of the filament is directly removed. Figure 8(c) is the denoised image after MCA decomposition. Structures of the filament are more complete than those in panel b. Setting different threshold of noise variance for the two parts can avoid the removal of useful information, especially in the area marked by red box. Our method not only removes noise from the cartoon layer and texture layer but also strengthens details in the image synthesis process. Applying MCA to extract filament can retain much structural information of the filament.

3.2.2. Extraction Results

Finally, the layered subimages are added together to produce the filament. Figure 9 shows the extraction results obtained using the getfilaments algorithm. Figure 9(a) is the extracted filament done by getfilaments. Compared to the input filament in Figure 3(a), most of noises are cleaned and structures of the filament are also removed (marked in red boxes in Figures 9(b) and 9(c)). The extraction results obtained using our method are shown in Figure 10. Comparing with the getfilaments method, our method obtains the filament more complete, excludes noise, and retains more structural information, especially in the three areas marked by red boxes (Figures 10(b) and 10(c)).

3.3. Peak Signal-to-Noise Ratio

We compare the peak signal-to-noise ratio (PSNR) of images processed by the getfilaments algorithm with that of our method. The PSNR is the objective criterion most widely used to evaluate image quality. An image has less noise when the PSNR is higher. The PSNR is defined as follows:where is the mean square error (i.e., difference) between the original image and the image after noise is superimposed and n is 8 since pixels are represented using 8 bits per sample.

For different intensities of salt-and-pepper noise, we analyze the PSNRs of images processed by different methods. Table 1 shows that the PSNR of the images processed using our improved method is always higher than that of getfilaments method, which means that the images processed by MCA have less noise.

Data Availability

The data suffixed with fits used to support the findings of this study were supplied by National Astronomical Observatory of China under license and so cannot be made freely available.

Disclosure

F. Q. Duan present address is College of Information Science and Technology, Beijing Normal University, Beijing, China. The authors presented this work in 2016 Astronomical Data Analysis Systems and Software Conference.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The study is supported by the National Natural Science Foundation of China (no. 41272359), Ministry of Land and Resources for the Public Welfare Industry Research Projects (201511079-02), and Ph.D. Programs Foundation of Ministry of Education of China (20120003110032).