Segmentation of rays in wood microscopy images using the U-net model

Ergun, H. (2021). "Segmentation of rays in wood microscopy images using the U-net model," BioResources 16(1), 721-728.

Abstract

Rays are an important anatomical feature in tree species identification. They are found in certain proportions in trees, which vary for each tree. In this study, the U-Net model is adopted for the first time to detect wood rays. A dataset is created with images taken from the wood database. The resolution of microscopic wood images in tangential section is 640×400. The input image for training is divided into 32×32 image blocks. Each pixel in the dataset is labeled as belonging to the ray or the background. Then, the dataset is increased by applying scale, rotation, salt-and-pepper noise, circular mean filter, and gauss filter. The U-Net network created for ray segmentation is trained using the Adam optimization algorithm. The experimental results show that the ray segmentation accuracy in testing is 96.3%.

Download PDF

Full Article

Segmentation of Rays in Wood Microscopy Images Using the U-Net Model

Halime Ergun *

Keywords: Image segmentation; Wood rays; U-Net

Contact information: Konya Necmettin Erbakan University, Seydişehir Ahmet Cengiz Faculty of Engineering, Computer Engineering, Seydişehir /Konya, Turkey;

* Corresponding author: hboztoprak@erbakan.edu.tr

INTRODUCTION

Rays are formed by a combination of parenchyma cells and have a porous structure in a tangential-section (Noshiro and Suzuki 2001). Besides the known physiological functions of substance storage and conduction, the ray parenchyma also make a contribution to the biomechanics of living trees, which has been previously underestimated (Burgert and Eckstein 2001).

A microscopic wood identification requires the anatomical description of microscopic character (Alfonso et al. 1989) of all individual cell types (vessel, axial and ray parenchyma and fibres) in the three anatomical directions. In order for the characteristics of each cell type to be determined, each cell must be segmented correctly.

The properties of the elements in wooden microscopic images are key to identifying exemplary species. The shapes, size, and number distributions of the cells or particles in the microscopic images provide important information for the evaluation of the sample. Therefore, the particles in each image must be localized and segmented to provide quantitative support (Oktay and Gurses 2019). However, there are some difficulties due to particulate detection, varying particle shapes, low image quality, sizes, and conflicting situations (Wei et al. 2019). This situation pertains for many microscopic images. In addition, the manual localization and detection of cells take a lot of time and it is a subjective process.

Scientists have done many studies on wood image processing, some of them are as follows. Kennel et al. (2010) used the watershed algorithm in identifying cells from microscopic images of coniferous trees. In their study, Pan and Kudo (2012) developed mathematical morphological algorithms to classify gray-level images according to the blank spaces in the image. Vessels, which occupy 6 to 30% in the structure of leafy wood, were identified. In their study, Brunel et al. (2014) studied only three cell types, as a detailed examination of the wood cell type is required. These vessel, tracheid, and rays, the main cells of wood in the radial section, were analyzed using image analysis software. In addition, wall thickness, height, circularity, surface area of the cell, and lumen were calculated for each examined cell. In a study by Boztoprak and Ergün (2017), vessel and fiber ratios were identified in a cross-section image of the Juglans regia (ordinary walnut), after morphological processes.

Especially in morphologically based image processing techniques, there are some parameters (such as structural elements) to be adjusted according to the properties of the images. This method may not achieve the same success for some images. A quest for more accurate and effective methods and the increase in the number of complex problems have led researchers to deep learning.

Deep learning technique is a data-driven method that does not require manually crafted rules. The model building process consists of the selection of an appropriate network structure (a set of nested layers), a function to evaluate model output (the loss function), and an optimization algorithm (Liu et al. 2019).

The application of deep learning methods for wood images is still in its infancy. The reason for this is that in convolutional deep learning models, large data sets are needed for the training process. U-Net based architectures achieve a high precision with a small training set (Liu et al. 2019). For small data, different solutions, such as data augmentation and transfer learning, have been proposed. Another study also states that transfer learning cannot be a viable option for some applications, so the model must be trained from scratch (Yang et al. 2019). Although U-Net has been used in many areas, such as medical image processing, semantic segmentation, autonomous driving, etc., it has not yet been applied to wood images.

This work mainly focused on the segmentation of wood rays. This segmentation stage is very important, as it will affect the later stages. There are a number of difficulties in automated ray segmentation. In the case of merging or wrong determination of ray cells, the properties of the rays (width or height) will be calculated incorrectly. Rays are inherently difficult to characterize in detail due to their multicellular nature. Therefore, a method has been proposed for segmentation of ray cells using the U-Net model which gives better results with less data.

MATERIALS AND METHODS

In this study, an automatic segmentation method of wood images was proposed to extract ray cells in tangential-section microscopic images using the U-Net model. MATLAB software (Mathworks, 2018a, Natick, MA, USA) was used in this paper.

Dataset

In this study, Wood Anatomy Database (Schoch et al. 2004) is used for the microscopic images of the wood in the tangential section. The size of each image is 640 × 400 pixels.

Fig. 1. a: Original image, b: labeled image (ground truth)

Each pixel in the dataset was labeled as belonging to a ray or the background. The labeled slices were transformed into black and white (Fig. 1). (x_i, y_i) = 1 depicted that there was a cell at this location. These black and white masks were used as the ground-truth masks, and thus the network’s objective, i.e., what should be predicted, and were used to measure the network error rate.

Data Augmentation

Data augmentation is a technique to increase the number of training set by applying random transformations such as image rotation, flipping, adding noise, and cropping.

Focus blur is one of the most common deformations seen in many of the microscopic images. While some of the particles in the image are clear, others may be blurred due to the focal distance. This deformation can be simulated. A circular average filter was also added to simulate focus blur in the data augmentation stage. Circular averaging filter (pillbox) is the point spread function of an out-of-focus lens:

(1)

The dataset was increased by applying scale, rotation, salt-and-pepper noise, circular mean filter and gauss filter.

U-Net Model

The U-Net is a convolutional neural network, developed for medical image segmentation (Ronneberger et al. 2015). It is used in a wide range of applications, from the segmentation of cells in microscopic images, to detecting ships or houses in satellite photography.

Fig. 2. The architecture of the U-Net model (Ronneberger et al. 2015) (Each blue box corresponds to a multi-channel feature map. White boxes represent copied feature maps. The arrows denote the different operations.)

The U-Net architecture is symmetrical and consists of two main parts: the encoder and the decoder. The input images are obtained as a segmented output map at the output. The network does not have a fully connected layer. Each standard convolution process is activated by ReLU (rectified linear unit). The U-Net uses a loss function for each pixel of the image. The softmax is applied to each pixel, followed by a loss function. This transforms the segmentation problem into a classification problem, where each pixel must be classified into one of the classes. The architecture of the U-Net model is illustrated in Fig. 2.

Class Imbalance

In the case of data imbalance, the model becomes more biased against the majority class, as it has a greater impact on the majority class. Weights can be added to losses corresponding to different classes to eliminate this data bias. This technique can help us to reduce the problem of data imbalance and improve model generalization across different classes.

Data class imbalance is a major problem, negatively affecting minority classes, especially (Japkowicz and Stephen 2002). A wide variety of strategies have been developed to overcome this common problem, including excessive sampling, undersampling, preservation of natural proportions in training samples, data synthesis, and class-weighted loss functions. Although oversampling or undersampling helps to eliminate data imbalance, duplicate data increases the likelihood of overlapping (Weiss and Provost 2001). In this study, instead of oversampling and undersampling, different loss functions and the balance the dataset method (Eigen and Fergus 2015) were tried.

Loss Function

During the learning, the loss function calculates the difference between the target value (actual) and the values predicted by a neural network. A proper loss function in a complex problem is of importance to guarantee the performance of the deep learning model.

For classification problems, the crossentropy loss is often used. The BCE function is commonly used as a loss function for binary classifiers, i.e., Eq. 2:

(2)

However, the BCE function does not account for the imbalance between the foreground and background pixels, which leads to a bias for the class with an excess number of pixels.

Dice is a performance criterion often used to evaluate success in biomedical images.

(3)

Model Training

The network was trained using the Adam optimization algorithm (Kingma and Ba 2015). The Adam algorithm is based on the gradient descent method, but it varies within a certain range of parameters during each iteration of the Adam algorithm. The parameter does not change drastically due to the large gradient value calculated at a given time, and the value of the parameter is relatively stable.

The He-initializer introduced (He et al. 2015), which was similar to Xavier-initialization, was used in determining the initial weights.

How weights are determined affects the training and speed of the model. If weights are initialized with random small numbers at the beginning of the model, then the model operates in small networks, but leads to a heterogeneous distribution of activation between network layers. In the He-Normal initialization, weights in the network are initialized with the zero average and normal distribution and by multiplying a given variance factor by two. The variance factor is given by Eq. 4 as follows,

(4)

where fan_in is the number of input neurons.

The input image of training set was divided into 32×32 image blocks. In the training, 1500 block images were used. Since the resolution of the images is low, the 32×32 size is selected. This size contains sufficient ray information. In the literature, the numbers and sizes of input images used in training vary. In the training set, 975 image blocks were used, through segmentation from 512×512 resolution 2D images into 64×64 image blocks (Tong et al. 2018).

All input data were normalized in the range of 0 to 1. Parameters, such as the learning rate, mini batch size, number of training periods were kept constant. Mini-batch 64 was selected. A fixed learning rate value was determined during the training of the first 20 epochs. Then, the learning rate was reduced by 0.1 times that of the previous one.

The network was separately trained according to different loss functions, data augmentation. In the data augmentation, the noises also were added to the images randomly and saved. Thus, the same database was used in each trial. Then, the trained networks were tested using the test images.

RESULTS AND DISCUSSION

For test images, the results obtained both without data augmentation and by data augmentation are given in Table 1. The values given are the mean results of the test images. Since there was no composite data set for fair comparisons in the field of wood segmentation, the segmentation results was evaluated using Intersection over Union (IoU), accuracy, BF (contour matching) score measures.

Table 1. Test Results with Data Augmentation and Without Data Augmentation of U-Net Structure

IoU:Intersection over union, BF: contour matching score

In the proposed U-Net structure, the depth of endoder was reduced due to the small size of the images used in training. Thus, the dilated value of the convolution layer was increased by decreasing the max pool layer. In Table 1, the U-Net structure gave good results with data augmentation.

Sample segmentation results of Alnus incana DC, Berberis vulgaris L., and Quercus robur L. species are shown in Fig. 3. Details of the morphological processes used in comparison are provided in the study by Ergün (2019). As shown in Fig. 3, the method applied in determining the rays yielded quite good results, with a decrease in the number of incorrectly identified rays.

Fig. 3. Results of image segmentation using the sample images: a) original images; b) morpho-logical operations; and c) U-Net for ray segmentation (Alnus incana DC, Berberis vulgaris L., and Quercus robur L., respectively)

The network was trained with images of small size. Because the image contains a large amount of data, the goal of segmentation is predicated on the tiny target of the pixel (Tong et al. 2018). Thus, the network, which was trained with small images, was used in the segmentation of larger images. The pixels on the border region are added symmetrically around the image for seamless segmentation of larger images. The smaller input image size may be an advantage for low resolution images. They contain enough information. Thus, deep learning methods can be utilized without the need for advanced hardware.

The U-Net is architecture built from convolutional neural network layers and yields more successful results in pixel-based image segmentation than conventional models.

CONCLUSIONS

Segmentation of images can be a challenging problem, especially when there is not sufficient data, whether in high or low resolution. It is an area where different, current and old approaches can be evaluated to develop new approaches.
U-net model is an effective method to detect the rays of wood. It could more quantitatively characterize the rays of wood.
Image segmentation is especially important for the later stages. Successful segmentation will help determine the characteristics of rays such as homogeneous, heterogeneous, multiseriate, and uniseriate more accurately and automatically in the next stages.

REFERENCES CITED

Alfonso, V. A., Paulo, S., Baas, P., Carlquist, S., Chimelo, J. P., Paulo, S., Coradin, V. T. R., Détienne, P., Gasson, P. E., Grosser, D., Ilic, J., Kuroda, K., Miller, R. B., Ogata, K., and Richter, H. G. (1989). “IAWA list of microscopic features for hardwood/softwood identification,” 116.

Boztoprak, H., and Ergun, M. E. (2017). “Determination of vessel and fibers in hardwoods,” Gaziosmanpasa Journal of Scientific Research 6(2), 87-96.

Brunel, G., Borianne, P., Subsol, G., Jaeger, M., and Caraglio, Y. (2014). “Automatic identification and characterization of radial files in light microscopy images of wood,” Ann. Bot-London. 114(4), 829-840. DOI: 10.1093/aob/mcu119

Burgert, I., and Eckstein, D. (2001). “The tensile strength of isolated wood rays of beech (Fagus sylvatica L.) and its significance for the biomechanics of living trees,” Trees, 15(3), 168-170. DOI: 10.1007/s004680000086

Eigen, D., and Fergus, R. (2015). “Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture,” in: Proceedings of the IEEE International Conference On Computer Vision, Santiago, Chile, pp. 2650-2658.

Ergün, H. (2019). “Determination of homogene rays with morphological processes in hardwood and softwood,” Journal of Engineering Sciences and Design 7(1), 52-59. DOI: 10.21923/jesd.463819

He, K., Zhang, X., Ren, S., and Sun, J. (2015). “Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification,” IEEE I. Conf. Comp. Vis.DOI: 10.1109/ICCV.2015.123

Japkowicz, N., and Stephen, S. (2002). “The class imbalance problem: A systematic study,” Intell. Data Anal. 6(5), 429-449. DOI: 10.3233/IDA-2002-6504

Kennel, P., Subsol, G., Gueroult, M., Guéroult, M., and Borianne, P. (2010). “Automatic identification of cell files in light microscopic images of conifer wood,” in: 2010 2^nd International Conference on Image Processing Theory, Tools and Applications, IEEE, Paris, pp. France, 98–103. DOI: 10.1109/IPTA.2010.5586800

Kingma, D. P., and Ba, J. (2015). “Adam: A method for stochastic optimization,” in: arXiv:1412.6980, San Diego, CA.

Liu, Z., Cao, Y., Wang, Y., and Wang, W. (2019). “Computer vision-based concrete crack detection using U-net fully convolutional networks,” Automation in Construction 104, 129-139. DOI: 10.1016/j.autcon.2019.04.005

Noshiro, S., and Suzuki, M. (2001). “Ontogenetic wood anatomy of tree and subtree species of Nepalese Rhododendron (Ericaceae) and characterization of shrub species,” Am. J. Bot. 88(4), 560-569. DOI: 10.2307/2657054

Oktay, A. B., and Gurses, A. (2019). “Automatic detection, localization and segmentation of nano-particles with deep learning in microscopy images,” Micron 120, 113-119. DOI: 10.1016/j.micron.2019.02.009

Pan, S., and Kudo, M. (2012). “Recognition of wood porosity based on direction insensitive feature sets.,” Trans. MLDM. 5(1), 45-62.

Ronneberger, O., Fischer, P., and Brox, T. (2015). “U-Net: Convolutional networks for biomedical image segmentation,” in: Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, N. Navab, J. Hornegger, W. M. Wells, and A. F. Frangi(eds.), Springer International Publishing, New York, NY,USA, pp.234-241. DOI: 10.1007/978-3-319-24574-4_28

Schoch, W., Heller, I., Schweingruber, F. H., and Kienast, F. (2004). Wood anatomy of central European Species, Swiss Federal Institute for Forest.

Tong, G., Li, Y., Chen, H., Zhang, Q., and Jiang, H. (2018). “Improved U-NET network for pulmonary nodules segmentation,” Optik 174, 460-469. DOI: 10.1016/j.ijleo.2018.08.086

Wei, Y., Chen, H., Wang, H., Wei, D., Wu, Y., and Fan, K. (2019). “Detection of nano-particles based on machine vision,” in: 2019 IEEE International Conference on Manipulation, Manufacturing and Measurement on the Nanoscale (3M-NANO), Jiangsu, China, pp.189-192. DOI: 10.1109/3M-NANO46308.2019.8947355

Weiss, G. M., and Provost, F. (2001). The Effect of Class Distribution on Classifier Learning: An Empirical Study (Technical Report ML-TR-44), Rutgers University, Piscataway, NJ, USA.

Yang, J., Faraji, M., and Basu, A. (2019). “Robust segmentation of arterial walls in intravascular ultrasound images using Dual Path U-Net,” Ultrasonics 96, 24–33. DOI: 10.1016/j.ultras.2019.03.014

Article submitted: April 11, 2020; Peer review completed: October 18, 2020; Revised version received and accepted: November 26, 2020; Published: December 7, 2020.

DOI: 10.15376/biores.16.1.721-728