Wood species classification in open set using an improved NNO classifier

Zhang, K.-X., and Zhao, P. (2025). "Wood species classification in open set using an improved NNO classifier," BioResources 20(1), 944–955.

Abstract

A wood species classification scheme was developed based on open set using an improved Nearest Non-Outlier (NNO) classifier. Near infrared (NIR) spectral curves were collected in spectral band 950 to 1650 nm by a micro spectrometer. The spectral dimension reduction was performed with a Metric Learning (ML) algorithm. Two improvements were proposed in the following NNO classifier. First, a cluster analysis was performed in each wood class by using a Density Peak Clustering (DPC) algorithm to get 1 to 3 clusters. A fixed threshold for all wood classes was replaced by a variable for all clusters. This threshold defines an internal boundary for one wood species to further compute a class membership score for all wood species. The classification accuracy based on these clusters of each wood class was better than that based on each class. The experimental results in different open set scenarios demonstrate that the improved NNO classifier outperformed the original NNO classifier and some other state-of-the-art open set recognition (OSR) algorithms.

Download PDF

Full Article

Wood Species Classification in Open Set Using an Improved NNO Classifier

Ke-Xin Zhang and Peng Zhao *

DOI: 10.15376/biores.20.1.944-955

Keywords: Wood species classification; Open set recognition; Spectral analysis; NNO algorithm

Contact information: School of Computer Science and Technology, Guangxi University of Science and Technology, Liuzhou 545006, China; * Corresponding author: bit_zhao@aliyun.com

INTRODUCTION

There are approximately 60,000 tree species around the world according to statistical data. It is very hard to develop a wood species classification system to classify all these tree species. In most cases, only those tree species in a specific region (e.g., in Heilongjiang Province of China) or specific category (e.g., mahogany) need classification. These specific tree species are included in the training set of the above-mentioned classification system. However, those tree species not in the training set should be rejected correctly by this classification system. In summary, wood species classification should be studied in an open set scenario in practice. However, most wood species classification investigations are performed in a closed set scenario with emphasis on classification methodologies such as spectral analysis and anatomical analysis (Zhan et al. 2023; Ma et al. 2021; Park et al. 2021; Tuncer et al. 2021). The wood species in the training set (i.e., the known species) can be classified correctly, whereas those not in the training set (i.e., the unknown species) will be misclassified as one known species in a closed set scenario. In fact, to the best of our knowledge, the wood species number in the training set of almost all wood species classification systems is usually under 100. It is appropriate to study wood species classification systems in an open set scenario, since it is very likely that some wood samples from unknown wood species will be encountered by these systems.

Open set recognition (OSR) has been investigated for more than 10 years. It can not only classify the known classes in the training set correctly, but also reject those unknown classes not included in the training set effectively (Geng et al. 2021). Scheirer et al. (2013) proposed a 1-vs-Set Machine based on the one-class Support Vector Machine (SVM) for OSR application in image processing. A Compact Abating Probability (CAP) model was further proposed later (Scheirer et al. 2014) to combine with a statistical Extreme Value Theory (EVT). A Weibull distribution calibrated SVM (W-SVM) was proposed to further improve the OSR classification accuracy. Jain et al. (2014) proposed a P_I-SVM, which also uses the EVT to model the positive training samples in a decision boundary. Except for the OSR schemes based on SVM, some other machine learning based OSR schemes have also been proposed. For instance, Zhang and Patel (2017) proposed a sparse representation based OSR scheme, which uses the EVT to model the tail distribution of reconstruction loss. Junior et al. (2017) proposed an Open Set Version of Nearest Neighbor Classifier (OSNN). In this scheme, the distances between a detected sample s and the two nearest neighbor samples t, u from two different classes are calculated and the ratio is computed as Ratio = d(s, t)/d(s, u). If Ratio ≤ T_R, then s is classified into the class which includes the sample t; otherwise, s is rejected as an unknown class. Moreover, some deep learning based OSR schemes have been proposed such as OpenMax neural network (Bendale and Boult, 2016), where the SoftMax layer is replaced by OpenMax layer to modify the membership probability of known and unknown classes.

Bendale and Boult (2015) extended the Nearest Class Mean (NCM) classifier and propose the Nearest Non-Outlier (NNO) classifier for OSR use. The distances between a detected sample and each known class mean are computed to classify the known classes and reject the unknown classes. Moreover, a Metric Learning (ML) algorithm (Mensink et al. 2013) is used for feature dimension reduction. When a new class is added into the training set of the classification system, the previous projection matrix W can still be used with near-zero errors, as pointed out by Mensink et al. (2013). However, this NNO classifier can achieve a satisfied classification accuracy in open set scenario only when each known class is represented by a similar sphere distribution (i.e., with a same sphere radius approximately). In practice, this strict constraint is hard to satisfy.

In this article, this NNO classifier was improved so that it can be used for known classes with different shape distributions (e.g., sphere structure or manifold structure) in OSR use. Specifically, every known class is further processed by an automatic cluster analysis to get 1 to 3 clusters with a sphere structure approximately. In a known class, each cluster usually has a different size. Then the distances between a detected sample and each cluster of one known class are computed to get membership probability of known classes. Moreover, each cluster’s size threshold is computed by an optimal strategy. In this way, the proposed improved NNO classifier can achieve more accurate classification results in open set in practice for known classes with different shape distributions. As for the classification feature, the NIR spectral curve is used here, since it has the advantages of fast speed, high accuracy, and non-destructive testing (Ma et al. 2021; Park et al. 2021; Tuncer et al. 2021).

EXPERIMENTAL

There were in total 35 wood species used in the experimental wood dataset, which included both broadleaved and coniferous tree species. The wood dataset contained some similar wood species with similar colors and textures or within the same genus. The specific tree species information is illustrated in Table 1. The cross sections of these wood species were used for spectral acquisition and are illustrated in Fig. 1. The Ocean Optics Flame-NIR micro spectrometer was used to pick up the NIR spectral curves. The effective wavelength band was 950 to 1650 nm, with a wavelength resolution of 5.4 nm, respectively. Each wood species consisted of 50 samples (i.e., NIR spectral curves) so that there were a total of 1750 samples. Each spectral curve was represented by a 128-dimensional (128D) vector.

Table 1. Information on Experimental Samples

Before the wood spectral collection, the wood sample pre-processing was performed. First, 25 wood blocks from different trees were selected for every tree species. These 25 wood blocks are then cut into small wood samples with size of 2 × 2 × 3 cm. The 2 × 2 cm surface was the cross section, while the 2 × 3 cm surface was the radial or tangential section. Second, 2 wood samples from every wood block were randomly selected so as to obtain 50 wood samples with size of 2 × 2 × 3 cm in total for each wood species. To delete the uneven burrs from the wood cutting procedure, sandpaper of 800 to 1200 mesh was used to polish the cross sections of wood samples. Finally, the wood NIR spectral curves may be sensitive to some external environmental factors such as temperature and humidity so that the spectral acquisition was performed in a room with temperature at 24 °C and humidity at 35%. It should be noted that the physical property of wood samples is influenced by some variables such as the age of trees, geographic origin, growth ring position, and proportion of latewood versus earlywood. These variables are controlled effectively in wood spectral acquisition so that the within-class difference of spectral curves for each wood species is adequately small. This control is implemented in practice by ensuring that trace (S_w) is small or less than a threshold for every species (i.e., S_w denotes the within-class scatter matrix).

Fig. 1. Cross sections of the 35 wood species (the serial number in Figure 1 is same as that in Table 1).

Figure 2 shows the experimental spectral collection setup. This spectral collection setup mainly consists of a computer, spectrometer, optical fiber, and radian (i.e., halogen lamp). The spectral acquisition is performed as following steps. A spectral calibration is performed by using a standard whiteboard. Then one wood sample is placed on the holder, and the distance between this wood sample and the fiber probe is adjusted. Finally, the spectral reflectance curves are picked up and are saved in the computer.

Fig. 2. Experimental spectral acquisition setup

Spectral Dimension Reduction

Before the wood spectral dimension reduction, a wood spectral pre-processing procedure is required. Figure 3 illustrates the spectral reflectance curves of the 35 wood species. A standard normal variation (SNV) correction and a smoothing correction with a moving window of 5 × 5 size are often used for the NIR spectral curves to ensure a good classification accuracy.

Fig. 3. Spectral reflectance curves of cross sections of 35 wood species

The NIR spectral curve may be a 128D vector, so that a spectral dimension reduction is usually performed to decrease the redundant information and increase the computational efficiency. Some feature dimension reduction algorithms can be used such as principal component analysis (PCA) (Reddy et al. 2020), multidimensional scaling (MDS) (Mignotte 2011), locally linear embedding (LLE) (Yu et al. 2020), Laplacian (Belkin and Niyogi 2003), and Kernel PCA (Alhayani and Ilhan 2017).

In this work, a Metric Learning (ML) algorithm proposed by Mensink et al. (2013) was used for spectral dimension reduction. When a new class is added into the training set of the open set classifier, the previous projection matrix W can still be used with near-zero errors, as pointed out by Mensink et al. (2013). Therefore, this projection matrix W can be used in an incremental learning classifier, in which the number of known classes is increased gradually. This incremental learning classifier is usually used in an open set scenario, since one often wishes to increase the number of known classes in a training set so as to classify more known classes and reject fewer unknown classes correctly. Due to the above-mentioned advantages in the incremental learning classifier, this ML algorithm is also used in the NNO classifier (Bendale and Boult 2015). If one defines an original 128D spectral vector as v, then a new spectral vector after spectral dimension reduction is denoted as W v. The detailed computation procedure is illustrated by Mensink et al. (2013), which is omitted here.

Original NNO Classifier

The original NNO classifier was proposed by Bendale and Boult (2015). In summary, an original NNO classifier is an extension version of NCM classifier for OSR use. In an NNO classifier, a confidence score for one known class y is defined as follows.

(1)

where x is a detected sample vector, and the parameter is a fixed threshold value to define a sphere radius around each known class mean . This is determined in advance by an experienced expert and it is a constant value for all known classes in the original NNO classifier. The distance The normalization factor is defined as so that s_y integrates to 1 in the domain s_y(·) > 0 (i.e., represents a standard gamma function). In fact, is the reverse of sphere volume with radius and dimension m.

As for open set classification, a detected sample x is rejected by one known class y when , and this x is rejected as an unknown class only when it is rejected by all known classes. Otherwise, this x is classified as a known class with the largest positive . The projection matrix W is learned offline in an initial training set of known classes. This matrix can still be used with near-zero errors in an incremental learning classifier where some new classes are added into the training set gradually, as pointed out by Mensink et al. (2013).

Proposed Improved NNO Classifier

The original NNO classifier (Bendale and Boult 2015) has some disadvantages. First, a constant threshold is used for all known classes. Assuming that all known classes have sphere distribution structures, these sphere radii are usually different. Therefore, different threshold values should be used. Second, in practice, all known classes may have different topological distribution structures such as sphere structure and manifold structure. The manifold structure can be usually divided into different clusters and this division can be fulfilled by a cluster analysis.

To overcome the above-mentioned two disadvantages, an improved NNO classifier version was proposed here. For every known class, a Density Peak Clustering (DPC) algorithm is used to perform a cluster analysis (Rodriguez and Laio 2014). In this clustering algorithm, the number of clusters is not required to be determined in advance, and this number can be determined automatically in the clustering process. This algorithm is hardly influenced by outliers. The detailed clustering procedure is illustrated as follows.

The clustering centers have relatively high local density, and they are relatively far from those points with higher local density. The local density and distance were calculated for each sample . A local density of each sample is calculated by either a cut-off kernel Eq. 2 or a Gaussian kernel Eq. 3,

(2)

(3)

(4)

where is the distance between sample and sample is the cut-off distance that is determined in advance by an expert; is a 0-1 function defined as Eq. 4. Therefore, the computed by Eq. 2 is a discrete value, whereas that by Eq. 3 is a continuous value. The is the distance between a sample and its nearest sample among those samples with higher local density, as illustrated by Eq. 5. Therefore, a sample is possibly a clustering center when it has a relatively large so that the probability of one sample being a clustering center can be computed by Eq. 6. Once the clustering centers are determined, a detected sample is classified into the cluster which consists of the nearest neighbor of this sample with a higher local density.

(5)

(6)

The DPC algorithm is applied to the spectral vectors after spectral dimension reduction by using the above ML algorithm (Mensink et al. 2013). This spectral vector is denoted as . In the following wood spectral classification experiments, 1 to 3 clusters were obtained for each known wood species. All these clusters can be approximated by spheres with different radii (i.e., this is a threshold for a cluster sphere). Then the original NNO classification is performed by using these clusters with different sizes. Please note that the in Eq. 1 is computed by a Mahalanobis distance instead of Euclidean distance, as illustrated by Eq. 7,

(7)

where is a cluster center after spectral dimension reduction; C is the covariance matrix for the cluster whose cluster center is The different thresholds of different clusters can be obtained by an optimal grid search. More accurate classification is achieved because the NNO classifier is applied with those extracted clusters with different radius thresholds. However, the original NNO classifier is applied with those original known classes with a same radius , even some classes may have manifold distribution structures. This situation may produce large classification errors.

RESULTS AND DISCUSSION

Classification Performance Evaluations

Performance evaluation measures play an important role in judging the classification performance of the OSR classifier. To comprehensively assess the classification performance in OSR, three measures such as F-Score, Kappa coefficient, and overall recognition accuracy (ORA) are used. The F-Score computation is based on the Precision and Recall, as illustrated in Eqs. 8 to 10. Here represents the number of correctly classified samples in known classes, whereas represent the number of misclassified samples in known and unknown classes, respectively.

(8)

(9)

(10)

Dataset Partition

There are 35 wood species in total in the wood dataset as illustrated in Table 1, and each wood species consists of 50 spectral samples. The wood dataset is divided into 3 groups to testify the proposed improved NNO classifier in open set scenario. These 3 groups are explained as follows.

Group 1: The initial 5 wood species are selected randomly as the known species to form the training set of the NNO classifier, and these 5 wood species are used in the ML algorithm (Mensink et al. 2013) to obtain the projection matrix W. Then in the incremental learning process, another 5 wood species are selected randomly and added into the training set of the improved NNO classifier. This incremental learning process is repeated for 4 times. Finally, the training dataset consists of 25 known wood species. The unknown wood dataset consists of 5 wood species. This wood dataset partition is illustrated in Table 2.

Group 2: The initial 5 wood species are selected randomly as the known species to form the training set, and these 5 wood species are used in the ML algorithm (Mensink et al. 2013) to obtain the projection matrix W. Then in the incremental learning process, another 5 wood species are added into the training set. This incremental learning process is repeated for 3 times. Finally, the training dataset consists of 20 known wood species. The unknown wood dataset consists of 10 wood species. This wood dataset partition is illustrated in Table 3.

Group 3: The initial 10 wood species are selected randomly as the known species to form the training set, and these 10 wood species are used in the ML algorithm (Mensink et al. 2013) to obtain the projection matrix W. Then in the incremental learning process, another 5 wood species are added into the training set. This incremental learning process is repeated for 3 times. Finally, the training dataset consists of 25 known wood species. The unknown wood dataset consists of 5 wood species. This wood dataset partition is illustrated in Table 4.

Table 2. The Known and Unknown Wood Species Number Partition in Group 1

Table 3. The Known and Unknown Wood Species Number Partition in Group 2

Table 4. The Known and Unknown Wood Species Number Partition in Group 3

Wood Species Classification Comparisons

The proposed improved NNO classifier was compared in open set with other 5 representative OSR classifiers. The original NNO classifier (Bendale and Boult 2015) was used for a baseline comparison. Another one conventional OSR classifier was used, which consisted of two parts.

Table 5. The OSR Classification Performance Comparisons in Group 1

Table 6. The OSR Classification Performance Comparisons in Group 2

Table 7. The OSR Classification Performance Comparisons in Group 3

The first part was a one-class classifier Weight-SVDD (Tax and Duin 2004), which classifies wood samples into known and unknown categories. The second part was a multi-class classifier LibSVM, which classifies the known category given by the first part into specific wood species. The optimal parameters are for the Weight-SVDD and for the LibSVM. In the third OSR classifier, the K-Means cluster analysis is used instead of DPC automatic clustering, and the cluster number K = 2 for each known class in a NNO classifier. The OSNN classifier proposed by Junior et al. (2017) was performed for comparison, with the threshold T_R = 0.3. The ML algorithm was used for spectral dimension reduction.

The final comparative OSR classifier was the OpenMax network (Bendale and Boult 2016), where a ResNet 50 was used as the backbone network. This OpenMax network is mainly applied in image processing field, obtaining the visible images of wood cross sections. The original image size was 1280 x 960 with a magnification of 50, and this original image was cropped into an 960 x 960 image, which was then resized into an 224 x 224 image. The specific class probability threshold was set as 40% so that one detected sample was classified as an unknown class if the largest class membership probability was less than 40%. The specific OSR classification comparisons are illustrated in Tables 5 to 7. The bold fonts indicate the best classification accuracy in the relevant column.

CONCLUSIONS

As for the spectral dimension reduction, the metric learning (ML) algorithm proposed by Mensink et al. (2013) seems to be a proper choice. This algorithm can be efficiently used in an incremental learning classifier. When some new classes are added into a training set of the open set recognition (OSR) classifier, the previous projection matrix W can still be used with near-zero errors, as pointed out by Mensink et al. (2013). However, in other feature dimension reduction algorithms such as principal component analysis (PCA) and Kernel PCA, the projection matrix W requires to be relearned and recalculated when new classes are added.
By OSR experimental specific comparisons in 3 groups, the proposed improved nearest non-outlier (NNO) classifier (NNO+DPC) version outperformed the original NNO classifier greatly. Moreover, in most cases, the improved NNO classifier also outperformed five other representative OSR classifiers. The improved NNO classifier’s better classification performance comes from the classification strategy based on different clusters of known classes with different thresholds . In summary, the proposed improved NNO classifier performs a more refined and specific classification in an open set scenario.
In each experimental group, the proposed improved NNO classification performance measures (i.e., ORA, Kappa coefficient, and F-Score) decrease to some extents, when the known class number increases and unknown class number remains fixed. This situation means that the number of known classes affects the NNO classification performance.

ACKNOWLEDGEMENTS

This research was supported by the National Natural Science Foundation of China (Grant number 62265001), and the Guangxi University of Science and Technology Doctoral Research Fund (Grant number 22Z07).

Availability of Data and Materials

The wood spectral dataset used in this work is confidential, but this dataset used to support the findings of this study is available from the corresponding author upon request after this article is accepted and published online.

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Zhang Ke-Xin makes the wood species recognition experiments and collates the experimental results. Zhao Peng proposes the research idea and the experimental framework, writing the whole manuscript. All authors read and approve the final manuscript.

REFERENCES CITED

Alhayani, B., and Ilhan, H. (2017). “Hyper-spectral image classification using dimensionality reduction techniques,” International Journal of Innovative Research in Electrical, Electronics, Instrumentation and Control Engineering 5(4), 71-74. DOI: 10.17148/IJIREEICE.2017.5414

Belkin, M., and Niyogi, P. (2003). “Laplacian eigenmaps for dimensionality reduction and data representation,” Neural Computation 15(6), 1373-1396. DOI: 10.1162/089976603321780317

Bendale, A., and Boult, T. E. (2015). “Towards open world recognition,” in: Proceedings of IEEE Computer Vision and Pattern Recognition, Boston, USA, pp.1893-1902.

Bendale, A., and Boult, T. E. (2016). “Towards open set deep networks,” in: Proceedings of IEEE Computer Vision and Pattern Recognition, Las Vegas, USA, pp.1563-1572.

Geng, C. X., Huang, S. J., and Chen, S. C. (2021). “Recent advances in open set recognition: A survey,” IEEE Trans Pattern Analysis and Machine Intelligence 43(10), 3614-3631. DOI: 10.1109/TPAMI.2020.2981604

Jain, L. P., Scheirer, W. J., and Boult, T. E. (2014). “Multi-class open set recognition using probability of inclusion,” in: Proceedings of European Conference of Computer Vision, Zurich, Switzerland, pp. 393-409.

Junior, P. R. M., Souza, R. M. D., Werneck, R. D. O., Stein, B. V., Pazinato, D. V., Almeida, W. R., Penatti, O. A., Torres, R. D., and Rocha, A. (2017). “Nearest neighbors distance ratio open set classifier,” Machine Learning 106(3), 359-386. DOI: 10.1007/S10994-016-5610-8

Ma, T., Inagaki, T., and Tsuchikawa, S. (2021). “Demonstration of the applicability of visible and near-infrared spatially resolved spectroscopy for rapid and nondestructive wood classification,” Holzforschung 75(5), 419-427. DOI: 10.1515/hf-2020-0074

Mensink, T., Verbeek, J., Perronnin, F., and Csurka, G. (2013). “Distance-based image classification: generalizing to new classes at near-zero cost,” IEEE Trans Pattern Analysis and Machine Intelligence 35(11), 2624-2637. DOI: 10.1109/TPAMI.2013.83

Mignotte, M. (2011). “MDS-based multiresolution nonlinear dimensionality reduction model for color image segmentation,” IEEE Transactions on Neural Networks 22(3), 447-460. DOI: 10.1109/TNN.2010.2101614

Park, S. Y., Kim, J. H., Kim, J. C., Yang, S. Y., and Choi, I. G. (2021). “Classification of softwoods using wood extract information and near infrared spectroscopy,” BioResources 16(3), 75-80. DOI: 10.15376/biores.16.3.5301-5312

Reddy, G. T., Reddy, M. P. K., Lakshmanna, K., Kaluri, R., Rajput, D. S., and Srivastava, G. (2020). “Analysis of dimensionality reduction techniques on big data,” IEEE Access 8, 54776-54788. DOI: 10.1109/ACCESS.2020.2980942

Rodriguez, A., and Laio, A. (2014). “Clustering by fast search and find of density peaks,” Science 344(6191), 1492-1496. DOI: 10.1126/science.1242072

Scheirer, W. J., Jain, L. P., and Boult, T. E. (2014). “Probability models for open set recognition,” IEEE Trans Pattern Analysis and Machine Intelligence 36(11), 2317-2324. DOI: 10.1109/TPAMI.2014.2321392

Scheirer, W. J., Rocha, A. D. R., Sapkota, A., and Boult, T. E. (2013). “Toward open set recognition,” IEEE Trans Pattern Analysis and Machine Intelligence 35(7), 1757-1772. DOI: 10.1109/ TPAMI.2012.256

Tax, D. M. J., and Duin, R. P. W. (2004). “Support vector data description,” Machine Learning 54(1), 45-66.

Tuncer, F. D., Dogu, D., and Akdeniz, E. (2021). “Efficiency of preprocessing methods for discrimination of anatomically similar pine species by NIR spectroscopy,” Wood Material Science & Engineering 1-10. DOI: 10.1080/17480272.2021.2012821

Yu, Z., Qin, L., Chen, Y., and Parmar, M. D. (2020). “Stock price forecasting based on LLE-BP neural network model,” Physica A: Statistical Mechanics and its Applications 553, 124197. DOI: 10.1016/j.physa.2020.124197

Zhan, W., Chen, B., Wu, X., Yang, Z., Lin, C., Lin, J., and Guan, X. (2023). “Wood identification of Cyclobalanopsis (Endl.) Oerst based on microscopic features and CTGAN-enhanced explainable machine learning models,” Frontiers in Plant Science. 14, article 1203836. DOI:10.3389/fpls.2023.1203836

Zhang, H., and Patel, V. M. (2017). “Sparse representation-based open set recognition,” IEEE Trans Pattern Analysis and Machine Intelligence 39(8), 1690-1696. DOI: 10.1109/TPAMI.2016.2613924

Article submitted: September 30, 2024; Peer review completed: October 26, 2024; Revised version received and accepted: November 1, 2024; Published: November 27, 2024.

DOI: 10.15376/biores.20.1.944-955