NC State
BioResources
Wu, X., Wu, G., Wang, B., and Li, J. (2023). “Classification of alfalfa hay based on infrared spectroscopy,” BioResources 18(3), 5399-5416.

Abstract

Alfalfa hay plays a decisive role in the quality and safety of livestock products. Chemical analytical methods for alfalfa hays are laborious, time-consuming, and costly. Therefore, suitable methods are required for rapid and accurate detection of alfalfa hay. This study evaluated the feasibility of infrared spectroscopy (IR) in identifying different alfalfa hays. 105 alfalfa hay samples under three different drying methods were analysed. Results indicated that the full spectra model constructed through standard normal variable transformation (SNV), first-derivative (FD), and second-derivative (SD) preprocessing by BP and SVM had the best performance. The accuracies were all up to 100%. Under the same preprocessing method, the accuracy of BP neural networks was better than that of support vector machine models in most cases. The characteristic wavelength-based SNV-SD-SPA by BP exhibited better performance than the other pretreatment methods, such as: SNV-SPA, SNV-FD-SPA, and SNV-GA, etc. The classification accuracy of moldy-dried alfalfa, sun-dried alfalfa, and shade-dried alfalfa in the training set were 100%, 100%, and 99.5%, respectively, and the accuracy of the prediction set reached 100%, 97.6%, and 97.4%, respectively. Thus, a better theoretical basis was obtained for the grading and online monitoring of alfalfa hay.


Download PDF

Full Article

Classification of Alfalfa Hay Based on Infrared Spectroscopy

Xiaoqing Wu,a,b Guifang Wu,a,* Bo Wang,a and Jie Li a

Alfalfa hay plays a decisive role in the quality and safety of livestock products. Chemical analytical methods for alfalfa hays are laborious, time-consuming, and costly. Therefore, suitable methods are required for rapid and accurate detection of alfalfa hay. This study evaluated the feasibility of infrared spectroscopy (IR) in identifying different alfalfa hays. 105 alfalfa hay samples under three different drying methods were analysed. Results indicated that the full spectra model constructed through standard normal variable transformation (SNV), first-derivative (FD), and second-derivative (SD) preprocessing by BP and SVM had the best performance. The accuracies were all up to 100%. Under the same preprocessing method, the accuracy of BP neural networks was better than that of support vector machine models in most cases. The characteristic wavelength-based SNV-SD-SPA by BP exhibited better performance than the other pretreatment methods, such as: SNV-SPA, SNV-FD-SPA, and SNV-GA, etc. The classification accuracy of moldy-dried alfalfa, sun-dried alfalfa, and shade-dried alfalfa in the training set were 100%, 100%, and 99.5%, respectively, and the accuracy of the prediction set reached 100%, 97.6%, and 97.4%, respectively. Thus, a better theoretical basis was obtained for the grading and online monitoring of alfalfa hay.

DOI: 10.15376/biores.18.3.5399-5416

Keywords: Alfalfa hay; Infrared spectroscopy; Machine learning; Classification

Contact information: a: College of Mechanical & Electrical Engineering, Inner Mongolia Agricultural University, Hohhot, 010018, P.R. China; b: College of Physics and Electronic Information, Inner Mongolia Normal University, Hohhot, 010022, P.R. China; *Corresponding author: wgfsara@imau.edu.cn

INTRODUCTION

Alfalfa hay is an important feed for dairy cows and plays an important role in the healthy and stable development of the dairy market (Darabighane et al. 2020; Lorenzo et al. 2020). Dried alfalfa needs to be compressed and processed into a certain size bale for storage and transportation (Cheng et al. 2018; Vanzant et al. 1990). To ensure alfalfa nutrition, bundling is carried out according to a certain water content (Han et al. 2004; Lim et al. 2020). Improper antimildew measures are conducive to the proliferation of microorganisms and cause alfalfa mildew (Wang et al. 1996). The nutrient content of alfalfa after mildew infestation is destroyed, and its feeding value is low, which can cause livestock poisoning and affect the milk product quality (Coblentz et al. 1996). Therefore, if alfalfa mildew can be quickly identified during drying or storage, the loss can be effectively reduced. Traditional methods of chemical detection of mold generally have the characteristics of cumbersome operation, long detection period, and high cost (Gfrerer et al. 2004).

Infrared spectroscopy technology is a chemical analysis method that can detect different absorbance frequencies of specific molecules in substances and is fast and nondestructive (Xiong et al. 2016; Zhou et al. 2022). Because different chemical components contain different chemical groups, corresponding to different group frequencies, the positions of the generated characteristic absorbance peaks are also different, and moreover, for the same chemical composition, the intensity of the characteristic absorbance peaks reflected by the different content is not the same (Hell et al. 2016). Therefore, for both quantitative and qualitative analyses of substances, infrared spectroscopy can be utilized. Traditional mid-infrared spectroscopic analysis requires the production of potassium bromide tablets for solid samples. Attenuated total reflection (ATR) technology obtains the information through the reflection signal of the sample surface (Undugodage et al. 2018). It has the characteristics of high sensitivity, clearly characteristic bands, simple operation, and there is no need for sample preparation (Kuronuma et al. 2020). However, the intensity of the overall signal is hard to control when using ATR plate methods, since it depends on the smoothness and pressure of pressing the specimen onto the plate.  In recent years, infrared spectroscopy techniques have been widely used in food, pharmaceutical spetrochemicals, tea, wood, feed, and other fields (Mohebby 2010; Wallén et al. 2018; Zapata et al. 2021). There are also some research reports on the detection of food mildew by infrared spectroscopy. Shen and Huang established an online corn mold detection system using spectra and image information fusion technology, by collecting the spectra and image information of corn samples stored on days 6, 9, 12, and 15 and establishing the discriminant linear analysis model, an overall recognition rate of 91.1% was obtained for different degrees of mildew (Shen and Huang 2019). Chu et al. (2014) used near-infrared spectroscopy technology to detect corn kernels with different degrees of mildew. They used principal component analysis to reduce the dimensionality of the spectral data and established a model with FDA (Fisher discriminant analysis), which had a classification accuracy of 91.4%. At present, most studies use spectroscopy and machine vision techniques to detect mildew in food, and there are few reports on the use of infrared spectroscopy to detect mildew in alfalfa hay.

Infrared spectral data has the characteristic of high correlation between two adjacent spectra data and high dimensionality (Tanaka et al. 2011). Using full-spectra data to build a model will increase the computing time, and the recognition and classification results may not be ideal.

With the development of computer science and artificial intelligence, more machine learning algorithms have been developed and applied to information mining of infrared spectra. Machine learning is a field of study that automatically detects patterns in data from a given database of knowledge and then uses the detected patterns to predict unknown data. Therefore, infrared spectroscopy combined with machine learning may be a potential solution for identifying the quality of alfalfa hay (Kumar et al. 2017).

The objectives of this study were as follows: (1) to acquire spectra of alfalfa hay, (2) to determine the optimal wavelength using successive projection algorithm (SPA) and genetic algorithm (GA), (3) to construct a classification model by using the full spectra and optimal wavelengths, and (4) to use neural network and SVM procedures to classify the extracted features.

EXPERIMENTAL

Preparation of Experimental Samples

Samples used for this study were collected from an experimental field of Inner Mongolia Agricultural University in 2019. They were split into three categories of dry alfalfa: alfalfa dried in the shade, alfalfa naturally dried in the sun, and moldy naturally dried in the sun. The alfalfa moisture content was approximately 15% to 20%. Three different types of dry alfalfa were first ground into powder using an electric high-speed pulverizer, which was followed by a 1-mm mesh sieve to remove impurities. Finally, 5 grams of the powdered alfalfa were weighed using an electronic scale into a 50-mL test tube and covered with plastic wrap for storage. Thirty-five samples were prepared for each type of dried alfalfa, and all 105 samples were prepared.

Infrared Spectral Acquisition

Infrared spectra were recorded using an attenuated total reflectance sampling accessory (PerkinElmer, Boston, MA, USA) and PE series software. Reflectance data were recorded over the wavenumber range of 400 to 4000 cm-1 with 64 scans per spectra and a spectral resolution of 4 cm-1. The acquisition time for a single spectra was 66 s. Background spectra were collected with no samples present on the crystal, and under the same experimental conditions. To assess repeatability and identify any problems caused by the sample’s finite particle size, three spectra for each sample were gathered. For the statistical analysis, the average of these three spectra for each sample was then used. The files were exported as comma separated value (csv) files and imported into the MATLAB software (Mathworks, 2020b, Natick, MA, USA) for analysis. The first and last noise of the spectral data was relatively large, and finally 600 to 2000 cm-1 were retained. There were 701 variables in each spectra for preprocessing and modelling in the study.

Methods

Pretreatment of the spectral data

Pretreatment of the averaged spectra was required to eliminate mechanical noise and baseline drift. Pretreatment methods include standard normal variable (SNV), MSC (multiplicative scatter correction), first derivatives (FD), second derivatives (SD), and Savitzky-Golay convolution smoothing (SG), and so on. In order to eliminate strength differences between different samples and analyze data, all data were normalized before preprocessing. The standard normal variable transformation is primarily used for the surface scattering influence and light intensity changes on the spectra, and multivariate scattering correction is used to eliminate the influence of particle size and scattering caused by particle inhomogeneity (Kamruzzaman et al. 2016). A derivative operation is used to eliminate the shift of the baseline. The SG smoothing improves the smoothness of the spectra and reduces the interference of noise (Rahman et al. 2016). In this study, spectral data preprocessing was performed using Unscrambler 10.1 (Camo Software, Oslo, Norway).

Characteristic Wavelength Selection

Infrared spectral data contain hundreds of continuous wavelengths, which are redundant and multicollinear. Eliminating redundant wavelengths and selecting optimal variables not only can simplify the modelling process and reduce costs and running time, but also, they can improve the performance of the model. In this study, the uninformative variable elimination (UVE)-SPA and GA methods were selected to extract the optimal wavelengths in MATLAB (Version 2020a, MathWorks, Natick, MA, USA).

In the UVE-SPA method, UVE can remove a lot of invalid information. Variable modelling based on UVE selection can avoid model overfitting and improve its predictive ability. The SPA mainly solves the problem of collinearity, and it is used to select the wavenumber with the lowest redundant information and obtain the useful variable with the least collinearity (Mário et al. 2001). SPA has been widely used in the selection of spectral characteristic variables. The basic principle of the SPA is to simply project a set of wavelength subsets into the vector space and select the wavelength subset with the least redundancy. The algorithm steps are described below, assuming that the first wavelength k(0) and N are given.

The genetic algorithm is an intelligent optimization algorithm designed and proposed by John Holland according to the evolutionary laws of organisms in nature. The genetic algorithm simulates the phenomena of reproduction, crossover, and gene mutation that occur in natural selection and natural genetic processes (Ji et al. 2022). In each iteration, a set of candidate solutions are retained, and a better individual is selected from the solution group according to a certain index. We use genetic operators (selection, crossover, and mutation) to combine these individuals to produce a new generation of candidate solution groups and repeat this process until a certain convergence index is met. A genetic algorithm’s specific procedure is depicted in Fig.1.

Fig. 1. The flow chart of GA

BP Neural Network

The back propagation (BP) neural network is a concept proposed by scientist leaders Rumelhart and McClelland in 1986 (Rumelhart et al. 1986). The learning process of the BP network is an error correction learning algorithm that is composed of forward propagation and back propagation. In the forward propagation process, the input signal propagates from the input layer to the hidden layer and the output layer through the activation function. The neuron state of each layer only affects the neuron state of the next layer. If the desired output cannot be obtained in the output layer, it will switch to back propagation and return according to the original link path. The topology of the neural network is shown in Fig. 2. Equations 1 and 2 provide the weights, thresholds, and transfer functions that link the neurons in the input layer, hidden layer, and output layer,

where n is the input layer’s number of neurons; p is the hidden layer’s number of neurons; and q is the output layer’s number of neurons. For Eq. 2, f1 and f2 are the activation function of the hidden layer and the output layer; is the nth input neuron to the pth weights of hidden neurons; is the weight from the pth hidden neuron to the qth output neuron; Zp is the threshold from the input layer to the hidden layer; is the threshold from the hidden layer to the output layer; and yq is the output for the neural network.

Fig. 2. The structure of neural network

Support Vector Machine

Support Vector Machine (SVM) is a commonly used machine learning algorithm in spectral analysis, which performs well when classifying small amounts of high-dimensional data (Chang and Lin 2007). By using different kernel functions, SVM has the powerful ability to handle linear and nonlinear problems. In this study, the Radial basis function (RBF) was selected as the kernel function, and the parameters c and g were determined through optimization.

Model Performance Evaluation

To test the stability of the model, Monte Carlo cross-validation was used to divide the dataset into 20 different points, and the average accuracy under different datasets was calculated as the basis for the comparison of different models. At the same time, the coefficient of variation was used to evaluate the stability of the model under different data sets. The stability of the system increases with decreasing coefficient of variation. Data set A (containing 84 sets of data) was used as the calibration set, and data set B (containing 21 sets of data) was used as the test set. The models were evaluated using classification accuracy as a criterion.

RESULTS AND DISCUSSION

Figure 3 shows the average spectra of the three alfalfa species. It is essential for describing the main traits and characteristics of the Mid-Infrared (MIR) spectra of alfalfa hay. The C-OH stretch of cell wall polysaccharides is responsible for the strongest band in the spectra, located at 1040 cm-1. The vibration peaks are CH3 symmetrical bending vibration peak near 1375 cm-1, N/N vibration peak near 1418 cm-1, and weak absorbance peaks around 1500 cm-1 to 1600 cm-1, which are absorbance peaks of benzene ring (Kaya and Huck 2017; Josiah et al. 2018). The average spectra of the three alfalfa species generally converged but differed, as shown in Fig. 4.

Fig. 3. The average spectra

Fig. 4. The average spectra of three alfalfa species

Spectral Characteristics

Figure 5(a) shows the original spectral data from 105 alfalfa hay specimens. Through observing the original spectra, it can be seen that the overall spectra data tends to be consistent. However, the degree of dispersion is high, and it is impossible to distinguish the samples dried in three ways through spectral data. Figure 5(b through f) shows the spectral curves for the various pretreatments, including SNV, FD, SD, SNV+MSC, and SNV+SG. After pretreatments, all these models kept their original spectral features. The processing results show that various processing techniques cause the spectral data to be processed differently and become smoother.

 

Fig. 5. The raw and pretreated spectral curves of all alfalfa via different methods: (a) raw; (b) SNV; (c) FD; (d): SD; (e) SNV+MSC; and (f) SNV+SG

Characteristic Wavelength Selection

To simplify the classification model, UVE-SPA and the GA were used for characteristic wavelengths. The screening results are shown in Fig. 6 and listed in Table 1. Figure 6 shows the characteristic wavelength extraction results for the SNV, SG, MSC, SG-SNV, and SG-MSC models, respectively. The characteristic spectra after 4 and 5 cycles of dimensionality reduction are shown in Fig. 7.

As presented in Fig. 6 and Table 1, after variable selection the number of characteristic wavelengths selected by the SPA, was reduced 96%, 96.3%, 97.1%, 97.9%, and 96.7% when the SNV, MSC, SD, FD, and SG pretreatment methods were employed, respectively. These results indicate the effectiveness of the SPA in dimension reduction. After wavelength selection, spectral reflection values at specific wavelengths were extracted, and a simplified classification model was constructed to replace the full spectra as the input for the subsequent classification mode.

As shown in Fig. 7 and Table 1, after variable selection, the number of characteristic wavelengths selected by the GA was reduced to 96.7%, 96.9%, 96.4%, and 97.4%.

Fig. 6. Wavelength selection results on the pretreated spectral data via the UVE-SPA method: (a) SNV; (b) MSC; (c) FD; (d) SD; and (e) SG

The specific parameter settings for feature spectra extraction using GA algorithm were as follows: the initial population was 701 cm-1 composed of 0, 1, and a total of 30. In the experiment, the genetic algorithm was used to reduce the dimensionality, which reduced the number of wavenumbers from 701 in the whole band to 20 to 40 waves in 5 cycles. The precise band selected in each cycle was half of the original wavenumber. The genetic algorithm uses the fitness function to evaluate the quality of individual solutions. When the value of the fitness function is larger, the quality of the solution is much better. In this paper, the error sum of squares was used as the fitness function, and GAOT pachage was used. The specific genetic operator settings were as follows: normGeomSelect was selected for selection operator, simpleXover for the crossover operator, and boundary Mutation selected for the mutation operator.

Table 1. Wavelength Selection for Classification

The dimensionality reduction by the genetic algorithm reduced the number of 701 waves to less than 50. Figure 7 is the characteristic band spectra after the fourth and fifth optimizations. After the fourth optimization, 47 feature bands were extracted, and after the fifth optimization, it was reduced to 23 characteristic bands.

The specific characteristic spectra is shown in the Table.

Table 2. Wavelength Selection after GA

A picture containing line, diagram, plot, text

Description automatically generated

Fig. 7. Result of characteristic variables by GA (a) After 4 cycles; (b) After 5 cycles

Construction of the Full Spectra Model

The appropriate full spectra BP neural network and SVM models were created after pretreating the original spectra using various techniques. When using the full band for training, the specific parameters of the neural network were set as follows: the hidden layer was set to 10; the learning rate was set to 0.0001, and the number of iterations was set to 30. The results of this model are listed in Table 3. Using a back propagation neural network to classify full band alfalfa hay was able to achieve 100% classification results, when using SNV, FD, and SD pretreatment, but it took a long time for the model to function. The other two pretreatment techniques also produced positive outcomes.

The classification results of the training set were 100%, 98.8%, and 98.2%, while the classification results of the prediction set were 100% because of the MSC preprocessing technique. With the exception of the natural drying method sample, which has a classification result of 98.7% when the SG processing method is applied, the training set’s classification result was 100%. Table 4 shows the full spectrum classification results using support vector machines. In the prediction set, the classification accuracy of moldy dried alfalfa hay reached 100%, while the classification accuracy of the other two drying methods was not particularly good, with the minimum accuracy of 80.6%.

Characteristic Wavelength Model

Spectral data have high dimensionality and great correlation, so the model takes a long time. The authors used GA and SPA to extract the characteristic wavelengths from the entire spectrum using multiple pretreatment techniques to simplify the model, decrease the model’s running time, and improve classification results. The BP neural network and SVM model were then built using the extracted characteristic wavelengths. Table 5 and 6 provides the model’s performance data.

It can be seen from Tables that both models established by the characteristic band obtained relatively good classification results. The moldy alfalfa in the calibration set were both 100% recognized. In addition to SNV-FD-GA and SNV-SD-GA, other moldy alfalfa in the prediction set were 100% identified with using BP model. In the SVM model, over half of the moldy alfalfa hay classification accuracy also reached 100% in the prediction set. All naturally dried alfalfa in the validation set, excluding SNV-MSC-UVE-SPA, SNV-FD-UVE-SPA, and SNV-SG-UVE-SPA, were correctly identified at 100%; however, the recognition results of the prediction set were not very good. The recognition rate of most of the preprocessing methods reached more than 90%. In the model using SVM, the accuracy of the four preprocessing methods reached 100%, while the accuracy of the other methods remained above 80%. Most of the recognition performances of the shady dried alfalfa samples in the training set reached 100%, while most of the validation sets were also remained above 90%. In the SVM model, only the training set under SNV-FD-SPA processing achieved 100%, while the others were above 89%, and most of the accuracy of the validation sets were above 80%.

Table 3. Result Using the Full Spectra by BP

Table 4. Result Using the Full Spectra by SVM

Table 5. Characteristic Wavelength by BP Model Data