Abstract
The quality detection of alfalfa hay is crucial for the development of animal husbandry. In this study, a method for quality detection of alfalfa hay based on the fusion of multisource information including near-infrared spectroscopy, image processing techniques, and electronic nose is proposed. After SG convolution smoothing, feature wavelengths were extracted using Competitive Adaptive Re-weighting Scheme and Successive Projections Algorithm from the spectral data. The image data were denoised using adaptive wavelet thresholding, and color and texture features were extracted using color histograms and random forest algorithms, respectively. Electronic nose data using principal component analysis was used for data dimensionality reduction. Support Vector Machine, Extreme Learning Machine, and Multi-Layer Perceptron were employed to establish quality detection models of alfalfa hay based on spectroscopy, image, gas information, and their combination, respectively. Experimental results demonstrate that the fusion of near-infrared spectroscopy, image data, and gas information effectively enhances the classification accuracy of the model. The accuracy of the test set reaches 100%, with root mean square error and determination coefficient values of 0.1728 and 0.9239, respectively, surpassing prediction models established solely on individual information. This study provides new insights into alfalfa hay quality detection.
Download PDF
Full Article
Quality Detection of Alfalfa Hay Based on Multisource Information Fusion: A Preliminary Study
Huihe Yang,a,b,# Jie Li,a,b,# Guifang Wu,a,b,* Xuehong De,a,b,* Yong Zhang,a,b Fang Guo,a,b Shubin Yan,a,b Xiangping Bai,c Haowen Xiao,c and Yang Cao c
The quality detection of alfalfa hay is crucial for the development of animal husbandry. In this study, a method for quality detection of alfalfa hay based on the fusion of multisource information including near-infrared spectroscopy, image processing techniques, and electronic nose is proposed. After SG convolution smoothing, feature wavelengths were extracted using Competitive Adaptive Re-weighting Scheme and Successive Projections Algorithm from the spectral data. The image data were denoised using adaptive wavelet thresholding, and color and texture features were extracted using color histograms and random forest algorithms, respectively. Electronic nose data using principal component analysis was used for data dimensionality reduction. Support Vector Machine, Extreme Learning Machine, and Multi-Layer Perceptron were employed to establish quality detection models of alfalfa hay based on spectroscopy, image, gas information, and their combination, respectively. Experimental results demonstrate that the fusion of near-infrared spectroscopy, image data, and gas information effectively enhances the classification accuracy of the model. The accuracy of the test set reaches 100%, with root mean square error and determination coefficient values of 0.1728 and 0.9239, respectively, surpassing prediction models established solely on individual information. This study provides new insights into alfalfa hay quality detection.
DOI: 10.15376/biores.19.3.4531-4546
Keywords: Alfalfa hay; Image processing; Near-infrared spectroscopy; Electronic nose; Machine learning
Contact information: a: College of Mechanical & Electrical Engineering, Inner Mongolia Agricultural
University, Hohhot, 010018, P.R. China; b: Inner Mongolia Engineering Research Center of Intelligent Equipment for the Entire Process of Forage and Feed Production, Hohhot, 010018, P.R. China; c: Inner Mongolia Autonomous Region Agricultural and Pastoral Technology Extension Center, Hohhot, 010010, P.R. China; Huihe Yang and Jie Li contributed to the work equally and should be regarded as co-first authors; * Corresponding authors: wgfsara@126.com and dexuehong@126.com
INTRODUCTION
Purple alfalfa, originating from Persia, is the most widely distributed and oldest cultivated leguminous forage grass in the world, often referred to as the “king of forage grass”. It contains not only many important nutrients such as proteins, minerals, and vitamins but also essential amino acids, trace elements, and unidentified growth factors required by animals. Therefore, it is used as the main raw material for protein extraction and the primary high-quality feed for animals (Li et al. 2023). Thus, classifying alfalfa hay of different qualities is a very important research task. Traditional evaluation methods include sensory evaluation and objective measurements (Li et al. 2020). The former is mainly judged by experienced personnel based on morphological appearance, while the latter usually employs chemical and biological methods such as protein electrophoresis, gas chromatography, spectrophotometry, and high-performance liquid chromatography to detect relevant compounds or specific spoilage quantities through microbial analysis. The former is time-consuming, labor-intensive, and susceptible to subjective factors, while the latter methods are more accurate but always require corresponding professional personnel for operation and cannot be achieved online. From economic and technological perspectives, addressing these limitations is necessary and beneficial. Therefore, it is necessary to develop a new method to overcome the limitations of traditional methods.
Near-infrared spectroscopy technology is a rapid and non-destructive analysis method capable of detecting the different absorption frequencies of specific molecules in substances. Since different chemical components contain different chemical groups corresponding to different group frequencies, the positions of characteristic absorbance peaks generated are also different, and the characteristic absorbance peaks reflected by different contents of the same chemical component vary (Tang et al. 2023). Therefore, both quantitative and qualitative analyses of substances can be conducted using infrared spectroscopy technology. In this experiment, a quantitative prediction model for crude fat in alfalfa hay was established based on near-infrared spectroscopy to predict the content of crude fat in alfalfa hay through spectral data.
Image processing technology refers to the analysis of images to extract various information from them using computers. In recent years, image processing technology has been widely used in agricultural production for crop detection (Song 2022), pest and disease identification (Kemal et al. 2022), soil analysis, etc., thereby helping to improve agricultural production efficiency and quality. Alfalfa hay is prone to moisture absorption and mold, and if alfalfa with moldy properties is mixed into feed for feeding cows, the cows’ intake will decrease due to the loss of the original taste of alfalfa and the presence of peculiar smells, leading to digestive disorders such as rumen stasis and reduced rumination, as well as symptoms of poisoning such as drooling; milk production will decrease sharply, affecting the quality of dairy products, with protein, fat, and lactose all failing to meet requirements (Xue 2006). Therefore, detecting the degree of moldiness in alfalfa hay is crucial. After alfalfa grass becomes moldy, its color and texture characteristics change significantly, so establishing a qualitative discrimination model by collecting images of alfalfa hay and using image processing technology achieves the effect of detecting whether alfalfa hay is moldy.
The electronic nose is a biomimetic detection system developed to mimic the human olfactory system. It captures the odor information of samples through gas-sensitive sensors, converts it into electrical signals, and uses this information for quality detection of samples (Shi et al. 2024). Grass undergoes significant changes in odor before and after molding. By collecting its odor information and combining it with sensory evaluation, a model can be established to discriminate whether alfalfa is moldy.
Multi-source information fusion, also known as multi-sensor information fusion, refers to the integration and processing of multiple information sources from different origins, types, and spatial resolutions. It can provide richer information than a single information source, thereby enhancing the cognitive ability and decision-making level towards the target (Jiang et al.2023). In recent years, multi-source information fusion has been extensively researched and advanced, spanning various fields such as military (Zhou et al. 2024), medical (Li et al. 2023), unmanned driving (Ding et al. 2023), geology (Kong et al. 2022), among others. However, there are few cases applying multi-source information fusion to agricultural production. This paper proposes a new approach to apply multi-source information fusion to alfalfa detection, achieving more accurate control over the quality of alfalfa.
Machine learning is a field of study that automatically detects patterns and rules from a given database and uses the detected patterns to predict unknown data (Quintero et al. 2023). Therefore, combining digital images, near-infrared spectroscopy, electronic nose, and machine learning may be a potential solution for identifying the quality of alfalfa hay, thereby achieving rapid and non-destructive detection of alfalfa hay quality.
EXPERIMENTAL
Samples and Equipment
Following the basic requirements of plant biology experiments, three representative samples of alfalfa from Inner Mongolia were selected as experimental objects. After removing impurities such as weeds and sand, each sample was subjected to sun drying, and moldy treatment. Sixty samples were taken for each treatment method, totaling 120 samples. After treatment, image collection was carried out, followed by grinding the samples into powder and sieving through a 100-mesh sieve, and then collecting near-infrared spectroscopy data and electronic nose data. The fat content of the samples was determined using an ANKOM XT15i fat analyzer. Representative samples of alfalfa hay were placed on a black background and photographed using a digital camera. During photography, the camera lens was kept parallel to the plane of the sample, with the camera positioned 20 cm above the sample. The obtained images had a resolution of 3072×4096 pixels. One or more sub-images were cropped manually from each image to construct the image library.
Fig. 1. Information collection process
The near-infrared spectroscopy instrument used for data collection was the Quality Spec Pro spectrometer manufactured by Analytical Spectral Devices (ASD, Incl, USA). The wavelength range of the spectrometer was 350 to 1830 nm, with a spectral sampling interval of 1 nm. After preheating the spectrometer for 20 min, the sample was placed in the measurement chamber of the spectrometer. External interference such as light and sound was eliminated, and near-infrared light was then irradiated onto the sample, with the reflected or transmitted spectral data recorded. Multiple measurements were taken on the sample to obtain the average spectrum and check the measurement repeatability. The electronic nose detection device used was the PEN3 model electronic nose manufactured by AIRSENSE, Germany, equipped with ten gas-sensitive sensors. For detection, 15 to 18 g of dried purple alfalfa sample was placed in a centrifuge tube, heated in a 50 °C water bath for 30 min, washed for 12 seconds, and the measurement was carried out for 24 seconds. The software used for spectral analysis, image processing, and modeling was Unscrambler X 10.4 from CAMO, Norway, and Matlab R2022b from MathWorks, USA. The odor information was analyzed using the Win Muster software, which is built into the PEN3 instrument. The information collection process is shown in Fig. 1.
Spectral Data Preprocessing
Due to the interference of unrelated information such as stray light, baseline drift (Sun et al. 2023), noise, and sample background, the spectral data obtained by the spectrometer are susceptible to disturbance, which affects the modeling effect. To improve the accuracy of near-infrared spectroscopy measurements and enhance the signal-to-noise ratio of the spectra, noise spectra in the range of 350 to 429 nm were excluded, and models were built using spectral bands between 430 to 1830 nm. Before building the model, the spectral data were preprocessed using the S-G smoothing method. The derivative order was set to 0, the window number was set to 7, and the smoothing order was set to 3. The original spectral graph and the graph after S-G convolution smoothing are shown in Fig. 2.
Fig. 2. (a) Reflectivity curve of alfalfa hay samples, (b) Spectral curve after SG preprocessing
Preprocessing of Image Data
To ensure the accuracy and stability of the experimental results, preprocessing was performed on the collected images. After removing irrelevant backgrounds, images of different sizes were adjusted to the same pixel size to eliminate noise interference introduced during collection and transmission. This process separates images of moldy alfalfa for identification.
During image collection and transmission, the presence of noise can degrade the quality of the images. Noise interference affects the segmentation effect, feature extraction parameters, and mold recognition accuracy of leaf images, leading to serious impacts on image processing algorithms. Therefore, noise removal is an essential step in image processing. Adaptive wavelet thresholding was selected to denoise the collected images (Liu et al. 2022), which eliminates noise while preserving details and edge information in moldy alfalfa images.
The process of wavelet threshold denoising can be divided into three steps: transforming real natural images into the wavelet domain using wavelet transformation, applying nonlinear shrinkage rules to wavelet coefficients, and performing wavelet inverse transformation on thresholded wavelet coefficients to obtain denoised images. The effectiveness of denoising depends on several factors: the choice of wavelet basis, determination of wavelet decomposition levels, selection of threshold functions and threshold estimation methods. The adaptive wavelet thresholding algorithm dynamically selects thresholds based on local characteristics of the image, resulting in the removal of irrelevant backgrounds. Examples of images before and after denoising are shown in Fig. 3.
Fig. 3. (a) Normal alfalfa (b) Denoised normal alfalfa (c) Moldy alfalfa (d) Denoised moldy alfalfa
Electronic Nose Data Preprocessing
The odors released when alfalfa hay molds are usually associated with volatile organic compounds, including but not limited to ketones, alcohols, aldehydes, acids, and other volatile organic compounds (Tian et al. 2021). Electronic noses can effectively capture such signals and convert them into digital information. The data collected by the electronic nose often exhibit a waveform that rises first, then decreases, and finally stabilizes. The odors or volatile compounds produced by moldy grass usually reach a stable state after the grass molds and may continue to be released for a period of time. Therefore, analyzing the data in the final stable stage can better capture the odor characteristics related to molding, while eliminating possible interference at the beginning of the collection, such as environmental odors or other noises. Therefore, data collected from the stable 20 seconds are analyzed.
Feature Extraction and Dimensionality Reduction of Near-Infrared Spectroscopy
The preprocessed spectral data contain a large number of wavelength variables, resulting in high data dimensionality, excessive redundancy, prolonged processing time, and potential degradation in classification results if models are directly built (Gibertoni et al. 2022).
Fig. 4. Feature wavelength extraction map
Hence, the Competitive Adaptive Re-weighting Scheme (CARS) algorithm and Successive Projections Algorithm (SPA) are employed to extract feature wavelengths, selecting fewer wavelength variables based on the principle of minimizing the root mean square error (RMSE) to establish a predictive model for the nutritional quality of alfalfa hay.
The Competitive Adaptive Re-weighting Scheme (CARS) algorithm determines the importance weights of each sample during the training process through a competitive mechanism, giving more weight to samples that are more difficult to classify. Typically, this competition can be based on sample difficulty, error rate, or potential impact metrics. The algorithm performs re-weighting of each sample based on its importance weight iteratively during training. For classifier training, samples with higher importance weights are given greater weight, thereby enhancing the learning effect on minority class samples and improving the model’s ability to identify samples (Zhang et al. 2023). The Competitive Adaptive Algorithm is used to extract feature wavelengths from the full spectral range, with 100 iterations and 10 cross-validation folds. The results show that the minimum RMSECV is achieved with 47 iterations, extracting a total of 67 feature wavelengths. The results are shown in Fig. 4.
The Successive Projections Algorithm is a forward variable selection algorithm that minimizes collinearity in vector space. It eliminates redundant information in the original spectral matrix, selecting fewer feature wavelengths. The wavelengths selected through this algorithm demonstrate better predictive performance when used to build models. The Successive Projections Algorithm extracts 51 feature wavelengths, as shown in Fig. 4.
Feature Extraction and Dimensionality Reduction of Image Data
Image preprocessing only removes irrelevant information from moldy alfalfa images. To achieve the automatic recognition function of the mold recognition system, feature extraction of images is also required. Selecting appropriate feature extraction methods and categories is a key factor in ensuring recognition accuracy. Since the moldy parts of the image have significant differences in color and texture compared to normal leaves, color features and texture features of the moldy images are extracted for model building.
Color features are extracted using color histograms, capturing nine features including RGB, H (hue), S (saturation), V (brightness), L (lightness), a (from red to green range), and b (from yellow to blue range). Texture features are extracted using the Tamura algorithm, extracting 22 texture features such as autocorrelation, entropy, and contrast. From these 22 texture features, the Random Forest (RF) algorithm is used to extract 10 texture features including energy, entropy, contrast, and variance for texture feature representation.
Color histograms are widely used color features in many image retrieval systems (ZhangZhong et al. 2023). They describe the proportion of different colors in the entire image, without considering the spatial position of each color, thus unable to describe objects or entities in the image. Color histograms are particularly suitable for describing images that are difficult to automatically segment. The color histogram of alfalfa hay is shown in Fig. 5. In the RGB histogram, the horizontal axis represents pixel values (0 to 255), representing the brightness or color component values of the image. The vertical axis represents the frequency at which the pixel value appears in the image, i.e., the number of pixels with a specific pixel value in the image. In the HSV histogram, for the H channel, the horizontal axis represents the range of hue values (0 to 1), representing the color in the image. For the S and V channels, the horizontal axis also represents the range (0 to 1), representing the saturation and brightness in the image. The vertical axis represents the frequency or count of data within the corresponding range.
(a) (b)
Fig. 5. Color Histograms (a) RGB Histogram (b) HSV Histogram
Random Forest (RF) exhibits high prediction accuracy, good robustness, and strong resistance to overfitting. Additionally, it can handle high-dimensional data and is relatively robust to missing and outlier values. The calculation method for extracting texture features using Random Forest is as follows (Ye et al. 2023):
Suppose there are m features x1, x2, x3, …, xm , collected from image samples. First, calculate the Gini index for each feature, and then compute the importance score of each texture feature using the VIMJ(Gini) formula. The Gini index is calculated using Eq. 1,
(1)
where k represents the category, pmk, denotes the proportion of category k in node m, which can also be seen as the probability of two randomly sampled samples from node m having different category labels.
The importance of feature x in node m, denoted as VIMjm(Gini), which is the change in Gini index before and after node m splits, is calculated using Eq. 2,
(2)
(3)
Assuming that the texture feature xj appears in a total of n trees in the Random Forest, the evaluation formula for its final importance is Eq. 4,
(4)
Finally, the calculated importance of texture features is sorted to obtain the required texture features.
Dimensionality Reduction of Electronic Nose Data
Original electronic nose data may contain a large number of features, which can lead to very high computational complexity during model training and inference. Dimensionality reduction can reduce the number of features, thus reducing computational complexity. Additionally, electronic nose data may contain some redundant information or noise, which can affect the model’s performance. Dimensionality reduction helps remove this redundant information, improving the model’s generalization ability and efficiency. Principal Component Analysis (PCA) was chosen in this study to reduce the dimensionality of electronic nose data (Tian et al. 2023). To verify whether the data is suitable for PCA, the Kaiser-Meyer-Olkin (KMO) test and Bartlett’s sphericity test were conducted. The results indicate that the KMO sampling adequacy measure is 0.656, greater than 0.6, and the significance (sig) is less than 0.05, indicating that the data supports PCA. Following the criterion of eigenvalues greater than 1, two common factors are extracted, contributing to a cumulative variance of 87.2%. Hence, extracting two common factors can reflect 87.2% of the variance in the original data, achieving dimensionality reduction of electronic nose data.
Information Fusion Method
The spectrometer provides spectral absorption information containing chemical bonds and functional groups, while the electronic nose provides comprehensive gas information obtained using a cross-sensitive gas sensor array. Images capture color and texture information from the material surface. These three types of information are fused using a feature set fusion method (Jia et al. 2021). Features are extracted from each data source, and they are combined to form larger feature vectors. Here, the weight of each information type is set to 1. The fused information obtained is used as input to evaluate the performance of the alfalfa hay quality detection model based on multi-source information fusion.
Alfalfa Hay Quality Model Construction and Optimization
In this study, three algorithms, namely Support Vector Machine (SVM), Extreme Learning Machine (ELM), and Multi-Layer Perceptron (MLP), were employed to explore the optimal model for identifying the quality of alfalfa hay. The model evaluation criterion was the confusion matrix method, which assesses the model’s performance based on the accuracy of identification. In this experiment, the preprocessed dataset was subjected to 10-fold cross validation, dividing it into 10 equally sized subsets. Each time, 9 subsets were used as the training set, and the remaining 1 subset was used as the testing set. This process was repeated 10 times, selecting different subsets as the testing set. Finally, take the average of these 10 evaluation results as the performance indicator of the model. Regression models were established for spectral bands, and the predictive results are shown in Table 1. Classification models were built for image and electronic nose data, with results presented in Tables 2 and 3, respectively. The model results for fused information are shown in Table 4.
Support Vector Machine (SVM) is a common machine learning algorithm used for classification and regression problems (Shen et al. 2023). In classification tasks, the working principle of SVM is as follows: The first step is to map data to a high-dimensional feature space. Then one finds a hyperplane in the feature space that maximizes the margin between different classes of samples. The distance from samples to the hyperplane is called support vectors, which determine the position of the hyperplane. New samples can be mapped to the feature space and classified through the hyperplane. The advantages of SVM include: applicable to linearly separable and non-linearly separable data; capable of handling high-dimensional data, which is effective for problems with many features; providing good results even with a small number of samples; and an ability to handle non-linear problems by selecting different kernel functions (Qin et al. 2017). Using cross-validation techniques to search for the best penalty factor (C) and radial basis function parameters (γ), the values of C and γ selected for the feature wavelengths extracted from the full spectrum, CARS, and SAP are respectively 1 1000 100 and 0.1 0.01 0.01. From the results, it can be observed that using the full spectrum as input led to overfitting due to the large data dimensionality and high model complexity. The model established using feature wavelengths extracted from CARS performed better than SAP, hence CARS-extracted feature wavelengths were chosen to build the fusion model.
Extreme Learning Machine (ELM) is a fast and effective machine learning algorithm used for solving classification and regression problems. Compared to traditional neural network algorithms, ELM has faster training speed and better generalization capabilities (Swati et al. 2023). The training process of ELM is very simple and efficient: it randomly initializes the weight matrix and threshold vector of the hidden layer, maps the input data through the nonlinear mapping of the hidden layer to obtain the output of the hidden layer, uses least squares method or other methods to perform linear regression between the output of the hidden layer and the target, and obtains the weight matrix of the output layer to complete model training. The advantages of ELM include . fast training speed. Unlike traditional neural network algorithms, ELM does not require an iterative optimization process; it can quickly train models by randomly selecting weight matrices and threshold vectors, and it has strong generalization capabilities (Mumtaz et al. 2022). The weights in ELM are fixed during training, allowing the generalization capability of the model to be transferred to the linear regression in the output layer, thereby avoiding overfitting problems in traditional neural networks. However, ELM also has some limitations: it may be affected by data containing a large amount of noise; random initialization of weight matrices and threshold vectors may lead to different model performances for different random initialization results. In this experiment, the number of hidden layers was set to 10 for building models using spectral and fusion information, and set to 100 for building models using image and electronic nose data.
Multi-Layer Perceptron (MLP) is a common type of Artificial Neural Network (ANN) model used to solve supervised learning problems, including classification and regression tasks (Ma et al. 2023). MLP consists of multiple layers of neurons, each layer connected to all neurons in the previous layer, with one or more hidden layers except for the input layer, and the last layer being the output layer. The working principle of MLP involves training the network using the backpropagation algorithm to minimize the error between the predicted output and the actual labels. During training, the forward propagation calculates the output for each sample, then the error is computed and the weights in the network are adjusted through backpropagation to minimize the error. Each neuron in MLP has an activation function to introduce nonlinearity. Common activation functions include sigmoid, ReLU (Rectified Linear Unit), and tanh, among others. Nonlinear activation functions enable MLP to learn and represent complex nonlinear relationships, resulting in better fitting of data (Najmeh et al. 2022). The advantage of MLP is that it can automatically learn effective feature representations from raw data, has good generalization ability, and the calculation process can be highly parallelized, which helps to improve training speed and performance. MLP has important hyperparameters including the number of hidden layers, the number of neurons in each hidden layer, learning rate, regularization parameter, etc. The selection of these hyperparameters is crucial for the performance and generalization capability of the model, and usually requires tuning through techniques such as cross-validation. In this experiment, the number of hidden layers was set to 100, learning rate was set to 0.01, and the random seed (rng) was set to 0.
Table 1. Prediction Results of Regression Models Based on Near-Infrared Spectroscopy
Table 2. Prediction Results of Classification Model Based on Images
Table 3. Prediction Results of Classification Model Based on Electronic Nose
Table 4. Model Prediction Results based on Fusion of Near Infrared Spectroscopy, Imaging, and Electronic Nose Information
Fig. 6. Line chart of estimated vs actual values for the SG-CARS -SVM Model based on fused information
Fig. 7. Prediction accuracy for the SG-CARS -SVM Model based on fused information
RESULTS AND DISCUSSION
This study utilized near-infrared spectroscopy, image processing, electronic nose technology, and various preprocessing and feature extraction methods to successfully establish a predictive model for the quality of alfalfa hay based on the fusion of multiple sources of information. For spectroscopic data, SG convolution smoothing was used to process the alfalfa hay spectroscopic data in the range of 430 to 1830 nm to reduce noise interference. Then the competitive adaptive reweighted sampling (CARS) algorithm and successive projections algorithm (SPA) were used to extract feature wavelengths from the spectral bands, effectively extracting key information related to crude fat content in alfalfa hay from complex spectroscopic data. For image data, adaptive wavelet thresholding was selected to denoise images and extracted color and texture features since alfalfa hay undergoes significant changes in color and texture after mold growth. Nine color features were extracted, including RGB, and the random forests (RF) algorithm was used to extract 10 texture features including energy, entropy, contrast, and variance from 22 texture features based on importance ranking. For electronic nose data, unstable data were excluded to eliminate external interference and then used principal component analysis was applied to reduce data dimensionality, decrease computational complexity, and improve the model’s generalization ability and efficiency.
When objects are illuminated by near-infrared light, different chemical compositions with different chemical groups inside them correspond to different group frequencies and generate spectral feature absorption peaks at different positions. Therefore, near-infrared spectroscopy can be used for qualitative analysis of substances. In this experiment, a quantitative prediction model was established for crude fat in alfalfa hay based on near-infrared spectroscopy using SVM, ELM, and MLP algorithms. It was found that using the full spectrum as input led to overfitting due to high data dimensionality and model complexity. The model established based on feature wavelengths extracted by CARS outperformed SAP, hence the CARS-extracted feature wavelengths were selected to establish the fusion model. Subsequently, by combining image and electronic nose data with machine learning algorithms, a qualitative discrimination model was built to detect mold growth in alfalfa hay effectively. Finally, using feature set fusion method, there was a merging of spectroscopic, image, and electronic nose data to separately establish quantitative detection models and qualitative discrimination models for alfalfa hay.
The results demonstrated that the fusion information-based models achieved higher accuracy than models based on single information. The fusion information model performed best under the SG-CARS-SVM algorithm combination, with testing set root mean square error and determination coefficient values of 0.1728 and 0.9239, respectively – higher than the best results of 0.2161 and 0.8217 achieved by the prediction model based on near-infrared spectroscopy. The fusion information model achieved 100% accuracy in discriminating mold growth, surpassing the best results of 91.562% and 80.549% from models based on image and electronic nose data.
In summary, this study not only successfully established predictive models for alfalfa quality using near-infrared spectroscopy, image processing, electronic nose technology, and machine learning algorithms but also confirmed the effectiveness and reliability of these models in accurately predicting the nutritional content of alfalfa. This innovative approach will provide more effective tools and methods for agricultural production.
However, this study also has certain limitations and areas for improvement. Noise and lighting changes during the image processing may affect the results, requiring further optimization of algorithms and technologies. Additionally, the limited sample size may lead to results lacking universality. Subsequent experiments will need to be conducted on a large sample basis to enhance the reliability and stability of the models.
CONCLUSIONS
- This study explored the integration of image processing techniques, electronic nose technology, and spectroscopy for evaluating the quality of alfalfa hay. By harnessing high-resolution imaging, electronic nose outputs, and spectral data from alfalfa hay, coupled with machine learning algorithms, it was possible to present a novel, non-contact, efficient, and non-destructive method for assessing forage quality.
- The experimental results demonstrated that the SG-CARS-SVM based fusion Information model delivered exceptional accuracy and reliability in determining forage quality, achieving a 100% accuracy rate on the test set. surpassing the best results of 91.562% and 80.549% from models based on image and electronic nose data. Furthermore, it reported a root mean square error of 0.15018 and a coefficient of determination (R^2) of 0.92151, respectively – higher than the best results of 0.2161 and 0.8217 achieved by the prediction model based on near-infrared spectroscopy, underscoring its effectiveness.
- By amalgamating visual, olfactory, and spectral analyses, this research offers a holistic approach to forage quality assessment. This groundbreaking method stands to benefit agricultural and livestock management practices significantly, paving the way for more sustainable agricultural advancements.
ACKNOWLEDGMENTS
This project is supported by National Natural Science Foundation of China (Grant Nos. 32060414, 51766016); Natural Science Foundation of Inner Mongolia, China (2022MS06023, 2023QN05034); Natural Science Foundation of The Autonomous Region Military-Civilian Integration Key Research & Soft Science Research Projects of Inner Mongolia, China (JMZD202201); Scientific Research Project of Universities In Inner Mongolia, China (NJZY21461);
REFERENCES CITED
Ding, P., Zou, Y., Gou, X.L., Chen. X., and Lu, F. S. (2023). “An automatic control method for semi-active suspension of driverless vehicle based on multi- sensor information fusion in complex environment,” Journal of Automotive Safety and Energy 14(03), 355-364. DOI: 10.3969/j.issn.1674-8484.2023.03. 011
Gibertoni, G., Lenzini, N., Ferrari, L., and Rovati, L. (2022). “Design and performance of a near-infrared spectroscopy measurement system for in-field alfalfa moisture measurement photonics,” Photonics 9(3), 178-178. DOI: 10.3390/PHOTONICS9030178
Jia, Y. R., Huang, S., and Zhang, T. J. (2021). “KK-DBP: A multi-feature fusion method for DNA-binding protein identification based on random forest,” Frontiers in Genetics 12(13), 12811158-811158. DOI:10.3389/FGENE.2021.811158
Jiang, C. S., Zeng, Z., and Wang, J. (2023). “A review of research advances in multi-source information fusion,” Modern Computer 29(18),1-9. DOI: 10.3969/j.issn.1007-1423.2023.18.001
Kemal, A., Metin, M. O., and Ziya, A. (2022). “A sugar beet leaf disease classification method based on image processing and deep learning,” Multimedia Tools and Application 82(8), 12577-12594. DOI: 10.1007/S11042-022-13925-6
Kong, X. L., Xia. Y. H., and Wu, X. Q. (2022). “Discontinuity recognition and information extraction of high and steep cliff rock mass based on multi-source data fusion,” Applied Sciences 12(21), 11258-11258. DOI: 10.3390/APP122111258
Li, Z. M., Zhang, C., Zhang, C. Y., and Zhang, G. G. (2020). “The relationship between nutrients and biological yield of different varieties of alfalfa,” Scientia Agircultura Sinica 53(6), 1269-1277. DOI: 10.3864/j.issn.0578-1752,2020.06.018
Li, Z.W., Wang, Y.W., Liu, J., Chen, D., Feng, G., Chen, M., Feng, Y., Zhanga, R., and Yan, X. (2023). “The potential role of alfalfa polysaccharides and their sulphated derivatives in the alleviation of obesity,” Food & Function 14(16), 7586-7602. DOI: 10.1039/D3FO01390A
Liu, Z. C., Yuan, L. Y., and Gai, X. H. (2022). “Cow monitoring image enhancement algorithm in complex environment based on dual domain decomposition,” Jiangsu Agricultural Sciences 50(09), 203-210. DOI: 10.15889/j.issn.1002-1302.2022.09.033
Li, H., Xie, M. D., and Gui, X. J. (2023). “Research progress of multi-source information fusion technology in quality evaluation of traditional Chinese medicine,” Acta Pharmaceutica Sinica 58(10), 2835-2852. DOI:10.16438/j.0513-4870.2023-0195
Ma, L., Zhou, Q. L., and Zhao, L. Y. (2023). “Classification and recognition of tomato leaf diseases based on deep learning,” Journal of Chinese Agricultural Mechanization 44(07), 187-193+206. DOI:10.13733/j.jcam.issn.2095-5553,2023.07.025
Mumtaz. A., and Xiang, Y. (2022). “Coupled online sequential extreme learning machine model with ant colony optimization algorithm for wheat yield prediction,” Scientific Reports 12(1), 5488-5488. DOI: 10.1038/S41598-022-09482-5
Najmeh, H., Adel, B., and Sedigheh, M. (2022). “Monitoring Botrytis cinerea infection in kiwifruit using electronic nose and machine learning techniques,” Food and Bioprocess Technology 16(4), 749-767. DOI: 10.1007/S11947-022-02967-1
Qin, F., Liu, D. X., and Sun, B. D. (2017). “Image recognition of four different alfalfa leaf diseases based on deep learning and support vector machine,” Journal of China Agricultural University 22(07), 123-133. DOI:10.11841/j.issn.1007-4333,2017.07.15
Quintero, D., Andrade, A. M., Cholula, U., and Solomon, J. K. Q. (2023). “A machine learning approach for the estimation of alfalfa hay crop yield in Northern Nevada,” AgriEngineering 5(4), 1943-1954. DOI: 10.3390/AGRIENGINEERING5040119
Shen, S. C., Zhang, J. X., and Chen, N. H. (2023). “Estimation of above-ground biomass and chlorophyll content of different alfalfa varieties based on UAV multi-spectrum,” Spectroscopy and Spectral Analysis 43(12), 3847-3852. DOI: 10.3964/j.issn.1000-0593(2023) 12-3847-06
Shi, Y., Ren, Y. Q., and Wang, S. Y. (2024). “Adaptive fusion of gas spectral bimodal information for peanut origin traceability,” Transactions of the Chinese Society for Agricultural Machinery, 1-13.
Song, Y. B. (2022). “Leaf area measurement system based on digital image processing,” Technology Journal of Agriculture 12(02), 73-75.
Sun, P. Y. (2023). Research on High-Precision Detection Technology of Field Near-Infrared Spectrometer, Master’s Thesis, Jilin University, Jilin, China.
Swati, V., Praveen, K., and Chandra, M. T. (2023). “Crop yield prediction using improved extreme learning machine,” Communications in Soil Science and Plant Analysis 54(1), 1-21. DOI: 10.1080/00103624.2022.2108828
Tang, Y., Wang, X. P., and Lu, C. C. (2023). “Estimating the canopy water content of alfalfa based on the PROSAIL model and spectral index,” Journal of Lanzhou University (Natural Sciences) 59(01), 55-62. DOI: 10.13885/j.issn.0455-2059.2023.01.008
Tian, B., Ma, C., and Di, Y. W. (2023). The feeding value of different forage nutrients was evaluated based on principal component analysis. Feed Industry, 1-5
Tian, H. X., Yang, R. Q., Zou, H.Q. Guo, X. Y., Hong, W. F., Yao, Y. B., Liu, Y., and Yan, Y. H. (2021). “High-speed identification of odor changes and substance basis of Myristicae Semen mildew by electronic nose and HS-GC-MS,” Zhong guo Zhong yao za zhi 46(22), 5853-5860. DOI: 10.19540/j.cnki.cjcmm.20210526.302
Xue, X. Y. (2006). “Harm and prevention of moldy alfalfa grass on cows,” Northern Animal Husbandry 12(20), 20.
Ye, W. C., Luo, S. Y., and Li, J. H. (2023). “Research on classification method of hybrid rice seeds based on the fusion of near-infrared spectra and images,” Spectroscopy and Spectral Analysis 43(09), 2935-2941. DOI: 10.3964/j.issn.1000-0593(2023) 09-2935-07
Zhang, F., Cao, W. Y., and Cui, X. H. (2023). “Non-destructive detection of soluble solids in cherry tomatoes by visible/near infrared spectroscopy based on SG-CARS-IBP,” Spectroscopy and Spectral Analysis 43(03), 737-743. DOI: 10.3964/j.issn.1000-0593(2023)03-0737-07
ZhangZhong, L. L., He, T. T., and Li, Z. W. (2023). “Quantitative grading method for tomato maturity using regional brightness correction,” Transactions of the Chinese Society of Agricultural Engineering 39(07), 195-204. DOI: 10.11975/j.issn.1002-6819.202211192
Zhou, Y., Zhao, L., Yang, H., Zhang, Y., Li, G. W., and Liu, D. (2024). “Application of multi-information intelligent sensor fusion technology in joint operations,” National Defense Technology 45(01), 22-29. DOI:10.13943/j. issn1671-4547.2024.01.04
Article submitted: April 4, 2024; Peer review completed: April 24, 2024; Revised version received and accepted: May 7, 2024; Published: May 20, 2024.
DOI: 10.15376/biores.19.3.4531-4546