Calibration optimization and efficiency in near infrared spectroscopy
Is Version Of
Calibration optimization in near infrared spectroscopy (NIRS) is a complex process, requiring long-term database maintenance and model update by including new variations. A sample selection procedure was introduced to identify the number and choice of samples required in a NIRS calibration model. The example case is the determinations of moisture, protein and oil contents in whole soybeans. The original large database is composed of soybean NIR transmittance spectra (n>8,000) across crop years (2001-2011), varieties and locations. Uniform random, Kennard-Stone and D-optimal algorithms were compared for calibration sample selection. The optimal models based on calibration set selected by uniform random method outperformed the benchmark calibrations using the original dataset with less than 7% of the original dataset for moisture, and less than 30% for protein and oil contents. This procedure was applied to a network of four instruments from two vendors (Foss Infratecs and Bruins OmegAnalyzerGs) to examine the effect of calibration set on calibration transfer. Calibration models of protein and oil contents based on the smallest and optimal number of representative datasets (about 10% and 35% for protein and oil, respectively) were transferred across instrument units of the same brand. Results showed the effectiveness of post-regression slope and bias correction on standardizing predicted values by models built on calibration subsets. Calibrations (n=120) built on the selected master instruments were used to evaluate their robustness against temperature fluctuation as an external perturbation. Different temperature compensation approaches were applied to incorporate information of five well-selected perturbed samples. The extended global model and difference augmentation method successfully removed the temperature effect and reduced SEPs on both Bruins (SEPs=0.60% and 0.47% for protein and oil, respectively) and Infratec (SEPs=0.57% and 0.46%, respectively) instruments. Improvements on the predictions of regular samples from crop year 2011 have also been examined with SEPs of 0.51% and 0.34% for protein and oil, respectively on Bruins instrument, and SEPs of 0.52% and 0.34%, respectively on Infratec instrument. Only one or two more PCs were used in the compensated models.