Paula Marcela Valencia Ramírez
Ingenio Providencia, Colombia
Sucrose is one of the most important indicators of profitability of Colombian sugar mills, hence understanding its drivers and forecasting its levels are fundamental for business. A model that predicts the percentage of sucrose based on historical data from mechanically harvested farms was developed to predict monthly sucrose levels and allow planning of monthly and annual sugar production. The Lasso regularization method was used to select the predictor variables for the model. The most important variables were the accumulated rainfall from 10 months of age of the crop, the weeks of maturation, the foreign matter content, and the percentage of Diatraea (stemborer) infestation. The XGBoost algorithm was used to develop the numerical prediction of the percentage sucrose variable, and the model was validated using 20% of the available data. The theoretical mean square error of the model for percentage sucrose was 0.25, reasonably predicting monthly data from cane milled at Ingenio Providencia. For the year 2023, the average difference between the predicted data and the actual sucrose content was 0.11%.