Modelling electrical conductivity of the dilution of Bulgarian linden honey

Linden honey is among the most common monofloral types of honey in the region of Bulgaria, and it has no reference values for its electrical conductivity (EC) in the international and national standards. Three variants of a model of the EC dependence on the concentration of linden honey (Gaussian, quadratic and cubic function based ones) are proposed. A standard temperaturecompensated (at 25C) EC meter is used to capture experimental data. Then the EC readings are calculated to that temperature specified in the honey standard (20C) using a linear temperature compensation coefficient related to honey (2.6 %/ C). Verification of the proposed models and their statistical evaluation is done in MATLAB environment. The models can be further developed, summarized and used as an additional tool for the classification of honey on the basis of its botanical and/or geographical origin.


Introduction
The botanical and geographical declaration of the origin of honey seems to be one of the fundamental aspects of the honey quality that affects its commercial value (Robbins 2003;Sergiel et al. 2014).The International honey standards (Codex Alimentarius, 1981/2001 (EU directive)) allow specific denominations for honey produced from particular botanical origin (unifloral honeys), but they do not specify the characteristics of various honey types (Acquarone et al. 2007).The pollen analysis is widely employed to ascertain botanical origin of honey, but the interpretation of pollen analysis data may be difficult and depends mainly on the experience and performance of the operator (Mateo et al. 1998;Acquarone et al. 2007).Honeys of the same floral source can vary due to seasonal climatic variations or to different geographical origins (Anklam 1998).The mineral content in honey samples give an indication of environmental pollution and hence also an indication of geographical origin (Anklam 1998).The electrical conductivity of the honey is closely related to the concentration of mineral salts, organic acids and proteins and proved useful for discriminating honeys of different floral origins (Mateo et al. 1998;Acquarone et al. 2007).It has been shown (Accorti et al. 1987) that a highly significant linear correlation exists between ash and electrical conductivity of honey.The conductivity of honey is an important parameter for determining its authenticity and quality, which is specified both in Codex Alimentarius Draft revised standard for honey (2001) and EU Directive relating to honey (2002).The Bulgarian Regulations of the Council of Ministers (2002) and the Ministry of Agriculture and Forests (2005) relating to honey introduced requirements of EU Directive to our country.According to those documents the values of electrical conductivity for nectar honey and mixtures of blossom and honeydew honeys should be -not more than 0.8 mS.cm -1 , and for honeydew and chestnut honeysnot less than 0.8 mS.cm -1 .There are no requirements for some kinds of honey: strawberry tree (Arbutus unedo), bell heather (Erica), eucalyptus, lime (Tilia spp.), ling heather (Calluna vulgaris), manuka or jelly bush (leptospermum), tea tree (Melaleuca spp.), because of extremely high variation in their conductivity.National standards for honey of some countries additionally set down minimum requirements for electrical conductivity for some kinds of honey.For example, according to the Polish standard, they are: generally for nectar honeynot less 0.2 mS.cm -1 ; for blends of honeydew with blossom honey -not less than 0.6 mS.cm -1 ; for deciduous honeydew honey -not less than 0.8 mS.cm -1 and for coniferous honeydew honey -not less than 0.95 mS.cm -1 (Szczêsna et al. 2004).Conductivity is the ability of a solution to pass an electric current.This depends on a number of factors including concentration, mobility of the ions, valence of the ions and temperature.As the temperature of a solution increases, the mobility of the ions in the solution also increases and consequently this will lead to an increase in its conductivity.Therefore it is mandatory to always associate conductivity measurements with a reference temperature, usually 20 0 C or 25 0 C (www.jenway.com/adminimages/A02_001A_Effect_of_temperature_on_conductivity.pdf).The standard requirements for conductivity readings of honey are defined at 20 0 C. If the measurement is carried out at a temperature different from 20 0 C, the correction factor for temperature should be used for calculation of the conductivity value for honey at exact 20 0 C.For example, the value 3.2 %/ 0 C of the correction factor has been proposed by the Harmonised Methods of European Honey Commission (Bogdanov et al. 1997), and the value 2.6 %/ 0 Cby Szczêsna et al. (2004).The purpose of the paper is to model the dependence of the electrical conductivity (EC) of linden honey on its concentration at two reference temperatures of 20 0 C and 25 0 C degrees.The first temperature corresponds to the standards for honey and the second is in line with international reference measurements.Linden honey is among the most common monofloral types of honey in the region of Bulgaria, and it has no reference values for its EC in international and national standards.The model can be further developed, summarized and used as an additional tool for the classification of honey on the basis of its botanical and/or geographical origin.It is envisaged that EC measurements should be carried out without maintaining a constant reference temperature but instead using a linear correction factor relating to honey.The testing of proposed models and their statistical evaluation is done in MATLAB environment.

Materials and Methods
Electrical conductivity of honey.The electrical conductivity was measured with a conductivity meter "Dist" by Hanna Instruments, which has a temperature compensation to 25 º C. Measurement was performed at room temperature ( C 2 C 25 0 0  ) at concentrations ranging from 15% to 40% (w/v) by step of 5%.All EC readings were performed in triplicate and the average value was obtained.According to the Harmonized Methods of the European Honey Commission EC is measured at 20ºC and concentration 20%, that is, 20 g .100mL -1 solution of honey (dry matter basis) in deionized water (Bogdanov et al. 1997).The results are then recalculated for temperature using the formula:

Models of еlectrical conductivity of honey.
Parametric fitting involves finding coefficients (parameters) for one or more models which fit to data.The data is assumed to be statistical in nature and is divided into two components: deterministic component and random component.The deterministic component is given by a parametric model and the random component is often described as error associated with the data: parametric model and error.The model is a function of the independent (predictor) variable and one or more coefficients.The error represents random variations in the data that follow a specific probability distribution (usually Gaussian).The variations can come from many different sources, but they are always present at some level when it is dealt with measured data.Systematic variations can also exist, but they can lead to a fitted model that does not represent the data well.
Three types of model are compared for modelling electrical conductivity upon the dilution of Bulgarian linden honey: a) Gaussian model may present in the form: Where EC and ECmax are the electrical conductivities at concentration C and at maximum of concentration Cmax, respectively, and SD is the width of Gaussian curve.This model has 1st order and hereafter it is referred to as Gauss1.The unknown parameters (ECmax, Cmax, SD) are estimated by Levenberg-Marquardt algorithm.
b) Polynomial model -2 nd order in the standard form:


is the vector of unknown parameters.Hereafter this model is referred to as POLY2.c) Polynomial model -3 rd order in the standard form: In this case the vector of unknown parameters has four elements.Hereafter this model is referred to as POLY3.Least-squares fitting is a common task in the sciences.Polynomial models (2 nd and 3 rd orders) are estimated by linear least square methods.Gaussian curve based model is obtained by virtue of a non-linear regression using the iterative Levenberg-Marquardt algorithm (LM) (Lampton 1997).The method has been widely presented and is a component of several numerical mathematics packages, e.g.MATLAB.LM is applicable to a wide variety of nonlinear problems because it is adaptive.In the LM method, an adjustable Ndimensional parameter vector controls an Mdimensional vector of residuals.The discrepancy of the fit is measured by the sum of squares (SOS) of the components of the residuals' vector.The residuals depend nonlinearly on the parameters.From a given starting point, LM produces a sequence of parameter vectors, each step being an improvement in fit, i.e., a reduction in the SOS.The sequence terminates near a minimum of the SOS.
The patterns of the EC variables upon honey dilution were modelled, using a non-linear regression programme Curve Fitting Toolbox for MATLAB https://www.mathworks.com/help/curvefit/parametric-fitting.html#bszh0sy-2.The software lets the user conduct regression analysis using the library of linear and nonlinear models provided or specified custom equations.It performs exploratory data analysis, data preprocessing and post-processing, compares candidate models, and optimized their parameters.All parameters are determined with 95% confidence intervals.
Statistical analysis of models.Once a regression model has been constructed, it may be important to confirm the goodness of fit of the model and the statistical significance of the estimated parameters.The following statistical parameters which indicate goodness of fits of model to experimental data were calculated: the sum of squared errors (SSE), the correlation coefficient (R 2 ), adjusted R 2 and the root-mean-square error (RMSE).The sum of squared errors of prediction (SSE) also known as residual sum of squares (RSS) is the sum of the squares of residuals (deviations predicted from actual empirical values of data).It is a measure of the discrepancy between the data and an estimation model.The main purpose of coefficient of determination or correlation coefficient, denoted R 2 or R-squared is a measure of how well observed outcomes are replicated by the model, based on the proportion of total variation of outcomes explained by the model (Draper et al. 1998).When it is added more terms, the coefficient of determination R 2 increases.The use of an adjusted R 2 is an attempt to take into account a penalty for the number of terms in a model.Adjusted R 2 , therefore, is more appropriate for comparing how different models fit to the same data.The root-mean-square error (RMSE) or root-mean-square deviation (RMSD) is a frequently used measure of the differences between values predicted by a model and the values actually observed.The RMSE represents the sample standard deviation of the differences between predicted values and observed values.These individual differences are called residuals when the calculations are performed over the data sample that was used for estimation, and are called prediction errors when computed out-of-sample.RMSE is a measure of accuracy, to compare forecasting errors of different models for a particular data and not between datasets, as it is scale-dependent.

Results and Discussion
Grounds for the expected results.Gaussian and quadratic functions are proposed to model the EC dependence of honey on its concentration (dry matter basis) in the works of Acquarone et al. (2007), and Sancho et al. (1991), respectively.The results obtained in this paper are, on the one hand, a general verification on models already proposed by other authors and, on the other hand, are a study on Bulgarian linden honey for the purpose of future classification of its botanical or geographical origin.As it is known the ash content may be a complex function of both floral and geographical origin (Acquarone et al. 2007).A highly significant linear correlation exists between ash and electrical conductivity of honey (Accorti et al. 1987).The EC depends on sample dilution, due to changes of the physical and chemical environment of the ions and to their interactions with different components present in the system: sugars, amino acids, salts (Acquarone et al. 2007).Therefore, the variation of EC with increasing honey concentration could be very useful to discriminate floral and/or geographic origin of different honeys.Acquarone et al. (2007) have shown that EC peak values (ECmax) higher than those from other geographic regions are probably due to the higher mineral content of the soils in the region; the lowest ECmax values are probably due to the low mineral content and the high amount of organic matter in the soil of this region; and the ECmax values intermediate between those of the types of honey described above indicate that the soils of these regions contain intermediate amounts of minerals and organic substances.Therefore, the ECmax values could be of interest for determining the geographic origin of honey.

Fitted surfaces of the dependence of electrical conductivity of honey upon dilution.
The experimental data of electrical conductivity of honey upon dilution have considered for two samples studiedat temperature 20 0 C and 25 0 C. The values of coefficients in models that describe conductivity dependence on honey dilution for three studied models are listed in Table 1.The values in brackets after each coefficient in Table 1 are the lower and upper confidence limits, respectively, for the default 95% confidence intervals for the coefficients.Table 1.Parameters that describe conductivity dependence on honey dilution for the three studied models   2) -( 4) for θ = 20 °C and θ = 25 °C, and the 'circle' and 'square' marks correspond to the data measured for the same temperatures.In figures' legends these two cases are noticed by EC20 and EC25.The conductivity dimension is in μS.cm upon these figures.The curve of the dependence of electrical conductivity EC on honey concentration presented a behavior, characterized by a maximum value at a given dilution Cmax.The EC values depend on the concentration and mobility of ions present in the honey solution.In the region of the curve corresponding to the more diluted samples, EC increased as a result of increasing ion concentration up to a maximum value.However, simultaneously, predominates in the descending region of the curve (high honey concentration) (Acquarone et al. 2007).The values of obtained parameters (Table 1) indicate that (2)-( 4) fit very well to experimental data.The predicted curves by (2) through non-linear regression analysis are shown in Fig. 1, and the predicted ones by ( 3) and ( 4) through linear regression analysis -in Fig. 2 and Fig. 3, respectively.It is evident the model very well approximates the experimental data in all the cases.All obtained statisticsthe sum of squared errors Table 2. Quality assessments for Tested Models (SSE), the correlation coefficient (R 2 ), adjusted R 2 and the root-mean-square error (RMSE)used in the context of statistical models are given in Table 2.It seems that all the considering statistical features have comparable values, i.e. the three models are comparable on these benchmarks.The calculated statistics for R 2 and adjusted R 2 are highest for the second model POLY2 and lowest for the non-linear Gauss1, for both temperatures (θ = 20 0 C and θ = 25 0 C).
In the narrow range of analyzed EC values, a highly R-squared was found between EC and concentration/ dilution for all the considered model structures.Acquarone et al. (2007) have found a similar correlation for the 1st model.Curve Fitting Toolbox for MATLAB calculate R-squared for a nonlinear regression but the research literature shows that it is an invalid goodness-of-fit statistic for this type of model.In nonlinear regression, SOS Regression and SOS Error do not equal SOS Total.This completely invalidates R-squared for nonlinear models, and it no longer has to be between 0 and 100%.Spiess and Neumeyer (2010) have performed thousands of simulations for their study that show how using R-squared to evaluate the fit of nonlinear

Conclusions
Three variants of a model of the EC dependence on the concentration of linden honey, Gaussian, quadratic and cubic ones, are proposed.The results obtained confirm the conclusions of other authors investigating similar problems.For a wider range of honey concentrations, the shape of the Gaussian model is more appropriate, and for a narrower range -the shape of the quadratic one.If the temperature varies in a narrow range around 25 º C, it is acceptable to use a standard temperaturecompensated (at 25 °C) EC meter and calculate the EC readings for the temperature specified in the honey standard (20 °C) using a linear temperature compensation coefficient related to honey (2.6 %/ °C).This will simplify and accelerate the EC measurements since there will be no need to maintain a constant temperature 20 °C.The maximum EC value may be an indicator of the geographical region of honey with a given botanical origin, for example linden honey.
The future work will include expanding the database with new EC characteristics of honey, including: (1) ones of different botanical origin (linden, acacia, lavender, honeydew, etc.); (2) ones from different geographical regions within the honey of a given botanical origin; (3) development of honey classifiers using EC models; and others.

Fig. 1 -
Fig.1-3 show the dependence of electrical conductivity EC on honey concentration.The solid lines in Fig.1-3 represent the predicted curves by (2) -(4) for θ = 20 °C and θ = 25 °C, and the 'circle' and 'square' marks correspond to the data measured for the same temperatures.In figures' legends these two cases are noticed by EC20 and EC25.The conductivity dimension is in μS.cm upon these figures.The curve of the dependence of electrical

Figure 1 .Figure 2 .Figure 3 .
Figure 1.Gaussian model first order (Gauss1) It should be emphasized that this model is linear by parameters and that its statistics from Table2are much more correct than ones of the non-linear Gauss1 model.On the other hand, considering the fact that the EC was modelled within a small range of concentrations, it is preferable the non-linear Gauss1 model to be used, because model provides high predictive precision in a wide range of concentrations of honey solution (for example, from 1 to 100 %).