Talk:Determination of equilibrium constants
From Wikipedia, the free encyclopedia
Earlier discussion has been archived to reduce size:
- Archive 1: to 15:45, 15 December 2007. Summary: Discussion of a unified presentation of mass balance. Pgpotvin (talk) 16:28, 24 December 2007 (UTC)
Contents |
[edit] Compromise Text
[edit] The chemical model
The chemical model consists of a set of chemical species present in solution, both the reactants added to the reaction mixture and the complex species formed from them. Denoting the reactants by A, B ..., each complex species is specified by the stoichiometric coefficients that relate the particular combination of reactants forming them.
and ![\beta_{pq...}=\frac{[A_pB_q...]} {[A]^p[B]^q...}](../../../../math/3/8/1/381e4534957fede5aba2929df1f7ecf8.png)
The β constants are thereby often called cumulative formation constants. This representation does not ignore ionic charges, which would be included in the chemical formulae of the reactants symbolized here with A, B,... . For consistency, all the equilibrium constants should be association constants. When using general-purpose computer programs, it is usual to use cumulative constants. With aqueous solutions, the constant for the self-dissociation (the ionic product) of water should be included.
If either H+ or OH− is one of the reactants, say reactant A, then the self-dissociation of water can be represented by
if H+ reactant A, or
if OH- is reactant A,
It is quite usual to omit from the model ...
[edit] Speciation calculations
The speciation calculations consist of simultaneously solving the equations of mass-balance at each data point, for the concentrations of free reactant, [A], [B] ..., with the total concentrations T of each reactant present,
![T_A=[A]+\sum p\beta_{pq...}[A]^p[B]^q...](../../../../math/b/d/0/bd0b4edf7c368d76d3a5e255e5277fe7.png)
![T_B=[B]+\sum q\beta_{pq...}[A]^p[B]^q...](../../../../math/7/3/e/73ed067899476aad64ce4b62176e3f62.png)
- ...
To simplify and unify the representation, some authors[1] include the free reactant terms in the sums by declaring identity (unit) β constants:
where ![\beta_{10...^{}} ={[A_1B_0...]}/{[A]^1[B]^0...}=1](../../../../math/3/d/5/3d5a7390dad2c80b28c3a63554ff987c.png)
where ![\beta_{01...^{}} ={[A_0B_1...]}/{[A]^0[B]^1...}=1](../../../../math/a/0/f/a0f07c670b545899915b36750ec97145.png)
- ...
In this manner, all chemical species, including the free (unreacted) reactants, are treated in the same manner, i.e. as having been formed from some combination of reactants specified by the coefficients p,q,..., as by
whence
![T_A=\sum p\beta_{pq...}[A]^p[B]^q...](../../../../math/d/5/3/d53a5abd5b8f39b99291ef6bb3c74f38.png)
![T_B=\sum q\beta_{pq...}[A]^p[B]^q...](../../../../math/2/b/b/2bbe6f511fab94fd428be00b7d999b6c.png)
- ...
suffices. In general, the concentration of the i'th species Si, formed by a combination of the NR reactants R, is
where the coefficient ai,k is the number of equivalents of the k'th reactant entering into the formation of the i'th species, and where βi is the formation constant governing that assembly, and the j'th total reactant concentration can be written
This unification makes for simpler coding of computer programs during the calculation of species concentrations and derivatives.
At each data point, the T would be known quantities, ...
Pgpotvin (talk) 16:01, 15 December 2007 (UTC)
[edit] Proposed Insertion in 'Speciation calculations'
Note: Pre-existing text is coloured grey.
The unknown concentrations [R] are solved from initially guessed values by corrections that increasingly satisfy the mass balance equations, calculated and applied iteratively until they become insignificantly small. The corrections are obtained by solving the family of first-order-truncated Taylor series
The series can be gathered together and written in matrix notation as
where
is the matrix of partial derivatives,[2]
and
are vectors containing the analytical and calculated total reactant concentrations, respectively, and
is the vector of corrections to be applied to the reactant concentrations. This is solved with
since the matrix
will be square and invertible.[3] At each iteration, the derivatives (and concentrations) need to be recomputed, and the process is repeated until the corrections Δ[R] become insignificant, whence the final values of the concentrations will have been obtained. The number of iterations required depend on the quality of the initial guesses at the first iteration. In a titration context, where the samples are ordered, such guesses are only needed at the first titration point and the number of iterations needed at each subsequent point is lessened.[4] In general, solving these non-linear equations can present a formidable challenge because of the huge range over which the free concentrations may vary. For this reason, logarithmic expressions are often used because logarithms span a much narrower range.
Special care is needed when reactants are present at only very small concentrations in their free states, which may warrant a reformulation of the chemical model. [5]
[edit] Proposed Insertions in 'Equilibrium constant refinement'
Note: Pre-existing text is coloured grey.
The method described here is essentially that delineated by Alcock et al. in 1978.[6]
The objective of the refinement process... Most often, only the diagonal elements can be estimated, in which case the objective function simplifies to
with Wi,j = 0 for j≠i. Unit weights, Wi,i = 1, are often used but, in that case, the expectation value of U is the root mean square of the experimental errors. Estimates of the diagonal elements of W are obtained from error propagation as
for all parameters Q that are not being determined (refined), where σ(Qj) represents a realistic estimate of the uncertainty in the j'th Q parameter. This recognizes that any one parameter will not affect each data point uniformly, and the contribution of the y residual at each point is de-emphasized according to its uncertainty arising from all error sources.
The minimization may be performed... using implicit differentiation.
Parameter increments δP are calculated by solving the normal equations, the first-order Taylor series gathered in matrix form as
Weighting is applied here as well to de-emphasize the most error-prone data in determining the P parameters:
Solving for δP gives
where the superscript T indicates the transpose of the matrix and
is called the Hessian matrix, represented by H. The increments δP are added to the current parameter estimates to obtain better estimates, the species concentrations, ycalc and W values are recalculated at every data point and the procedure is repeated until no further useful reduction in U is achieved.
[edit] Marquardt-Levenberg Modification
To modulate the refinement, one can apply the Marquardt-Levenberg algorithm and instead use
where λ is the Marquardt parameter, I is the identity matrix. The Marquardt-Levenberg algorithm can remediate problems of undershoot, overshoot and oscillation about the minimum U through adjustment of λ. A non-zero λ orients the search for the minimum U toward the so-called Steepest Descent direction,
, which results from the direct minimization of U by setting to zero all of its derivatives with respect to P.
[edit] Uncertainty of Determination and Refinement of the Chemical Model
A numerical determination of equilibrium constants is merely a test of a chemical model, and not proof of its relevance. One can examine the statistical quality indicators of the model with the notion of modifying the model to improve the determination. However, the determination ceases being a measure of constants for known equilibria and risks becoming a discovery of new equilibria.[7] One must do so wisely, guided by unbiased comparisons between models.
The uncertainty in the value of the j 'th parameter P is given by the j, j 'th (diagonal) element of the Hessian matrix:
where σ0 is the uncertainty of an observation of unit weight, which can be estimated[8] with
for NP parameters determined from ND data.
From a statistics point of view, the uncertainty in a parameter denotes a range of values, the width of which will depend on the confidence level one wishes to apply, for that parameter that will generate indistinguishable fits. Thus, the uncertainties reflect the (in)ability of the refined parameter values to reproduce the data. Conversely, the uncertainties reflect the (in)ability of the data to pin-point the parameter values. Modifications to the experimental design might more precisely ascertain the values of uncertain parameters.[9] Otherwise, it may be that a highly uncertain parameter can be dropped from the chemical model without 'significant' degradation of the fit to the data.
The correlation coefficients between the parameter values are given by the off-diagonal Hessian elements. Between the j 'th and k 'th parameter values,
These reflect the in(ter)dependence of the parameters in modeling the data, i.e. the benefit (or lack thereof) of separately including each parameter in the chemical model. Conversely, the correlation coefficients reflect the (in)ability of the data to differentiate between parameters. Again, experimental modifications might bring remedy, but it may be that a pair of highly correlated parameters can be replaced in the chemical model by just one parameter without 'significant' loss of fit.
As well, it may be that the chemical model provides an 'unsatisfactory' fit of the data, which the inclusion in the model of additional complex species (and additional parameters in the form of new β values) would improve.
Given that adding parameters will usually improve the fit, whereas removing parameters will usually degrade it, some guidance is required as to what constitutes 'significant' degradation or improvement of a fit. Any alteration to a chemical model, especially when including an unanticipated chemical species or discounting one that had been expected, should be guided by good chemical sense and be unbiased. To help make an unbiased decision, the Hamilton R-ratio test (Hamilton, 1964) routinely used in crystallography, can be applied to allow the data to decide whether an alternate model should be rejected or accepted.
Finally, the measurements themselves and the parameters Q will affect the quality of the fit. For instance, certain data points (outliers) may be 'problematic' in that their removal 'significantly' improves the quality of the fit, as well as reducing the uncertainties in P, contrary to the general rule that the fit statistics (and parameter uncertainties) improve with an increasing number of data. As well, the yobs − ycalc residuals may be correlated in that the yobs are not randomly distributed about the ycalc, which the method of least squares assumes will be the case.[10] To remedy such problems, one can remove certain data, or shift the weighting by increasing or decreasing some of the σ(Q) values, or change the Q values themselves, or even refine them alongside the unknown P as if they need to be determined (which many computer programs allow). Such tinkering for the sake of improving the fit introduces bias at the risk of losing meaning. Good chemical sense, rather than bias toward a result, should guide any changes to the model, the fixed parameters, the data or their weights, and efforts at improving a determination may be more usefully devoted to an improvement in the quality of the data (i.e. the confidence in the data).[11]
Nevertheless, it might be useful in some cases to rigourously compare the results from judicious alterations of this sort. The selective omission or de-emphasis of data and the alteration of fixed parameters gives rise to competing models with different W matrices, and a modified Hamilton R-ratio test[12] may be used to decide which are statistically significantly different. This modified test is also useful for the comparison of results from separate experiments, with different data points and different W matrices.
[edit] Uncertainties vs. Experimental Errors
As indicated above, the uncertainties calculated from the Hessian matrix H reflect the quality of the fit and carry the notion of 'determinability'. Both the parameter values and their uncertainties should be well reproduced if the experiment were conducted again under identical conditions. On the other hand, 'reproducibility' relates the expectation of statistically indistinguishable results from completely independent determinations, ideally by different practitioners using different equipment and different concentrations of the same materials from different batches. Most often, such diverse data sources are not available.[13] Replicate experiments are nevertheless useful to minimize the effects of random errors in some systematic error sources Q. [14]
The assessment of an 'experimental error' in an equilibrium constant then ranges from the quality of fit in a single determination, or repeated determinations where some systematic error sources are minimized by averaging, to completely independent determinations where all error sources are averaged. Therefore, any report of 'experimental error' should also convey the diversity of the data used. However, simply averaging the values from separate determinations would not take into account the fact that not all determinations have equal value (and precision), as related by the uncertainties from individual determinations. Instead, the uncertainty-weighted average,
where σn(Pn) represents the uncertainty in the n 'th of N determinations of P, will bias the consensus value towards the least uncertain values. Similarly, one can quantify the 'experimental error' in
as the uncertainty-weighted standard deviation
about this average,[15] from
[edit] Uncertainties in Derived Constants
When computing equilibrium constants that are functions of the refined formation constants, such as Ka values, the uncertainty in such derived constants is not a simple function of the uncertainties in the component formation constants, as normal error propagation rules would suggest, since normal error propagation requires independence between error sources while the component formation constants are correlated when assessed from the same data. For instance, for protonation of a dibasic substance L,
, or
and
, or 
the uncertainty (or variance) in pKa1 is the same as that in
but the variance (squared uncertainty) in pKa2, σ2(pKa2), is not necessarily the sum of the variances in
and
, because of the (usually) non-zero correlation between the two logβ values. In fact,
where
is the coefficient of correlation between
and
. The coefficients of correlation between derived constants are similarly functions of those between formation constants. Since σ are positive numbers but correlation coefficients can be negative, the uncertainty in derived constants can be greater or smaller than what normal error propagation rules would suggest. Because the protonation constants
and
are likely negatively correlated (an increase in one would require a compensatory decrease in the other in order to match the data), the uncertainty in pKa2 will likely be greater than would normally be expected.
Derived constants are equivalent representations in a chemical model. For instance, pKa1 and pKa2 can entirely and equivalently replace
and
in the chemical model, even if one or the other representation is used in the model refinement for reasons of programming convention or personal preference, but the error relationships between parameters (variance and covariance) may differ between representations. A method of constructing equivalent chemical models and deriving the error relationships involving derived constants has been outlined.[16]
[edit] Modeling Spectral Data
A particular issue arises with NMR...
Pgpotvin (talk) 22:23, 21 December 2007 (UTC)
[edit] Notes
- ^ See Motekaitis and Martell (1982), Martell and Motekaitis (1992) and Potvin (1990a).
- ^ For the j'th reactant R,
- ^ This illustrates the great advantage of using formation constants to define an equilibrium system: no matter how many species are present in the equilibria and needing to be quantified, the definition of all species in terms of a much smaller number of reactants reduces the number of simultaneous equations to solve, the size of the matrix to invert and the overall computational requirements. In this regard, the computational approach described here deviates from that of Alcock et al. (1978), who solved for the concentrations of all species at once.
- ^ The final concentration values calculated at one titration point can serve as initial guesses for the next point. The derivatives will then have already been computed, the corresponding
matrix will already have been inverted, and one will be able to immediately apply a first set of corrections Δ[R]. Thus, only a few iterations will suffice for subsequent titration points, except near inflection points where the concentrations change more radically. - ^ If the concentration of a reactant in free (unreacted) form is very small at all data points as a result of large formation constants for the complex species arising from it, then the calculation of species concentrations can be numerically problematic because of the need to invert a
matrix containing very small elements. The degree of difficulty will depend on the floating-point precision used, the sizes of the formation constants involved, the starting guesses for the concentrations during the first iteration and the initial guesses at the values of the unknown formation constants. The subsequent refinement of unknown formation constants that depend on very small reactant concentrations can produce values plagued with high uncertainties, since perturbations of very small reactant concentrations will be felt in the computed values of the observables to an insignificant degree. For instance, an uncertainty in pH measurement of ±0.0005 log units means an uncertainty of about ±10-10 M in H+ at pH 7 or about ±10-5 M at pH 2, whence a concentration of free metal ion below these values will affect the metal-proton competition for ligand to a degree that will be essentially undetectable by pH measurements. Such situations can arise with strongly bound metal-ligand complexes, for instance, where there might be negligible amounts of free metal ever present. As well, equilibria other than formation equilibria, such as ligand exchange equilibria, might be of greater interest than the formation equilibria themselves, but the exchange equilibrium constants would be ratios of large and potentially highly uncertain formation constants. In such cases, it is possible to reformulate the model (see Potvin (1990b)) to avoid invoking free reactants at negligibly small concentrations and obviate the need to calculate formation constants therefrom in order to assess other kinds of equilibrium constants directly and with less uncertainty. - ^ There were earlier authors, which are cited in Alcock et al. (1978), describing the determination of equilibrium constants. The classic reference from pre-computer times (Rossotti, F. J. C.; H. S. Rossotti (1961). The Determination of Stability Constants. McGraw-Hill, New York. ISBN.) described methods now largely displaced. Although Alcock et al. chose to refine the species concentrations at the same time as the equilibrium constants — an inefficient practice no longer employed — their presentation of the mathematical basis for statistically sound, computer-based refinement is well detailed and still pertinent, and is based on Hamilton's general treatment (Hamilton 1964).
- ^ The obtention of a significantly better fit by inclusion of a species, especially an unanticipated one, does not constitute proof of the existence of that species. Many journals require independent evidence of new species to corroborate the chemical model. For instance, the description of the journal Polyhedron specifically states that "papers concerned solely with stability constants determined by potentiometric titration data unsupported by other e.g. spectroscopic techniques will not be acceptable" (Polyhedron (Elsevier) description for authors. Retrieved on 2007-12-23.).
- ^ From Alcock et al. (1978).
- ^ Species concentration profiles across the data (speciation profiles) or the data 'sensitivities' (given by the Jacobean matrix elements
) can indicate which data points are most informative of a particular parameter value, and suggest modifications to the experiment (widening the data set or adding interpolating points) or new experiments (at different relative reactant concentrations) to increase the concentration of the species governed by that parameter. - ^ Correlated residuals are particularly easily detected in titration data, and the degree of correlation can be quantified. A formula for titration data derived from Neter et al. (Neter, J.; W. Wasserman and M. H. Kutner (1983). Applied Linear Regression Models. R. D. Irwin, Homewood, IL. ISBN.) in given in Potvin (1994). Highly correlated residuals usually signal a systematic error in one or more of the model's fixed Q parameters (volumes, reactant concentrations). As well, some computer programs account for the unsuspected presence of carbonate impurities in alkaline reagents (from atmospheric CO2), although Gran plots can quantify this (see Martell and Motekaitis, 1992). Finally, the suspected but undetected precipitation of metal hydroxides, which some computer programs can anticipate, can justify the exclusion of high-pH data on the basis of known Ksp values.
- ^ As an example of a justifiable change to fixed parameters, Martell and Motekaitis (1992) describe a situation where a poor fit led to the identification of an impurity in a reagent, which could then be corrected.
- ^ The Hamilton R-ratio test (Hamilton, 1964) compares competing chemical models refined while applying the same weighting scheme, i.e. the same W matrix, to the same data. If comparing models resulting from different weighting schemes, different experiments, different subsets of the data or different values of Q, whence different W matrices apply, a weight-equalized test can be used instead. See Potvin (1994).
- ^ Even when diverse data sets are available, not all are useful. In 1982, Braibanti et al. carried out an analysis of inter-laboratory practice and variance in assessing the same chemical equilibria (Braibanti, A.; F. Dallavalle, G. Mori and B. Veroni (1982). "Analysis of variance applied to determinations of equilibrium constants". Talanta: 725–731.), rejecting three of seven sets of results in arriving at 'global' averages.
- ^ Unless using faulty equipment, random errors in sampling and measurement will be averaged out in replicate experiments. If the sample preparation involves combining certain volumes of the same stock solutions, then the replicates will reduce the effects of random errors in those volumes, but not in the stock concentrations. Using separately prepared stock solutions will reduce the effects of errors in their preparation (in weights and volumes), but not in their compositions, unless separate batches of materials are used. Potvin (1994) describes a method of computing equilibrium constants from individual titration data sets that reduces the impact of systematic errors, beyond the effect of simple data weighting or averaging from repeated experiments.
- ^ From Potvin (1994).
- ^ See Potvin (1990b).
[edit] References
- Alcock, R. M.; F. R. Hartley and D. E. Rogers (1978). "A Damped Non-linear Least-squares Computer Program (DALSFEK) for the Evaluation of Equilibrium Constants from Spectrophotometric and Potentiometric Data". J. Chem. Soc. Dalton Trans.: 115–123.
- Hamilton, W. C. (1964). Statistics in Physical Science. Ronald Press, New York. ISBN.
- Martell, A. E.; R. J. Motekaitis (1992). The determination and use of stability constants. Wiley-VCH. ISBN.
- Motekaitis, R. J.; A. E. Martell (1982). "BEST - a new program for rigourous calculation of equilibrium parameters of complex multicomponent systems". Can. J. Chem. 60: 2403–2409.
- Potvin, P. G. (1990a). "Modelling complex solution equilibria. I. Fast, worry-free least-squares refinement of equilibrium constants". Can. J. Chem. 68: 2198–2207.
- Potvin, P. G. (1990b). "Modelling complex solution equilibria. II. Systems involving ligand substitutions and uncertainties in equilibrium constants derived from formation constants". Can. J. Chem. 68: 2208–2211.
- Potvin, P. G. (1994). "Modelling complex solution equilibria. III. Error-robust calculation of equilibrium constants from pH or potentiometric titration data". Anal. Chim. Acta 299: 43–57.
[edit] Response to proposals
The section on computational methods has been extensively revised in the spirit, if not the substance of the proposals. The revision is intentionally as concise as possible. error propagation has been virtually re-written and an article on the hat matrix has been created, to help keep this article from becoming excessively long. Petergans (talk) 11:17, 13 January 2008 (UTC)
- I'm surprised there wasn't already an article on the hat matrix. You might add (there) that it's useful for identifying outliers.
-
-
- A Google search for "hat matrix outliers" led me to http://elsa.berkeley.edu/sst/regression.html#StdResid and lots of other stuff. Pgpotvin (talk) 03:42, 18 January 2008 (UTC)
-
- I think, however, that it's not all that important here, as I believe you're confusing residual correlation as you've described it, with autocorrelation, particularly with regard to its occurrence in regression. Whereas residuals yobs − f(x) are always functionally correlated through f (a change in any one x or yobs will trigger a compensatory change in all residuals through a change in f in order to minimize the sum of squares), the error terms e = yobs − f(x) (from the assumption of yobs = ytrue + e with ytrue being unbiasedly estimated as f(x)) are meant to be independent of each other as well as of the variable, and be random (in sign and magnitude), and the residuals are unbiased estimates of those errors. The basic assumption of least squares that the true errors will be independent and random leads therefore to the expectation that the residuals will be independent and random (in magnitude and sign). Autocorrelation is a violation of that assumption.
- Pgpotvin (talk) 13:21, 16 January 2008 (UTC)
Absorption spectra taken on a scanning spectrophotometer are usually highly autocorrelated. This arises because the signal is smoothed, irrespective of whether it is analogue or digital smoothing. To my knowledge no-one has ever taken this into account in fitting applications. I once tried measuring the correlation coefficients between data at adjacent wavelengths by processing a large number of replicate scans and they came out close to 1! Data from a diode array instrument are not correlated.
- I'm not sure what you did here, exactly. In any case, smoothing is a form of local averaging, as is a moving average. Moving averages are necessarily autocorrelated. Smoothing can also be viewed as a delayed response, and, as with time-series data, would be expected to show autocorrelation. Maybe you could test whether or not smoothed data give biased results by acquiring non-smoothed data, model-fitting it, then artificially applying the instrument's smoothing function on the same data and refitting. Or maybe you've done this already. Pgpotvin (talk) 03:42, 18 January 2008 (UTC)
A point not mentioned here is that if experimental error follows a normal distribution, then so should residuals (Mardia, Kent and Bibby, Multivariate analysis, Academic Press, 1979). However, I find that idea difficult to square with the fact that residuals are correlated. Petergans (talk) 10:27, 17 January 2008 (UTC)
- As I said, residuals are always related to each other through the fitting function. Change one residual, change them all. That is normal and unremarkable. With independent observations, independent errors, and with independence between the errors and the observations, as well as between the errors and the variables, the residuals (with an expectation value of zero) should be normal if the errors are normal, as the residuals are meant to estimate the errors. When they are autocorrelated, the residuals are no longer independent of the variables and no longer unbiasedly estimate the errors. Pgpotvin (talk) 03:42, 18 January 2008 (UTC)
[edit] Changes to 'Parameter errors and Correlations'
- "...
- If so, each weight should be the reciprocal of the variance of the corresponding observation. For example, the weight of the observation from any k 'th sample or at any k 'th titration point, can be given by

- where
is the error in the k 'th observable (usually constant for all samples or all titration points) and
is the error on any j 'th experimental parameter Q. For instance, in a titration, the initial volume, the initial reactant concentrations, the titrant volume, the titrant concentration, etc. are all error sources contributing to the variance in ycalc, but not necessarily equally over all data. These error sources are assumed to be independent of the observable and of one another. - ..."
There is no reason to include some error sources and not others. Pgpotvin (talk) 05:03, 26 January 2008 (UTC)
[edit] Changes to 'Distribution of residuals'
I object to:
- "At the minimum in U the system can be approximated to a linear one, the residuals in the case of unit weights are related to the observations by

- The symmetric, idempotent matrix
is known in the statistics literature as the hat matrix, :
. Thus, 
- and

- where I is an identity matrix and Mr and My are the variance-covariance matrices of the residuals and observations, respectively. This shows that even though the observations may be un-correlated, the residuals are always correlated."
The reason for my objection was made clear earlier in the discussion of correlation vs. autocorrelation: that the residuals correlate is totally expected and is not the source of autocorrelation, which is the phenomenon to which the figure alludes.
I also object to the concluding sentence:
- "This is about as good as it gets."
Since autocorrelation is probably due to model imperfections and parameter errors, it would be misleading to suggest that statistics expects it, that we should throw up our hands and live with it. Further, my 1994 Anal. Chim. Acta paper shows how one can generate a smaller degree of autocorrelation, for a given set of data and a given model (i.e. without changing anything). Pgpotvin (talk) 05:19, 26 January 2008 (UTC)
- You have misunderstood me. I suppose that autocorrelation is absent from titration data as long as sufficient time elapses between successive measurements for the system to come to equilibrium. A good auto-titrator assures that this is so. My concern is based on the experience that residuals usually show systematic trends over and above what one might expect from correlation alone. These trends are the result of the presence of unquantifiable systematic error, beit in the measurements and/or the model. Just to give an example or two, how much error is due to the assumption of a constant activity quotient? What very minor species, such as ion-pairs involving the ionic medium, have been omitted from the model? That is why it is rare to see a better fit than the one illustrated. Petergans (talk) 10:56, 27 January 2008 (UTC)
Granted. However, I would temper the idea with something like:
"Keeping in mind that some systematic error sources that affect residual distributions (drifts in ionic strength, minor species, unexpected errors in the variables, etc.) are unquantifiable while others will remain unquantified, the quality of the fit illustrated here is about as good as one can get, conferring a high degree of confidence in the results."
Autocorrelation is often a time-dependent phenomenon in other contexts, true. As you say, an autotitrator will wait for the potential to stabilize before recording its value. Setting aside the time lag in the potential, there should be no time dependence in this context, but autocorrelation does not require a time dependence. There is nothing to which autocorrelation adds "over and above". Autocorrelation is the presence of systematic (non-random) trends in the residuals. As I've written three times, now, residual correlation as you describe in the article is actually correlation of ycalc values and is completely unremarkable. It does not generate trends among the residuals. Simulated data sets make this clear. If it did, a perfect fit would never be possible.
Pgpotvin (talk) 21:20, 27 January 2008 (UTC)
[edit] Changes to 'Derived Constants'
- "When calculating the error on the stepwise constant, the fact that the cumulative constants are correlated must be taken into account. By error propagation
"
According to my second 1990 paper,
Pgpotvin (talk) 05:27, 26 January 2008 (UTC)
[edit] Over Two Weeks
It's been well over two weeks with no response from you. I hope nothing unpleasant has befallen you. Nevertheless, one more week and I will implement the above changes myself. Pgpotvin (talk) 03:49, 15 February 2008 (UTC)
- I have been busy with other things, mainly a big tidying up operation on least squares, linear least squares, non-linear least squares, total least squares and Gauss-Newton algorithm which might make it possible to streamline this article somewhat. In the meantime I've lost track of the changes proposed and implemented here. Basically I've done my bit and now it's up to you to edit as you see fit. All I will say is that you should keep in mind a broad perspective, by which I mean that I have generally avoided things specific to one technique only. Having said that I put a link in total least squares to the expression for weight in potentiometric titration. Good luck! Petergans (talk) 20:17, 17 February 2008 (UTC)

![K_W=[H^+_{ }][OH^-]](../../../../math/8/7/5/87500672d366e06d4a0aa964bdf0a994.png)
![[A_{p^{ }}B_q...]=\beta_{pq...}[A]^p[B]^q...](../../../../math/a/7/2/a72ca1e06badd4d0fd6a3d42a8f8c97f.png)
![[S_i] = \beta_i \prod_k^{N_R} [R_k]^{a_{i,k}}](../../../../math/8/9/0/89021bfe8ceb7452d2ec09ce0c98c464.png)
![[T_j]= \sum_i^{N_S} a_{i,j} [S_i] = \sum_i^{N_S} a_{i,j} \beta_{i} \prod_k^{N_R} [R_k]^{a_{i,k}}](../../../../math/3/8/d/38d051df949d9de05a80bd4bb6e53e0a.png)
![[T_j] = \sum_i^{N_S} a_{i,j} [S_i] + \sum_k^{N_R} \frac {\partial [T_j]}{\partial [R_k]} \Delta [R_k]](../../../../math/5/5/a/55aa5aca929c73bf912b63deb731aed0.png)














![{\partial [T_j]}/{\partial [R_k]} = \sum_i^{N_S} a_{i,j}[S_i]a_{i,k}/[R_k]](../../../../math/1/c/0/1c0f6ef8dfecc0495ceeefab5287688d.png)


