Abstract
Standardized statistical and graphical methods for analysis of limiting dilution assays are highly desirable to enable investigators to compare and interpret results and conclusions with greater accuracy and precision. According to these requirements, we present in this work a powerful statistical slope test that estimates the fit of the single-hit Poisson model to limiting dilution experiments. This method is readily amenable to a graphical representation. This slope test is obtained by modeling limiting dilution data according to a linear log-log regression model, which is a generalized linear model specially designed for modeling binary data. The result of the statistical slope test can then be graphed to visualize whether the data are compatible or not with the single-hit Poisson model. We demonstrate this statistical test and its graphical representation by using two examples: a real limiting dilution experiment evaluating the growth frequency of IL-2-responsive tumor-infiltrating T cells in a malignant lymph node involved by a B cell non-Hodgkin’s lymphoma, and a simulation of a limiting dilution assay corresponding to a theoretical non-single-hit Poisson model, suppressor two-target Poisson model.
Limiting dilution analysis has gained widespread acceptance as a tool for quantifying the frequency of cells in the immune system that possess various functional activities (1). To analyze the data, it is usually assumed that only one limiting cell of only one cell subset is necessary and sufficient for generating a positive response (the single-hit Poisson model (SHPM)2. However, the immune system is a complex network of cellular and humoral (cytokines) interactions. It is thus not unreasonable to consider that experimental data do not adhere to all-or-none functionality and will yield nonlinear titration curves (1, 2). In the literature, deviation from single-hit kinetics is usually estimated by a standard χ2 test. Previously, we demonstrated that the standard χ2 test was insufficient for estimating the goodness-of-fit to the SHPM due to its intrinsic lack of power (3). Subsequently, we presented evidence that modeling limiting dilution assays by a log-log generalized linear model results in a statistical slope test that is better able to account for deviation from linearity than the standard χ2 test (4). In the present paper we show that, in addition to its ability to accurately detect departure of the experimental data from the SHPM, this slope test is readily amenable to a graphical representation.
In recent years, the continuing development of more complex models giving mathematical representation of biological data, in parallel with the tremendous improvements in computing power, have led to the emergence of even more complex statistical methodologies. In this context, generalized linear theory-based statistical tests represent a major advance in evaluating the fit of the SHPM to limiting dilution analysis. However, new methods are rarely used in practice, and a major challenge for research statisticians is to explain to immunologists the benefit that they can draw from these new methods. In line with this problem, it must be emphasized that graphing can make the result of a statistical investigation clearer to the reader and more attractive than if the result has been presented merely in a table. In this way, we present an understandable graphical illustration of a generalized linear model-based statistical slope test with the aim of encouraging investigators to use a new statistical method for their limiting dilution data.
The generalized linear model-based statistical procedure, accompanied by its graphical representation, is illustrated in this paper with two examples. The first set of data is derived from a real experiment assessing the growth frequency of IL-2-responsive tumor-infiltrating lymphocyte T (TIL-T) in a malignant lymph node involved by a low-grade B cell non-Hodgkin’s lymphoma. The second set of data is artificially generated and corresponds to a theoretical non-SHPM, suppressor two-target Poisson model (STTPM). Specifically, in this second example, the total cell population contains a suppressor cell subset exerting a suppressor activity on the growth of a second cell subset (3). A possible physiologic illustration of this situation is given by the functional relationship which has been described between CD4+ virgin and memory T cells, where memory cells produce IL-10, which in return inhibits the growth of virgin cells (2, 5).
Materials and Methods
Abbreviations used in the text, f, k, i, n, x, r, m, and μ, are defined as follows: f is the estimate of the cell frequency according to the SHPM hypothesis; k is the number of groups of replicate wells, each group labeled i, with i = 1, 2, 3, … , k. At each i, ni is the number of replicate wells, xi is the number of cells plated in each replicate well, ri is the number of observed negative wells, mi is the observed fraction of negative wells, mi = ri/ni, μi is the theoretical fraction of negative wells.
Generalized linear model fitting the SHPM
The first step consists of modeling limiting dilution data according to a generalized linear log-log model fitting the SHPM and checking this model by an appropriate slope test (4). According to the zero term of the Poisson distribution, the theoretical fraction μi of negative wells is given by μi = exp(−fxi).
This equation can be written according to a log-log linear model as log(−log(μi)) = log(f) + log(xi). This equation is now in the form Yi = α + βXi with Yi = log(−log(μi)), α = log(f), β = 1, and Xi = log(xi).
A test of deviation from the SHPM (model checking) is a test of whether the estimate of the slope β is significantly different from 1. The appropriate statistical test z uses the normal deviates under the null hypothesis that β = 1. The hypothesis that the slope β is compatible with 1 at the 95% confidence level can also be expressed as
This statement can be rewritten as β − 1.96 SE (β) < 1 < β + 1.96 SE (β)showing that the value, 1, is included in the 95% confidence interval of the slope β if the SHPM hypothesis holds.
Maximum likelihood estimates of the parameters α and β were obtained with the Fisher’s method of scoring according to an iteratively reweighted least-squares (IRLS) procedure written for binary data (4, 6, 7). Maximum likelihood estimates satisfy the equation XTWXβ = XTWZ with X, the matrix with elements Xi = log (xi); W, the diagonal matrix of weights with elements where Vi is the variance of μi
and thus
Z is the matrix of dependent variables with elements
and thus
β is the matrix of the coefficients α and β. The revised estimateβ(m) solved iteratively by IRLS is β(m) = (XTWX)−1XTWZ where m is the mth iteration and β(m) is the matrix of the coefficients α(m)and β(m). The elements α and β of the matrix β(m) are given by
The estimates of the variances for α and β, var(α) and var(β), are given bythe diagonal elements of the variance-covariance matrix (XTWX)−1 evaluated with α(m) β(m). The variance of β is given by
The 95% confidence interval for β was calculated as
The main steps of the IRLS algorithm used for calculations of α, β, and var(β) are given in the Appendix.
Computation of the cell subset frequency according to the SHPM
The second step consists in computation of the cell subset frequency according to the SHPM by maximum likelihood estimation. Let f be the estimate of the cell frequency; the maximum likelihood of f is the value of f that maximizes where log(L) is the natural logarithm of the likelihood function L (8), and Pi is given by Pi = exp(−fxi) according to the SHPM.
The variance of f was calculated as the negative reciprocal of the second derivative of log(L) (8). The 95% confidence interval for f was calculated as 95% CI(f ) = f ± 1.96 SE (f ).
Derivation of the STPPM
For simulation experiments, it was assumed that the total CD4+ T cell population contains suppressor cells with frequency φ, demonstrating a growth suppressor activity, for instance, by the production of IL-10, on another population of CD4+ T cells capable to growth with frequency γ (3). According to the Poisson law, the fraction of wells not containing suppressor cells, but containing at least one proliferating T-lymphocyte precursor cell, is exp(−φxi)(1 − exp(−γxi)). Thus, the theoretical fraction μi of negative wells is μi = 1 − (exp(−φxi)(1 − exp(−γxi))). Calculations were made with φ = 0.08 and γ = 0.4. The theoretical number of negative wells was equal to 100 μi (100 replicate wells for each cell dose; ni = 100), and the experimental number of negative wells, ri, was taken at the nearest integer.
Results and Discussion
Table I⇓ presents the experimental and simulated limiting dilution data. Table II⇓ indicates the results of statistical analysis of the data and precursor frequency. In the real experiment conducted with TIL-T in one case of B cell lymphoma (9), the slope test z derived from the generalized linear log-log model did not reject the SHPM, and it was concluded that the SHPM adequately fits the observed data. In contrast, in the simulated experiment, conducted according to the STTPM, the statistics z clearly rejected the hypothesis that the data conformed to single-hit kinetics. This reflects the fact that the slope β was not compatible with 1 (95% confidence interval for the slope β ranged from 0.598 to 0.936).
Data of limiting dilution assaysa
Statistical analysis of limiting dilution data
Fig. 1⇓A and Fig. 2⇓ are graphical representations of the slope test z for both examples. This graphical representation is designed to allow visualization of whether the slope β of the log-log regression line fitted to the experimental data is compatible or not with 1 (SHPM adequacy or inadequacy). The graph is constructed in the following way. The log-log transform of the proportion of negative wells is plotted against the logarithm of the number of T cells per well. First, the observed data points (Xi, log(−log (mi)) are plotted. Then the log-log regression line fitted to the experimental data (regression line with equation Yi = α + βXi; Fig. 1⇓A and Fig. 2⇓, blue plain line) is plotted. Next are plotted two regression lines with the same intercept α as the log-log regression line, but with slope β equal to either the upper or the lower value of the 95% confidence interval for β (boundary regression lines, Yi = α + βupperXi, Yi = α + βlowerXi; Fig. 1⇓A and Fig. 2⇓, blue dotted lines). Finally, a theoretical, so-called SHPM regression line, with the same intercept α, but with slope β equal to 1, is superimposed on the graph (Yi = α + Xi; Fig. 1⇓A and Fig. 2⇓, red line). If the theoretical SHPM regression line with slope β equal to 1 does not lie between the two boundary regression lines for βupper and βlower, then the slope β of the log-log regression line fitted to the experimental data is not compatible with 1, and thus the SHPM does not fit the data. Reciprocally, if the SHPM regression line with slope β equal to 1 lies inside the limits of the two boundary regression lines for βupper and βlower, then the slope β of the log-log regression line fitted to the experimental data is compatible with 1, and thus the SHPM fits the data. To construct the graphs presented in Fig. 1⇓A and Fig. 2⇓, Table III⇓ gives the numerical values of the pairs (Xi, log(−log(mi)) for plotting observed data points, and Tables IV⇓ and V⇓ show determination of the pairs (Xi, Yi) for drawing the four regression lines.
A and B, Graphical displays of limiting dilution data in a real experiment estimating the growth frequency of IL-2-responsive TIL-T in one case of B cell lymphoma. A, Graphical representation of the slope test z evaluating the adequacy of the SHPM to limiting dilution data. The log-log transform of the fraction of negative wells is plotted against the logarithm of the number of cells per well. The blue plain line represents the log-log regression line fitted to the experimental data according to the generalized linear log-log model (Yi = −3.081 + 0.909Xi). The blue dotted lines represent regression lines with the same intercept α as the fitted log-log regression line, but with slope β corresponding to the lower and upper values of the 95% confidence interval for β (Yi = −3.081 + 0.763Xi; Yi = −3.081 + 1.055Xi). The red line represents the theoretical SHPM regression line with the same intercept α, but with slope β equal to 1 (Yi = −3.081 + Xi). B, Plot of the fraction of negative wells as a function of the number of cells per well. The prediction equation of the regression line (plain line) is −log(μi) = 0.0339xi, where 0.0339 is the cell frequency f estimated by the maximum likelihood method according to the SHPM hypothesis. Upper and lower dotted lines are plotted by using upper and lower values of the 95% confidence interval of f. The boundary equations are −log(μi) = 0.0289xi (lower) and −log(μi) = 0.0391xi (upper).
Graphical representation of the slope test z evaluating the adequacy of the SHPM to limiting dilution data. Data were artificially generated according to a STTPM. The log-log transform of the fraction of negative wells is plotted against the logarithm of the number of cells per well. The blue plain line represents the log-log regression line fitted to the experimental data according to the generalized linear log-log model (Yi = −1.095 + 0.767Xi). The blue dotted lines represent the regression lines with the same intercept α as the fitted log-log regression line, but with slope β corresponding to the lower and upper values of the 95% confidence interval for β (Yi = −1.095 + 0.598Xi; Yi = −1.095 + 0.936Xi). The red line represents the theoretical SHPM regression line with the same intercept α, but with slope β equal to 1 (Yi = −1.095 + Xi).
Tabulated values of the pairs of data (Xi and Yi) used for the construction of the regression lines corresponding to the limiting dilution experiment performed with TIL-T in B-NHL (Fig. 1A⇑)a
Tabulated values of the pairs of data (Xi and Yi) used for the construction of the regression lines corresponding to limiting dilution data artificially generated from the theoretical suppressor two-target Poisson model (Fig. 2⇑)a
Fig. 1⇑A represents the graphical evaluation of the slope test z in the experiment evaluating the growth frequency of IL-2-responsive TIL-T in lymphoma. The equation of the log-log regression line fitted to the experimental data is Yi = −3.081 + 0.909Xi, and the boundary equations of the two regression lines corresponding to the lower and upper limits of the slope β are Yi = −3.081 +0.763Xi (lower value) and Yi = −3.081 + 1.055Xi (upper value). Obviously, the regression line with β equal to 1, Yi = −3.081 + Xi, lies inside the boundary regression lines for β = 0.763 and β = 1.055, thereby indicating that the experimental data are compatible with the SHPM. The graphical display of the slope test z in the STTPM is shown in Fig. 2⇑. The equation of the log-log regression line fitted to the experimental data is Yi = −1.095 + 0.767Xi, and the boundary equations of the two regression lines for the lower and upper limits of the slope β are Yi = −1.095 + 0.598Xi (lower value) and Yi = −1.095 + 0.936Xi (upper value). The regression line with β equal to 1, Yi = −1.095 + Xi, lies outside the boundary regression lines for β = 0.598 and β = 0.936, indicating that the data do not conform to the SHPM, thus precluding the computation of the cell frequency f according to single-hit kinetics.
Because the limiting dilution assay with TIL-T is compatible with the SHPM, results can be presented by the usual graphical display of limiting dilution data (semilog plot, Fig. 1⇑B). The fraction of negative wells on a logarithmic scale is plotted against the number of T cells per well. The prediction equation of the regression line is −log(μi) = 0.0339xi, where 0.0339 is the cell frequency f estimated by the maximum likelihood method according to the SHPM hypothesis. Lower and upper dotted lines in Fig. 1⇑B are plotted by using lower and upper values of the 95% confidence interval of f. The boundary equations are −log(μi) = 0.0289xi (lower), and −log(μi) = 0.0391xi (upper).
Based on these findings, we propose that immunologists adopt a standard two-graph method for presenting the results of limiting dilution data. The first graph should present the log-log transform of the results with the graphical representation of the slope test z designed to visualize whether the data are consistent or not with the SHPM. Four regression lines should be presented: the log-log regression line fitted to the experimental data according to the generalized linear log-log model, the two boundary regression lines for βupper and βlower, and the theoretical regression line with β equal to 1 (SHPM regression line). The graph should be accompanied by the equation of the log-log regression line fitted to the experimental data, the SE of β, the 95% confidence interval of β, and the numerical values of z and p(z) (Fig. 1⇑A and Fig. 2⇑).
If the SHPM hypothesis holds, the second graph should be the usual graphical representation of limiting dilution data (semilog plot), where the fraction of negative wells is plotted on a logarithmic scale against the number of cells per well. The prediction equation of the regression line is −log(μi) = fxi, with the frequency f estimated by the maximum likelihood method according to the SHPM hypothesis. On this graph should also be plotted the two regression lines based on the extreme values of the confidenceinterval for f. The graph should be accompanied by the numerical value of f (or its inverse), its 95% confidence interval, and the equation of the regression line (Fig. 1⇑B).
In conclusion, this proposition broadly resembles a two-graph standard presentation of data adopted a number of years ago by immunologists dealing with radioimmunoassays. The first graph eliminates nonspecific binding by demonstrating that the concentration of bound ligand (y-axis) increases nearly linearly with increasing free ligand (x-axis), then asymptotically approaches a plateau; this graph serves as a validation for transforming the data according to a second graph (the so-called Scatchard plot; bound/free vs bound) used to determine the number of binding sites and the ligand binding affinity. In the same manner, data from limiting dilution assays could adequately be summarized and validated according to the SHPM by routine representation using the two graphs presented in this report.
Appendix 1
Calculations of α, β, and var(β) according to the IRLS algorithm
A convenient feature of this algorithm is that it suggests a simple starting procedure to get the iteration under way. This consists of using the data themselves as the initial values of the dependent variable, Z0, and the weight, W0, and from these values are then calculated the first values of the parameter estimates, α0 and β0. These first values for α and β are in turn used to calculate the first value of the theoretical fraction of negative wells, μ1, which in turn is used to obtain the revised values of weight and dependent variable, Z1 and W1, which in turn are used to calculate the revised values of the parameter estimates, α1 and β1, and so on. The iteration is continued until the absolute change in the values of αm and βm in two successive cycles of the iterative process is <0.0001. m is the mth cycle of the iteration. The estimates of α and β, thus obtained, are denoted by α̂ and β̂, respectively. The corresponding fitted values of weights, W, are denoted by Ŵ.
First step: initiation of the algorithm and calculation of the starting values, W0, Z0, α0, and β0
For each group of replicate wells, i, 1) compute the initial value of the weight, Wi0 2) Compute the initial value of the dependent variable, Zi0
3) Compute the first values of the parameter estimates, α0 and β0, according to equations 4 and 5, using Wi0 and Zi0.
Second step, the iterative process for calculating the revised values of the parameter estimates, αm and βm
1) Compute the estimates of the theoretical fraction of negative wells, μim: μim = exp(−exp(αm − 1 + βm − 1Xi)). 2) Compute Wim according to equation 2 using μim. 3) Compute Zim according to equation 3 using μim and αm − 1, βm − 1. 4) Compute the revised values of the parameter estimates, αm and βm, according to equations 4 and 5, using Wim and Zim.
Repeat steps one to four until the absolute change in the values of αm and βm in two successive cycles of the iterative process is sufficiently small (<0.0001). The estimates of α and β are denoted by α̂ and β̂. The corresponding fitted values of weights are denoted by Ŵi.
Acknowledgments
We thank J. L. Martiel (Unité Mixte de Recherche, Centre National de la Recherche Scientifique 5525), J. J. Lawrence (Institut National de la Santé et de la Recherche Médicale Unité 309), and E. Deslandres for helpful discussions throughout the development of this work.
Footnotes
-
↵1 Address correspondence and reprint requests to Dr. Thierry Bonnefoix, Groupe de Recherche sur les Lymphomes, Institut Albert Bonniot, Rond-Point de la Chantourne, 38706 La Tronche, France. E-mail address: Thierry.Bonnefoix{at}ujf-grenoble.fr
-
↵2 Abbreviations used in this paper: SHPM, single-hit Poisson model; TIL-T, tumor-infiltrating lymphocyte T; STTPM, suppressor two-target Poisson model; IRLS, iteratively reweighted least-squares.
- Received February 8, 2001.
- Accepted September 18, 2001.
- Copyright © 2001 by The American Association of Immunologists