Indian Journal of Dental ResearchIndian Journal of Dental ResearchIndian Journal of Dental Research
Indian Journal of Dental Research   Login   |  Users online: 2517

Home Bookmark this page Print this page Email this page Small font sizeDefault font size Increase font size         


ORIGINAL RESEARCH Table of Contents   
Year : 2007  |  Volume : 18  |  Issue : 4  |  Page : 163-167
Use of the generalized linear models in data related to dental caries index

1 Department of Public Health Dentistry, SDM College of Dental Sciences and Hospital, Dharwad - 580 009, Karnataka, India
2 Department of Statistics, Bangalore University, Jnanabharati, Angalore - 560 056, Karnataka, India

Click here for correspondence address and email

Date of Submission18-Jan-2007
Date of Decision28-Apr-2007
Date of Acceptance08-May-2007


The aim of this study is to encourage and initiate the application of generalized linear models (GLMs) in the analysis of the covariates of decayed, missing, and filled teeth (DMFT) index data, which is not necessarily normally distributed. GLMs can be performed assuming underlying many distributions; in fact Poisson distribution with log built-in link function and binomial distribution with Logit and Probit built-in link functions are considered. The Poisson model is used for modeling the DMFT index data and the Logit and Probit models are employed to model the dichotomous outcome of DMFT = 0 and DMFT ≠ 0 (caries free/caries present). The data comprised 7188 subjects aged 18-30 years from the study on the oral health status of Karnataka state conducted by SDM College of Dental Sciences and Hospital, Dharwad, Karnataka, India. The Poisson model and binomial models (Logit and Probit) displayed dissimilarity in the outcome of results at 5% level of significance ( P <0.05). The binomial models were a poor fit, whereas the Poisson model showed a good fit for the DMFT index data. Therefore, a suitable modeling approach for DMFT index data is to use a Poisson model for the DMFT response and a binomial model for the caries free and caries present (DMFT = 0 and DMFT ≠ 0). These GLMs allow separate estimation of those covariates which influence the magnitude of caries.

Keywords: Binomial models, decayed, missing, and filled teeth index, generalized linear modeling, poisson model

How to cite this article:
Javali S B, Pandit PV. Use of the generalized linear models in data related to dental caries index. Indian J Dent Res 2007;18:163-7

How to cite this URL:
Javali S B, Pandit PV. Use of the generalized linear models in data related to dental caries index. Indian J Dent Res [serial online] 2007 [cited 2020 Sep 18];18:163-7. Available from:
The decayed, missing, and filled teeth (DMFT) index is one of the most commonly used index in various epidemiological studies to measure the degree of caries experience of subjects with primary as well as permanent dentition. It is the sum of the simple count of the number of decayed, missing, and filled teeth, which represents the cumulative severity of the dental caries experience. In such studies, the mean DMFT is commonly quoted for the total sample and used as a measure to compare the caries experience between subgroups.

In the earlier studies reported in the literature, it was generally regarded that DMFT index data fulfilled the normality assumption. Hence, multiple linear regression (MLR) models were commonly used to estimate the influence of covariates. Numerous studies have been reported in the literature that have determined the influence of different covariates of dental caries experience, such as sex, gender, age, sweet consumption, frequency of brushing, etc. In a vast majority of the studies, the caries data was analyzed by using traditional MLR techniques, which assume that dental caries indices follow the normality assumption. [1],[2],[3],[4],[5]

It has been observed that the worldwide prevalence of dental caries, especially in the developed countries, [6] has declined rapidly during the last 20 years. Thus, the DMFT index, used either to assess the prevalence or incidence of dental caries, has become highly positively skewed among subjects and adults. [7] These changes in the DMFT have had the effect of increasing the proportion of zeros in the distribution.

However, as the prevalence of dental caries declined and the proportion of zeros increased, various investigators have questioned the assumption of normality and have stressed that greater focus should be given to the caries-free component. [8] When assumptions regarding normality do not hold, common regression techniques cannot be applied to study data. When the number of the caries-free component is so large that the normality assumption is not applicable to the data, there arises a need to describe the nature of the distribution of the DMFT index, which is readily available in the existing statistical software. Various investigators have proposed certain techniques to transform data to make the normality assumption more approximate. However, some statisticians maintain that a discrete index cannot strictly follow a normal distribution, either untransformed or in any transformed state. But, numerous models have been described by various authors to describe the nature of the distribution of the DMFT index. Grainger and Ried [9] suggested that the negative binomial distribution is the best and most satisfactory model for dental caries; Turlot et al. [10] proposed a model based on a Poisson 'with zeros' distribution and Fabien et al. [11] initiated the GLMs with Poisson distribution to compare caries indices.

In public dental health literature, very few studies can be found where the GLM has been applied to caries count data. [8],[11] The aim of the present study was to initiate the application of GLMs in analyzing the covariates of the DMFT index data that do not require the assumption of normality.

   Materials and Methods Top

Study area

The study was carried out from December 2000 to December 2001 in all the districts of Karnataka, including Bangalore district. The state is situated on the western edge of the Deccan plateau on the west coast of South India. It is one of the largest and most populous states in India. Karnataka has an area of 191791 km 2 and an approximate population of 44.9 million. The literacy rate is nearly 60%. The domestic production per capita is average when compared with the rest of the country.

Study population and sampling procedure

Eighteen to thirty year old adults of Karnataka state were included in the study and a multistage cluster sampling procedure was used. Districts of the state were the primary sampling units; these were divided into talukas. A total of 42 talukas were selected randomly; they contained 41 urban and 117 rural units. Among the selected urban and rural units, the required sample of 7188 subjects aged 18-30 years was obtained. The mean age of the children was 25.98 ± 5.45 years.

Clinical examination

Five well-qualified dentists, assisted by three recorders, examined all the subjects. The examinations were carried out at the subjects' homes using plane mouth mirrors, WHO periodontal probes, and natural / artificial light. The DMFT examinations were conducted following standardized and widely-accepted criteria, as recommended by the WHO report on oral health. [12] Besides the oral health information, data were collected on socio-demographic characteristics and oral hygiene practices (OHP) by a structured interview method. For more detailed description of the procedure followed refer to Oral Health Status, Karnataka state. [13] In each district, 20 subjects were examined twice by the same dentist for assessing intra-examiner agreement. The kappa value for intra-examiner agreement of the tooth status in all the districts ranged from 0.61-0.80. Apart from dental health status, data on socio-demographic and other factors, i.e., age, gender, religion, occupation, location, and OHP were recorded by the personal interview procedure.

Data analysis

The authors were interested in establishing the covariates of caries experience (DMFT index). For convenience of fitting GLMs, the DMFT index data was treated as a response variable and the dummy variable 'female' was created to represent gender. To assess the influence of OHP, the children who cleaned their teeth without using brush/paste/powder were considered as the dummy variable. Similarly, all the other covariates were fitted as dummy variables, except for age (in years), which was fitted as a continuous variable.

As can be seen in [Figure - 1] and [Table - 1], the distribution of the DMFT index is markedly skewed, with the majority of the subjects having a low score and only a minority with a high score. About 52% (n = 3739) of 18-30 years old adults presented without any sign of caries experience. Therefore, before the initiation of the application of GLMs, a test of normality was employed to see whether the DMFT index data satisfied the assumption of normality by Shapiro-Wilk statistics; it was found that the DMFT index data was not normally distributed. Hence, we are more away from the traditional MLR models. But, these characteristics fit the various GLMs. First we initiated the GLM with Logit built-in link function: f(π) = log {π/1-π} and Probit built-in link function: f(π) = fφ-1 (π) (φ is the standard normal cumulative distribution function) of dichotomized (DMFT = 0 and DMFT ≠ 0) as a response on a set of covariates. Second, the Poisson model with log built-in link function: log (π) =x1iβ or η = log (π) was applied on the set of covariates; it means that the DMFT index data are not considered as independent events. Here, the majority are of discrete or categorical nature and more likely to fulfill the underlying assumption of Logit and Probit built-in functions. The findings of Logit and Probit models were compared with the Poisson model.

   Results Top

A test of normality was employed to see if the DMFT index data followed a Gaussian distribution. The Shapiro-Wilk statistic proposed by the univariate procedure of STATISTICA 5.0 [14] and SPSS 11.0 [15] (P<0.001) confirmed that the DMFT index data was not normally distributed. The graphic representation of the DMFT data supported these results as shown in [Figure - 1],[Figure - 2]. The mean DMFT score of the study subjects was 1.4961, which is greater than the standard deviation (2.2192) called over dispersion, indicating that more than 50% of the adults were free from dental caries and only 47.98% of the adults had dental caries. It is clear that the DMFT index data set did not support an underlying normal distribution.

The proportion of adults with dental caries is shown in the [Table - 1].

The prevalence or the proportion of adults with dental caries according to independent variables and bivariate odds ratios obtained by simple logistic analysis are presented in [Table - 2]. To avoid the unscientific practice of applying common statistical techniques without regard to the type data at hand, logistic regression (bimodal GLM) was first applied and then the Poisson model, using the DMFT index data as the response variable. It can be seen that each covariate has been fitted as a dummy variable, except for age (in years), which has been fitted as a continuous variable centered about its mean. The estimation of parameters refer to the additional effect that a particular covariate had after for the constant effect and holding the other covariates.

The result of the GLM models with estimates of true parameters, standard error (SE), and t values are summarized in the [Table - 3]. The location is the only statistically significant and negatively associated covariate of dental caries in both binomial models. This is an expected result, because these two models are closely related. However, when applied to other data, the differences could be more substantial and lead to different inferences being made.

At this juncture, model selection was performed after a careful scrutiny of the DMFT frequency distribution. Referring to the distribution of the DMFT index data, it can be seen that the data ranges from 0 to 12. After removing caries-free children, the DMFT index ran from 1 to 12 and is presented in [Figure - 2]. Therefore, the Poisson model was applied and the results indicate that age, gender, and occupation were found to be statistically significant and positively associated covariates of dental caries, but religion is the only covariate to be statistically significant and negatively associated with dental caries. These findings are very different from the results of the binomial models. The MLE values for binomial models and the Poisson model are 795.5165, 795.5782, and 198.5353, respectively. This also indicates that modeling the data using the Poisson model is more appropriate than using the binomial models. It means that the binomial models were a poor fit, whereas the Poisson model showed a good fit for the DMFT index data.

   Discussion Top

Generally, data obtained by using dental caries indices are assumed to follow a normal distribution. To estimate the covariates of the data on dental caries, MLR models based on the normality assumption are usually employed. However, on using a test of normality, it can be clearly demonstrated that the data on dental caries is not normally distributed. The findings from the data on dental caries in adults indicate that the data is skewed and the distribution is closer to a binomial or a Poisson distribution [Figure - 1],[Figure - 2]. These distributions are more suitable to describe count data such as dental caries indices.

Therefore, keeping in mind the current trends of dental caries, the multiple linear models are not appropriate for estimation of the covariates of DMFT caries. Subsequently, questions arise regarding the appropriate statistical methods to estimate the covariates of such caries data. The GLMs were initiated for estimation of the covariates of caries index data. Lewsey et al. [8] felt that the use of GLM was more appropriate for analyzing the distribution of caries data in children. The selection of the GLM is a very difficult issue. However, on examination of the sample DMFT distribution, the investigator should be able to make a good estimate of which model is best suited to their requirements, and for further validation, nonparametric goodness-of-fit tests should also be applied.

For consistent estimation, the influence of the covariates was measured with GLM procedures, i.e., Logit, Probit, and Poisson models, and were interpreted accordingly. It is important to note that the effects of the covariates were multiplicative in the Poisson model, whereas they were additive in nature in the other models, including the normal models. However, in Logit and Probit models, it is a common practice to estimate the odds ratios as it helps to interpret the measure of association, which is frequently used in dental epidemiological studies.

In general, the findings of the present study indicate that the Poisson model could be adopted for DMFT, and Logit and Probit models could be adopted for carious and caries-free subjects. These estimations allow us to separate those factors which influence the actual presence or absence of caries and those factors which influence the magnitude of caries.

In GLM procedures, the built-in link function should be chosen for more accurate estimation of the covariates of dental caries. Considerable amounts of differences were observed between the results obtained using the Poisson distribution and the binomial distribution. Moreover, with the same probability distribution, the choice of the built-in link function was very important as it would depend on the nature of the clinical data.

   Conclusion Top

The comparison of the results estimated by the use of the Poisson distribution and binomial (Logit and Probit) distribution displayed some differences. These differences could lead to incorrect interpretations, affecting the general results of the studies. Therefore, the appropriate GLMs are to be chosen when comparing the covariates of the data on caries indices.

   Acknowledgements Top

The authors thanks to Dr. Bhasker Rao, Principal, and Dr. K. V. V. Prasad, Head, and other subordinate staff of the Department of Public Health Dentistry, SDM College of Dental Sciences and Hospital, Dharwad, Karnataka, India for providing the dental caries data in this paper.

   References Top

1.Dummer PM, Oliver SJ, Hicks R, Kingdon A, Kingdon R, Add M, et al . Factors influencing the caries experience of a group of children at the ages of 11-12 and 15-16 years: Results from an ongoing epidemiological survey. J Dent 1990;18:37-48.  Back to cited text no. 1      
2.Angellio IF, Romano F, Fortunato L, Montanazo D. Procedure of dental caries and enamel defects in children living in areas with different water fluoride concentration. Commun Dent Health 1990;29:424-34.   Back to cited text no. 2      
3.Venobbergen J, Martens L, Lesaffre E, Bogaerts K, Decleack D. Assessing risk indicators for dental caries in primary dentition. Community Dent Oral Epidemiol 2001;29:424-34.   Back to cited text no. 3      
4.Javali SB, Prasad KVV, Tippeswamy V. Determinants of dental caries experience. Indian J Dent Res 2001;12:230-3.0   Back to cited text no. 4  [PUBMED]    
5.Javali SB, Pandit PV. Statistical analysis of data from some determinants of dental caries experience. J Pierre Fauchard Acad 2004;18:59-66.   Back to cited text no. 5      
6.Downer MC. The changing pattern of dental disease over 50 years. Br Dent J 1998;185:36-41.   Back to cited text no. 6  [PUBMED]    
7.Spencer AJ. Skewed Distributions-new outcome measures. Community Dent Oral Epidemiol 1997;25:52-9.   Back to cited text no. 7  [PUBMED]    
8.Lewsey JD, Gilthorpe MS, Bulman JS, Bedi R. Is modeling dental caries a normal thing to do? Community Dent Health 2000;17:212-7.  Back to cited text no. 8  [PUBMED]    
9.Grainger RM, Reid DB. Distribution of dental caries in children. J Dent Res 1954;33:613-23.  Back to cited text no. 9  [PUBMED]  [FULLTEXT]  
10.Turlot JC, Cahen PM, Frank RM. Longitudinal study of the evolution of the frequency of dental caries in a school milieu: A statistical model. Rev Epidemiol Sante Publique 1984;32:398-407.   Back to cited text no. 10  [PUBMED]    
11.Fabien V, Anne-Matrie OM, Guy H, Pierre-michal C. Use of the generalized linear model with Poisson distribution to compare caries indices. Community Dent Oral Epidemiol 1999;16:93-6.  Back to cited text no. 11      
12.World Health Organization: Oral health surveys. Basic Methods. WHO: Geneva; 1997.  Back to cited text no. 12      
13.Prasad KVV, Thanveer, Joseph J. Oral Health Status Karnataka State; An Epidemiological Survey, MDS, Dissertation, Rajiv Gandhi University of Health Sciences, Bangalore, 1999-2000  Back to cited text no. 13      
14.Statistical Package: STATISTICA-5.0 version: Tulsa, OK 74104, Stat Soft INC, U. S. A. 1995  Back to cited text no. 14      
15.Statistical Package for the social sciences: SPSS 11.0.1 users guide, Chicago: SPSS Inc. 2001.  Back to cited text no. 15      

Correspondence Address:
S B Javali
Department of Public Health Dentistry, SDM College of Dental Sciences and Hospital, Dharwad - 580 009, Karnataka
Login to access the Email id

Source of Support: None, Conflict of Interest: None

DOI: 10.4103/0970-9290.35825

Rights and Permissions


  [Figure - 1], [Figure - 2]

  [Table - 1], [Table - 2], [Table - 3]

This article has been cited by
1 Bayesian Analysis of the Association between Family-Level Factors and Siblings’ Dental Caries
A. Wen,R.J. Weyant,D.W. McNeil,R.J. Crout,K. Neiswanger,M.L. Marazita,B. Foxman
JDR Clinical & Translational Research. 2017; 2(3): 278
[Pubmed] | [DOI]
2 Relationship between oral health literacy and oral health behaviors and clinical status in Japanese adults
Ueno, M. and Takeuchi, S. and Oshiro, A. and Kawaguchi, Y.
Journal of Dental Sciences. 2013; 8(2): 170-176
3 Relationship between oral health literacy and oral health behaviors and clinical status in Japanese adults
Masayuki Ueno,Susumu Takeuchi,Akiko Oshiro,Yoko Kawaguchi
Journal of Dental Sciences. 2013; 8(2): 170
[Pubmed] | [DOI]
4 Risk indicators of oral health status among young adults aged 18 years analyzed by negative binomial regression
Hai-Xia Lu,May Chun Wong,Edward Chin Lo,Colman McGrath
BMC Oral Health. 2013; 13(1): 40
[Pubmed] | [DOI]
5 Gender differences in oral health in South Asia: Metadata imply multifactorial biological and cultural causes
John R. Lukacs
American Journal of Human Biology. 2011; 23(3): 398
[VIEW] | [DOI]
6 Using universal patterns of caries for planning and evaluating dental care
Sheiham, A. and Sabbah, W.
Caries Research. 2010; 44(2): 141-150
7 Using zero inflated models to analyze dental caries with many zeroes
Javali, S.B. and Pandit, P.V.
Indian Journal of Dental Research. 2010; 21(4): 480-485
8 What statistical method should be used to evaluate risk factors associated with dmfs index? Evidence from the National Pathfinder Survey of 4-year-old Italian children
Giuliana Solinas, Guglielmo Campus, Carmelo Maida, Giovanni Sotgiu, Maria Grazia Cagetti, Emmanuel Lesaffre, Paolo Castiglia
Community Dentistry And Oral Epidemiology. 2009; 37(6): 539-546
[Pubmed] | [DOI]


    Similar in PUBMED
   Search Pubmed for
   Search in Google Scholar for
 Related articles
    Email Alert *
    Add to My List *
* Registration required (free)  

    Materials and Me...
    Article Figures
    Article Tables

 Article Access Statistics
    PDF Downloaded604    
    Comments [Add]    
    Cited by others 8    

Recommend this journal