Indian Journal of Dental ResearchIndian Journal of Dental ResearchIndian Journal of Dental Research
Indian Journal of Dental Research   Login   |  Users online:

Home Bookmark this page Print this page Email this page Small font sizeDefault font size Increase font size         


Table of Contents   
Year : 2011  |  Volume : 22  |  Issue : 2  |  Page : 285-290
A randomized clinical study to assess the reliability and reproducibility of "Sign Grading System"

Department of Periodontics, KBH, MGV Dental College and Hospital, Panchvati, Nasik, Maharashtra, India

Click here for correspondence address and email

Date of Submission27-Dec-2009
Date of Decision15-May-2010
Date of Acceptance10-Nov-2010
Date of Web Publication27-Aug-2011


Background: Signs such as +, ++ and +++ for mild, moderate and severe stains/calculus are being used in India effectively for more than four decades. However, there are no standardized criteria for grading, and no data regarding how and when this system was introduced, but it became very popular throughout India and is being used since then.
Aims and Objectives: An attempt was made here to standardize the criteria on which the grades would be given and designate it as "Sign Grading System". Along with this, the objective of this paper was to evaluate whether this index/system satisfies all the requirements of an ideal index, particularly reliability and reproducibility.
Settings and Design: Inter-examiner and intra-examiner reliability and reproducibility of this index was assessed through a randomized clinical study. Patients were recruited from an institutional setting by random selection from the outpatient department.
Materials and Methods: One month of training was conducted before the actual start of study. The clinical aspect of the study involved 3 investigators and 50 patients of whom 45 patients were reassessed. All the data were kept blind by a research assistant to reduce bias. Necessary measures were taken to reduce/eliminate the confounding variables, which could have affected the outcome of this study. Cohen's kappa and Fleiss' kappa statistics were employed for statistical analysis.
Results and Conclusion: The index fulfills most of the ideal requirements of an index along with a high degree of reliability and reproducibility.

Keywords: Calculus index, dental calculus: observer variation, stains index, tooth discoloration/diagnosis

How to cite this article:
Agrawal AA. A randomized clinical study to assess the reliability and reproducibility of "Sign Grading System". Indian J Dent Res 2011;22:285-90

How to cite this URL:
Agrawal AA. A randomized clinical study to assess the reliability and reproducibility of "Sign Grading System". Indian J Dent Res [serial online] 2011 [cited 2021 Jul 26];22:285-90. Available from:
Prevention of disease has always been known to be the most perfect form of practice in the healing arts. It rests on knowledge of the disease etiology as well as an understanding of the occurrence and distribution of related factors and conditions. One of the major problems in studying dental disease and its factors is the development of a suitable and practicable method for recording and classifying the occurrence and severity of the disease. Measuring a disease in quantitative terms allows one to assess whether new methods of treatment are superior or inferior to the existing modes and whether preventive programs are accomplishing or failing their objectives. Quantitative measurements of a disease most commonly rely on "indices". [1]

An index has been defined as "a numerical value describing the relative status of a population on a graduated scale with definite upper and lower limits, which is designed to permit and facilitate comparison with other populations classified by the same criteria and methods". [1] Some of the indices proposed earlier are still useful because their use enables a comparison to be made with epidemiological findings of an earlier period. [2],[3],[4],[5],[6],[7] However, most of the earlier known indices do not fulfill the ideal requirements of an index, which should be simple, reproducible, sensitive, acceptable, economical and amenable to statistical evaluation, as proposed by Davies et al. [8]

Majority of indices [2],[3],[4],[5] for assessment of calculus considered only 4-6 mandibular anterior teeth. In this area, there are highest chances of calculus formation. Even a person having good manual dexterity would have difficulty in cleaning the lingual surface of the mandibular anterior region. Then, assessing the overall oral hygiene based on the examination of mandibular anterior teeth only would give false positive results. Secondly, for calculus surface severity index, detection of 0.5-1 mm difference is clinically difficult. Calculus surface index [5] has scores 0 and 1; thus, this index is not sensitive enough to explain small changes between sites/patients.

The aim of this paper was to propose a simple, clear and sensitive index for assessment of stains and calculus to be known as the "Sign Grading System" (SGS). Along with this, the objective of this paper was to evaluate whether this index/system satisfies all the requirements of an ideal index, particularly reliability and reproducibility.


Scoring criteria

The oral cavity is divided into six sextants: 18-14, 13-23, 24-28, 38-34, 33-43 and 44-48. A typical examination usually starts with the maxillary right third molar and is continued, sextants wise, in a clockwise direction until all the teeth required to be assessed are through. All surfaces of the tooth should be examined for each grade and the surface with the highest grade should be allocated to that particular tooth. This will help in simpler data management and assessment of overall condition of the patient, rather than mentioning different grades for each tooth surface. Similarly, grading of the sextants will depend on the highest grade of the tooth within that sextant. In all classification, "crown" means clinical crown and not anatomic crown.

Sign grading system for stains

Scoring requires adequate light and mouth mirror. If optimal conditions and chairside assistance are provided and all teeth are to be examined, scoring according to this system requires approximately 4-5 minutes. Although this index, for stains, does not differentiate between extrinsic or intrinsic stains, it is recommended that it should be used preferably for extrinsic stains [Table 1].
Table 1: Scoring criteria for stains

Click here to view

Sign grading system for calculus

Dental calculus is defined as a mineralized bacterial plaque that forms on the surfaces of natural teeth and dental prostheses. [9] Scoring requires adequate light, mouth mirror and explorer. If optimal conditions and chairside assistance are provided and all teeth are to be examined, scoring according to this system requires approximately 5-6 minutes [Table 2].
Table 2: Scoring criteria for calculus

Click here to view

   Materials and Methods Top

Examiner training

An experienced principal investigator (project coordinator), AA, trained each of the two investigators, RM and SA, in basic methodology for clinical research and correct procedure for examination of dental stains and dental calculus. A fourth person SS was included as a project assistant. Training for a period of 1 month was given before the actual start of the study. All the three investigators and the project assistant attended five sessions during 1 month training period. In the first session, there was a detailed discussion regarding the research plan, and the scoring criteria were proposed by the principal investigator. Role of each investigator and project assistant was recommended and any modifications required in the index system were analyzed. In the second training session, the principal investigator made an audiovisual presentation of 8-10 clinical cases. Index criteria were discussed in detail, step by step, along with a clinical photograph of each variable. This was followed by explanation of calculations of grades and inference determination. In the third training session, all the three investigators graded an audiovisual presentation of 16 different clinical cases simultaneously. After the completion of grading, investigators compared each other's grades and doubts were cleared where differences aroused. In the fourth training session, all the investigators scored a similar presentation of 16 different clinical cases separately. On completion of the analysis, all the scores were compared and doubts were cleared by discussion. In the final training session, four patients were randomly selected from the outpatient department (OPD). All four cases were examined in groups and graded after discussion. More attention was given to grading calculus, since in second and third sessions clinical photographs were used to assess stains only. After gaining adequate confidence, in vivo study was started. During the entire practice time, principal investigator regularly reviewed the examination data and provided instructive feedback to the investigators in training and thereby provided guidelines for handling specific deviations from normality.

In vivo study

Project assistant randomly sampled 50 patients from the OPD. The selected patients were age and sex independent, but were required to have at least 20 teeth in their oral cavity and should voluntarily give consent to be included in the research project. An informed consent was explained to the patient in their language by the project assistant and got them signed if the patient was willing to be included in the project. To assess the reliability of the index, each investigator examined a total of 50 patients and scored for stains and calculus individually and submitted the form to the project assistant. Every investigator re-examined (30%) 15 patients to assess the reproducibility of the index. To reduce/eliminate the Halo effect, [10],[11] the project assistant decided as to which investigator would reexamine which patient. There was at least 30 minutes gap between two examinations on a particular patient by the same investigator to reduce/eliminate the memory effect. Participants were preferably re-examined on the same day to eliminate Hawthorne effect, [12] as they would tend to clean their teeth knowing that a dentist would examine them. Checking the patients on the same day also eliminated the attrition/mortality effect on the results. After the study was complete, total database was collected by the project coordinator, which was then filtered and tabulated for statistical evaluation.

Statistical evaluation of grades

For field study

The surface with highest grade is allocated to that particular tooth (t) and the tooth with highest grade is allocated to that sextant (X). Then consider signs - as 0; + as 1; ++ as 2 and +++ as 3. Sum the numerical data of all six sextants (sX) and divide by the number of sextant examined (n). The resultant score can be interpreted in the following manner:

For example, grades allotted for stains/calculus to individual sextants:

Grade = sX/6 = 9/6 = 1.5 = moderate stains/calculus = ++

For clinical research

As per the study requirement, either the sextant can be counted as above, or the grades allotted to each tooth are converted to numerical data (T). Then, sum all the numbers (sT) to be included in the study and divide by the total number of teeth examined (N).

For example, sum of all the grades for stains/calculus of 21 individual teeth are as follows:

sT = 1+2+0+3+1+3+2+2+1+0+2+2+3+1+0+0+2+1+2+1+0=29

Grade = sT/N = 29/21 = 1.38 = moderate stains/calculus = ++

Consequently, the SGS may be used in large-scale epidemiological investigations as well as in the examination of smaller groups or within the dentition of the individual as per the requirements of the researcher. For statistical evaluation, a simple Chi-square test also can be used to test the association between two responses. However, it does not provide us with a quantitative measure of the degree of reliability between two sets of responses. Hence, Cohen's kappa and Fleiss' kappa statistics were applied.

   Results Top

It is well recognized that accurate assessment of stain is difficult and factors such as lighting conditions and inter- and intra-examiner variability may influence the outcome of clinical trials. To help overcome these problems, training of assessors before the commencement of the study is very often recommended, and to assess the reproducibility both within the examiner and between examiners, various statistical methods can be employed. [7]

In the present study, we first used Cohen's kappa statistics [13],[14] to assess intra-examiner reproducibility, by comparing two scores of the same examiner separately for stains and calculus [Table 3]. This was followed by assessment of inter-examiner reliability measure by comparing scores between investigators I and II, I and III, and II and III [Table 4]. Since Cohen's kappa can be used only for comparison between two readings, Fleiss' kappa correlation statistics [15],[16] was applied to assess inter-examiner reliability between all the three investigators at a time [Table 5]. In all these tables, "Po" is the observed proportion of agreement and "Pe" is the proportion of agreement expected by chance. They are not the mean values of the observation table, but are calculated using a specific formula. [13],[14],[15],[16]
Table 3: Intra-examiner reproducibility for stains and calculus

Click here to view
Table 4: Inter-examiner reliability by Cohen's kappa statistics

Click here to view
Table 5: Inter-examiner reliability of all three investigators by Fleiss' kappa statistics

Click here to view

   Discussion Top

Dental indices or index can be considered as the main tool of epidemiological studies in the dental diseases to find out the incidence, prevalence and severity of the diseases, based on which preventive programs are adopted for their control and prevention. Epidemiological indices are attempts to quantify clinical conditions on a graduated scale, thereby facilitating comparison among populations examined by the same criteria and methods. [1]

Any classification system or grading system is introduced so that the information can be recorded easily and accurately and can be communicated effectively among professionals around the world. Signs such as +, ++ and +++ for mild, moderate and severe stains/calculus, respectively, are being used in India effectively for more than four decades. Since there were no standardized criteria for grading, this system was never used in any research and hence it was never mentioned in the literature. There are no data regarding how and when this system was introduced but it became very popular throughout India and is being used since then. With this basic fact in mind, an attempt was made here to standardize the criteria on which the grades would be given and call it Sign Grading System (SGS). The index can be made more detailed according to various situations, locations in the oral cavity, purpose of use, i.e., whether for field study or for clinical research, but it will make it more complicated and difficult to apply. The mere significance of putting forth the classification will stand in doubt.

Unlike many other indices, special attention is given to conditions such as orthodontic brackets, fractured teeth, restoration margins or anatomic defects in the index. These conditions favor deposits in spite of adequate oral hygiene care, and so, even if they were present in middle or coronal third of the crown they were graded as "+" only. However, there are some inherent problems while applying this index for grading internal stains. For internal stains due to fluorosis or hypocalcification, assessment of severity would also be important for which destruction of enamel surface should also be taken into account. Apart from this, internal stains are often diffuse in nature, which may lead to significant subjective variation in grading. Another limitation would be that SGS measures the extent of stains and calculus but not the thickness. A lower grade may get allotted to cases having thicker deposits but are limited in their extent. If third molars are not accessible or visible, they can be excluded from the analysis. Full metal crown coverage can be excluded for assessment of stains.

The present index fulfills most of the ideal requirements such as clarity, objectivity, sensitivity, acceptability and suitability for statistical evaluation in addition to being economical. The scoring criteria were not time consuming as it required approximately 4-5 minutes for assessing stains and 5-6 minutes for calculus. To judge the reliability of the index, all three investigators examined 50 patients and 30% of patients were reassessed to evaluate the reproducibility. All necessary measures were taken to reduce or eliminate the bias and confounding variables which could have affected the outcome of this study. The sample size of 50 patients in the present study is larger than the sample size of 10 patients in the study of Gadhia et al., [7] 26 patients in the study of Marks et al. [17] and 6 patients in the study of Dombret et al. [18] On the contrary, the sample size was smaller than 185 patients studied by MacPherson et al.; [6] however, reassessment in his study was done on 12.5% of total sample size, whereas it was done on 30% of patients by each investigator in the present study.

Strong agreement between raters is considered a necessary prerequisite for the effectiveness of any subjective procedure intended for diagnostic or classification purposes. Yet, it is well acknowledged that wide variability between raters is commonly observed. [13] To date, methods for modeling agreement data are categorized as summary statistics and model-based approaches. Summary statistics include Cohen's kappa statistics, Fleiss' statistics, the intraclass correlation coefficient, concordance correlation coefficient and others. In particular, Cohen's kappa is very popular among biomedical professionals due to its appealingly simple calculation and interpretation. Typically, it is used in assessing the degree to which two or more raters, examining the same data, agree when it comes to assigning the data to categories. [14] Fleiss' kappa is a chance-corrected inter-rater agreement measure for three or more raters. Whereas Cohen's kappa assumes that the same two raters have rated a set of items, Fleiss' kappa specifically assumes that although there are a fixed number of raters (e.g., three), different items are rated by different individuals. [15],[16]

   Conclusions Top

The SGS proposed here is a simple, clear, reliable and reproducible index for assessment of stains and calculus. It can have application in field studies and clinical research. Comparison of scoring methods with different indices would be necessary to document the potentiality of the index system. Disease and its manifestations have concrete meaning for health professionals and providers, planners and financial intermediaries. This index represents a basic step in rationalizing the measurement of dental care. As more experience is gained with the index, refinements will be necessary; but even at this early stage, the outcome measure meets an important need of the dental community. The reproducibility is good, provided the examiner's knowledge of periodontal biology and pathology is optimal.

   Acknowledgments Top

I am grateful to supporting investigators Dr. Rakesh Mutha (RM), Dr. Shumaila Ansari (SA) and Dr. Shahabe Saquib (SS) for their valuable contribution. I would also like to thank my wife Dr. Mrs. Pooja Agrawal for all the statistical calculations. I express my sincere gratitude toward the HOD, Dr. Nitin Dani, and Principal, Dr. Ajay Bhoosreddy, for granting me the permission for carrying out the study in the Department of Periodontics and Implantology, KBH-MGV Dental College and Hospital, Nasik, Maharashtra.

   References Top

1.Peter S. Indices in Dental Epidemiology. In: Peter S, editor. Preventive and Community Dentistry. 3 rd ed. New Delhi: Arya (Medi) Publishing House; 2006. p. 123-4.   Back to cited text no. 1
2.Greene JC, Vermillion JR. The simplified Oral Hygiene Index. J Am Dent Assoc 1964;68:7-13.  Back to cited text no. 2
3.Muhlemann HR, Villa PR. The marginal line calculus index. Helvetica Odontologia Acta 1967;11:175-9.  Back to cited text no. 3
4.Volpe AR, Manhold JH, Hazen SP. In vivo calculus assessment: Part 1. A method and its examiner reproducibility. J Periodontol 1965;36:292-8.  Back to cited text no. 4
5.Ennever J, Sturzenberger CP, Radke AW. Calculus surface index for scoring clinical calculus studies. J Periodontol 1961;32:54-7.  Back to cited text no. 5
6.Macpherson LM, Stephen KW, Joiner A, Schafer F, Huntington E. Comparison of a conventional and modified tooth stain index. J Clin Periodontol 2000;27:854-9.  Back to cited text no. 6
7.Gadhia K, Shah R, Swaminathan D, Wetton S, Moran J. Development of a stain shade guide to aid the measurement of extrinsic dental stain. Int J Dent Hygiene 2006;4:98-103.  Back to cited text no. 7
8.Davies GN. The different requirements of periodontal indices for prevalence studies and clinical trials. Int Dent J 1968;18:560-70.   Back to cited text no. 8
9.Hinrichs JE. The role of dental calculus and other predisposing factors. In: Newman MG, Takei HH, Carranza FA, editors. Clinical Periodontology. 9 th ed. Philadelphia: W.B. Saunders Co.; 2003. p. 182.  Back to cited text no. 9
10.Thorndike EL. A constant error in psychological ratings. J Appl Psychol 1920;4:25-9.  Back to cited text no. 10
11.McKinstry B, Cameron H, Elton RA, Riley SC. Leniency and halo effects in marking undergraduate short research projects. BMC Med Educ 2004;4:28.  Back to cited text no. 11
12.McCarney R, Warner J, Iliffe S, van Haselen R, Griffin M, Fisher P. The Hawthorne effect: A rondomised, controlled trial. BMC Med Res Methodol 2007;7:30.  Back to cited text no. 12
13.Bloch DA, Kraemer HC. 2 x 2 kappa coefficients: Measurements of agreement and association. Biometrics 1989;45:269-87.  Back to cited text no. 13
14.Klar N, Lipsitz SR, Ibrahim JG. An estimating equations approach for modelling kappa. Biom J 2000;1:45-58.  Back to cited text no. 14
15.Sim J, Wright CC. The kappa statistic in reliability studies: Use, interpretation and sample size requirements. Phys Ther 2005;85:257-68.  Back to cited text no. 15
16.Fleiss'' kappa. Available from:''_kappa. [Last accessed on 2010 Apr 17].  Back to cited text no. 16
17.Marks RG, Magnusson I, Taylor M, Clouser B, Maruniak J, Clark WB. Evaluation of reliability and reproducibility of dental indices. J Clin Periodontol 1993;20:54-8  Back to cited text no. 17
18.Dombert B, Matthijs S, Sabzevar M. Interexaminer reproducibility of ordinal and interval scaled plaque indices. J Clin Periodontol 2003;30:630-5.  Back to cited text no. 18

Correspondence Address:
Amit A Agrawal
Department of Periodontics, KBH, MGV Dental College and Hospital, Panchvati, Nasik, Maharashtra
Login to access the Email id

Source of Support: None, Conflict of Interest: None

DOI: 10.4103/0970-9290.84305

Rights and Permissions


  [Table 1], [Table 2], [Table 3], [Table 4], [Table 5]


    Similar in PUBMED
   Search Pubmed for
   Search in Google Scholar for
 Related articles
    Email Alert *
    Add to My List *
* Registration required (free)  

    Materials and Me...
    Article Tables

 Article Access Statistics
    PDF Downloaded358    
    Comments [Add]    

Recommend this journal