|
|
||||||||
Educational Methodologies |
Key words: self-assessment, student evaluation, self-directed learning
Submitted for publication 07/30/07; accepted 11/20/07
| Abstract |
|---|
|
|
|---|
Self-assessment is often advocated as a way to share educational responsibility and to develop patterns of reflective learning in students.8–10 The advantages of self-assessment are said to be especially valuable in formative learning by helping students direct efforts towards information and/or skills that they have not yet developed.9,11 There has been increasing interest in self-assessment because students should have a realistic sense of their own strengths and weaknesses and should be guided towards responsible self-directed learning.12,13 Although self-evaluation skills are integral to the lifelong learning process,14 self-assessment has not been a significant part of curricula in health care education.15
Most educators agree that self-assessment is potentially valuable; however, there has been a reluctance to embrace the use of this technique because of poor agreement between external measures of students performance and students self-assessments of performance.12,16 Generally, students do not accurately self-assess because they tend to evaluate their potential or historical levels of achievement, rather than actual performance.17,18 Investigators have shown that the highest-achieving students often self-assess with their potential in mind and, as a result, they often underrate themselves.3,17 Conversely, low-achieving students and those with the most confidence often overrate themselves.19 Self-assessment ratings are generally higher than peer ratings20 and correspond with a students level of confidence rather than actual performance.21 Evans et al. concluded that students would prefer to have someone else (an expert) assess them rather than complete a self-assessment.22 Because self-assessment can be unsettling, it is difficult to create a school culture in which students are comfortable making judgments about their own performance.7
It is unclear if students can improve their ability to self-assess. Some investigators feel that self-assessment is a behavioral quality that remains consistent across both skill and knowledge tasks as well as over time.21,23 In a study of over 300 medical students, Fitzgerald et al. compared students abilities to self-assess performance on a knowledge-based exam and a clinical exam.23 They concluded that students self-assessment skills were consistent over a range of skills and tasks. In contrast to this study, other investigators have shown modest improvement in student/faculty agreement when the study design includes multiple self-assessments.5,8,24,25 In a study evaluating preclinical skills of dental students, Geissler noted student scores differed an average of 5 percent from faculty assessments but improved to a 3.3 percent difference after a four-month trial, indicating a modest improvement in the calibration of students to faculty with multiple self-assessment trials.5 Additionally, in a study of eighty-six dental students, Knight et al. showed recognition skills in preclinical exercises improved over a ten-week trial, indicating self-assessment skills can improve examination results.25 Unfortunately, there are few studies that have assessed the consistency of self-assessment over time or between trials.
Confounding the issue of student self-assessment skills is the issue of reliability among faculty ratings.21,26,27 Often, studies comparing student to faculty evaluations include methodological inconsistencies such as lack of reliability, students and faculty using different criteria or scales for assessment, or students rating their effort while faculty rate the product of that effort.21,27
It is difficult to compare self-assessment studies because investigators often differ in the way they define self-assessment. Often, they compare completely different populations and use different external measures to validate the results.21,26,27 Some investigators use self-assessment as a means to compare a students skills against faculty criteria used as a benchmark,23 while others have measured self-assessment against various external standards.2,8 External standards of student performance have included knowledge-based written examinations, clinical decision making with objective structured clinical exams, and global ratings by faculty mentors.4,6,7 Most investigations have looked at student/faculty agreement as the basis for determining accuracy in self-assessment. Almost all self-assessment studies have used descriptive statistics, and many have used correlations to show bivariate association, but relatively few have used regression analysis to explain predictive variability. Additionally, most studies have used a cross-sectional design in which one self-assessment effort is compared to one external standard rather than measuring self-assessment over time.27
If we are interested in better understanding the relationship of self-assessment to student performance on examinations, it makes sense to measure at least two self-assessment efforts to determine change in self-assessment over time. It also makes sense to determine the relationships between changes in self-assessment scores over time and examination scores over the same period. For example, do students who improve most from the first to second self-assessment also improve the most on successive examinations? This could be evaluated by comparing faculty/student agreement over successive examinations. To the best of our knowledge, this type of comparative investigation has not previously been completed in dental education and is the rationale for our study. Our purpose was twofold: first, to determine if students who perform well on an initial preclinical examination were more accurate on initial self-assessments; and second, to determine if students who improve more from the first to second self-assessment also improve more on successive examinations.
An area of the curriculum where we felt self- assessment skills could be readily evaluated and developed in dental students was the preclinical technique courses in which psychomotor skills and clinical procedures are taught. Our goal was to complete consecutive preclinical prosthodontic assessments of students abilities to assess their own work and determine relationships between student performance on examinations and self-assessment.
| Methods |
|---|
|
|
|---|
First and Second Examinations and Self-Assessment
The first examination required students to set maxillary denture teeth against a standardized mandibular cast mounted on an articulator (Figure 1
). Students were rated on their ability to position denture teeth to an ideal patient setup simulating the clinical equivalent of a maxillary denture with bilateral balanced occlusion. Evaluation criteria included denture tooth position, arch form, midline positioning of teeth, contact in centric relation, and functional measures in working, nonworking, and protrusive movements (see Appendix). The criteria were familiar to students from previous instruction and a review of the evaluation format provided prior to the first examination. Students were advised that the self-assessment performance would count significantly towards their grade. Clear criteria and appropriate incentives were provided, as they have been shown to improve self-assessment accuracy.21,28
|
|
One faculty member graded all examinations. To establish a measure of reliability, the faculty member graded ten denture setups twice—five from each of the two examinations.
Data Analysis
The following data were collected:
For example, in the function section of the examination (Appendix), if a student considered there were working contacts on all seven denture teeth, he or she would circle all seven answers and have a self-assessment score of seven points. If the faculty felt that only five teeth had working contact, then the validated self-assessment score would be five points. As a second example, if a student thought there were working contacts on denture teeth one through five but not six or seven, and the faculty agreed, then the student self-assessment score would be five points while the validated self-assessment score would be seven, since there was agreement between the student and faculty member on all seven points.
We also calculated index scores that measured change for each of the three scores from the first examination to the second, as we anticipated that the changes might provide valuable criteria by which to compare student performance. The following indexes were calculated: 1)
examination index: change in examination score from the first examination to the second; 2)
student self-assessment index: change in the student self-assessment score from the first self-assessment to the second; and 3)
validated self-assessment index: change in the validated self-assessment score from the first self-assessment to the second. The three indices allowed comparisons of improvement, decline, or no change from the first examination to the second for the three evaluation measures. Because some students scores decreased from the first examination to the second, resulting in negative differences, 100 was added to the differences, creating positive index values to facilitate statistical procedures such as multiple correlation and regression analysis.
Descriptive measures for all variables were prepared to assess general characteristics of the data and to identify variation among students scores on their first and second examinations, student self-assessment, and validated self-assessment scores. Paired t-tests comparing mean differences for each of the three sets of scores were used to determine if statistically significant performance differences existed.
Graphic and numeric measures of association were developed to understand general relationships between faculty scores and the two measures of self-assessment. Pearsons multiple correlation coefficients were used to evaluate individual bivariate associations among examination scores and measures of self-assessment. Scatter diagrams were also plotted to visually illustrate associations between examination and self-assessment scores.
Regression modeling using SPSS software was used to predict the individual effects of student self-assessment and validated self-assessment scores on examination outcomes for both indexed and non-indexed values. Indexed and non-indexed regression models were also used to predict the individual effects of student self-assessment scores on validated self-assessment scores. This technique allowed us to measure the predictive power of both self-assessment instruments on individual examination performance and change in examination performance over time. Similarly, the indexed and non-indexed approach was also useful for predictions of student self-assessment scores with validated self-assessment scores individually and between examination periods. Nine models were formulated to evaluate these characteristics.
Regression models using indexed scores (
examination,
student self-assessment,
validated self-assessment) were also developed to evaluate changes in student performance. Regression models evaluated change in examination scores as the dependent variable and change in student self- assessment and change in validated self-assessment as independent variables. Because the variables student self-assessment scores and validated self- assessment scores exhibited a high degree of multi-collinearity, these independent variables could not be combined in a multiple regression model; therefore, all models are simple regressions.
| Results |
|---|
|
|
|---|
|
|
|
|
0.05, respectively. For all of the models, variance of the independent variables explained from 5.1 percent to 45.2 percent of the variability in the dependent variables (Table 3
|
The predictive power of regression modeling improved significantly from the first examination to the second as demonstrated by both indexed and non-indexed models. For example, independent variables predicting change in scores for the three index models explained 14.2 percent, 16.3 percent, and 31.8 percent, respectively, of the variability of the three dependent variables, indicating that change in self-assessment scores are valid predictors of change in the designated outcome variables over time.
Confirming the observation that the explanatory power of regression modeling increased over time, the predictions of the non-indexed models improved from the first examination to the second as well. As an example, for the non-indexed models, between the first and second examinations, the variation in examination scores explained by student self-assessments increased from 5.1 percent to 13.7 percent and for validated self-assessments from 11.6 percent to 18.4 percent. This trend of prediction improvement between the first and second examinations was also observed by the increase from 26.0 percent to 45.2 percent of the variance in validated self-assessment predicted by the variation in student self-assessment. That the regression modeling predictive results improved over time indicates that the self-assessment instruments gained in their ability to explain and predict changes in student outcomes.
| Discussion |
|---|
|
|
|---|
More importantly, although there was no statistically significant improvement in mean scores from the first examination period to the second, the association of examination scores with student self-assessment scores and with validated self-assessment scores strengthened between periods. The examination/student self-assessment correlation increased from r=0.225 to 0.370, the examination/validated self-assessment correlation from r=0.340 to 0.429, and the student self-assessment/validated self-assessment correlation from r=0.510 to 0.672 (Table 2
; Figures 3
and 4
).
These stronger associations in the second examination period indicate that student self-assessment skills became a more accurate predictor of exam performance and instructor-validated self-assessment. At the second examination, while group mean scores failed to improve significantly, individual students rearranged themselves about the mean, so that the association of competent self-assessors with exemplary exam performance and validation of self-assessment became an even stronger indicator of success. The converse held that poor self-assessment skills were more associated with poorer performance in the subsequent period.
Regression analysis showed the predictive value of student self-assessment improved from the first examination to the second (R2 from 0.051 to 0.137). We noted similar improvement in predicting validated self-improvement (R2 from 0.116 to 0.184). This improvement is consistent with Geissler, who found that student self-assessment scores became more closely aligned with faculty assessments over time.5 Geisslers study suggested that, with self-assessment experience, there was an increased willingness to award marks outside the middle range and a closer faculty/student agreement resulted. The finding by Geissler may help explain our findings that, although mean values did not increase, student/faculty agreement improved on successive examinations. Our findings are also consistent with Knight et al., who found that students who improved in recognition tasks also improved on examination scores and conversely that students who did not improve in recognition tasks did not improve on examination scores.25 Our findings, along with those of Geissler and Knight et al., support the concept that self-assessment practice can modestly improve self-assessment scores, which improves examination scores.
Our finding of consistent mean scores on successive student self-assessment efforts is similar to results of other investigators.10,24,29 In a study of 275 medical students followed over a two-year period, Sobral utilized a behavioral measure called the reflection-in-learning scale (RLS).10 He considered self-assessment measures were a component of reflection-in-learning and showed that 70 percent of the students who had higher (upper quartile) RLS scores maintained them in a second appraisal.10 Fitzgerald et al. found a similar pattern when they followed over 300 medical students over a three-year period and measured consistency of self-assessment on classroom examinations and on objective structured clinical examinations (OSCEs).29 They determined that correlation of self-assessment to faculty assessment varied between r=0.46 and 0.69, depending on the task measured, and found that self-assessment scores remained steady over the three-year period. Our correlation results of examination scores with student self-assessment scores (r=0.225 and 0.370) on the first and second examinations, respectively, were lower than Fitzgerald et al., while our results for examination scores with validated self-assessment scores (r=0.34 and 0.429) for the first and second examinations, respectively, were also slightly lower. Our findings of validated self-assessment scores were consistent with Falchikov, who completed a meta-analysis of forty-four studies on self-assessment in higher education and calculated an average correlation of r=0.39.30
The index regression shows that students who improved in their ability to complete their self- assessment over successive examinations also improved on the self-assessment scores validated by their instructors. The model predicted 32 percent of the variation in scores on the validated self-assessment index with the variance in scores on the student self-assessment index. The student self-assessment score is the students perception of correct answers, while the validated self-assessment score reflects agreement between student and faculty about correct and incorrect responses, providing an external validation of student. It is not surprising that, when students improve in self-assessment accuracy, they will improve in their agreement to faculty assessments.
Our findings showed 16.3 percent of the variation in the examination index was explained by variation in the student self-assessment index (Table 3
). This means, in essence, that change on successive student self-assessments helps explain change in successive examination scores. Although being able to predict 16.3 percent of the variation in the examination score may not seem very predictive, it actually explains more, for example, than Scholastic Aptitude Test scores predict first-year college grade point average (GPA) (8.4 percent) or how much high school GPA predicts first year of college GPA (12.6 percent).31 Therefore, the explanatory power of our model to predict examination results with student self-assessment scores is comparable to or stronger than commonly used predictors of academic performance.
We need to ask how our results might be applied to anecdotal constructs from clinical teaching. When reviewing a clinical procedure completed by a student, faculty will identify areas that might be improved. Some students will understand and positively respond to faculty feedback, while other students will not learn from faculty feedback, resulting in a frustrating experience for both parties. We believe that students who are able to see what the faculty members see and thus recognize both their correct and incorrect actions in completing a clinical procedure may be the students with higher self-assessment scores. Obviously, this anecdotal construct needs to be tested, and this is an area we plan to explore.
A potential reason self-assessments are not used more often in dental education may be the dichotomy between societal pressures and educational needs. On one hand, we live in a society in which mistakes are often perceived as a sign of weakness. Students are naturally reluctant to acknowledge their mistakes, and this can stifle learning.32 On the other hand, reflective learning exercises, such as self-assessment, require students to develop skills in recognizing, understanding, and learning from their mistakes. Resolving this incongruity requires nurturing an environment in which mistakes are openly shared, discussed, and accepted as part of the total learning experience.
Most clinical skills in dentistry and medicine are taught in an apprentice style, and clinical education is primarily based on experimental learning.9 When students dont learn from faculty mentors or peers, they often learn by their mistakes. Creating an environment in which students are encouraged to understand that their mistakes are an integral part of the educational process is a challenge, but it could lead to improved self-assessment skills, which the current study suggests could result in improved learning.
| Conclusions |
|---|
|
|
|---|
| APPENDIX |
|---|
|
|
|---|
Directions to Complete Self-Assessment Sheet:
Please circle the appropriate answer that best describes your denture set-up. For example, if you feel your set-up has excessive horizontal overlap, circle excessive and you will get credit for a correct answer. If you circle ideal horizontal overlap when your set-up had excessive horizontal overlap, you will receive no credit. In other words, if you correctly identify all your mistakes, you can receive a 100% score. (The illustrations on this form show an ideal set-up and were included for clarity!)
GENERAL TOOTH POSITION
(circle one answer in each of the three general categories)
Anterior Arch Form:
A. ideal B. too wide C. too narrow
Posterior Arch Form:
A. ideal B. too wide C. too narrow
Maxillary/Mandibular Midline:
A. ideal B. to right of mandibular midline C. to left of mandibular midline
|
CENTRIC RELATION: Circle all that apply in your denture set-up. In other words, if you feel the maxillary first and second molars were touching the mandibular dentition bilaterally but the two bicuspids bilaterally were not contacting the lower teeth, you would circle the 6 and 7 as shown in the following example (example: (7 6 5 4 | 4 5 6 7). For a second example, if you felt the axial inclination of all 4 maxillary bicuspids had the correct axial inclination, you would circle the 4 and 5 as shown in the following example (5 4 | 4 5) .
maxillary lingual cusp touching (circle if touching) 7 6 5 4 | 4 5 6 7
|
mandibular buccal cusp touching (circle if touching) 7 6 5 4 | 4 5 6 7
|
interdigitation (mesio/distal mesh) (circle where correct) 7 6 5 4 | 4 5 6 7
axial inclination of maxillary bicuspids (circle where correct inclination) 5 4 | 4 5
In this section please circle where you feel the overlap is ideal, too much, or too little. For example, if you feel the horizontal overlap of the upper right second molar is ideal, you would circle the I for ideal under #7. If you felt the upper right first molar had too much horizontal overlap, you would circle the + under #6. There is one correct answer for each tooth so you should be circling either I, + or – under each tooth. Follow the same instructions for the vertical overlap of the anterior teeth.
|
Horizontal overlap of Anterior and Posterior Teeth:
|
Vertical overlap of Anterior Teeth:
|
ANTERIOR AXIAL INCLINATION (axial inclination-as related to gingival placement of tooth) Please circle the response that best describes your anterior denture tooth set-up. For example, if the upper tooth #6 has an ideal axial inclination, circle I for ideal. If the upper tooth #6 is too distally inclined, circle D for tooth #6.
|
|
FUNCTIONS: In this section please circle the correct answer (s). For example, if in your denture set-up you see contacts on the maxillary right first and second molar in a right working movement but do not see contact on the bicuspids or anterior teeth you would circle 6 and 7, (7 6 5 4 3 2 1 | ). On the left side during a right working movement if you saw balancing contacts on the bicuspids only you would mark 4 and 5; for example (_|_4 5 6 7) .
|
right working contacts 7 6 5 4 3 2 1 | left balancing contacts _| 4 5 6 7 (pin off table yes no)
left working contacts _| 1 2 3 4 5 6 7 right balancing contacts 7 6 5 4 |_ (pin off table yes no)
|
protrusive contacts 7 6 5 4 3 2 1 | 1 2 3 4 5 6 7 (pin off table yes no)
| Acknowledgments |
|---|
| Author Information |
|---|
|
|
|---|
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
E. T. Nance, S. K. Lanning, and J. C. Gunsolley Dental Anatomy Carving Computer-Assisted Instruction Program: An Assessment of Student Performance and Perceptions J Dent Educ., August 1, 2009; 73(8): 972 - 979. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. M. Hauser and D. M. Bowen Primer on Preclinical Instruction and Evaluation J Dent Educ., March 1, 2009; 73(3): 390 - 398. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. A. Kramer, J. E.N. Albino, S. C. Andrieu, W. D. Hendricson, L. Henson, B. D. Horn, L. M. Neumann, and S. K. Young Dental Student Assessment Toolbox J Dent Educ., January 1, 2009; 73(1): 12 - 35. [Full Text] [PDF] |
||||
![]() |
J. E.N. Albino, S. K. Young, L. M. Neumann, G. A. Kramer, S. C. Andrieu, L. Henson, B. Horn, and W. D. Hendricson Assessing Dental Students' Competence: Best Practice Recommendations in the Performance Assessment Literature and Investigation of Current Practices in Predoctoral Dental Education J Dent Educ., December 1, 2008; 72(12): 1405 - 1435. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |