j s e t logo
JSET ejournal

this issue button
this volume button
email us button
about j s e t button
related links button


powerpoint presentations button
p d f files button

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

top of page button

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

top of page button

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

top of page button

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

top of page button

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

top of page button

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

top of page button

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

top of page button

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

top of page button

Universal Design for Learning
Associate Editor Column
David Rose

With Bob Dolan, Guest Columnist


If you are a gardener and want to improve the yield of a tomato plant, you wouldn't wait until the end of the season and then measure how high the plant is. Instead, you would determine the number of tomatoes, not the plant height. Furthermore, you would do this assessment during the growing season, so that you could then use the information to help identify proper interventions to improve the plant's yield.

Educational assessment, too, can be used to inform student instruction, assuming it is used properly. However, this powerful educational tool is used improperly far to often. This column discusses some of the limitations of current assessment practice and how application of Universal Design for Learning (UDL) concepts can improve assessment accuracy and its applicability to instruction.

Purposes of Assessment
There are many reasons for conducting assessments. Assessments can be used to evaluate the performance of students, teachers, schools, school districts, and even materials and methods. For now, we will concentrate primarily on the assessment of student learning.

There are, in turn, several different purposes for conducting assessments of student learning. The school district, for example, may conduct assessments of students as a way of (a) evaluating teachers, (b) analyzing the effects of changing school practices, (c) making comparisons with other school districts, and so forth. Students may conduct their own self-assessments as a way of identifying comparative strengths and weaknessess as well as areas for the focus of study.

This article will concentrate on assessments that are designed to address directly the core issues of teaching and learning (i.e., assessments designed to monitor progress toward instructional goals, to revise and shape the course of instruction, to motivate achievement) However, the same principles provided here also hold for other uses of assessment.

Limitations in Assessment Accuracy
The value of an assessment as a means for informing instruction is a function of its accuracy. Accuracy, in turn, is a function of how well an assessment measures students' abilities vis-à-vis particular educational goals. An inaccurate assessment is one in which this information is confounded by other factors of student performance that are irrelevant to a particular learning goal.

Although there are remarkable individual differences among students in our classrooms, most existing methods of assessment have not been designed with these individual differences in mind. In fact, many assessments appear to have been designed under the assumption that learners are relatively homogeneous, and that the expected outcomes for all students are relatively the same. As a result, assessments are rarely free of confounding information, and thus rarely represent a truly accurate measure of student abilities.
Using the principles of Universal Design for Learning, we can examine the accuracy limitations of current assessment techniques in light of their capacity to provide multiple means of representation, expression, and engagement.

Limited Means of Representation. Consider Patrick, a student with dyslexia. When he sees the test, he is likely to experience an all too familiar dread. The page full of text is daunting. Decoding the questions printed in the test booklet may pose difficulties for him that are as challenging as those posed by the knowledge of science itself. In any case, with many questions to read and a strict time limit, Patrick's knowledge of science would likely be obscured by his problems in reading the test. Moreover, many objective tests like this are purposely constructed with complex syntax and other linguistic structures so that the items provide greater differentiation among students. This means that, quite independent of knowledge of physics, students like Patrick and those with other language-based disabilities, are targeted to achieve lower scores than those without language-based disabilities. Their scores will have little to do with science.

The problem with science tests such as this is that they do not really measure knowledge of science. Instead, they also measure the students' facility with the medium of print. For Patrick, who is not facile with printed text, delivering the assessment in print depresses his score regardless of his knowledge of science. The accuracy of the test is poor.
However, this particular test would not necessarily be more accurate even if the teacher had administered the test orally to the whole class. There is no single medium that would provide an unbiased vehicle for assessment. If we chose a different medium, like oral language, to represent the science problems, Patrick's scores might indeed be improved, but at the expense of some other students for whom another medium might better convey the test material.

Because there is an inevitable interaction between the representational demands of the medium and individual capacities of the students, for each student there will be some inadvertent effect of the medium in which the assessment occurs. For some students the effect is relatively negligible or sometimes even positive). For others, like Patrick, the effect is large and negative. For any group of students, and particularly any group of diverse students, the fixed version of the test provides scores that are unreliable and distorted by the unseen weight of the medium. As a result, there is no single class-wide method for accurately assessing knowledge of science.

Limited Means of Expression. Consider now Billy, a student with a physical disability. If the same standardized paper-and-pencil science test is handed to him, he will fail it outright because of physical limitations. So would many other individuals with physical disabilities (e.g., Steven Hawking, the Nobel Prize-winning author and pre-eminent physics professor who has ALS). In neither case would performance reflect their knowledge of science, but merely both individuals' inability to master the means of expression required by the paper-and-pencil test. While Billy's case is extreme, making it easy to recognize that the test cannot extract an accurate measure of Billy's knowledge of science, it exemplifies the fact that a test that offers only a single mode of expression from all students presents the same obstacles to accuracy that the single mode of representation did for Patrick.

These effects of test situations such as these can be quite and the problem is not restricted to extreme cases like Billy. Research is emerging that shows strong effects of the mode of expression even on students with no obvious disabilities. For example, researchers at Boston College (Russell, 2000) have completed a set of studies that investigated the role of different modes of expression (handwriting versus keyboarding) on the standardized test scores of general education students. Results indicate that student scores, supposedly based on content alone, were affected by the mode of responding. That is, students who had experience on computers got much higher scores on the same test if they responded with the computer than with handwriting.

Limited Means of Engagement. Assessments can create special problems for engagement. Some non-trivial level of engagement is clearly required for an assessment to be an accurate estimate of optimal performance. That is why assessment in school is usually associated with intensified emotional states in those who are being assessed. These assessments are often the gatekeeper for the most significant rewards and punishments. These external rewards and punishments are designed to get the attention of affective networks, and are given an high level of affective significance, often fear or anxiety.

We know, however, that there are immense individual differences in affective reactions to external rewards and punishments. So the same motivators will have very different effects on different students. What seems, from an external standpoint, to be the same level of motivation can provoke a very different effect in students.

The critical difficulty is that for any particular problem, and any assessment, there is an optimal level of engagement. Very low levels of engagement and very high levels of engagement tend not to be optimal for performance on most tasks, but there is considerable difference depending on whether the task calls for creative thinking, divergent or convergent problem solving, or producing skillful rapid solutions. We have all felt the disabling effects of anxiety , the metaphorical choking. This anxiety response demonstrates that there is no stable way to assess performance that is predictive across different states of arousal. For example, the 90% foul shooter can be reduced to a 50% foul shooter by placing him/herself at the line in the last seconds of a championship game. On any task there is an optimal level, and that varies across individuals and across settings (Goleman 1995).

Applying externally uniform (fixed) rewards, the same for everybody, will have highly differential results on performance. While the rewards associated with testing and assessment are designed to raise the level of anxiety for every student, they are likely to have a clearly deleterious effect for a student experiencing heightened anxiety. If a student's level of anxiety is chronically at a high level, the the student is likely to be highly reactive to anxiety-producing events like testing. The affective significance the student attaches to the fear of failure may move him/her very far from an optimal level, causing poor performance in those situations.

The particular affective state that any student attaches to assessment is dependent on his/her own individual makeup and on factors in the perceived punishments and rewards. The fact that these differ greatly across individuals ensures that the use of any single means of engagement will not create accurate measure of optimal performance.

Universally Designed Assessment
In light of student differences, administering assessment using a common format does not level the playing field as many educators believe. Rather, a single format tilts the playing field, favoring some and hampering others. The solution lies in providing a flexible test administration vehicle that provides students the opportunity to demonstrate their understanding and skills according to the particular learning goals associated with the assessment.

In a universally designed curriculum, multiple means of representation, expression, and engagement are available as a normal part of every learning environment and every assessment. The following sections describe universally designed assessments.

Providing Multiple Means of Representation and Expression. The universally designed test allows some variation in the manner in which test material is presented so that students are better able to express what they know, thus making the test more accessible. For example, presenting the test on the computer, rather than solely in print, provides many options for access. In Billy's case, the computerized test allows him to take the test independently, using a single-switch access program or voice input. Whereas in the print version, Billy is shut out entirely, without access.

For a wide range of other students who do not have physical access limitations, the representation and input options of the assessment go beyond providing accessibility and serve to increase accuracy. We have already noted that the opportunity to type answers rather than write them on paper makes a huge difference for computer-using students (Russell, 2000), illustrating that a test often evaluates input fluency to some extent and confounds these assessment data with knowledge or comprehension data. For other students, especially those with dysgraphia and dyslexia, voice control options can have an even more dramatic effect.
The above examples, however, are merely modality transformations, not significant changes in the modes of representation and expression. In the case of representation, UDL assessments could use multimedia to present material in ways that are tailored to the best means of understanding. In this way, a student such as Patrick could understand what he is being asked without being punished for poor decoding ability.

In the case of expression, imagine future UDL assessments in which verbal comprehension questions are complemented or replaced with alternatives. For example, students complete a drawing instead of a sentence, showing the next step in a scientific process, or they predict what will happen next on the basis of existing information.

More likely, students will be presented with virtual labs, where actual manipulations of data, technologies, or substances are used to demonstrate more clearly than any verbal response that the student understands processes, methods, and outcomes ­ that they understand and comprehend the science, not just the words. These options create a more interesting, more accurate, and often more relevant, assessment of learning strategies across a wide variety of students, and a wide variety of subjects. Through these multiple means of expression, it is possible to find, for example, that a student knows how to create a good summary, pictorially or orally, even if they can't do it through text. That information provides a much clearer focus for the kinds of educational interventions that are needed for a particular student. Providing such flexibility of expression is not limited to portfolio assessments as automated scoring methods can be applied to expression modalities such as creation of concept maps (Ruiz-Primo, Schultz et al. 1997).

Providing Multiple Means of Engagement. One reason that traditional assessments are not very predictive of later success is that they rely too heavily on grades. While grades, as a means of engagement, may work for many students, it is highly variable in its effects. For some students the stakes are set too high by testing in general and grading in particular, leading to a reduction in performance often described as test anxiety. For others, the stakes are set too obliquely so that grades are no longer a strong motivator.

It is important to emphasize the potential value of embedding assessment in the curriculum. Most free-standing tests isolate assessment, imbuing it with the character of an ultimate obstacle, hurdle, or failure detector. This eliminates the more positive role of assessment, that of providing an ongoing source of feedback. By incorporating assessment within the routine of interacting with material, it becomes active feedback that is found in any learning situation. This type of ongoing, formative evaluation is a critical part of learning and, yet, rarely is provided in school, replaced instead by summative evaluation.

In our science example, there is only one content area in which the assessment is conducted. It is not typical to assess comprehension strategies in science. It is, however, typical to assess comprehension strategies in a fixed, uniform, artificial content with every student receiving the same passages, which are selected for their readability, their syntactic complexity, and so forth. These passages are not selected, however, for their interest, relevance, or engagement. While this design appears at first to achieve comparability across students, such a fixed design limits comparability when engagement is a key factor.

Imagine instead that the choice of content were flexible, a part of the universal design. For some students, this variability would have a very significant effect. They might be far better at producing a good summary on material that interests, or engages, them. It would provide valuable information to a teacher to know that a student is able to make a very facile summary of a car magazine article or a science passage, but not an Elizabethan sonnet or a history passage. This student would require a very different type of intervention than a student who cannot make an adequate summary regardless of his/her level of engagement. Without ensuring that the assessment of students has be conducted under optimal motivational conditions, it is quite possible to remediate the student unnecessarily when remediation is not precisely what they need.

Using Universally Designed Assessment to Inform Instruction
The obvious value of UDL in the arena of assessment is to ensure that there are assessment instruments that are accessible and practical for students with disabilities. We have argued here, however, that the ultimate value of universally designed assessments goes far beyond access and practicality. With the flexibility inherent in such assessments ­ flexibility in representation, expression, and engagement ­ it is possible to reduce the common sources of error introduced by fixed assessments, errors that presently interfere with accurate measurement of learning. Further, that same flexibility allows teachers to align the assessment more sharply with teaching goals and methods, to vary those goals and methods, and to assess them accurately within the instructional venue. But the future is much richer.

The interactive capacity of new technologies allows us to engage in dynamic assessments that more organically assess the ongoing processes of learning. By tracking what supports a student uses, the kinds of actions and strategies he/she follows, the types of strategies or approaches that seem to be missing, and the aspects of the task environment that bias the student toward successful or unsuccessful approaches, the teacher has the information that can help understand more about the student as a learner.

A much richer set of options is available when the lesson itself becomes a part of the assessment. Imagine that the teacher has set comprehension goals students who need help in reading for meaning. Rather than wait until the end of the passage, assessments of comprehension could be embedded throughout the digital version of the chapter, displayed specifically to those students whose instructional goals make them appropriate or necessary.

These chapter-embedded comprehension checks function less like the traditional test, and more like scaffolds with feedback. In this manner they are more strategically useful to the reader and provide support for building meta-awareness and self-monitoring strategies as part of building comprehension skills. Expression of what the student knows, and when they know it, becomes a normal part of interacting with text, rather than pass-fail information gathered much later when performance is confounded with memory problems from the separation in time or space.

Most important, the new technologies allow two-way interactive assessments. With these technologies we will be able to create learning environments that not only teach, but also learn. By distributing the intelligence better between student and environment, the curriculum is able to learn about the student (e.g., their individual strengths and styles) and keep track of the successes and failures of its own methods. The result is a curriculum that becomes smarter, not more outdated, over time.

Finally, dynamic assessments will be universally designed. By providing a full range of customizations and adaptations as a part of assessments, student performance will be evaluated more accurately.. The accuracy will come from the capacity to evaluate performance under varying conditions that range from situations in which the student's performance is constrained by barriers inherent in specific modes of representation, expression, or engagement, to conditions where appropriate adaptations and supports are available to overcome those barriers.

An Example. If we wish to evaluate Patrick's progress in learning summarization strategies, his teacher can have him read a digital version of the content an option for text-to-speech. In this manner he/she would be able to evaluate more accurately both his knowledge of science and his growth in summarization skills. Suppose, for example, that Patrick scored dramatically better on both the science and summarization questions with speech turned on. That would suggest that he has already learned how to summarize and comprehend adequately and that his low scores reflect primarily decoding difficulties, not difficulties related to summarization. In terms of strategic teaching, the teacher would know clearly that it is necessary to concentrate on Patrick's fluency, rather than remedial work in summarizing strategies. Or, he/she could decide that to enhance Patrick's learning of science, sound should be kept on whenever he is working independently.

This is only the tip of the iceberg diagnostically. The same flexibility can be applied to many other kinds of representation such as: (a) providing vocabulary support, (b) providing links to background information, (c) providing graphic organizers before, during, or after reading, or (d) providing syntactic support. Each of these features can be a normal part of a universally-designed document. The flexibility to turn them on and off allows customization for the needs of each student, but also provides the mechanism for assessment. "Does vocabulary support help Patrick?" is an easy question to assess when the flexibility to turn it off and on is available. The key point is that a universally designed assessment provides the flexibility needed to make the assessment accessible, but also to make it more accurate and instructionally valuable. It does this by providing options; options that are essential to assess what is working and what is not.

Short-term Solutions for Assessment. Currently there are very few formal assessments that are universally designed and very few curricula that have embedded universally-designed assessment within them. This lack of availability will soon change. The Center for Applied Special Technology (CAST) is already beginning to work with some educational publishers concerning these changes. The changes may happen relatively quickly because public policy appears to require such changes (e.g., the IDEA amendments and Section 504), the economic incentive for modification will be intense, and the new technologies make it practical to do so.
In the meantime, what can educators do? First, it is possible to modify existing methods of testing and assessment so that they are more accessible and flexible. One straightforward step is to administer existing print-based tests on the computer instead. In that format, the test becomes much more flexible (e.g., the print can be enlarged, the text can be read aloud) for use with a variety of students. Many of the existing accommodations offered to students with disabilities who participate in large-scale tests can be provided in this way (IDEA-Partnerships & The Council for Exceptional Children, 2000). This is an enormously helpful step, and one that is not difficult to take. For techniques and software tools that can make this step much easier, the CAST website (www.cast.org) is one place where you will find relevant information and links to helpful resources.

Next, for any new or recent curricular materials, ask publishers for accessible versions. By law, you are able to re-make accessible versions yourself. However, publishers are increasingly under pressure by many states to provide accessible versions of educational materials at the outset. In the years ahead, it is critical that educators and adoption committees request that publishers provide materials that can be used by all students. More than any other step, consumer requests drive publishers to do the right thing. As publishers become accustomed to meeting minimal standards for accessibility, they will learn to make assessments that are truly universally designed. At that point, responsible assessment of all of our students will no longer be the sole responsibility of the classroom teacher, but will be an integral part of the education system.

References
Goleman, D. (1995). Emotional intelligence: Why it can matter more than IQ. New York: Bantam Books.

IDEA-Partnerships and The Council for Exceptional Children (2000). Making assessment accommodations: A toolkit for educators. Reston, VA: The Council for Exceptional Children.

Individuals with Disabilities Education Act, Amendments of 1997, Public Law No. 105-17, [On-line]. Available: http://www.ed.gov/offices/OSERS/IDEA/the_law.html

Ruiz-Primo, M. A., Schultz, S.E., & Shavelson, R.J. (1997). Concept map-based assessment in science: Two exploratory studies. Los Angeles, CA, CRESST.

Russell, M. (2000). It's Time to Upgrade: Tests and administration procedures for the new millennium. The Secretary's Conference on Education Technology 2000, U. S. Department of Education

Section 504, Rehabilitation Act of 1973. [On-line] Available: http://www.dol.gov/dol/oasam/public/regs/statutes/sec504.html.



Bob Dolan is Senior Research Scientist at the Center for Applied Special Technology. Email to: rdolan@cast.org

 

 

 

top of page button
top of page