
JSET ejournal







|
Universal Design
for Learning
Associate Editor Column
David Rose
With Bob Dolan, Guest Columnist
If you are a gardener and want to improve the yield of a tomato
plant, you wouldn't wait until the end of the season and then
measure how high the plant is. Instead, you would determine the
number of tomatoes, not the plant height. Furthermore, you would
do this assessment during the growing season, so that you could
then use the information to help identify proper interventions
to improve the plant's yield.
Educational assessment, too, can be used to inform student instruction,
assuming it is used properly. However, this powerful educational
tool is used improperly far to often. This column discusses some
of the limitations of current assessment practice and how application
of Universal Design for Learning (UDL) concepts can improve assessment
accuracy and its applicability to instruction.
Purposes of Assessment
There are many reasons for conducting assessments. Assessments
can be used to evaluate the performance of students, teachers,
schools, school districts, and even materials and methods. For
now, we will concentrate primarily on the assessment of student
learning.
There are, in turn, several different purposes for conducting
assessments of student learning. The school district, for example,
may conduct assessments of students as a way of (a) evaluating
teachers, (b) analyzing the effects of changing school practices,
(c) making comparisons with other school districts, and so forth.
Students may conduct their own self-assessments as a way of identifying
comparative strengths and weaknessess as well as areas for the
focus of study.
This article will concentrate on assessments that are designed
to address directly the core issues of teaching and learning
(i.e., assessments designed to monitor progress toward instructional
goals, to revise and shape the course of instruction, to motivate
achievement) However, the same principles provided here also
hold for other uses of assessment.
Limitations in Assessment Accuracy
The value of an assessment as a means for informing instruction
is a function of its accuracy. Accuracy, in turn, is a function
of how well an assessment measures students' abilities vis-à-vis
particular educational goals. An inaccurate assessment is one
in which this information is confounded by other factors of student
performance that are irrelevant to a particular learning goal.
Although there are remarkable individual differences among students
in our classrooms, most existing methods of assessment have not
been designed with these individual differences in mind. In fact,
many assessments appear to have been designed under the assumption
that learners are relatively homogeneous, and that the expected
outcomes for all students are relatively the same. As a result,
assessments are rarely free of confounding information, and thus
rarely represent a truly accurate measure of student abilities.
Using the principles of Universal Design for Learning, we can
examine the accuracy limitations of current assessment techniques
in light of their capacity to provide multiple means of representation,
expression, and engagement.
Limited Means of Representation. Consider Patrick, a student
with dyslexia. When he sees the test, he is likely to experience
an all too familiar dread. The page full of text is daunting.
Decoding the questions printed in the test booklet may pose difficulties
for him that are as challenging as those posed by the knowledge
of science itself. In any case, with many questions to read and
a strict time limit, Patrick's knowledge of science would likely
be obscured by his problems in reading the test. Moreover, many
objective tests like this are purposely constructed with complex
syntax and other linguistic structures so that the items provide
greater differentiation among students. This means that, quite
independent of knowledge of physics, students like Patrick and
those with other language-based disabilities, are targeted to
achieve lower scores than those without language-based disabilities.
Their scores will have little to do with science.
The problem with science tests such as this is that they do not
really measure knowledge of science. Instead, they also measure
the students' facility with the medium of print. For Patrick,
who is not facile with printed text, delivering the assessment
in print depresses his score regardless of his knowledge of science.
The accuracy of the test is poor.
However, this particular test would not necessarily be more accurate
even if the teacher had administered the test orally to the whole
class. There is no single medium that would provide an unbiased
vehicle for assessment. If we chose a different medium, like
oral language, to represent the science problems, Patrick's scores
might indeed be improved, but at the expense of some other students
for whom another medium might better convey the test material.
Because there is an inevitable interaction between the representational
demands of the medium and individual capacities of the students,
for each student there will be some inadvertent effect of the
medium in which the assessment occurs. For some students the
effect is relatively negligible or sometimes even positive).
For others, like Patrick, the effect is large and negative. For
any group of students, and particularly any group of diverse
students, the fixed version of the test provides scores that
are unreliable and distorted by the unseen weight of the medium.
As a result, there is no single class-wide method for accurately
assessing knowledge of science.
Limited Means of Expression. Consider now Billy, a student
with a physical disability. If the same standardized paper-and-pencil
science test is handed to him, he will fail it outright because
of physical limitations. So would many other individuals with
physical disabilities (e.g., Steven Hawking, the Nobel Prize-winning
author and pre-eminent physics professor who has ALS). In neither
case would performance reflect their knowledge of science, but
merely both individuals' inability to master the means of expression
required by the paper-and-pencil test. While Billy's case is
extreme, making it easy to recognize that the test cannot extract
an accurate measure of Billy's knowledge of science, it exemplifies
the fact that a test that offers only a single mode of expression
from all students presents the same obstacles to accuracy that
the single mode of representation did for Patrick.
These effects of test situations such as these can be quite and
the problem is not restricted to extreme cases like Billy. Research
is emerging that shows strong effects of the mode of expression
even on students with no obvious disabilities. For example, researchers
at Boston College (Russell, 2000) have completed a set of studies
that investigated the role of different modes of expression (handwriting
versus keyboarding) on the standardized test scores of general
education students. Results indicate that student scores, supposedly
based on content alone, were affected by the mode of responding.
That is, students who had experience on computers got much higher
scores on the same test if they responded with the computer than
with handwriting.
Limited Means of Engagement. Assessments can create special
problems for engagement. Some non-trivial level of engagement
is clearly required for an assessment to be an accurate estimate
of optimal performance. That is why assessment in school is usually
associated with intensified emotional states in those who are
being assessed. These assessments are often the gatekeeper for
the most significant rewards and punishments. These external
rewards and punishments are designed to get the attention of
affective networks, and are given an high level of affective
significance, often fear or anxiety.
We know, however, that there are immense individual differences
in affective reactions to external rewards and punishments. So
the same motivators will have very different effects on different
students. What seems, from an external standpoint, to be the
same level of motivation can provoke a very different effect
in students.
The critical difficulty is that for any particular problem, and
any assessment, there is an optimal level of engagement. Very
low levels of engagement and very high levels of engagement tend
not to be optimal for performance on most tasks, but there is
considerable difference depending on whether the task calls for
creative thinking, divergent or convergent problem solving, or
producing skillful rapid solutions. We have all felt the disabling
effects of anxiety , the metaphorical choking. This anxiety response
demonstrates that there is no stable way to assess performance
that is predictive across different states of arousal. For example,
the 90% foul shooter can be reduced to a 50% foul shooter by
placing him/herself at the line in the last seconds of a championship
game. On any task there is an optimal level, and that varies
across individuals and across settings (Goleman 1995).
Applying externally uniform (fixed) rewards, the same for everybody,
will have highly differential results on performance. While the
rewards associated with testing and assessment are designed to
raise the level of anxiety for every student, they are likely
to have a clearly deleterious effect for a student experiencing
heightened anxiety. If a student's level of anxiety is chronically
at a high level, the the student is likely to be highly reactive
to anxiety-producing events like testing. The affective significance
the student attaches to the fear of failure may move him/her
very far from an optimal level, causing poor performance in those
situations.
The particular affective state that any student attaches to assessment
is dependent on his/her own individual makeup and on factors
in the perceived punishments and rewards. The fact that these
differ greatly across individuals ensures that the use of any
single means of engagement will not create accurate measure of
optimal performance.
Universally Designed Assessment
In light of student differences, administering assessment
using a common format does not level the playing field as many
educators believe. Rather, a single format tilts the playing
field, favoring some and hampering others. The solution lies
in providing a flexible test administration vehicle that provides
students the opportunity to demonstrate their understanding and
skills according to the particular learning goals associated
with the assessment.
In a universally designed curriculum, multiple means of representation,
expression, and engagement are available as a normal part of
every learning environment and every assessment. The following
sections describe universally designed assessments.
Providing Multiple Means of Representation and Expression.
The universally designed test allows some variation in the manner
in which test material is presented so that students are better
able to express what they know, thus making the test more accessible.
For example, presenting the test on the computer, rather than
solely in print, provides many options for access. In Billy's
case, the computerized test allows him to take the test independently,
using a single-switch access program or voice input. Whereas
in the print version, Billy is shut out entirely, without access.
For a wide range of other students who do not have physical access
limitations, the representation and input options of the assessment
go beyond providing accessibility and serve to increase accuracy.
We have already noted that the opportunity to type answers rather
than write them on paper makes a huge difference for computer-using
students (Russell, 2000), illustrating that a test often evaluates
input fluency to some extent and confounds these assessment data
with knowledge or comprehension data. For other students, especially
those with dysgraphia and dyslexia, voice control options can
have an even more dramatic effect.
The above examples, however, are merely modality transformations,
not significant changes in the modes of representation and expression.
In the case of representation, UDL assessments could use multimedia
to present material in ways that are tailored to the best means
of understanding. In this way, a student such as Patrick could
understand what he is being asked without being punished for
poor decoding ability.
In the case of expression, imagine future UDL assessments in
which verbal comprehension questions are complemented or replaced
with alternatives. For example, students complete a drawing instead
of a sentence, showing the next step in a scientific process,
or they predict what will happen next on the basis of existing
information.
More likely, students will be presented with virtual labs, where
actual manipulations of data, technologies, or substances are
used to demonstrate more clearly than any verbal response that
the student understands processes, methods, and outcomes
that they understand and comprehend the science, not just the
words. These options create a more interesting, more accurate,
and often more relevant, assessment of learning strategies across
a wide variety of students, and a wide variety of subjects. Through
these multiple means of expression, it is possible to find, for
example, that a student knows how to create a good summary, pictorially
or orally, even if they can't do it through text. That information
provides a much clearer focus for the kinds of educational interventions
that are needed for a particular student. Providing such flexibility
of expression is not limited to portfolio assessments as automated
scoring methods can be applied to expression modalities such
as creation of concept maps (Ruiz-Primo, Schultz et al. 1997).
Providing Multiple Means of Engagement. One reason that
traditional assessments are not very predictive of later success
is that they rely too heavily on grades. While grades, as a means
of engagement, may work for many students, it is highly variable
in its effects. For some students the stakes are set too high
by testing in general and grading in particular, leading to a
reduction in performance often described as test anxiety. For
others, the stakes are set too obliquely so that grades are no
longer a strong motivator.
It is important to emphasize the potential value of embedding
assessment in the curriculum. Most free-standing tests isolate
assessment, imbuing it with the character of an ultimate obstacle,
hurdle, or failure detector. This eliminates the more positive
role of assessment, that of providing an ongoing source of feedback.
By incorporating assessment within the routine of interacting
with material, it becomes active feedback that is found in any
learning situation. This type of ongoing, formative evaluation
is a critical part of learning and, yet, rarely is provided in
school, replaced instead by summative evaluation.
In our science example, there is only one content area in which
the assessment is conducted. It is not typical to assess comprehension
strategies in science. It is, however, typical to assess comprehension
strategies in a fixed, uniform, artificial content with every
student receiving the same passages, which are selected for their
readability, their syntactic complexity, and so forth. These
passages are not selected, however, for their interest, relevance,
or engagement. While this design appears at first to achieve
comparability across students, such a fixed design limits comparability
when engagement is a key factor.
Imagine instead that the choice of content were flexible,
a part of the universal design. For some students, this variability
would have a very significant effect. They might be far better
at producing a good summary on material that interests, or engages,
them. It would provide valuable information to a teacher to know
that a student is able to make a very facile summary of a car
magazine article or a science passage, but not an Elizabethan
sonnet or a history passage. This student would require a very
different type of intervention than a student who cannot make
an adequate summary regardless of his/her level of engagement.
Without ensuring that the assessment of students has be conducted
under optimal motivational conditions, it is quite possible to
remediate the student unnecessarily when remediation is not precisely
what they need.
Using Universally Designed Assessment to Inform Instruction
The obvious value of UDL in the arena of assessment is to
ensure that there are assessment instruments that are accessible
and practical for students with disabilities. We have argued
here, however, that the ultimate value of universally designed
assessments goes far beyond access and practicality. With the
flexibility inherent in such assessments flexibility in
representation, expression, and engagement it is possible
to reduce the common sources of error introduced by fixed assessments,
errors that presently interfere with accurate measurement of
learning. Further, that same flexibility allows teachers to align
the assessment more sharply with teaching goals and methods,
to vary those goals and methods, and to assess them accurately
within the instructional venue. But the future is much richer.
The interactive capacity of new technologies allows us to
engage in dynamic assessments that more organically assess the
ongoing processes of learning. By tracking what supports a student
uses, the kinds of actions and strategies he/she follows, the
types of strategies or approaches that seem to be missing, and
the aspects of the task environment that bias the student toward
successful or unsuccessful approaches, the teacher has the information
that can help understand more about the student as a learner.
A much richer set of options is available when the lesson itself
becomes a part of the assessment. Imagine that the teacher has
set comprehension goals students who need help in reading for
meaning. Rather than wait until the end of the passage, assessments
of comprehension could be embedded throughout the digital version
of the chapter, displayed specifically to those students whose
instructional goals make them appropriate or necessary.
These chapter-embedded comprehension checks function less like
the traditional test, and more like scaffolds with feedback.
In this manner they are more strategically useful to the reader
and provide support for building meta-awareness and self-monitoring
strategies as part of building comprehension skills. Expression
of what the student knows, and when they know it, becomes a normal
part of interacting with text, rather than pass-fail information
gathered much later when performance is confounded with memory
problems from the separation in time or space.
Most important, the new technologies allow two-way interactive
assessments. With these technologies we will be able to create
learning environments that not only teach, but also learn. By
distributing the intelligence better between student and environment,
the curriculum is able to learn about the student (e.g., their
individual strengths and styles) and keep track of the successes
and failures of its own methods. The result is a curriculum that
becomes smarter, not more outdated, over time.
Finally, dynamic assessments will be universally designed. By
providing a full range of customizations and adaptations as a
part of assessments, student performance will be evaluated more
accurately.. The accuracy will come from the capacity to evaluate
performance under varying conditions that range from situations
in which the student's performance is constrained by barriers
inherent in specific modes of representation, expression, or
engagement, to conditions where appropriate adaptations and supports
are available to overcome those barriers.
An Example. If we wish to evaluate Patrick's progress
in learning summarization strategies, his teacher can have him
read a digital version of the content an option for text-to-speech.
In this manner he/she would be able to evaluate more accurately
both his knowledge of science and his growth in summarization
skills. Suppose, for example, that Patrick scored dramatically
better on both the science and summarization questions with speech
turned on. That would suggest that he has already learned how
to summarize and comprehend adequately and that his low scores
reflect primarily decoding difficulties, not difficulties related
to summarization. In terms of strategic teaching, the teacher
would know clearly that it is necessary to concentrate on Patrick's
fluency, rather than remedial work in summarizing strategies.
Or, he/she could decide that to enhance Patrick's learning of
science, sound should be kept on whenever he is working independently.
This is only the tip of the iceberg diagnostically. The same
flexibility can be applied to many other kinds of representation
such as: (a) providing vocabulary support, (b) providing links
to background information, (c) providing graphic organizers before,
during, or after reading, or (d) providing syntactic support.
Each of these features can be a normal part of a universally-designed
document. The flexibility to turn them on and off allows customization
for the needs of each student, but also provides the mechanism
for assessment. "Does vocabulary support help Patrick?"
is an easy question to assess when the flexibility to turn it
off and on is available. The key point is that a universally
designed assessment provides the flexibility needed to make the
assessment accessible, but also to make it more accurate and
instructionally valuable. It does this by providing options;
options that are essential to assess what is working and what
is not.
Short-term Solutions for Assessment. Currently there are
very few formal assessments that are universally designed and
very few curricula that have embedded universally-designed assessment
within them. This lack of availability will soon change. The
Center for Applied Special Technology (CAST) is already beginning
to work with some educational publishers concerning these changes.
The changes may happen relatively quickly because public policy
appears to require such changes (e.g., the IDEA amendments and
Section 504), the economic incentive for modification will be
intense, and the new technologies make it practical to do so.
In the meantime, what can educators do? First, it is possible
to modify existing methods of testing and assessment so that
they are more accessible and flexible. One straightforward step
is to administer existing print-based tests on the computer instead.
In that format, the test becomes much more flexible (e.g., the
print can be enlarged, the text can be read aloud) for use with
a variety of students. Many of the existing accommodations offered
to students with disabilities who participate in large-scale
tests can be provided in this way (IDEA-Partnerships & The
Council for Exceptional Children, 2000). This is an enormously
helpful step, and one that is not difficult to take. For techniques
and software tools that can make this step much easier, the CAST
website (www.cast.org) is one place
where you will find relevant information and links to helpful
resources.
Next, for any new or recent curricular materials, ask publishers
for accessible versions. By law, you are able to re-make accessible
versions yourself. However, publishers are increasingly under
pressure by many states to provide accessible versions of educational
materials at the outset. In the years ahead, it is critical that
educators and adoption committees request that publishers provide
materials that can be used by all students. More than any other
step, consumer requests drive publishers to do the right thing.
As publishers become accustomed to meeting minimal standards
for accessibility, they will learn to make assessments that are
truly universally designed. At that point, responsible assessment
of all of our students will no longer be the sole responsibility
of the classroom teacher, but will be an integral part of the
education system.
References
Goleman, D. (1995). Emotional intelligence: Why it can matter
more than IQ. New York: Bantam Books.
IDEA-Partnerships and The Council for Exceptional Children (2000).
Making assessment accommodations: A toolkit for educators. Reston,
VA: The Council for Exceptional Children.
Individuals with Disabilities Education Act, Amendments of 1997,
Public Law No. 105-17, [On-line]. Available: http://www.ed.gov/offices/OSERS/IDEA/the_law.html
Ruiz-Primo, M. A., Schultz, S.E., & Shavelson, R.J. (1997).
Concept map-based assessment in science: Two exploratory studies.
Los Angeles, CA, CRESST.
Russell, M. (2000). It's Time to Upgrade: Tests and administration
procedures for the new millennium. The Secretary's Conference
on Education Technology 2000, U. S. Department of Education
Section 504, Rehabilitation Act of 1973. [On-line] Available:
http://www.dol.gov/dol/oasam/public/regs/statutes/sec504.html.
Bob Dolan is Senior Research Scientist at the Center for Applied
Special Technology. Email to: rdolan@cast.org
|