Research and Theory

Test-Taking Strategies

Test-taking strategies are defined as those test-taking processes which the respondents have selected and which they are conscious of, at least to some degree. In other words, the notion of strategy implies an element of selection. Otherwise, the processes would not be considered as strategies.

At times, these strategies constitute opting out of the language task at hand (for example, through a surface matching of identical information in the assigned passage and with information in one of the response choices).

At other times, the strategies may constitute short-cuts to arriving at answers (for example, not reading the text as instructed but simply looking immediately for the answers to the given reading comprehension questions). In such cases, the respondents may be using test-wiseness to circumvent the need to tap their actual language knowledge or lack of it, consistent with Fransson's (1984) assertion that respondents may not proceed via the text but rather around it.

In the majority of testing situations, however, test-taking strategies do not lead to opting out or to the use of short cuts. In some cases, quite the contrary holds true.

In a study of test-taking strategies in Israel, one Hebrew second-language respondent determined that he had to produce a written translation of a text before he could respond to questions dealing with that text (Cohen & Aphek, 1979).

At times, the use of a limited number of strategies in a response to an item may indicate genuine control over the item, assuming that these strategies are well-chosen and are used effectively. At other times, true control requires the use of a host of strategies.

It is also best not to assume that any test-taking strategy is a good or a poor choice for a given task. It depends on how given test takers – with their particular cognitive style profile and degree of cognitive flexibility, their language knowledge, and their repertoire of test-taking strategies – employ these strategies at a given moment on a given task.

Some respondents may get by with the use of a limited number of strategies that they use well for the most part. Others may be aware of an extensive number of strategies but may use few, if any of them, effectively. So, for example, while a particular skimming strategy (such as paying attention to subheadings) may provide adequate preparation for a given test taker on a recall task, the same strategy may not work well for another respondent. It also may not work well for the same respondent on another text which lacks reader-friendly subheadings.

As long as the task is part of a test, students may find themselves using strategies that they would not use under non-test conditions. It is for this reason, that during the pilot phase, it is crucial for test constructors to find out what their tests are actually measuring.

Verbal Report as a Window onto Test-Taking Strategies

Test-taking involves cognitive processes that are not readily open to objective observation and evaluation. Consequently, in order to get the best picture possible of what it is that respondents do as they, for example, read test prompts and respond to test questions, researchers have tended to use verbal report protocols.

A comprehensive and in-depth overview of how verbal reports can and are used in language testing has been provided by Green (1998). According to him, “Verbal protocols are increasingly playing a vital role in the validation of assessment instruments and methods” in that they “offer a means for more directly gathering evidence that supports judgments regarding validity than some of the other more quantitative methods” (p. 3).

Green, in fact, notes that verbal reports are frequently used to address “one of the most fundamental questions” about language tests: what is it that a test actually measures (p. 3). Verbal reports include data that reflect:

self-report: learners' descriptions of what they do, characterized by generalized statements, in this case, about their test-taking strategies – for example, "On multiple-choice items, I tend to scan the reading passage for possible surface matches between information in the text and that same information appearing in one of the alternative choices," or questionnaires and other kinds of prompts which ask learners to describe the way they usually take a certain type of language test are likely to elicit self-report data.
self-observation: the inspection of specific, not generalized language behavior, either introspectively, that is, within 20 seconds of the mental event, or retrospectively – for example, "What I just did was to skim through the reading passage for possible surface matches between information in the text and that same information appearing in one of the alternative choices."
Self-observation data would entail reference to some actual instance(s) of language testing behavior. For example, recollections of why certain distracters were rejected in search of the correct multiple-choice response on previously answered items would count as retrospective self-observation.
self-revelation: "think-aloud," stream-of-consciousness disclosure of thought processes while the information is being attended to – for example, "Hmm...I wonder if the information in one of these alternative choices also appears in the text."
Self-revelation or think-aloud data are only available at the time that the language event is taking place (that is, within 20 seconds of it), and the assumption would be that the respondent is simply describing, say, the struggle to determine which five out of seven or more statements constitute the best set of main points for a text. Any thoughts that the respondent has which are immediately analyzed would constitute introspective self-observation – for example, “Now, does this utterance call for the present or imperfect-subjunctive? Let me see...”

Verbal reports can and usually do comprise some combination of these (Radford, 1974; Cohen & Hosenfeld, 1981; Cohen, 1987).

By asking test-takers to think-aloud as they work through a series of test items, it becomes possible to analyze the resulting protocol to identify the cognitive processes involved in carrying out the task.

Think-aloud protocols have the advantage of giving a more direct view of how readers process a text as they indicate what they are doing at the moment they are doing it (Cohen, 1987).

Retrospective interviews, in turn, provide an opportunity for investigators to ask directed questions to gain clarification of what was reported during the think-aloud.

Early work in verbal report with language testing found, for example, that some assumptions were ill-founded. One was that technical vocabulary does not cause as much difficulty as non-technical vocabulary and non-technical vocabulary used technically within a given field. Furthermore, seemingly obvious discourse markers may not be so obvious to the L2 reader. In addition, the problems arising from syntactic features may be quite limited in scope – stemming mostly from structures such as heavy noun phrases (Cohen, Glasman, Rosenbaum-Cohen, Ferrara, & Fine, 1979). Cohen (1986) laid out a series of measures to be taken to ensure that verbal report tasks could be used effectively to obtain data on the reading process.

More recently, numerous studies have been done to determine the strategies that students use to read texts (see Singhal, 2001, for a review). Upton (1997, 1998), for example, reported on 11 natives speakers of Japanese, half still taking ESL classes and half finished with courses. The students were asked to provide think-aloud protocols while they read academic passages.

In retrospective interviews, they then listened to their tape-recorded protocols and were asked to clarify and explain their thoughts.

Upton’s study demonstrated how verbal report can be used to describe the ways in which nonnatives can misconstrue the meaning of words and phrases as they read an L2 text, and how this throws off their understanding of the entire text. He found that many reading errors could be explained in terms of what Laufer (1991) has called synforms – that is, words that look or sound similar to other words that the readers know. The respondents would make vocabulary in the passage conform in their minds to what they thought the meaning of these look-alike words was. A more recent study by Upton and Lee-Thompson (2001) used verbal report with 20 native speakers of Chinese and Japanese to explore the question of when and how they use L1 resources while reading L2 texts.

Verbal report measures have helped determine how respondents actually take reading comprehension tests as opposed to what they may be expected to be doing (Cohen, 1984, 1994a: 130-136). Studies calling on respondents to provide immediate or delayed retrospection as to their test-taking strategies regarding reading passages with multiple-choice items have, for example, yielded the following results:

When the instructions ask students to read the passage before answering the questions, students have reported either reading the questions first or reading just part of the article and then looking for the corresponding questions.
When advised to read all alternatives before choosing one, students stop reading the alternatives as soon as they have found one that they decide is correct.
Students use a strategy of matching material from the passage with material in the item stem and in the alternatives, and prefer this surface-structure reading of the test items to one that calls for more in-depth reading and inferencing.
Students rely on their prior knowledge of the topic and on their general vocabulary.

From these findings and from others, a description of what respondents do to answer questions is emerging. Unless trained to do otherwise, they may use the most expedient means of responding available to them – such as relying more on their previous experience with seemingly similar formats than on a close reading of the description of the task at hand. Thus, when given a passage to read and multiple-choice items to answer, students may attempt to answer the items just as they have answered other multiple-choice reading items in the past, rather than paying close attention to what is called for in the current one. Often, this strategy works, but on occasion the particular task may require subtle or major shifts in response behavior in order to perform well.

Assessing the Interaction of Reading and Writing

Perhaps at the cutting edge of research on reading and writing is that of assessing language behavior at the intersection of reading and writing. One vehicle for doing this is through a close inspection of the process of summarizing.

Summarization tests are complex in nature. The reading portion entails identifying topical information, distinguishing superordinate from subordinate material, and identifying redundant and trivial information.

The writing of the summary entails the selection of topical information (or generating it if it is not provided), deleting trivial and redundant information, substituting superordinate material, and restating the text so that it is coherent and polished (Brown & Day, 1983; Kintsch & van Dijk, 1978).

Given the lack of clarity that often accompanies such tasks, research has shown that it may be useful to give test takers specific instructions about how to go about the summarization task (Cohen, 1993, 1994b). For example:

Summarization task

How to Read:

Read to extract the most important points – for example, those constituting topic sentences signaled as crucial by the paragraph structure: points that the reader of the summary would want to read.
Reduce information to superordinate points.
Avoid redundant information – points off.

How to Write:

Prepare indraft form and then rewrite.
Link points smoothly.
Pay attention to the required length for the summary (e.g., it may be 10 percent of original test, so 75 words for 750-word text)
Write the summary in your own words.
Be brief.
Write legibly.

It may also be beneficial to give raters specific instructions as to how to assess the summaries. For example:

Assessing your Summary:

Check to see whether each important point is included (point that were agreed upon by a group of experts in advance).
Check to make sure that these points are linked together by the key linking/integrating elements appearing on the master list.
Points(s) off for each irrelevant point.
Points off for illegibility.
A rubric for what would constitute “writing in your own words.”

This last item is a difficult one to regulate since some texts or phrases in a text lend themselves to paraphrase better than others. At some points in a text, the best summary of a point requires the use of the words found there. In other cases, paraphrase is not only possible but preferable.

Assessing Written Expression

Perhaps the main thing to be said about a given test of written expression is that it is a poor substitute for repeated samplings of a learner’s writing ability while not under the pressure of an exam situation.

The current process-oriented approach to writing raises the question of whether it is sound testing practice to have learners write a single draft of a composition as a measure of their writing ability. Instead, might it not be more appropriate to have learners prepare multiple drafts that are reviewed both by peers (in small groups) and by the teacher at given moments?

Hence, if writing is to be assessed on a test, it would be important to provide the learners with specific guidelines as to the nature of the task. For example:

Your boss has asked you to rough out an argument for why the factory employees should not get longer coffee breaks. Try to present your arguments in the most logical and persuasive way. Do not worry about grammar and punctuation at this point. There is no time for that now. Just concern yourself with the content of your ideas, their organization, and the choice of appropriate vocabulary to state your case.

It is important for the person doing the assessment of the writing to pay attention only to those aspects of the task that learners were requested to consider.

Furthermore, the field of L2 writing has embraced the use of portfolios whereby the students prepare a series of compositions (possibly including the various drafts of each as well). Each entry may represent a different type of writing – for instance, one a narrative or descriptive or expressive piece, the second a formal essay, and the third an analysis of a prose text. Hence, the portfolio represents multiple measures of the students' writing ability. (For more on portfolios, see Hamp-Lyons, Condon, & Farr, 2000. The National Capital Language Resource Center has developed a manual, Portfolio Assessment in the Second Language Classroom, that can be downloaded or ordered from the web site.

Final Thoughts

Advances in assessment have brought relatively untapped elements of language into assessment measures.

For example, language assessment may now include more finely-tuned assessment of languages for specific purposes (see Douglas, 2000) and of vocabulary (see Read, 2000), more sophisticated computer-based assessment (Dunkel, 1999), as well as the assessment of cross-cultural pragmatics (see Hudson, Detmer, & Brown, 1995; Brown, 2001; Cohen, in press).

With regard to pragmatics, expertise is accumulating in the assessment of speech acts such as complaining, apologizing, requesting, and the so forth.

Likewise, the assessment field is refining its means of assessing language sensitive to the use of the target language in specific, often technical contexts; the field is taking assessment of second-language vocabulary knowledge beyond simplistic measures to better assess the depth and breadth of lexical control; and testers are pursuing research and development projects to provide us with not only computer-assisted assessment measures but computer-adaptive ones as well.

Center for Advanced Research on Language Acquisition (CARLA) • 140 University International Center • 331 - 17th Ave SE • Minneapolis, MN 55414

Twin Cities Campus:
Parking & Transportation

Maps & Directions
Directories

Contact U of M
Privacy