Benchmark Assessment Efficacy Studies, Research Base and White Papers

Home / Extend / Research and Standards / BAS Research

BAS Research

Research Background

After the construction of the Benchmark Assessment System, an outside evaluation team conducted an independent study of the system's reliability and validity as a way of measuring reading progress against grade-level criteria. An independent agency reviewed the data. The first stage of the study provided valuable information for adjusting the difficulty of texts in detailed ways. The second stage provided data to assure that the texts provide a true gradient—that is, that each level is more difficult than the previous level and is easier than the next level. The study also provided information on internal consistency—that the fiction and nonfiction selections at each level are equivalent. The assessment was also correlated with the existing Reading Recovery® leveled assessment and a close fit was discovered. You can review either the Executive Summary or the Full Report.

The Benchmark Assessment System is new but the F&P Text Level Gradient™ on which it is based has been developed over the last twenty years and used with high reliability to establish grade-level expectations. The F&P Text Level Gradient™, which was published in the 1990s, has been refined and developed over the years. You can now find over 50,000 books listed by level on fountasandpinnellleveledbooks.com. This gradient was used as a standard by the New Standards Project® (Resnick & Hampton, 2009). New Standards is a joint project of the Learning Research and Development Center at the University of Pittsburgh (Pennsylvania) and The National Center on Education and the Economy (Washington, D.C.). Heading a consortium of 26 U. S. states and six school districts, New Standards developed performance standards in English language arts and other areas.

The F&P Text Level Gradient™ is a defined continuum of characteristics related to the level of support and challenge that a reader meets in a text. Terms such as easy and hard are always relative terms that refer to the individual reader's foundation of background knowledge. At each level (A to Z) texts are analyzed using ten characteristics: (1) genre/form; (2) text structure; (3) content; (4) themes and ideas; (5) language and literary features; (6) sentence complexity; (7) vocabulary; (8) word difficulty; (9) illustrations/graphics; and (10) book and print features.

Texts are leveled using a highly reliable process in which teams of trained teachers, working independently and then through consensus, assign a level to books after analyzing them according to the ten factors. They are then analyzed by Fountas and Pinnell. The Benchmark Assessment books were actually created to precisely match the F&P Text Level Gradient™, and they were independently analyzed using the same process.

Often information from readability formulas like the Spache and Flesch-Kincaid are used as part of the text analysis process; however, those formulas measure a more narrow range of factors such as sentence length and number of syllables in words. The leveling system on which this assessment is based takes into account a more complex range of text factors (for example, literary features and abstractness of theme). In fact, it is well known that the grade levels revealed by different formulae vary widely according to what is being analyzed.

So, we would not expect an exact correlation between those factors and this assessment system. They do predict student performance on the kinds of texts and comprehension tasks students are expected to demonstrate in school. In a small evaluation in a city in Ohio data showed that if students proficient at levels M or N there was a strong predictability of proficiency on the Ohio Achievement Test in grade 3. More data are being collected.

The Benchmark Assessment System is appropriate for use in RTI. It does not provide national norms or percentiles; it is not intended for national achievement testing. However, it is based on widely used grade-level criteria (see the website for detailed documents). It enables the classroom teacher and specialist teacher to engage in diagnosis of a variety of sub-skills. This complex and comprehensive assessment system is designed to measure progress in each of the subskills in a way that informs instruction. It is linked to a detailed continuum of observable behaviors to assess and teach for at every level (see The Literacy Continuum). Included in every BAS, this continuum offers a very specific bridge to instruction.

Resnick, L. B., & Hampton, S. (2009). Reading and writing grade by grade. Newark, DE: International Reading Association.

Field Study of Reliability and Validity

A formative evaluation of the Fountas & Pinnell Benchmark Assessment System was conducted to ensure that (1) the leveling of the texts is reliable and (2) the reading cores are valid and accurately identify each student's reading level. The purpose of the study was twofold. The first was to examine every book, at every level, for the reliability of its designated level within a broader literacy framework and across corresponding fiction and nonfiction genres, i.e., is the readability of the books consistent across the fiction and nonfiction domains? For example, are the level G fiction and nonfiction books not only typical level G books, but do corresponding fiction and nonfiction books at this level have the same degree of readability? The second purpose of the evaluation was to determine the correlation between the Fountas & Pinnell Benchmark Assessment System and other reading assessments, i.e., to what extent is the Fountas & Pinnell Benchmark Assessment System associated with other valid reading assessments?

RESEARCH QUESTIONS
In order to determine the reliability and validity of the Fountas & Pinnell Benchmark Assessment System, the following three research questions guided the formative evaluation:

Research Question 1

How reliable is the Fountas & Pinnell Benchmark Assessment System? That is, how consistent and stable is the information derived from the reading books?
Does each book of the Fountas & Pinnell Benchmark Assessment System consistently occupy the same position on the gradient of readability, based on multiple readings by age-appropriate students? That is, does each book, level A–Z represent a degree of increased difficulty that is consistent with other Fountas and Pinnell leveled texts.

Research Question 2

To what extent are the gradients of difficulty for fiction and nonfiction books aligned within the Fountas & Pinnell Benchmark Assessment System? Do fiction and nonfiction books represent similar levels of difficulty within similar levels of reading?

Research Question 3

To what extent is the Fountas & Pinnell Benchmark Assessment System associated with other established reading assessments?

What is the convergent validity between the System 1 and Reading Recovery® assessment texts?
What is the convergent validity between the System 2 and the Slosson Oral Reading Test—Revised (SORT-R3) and the Degrees of Reading Power® (DRP)?

Click here to read the Executive Summary »

Click here to read the Full Report »

BAS Research

BAS Research Base

Research Background

Benchmark Assessment System 2e Executive Summary

BAS Research Field Study Full Report

Field Study of Reliability and Validity

Benchmark Assessment System 2e Executive Summary

BAS Research Field Study Full Report