The New SAT -- an overview and very sketchy guide
Change triggers three typical reations: roughly 15% of the affected will embrace it uncritically and yet will offer positive criticisms of the change; another 15% will categorically reject the worth and principles of the change without giving either much thought time; the remaining 70% (remember, these numbers are approximate) will want to know as much as possible about the change and will then "wait to see the results" It is to the 70% that I address myself, with the idea of giving as much information as possible and inviting them to a kind of window to "wait and see the results".
WHAT'S GOING ON
This March 12, there will be a newly formatted SAT presented to High School Juniors (also to Sophomores wishing to get a jump on the task, and to sixth and seventh graders who want to take part in the Johns Hopkins Academically Talented Youth Search). The new format includes what should be more relevant math tasks, a critical reading section that promises to be a better judge of critical reading skills, and a writing section that supposedly guages of a student's grasp of Standard Written English. I use terms such as "should" and "supposedly" for two reasons: the SAT has not proven itself to be an accurate indication of anything in the past, and there is not enough testing of this test instrument to prove that the test makers have really done what they say they have done.
SO WHAT?
I'll address the first statement. There are two terms in test making that should be defined. The first is "reliability", and it refers to a statistic which will indicate the degree to which this test will render the same score to the same student or type of student after repeated testing attempts. In plain terms, if a student takes an SAT in March and scores a 500 on the verbal, a 500 on the math and a 500 on the writing, that same student should get the same scores in May and then in October if that student makes no unusual efforts to prepare. The SAT is a very reliable test, according to the statistics.
The second term is "validity" and this refers to a statistic that shows a direct correlation between performance on the test and performance on other assessment instruments (tests, essays, school performance) in the areas the test claims to give indicators. The SAT has never shown itself to be valid, according to this definition.
No less a personage than Ralph Nader looked into this SAT thing (I have some bad karma to work off over him because my dad was the detective the Corvair engineers hired to find dirt on poor old Ralphie... as of 1966, clean, incidentally) In the mid eighties, Nader and his organization, using the correct formulas to calculate these correlations, compared SAT scores with several other performance scores: grades, schools attended, classes taken... in short, anything they could put a number to. The only factor that had even close to a one-to-one correlation was family income. High SAT scores are made by students who come from financially well-off families. (dad isn't proud of trying to ruin Nader, by the way.)For a test that claims to predict how well suited a student might be for a particular college or university, this isn't good. (dad got fired for not finding anything, and it put off my parent's wedding... this is somehow the fault of the SAT)
The problem is, there really isn't anything else that the colleges have to use as a standardized view of the thousands of students who apply. Yes, the ACT, but it has a great deal in common with the SAT and there is not room here to discuss that test. The SAT is reliable and valid as an indicator of how well a student can do on an SAT, which is an intellectual task, so the test isn't a thorough-going fraud. Moreover, students can improve scores using materials that the test makers distribute, so there is an element of effort and choice in the process.
Now I'll address the second statement. The SAT score that most students care about is generated by calculating the raw scores (which process involves subtracting 1/4 of a point for each incorrect answer from the total correct answers) of all the students who took the test on any given Saturday and determining the median score. These students all get 500 on that section. The rest of the scores are then distributed on the 200 to 800 point scale, with the idea of generating a bell curve.
ETS has not yet given this four-hour monster to anyone in a real testing situation, so there is very little data for anyone to go on. We do not know how anyone is going to perform. We don't even know the rich kids are going to do. Daddy, get out the checkbook; Harvard needs a new squash court.
HOW'D THIS GET STARTED
In the year 1912, immigration to US shores peaked, consisting largely of ethnic whites: Italians, Eastern Europeans, and European Jews. The board members, which does not necessarily mean presidents, of the top colleges in the US found this situation threatening. They predicted that these people would try to be educated in top US schools, and might even contemplate sending their children to these schools. This, they knew, would not do. With their joint and several danders in a proper altitude, they formed The College Board. They decided to devise a test which would weed out the non-native from the native, and thus keep the good schools a bastion of Anglo-American stock. Dr. Campbell Brown, who created the test, was enthusiastic about the project; as enthusiastic as he was about his other projects which all involved eugenics and the sterilization of the unfit. A less objective person might draw the conclusion that the SAT is a tool of White Supremacy, but fortunately market forces and good educational philosophy came and kept that somewhat in check. If the test is biased in today's world, it is simply because the individual test makers themselves have a bias that makes them write questions that skew to the upper-middle class.
WHAT ARE THEY ASKING?
The new SAT begins with an essay. Students have 25 minutes to produce an original essay on a given topic, and are scored on a scale of 1 to 6, with 6 as the high score, by two readers. Students must demonstrate outstanding critical thinking, clear and consistent mastery of the language, and insightful completion of the task.
The students are instructed to use their readings, studies, observations and experiences to answer the question. The questions are kept general, and worded such that a {false?} dichotomy is easy to see and to support regardless of which side a student might embrace. Of the essays I have read, the essays based on personal experience tend to ramble, use vocabulary and sentence structure that are too simplistic, and often miss the point. Using examples from history and literature often makes for better essays. Students are more accustomed to finding and supporting viewpoints based on structured introduction to themes of books or issues in history.
The test then continues with three sections each of math, largely taken from algebra and geometry and in multiple choice and student produced response formats; critical reading, with sentence completions invented by the test makers and reading passages culled from academic trade journals (all of these questions are multiple choice questions). Beyond that, there are two writing sections with grammar questions of three types: finding the errors in sentences, identifying the best rewrite for a faulty sentence and finding the best fixes for faulty elements of essays. The grammar issues addressed are finer points -- one might almost call them trivial (if one didn't fear the wrath of the English Teacher descending in swift and terrible vengeance) -- of standard written English. There is an additional section called the "experimental section," where new questions are tested to see how students will react to the wrong answers.
Anyone curious enough to wonder how he or she would do on today's test can download a full sample test from the college board website. The test is without the experimental section. Space and copyright prevent me from reproducing any questions here.
WHAT ARE WE WATCHING FOR?
The reliability is the thing. This is the virtue of the SAT. ETS can reliably present a bell curve to the colleges and universities, which entities can then use those scores as they choose. Consumers and observers of the SAT must now wait to see if the anxiety of the essay, the shift of the verbal section from vocabulary to reading comprehension, the refocus of the math section to include real mathematical knowledge, and most of all the longer testing time, will have some kind of impact on the bell curve.
The key to taking a balanced view of the SAT is to understand that those who are looking at the scores appreciate their real worth and weigh them accordingly. Those who want something to worry about might reflect on the fact that most often, SAT scores are used to eliminate rather than to choose applicants. If the new tests prove unreliable as well as invalid, perhaps that might happen less often. It might restore a sense of perspective and provide an impetus for educators to find better ways to select applicants. Educators could come to value appropriately the results of a single Saturday in the life of a college applicant.