FIELD
SOBRIETY TESTING: A LOOK AT THE
SCIENTIFIC RESEARCH UNDERLYING THE TESTING.
Jennifer Wirth
In late
1975, the National Highway Traffic Safety Administration (“NHTSA”)
During the
1977 study, the researchers had the participating police officers attend one
training session on the administration of FSTs prior
to testing subjects in the lab environment.
Id. at 13.
The alcohol consumption of the test subjects was controlled prior to
undergoing the FST battery so that the subjects’ alcohol intake varied from
intoxicating levels to placebo doses. Id. at 15.
After the subject completed the test battery, the officer would indicate
whether or not they would have arrested or released a subject in a field
setting. Id. at 25.
Of the total arrest decisions made in the lab environment, the officers
made an incorrect decision to arrest a subject 47% of the time. Id.
At the
conclusion of the 1977 study, Marcelline Burns and Herbert Moskowitz, the SCRI
researchers, authored a final report detailing their laboratory findings. Id.
The report provided the percentage of times the officers had made a
correct determination that a test subject was intoxicated based on their FST
performance. Id. at 39.
The percentage of accurate decisions for the
walk-and-turn and finger-to-nose tests were reported among the findings.
Id.
According to the study, the officers made correct determinations nearly
57% of the time based on a subject’s performance on the finger-to-nose test
when the subject had a blood alcohol content (BAC) of
.10 or more. Id. Similarly, the
officers correctly identified intoxicated subjects 60% of the time after
administering the walk-and-turn test. Id.
Following
the 1977 study, the SCRI researchers began to conduct field validation studies
of their laboratory results on the FSTs. “Developments
and Field Test of Psychophysical Tests for DWI Arrest.” V. Tharp, M. Burns, and H. Moskowitz. DOT-HS-805-864 (1981).
In 1981, the researchers prepared a report detailing their findings. Id.
The report indicated that the walk-and-turn test, the one-leg-stand, and
the horizontal gaze nystagmus test detected intoxication in the field as well
as in the lab. Id. Based on their field
observations, the researchers recommended standardization of the three-test
battery. Id.
In September of 1983, the SCRI
researchers published additional results of their field validation studies of
the FSTs. “Field Evaluation of a Behavioral Test
Battery for DWI: Research and
Development.” M. Burns and H. Moskowitz.
DOT-HS-806-475 (Sept. 1983). In
this field study, an evaluation of the three-test battery was performed by
having battery-trained police officers record data on 1,506 drivers stopped for
DWI. Id.
The study reported that officers made correct decisions to arrest based
on the walk-and-turn test 83 percent of the time. Id.
Despite this accuracy rate, the
authors cautioned that significant reasons existed to be wary of the data and
conclusions in the 1983 study. Id.
First, the officers were not randomly assigned to different groups, so
the outcomes may have been affected by selection and assignment bias. Id.
Further, the subjects were given portable breath tests (“PBT”) prior to
the arrest decision in the majority of cases. Id.
Because of the use of the PBTs, the researchers expressed concern that
the officer’s evaluation of the subject’s test performance may have been
affected by their knowledge of the PBT reading. Id.
At the close of the three studies,
the National Highway Traffic Safety Administration (“NHTSA”) published their
first training manual for field sobriety testing. “Improved
Sobriety Testing.” DOT-HS-0-421-018 (1984). The manual implemented the recommendations of
the SCRI researchers by creating a standardized three-test battery composed of
the walk-and-turn, one-leg-stand, and horizontal gaze nystagmus tests. Id.
The finger-to-nose test, at issue in this case, was not recommended for
the standardized battery, and thus, was not included in the manual.
The NHTSA training manual provided
detailed instructions on how the standardized test battery should be
administered. Id. The NHTSA cautioned
that “if the standardized testing and scoring procedures in this Manual are not
followed, the decision making guidelines will not be accurate.” Id.
The NHTSA based its accuracy estimates on the previous SCRI studies it
had commissioned. Id.
For several years, the scientific
community largely ignored the initial SCRI studies that had given rise to the
standardized FST battery. Finally, in
1994, the first independent review of the SCRI studies was published by Dr.
Spurgeon Cole, a psychologist, and Ronald Nowaczyk. “Field
Sobriety Tests: Are They Designed for
Failure?” Dr. S.
Cole and R. Nowaczyk. Journal of Perceptual
and Motor Skills, pp. 99-104 (1994).
In the article, Cole and Nowaczyk critiqued the prior SCRI research on
several grounds. First, they highlighted
that in the original 1977 study, 47% of the subjects would have been falsely
arrested based on their performance on the FSTs. Id.
at 100. Similarly, Cole and Nowaczyk were alarmed
that 32% of the participants were incorrectly judged as have BACs of .10 or
higher in the 1981 SCRI study. Id.
Due to these high false arrest rates, they condemned the standardized
FSTs on the ground that the accepted reliability coefficient for standardized
clinical tests is 85% or higher, yet the reliability coefficients for the
standardized FSTs, as reported in the NHTSA studies, ranged from 61% to 72% for
individual tests and 77% for individuals that were tested on two different
occasions while dosed with the exact same BAC.
Id. Finally, the
researchers were concerned by the low inter-reliability rates that existed
where different officers scored the same subject. Id.
They highlighted that the inter-reliability rates of the SCRI studies
were only 34% to 60%, with an overall rate of 60 percent. Id.
Based on
their review of the SCRI studies, Cole and Nowaczyk theorized that the
standardized FSTs, particularly the
walk-and-turn and one-leg-stand tests, required the subjects to perform
unfamiliar, unpracticed motions and noted that very minor miscues can result in
a false conclusion that the subject is legally intoxicated. Id.
Their hypothesis was that that sober individuals would find the tests
difficult to perform, and thus, be classified as intoxicated as a result of
unfamiliarity with the test, rather than actual intoxication. Id.
The
researchers tested their hypothesis by videotaping twenty-one completely sober
individuals performing “normal abilities tests” (reciting their addresses,
phone numbers or walking in a normal manner). Id.
They also taped the same individuals performing the walk-and-turn and
one-leg-stand tests. Id. The
researchers had a group of police officers view the videotape and determine
whether the subjects were intoxicated. Id. at 100-101. Based on the subject’s
performance on the “normal abilities” tests, the police officers made incorrect
determinations of intoxication 15% of the time. Id. at 102.
However, when viewing the walk-and-turn and one-leg-stand test
performance, the officers incorrectly judged a subject as intoxicated 46% of
the time. Id. Based on these
findings, the researchers concluded that:
The
standardized field sobriety tests must be held to the same standards the
scientific community would expect of any reliable and valid test of behavior.
This study brings the validity of field sobriety tests into question. If law enforcement officials and the courts
wish to continue to use field sobriety tests as evidence of driving impairment,
then further study needs to be conducted addressing the direct relationship of
performance on these and other tests with driving. To date, research has concentrated on the
relationship between test performance and BAC and officers’ perceptions of
impairment. This study indicates that
these perceptions may be faulty. Id. at 103.
This research marked the first published criticism of the
standardized FSTs by the scientific community.
Following
the Cole study, Dr. Marcelline Burns, the scientist on the original SCRI
research, began new validation studies of the standardized test battery. In 1995, Burns and Anderson conducted a
validation study, funded by the Colorado Department of Transportation, to
determine the accuracy of arrest decisions when the standardized FSTs are
administered by experienced officers. “A Colorado Validation
Study for the SFST Battery.”
M. Burns and E. Anderson. 95-408-17-05 (1995). The Colorado validation study claimed that the
officer’s arrest decisions on subjects who submitted to chemical testing were
accurate 86 percent of the time. Id.
In a study published in 1997, Burns
reported her findings after conducting additional validation studies in Florida. “A Florida Validation
Study for the Standardized Field Sobriety Test Battery.”
M. Burns and T. Dioquino. (1997). In this study, 256 subject arrest decisions were reviewed by
BAC records. Id. Based on the BAC
records, Burns concluded that 95% of the arrest decisions were correct when the
officers administered the three-test battery. Id.
In 1998,
Burns published a final report of her findings based on an additional
standardized FST validation study conducted in San Diego, California. “Validation of
the Standardized Field Sobriety Test Battery at BACs Below .10 Percent.” M. Burns and J. Struster. (August 1998).
In this field study, battery-trained officers administered FSTs on
routine patrols and completed a data collection form for each test
administered. Id. At the conclusion of
the FST battery, the officers obtained a breath alcohol test from the subject
to validate their arrest decision. Id.
Based on the combined use of the FSTs and breath test results,
Burns reported that the officer’s decision to arrest was correct 91% of the
time. Id.
Since 1998,
the various SCRI studies have been subject to harsh scrutiny by the scientific
community. The newly emerging criticisms
of the SCRI research undermine its scientific value by detailing several core
problems with the research, including, but not limited to, high “guess rates,”
unacceptable false arrest rates,
incomplete findings, poor sample population, bias, and tainted data
collection procedures. “DWI. NHTSA Field Sobriety Tests: Validation v. Invalidation.” Spurgeon Cole, Phd., and Phillip Price. The Champion.
(April 2001); “Field Sobriety Tests:
Are They Designed for Failure?”
Dr. S. Cole and R. Nowaczyk. Journal of Perceptual and Motor Skills,
pp. 99-104 (1994); Affidavit
of Harold Brull in Horn Case ; Affidavit of Joel P. Wiesen
in Horn Case. Further, the relevant
scientific community is beginning to raise concerns as to the general validity
of FSTs due to the failure of credible research to
demonstrate an independent link between FSTs
performance and intoxication.
Field Sobriety Testing
should be treated as scientific evidence, and therefore, should have to meet
the standard of being generally accepted in the scientific community before a
Court can consider a Defendant’s negative performance as evidence of
intoxication.
The Illinois courts have adopted the standard set
forth in Frye v. United States to determine the admissibility of
scientific evidence. Frye v. United States, 293 F. 1013 (D.C. Cir. 1923); People v.
Basler, 193 Ill.2d 545 (2000).
Pursuant to Frye, scientific or technical evidence is admissible
when the question before the court is beyond the general knowledge of the
average individual so long as expert testimony provides a proper foundation for
the scientific principle sought to be introduced. Frye, 293 F. at 1014; People v. Vega, 145
Ill.App.3d 996 (4th Dist. 1986).
In order to provide such a foundation, the proponent of such evidence
must demonstrate that the evidence is generally accepted by the relevant
scientific community. Frye,
293 at 1014; People v. Basler, 193 Ill.2d 545 (2000). The requirement of “general acceptance”
assures that the most qualified to assess the general validity of a scientific
method will have the determinative voice.
“The Admissibility of Novel Scientific Evidence: Frye v. United States – A Half-Century Later.” 80 Colum.L.Rev. 1197 (1980).
The
Illinois Supreme Court has never been faced with the opportunity to address the
issue of whether field sobriety tests are “scientific” and/or “technical, and
therefore, must meet the Frye standard before being admitted into
evidence. However, the Second and Fourth
Appellate Districts have considered this issue and have ruled that the only
foundation required to introduce the results of FSTs
is the experience of the officer administering the tests. People v. Vega, 145 Ill.App.3d 996 (4th
Dist. 1986); People
v. Sides, 199 Ill.App.3d 203 (4th Dist. 1990); People v. Bostelman, 325 Ill.App.3d 22 (2nd Dist.
2001). These decisions largely rest on
dicta set forth in People v. Vega.
Id. et. al. This continual reliance on Vega is
problematic since the Vega court was not presented with any scientific
materials upon which to base their determination.
For
instance, in People v. Vega, the issue before the Court was whether the
State had laid a proper foundation under Frye for the admission of the
Horizontal Gaze Nystagmus (HGN) test. Id. at 997. The defense never challenged the admission of
the other FSTs performed on the Defendant. Id. at
996-1002. At the trial
court level, neither the State nor defense counsel presented any scientific
evidence regarding field sobriety testing.
Id. at 1001. In
challenging the admissibility of the HGN test, defense counsel relied on the
officer’s testimony as to how he had administered the test. Id. at 1000.
On appeal to the Fourth District, the State
and Defendant attached scientific literature to their briefs. Id. at 1001. The Court refused to consider such evidence
on the ground that the attachments to the appellate briefs had not been seen by
the trial court. Id. Due to the lack of scientific evidence, the
Court declined to rule on the admissibility of the HGN test. Id. at 1001. However, in making that ruling, the Court
stated as dicta that “The other tests, ‘walk the line,’ ‘one leg stand,’ and ‘finger
to nose,’ are not so abstruse as to require a foundation other than the
experience of the officer administering them.”
Id. at 1001.
Four years
later, the Fourth District was squarely presented with the issue of whether FSTs must meet the Frye standard before being
admitted as evidence. Id. at 205. The opinion does not indicate that any
scientific evidence was presented to the Court to aid in their
determination. Id. at
203-207. Rather, defense counsel argued
that Frye was not met because the officer was unaware of the scientific
theory underlying the FSTs. Id. at 205.
In arriving
at their ruling, the Sides court relied heavily on the dicta of Vega
and Illinois Pattern Jury Instructions 1.01 and 23.05. Id. at 206. Under IPI 23.05, the jury could determine a
defendant was under the influence “when, as a result of drinking any amount of
intoxicating liquor, his mental and/or physical faculties are so impaired as to
reduce his ability to think and act with ordinary care.” Id. Further, pursuant to IPI 1.01, the jury is
entitled to consider all the evidence in light of their own
observations and experience in life.” Id. After considering Vega and the IPI
instructions in conjunction, the Court held that “no expert testimony is needed
nor is a showing of scientific principles required before a jury can be
permitted to conclude that a person who performs badly on the field sobriety
tests may have his mental or physical faculties ‘so impaired as to reduce his
ability to think and act with ordinary care.’”
Id. at
206-207.
When
considering the reasoning for the Sides holding, the court’s conclusion
is virtually unfounded. First, the
reliance on Vega is flawed given that the Vega court made their
statement as dicta without any evidence on FSTs
before them for consideration. Vega,
145 Ill.App.3d at 1001. Further, the IPI Instructions are not actual
governing law and as such, carry no precedential
value. The Sides Court narrowed
its reasoning to the IPI instructions presented to that particular jury in that
particular case as the instructions existed at that particular time. All of these factors invariably can change on
a case-by-case basis and are not an appropriate tool to determine whether
something qualifies as “scientific” or not under Frye.
Finally,
the core issue of the scientific reliability of the FSTs
was never truly addressed by the Sides Court. Rather, the Court sidestepped the issue by
stating that a jury is experienced in driving a car, understands what physical
acuity is necessary to do so, and can make the inference that poor performance
on the FSTs can demonstrate impairment in operating a
motor vehicle. Id. at 206. This is meaningless given that the issue
before the Court was not whether the jury can make the inference that the
Defendant was impaired after poor performance on tests that demonstrate
intoxication – the issue was that FSTs do not carry a
reliable link between poor performance and intoxication. Id. at 205. The court never addressed this issue.
Eleven
years later, the Second District considered the issue of whether the
Defendant’s trial counsel was ineffective for failing to exclude the FST
results as lacking foundation. Bostelman, 325 Ill.App.3d at 24. Defense counsel argued that the foundational
requirements were not met because the officer failed to testify as to his
training and experience in regard to the administration FSTs. Id. at 32. The Court held that “so fundamental are
exercises of balance, coordination, and basic cognition to the activity of the
average person that ‘even a layperson is competent to testify regarding a
person’s intoxication.’” Id. at 33. In so holding, the Court relied largely on
the Vega dicta and the Sides decision.
Since
Bostelman, the Federal District Court of
Maryland has had the opportunity to consider whether FSTs
are scientifically reliable indicators of intoxication. United States v. Horn, 185 F.Supp.2d 530 (D. Md. 2002). (Online Copy of Order and
Memorandum, accessible at http://www.mdd.uscourts.gov/) In determining whether the FSTs are “scientific,” the Court stated that “because the
results of the SFSTs … may involve the application of
scientific, technical, or other specialized information, the requirements of
Rule 702, as recently revised, are of paramount importance.” Id. at 8.
In analyzing the admissibility of FSTs, the
Court applied the Daubert factors to determine
whether the scientific principles underlying FSTs are
reliable. Id. at 56. By doing so, the Court necessarily
crossed the threshold of determining that FSTs are
“scientific.” If they had not, there
would have been no need to apply the Daubert
factors to the FST evidence.
Further,
the scientific community is making it clear that FSTs
are by their very nature, “scientific.” In Horn, the defense provided
several affidavits from experts that detailed their objections to the SCRI
research that gave rise to the current standardized test battery. In the affidavit of Harold P. Brull, the Vice-President of a large
industrial/organizational psychology consulting organization, he explicitly
concluded that the FSTs are in fact scientific
tests. He stated:
There is absolutely no question that the
use of FSTs to predict impairment or blood alcohol
concentrations is a scientific question.
Neither the fact that the tests are behavioral or, in some cases, do not
require mechanical devices, obviates this fact.
The measurement of pulse by one’s fingers applied to an artery is no
less a scientific test than the measurement of body temperature via a
thermometer. The behaviors required of a
field sobriety test are not analogous to those of driving a car. One must make an inference from the former to
the latter. This is comparable to an instrument
reading from which one makes an inference regarding aspects of an individual’s
health (e.g. elevated body temperature as an indication of infection). (Brull
Affidavit, Page 5.
Aside from the Brull affidavit, the
scientific nature of FSTs was explained in a 2001
article, published in the Journal of Forensic Medicine. In the article, the author stated:
If we consider exactly what SFSTs actually assess, there can be no doubt the answer
must be not only sobriety, but also a variety of physical, neurological,
intellectual, and cognitive functions which interlink information processing,
organization skills, short term memory, spatial awareness, balance and
coordination, however not the least, the ability to perform these rigid and
complicated tests under stress. “Drugs, Driving, Standardized Field Sobriety
Tests: A Survey of Police Surgeons in Strathclyde.
O’Keefe. Journal of Clinical
Forensic Medicine, pp. 63-64.
(2001).
More recently, Steven Rubenzer, a
clinical and forensic psychologist, explained the scientific character of the FSTs in an article published in the Champion
Magazine. “The Psychometrics and Science of Standardized
Field Sobriety Tests (Parts I and II).” S. Rubenzer. Champion Magazine. May 2003 & June 2003. In the article, he
stated that “standardized FSTs are quite similar to
neuropsychological tests, which detect brain damage and assess sensory, motor,
and cognitive impairment. Id.
In
light of these statements from the scientific community, Illinois courts are at a better vantage point
than previous courts to determine if FSTs implicate
scientific principles. In the prior Illinois rulings, legal minds relied largely
on their instincts to determine whether the FSTs are
worthy of being classified a “scientific.”
With the growing body of research on FSTs, the
courts can make more informed determinations regarding the scientific nature of
FSTs by relying on the unambiguous conclusions of
scientists that FSTs indeed bear the brand of being
“scientific.” By using a “test” that
purportedly measures behaviorial and cognitive functions;
the courts have delved into the Frye arena.
The general acceptance by the relevant
scientific community of FSTs has not been adequately
litigated in Illinois courts.
If the FSTs
are deemed “scientific,” Frye requires that the tests are generally
accepted in the relevant scientific community before they are admitted into
evidence. Frye,
293 F. at 1013. This requirement
assures that the most qualified to assess the general validity of a scientific
method will have the determinative voice.
“The Admissibility of Novel Scientific Evidence: Frye v. United States – A Half-Century Later.” 80 Colum.L.Rev.
1197 (1980).
Because prior Illinois cases never crossed the threshold
that FSTs are in fact scientific, the issue of their
general acceptance has not been adequately litigated in Illinois courts.
In United States v. Horn, the
Federal District Court of Maryland had the opportunity to consider whether the
standardized FSTs are “generally accepted” when
determining the admissibility of FSTs under the Daubert standard.
In that decision, the Court found that standardized FSTs
were not subject to “general acceptance” by the scientific community. In so ruling, the Court stated:
…
Acceptance by a relevant scientific or technical community implies that the
community has the expertise critically to evaluate the methods and principles
that underlie the test or opinion in question.
However skilled law enforcement officials, highway safety specialists,
prosecutors and criminologists may be in their fields, the record before me
provides scant comfort that these communities have the expertise needed to
evaluate the methods and procedures underlying human performance tests such as
the SFSTs. Horn
Order, Pages 61-62.
Upon making this determination, the Court went on to rule
that SFSTs are not admissible as direct evidence of
intoxication or impairment because they fail to meet the standards for
admissibility in federal court. Id. at 62.
Since
the 1977 study, the FSTs have been subject to
ever-increasing criticism by the scientific community. As stated earlier, the early SCRI research
was not subject to much peer review because it was never submitted for
scientific publication. As such, the
1994 Cole Study marked the first independent review of the FSTs
by the scientific community. In that
study, the researchers not only criticized the findings of the SCRI research,
but concluded that the link between FSTs and
intoxication must be re-examined after their research demonstrated that
officers would have arrested sober individuals 47% of the time based upon their
performance on the SFSTs.
In 2001, police surgeons were interviewed at a
conference as to their opinions regarding the reliability of SFSTs. In an article
published in The Journal of Forensic Medicine, O’Keefe reports that 46%
of the police surgeons expressed reservations regarding the overall use of the SFSTs, with the walk-and-turn test causing concern from at
least 50% of the doctors. Id. at 61. Aside from intoxication, O’Keefe notes that poor
performance on the FSTs may be due to dyslexia,
general fatigue, stress, or other undiagnosed conditions. Id. at 63. Because FST performance can be affected by
individual health, the author questions the validity of the tests when used as
a basis for determining intoxication. Id. at
63-64.
Later
in 2001, Cole co-authored an article criticizing the three of the validation
studies for the standardized FST battery. “DWI.
NHTSA Field Sobriety Tests:
Validation v. Invalidation.” Spurgeon
Cole, Phd., and Phillip Price. The Champion.
(April 2001) (“Appendix F”). In the article, Cole explains that the
results of the Colorado study are
inflated because the “guess rate” was 79 percent. The term “guess rate” refers to the number of
persons who were intoxicated in the sample population. Id. at 2. He explains that the effect of having such a
high guess rate is that “an officer can simply arrest everyone in the sample
and be correct 79 percent of the time.” Id.
The
article also criticizes the 1998 Validation Study conducted by the NHTSA on the
ground that the data collection procedures were improper. Id. In the 1998 study, the officers were given PBTs and no observers were used at the roadside. Id. Cole claims that this procedure totally
nullifies the attempted validation of the standardized FST battery. Id. He explains that:
The fact that a PBT
was furnished to the arresting officers with no observers present is an
improper method of data collection… Data
must be collected in a trustworthy manner with objectivity built in to insure a
fair sampling process… If one interjects subjectivity and/or the opportunity of
unreliable data with no controls, the experiment fails. No reliable conclusions can be drawn from a
study when all the participants are given a method and an opportunity to know
the answers to the test. Id.
Cole also expresses concern that,
even with the data collection errors favoring law enforcement, 29% of persons
with a BAC of less than .08 were arrested in the study. Id.
Finally,
Cole article notes that the Florida Validation Study is so incomplete that it
is incapable of evaluation by the scientific community because no reliability
or validity scores were provided by the researchers – a necessary requirement
for scientific research. Id.
Further, as in the Colorado study, the accuracy findings were
falsely inflated due to the fact that 80 percent of the sample population was severely
intoxicated. Id.
The article concludes that “an accuracy rate of 90 percent does not look
very good when you consider the guess rate is 80 percent and the mean BAC level
is almost twice the legal limit of .08 percent.” Id.
In
the Horn case, the scientific community again questioned the validity of
the SCRI research underlying the standardized FSTs. Horn
Order. In Horn, the defense
submitted several affidavits from scientists that detailed their concerns with
the reliability of the SCRI studies. Brull Affidavit and Wiesen
Affidavit. In the affidavit, Brull notes that there are no “known error rates” for the
SCRI studies. He explains that a “known
error rate” exists where comparable results are achieved by independent
observation. Id.
Because the lab results have never been replicated or been subject to
peer review, known error rates for the standardized FSTs
cannot be determined. Id.
Brull also expressly states that the studies do not meet
scientific standards because there are fatal reliability problems with the
research. He begins by focusing on
“inter-rater reliability rates.” Id. According to Brull,
“inter-rater reliability” refers to the likelihood that different test
administrators would reach the same conclusion. Id.
He cites that the inter-rater reliability rates in the
SCRI research for arrest decisions was only 59 percent. Id.
Similarly, he points out that the near 50 percent rate of false arrest
decisions in the 1977 lab study is “comparable to deciding whether a person
should be arrested by flipping a coin.” Id.
Aside from his many other
criticisms, Brull echoes the concerns of the Cole
article by concluding that the accuracy rates of the SCRI studies are inflated
due to the high number of persons who were intoxicated in the sample
population. Id.
He also determined that the later validation studies are too incomplete to
evaluate them as studies – calling them mere “summary reports, without
foundation of findings.” Id.
In the Horn case,
Joel P. Wiesen, an expert in test development and a
psychologist, provided an affidavit that set forth a lengthy list of reasons
why the SCRI studies do not meet scientific standards. Among his extensive criticisms, Wiesen cites bias, false positives, high guess rates,
inflated accuracy rates, tainted data collection procedures, and incomplete
findings. Id. He concluded that
“these publications [the SCRI research], singly and taken together, show only
that the FST may have promise as a psychological test. …If any of these studies were submitted for
publication in a peer-reviewed research publication, in my opinion they would
be rejected due to their serious shortcomings in methodology and data
analysis.” Id.
Further, the Horn Court details the testimony of Dr.
Spurgeon Cole at the hearing on the defense’s Motion in Limine in their Order. Horn
Order, Pages 20-26. At the hearing,
Cole explained that the combined “test-retest reliability” rates in the studies
do not meet scientific standards. Id.
He defined “test-retest reliability” as the achievement of the same test
result with the same individual under the same conditions at different points
in time. Id.
He observed that the “test-retest reliability” rate for the SCRI
research was only 77 percent. Id.
Cole explained this does not meet scientific standards because the
scientific community “expects reliability coefficients to be in the upper .80s
or .90 for a test to be scientifically reliable.” Id. at 23.
During his testimony,
Cole also disputed the claim that the SCRI research is “generally accepted” in
the scientific community. Id. at 25. In doing so, he stated that “it is
difficult to see how the NHTSA could claim that the FST is accepted in the
scientific community, when results of studies on the validation of the FST have
never appeared in a scientific peer reviewed journal, which is a basic
requirement for acceptance by the scientific community.” Id.
He also highlighted the internal errors of the studies, stating that “a
careful reading of the reports themselves provides support for the inadequacy
of the FST battery. The reports include
low reliability estimates for the tests, false arrest rates between 32 and 46.5
percent, and a field test of the FST that was flawed because the officers in
many cases had breathalyzer results at the time of the arrest.” Id. at 26.
Since the Horn
decision, Steven Rubenzer, a clinical and forensic
psychologist, provided an extensive list of his criticisms of the standardized FSTs in a recently-published article in the Champion Magazine. Appendix E. Among his many concerns, Rubenzer
states that the SCRI studies are unreliable due to the lack of peer review,
failure to control variables, poor sample population, variety of situations
permitted for walk-and-turn test (i.e. imaginary line, crooked line, offset
line, etc.), lack of studies on the impact of anxiety during testing, low
reliability coefficients, high guess rates, and failure to employ
“double-blind” research techniques. Id.
In his conclusion, Rubenzer explains that “the
standardized FSTs have significant limitations as
tests that should be understood by those who encounter them in the legal
arena. … Prosecutors and judges need to
critically examine the SFST evidence offered in DUI cases so that innocent
people are not wrongfully convicted.” Id.
Given the large-scale
criticism of researchers and psychologists, it cannot be said that the FSTs are “generally accepted” in the scientific
community. In fact, the scientific
community was not even asked for its acceptance by the SCRI researchers when
seeking to demonstrate a link between FSTs and
intoxication. The studies have never been
subject to peer review in a journal, and, judging by the newly-emerging attacks
on the research, the SCRI studies may not have even met the standards for
journal publication.
Although the lack of
scientific acceptance is not a “winning issue” in Illinois, it is an important
issue that should be challenged by defense attorneys in DUI cases to highlight
the scientific criticism underlying the FSTs.
It may prejudicial a client to admit the FST evidence
in Court if it is not “scientific” because the “tests” carry an aura of
scientific reliability that has not been adequately established by the
NHTSA-commissioned research.
Even if the
FSTs are not subject to Frye under Illinois
law, defense attorneys should still consider making a motion to bar the introduction
of FST evidence at trial due to their potential for prejudice.
Relevant
evidence is admissible if it is more probative than prejudicial. People v. Pantoja, 231 Ill.App.3d 351, 354 (2d. Dist. 1992). It is
the trial court’s function to weigh the probative value and prejudicial effect
of evidence to determine whether it should be admitted. Id.
It is basic
that the FSTs are either “scientific” or they are
not. If Illinois courts determine that
the FSTs do not fall within the realm of “scientific”
methods, they should exclude the FST evidence on the ground that such evidence
is more prejudicial than probative.
As the
previously discussed studies have indicated, poor performance on FSTs can be the result of a wide array of innocent factors,
such as fatigue, anxiety, unfamiliarity with the tests, brain damage and other
undiagnosed medical conditions. When
holding out the FSTs as “tests” that diagnose
intoxication, there is a high risk that a jury hearing such testimony will
believe that poor performance on FSTs is a reliable
scientific basis for the officer’s conclusion that the Defendant was under the
influence. If no scientific basis exists
for such a conclusion, the FSTs carry a high risk of
prejudicial impact by creating an aura of scientific reliability in so-called
“tests,” that are no more than mere observations that may or may not indicate
intoxication.
The vast
errors in the research creating the test battery undermine any alleged link
between poor test performance and intoxication.
Further, the recent studies demonstrating that FSTs
diagnose conditions unrelated to intoxication illustrate the heightened risk of
prejudice when introducing FSTs as evidence of
intoxication. The lack of scientific
basis for FSTs presents a clear danger that the jury
will trust that FSTs truly are capable of determining
intoxication by their formalization of human movements into a purported “test”
battery, when in fact, there is no scientific basis
for this conclusion.
The
prejudicial impact is also not cured by the determination that FSTs do not carry any aura of scientific reliability, and
therefore, fall strictly within the realm of mere observations of the
officer. In a study entitled,
“Psychology, Public Policy, and the Evidence of Alcohol Intoxication,” the
research findings clearly indicated that non-medical observers are not able to
reliably diagnose intoxication by mere observations. “Psychology, Public
Policy, and the Evidence for Alcohol Intoxication.” Langenbaucer, J.
and Nathan, P. American
Psychologist. Pages 1070-1077. (October 1983). In the study, the researchers tested the
ability of social drinkers, police officers, and bartenders to accurately
determine whether an individual was intoxicated. Id. at 1071. The study reported that all three groups were
only able to correctly judge an individual’s intoxication 25% of the time when
basing their determination on visual observation alone. Id. at 1076.
At a minimum, Courts should not allow the field
sobriety exercises to be termed “tests” or “examinations” if they are not
determined to be scientific.
If
Illinois courts continue to characterize the FSTs are
non-scientific, defense attorneys should take efforts to minimize the
prejudicial impact to their client by making a motion to bar any testimony
stating that the Defendant “passed” or “failed” these exercises. Further, the motion should request that
witnesses be prohibited from making any reference to the FSTs
as “tests” or “examinations.”
As stated
earlier, the FSTs must logically be considered either
“scientific” or “non-scientific.” If the
field sobriety tests are not scientific, they do not qualify as tests of human
performance. To allow any language, explicit
or implied, that field sobriety exercises amounted to “tests” or “examinations,”
that a defendant could “pass” or “fail,” would unduly prejudice them in that it
would cast an aura of scientific reliability on these exercises that does not
exist.
Illinois
appellate courts have not yet explicitly considered this argument. However, in State v. Ferrer,
the Hawaii Appellate Court was recently confronted with this issue. 23 P.3d 744 (Hawaii Ct. of
Appeals, 2001). In that case, the
Court made a determination that the FSTs were not
“scientific.” Id. In light of reaching that conclusion, the
Court held that an officer may not testifiy that a
Defendant “passed” or “failed” the FSTs because a
layperson would not use these terms to describe their observations of FST
performance. Id. The Court explained that the State must lay a
proper foundation regarding the officer’s experience, training, and compliance
with FST testing procedures before an officer can use the terms “passed” or
“failed.” Id.
In so
holding, the Hawaii Court of Appeals explained that their ruling is designed to
avoid the danger of affording undue scientific validity to lay opinions. Id.
The Court adopted the rationale employed by Oregon Supreme Court, which
stated:
Evidence perceived by lay jurors to be
scientific in nature possesses an unusually high degree of persuasive
power. The function of the court is to
ensure that the persuasive appeal is legitimate. The value of preferred expert scientific
testimony critically depends on the scientific validity of the general
propositions utilized by the expert.
Propositions that a court finds possess significantly increased
potential to influence the trier of fact as
scientific assertions, therefore, should be supported by appropriate scientific
validation. This approach ensures that
expert testimony does not enjoy persuasive appeal of science without subjecting
its propositions to the verification processes of science. Id. at 19.
The high probability that the terms “test” or “examination”
would cast FSTs in a scientific light, after making a
determination that they are in fact not, would creates an extreme risk of
prejudice in that lay jurors would likely assign a higher degree of value to
FST testimony than it should be afforded.
As such, DUI defense attorneys should seek to exclude any reference to
the terms “test” and “examination,” as well as any statements regarding whether
the Defendant “passed” or “failed” the FST exercises, due to their high
potential for prejudice by creating a false sense of scientific validation.
Conclusion
The long-standing acceptance of field sobriety test evidence
in Illinois courts must be challenged by skilled DUI defense attorneys in light
of the newly-emerging criticisms in the scientific community. Admittedly, the potential for a court to determine
that the FSTs, aside from the HGN, are subject to Frye
is minimal. However, defense attorneys
should bring these motions to highlight the inherent contradictions in the
judicial rationale for admitting the FST evidence. If the tests are scientific, they should meet
the Frye standard for admissibility.
If they are not, they are extraordinarily prejudicial to a defendant
because they carry an aura of scientific reliability where in fact, it has not
been shown to exist under the current studies.