What do I need to know about millisecond timing accuracy

If
you are a psychologist, neuroscientist or vision researcher who
uses a computer to run experiments, and report timing accuracy in
units of a millisecond, then it's likely your timings are wrong!
This can lead to replication failure, spurious results and
questionable conclusions. Timing error can affect your work even
when you use an experiment generator like E-Prime, SuperLab,
Inquisit, Presentation, Paradigm, OpenSesame or PsychoPy etc.
Our product's sole aim is to help you improve the quality of your
research prior to publication. The
Black
Box ToolKit v3 for example helps you check your own
millisecond timing accuracy in terms of stimulus presentation
accuracy; stimulus synchronization accuracy; and response time
accuracy and then tune your experiment to deliver better stimulus
and response timing. Whereas the
mBBTK
(event marking version) helps you independently TTL event
mark or produce TTL triggers to send to other equipment. Our range
of
response pads and other devices help
you ensure that your response timing is millisecond accurate and
consistent.
A summary of what types of millisecond timing error likely to
affect your computer-based experiment is shown below:
Idealized experiment shown top, what may happen in
reality on your own equipment bottom (click to enlarge)
Put simply, if you are using a computer to run experiments and
report timing measures in units of a millisecond then it's likely
that your presentation and response timings are wrong! Modern
computers and operating systems, whilst running much faster, are
not designed to offer the user millisecond accuracy. As a result
you may not have conducted the experiment you thought you had!
Hardware is designed to be as cheap as possible to mass produce
and to appeal to the widest market. Whilst multitasking operating
systems are designed to offer a smooth user experience and look
attractive. No doubt you'll have noticed that your new computer
and operating system doesn't seem to run the latest version of
your word processor any faster than your old system!
Don't commercial experiment generator packages solve all my
problems?
Unfortunately using a commercial experiment generator such as,
E-Prime, SuperLab, Inquisit and the like will not guarantee you
accurate timing as they are designed to run on commodity hardware
and operating systems. They all quote millisecond precision, but
logically, "millisecond precision" refers to the timing units the
software reports in and should not be confused with "millisecond
accuracy", i.e. do events occur in the real world with millisecond
accuracy.
If you write your own software you will remain just as uncertain
as to its timing accuracy. You should also be wary of in-built
time audit measures as they can lead to a false sense of security
as they are derived by the software itself. For example, if you
swap a monitor it is impossible for the software to know anything
about a TFT panels timing characteristics, or for that matter
about a response device, soundcard or other device you are working
with.
It is also impossible to find out which experiment generator
offers the most accurate presentation and response timing using
generic benchmarks. Often such benchmarks have been conducted
using devices such as our BBTK v3, or homemade response hardware,
and the experiment generator scripts tuned to give consistent
results. The fatal flaw in such an approach is that the authors
have tuned the experiment generator to give good results on their
own hardware within a very simple script. If you think about it
for a moment what this actually shows is that you should be
checking and tuning your own experiment on your own hardware with
a BBTK v3 to give better results. Results from generic benchmarks
cannot possibly apply to your own hardware and experiment as they
will be markedly different.
What about switching to Mac/PC/Linux?
It doesn't matter which hardware you work with, PC or Mac, which
operating system you use, Microsoft Windows, Apple's OS X or a
variety of Linux, you will succumb to timing error. What's more
it's getting harder to source the equipment you might have used
previously. For example CRT monitors are now virtually impossible
to source at a reasonable cost. Input lag can have a huge effect
on TFT panels whereas traditional CRTs don't suffer from this
effect and can be well over 20x faster when displaying images.
What's more each TFT make and model has different timing
characteristics for input lag and panel response time. This means
you should check each and every TFT you use. If you can see or
hear it – you know you have a problem!
Human variability and adding more trials
There has been a long standing argument that human responses are
far more variable than the hardware and software itself. In most
cases this is only true if the error is truly random, within
certain limits and you are not interacting with other external
hardware. This can make carrying out replications difficult due to
spurious artifacts and conditional biases. In the same way
carrying out an unspecified additional number of trials will not
lessen the effect of any systematic presentation, synchronization
or measurement error.
Aren't humans pretty slow?
The latest research suggests that humans may actually be able to
process information much faster than previously thought. For
example, Thurgood et al (2011) proposes that humans can identify
animals with only 1 millisecond of visual exposure. To her credit
to be able to test this her team had to develop their own
light-emitting diode (LED) tachistoscope. Put simply off the shelf
equipment was simply not fast enough. If differences as small as a
millisecond can have an experimental effect this implies that
timing errors in a typical study could also have more of an effect
than you might think. In the auditory arena a lag of just 10
milliseconds can be reliably detected.
Human error when designing experiments
Human error when creating the experimental scripts themselves is
also an unrecognized problem. For example, software commonly used
for experimental work has a variety of settings which can affect
presentation of both audio and visual stimuli. Often researchers
are unsure what impact various settings might have. It is also not
unknown for researchers to set incorrect values or introduce bugs
into their own code that affects timings. Such errors can be
clearly identified and corrected if studies are checked at an
early stage.
Do computers lie?
Computers don't, and more to the point can't always do what you
tell them and you shouldn't blindly rely on the results they give
you. For example you can tell a piece of software used to run
experiments to present a priming image for 11 milliseconds whilst
playing a tone in the left headphone for 100 milliseconds. You've
dialled in the numbers, the computer has accepted them, but the
hardware can't possibility do what you've asked due to TFT panel
input lag and soundcard start-up latency. The question is does
this make your experiment less valid because you are not running
the experiment you thought you were? More shockingly different
hardware and software has wildly different timing characteristics.
So if you reran your study with identical stimulus materials and
settings but on different hardware would you be running a
different study? Would your results be comparable?
Face and faith validity
In terms of computer-based studies often researchers are prepared
to blindly believe what the computer tells them. If the computer
reports that a reaction time is 300.14159265 milliseconds because
there are quite a few digits after the decimal place on the face
of it surely this must be an accurate measure? Well actually no.
All it tells us is that the computer is quite precise but not that
it has given you an accurate measure. A wall clock can be 10
minutes slow but be accurate to the second. If we knew this would
we still say the time we read from its face is accurate? If we
didn't know the clock was 10 minutes slow then it would also
achieve faith validity. In much the same way we place our faith in
computers being accurate when often they are not.
In a nutshell
In a nutshell bad timing will negatively affect the reliability
and validity of your experimental work and the results you find.
Plus you may also not be able to replicate your own findings over
the longer term. The cornerstone of good science is experimental
control and replication.
I need to talk specifics
If you would like to discuss the functionality of any of our
products or would like to inquire about our consultancy services
feel free to contact us. Please note we
are unable to give specific advice on timing unless you are one of
our products users.