PEST: Parameter Estimation by Sequential Testing

Introduction

The PEST algorithm is based on the procedure described by M. M. Taylor and C. Douglas Creelman, in "PEST: Efficient Estimates on Probability Functions" (Journal of the Acoustical Society of America, Volume 41, Number 4, 1967).

It is a set of rules for adjusting the difficulty of a task to quickly find the point at which performance reaches a predefined level. More formally, a property of some signal (the independent variable) is adjusted to find the magnitude that results in a performance of specified accuracy (the dependent variable). For instance, Taylor and Creelman tested participants' ability to detect a sound by varying its loudness until participants detected it on 75% of the times it was played.

For a test of mental-processing speed, one might adjust the rate at which stimuli are presented until the participant responds correctly (simple detection, or the correct selection from a number of options) to a given percentage of the stimuli.

Thus, PEST presents trials at a certain level of the IV, and if the participant's accuracy is too high, the IV's level is changed to make the task harder, and vice versa. Whenever the IV is changed, the accuracy is reset, and testing begins again. A series of consecutive trials at the same IV level is called a `step' or `block'.

PEST is especially efficient in obtaining a given performance level from the participant with the minimal number of trials. This efficiency comes from the rules that govern the level chosen at each change.

Terminology

Level
The task stimulus-property being adjusted (the Independent Variable or IV).
DV
The response from the participant (the Dependent Variable).
Step
A contiguous sequence of trials at a given level.
Step-change
The change in level between steps.
Trial
A single stimulus-presentation to which the participant should respond.

Input Parameters

Starting level
The level of the IV for the first step. Must be easy for the participant.
First change
The direction (positive or negative) and amount by which to change the IV level after the first step. Must make the task more difficult. A positive value causes an increase in level when performance is too high, and vice versa.
Maximal change
The largest absolute difference between consecutive steps.
Minimal value
The smallest absolute value of step. (The algorithm, if uncorrected, can choose negative or values that are otherwise unsuitable for the task).
Target accuracy
The performance accuracy level to be achieved by the participant. Expressed as a floating point number between zero (0%, or all wrong), and one (100%, or all correct).
Deviation limit
The maximal difference (in number of trials) between the desired accuracy and the actual accuracy, beyond which the level should be changed. Note: this is a floating point number.

The Rules

These rules are designed to generate a sequence of levels that quickly `home-in' on the desired level, and that track any changes that might occur over time (e.g. if the participant loses concentration).

  1. On the first step-change in a given direction (i.e. after reversing step direction), halve the step size.
  2. On the second step in a given direction, use the same step-change size.
  3. For the third step, if the step-change before the previous reversal was a doubling, do not double the previous step-change. Otherwise, double the previous step-change.
  4. For step four and beyond, double the previous step-change.

The first rule avoids retesting the participant at the same level as they were tested recently (which was found to give the wrong performance level). The second rule ensures that the second level after a reversal is the same as the last-but-one level before the reversal. Similarly, the third rule ensures that the third level after a reversal is the same as the last-but-two level before the reversal. See figures 1 and 2.

Figure 1: Prior Doubling

Note that, although the figures depict a task that gets harder then easier, the direction is irrelevant; the same rules apply in the opposite direction (imagine a mirror image of the curve going down then up, or simply swap the "Easy" and "Hard" labels).

Figure 2: No Prior Doubling

The fourth rule is based on the assumption that if the participant's performance hasn't changed after many steps, the participant is performing well above, or well below, their actual performance threshold. Thus, it accelerates the rate of change between steps in order to reach the participant's threshold sooner.

Note that the process requires that PEST `knows' the difference between making the task easier and making it harder. In other words, if the participant's performance is too high, the task is too easy, so PEST must know how to make it more difficult, and vice versa. PEST obtains this information from the sign (positive or negative) of the first step; if it is positive, PEST will increase the level whenever the participant's performance is too high (task too easy), and vice versa.

Thus, in order for the difficulty to be adjusted properly, PEST requires that two input values are chosen carefully: (1) the value of the initial level is guaranteed to be easy for participants, and (2) the sign of the initial step-change results in a second step that is easier than the first. You should also verify that the initial step-change is not too large, since a large change in difficulty level, especially at the start of the run, often unsettles the participant.

Performance Level

PEST monitors the participant's performance by dividing the number of trials to which the participant responded correctly by the total number of trials in order to obtain a score between zero (all wrong) and one (all correct). It then compares this score with the target accuracy in order to decide whether to change the task level. If the accuracy is too high, it makes the task more difficult, and vice versa.

When comparing the target and actual accuracy levels, PEST allows for a deviation limit, which prevents the level being changed too often (i.e. steps having too few trials). For instance, for any accuracy level (other than zero and one), the actual accuracy would always be too high or too low after the first trial.

Suppose the target accuracy were 0.5 (50%). If the participant responded correctly to the first trial in any step, their accuracy would be 1.0. PEST would consider this too high, and thus make the task more difficult after only one trial. Similarly, if the participant got the first trial wrong, their accuracy would be 0.0, so PEST would make the task easier after only one trial.

It can be seen that, for any accuracy level, PEST would always change the step-level after only a single trial in each step. This rapid change between steps results in an unstable presentation of difficulty levels. To stabilise this behaviour, the accuracy level is treated as `a range of acceptable values' rather than a single point.

This range is called the deviation limit, and is specified in units of trials, and is applied by adding it to, and subtracting it from, the expected number of correct trials within the current step. For instance, suppose the target accuracy is 0.75, and the deviation limit is 1.1. In this case, after four trials in a given step, the expected number of correct trials is 4 x 0.75 = 3 trials. Applying the deviation limit gives an acceptable range of 3 - 1.1 = 1.9 and 3 + 1.1 = 4.1. So, PEST would make the task easier if the participant had only 1 trial correct after four trials in the current step. On the other hand, it would make the task more difficult if the participant had five trials correct; this is impossible after only four total trials, so PEST maintains the same difficulty level as long as the participant got at least two trials correct.

With the given parameters of 0.75 and 1.1, at least five trials would have to be performed in a given step before PEST could increase the difficulty level. This is because the number of correct trials exceeds the target number of trials plus the deviation limit only after five trials. In mathematical terms, five is greater than (5 x 0.75) + 1.1 = 4.85.

Table 1 lists the limits of the number of correct trials that cause a change in task difficulty. The values assume an accuracy of 0.75 and a deviation limit of 1.1, which were the values proposed by Taylor and Creelman.

Step
trials
Target
trials
Minimum
trials
Maximum
trials
Task easier if... Task harder if...
1 0.75 -0.35 1.85 Not possible N.P.
2 1.5 0.4 2.6 All wrong N.P.
3 2.25 1.15 3.35 More than two wrong N.P.
4 3 1.9 4.1 More than three wrong N.P.
5 3.75 2.65 4.85 More than three wrong All correct
6 4.5 3.4 5.6 More than three wrong All correct
7 5.25 4.15 6.35 More than three wrong All correct
8 6 4.9 7.1 More than four wrong All correct
9 6.75 5.65 7.85 More than four wrong More than seven correct
10 7.5 6.4 8.6 More than four wrong More than eight correct
Table 1: Conditions Invoking a Change in Difficulty

The effect of different values of accuracy and deviation limit are depicted in Figures 3 and 4. The green band represents the range in the score required to effect a change in task difficulty. The blue dots represent a participant performing with perfect accuracy, and the red dots represent complete failure. PEST will change the task difficulty after the first trial (dot) that falls outside the green band. The accuracy level affects the steepness of the green band, whilst the deviation limit increased the width of the band. The deviation limit effectively reduces the rate at which PEST changes step. In other words, it tends to increase the number of trials in each step.

Figure 3: Deviation Limits for 75% Accuracy

Thus, in Figure 3, it can be seen that for deviation limits of 1.1 and 1.3, the task difficulty will change (become easier) if the first two trials in the step both receive wrong responses. On the other hand, if the participant responds perfectly, the task difficulty changes (becomes harder) after the first five trials for a deviation limit of 1.1, and after the first six trials for 1.3. Note that these numbers apply only to the first trials in each step; five (or six for a DL of 1.3) consecutive correct responses will have a different effect if they were preceded by some wrong responses in the same step.

Figure 4: Deviation Limits for 85% Accuracy


Home About Me
Copyright © Neil Carter

Last updated: 2017-03-12