Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Analysis of Signal Detection and Recognition Experiments, Papers of Linguistics

Various models for complete identification experiments, partial identification designs, and choice experiments in signal detection and recognition. It covers concepts such as stimulus similarity, response bias, and threshold curves. The document also mentions the use of eq. 5 and assumptions 1-3, and provides examples of data analysis using confusion matrices and observed vs. Predicted values.

Typology: Papers

Pre 2010

Uploaded on 09/17/2009

koofers-user-9xz
koofers-user-9xz 🇺🇸

10 documents

1 / 87

Toggle sidebar

Related documents


Partial preview of the text

Download Analysis of Signal Detection and Recognition Experiments and more Papers Linguistics in PDF only on Docsity! Detection and Recognition"' R. Duncan Luce University of Pennsylvania I . This work was supported in part by grants NSF G-8864 and NSF G-17637 from the National Science Foundation to the University of Pennsylvania. 2. I wish to thank Robert R . Bush, Eugene Galanter, D . M . Green, Francis FV. Irwin, M . V. Mathews, B. S. Rosner, Elizabeth F. Shipley, S. S. Stevem, and W. P. Tanner, Jr., for their careful reading and criticism of irarious drafts of this material. Numerous conversations with Dr. Galanter during the last fouryears have led to the view of psychophysics that is rejected in this as well as in Chapters 4 and 5. Contents 1. Representations of the Response Probabilities 1. I Signal detectability theory, 108 1.2 Choice theory, 1 13 1.3 Threshold (or neural quantum) theory, 116 2. Simple Detection 2.1 Signal detectability analysis, I20 2.2 Choice analysis, 123 2.3 Threshold analysis, 125 2.4 Comparison of models with data, 127 3. Isosensitivity Curves 4. Complex Detection 4.1 Signal detectability analysis, 137 4.2 Choice analysis, 139 4.3 Threshold analysis, 141 4.4 Comparison of models with data, 142 5. Theories of the Bias Parameters 5.1 Expected value, 148 5.2 Asymptotic learning, 15 1 6. Theories of the Stin~ulus Parameters 6.1 Ideal observers, 155 6.2 The neural quantum model, 157 6.3 The neural quantum experiment, 161 7. Pure Recognition 165 ~ 7 . 1 Iiitroduction, 165 7.2 lnforniation theory analysis, 168 7.3 A choice analysis of the results of information theory, 171 -- 8. Siniultaneous Detection and Recognition 177 I 8.1 Signal detectab~lity analysis. 177 ( 8.2 Choice analysis, 179 9. Detection of an Unknown Stimulus 181 10. Conclusions 184 References 186 R E P R E S E N T A T I O N S O F T H E R E S P O N S E P R O B A B I L I T I E S '07 At one time the data from recognition experiments were either presented in uncondensed form as confusion matrices or they were summarized by one or another of the standard contingency-table statistics. Until quite recently, nothing like a coherent theory had evolved, and the empirical generalizations were few. Three may be mentioned. Let the presentations be labeled s,, s,, . . . , s, and the responses, 1, 2, . . . , k in such a way that ~ ( r ) = s,. First, the largest entry in row s, is generally the main diagonal one, $(r I s,). Second, although the matrix is not strictly symmetric in the sense that $(r ( s , ) = $(r' I s,), there is a definite tendency in that direction. Third, when the presentations are physically ordered-for example, by intensity, size, frequency-so that s, < s, < . . . < s,, then the value of @(r I s,) dips down rapidly from p(l I s,), reaches a plateau in the midrange, and then rises again rapidly to $(k I s,). In other words, plots of p(r I s,) versus r are usually U-shaped. Their exact nature depends upon the number and spacing of the presentations as well as upon the experimental conditions. Examples of size, color, and position con- fusion matrices can be found in Shepard (1958b). In the early 1950's a number of psychologists began to analyze recogni- tion confusion matrices in terms of Shannon's information measure, and several comparatively simple generalizations resulted. These we discuss in Sec. 7 on recognition experiments. What is still lacking is an adequate, detailed response theory to explain these somewhat gross results. The study of detection has proceeded largely independently of the work on recognition, with, however, some fusion developing in the last several years. Detection research began early in the history of experi- mental psychology with determinations of absolute and difference thresh- olds. Theoretical contributions were scattered until the early 1950's when a program of theoretical and experTmental research emerged at the University of Michigan. Somewhat later several related programs developed elsewhere in the United States. Our study begins with the several analyses of detection experiments which are currently of interest, and in Secs. 7 and 8 some of the same ideas are applied to recognition experiments. 1 . REPRESENTATIONS O F T H E RESPONSE PROBABILITIES Attempts to account for the behavioral relations among various types of identification experiments, both complete and partial, have so far resulted in three distinct response theories. In this section each is described in moderately general terms with little reference to specific experiments; 108 D E T E C T I O N A N D R E C O G N I T I O N in the remainder of the chapter and in much of the next two they are applied to specific designs. The reader may well wonder why three different theories for the same behavior should be presented, when, after all, at most one can be correct. One reason is that there is no assurance that the same response theory is appropriate for all modalities or for different tasks within one modality, but more important, it has been impossible so far to choose among them on empirical grounds even for a single type of experiment within one modality. Their predictions tend to be similar, and where there are differences the experimental results have been either inconclusive or contradictory. This situation is hopefully transitory; in fact, considerable clarification can be expected in the next few years. 1.1 Signal Detectability Theory The notions underlying signal detectability theory originally took root in psychology during the period bounded by Fechner and Thurstone. Later they reappeared in a slightly different guise explaining not just discrimination but also detection and recognition. W. P. Tanner, Jr., and his colleagues at the University of Michigan reinterpreted and modified analyses of optimal physical detection of electrical signals in noise (Peterson, Birdsall, & Fox, 1954; van Meter & Middleton, 1954) into a psychophysical theory. Some of the same ideas were also developed by Smith and Wilson (1953). In addition to the theory, a series of interrelated experiments have been performed (Birdsall, 1955, 1959; Clarke, Birdsall, & Tanner, 1959; Creelman, 1959, 1960; Egan, Schulman, & Greenberg, 1959; Green, 1958, 1960; Green, Birdsall, & Tanner, 1957; Speeth & Mathews, 1961; Swets, 1959, 1961b; Swets & Birdsall, 1956; Swets, Shipley, McKey, & Green, 1959; Swets, Tanner. & Birdsall, 1955, 1961; Tanner. 1955, 1956, 1960, 1961; Tanner & Birdsall, 1958; Tanner, Birdsall, & Clarke, 1960; Tanner & Norman, 1954; Tanner & Swets, 1954a.b; Tanner, Swets, & Green, 1956; Veniar, 1958a,b,c). These researches go under the name of signal detection, or, as Tanner prefers, signal detectability theory. Survey papers by Green (1960), Licklider (1959), and Swets (1961a) give summaries of the central ideas and experi- mental findings. The main notion is that the pertinent information available to the subject as a result of the stimulation can be summarized by a number; however, repeated presentations of the same stimulus produce not the same number but a distribution of them. The subject is assumed to behav~ as if he knew these distributions. He evaluates the particular number arising on a trial in terms of the distributions from which it could have R E P R E S E N T A T I O N S O F T H E R E S P O N S E P R O B A B I L I T I E S 109 arisen, much as a statistician evaluates an observation to decide between a null and an alternative hypothesis (Wald, 1950). Indeed, the two models are formally the same. The theory does not say where these distributions come from, although their usually assumed normality easily suggests a pseudoneurology in which many small independent neuronal errors accumulate to form the resultant error; nor does it tell how the subject comes to know the dis- tributions, but a learning process during pretraining seems a likely candi- date; nor does it suggest how the subject carries out the various needed transformations and calculations. Unexplained, "internal" numerical representations such as these are characteristic of almost all psycho- physical theories and they simply indicate, I suspect, the relatively primi- tive state of the theory. Nonetheless, we are no more obliged to account for them in, say, physiological terms than were the authors of the first macroscopic physical theories required to explain planetary motions in terms of elementary particle properties. Tanner and his colleagues arrived at the representation in this way. The effect of a presentation s is supposed to be a random vector s which assumes values in a k-dimensional Euclidean space E,; that is, the effect of stimulation is assumed to be adequately described by a k-tuple of numbers. Not only does this seem moderately plausible, but for frequency bounded temporal signals the important sampling, or 2 WT, theorem (Shannon & Weaver, 1949) shows that in the limit as T+ co such a representation of their physical properties is indeed possible. If x E Ek, the probability density that stimulus s produces the effect x, ~ ( ~ ) ( x I s) is assumed to exist. Suppose. for the moment, that one of two presentations, s or s', occurs on each trial and that on a particular trial an observation x occurs. The subject must use it and his assumed knowledge of the distributions p("(. I s) and P("(. I sf) to decide which of the two presentations fathered it. In such matters there is an inherent uncertainty. It is plausible that he might decide by considering the relative likelihood of the two presentations generating x. Specifically, let us suppose that he calculates the likelihood ratio If this number is large, it is only sensible to say that s was presented; if small, s'. This suggests that the subject should establish a cut-poitit (or criterion) c and use the decision rule: 112 D E T E C T I O N A N D R E C O G N I T I O N X + .- "7 C D A - . - .- a m a e a Decision axis Fig. 3. Three normal distributions on a single decision axis, which cannot occur if the decision axis is a monotonic function of likelihood ratio. theory has been applied one of the presentations, n, is a noise background that contains no stimulus and the others, s, sf , . . . , consist of stimuli embedded in the noise. The several distributions of effects over the Euclid- ean k-space are assumed to be statistically independent of each other. If we suppose that the stimuli s and s' differ only on one dimension, say energy, it seems plausible to think of the three distributions as existing over a common decision axis, as in Fig. 3. Unfortunately, such a simple representation in terms of a likelihood ratio axis does not follow from the argument just given for two stimuli. The trouble is that when we compare n and s the likelihood ratio noise distribution, p(z ( n), depends not only upon pk(. I n), which by assuinption is independent of the other presentations, but also upon the set L(z). Because L(z) is defined in terms of the likelihood ratio, it depends upon pk(. I s) as well as upon pk(. I n). That is to say, in terms of likelihood ratio each stimulus has its own separate noise distribution. Or we may see it another way. In the two- stimulus case we have just shown that p(log 1 I s)/p(log 1 I s') = 1, so where two distributions intersect, that is, where p(log 1 I s) = p(log I ] s'), the likelihood ratio must be 1. But there are three such intersections in Fig. 3, all of which would have to correspond to the same likelihood ratio of 1, which is clearly impossible. This means that complete identification experiments with more than two presentations cannot usually be reduced to a one-dimensional represen- tation because each pair of presentations has its own likelihood ratio axis. It is customary in signal detectability theory to assume that the logarithms of these several axes can be embedded in an Euclidean space of appropriate dimension. Two serious problems result: when should R E P R E S E N T A T I O N S O F T H E R E S P O N S E P R O B A B I L I T I E S "3 the separate axes be assumed to be orthogonal and to what class of parti- tions should the decision rule be restricted'? No class of rules seems nearly so compelling for a space as does the cut-point rule for a line. 1.2 Choice Theory Choice theory has been discussed, in one variant or another, by Bradley (1954a,b, 1955), Bradley and Terry (1952), Clarke (1957), Luce, (1959), Restle (1961),"hepard (1958a,b), and Shipley (1960, 1961) for either complete or partial identification experiments or both. No very systematic statement of the intuitions underlying these models has yet been given. Although I shall attempt some clarification, I am still far from a completely satisfactory axionlatic statement of all that is involved. First, the basic representation will be stated, and then I shall consider briefly some of the justifications that have been given for it. Two ratio scales (i.e., scales unique up t o multiplication by positive constants) q : S x S -t positive real numbers b : R -t positive real numbers are assumed to exist such that when i is the identification function the response probabilities are of the form The scale -11 is interpreted as a measure of the similarity between the presented stimulus s and the one, i(r), for which r is the correct response. The scale b, which is associated only with responses, is interpreted as a measure of response bias. At present, Eq. 5 is useful only if we make certain additional assump- tions. Those we make, which are in large part suggested by Shipley's (1961) work, all arise from preconceived notions about the intuitive meaning of the q scale and from considerations of mathematical simplicity; they are neither obviously necessary nor clearly dictated by data, even though their consequences have received some empirical support. The first three assumptions can be interpreted as formalizing our interpretation of q as a measure of the similarity between stimuli or, equally well, as postulating that the logarithm of 7 behaves like a measure of "psychological distance." Although I had an opportunity to read Restle's interesting book in manuscript form, it was not available to me when this chapter was being drafted and so no attempt has been made to incorporate his ideas directly. I r 4 D E T E C T I O N A N D R E C O G N I T I O N Assumption I . For all s, s' E S, q(s, sf) = q(sl, s). Assumption 2. F o r a l l s ~ S , q(s,s) = 1. Assumption 3. For all s, sf, sf' E S, q(s, s") > q(s, s')q(sl, s"). The heart of Assumption 2 is that the number q(s, s) is independent of s ; setting it equal to 1 merely fixes the unit of the q scale. It is easy to see that d(s, sf) = -log q(s, s') (6) satisfies the usual distance axioms, namely: 1. d(s, s') = d(sf, s). 2. d(s, sf) 0 and d(s, s) = 0. 3. d(s, s") < d(s, st) + d(s', s"). These three assumptions are used in all applications of the choice theory later. In addition, a fourth assumption, which is suggested by the interpretation of d as a distance measure, will sometimes be made. It plays exactly the same role in the choice theory as the orthogonal em- beddings of the logarithm-of-likelihood-ratio axes into Euclidean spaces which are used in signal detectability theory. Assumption 4. I f S = S, x S2 x . . . x Sk and if q is deJined ouer S and over each of the Si, then for s = (sl, s2, . . . , sk) and t = (t', t2,. . . , t k ) ES, k where d is deJined by Eq. 6. This states that if the stimuli can be viewed as having k distinct com- ponents-the S,-then the several distance measures are interrelated as one would expect them to be, provided that the Si were to correspond to the orthogonal dimensions of an Euclidean k-space and the distance, to the natural metric in that space. It is, of course, possible to write a weaker assumption in which the coordinates are not orthogonal, but this adds many more parameters to the model, namely the angles between the coordinates. Because we do not need this weaker version, it will not be described in detail. The origins of this particular representation lie in two papers, a book, Shipley's thesis, and some unpublished work. Shepard (1957) suggested what amounts to Eq. 5 and Assumptions 1 to 3 to relate either stimuli to stimuli or responses to responses, and later Shepard (1958a) suggested the formula R E P R E S E N T A T I O N S O F T H E R E S P O N S E P R O B A B I L I T I E S " 7 dynamics of that system, and a theory of its role in generating responses constitutes a partial description of the response mechanism ofthe organism. Absolute and difference threshold experiments differ i n this way. When the subject is asked to detect whether or not a stimulus has been presented, we are usually concerned with the value of the absolute threshold. But when he is asked to detect the difference between two temporal or spatial regions of stimulation, we are concerned with the difference threshold. Because the first can be viewed as detecting a difference between two regions of stimulation, one of which is null, no attempt is made in the formal theory to distinguish between the two problems. The value of the absolute threshold is generally defined to be that level of stimulation that is detected 50 (or some other arbitrarily chosen) per cent of the time when the observing conditions are relatively ideal. The actual techniques used are various; some are rapid and probably yield biased or variable estimates, others are more painstaking. But, however the determinations may be made, two things are important to us. First, although the resulting numbers are called threshold values, there is nothing to prevent the procedures from yielding the numbers even if there are no thresholds. Both the detectability and choice theories, which postulate no thresholds, lead one to expect that "threshold" values can be determined. Second, the behavior of the subject is not really the object of study; rather, attention is paid to the limiting characteristics of his sensory system. When the behavior has been examined, it has us~~ally been for "methodological" reasons-to improve the reliability or speed of the techniques. Examples of such research can be found in Blackwell's (1953) moiiograph on the determination of visual thresholds. Thus, although the threshold literature is large, it still remains for other psycliol- ogists to derive behavioral predictions from a threshold model, to use these to discover whether thresholds really exist, and to determine how such a psychophysical theory interacts with other factors affecting behavior. In making threshold determinations, valued outcomes, or even infornia- tion feedback. have rarely been used. Often the subject is instructed to miiiimize his "false-alarm" rate, and during pretraining he may be informed about his errors on catch trials. Some experimenters wait until the false alarm rate is sufficiently low-as they say, until the subject has established a "good criterion"-before proceeding to the main part of the experinient. Others simply estimate the false-alarm rate from the pre- training trials, and still others include catch trials during the experinient proper from which they estimate the rate. Sometimes these rates are simply reported; at other times they are used to "correct for guessing." I shall discuss the model for this correction presently (Sec. 2.3). The existing threshold model was developed in two stages. The first, 118 D E T E C T I O N A N D R E C O G N I T I O N which in essence is a discrete analogue of the Thurstone-Tanner statistical model for stimulus effects, was initially stated along with supporting evidence by BCktsy (1930); later Stevens, Morgan, and Volkmann (1941) refined the statement and added appreciably to the evidence. The second stage, which is concerned with the biases introduced by the subject, began with the model for correcting for guessing, was reformulated by Tanner and Swets (1954a), and was then extended by Luce (1963). For a general discussion of many of the issues involved and for some alternatives to the model we will discuss, see Swets (1961b). The BCktsy-Stevens model assumes that the effects of stimulation are discrete, not continuous as in the other two models. There is supposed to be a finite (or countable) sequence of "neural" quanta, which we may identify by the integers 1, 2, 3, . . . . A neural quantum is not identified with any particular neural configuration, although presumably it has Tome physiological correlate. At a given moment, stimulation is assumed to "excite" the first j of these quanta, but because of irregular fluctuations this number does not necessarily remain fixed over time, even when the stimulation is constant. The main feature of this quanta1 structure is that two stimuli, no matter how different they may be physically, cannot be different to the subject if they excite the same number of neural quanta. If this model is correct, the only changes that he can possibly notice are those producing a change in the number of excited quanta. Let us suppose that just prior to the presentation of s on trial i, j quanta are excited by the residual environment plus the background, if any. When s is presented, suppose j' quanta are excited. The change, then, is j' - j, and so we can think of the effect of presenting s on trial i as O(s, i)= j' - j, that is, the presentations generate a function where I denotes the set consisting of zero and the positive and negative integers. Because the background is assumed to have a fluctuating effect, O(0, i ) is not necessarily 0, as one might first think. The subject can detect a presentation only if O(s, i) is not zero, but there are situations in which it is reasonable for a subject to require a change of more than one quantum before responding that a stimulus was presented. With that in mind, we have the following class of "internal" detection rules: \the same as 1 A presentation s on trial i is detected as the background \dzfSerent fronzl - k ,< O(s,i) ,< k' 1 B(s, i) < - k or >k'\ ' where k and k' are nonnegatioe integers. (7) S I M P L E D E T E C T I O N 119 Although the evidence makes one suspicious, it is generally assumed that Pr [B(s, i) < -k or > k'] is independent of i but not of k and k'. We denote this probability by q(s, k, kt), or simply by q(s) when k and k' are assumed fixed, and we speak of it as the true detectionprobability. This is not necessarily the same as the corresponding response probability which one estimates from experimental data-there may be response biases. The second component of the model is the effect of the outcome struc- ture. The obvious parallel to signal detectability theory is to suppose that biases are introduced by the selection of the cut-points k and k'. This model should be explored, but it has not been. Rather, it has been assumed that k and k' are fixed in a given situation and that the subject biases his responses in the light of the payoff structure simply by falsely converting some detection observations into negative responses or some no-detection observations into positive ones. As in the other two models, there are two types of parameters. The true detection probabilities q(s) are considered to be stimulus-determined, and the proportions of falsified responses are bias parameters which depend upon, among other things, the outcomes. The main body of the Bekesy- Stevens work is concerned with the dependence of the stimulus parameters upon the stimulus (see Secs. 6.2 and 6.3), whereas other authors have focused more upon the response model and the dependence of the biasing parameters upon the outcome structure (see Secs. 5.2 and 6.3). We turn now to two of the simplest detection experiments and show in turn how each of these three theories tries to account for the behavior. 2. SIMPLE DETECTION Of the possible detection experiments, the simplest is the Yes-No design. At regular time intervals, which define the trials and which are marked off by, say, lights, a stimulus may or may not be added to a continuous background. Following each such interval, the subject responds, usually by pressing one of two buttons, to the effect that "Yes, a stimulus was present" or "No, none was there." Each presentation, stimulus plus background and background alone, is repeated some hundreds of times according to a random schedule, and the conditional choice probabilities are estimated by the relative frequencies of choices. This is easily seen to be a complete identification experiment in which Y = {A, 01, where 0 is the null stimulus, S = Y, and R = {Y, N), where Y means Yes and N, No. The identification function is, of course, i(Y) = A and i(N) = 0. It is convenient to think of the background as noise and to change. notation to the extent that s denotes "A plus noise" I 2 2 D E T E C T I O N A N D R E C O G N I T I O N C .- "l C 9 C .- .- n n 2 a Logarithm of the ratio of likelihood ratios Fig. 4. Normal distributions of X, - X, for the two-alternative forced-choice design assuming S and N are normally distributed with standard deviation u and separated by an amount d. 6.3), or when each interval includes a stimulus and they differ only in one dimension, as in the traditional discrimination experiments (see Sec. 3.1 of Chapter 4), then it is usually assumed that the effects of successive presentations are correlated. No very firm argument has been given why this should be, but the feeling seems to be something to the effect that the random errors introduced by the subject are due to comparatively slow changes in his reactions to the dimension being varied. When, however, the background is random noise and the stimulus is a tone, it is believed that it is more plausible to suppose that successive presentations have effects that are independent of one another. Under that assumption, it is easy to determine the two distributions we need by invoking the well-known fact that the distribution of the difference of two independent, normally dis- tributed random variables is normal with mean equal to the difference of the means and variance equal to the sum of the variances. Thus the distribution of effects from (s, n) is normal with mean d and standard deviation J j o , and that from (n, s), normal with mean -d and standard deviation also J j o . The situation is shown graphically in Fig. 4. From this and the assumed decision rule, Eq. 10, we obtain Observe that c' = Ocorresponds to no bias in the sense thatp(1 I (s, n)) = p(2 I (n, s)). The data from two-alternative experiments suggest that there S I M P L E D E T E C T I O N 123 is little or no bias when the presentation probability is 4 and the payoff matrix is symmetric, so we will assume that c' = 0 for such studies. The common response probability, the probability of a correct response, is de- notedp,(C) -2 for the number of alternatives and C for "correct." Given an estimate ofp,(C) from data, we can calculate the corresponding normal deviate 2d/ J?o = &dl, where d' is the sensitivity parameter used in the analysis of the corresponding Yes-No experiments. Thus d' can be independently estimated from both Yes-No and two-alternative forced- choice experiments. Later, in Sec. 2.4, we compare Yes-No and forced- choice estimates for two sets of data. 2.2 Choice Analysis If we assume that the choice model, Eq. 5 and Assumptions 1 to 3, hold, and if we denote q(s, n) by q, b(N)/b(Y) by b, and recall that q(s, s) = 1, then the scale values for the Yes-No experiment are Response Y N Stimulus Presentation n The corresponding confusion matrix is Y N In what follows, we usually write down only the tables of scale values and not the corresponding probability tables, which are obtained by dividing each scale value by the sum of the scale values in its row. As with signal detectability theory, this model describes the response frequencies perfectly: there are two parameters to account for two independent probabilities. The equations for the parameters are 124 D E T E C T I O N A N D RF.COC;NITION Our interest in 7 and b is not as simple transformations of the Yes-No data but in the possibility that they can be used to predict other data. This possibility stems from our interpretation of the two scales: under otherwise fixed experimental conditions and for a given subject, 7 is supposed to depend only upon the stimuli and b, upon the payoffs, presentation probabilities, and instructions. I t is believed that 7 is a measure of the subject's detection sensitivity, just as d' is in the first model, and that b is a bias which, like c, reflects the relative attractiveness to him of the two responses. The matrix of scale values for the two-alternative, forced-choice design is I 2 By Assumption 2, we know that and because S cY x Y, it is reasonable to invoke Assumption 4 under the same conditions as we did the independence assumption in the signal detectability model (e.g., a noise background) yielding = 1- 42 log 71" = [-log 7'z']8. Thus, by Assumption 1, v(01, s ) , (3, n)) = ~ ( ( 3 , n,, (n, s)) = v"?, and so, letting 6' = b(2)/b(l), the matrix of scale values reduces to where 7 is the Yes-No detection parameter. Note that b' = 1 corresponds to no response bias. Again, there is no question about the nod el reproducing the data. What is not automatic is that the estimate of 7 from the Yes-No data will be the same as that from the forced-choice data, as is alleged by the theory. S I M P L E D E T E C T I O N 127 two give him no hint how to respond. Presumably, these are the observa- tions he should bias, at least when there is nothing to drive him to extreme biases. We assume this. To write the equations for the response probabilities, we need to know the probabilities of each of these observation outcomes for each of the presentations. As with the other two models, we assume that successive stimulus effects are independent, so that, for example, the probability of a (D, b) observation when (s, n) is presented is simply q(s)[l - q(n)]. The other cases are similar: where u and w are biasing parameters such that 0 < u, w < 1. Again, the model has too many parameters to permit any check on it with just forced-choice data. In Sec. 3 we discuss experiments in which it, as well as the other models, can be tested. 2.4 Comparison of Models with Data Swets (1959) reported data for three subjects run in both Yes-No and two-alternative forced-choice designs. The stimuli were 1000-cps tones of 100-ms duration at several different energy levels in a background of white noise. Five hundred observations were obtained from each subject in each energy-design condition. The presentation probabilities were approximately 4 and a symmetric payoff matrix was used. The data, which Professor Swets has kindly provided me, and the estimates of d' for two different designs are shown in Table 1. Shipley (1961) also ran three subjects in both designs, using a back- ground of white noise and 500- and 1000-cps stimuli at one energy level each. A total of 1600 observations were obtained in each condition for each subject using presentation probabilities of 4 and a symmetric payoff matrix. The data and d' estimates are shown in Table 2. With the exception of subject 3 on the 1000-cps stimulus, all pairs of estimates are within 10 per cent of each other. Shipley's estimates seem considerably more consistent than Swets's, but in large part this is due to the increased number of observations. TO see this, suppose d' = 1.2 and p( Y I n) = 0.2, then p( Y I s) = 0.640. An increase of 10 per cent in d' and p(Y I n) = 0.2 yields p(Y ( s) = 0.683. With 250 s presentations, a difference of 0.683 - 0.640 = 0.043 corre- sponds to about 1.4 standard deviations, whereas with 800 presentations it corresponds to about 2.5 standard deviations. I 28 D E T E C T I O N A N D R E C O G N I T I O N These same data are reanalyzed in terms of the choice model in Tables 3 and 4, and much the same pattern is exhibited as for the detectability model. For example, the largest difference in Table 4 is in the same place Table 1 Yes-No and Two-Alternative Forced-Choice Acoustic Data (Swets, 1959) and the Corresponding Estimates of d' Yes-No Two-Alternative Subject SIN in db p(Y / s ) p(Y I / I ) p(l I ( s , n ) ) I 1 ( I s ) d' Forced-Choice d' 1 9.4 0.793 0.226 0.824 0.187 1.57 1.29 14.5 0.872 0.180 0.931 0.060 2.05 2.15 16.6 0.902 0.120 0.963 0.07 1 2.45 2.29 2 9.4 0.753 0.288 0.670 0.149 1.24 1.03 11.7 0.771 0.254 0.777 0.194 1.40 1.14 14.5 0.833 0.295 0.854 0.145 1.51 1.50 16.6 0.867 0.232 0.855 0.078 1.83 1.83 3 9.4 0.731 0.195 0.835 0.149 I .48 1.43 11.7 0.836 0.254 0.870 0.142 1.65 I .56 14.5 0.816 0.169 0.959 0.125 1.85 1.96 16.6 0.895 0.149 0.953 0.037 2.29 2.45 The stimuli were 1000-cps toncs of 100-ms duration in noise of 50 db re 0.0002 d/cm2; 500 observations were made on each subject at each energy level for each condition. Presentation probabilities of about 4 and symmetric payoff matriccs were used. Table 2 Yes-No and Two-Alternative Forced-Choice Acoustic Data (Shipley, 1961) and the Corresponding d' Stimulus 1 Yes-No Subject p ( Y 1 s) p( Y 1 tz) p(1 1 (s, a ) ) p(2 ( (n, s)) d' Forced-Choice d' 1 0.768 0.148 0.895 0.880 1.78 1.72 2 0.712 0.258 0.798 0.796 1.20 1.18 3 0.746 0.216 0.836 0.866 1.44 1.48 Stimulus 2 1 0.695 0.201 0.835 0.838 1.35 1.38 2 0.675 0.199 0.795 0.812 1.30 1.20 3 0.693 0.287 0.791 0.832 1.07 1.25 Each stimulus lasted for 100 ms in a 500-ms interval and was imbedded in wide band noise at 0.0435 volt across the terminals of the ear phones. Stimulus 1 was 500 cps at 0.0023 volt and Stimulus 2 was 1000 cps at 0.0026 volt. Each presentation of each condition occurred approximately 800 times. Presentation probabilities of 4 and symmetric payoff matrices were used. and is of comparable magnitude. In both analyses the Yes-No parameters do not seem to be consistently larger or smaller than the forced-choice ones. S I M P L E D E T E C T I O N '29 Thus, although the two theories differ in their approach, it is evident that they do not differ appreciably in their predictions from one simple detection design to another. To see this more vividly, we make the following calculation. For each of several different values of d', determine from Eq. 9 the values of p(Y I s) and p(Y I n) corresponding to different Table 3 Estimates of q for Swets's (1959) Da t a 7 SIN in Subject db Yes-No Forced-Choice See Table 1 for a description of the experimental con- ditions. Table 4 Estimates of q for Shipley's (1961) Data Stimulus 1 Stimulus 2 Subject Yes-No Forced-Choice Yes-No Forced-Choice See Table 2 for a description of the experimental conditions. choices of c. Elliot's (1959) tables of d' are handy for this. For these pairs of probabilities, determine q from Eq. 13. A plot of the logarithm of q versus d' is shown in Fig. 5. It is evident that the relation is approximately linear and that the correlation is high. The points that differ the most from the main trend are those for which at least one of the probabilities is near 0 or 1. As noted earlier, the threshold model has too many parameters to be tested with these data. 132 D E T E C T I O N A N D R E C O G N I T I O N Fig. 6. Typical signal detectability and choice model isoscnsitivity curves for Ycs-No design. The data points, reported in Tanner, Swets, & Green (195G), wrrc obtaincd by presenting pure tonrs in noise, wit11 P varied from 0.1 to 0.9 in steps of 0.2 and a fixed symmetric payoff' matrix. Note that the theoretical curves in Fig. 6 are symmetric about the main diagonal that runs from (0, 1 ) to (1,O). Not all data are symmetric, however, as the visual ones shown in Fig. 7 for one of four subjects studied by Swets, Tanner, and Birdsall (1955, 1961) indicate. The empirical isosensitivity curve was swept out by varying the payoffs and holding P = 4. The data for the other subjects are similar. I t is evident that these data reject both the choice and the equal-variance signal detectability models. The theoretical curves of Fig. 7, one of which corresponds reasonably well with the data, were obtained from the detectability model by assuming that the stimulus plus noise standard deviation is 1 + id' times the noise standard deviation. For these data, d' is in the range of 2 to 4, so the factor is 1.5 to 2. Thus a second stimulus parameter allows I S O S E N S I T I V I T Y C U R V E S '33 detectability theory to account for these data, but I am at a loss to under- stand why adding a faint tone to the noise should have such major reper- cussions on the variance of the distribution of effects. In all likelihood, there is some plausible way to add a second stimulus parameter to the choice model so that it does just about as well, but none has yet been suggested. The threshold theory isosensitivity curves are obtained by eliminating t from Eq. 15 and u from Eq. 16: This equation represents two line segments: one from (0, 0 ) to (q(n), q(s)), which is referred to as the 1on.er limb, and the other, the upper limb, from (q(n), q ( 4 ) to ( 1 , 1 ). Fig. 7. Nonsymmetric signal detectability isosensitivity curves for the Yes-No design. The data p o i t ~ ~ s were obtained by presenting local increases in light intensity, with P = 4 and different payotf matrices. See text for an explanation of the theoretical curves. Adapted with permission from Swets, Tanner, & Birdsall (1961, p. 319). D E T E C T I O N A N D R E C O G N I T I O N Fig. 8. Threshold isosensitivity curve fitted by eye to subject 1 acoustic plotted in Fig. 6. The theoretical crosses are explained in Sec. 5 .2 . data previously The traditional "correction-for-guessing" procedure assumes that q(n) = 0, in which case the lower limb runs along the ordinate to q(s) at which point the upper limb departs for (1, 1). It is abundantly clear that the q(n) = 0 model does not describe the data of Figs. 6 or 7; as a result, Tanner and Swets (1954a) concluded that these detection data reject the high-threshold hypothesis. Sometimes their conclusion has been inter- preted as a rejection of all sensory thresholds, but the more general threshold model appears to be quite adequate. In Fig. 8, threshold curves are fitted to Tanner, Swets, and Green's (1956) subject 1 data and in Fig. 9 to those of Swets, Tanner, and Birdsall's (1955) subject 4. These theo- retical curves are comparable to those of Fig. 7, not to the symmetric ones of Fig. 6, because the threshold model, like the unequal-variance signal- detectability model, has two estimated parameters. The threshold curves appear to be just as satisfactory as the signal-detectability ones. I shall not carry out the parallel development of isosensitivity curves for the two-alternative forced-choice design using the detectability or choice models. Suffice it to say that the same equations result, except that dl/JZ replaces d' and 17dz replaces 17. The isosensitivity curve for the threshold model is obtained by subtracting the second expression in Eq. 18 from the first and rewriting the result as C O M P L E X D E T E C T I O N '37 successive presentations, except when explicitly stated otherwise. In particular, this means that successive presentations cannot simply be tape recordings of the initial one. The reason for imposing this experimental limitation is to permit us to assume independence of effects in the analysis, as was done earlier in the discussions of the two-alternative forced-choice design. 4.1 Signal Detectability Analysis Let Xi denote the logarithm of the likelihood ratio of the observation on the ith look of the multiple-look Yes-No experiment; then it is assumed ,n, that X = 2 Xi is the random variable used to arrive at a decision. i= l First, suppose noise is presented rn times. If each presentation is independ- ent and normally distributed with mean 0 and standard deviation a, then X is normally distributed with mean 0 and standard deviation a, = J&. Similarly, if the stimulus is presented each time and the presentations are independent, then the mean is d,, = rnd and the standard deviation is a , = Jia. Thus the effective detection parameter is and so we can predict multiple-look data from simple Yes-No data (Swets, Shipley, McKey, & Green, 1959). The generalization of the two-alternative signal detectability model to the k-alternative forced-choice design is comparatively complicated if response biases are included and very simple if they are not. 1 shall sketch the general idea of the former and carry out the latter in detail. As presented in terms of differences, it is not easy to see how to generalize the two-alternative analysis; however, if we view it in a different but equivalent way, the outlines of the generalization become clear (Swets & Birdsall, 1956). As before, suppose that the observations in the two intervals are independent, in which case it is plausible to represent the two decision axes as orthogonal coordinates in the plane. Joint normal distributions for (s, n ) and for (n, s ) are assumed to exist and to have equal variances. When projected on either axis, these distributions generate the usual one-dimensional noise and signal distributions, the means of the noise distributions being at the origin. This is diagrammed in Fig. 11. The observational random variable is the pair (X,, X,), and the decision rule is no longer characterized by a point but by a division of the plane D E T E C T I O N A N D R E C O G N I T I O N Fig. 11. The two-dimensional signal detectability representation of the two-alternative forced-choice experiment. The dotted lines represent the family of decision rules that correspond to the cutpoints in the decision axis representation. into two nonoverlapping regions. Under reasonable assumptions about the subject's goals, it can be shown that the division of the plane must be by a line located a t 45" between the two decision axes; typical ones are shown dotted. I t is not dificult to see that our original representation in terms of differences is simply the projection of the present model onto a plane orthogonal both to this family of 45" lines and to the decision plane. The intersection of each 45" line with this plane corresponds to a possible cut-point c'. The generalization to k-alternatives is now clear. There are k random variables, X,, X,, . . . , X,, corresponding to observations in each of the intervals. For each stimulus presentation, si, where the stimulus is in the ith interval and noise is in the others, there is a density function having the value p(x,, x,, . . . , x, I si) at X, = x,, X, = x,, . . . , and X, = x,. These are assumed to be independent multivariate normal distributions with equal variances. The decision rule is a partition of the k-dimensional Euclidean space into k response regions; the simplest rule involves a division of the space by hyperplanes. The mathematics required for specific numerical calculations is, of course, rather clumsy, and, so far as I know, no actual work has used this general form of the model. . If, however, we assume that the payoffs are symmetric and that the subject introduces no biases, matters are very much simpler. The subject makes an observation, X,, in each interval, and he is assumed simply to C O M P L E X D E T E C T I O N '39 say that the stimulus is located in the interval having the largest observa- tion. Because there is no bias, it does not matter which interval actually contains the stimulus-the probability of a correct response, pk(C), is the same for all. The probability density that s generates an effect x that is the largest is simply p(x ( s) times the probability that all k - 1 of the noise observations are less than x, that is, p(x 1 s) P(x ( n)"], where ,. %. P(x I n) = J ' p(i I n) di. Because the particular value of the largest value - m is immaterial to the response made, we integrate over all x to obtain Of course, we assume that p(z I s) and p(x I n) are normal, have the same variance, and are separated by an amount d, just as in the Yes-No model. That being so, an estimate of d' from either the Yes-No or two-alternative forced-choice data is sufficient to predict pk(C) from Eq. 23 (Tanner & Swets, 1954a). The analysis of the multiple-look k-alternative forced-choice design is analogous to that for the multiple-look Yes-No design (Swets, Shipley, McKey, & Green, 1959). The subject is assumed to make observation m. X i j in interval i on the jth observation. The sums 2 X i j are calculated, j = 1 and then the subject chooses the interval having the largest sum. Because ? I & the X t j are assumed to be independent and normally distributed, 2 Xij is j = l normally distributed with mean 0 and standard deviation 2/% when n is presented in interval i and with mean rnd and standard deviation 2/& when s is presented. Thus Eq. 23 can be used to calculate the probability of a correct detection. Note that if dl(k) denotes the value of d' estimated from the simple k-alternative forced-choice design and dml(k) denotes that corresponding to the m-look design, they are related by dmt(k) = Jrn d'(k). (24) 4.2 Choice Analysis By repeated use of the independence Assumption 4, it is easy to see that the choice model matrix of scale values for the multiple look Yes-No design is Y N 142 D E T E C T I O N A N D R E C O G N I T I O N 4.4 Comparison of Models with Data There do not appear to be any published raw data for the multiple-look Yes-No design, and so all we can do is attempt to compare the several theories. In all three cases there is a free parameter which gives one a good deal of freedom: neither the bias parameters in the detectability and choice models nor the value of k,,, in the threshold model need be the same as the number of looks is changed. If we assume no bias in the first two models and suppose that, for rn = 1, p(Y / s ) = $ and p(Y I n) = 4, then we get the solid points on the diagonal of Fig. 12. The two models differ so little that the separate points cannot be shown on a graph of this size. The majority rule for the threshold model also yields points on the diagonal, but they do not approach the corner quite so rapidly. The threshold m = 7 and 8 point is nearly the same as the detectability and choice m = 5. The other two sets of threshold points are for the extremes k , = 1 and k , = rn, and it is clear that by other choices for k , almost 4 Choice and signal detectability 0.4 o Majority rule threshold A k = 1 threshold + k = rn threshold - - - - 0 I I I I 1 I I I I 0 0.2 0.4 0.6 0.8 1.0 ~ ( Y l n ) Fig. 12. Typical predictions of the several models for the multiple look Yes-No design. The parameter m denotes the number of independent repetitions of the stimulus plus noise or of the noise alone. C O M P L E X D E T E C T I O N I 4 3 any other curve to the left and above these extremes can be generated. Much the same is true for the other two models because of the freedom in choosing the bias parameter. In addition to the Yes-No and two-alternative forced-choice data given in Table 1, Swets (1959) collected four-alternative forced-choice data on the same subjects. Using Eq. 23, or Elliott's (1959) tables, p,(C) for the detectability model can be predicted from the observed values ofp,(C) and T a b l e 5 Observed a n d Predicted Values of P,(C) for Swets's (1959) d a t a Subject Threshold Parameters Observed Predicted Lower Threshold Upper Limb Limb EINo Signal Upper Lower in d b q(n) q(s) q(n) q(s) Detectability Choice Limb Limb See Table 1 for a description of experimental conditions. for the choice model from Eq. 27. The predictions for the threshold model, Eq. 30, depend upon knowing both q(n) and q(s). These may be estimated from the Yes-No and two-alternative forced-choice data, with, however, the upper limb-lower limb ambiguity inherent in the Yes-No model. The details about how this was done can be found in Luce (1963). Both sets of estimates and the predictions for all three models are shown in Table 5. I t is clear that there is little to choose between the detectability and choice models. The threshold model is adequate only if we admit the possibility that the subjects did not all operate on the same limb and that some may have shifted from one limb to the other as the stimulus energy was increased. Both seem like reasonable possibilities. Swets (1 959) also reported p,(C) estimates for three subjects and k = 2, 3, 4, 5, and 8. Assuming equal biases, so that Eqs. 23 and 27 can be used, the detectability and choice models can be compared. The results are shown in Fig. 13, and they clearly favor the detectability model. T o what extent this conclusion depends up011 the assumption of equal biases is not 144 D E T E C T I O N A N D R E C O G N I T I O N Fig. 13. Typical predictions of the signal detectability (solid curves) and choice (dotted curves) models for the unbiased k-alternative forced-choice design. The dala points for these subjects are from Swets (1959). clear. If, for example, the biases were U-shaped, so that the relative frequency of the first and last responses was in excess of llk and of the middle ones less than llk, then it is quite possible that p,(C) would be artificially inflated and that p,(C), k > 5, would be artificially deflated. If that were the case, the data surely would not support the choice model and, depending upon the magnitude of the effect, might very well not support the signal detectability model. On the other hand, if the biases formed an inverted U, then p,(C) could easily be deflated and p,(C), k > 5, inflated. I f the effect were large enough, this could cause us to accept the choice and reject the detection model. Swets does not indicate the nature of the biases in his data, but my best guess (based upon biases in recognition confusion matrices) is that they were of the first type, in which case the choice model is inadequate to account for these data. Until more detailed data are available. however, no very certain decision is possible. Without Yes-No data on the same subjects under the same conditions, it is impossible to predict p,(C) uniquely using the threshold model. The T H E O R I E S O F T H E B I A S P A R A M E T E R S ' 47 was designed to test among the three models but rather to decide about the adequacy of the signal detectability one. So far as these data are concerned, there is nothing in my opinion that clearly favors one model over another. There is some suggestion in the k-alternative forced-choice data that the choice model is inferior to the detectability one, but the reverse is true for the multiple-look four-alternative data. Because of the threshold model's larger number of parameters, none of these experiments adequately taxes it. 5. T H E O R I E S O F T H E BIAS P A R A M E T E R S To a traditional psychophysicist, what we have been doing so far in this chapter must seem strange, if not totally irrelevant to his interests. He wants to know the laws relating responses to well-controlled, specifiable stimuli, and yet nothing at all has been said about them. The reason is that many contemporary psychophysicists do not believe that this problem is nearly so straight-forward as it seems. The current view is that it should be divided into three distinct parts. The first is a theory of responses that relates responses to responses, not to stimuli. Such theories-they are what we have discussed so fa r4onta in estimable parameters, such as 7, dl, or q(s) and q(n) and b, c, or t and u, which are thought to depend upon and to summarize the relevant decision-making effects of the stimulat- ing and reward conditions of the experiment. Because such parameters can be estimated from the response data, there is actually no need to measure the physical properties of the stimuli or the characteristics of the outcome structure of the experiment; they need only be under control and reproducible at will. This sort of theory, as we have seen, uses the data from one experiment to predict the results of others having different designs but involving the same stimuli, background. and residual environ- ment. Once such a theory is developed and has received enough confirmation so that one feels that it may be approximately correct, one can begin to look into the other two problems: first, relations between the stimulus param- eters of the theory and measurable properties of the stimuli, and second, relations between bias parameters and other aspects of the experimental conditions. There is precious little point, however, in trying to establish such relations until the response-response theory has been rather carefully tested. If we are correct in supposing that parameters of the one class measure the subject's sensitivity to the stimuli and those of the other measure response biases that are under his control, then we must anticipate separate 1 4 ~ D E T E C T I O N A N D R E C O G N I T I O N theories relating each to certain aspects of the experimental situation. This section presents two quite different theories for the bias parameters. The next discusses theories of the sensitivity parameters. 5.1 Expected Value As we have seen in Sec. 3 on isosensitivity curves, experimental manipu- lations of either the presentation probability or of the payoff matrix appreciably affect the response probabilities, even when the stimulating conditions are fixed. This suggests that a theory of the bias parameters must involve at least these two experimental factors. Broadly speaking, mathematical psychologists have come up with two ideas about this dependence. The one that we look into in this subsection stems mainly from the economic and statistical literature. It says that subjects choose the parameters to optimize something. The other, which is discussed in the next subsection, says that subjects continually adjust the parameters in an "adaptive" fashion-they learn. Suppose that the presentation and payoff structure in the Yes-No design is Presentation Stimulus Response Probability Presentation Y N P S 1-P n where the oij are sums of money. One reasonably sensible criterion that a subject might use is to select that bias parameter that maximizes his total expected money return during the course of the experiment. Because the trials are assumed to be independent and because the response proba- bilities are assumed to be constant, this is the same as selecting it to maximize the expected value of a single trial. This assumption is criticized later. The expected outcome, E(o), is simply the money value of each of the four possible presentation-response conditions weighted by their respective probabilities of occurring: = [P(Y 1s) - B A Y I n)l Ro l l - o12) + Pol2 + ( 1 - P)oZ2, (31) where T H E O R I E S O F T H E B I A S P A R A M E T E R S I 4 9 If the response probabilities depend upon a single bias parameter z , then to find that value of z that maximizes E(o) we set the derivative of E(o) with respect to z equal to 0 and solve for z: For the signal detectability model, we simply calculate the derivative of Eqs. 8 with respect to c and find that Thus, given the payoffs and presentation probabilities, we can calculate p and from this determine c via Eq. 34, provided that we know the forms of p(. I s ) and p(- I n). An exactly parallel development holds for the two- alternative forced-choice design, except that P equals the ratio of the difference density for (s, n ) to the difference density for (n, s ) . Equation 33 says that the slope of the isosensitivity curve should equal the optimum /3 defined in Eq. 32, so that one comparison we can make is between these two quantities, using, say, the theoretical signal detectability curve to estimate the slope. Green (1960) has done this, and the results are shown in Fig. 15. It is clear that the data depart considerably from the Fig. 15. Obtained versus optimum values of /I assuming a maximization of expected value and tlie detectability Yes-No model. Adapted with permission fiom Green (1960, p. 1195). 1 52 D E T E C T I O N A N D R E C O G N I T I O N choice model this is the only other possibility. If so, then the probability of applying an operator depends upon the product of the presentation probability, which is constant during the experimental run, and the response probability, which is not. Because the probability that a par- ticular operator will be applied is changing over trials, the resulting stochastic process is exceedingly complicated. At present insufficient is known about its asymptotic properties for it to be of any use to us. This is, of course, a limitation in practice. not in principle. So we confine our attention to models in which the subject decides how to change his bias on the basis of the presentation and the internal observa- tion resulting from it. These models are called experimenter-controlled in learning theory. In many ways the internal observation seems a much more relevant event than the subject's response, for it is these observations that he must use in the future to decide what responses to make. By assumption, the conditional probability of an internal effect occurring is constant over trials, so the probability of applying a given learning operator is also constant, which eliminates the major difficulty mentioned above. With the signal detectability model, however, a problem still remains, namely that there is a continuum or effects. Although Suppes (1959, 1960) has begun work on such learning models, insufficient is currently known about them to arrive at a theory of biasing. The choice model does not suffer from this difticulty because there are no internal observations, nor does the threshold model because there are only two observation states, D and D. The choice theory learning model has already been presented in Sec. 1.2 as an argument for assuming the choice theory. As far as the biases are concerned, we found that b(r) = P[4r)I W) , where L is the identification function, P the presentation probability, and 8 a learning rate parameter. Assuming that this learning model is correct, the major unsolved bias problem is how learning rates depend on the pay- offs and whatever else they depend on. A somewhat similar analysis can be given for the threshold model. Suppose, first, that the subject is operating upon the lower limb of the isosensitivity curve; that is, he is saying Yes only to a proportion t of the D observations and No, otherwise. He is adjusting t on the basis of his experiences. It is surely inappropriate for him to change it on those trials when a observation occurs. (Such an observation may, of course, influence his decision to shift from the lower to the upper limb.) SO suppose a D observation occurs. If it resulted from a n s presentation, he should increase his tendency t to say Yes to D observations; whereas, if T H E O R I E S OF T H E B I A S P A R A M E T E R S '53 it resulted from an n observation, he should decrease t. With this in mind and assuming linear operators, we postulate that ti + 1 - t ) , if s and D occur on trial i I ti+, = i ti - 8'ti, if n and D occur on trial i (36) I ti, if occurs on trial i, where ti is the bias on trial i. It follows that the expected value of ti+l given ti is Because all of the probabilities on the right are trial-independent, we can take expectations over ti: If we assume that the asymptotic expectation of ti, call it t,, exists, then by taking the limit of Eq. 37 as i goes to infinity we may solve for t,: t, = 46) 4 s ) + 9(n)b ' where Note that, as in the choice model, the bias parameters are the product of the presentation probability and the corresponding learning rate parameters. The parallel model for the upper limb assumes ui + 8(1 - u,), if s and b occur on trial i if n and b occur on trial i (40) if D occurs on trial i, and it results in the asymptotic expectation of u, The quantity b is formally similar to (Eq. 32) in that the presentation probability enters in the same way. Presumably, the learning-rate param- eters depend in some fashion upon the payoffs, but no one has yet I 5 4 D E T E C T l O N A N D R E C O G N I T I O N reported a theory for this dependence. Much research is needed to deter- mine whether this sort of model is adequate and to understand the relation between learning rates and payoffs. An interesting feature of these asymptotic results for the threshold model is that the response probabilities can approach the true detection proba- bilities only under very special conditions. If the subject is operating on the lower limb and b has a moderate value somewhere in the neighborhood of 1, then t , approaches I only as q(n) approaches 0. On the upper limb u, approaches 0 only as q(s) approaches 1. Thus, if, as in the data of Figs. 8 and 9, q(n) > 0 and q(s) < 1, the theory predicts that no data points lie in the immediate neighborhood of (q(n), q(s)), and none seems to. In other words, one effect of information feedback, according to this model, is to prevent the subject from revealing directly the true detection proba- bilities. It is not known what he does when there is no information feedback, but it certainly should not be assumed that p( Y I s) = q(s) and p( Y I n) = q(n) without careful investigation. A second point of interest is that a t asymptote the response probabilities are still fluctuating under the processes described by Eqs. 36 and 40. An expression can be derived for the variance of the response probability a t asymptote which shows that the more rapid the learning, the larger the variance. Most experimenters feel that there is more than binomial variability in much psychophysical data, and learning may very well be one source. If so, considerable care must be exercised in applying the standard tests of significance that postulate constant underlying probabilities. A similar learning model can be developed for the biases z> and w of the two-alternative forced-choice design (see Luce, 1963). Suffice it to say that v, = w, = 1/(1 + b). Note that, when b = 1, v, = w., = &, which implies p(l I (s, n)) = p(2 I (n, s)). For P = +, b = I if and only if 8 = 8'. Thus the apparent tendency toward behavioral symmetry when P = + and the payoff matrix is symmetric suggests that the learning rates corresponding to symmetric payoffs are approximately equal. In that case b = (1 - P)/P. Assuming this, we may use Eqs. 38 and 41 to predict the data shown earlier in Fig. 8. The predicted values, which correspond to the points by pairs as one sweeps around the isosensitivity curve, are shown as crosses in Fig. 8 (p. 134). 6. THEORIES O F T H E STIMULUS PARAMETERS Relatively little is yet known about the way in which the stimulus parameters of the several theories depend upon physical measures of the T H E O R I E S O F T H E S T I M U L U S P A R A M E T E R S '57 performs as he does-assuming that ideal observers continue t o be defined to have properties such as those listed in the foregoing quotation. A somewhat cruder approach that nonetheless may merit attention involves parametric studies in which one or a t most two physical variables are manipulated and the model parameters are calculated from the data to see what, if any, simple relations appear to exist. 6.2 T h e Neural Quan tum Mode l Suppose that we have a simple background, such as a tone, and that a stimulus involves a short duration change of the background on one dimension, such as energy. The Btktsy-Stevens neural quantum model attempts to relate the change in the number of neural quanta excited by the stimulus to a physical measure of the increment (or decrement) intro- duced in the background. The model supposes that this physical dimen- sion can be partitioned a t any instant into nonoverlapping intervals that correspond to the neural quanta. Thus two different levels of stimulation lying within one interval excite the same number of quanta, whereas two in different intervals excite different numbers of quanta. We may think of the subject as imposing a quantal grid over the physical dimension. Over time, the quantal grid is assumed to fluctuate slowly as the result of changes internal to the subject, and so the number of quanta excited by a constant stimulus also fluctuates, sometimes increasing, a t other times decreasing. Although it is generally felt that it is the grid that shifts, it is more convenient mathematically to view the grid as fixed and to suppose that the physical measure corresponding to the stimulus does the fluctuat- ing. The two ways of viewing the matter are completely equivalent as long as the grid is equally spaced, as we shall assume. In terms of a fixed grid, a given background will have some distribution, such as that shown in Fig. 16. Suppose that just prior to presenting stimulus s, which, it will be recalled, is simply an increment (or decrement) in the background, the effect of the background is X. This effect is a random variable distributed in some manner, as shown in Fig. 16. The addition of s is assumed to change the effect from X to X + A(s), where A(s) depends only upon s. Thus. whenever s is presented, the same increment is always added. This is an important point. We are saying that the effect of the background just before stimulation and the effect of the background plus the stimulus are perfectly correlated. This assumption is quite different from the inde- pendence assumptions we have repeatedly made when discussing the detection of stimuli in noise. The correlation assumption is interlocked D E T E C T I O N A N D R E C O G N I T I O N One quantum X Fig. 16. A typical density of assumed stimulus effects in the quantal model. with our earlier supposition that the grid fluctuates slowly-in order that a perfect correlation may exist, the stimulus presentation must be of sufficiently short duration so that little or no change in the grid location will take place during the presentation. In practice, a duration of the order of 100 ms has been deemed sufficiently short. It will be recalled (Sec. 1.3) that we set up the decision rule that a change in stimulation is noted when it equals or exceeds some number k of quanta. Thus, if the physical increment corresponding to s, A(s), is less than the physical increment corresponding to k - 1 quanta, it fails to produce a detection observation. If, however, it equals or exceeds that corresponding to k quanta, then it will always be detected. And when it is between that corresponding to k - 1 and k quantal intervals, a detection observation may or may not occur. To be specific, suppose that A(s) corresponds to + of a quantum interval more than k - 1 quanta; then, if the random variable X overflows an integral number of intervals by less than $, the stimulus cannot excite the necessary k quanta. However, if it overflows + or more, then s excites the required k quanta. So the probability that a presentation of s produces a detection observation depends upon the probability that the background residue, as it is called, is greater than 3. It follows, then, that the probability of a detection observation occurring depends upon the distribution of residues. To talk about this distribution without specifying just how many quanta are excited by the background, as we have been doing, makes sense only if the physical measure we are using has the property that all quantal intervals are of the same size. It is not obvious that the usual physical measures have this property, but under very general conditions it is possible to find a continuous monotonic transformation that has. We assume that this is the measure we are using. T H E O R I E S O F T H E S T I M U L U S P A R A M E T E R S Increment in quantal units Fig. 17. Predicted true probability ofdetection versus a measure ofthe stimulus increment in quantal units (i.e., the true psychon~etric function) assuming the quantal model, a uniform distribution of residues, and a two-quantum criterion. Now, if the distribution of residues is uniform in that measure, as has generally been assumed in discussions of neural quantum theory, then the probability of detection is easily seen to be rectilinear, as in Fig. 17. Stevens, Morgan, and Volkmann (1941) attempted to argue verbally, and Corso (1956) alleged that it follows from Bayes's theorem, that the distribution of residues is uniform independent of the distribution of X. This is false. For example, suppose that X is distributed according to and that the quantal boundaries are located at the points iq, where i = 0, f 1, f 2, . . . , and q > 0 is the size of one quantum. If R denotes the residue random variable, then its distribution for 0 < r < q is given by I 62 D E T E C T I O N A N D R E C O G N I T I O N then a run of identical increments is presented. The subject responds to each of these. Thus he knows in advance that Yes is the correct answer on every trial. The possibility for biasing seems great. One school has argued as follows. Suppose that the true detection function is a smooth ogive. For a stimulus with a high detection proba- bility, say 0.9, there is a tendency for the number of Yes responses to be inflated artificially because of two factors. One is that the subject knows that Yes is the correct response, and the other is that he is assumed to have a tendency to perseverate his responses, most of which have been Yes. This means, then, that the data function must be above the true function, and it intercepts thep = 1 line in much the same way as a linear function does. For a stimulus with a low true-detection probability, say 0.1, the argument is less clear because the perseveration tendency decreases the number of Yes's, whereas his knowledge of the presentation schedule tends to increase the number. So, according to this argument, we may expect the upper intercept to confirm the neural quantum model, but the lower one should vary from subject to subject and, on the whole, be more rounded. Although it is difficult to prove formally, inspection of the published as well as of considerable unpublished data suggests that just the opposite is true: the lower intercept seems more stable and more in line with the quantal model than the upper one. Assuming that the rectilinear quantal model correctly describes the dependence of the true detection probability upon the stimulus magnitude, the learning model of Sec. 5.2 suggests that the observed responses should distort this function, especially at the upper intercept when there are no or only a few catch trials (Luce, 1963). Specifically, let us assume that a detection observation occurs when and only when a two-quanta change occurs, as suggested by the data. In addition, however, let us suppose that a conservative lower-limb bias is used by the subject when the detection observation is based upon a change of only two neural quanta, whereas with three or more he uses an upper-limb bias. Thus, for any stimulus of magnitude less than two quantal units, a lower-limb bias is in force, and so, by Eqs. 15 and 38, P(Y I s) = t , q(s) For stimuli two quantal units or larger, q(s) = 1. For such stimuli, the foregoing equation yieldsp(Y I s) = 1/[1 + q(n)b] for the lowerlimb and, by Eq. 16, p(Y I s) = 1 for the upper limb. The probability that a lower- limb bias is used decreases from 1 to 0 linearly as the stimulus magnitude T H E O R I E S O F T H E S T I M U L U S P A R A M E T E R S 1 63 increases from two to three quantal units. In summary, then, if s denotes the stimulus magnitude in quantal units, the response function is I 0, i f O < s < l Quanta1 units Fig. 18. Predicted observed probability of detection versus a measure of the stimulus increment in quantal units assuming a true underlying rectilinear function and the response biasing model described in the text. The parameter q ( n ) is the true false alarm rate and b is a quantity that depends on the frequency of "catch" trials and the learning rates of the subject. I 64 D E T E C T I O N A N D R E C O G N I T I O N Size of AF (cycles) Fig. 19. Quanta1 data for the detection of frequency incremrnts by one subject at four levels of sound inrensity. Each data point is based on 100 observations. T h e theo- retical curves were drawn subject to the condition that the intercepts stand in the relation of 2 to I. Adapted with permiasion from Stevens. Morgan, and Volkmann (191-1, p. 327). see that for small values of q(n)b, which, for example, corresponds to a small proportion of catch trials, the only effect is a slight distortion of the true quanta1 function near the upper intercept. As q(n)b becomes larger, we obtain a function that is approximately a straight line with 3 : 1 intercepts, and as q(n)b becomes still larger the function approaches a 3 : 2 line. In spite of all the arguments why the observed functions should not be rectilinear, the surprising thing is how linear they are. In Fig. 19 are data for one subject detecting frequency increments at different levels of intensity. The theoretical lines have 2: 1 intercepts. Similar data for the detection of intensity differences of a pure tone for two subjects are shown in Fig. 20. Again 2: 1 lines are shown. In my opinion, the main challenge of these results for those who do not believe that thresholds exist is to explain, using a continuous theory, why the apparent intercepts should exhibit a 2: 1 ratio. This has yet to be done. P U R E R E C O G N I T I O N 167 Table 7 Percentage of Detected and of Undetected Responses Recognized as Stimulus s (Shipley, 1961) Stimulus Subject Presentation 1 2 3 Detected Undetected Detected Undetected Detected Undetected Yes-No Design Subject Stimulus Presentation 1 2 3 Correct lncorrect Correct Incorrect Correct Incorrect Detection Detection Detection Detection Detection Detection Forced-Choice Design of detectability could easily lead to apparent differences in recognition, even when none existed. The relative stability, both experimentally and theoretically, of the forced-choice as against the three-response category scheme in discrimination studies (see Chapter 4, Sec. 5.2) must not be interpreted as a blanket recommendation for forced-choice designs in other types of experiments. These data raise again the question of a threshold, for, when symmetric payoffs are used, the "no stimulus" response contains no residual informa- tion about the identity of the presented stimulus. This is certainly con- sistent with the notion of a detection threshold. It does not, however, prove that one exists. Another interpretation is that all information about the presentation is lost once the subject decides that no stimulus was presented. It should be possible to decide between these two hypotheses by running a simultaneous detection and recognition experiment with various asymmetric payoff matrices. If there is a threshold, the recognition of detected stimuli will be degraded in a predictable fashion as the detection frequency is increased by changing the payoffs; whereas, if we are witnessing a decision phenomenon, the recognition of detected stimuli will be independent of the frequency of detection responses. This study has not been performed. I 68 D E T E C T I O N A N D R E C O G N I T I O N Having made clear that the two types of recognition experiments must be treated separately, the remainder of this section is devoted to pure recognition studies. 7.2 Information Theory Analysis Aside from traditional statistics of contingency tables, the main mathe- matical tool that has come to be used to study pure recognition with more than two or three stimulus presentations is information theory. I t is impossible to devote the space needed for a complete review of Shannon's theory (Shannon & Weaver, 1949) or even of its varied uses in psycho- physics, but fortunately several suitable summaries with extensive bibliographies already exist (Attneave, 1959; Luce, 1960; Miller, 1953, 1956). Except for the words used, the description of a communication system assumed by the information theorists is identical to our complete identification design. They interpret S as the set of elementary signals that can be transmitted by the system, R as the set of signals that can be received, and i as a given one-to-one correspondence between them. Thus S might be the ordinary alphabet, R the sequences of dots and dashes used in the Morse code, and i the code relating them. A probability distribution p over S x R is assumed to exist-p(s, r) is interpreted as the joint probability that signal s is transmitted and r is received. If we define and then P(s) is the probability that signal s is transmitted, p(r) the uncondi- tional probability that r is received, andp(r I s) the conditional probability that r is received when s is transmitted. I n a complete identification experi- ment, P(s) is the probability that s is presented, p(r), the unconditional probability of response r, and p(r I s), the conditional probability of response r given stiniulus s. In the communication terminology the matrix of conditional probabilitiesp(r 1 s) is called a noise matri.~, for by definition that which prevents communication from being perfect is noise; in a complete identification experiment it is called a confusion matrix. P U R E R E C O G N I T I O N 169 Information theorists undertook to state by means of a single summary number the average information-transmitting characteristics of such a system. I t was to be a measure that would satisfy certain a priori criteria and permit one to capture in precise theorems certain known empirical results concerning channel capacity, information transmission, and error correction. The major a priori requirement imposed by Shannon was this. Suppose that several signals are, in the statistical sense, independently selected and transmitted ; then the average amount of information created by their joint selection shall be the sum of the average amounts of informa- tion created by their separate selections, that is, average information is postulated to be additive when the selections are independent. He showed that this coupled with other much weaker conditions implies that the measure must be of the form H(S) = -1 P(s) log P(s). SES Usually, the base of the logarithm is chosen to be 2, thereby setting the unit of measure. Following a suggestion by J. Tukey, this unit is called a bit. A choice between two equally likely alternatives creates one bit; among 4, two bits; among 8, three bits; etc. Two features of this measure should be noted. First, it is nonnegative and has the value 0 when and only when one of the probabilities is 1 (and so all the rest are 0). That is to say, no information is generated by the selection of an alternative that is certain to be selected; this agrees, for example, with the view that little or no information is transmitted by the conventional replies to conventional greetings. Second, the measure has its maximum value when all of the probabilities are equal; if there are k alternatives, the maximum is log2 k. In like manner, we have as the information measure of the responses and, as the conditional measure of the response given the stimulus, H(R I S) = Z: 2 p(s, r) log2 p(r I s). SES r t R The quantity is called the information transmitled from the stimulus to the response. I t not only plays a significant role in information theory itself, but it has proved to be a useful measure in psychology. I t is not difficult to show that 172 D E T E C T I O N A N D R E C O G N I T I O N Thus the confusion matrix of scale values is of the form Responsc where the bias parameters are omitted. The first thing to note is that the model has the often observcd U-shapc when the probabilities p(r I s,) are plotted against r. In Table 8 the pre- dicted probabilities for the end and middle stimuli are presented for several small k's and for several plausible values of q. The dip is evident. Of course, the bias parameters affect the exact form of this U-shaped function. s, Stimulus Presentation Table 8 Comparison of the Theoretical Probability of Correct Identification for Encl and Miclclle Stimuli 7' T j 1 . . . 7L-3 P U R E R E C O G N I T I O N I 73 Next we look into the question of transmitted information. By what we have assumed for our stimuli, we know that the matrix of scale values for the k = 2 recognition design is Si S i t 1 again omitting biases. Assuming the independence condition, Assumption 4 on p. 1 14, the parallel forced-choice design has the matrix of scale values 1 2 1 v'i 1 J. (sit19 ~ i ) 17 In the light of our discussion of the quanta1 model, it is not clear whether the independence assumption is justified, but in order to continue the discussion we accept it. In discrimination work (see Chapter 4, Sec. 1.2) two stimuli are said to be one jnd (one just noticeable difference) apart if p(l I (sit sit,)) = 9 , in which case the forced-choice model yields For stimuli that are m-jnds apart in the sense that m - 1 stimuli can be found between them such that successive ones are one jnd apart, Eq. 46 implies 17 = (4)"' b'i. (49) For our calculations, let us consider k stimuli so spaced that successive ones are m jnds apart; thus the total range of stimuli is (k - I)m jnds. Specifically, let us fix the range at 26 = 64 jnds and let k = 5,9, 17, and 33 stimuli, which means that successive ones are separated by m = 16, 8, 4, and 2 jnds, respectively. For each k the confusion matrix of probabilities can be determined from the scale values given in Eq. 48, using Eq. 49 to determine q. Assuming that the stimuli are equally likely, the information transmitted is calculated using the formulas in Sec. 7.2. The results are shown in the last column of Table 9. Up to something just over three bits presented, the information transmitted is nearly equal to the stimulus information. Increasing the stimulus information further, the transmitted information increases less rapidly, reaching a maximum of about 3.6 bits. Not only does this correspond qualitatively to the data, but it is in about the right range of values. The data, however, appear to have arisen from a somewhat broader range of stimulus values and to have resulted in I 7 4 D E T E C T I O N A N D R E C O G N I T I O N Table 9 Choice Theory Predictions of Information Transmitted versus Information Presented Bits Transmitted for Number of Stimulus Range in jnds Equally Likely Bits Alternatives Presented 16 64 somewhat smaller maxima, which suggests that we are using too small a value for q. It is obvious that we could select a value of q that would yield quantitatively the same summary results as the data; it is a much more subtle question whether this model can reproduce the whole confusion matrix in detail, and that has not been thoroughly investigated. A second major information result is that the increase in information transmitted is relatively slight as the range is increased (Pollack, 1952). Two ranges, differing by a factor of four, are shown in columns 3 and 4 of Table 9. A considerable difference in the maximum exists, strongly suggesting that the present model is inadequate to explain these results. A third information theory result concerns the less than additive increase of the transmitted information as the number of dimensions per stimulus is increased. We consider the simplest possible case of two perfectly detectable stimuli differing on two equally recognizable dimensions. Suppose s and s' are the stimulus values on one dimension and t and t' on the other and that their confusion matrices of scale values are Response Response S st t t' Stimulus Presentation s' For the choice model with Assumption 4, the scale values for the composite stimuli are (s, t ) (st, t ' ) S I M U L T A N E O U S D E T E C T I O N A N D R E C O G N I T I O N r 77 We apply this estimation scheme to McGuire's data on size recognition (reported in Shepard, 1958b), although they are not ideal because ten subjects are averaged together. Each subject responded to 80 presentations of each of nine circular areas. A x2 comparison of the predicted and observed proportions yields a value of 96.3 which with 8 x 9 - 16 = 56 degrees of freedom is highly significant. This is none too surprising because the estimation scheme completely ignores the small entries in the table, which, of course, contribute heavily to x2. That the estimates are probably nonoptimal is indicated, for example, by the fact that about half the total contribution to x%ornes from the first column. In the light of the failings of the estimation procedure, a more reasonable test of the model is to lump together all entries to the left of cell i - 1 in row i and all of those to the right of the i + 1 entry. This reduces x2 to 15.6 and the degrees of freedom to 14, yielding 0.2 < p < 0.5. The observed versus predicted proportions for this collapsing are shown in Fig. 21. These results suggest that a better estimation method might very well result in a nonsignificant over-all X 2 . 8. S I M U L T A N E O U S D E T E C T I O N A N D R E C O G N I T I O N Even restricting our attention to the simplest simultaneous detection and recognition designs, namely S =Y = {s, s', 81, we find that relatively little work has been reported. Because the ideas are adaptations of those we have already discussed, it will suffice simply to outline them. 8.1 Signal Detectability Analysis Following the general structure of the signal detectability model, there is a decision axis relating stimulus s to stimulus s', another relating s to noise, and a third relating s' to noise. Tanner (1956) assumed that they can be represented in the plane, as in Fig. 22. The three intersections are supposed to occur at the means of the distributions projected on the several axes. The noise and each of the stimulus-plus-noise distributions are assumed to be independent and normal, all with equal variance. The two detection axes are separated by some angle 0, not necessarily 90". Tanner (1956) used this structure to analyze the pure recognition experiment Y = {s, st). I did not present this in the last section because no testable conclusions seem to derive from it. Swets and Birdsall (1956) dis- cussed the simultaneous detection and recognition experiment, proposing D E T E C T I O N A N D R E C O C N I T I O N 5 z .- 2 - .= j;. 0 g 5 Y t n 4 * - .. '- C r 3 5 C - .3 Q- - : 2 z 2 - m r g 8 r C 13 - ; 2 E 2 . - (0 nl a 5 % b 7 = 2 5 E 7 -X E . ... 3 2 + 6 a w u e: " m' + c W - .- 6 G &z . + (0 0 x 2 c ." (0 - z 2 kg .a e 2 7 - . - - 2 c, .- e" d : 8 5 c .z' 0 ... . * 2 u 5 2 C - U 0 g 2 &A 5 S I M U L T A N E O U S D E T E C T I O N A N D R E C O G N I T I O N Fig. 23. A decision rule proposed by Swets and Birdsall (1956) Tor the two-stimulus simultaneous detection and recognition experiment. the decision rule shown in Fig. 23. The three straight lines meet at a point in the triangle formed by connecting the means of the three distributions; each line is orthogonal to a side of the triangle. 8.2 Choice Analysis The choice model analysis follows immediately from Eq. 5 and As- sumptions 1 to 3: Y Y N where q = q(s, n), q' = q(sl, ,I), and 1 = q(s, s'). The adequacy of the choice model for this simultaneous identification design can be tested, using Shipley's (1961) data. The parameters q and q' are estimated as the mean of those obtained from the simple Yes-No and two-alternative forced-choice experiments (Table 2). The remaining I 82 D E T E C T I O N 4 N D R E C O G N I T I O N absence of any compelling data or arguments, one hopes that the phe- nomenon is of a more general response character and that it requires fewer specific assumptions than those made in the filter theories. Shipley (1960) suggested such an idea, one that is applicable to any response model. She supposed that the subject is covertly recognizing the stimuli as well as overtly detecting them, even though he makes no recognition response. If so, and if we assume the choice model, the matrix of scale values is simply Eq. 50 with the two detection responses combined. T o show that this predicts results qualitatively similar to those observed, consider the special case in which the stimuli are equally detectable and there is no recognition bias, that is, 7 = 7' and h = 1. Then we can combine the two stimulus presentations into one because the two rows are the same: Response Stimulus Presentation n This is equivalent, in the sense of generating the same probabilities, to a matrix of scale values of the form Y N S O ' " [) y ] n , where This is the form of the simple Yes-No matrix discussed in Sec. 2.2, and so 5 is an apparent detection parameter and is an apparent bias parameter. The probability of detecting an unknown stimulus is smaller than that of a known stimulus if and only if 5 > 7, and that in turn holds if and only if 1 < 1, as it must be for distinct stimuli. Moreover, as the stimuli become more recognizable, that is, as ;I gets smaller, 5 gets larger and so detectability becomes poorer, as has been observed. A parallel development exists for the forced-choice design. Omitting D E T E C T I O N O F A N U N K N O W N S T I M U L U S 183 the biases, the simultaneous detection-recognition matrix of scale values is seen to be Responses Stimulus Presentation where d is given by (log = (log + (log q'j2 and the other symbols have their previous meanings. T o see what happens qualitatively, again suppose q = q' and that the biases all equal 1 ; then we can collapse on both rows and columns: (S or sf , n j (n, s o r s'j The standard form for the unbiased forced-choice matrix of scale values is 1 2 so the effective stimulus parameter is AS for the Yes-No design, there is a n increasing reduction in detectability as the stimuli are made more identifiable, that is, as A is made smaller. We observe that the apparent loss in detectability is greater in the forced- choice than in the Yes-No design because I ad D E T E C T I O N A N D R E C O G N I T I O N To see whether Shipley's idea has any possibility of being correct, I turn again to her (1961) data. Both the simultaneous detection-recognition and the detection of an unknown stimulus conditions were run, so we can collapse the first data matrix on the recognition responses and compare it with the second. This is done in Table 1 1 for the Yes-No experiment and in Table 12 for the forced-choice experiment. Although there are some differences, which may very well be due to different response biases, they seem in sufficient accord to warrant more study of the idea. Table 11 Per Cent Yes Responses in the Yes-No Design When the Signal Is Unknown Subject Stimulus Presentation 1 2 3 Observed Calculated Observed Calculated Observed Calculated The calculated columns are obtained by collapsing the recognition responses in the corresponding detection and recognition experiment (Shipley, 1961). See Table 2 for a description of the experimental conditions. Table 12 Per Cent Correct Responses in Forced-Choice Design When Stimulus Is Known and When I t Is Unknown Stim- Subject ulus Presen- 1 2 3 tation Stim- Stim- Stim- Stim- ulus Stim- ulus Stim- ulus ulus Un- Cal- ulus Un- Cal- ulus Un- Cal- Known known culated Known known culated Known known culated The calculated columns are obtained by collapsing the recognition responses for the four response detection and recognition experiment (Shipley, 1961). See Table 2 for a description of the experimental conditions. 10. C O N C L U S I O N S Although detection and recognition experiments have long been performed-not always under those names-interest in them in theoretical R E F E R E N C E S '87 Edwards, W. The theory of decision making. Psych. Bull., 1954, 51, 380-417. Edwards, W. Behavioral decision theory. In Annual Reviews of Psychology, VOI. 12. Palo Alto, Calif.: Annual Reviews, Inc. 1961. Pp. 473-498. Egan, J. P., Schulman, A. I., & Greenberg, G. Z. Operating characteristics determined by binary decisions and by ratings. J. acoust. Soc. Amer., 1959, 31, 768-773. Elliott, P. B. Tables of d'. T.R. 97, Electronics Defense Group, Univer. of Mich., Ann Arbor, 1959. Garner, W. R., & McGill, W. J. Relation between uncertainty, variance, andcorrelation / analysis. Psychomerrika, 1956, 21, 219-228. Green, D. M. Detection of multiple component signals in noise. J. acoust. Soc. Amer., 1958, 30, 904-911. Green, D. M. Psychoacoustics and detection theory. J . acoust. Soc. Amer., 1960, 32, 1189-1203. Green, D. M., Birdsall, T. G., & Tanner, W. P., Jr. Signal detection as a function of signal intensity and duration. J. acortst. Soc. Amer., 1957, 29, 523-531. Halsey, R. M., & Chapanis, A. On the number of absolutely identifiable spectra hues. J . opt. Soc. Atner., 1954, 41, 1057-1058. Klemmer, E. T., & Frick, F. C. Assimilation of information from dot and matrix / patterns. J. exp. Psychol., 1953, 45, 15-19. Licklider, J. C. R. Three auditory theories. In S. Koch (Ed.), Psychology: a study of a science, Vol. 1. New York: McGraw-Hill, 1959. Pp. 41-144. Luce, R. D. Individual choice behavior. New York: Wiley, 1959. Luce, R. D. The theory of selective information and some of its behavioral applications. In R. D. Luce (Ed.), Developnients in mathematical psychology. Glencoe, Ill.: The Free Press, 1960. Luce, R. D. A threshold theory for simple detection experiments. Psychol. Rev., in press, 1963. McGill, W. J. Multivariate information transmission. Psychometrika, 1954,19,97-116. McGill, W. J. lsomorphism in statistical analysis. In H. Quastler (Ed.), Information theory in psychology. Glencoe, Ill.: The Free Press, 1955. Pp. 56-62.(a) McGill, W. J. The relation between uncertainty and variance. Proc. 1954 Conf. Test Probl. Educ. Test. Serv., 1955, 37-42.(b) Miller, G. A. What is information measurement? Amer. Psychol., 1953, 8, 3-11. Miller, G. A. The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol. Rev., 1956, 63, 81-97. Neisser, U. R. G. A methodological study of the quanta1 hypothesis in audirory psycho- physics. Unpublished doctoral dissertation, Harvard University, 1955. Peterson, W. W., Birdsall, T. G., & Fox, W. C. The theory of signal detectability. IRE Trans. Professional Group on Information Theory, 1954, 4, 171-212. Pollack, I. Information of elementary auditory displays. J. acoust. Soc. Amer., 1952, 24, 745-750. Pollack, I . The information of elementary auditory displays. 11. J. acoust. Soc. Amer., 1953, 25, 765-769. Pollack, I., & Ficks, L. Information of elementary multidimensional auditory displays. J . acousr. Soc. Amer., 1954,26, 155-158. Restle, F . Psychology ofjrtrlp,nent and choice. New York: Wiley, 1961. Senders, Virginia L. Further analysis of response sequences in the setting of a psycho- physical experiment. Atner. J. Psychol., 1953, 66, 215-228. Senders, Virginia L., & Sowards, A. Analysis of response sequences in the setting of a psychophysical experiment. Amer. J. Psychol., 1952, 65, 358-374. 188 D E T E C T I O N A N D R E C O G N I T I O N Shannon, C. E., & Weaver, W. The mathenraticul theory of comniunication. Urbana: Univer. of Illinois Press, 1949. Shepard, R. N. Stimulus and response generalization: a stochastic model relating generalization to distance in a psychological space. Psychometrika, 1957,22,325-345. Shepard, R. N. Stimulus and response generalization: deduction of the generalization gradient from a trace model. Psychol. Rev., 1958, 65, 242-256.(a) Shepard, R. N. Stimulus and response generalization: tests of a model relating gener- alization to distance in psychological space. J. exp. Psychol., 1958, 55, 509-523.(b) Shipley, Elizabeth F. A model for detection and recognition with signal uncertainty. Psychometrika, 1960, 25, 273-289. Shipley, Elizabeth F. Detection and recognition with uncertainty. Unpublished doctoral dissertation, Univer. of Pennsylvania, 1961. Smith, M., & Wilson, Edna A. A model of the auditory threshold and its application to the problem of the multiple observer. Psychol. Monogr., 1953, 67 (9, Whole No. 359). Speeth, S. D., & Mathews, M. V. Sequential effects in the signal detection situation. J . acoust. Soc. Amer., 1961, 33, 1046-1054. Stevens, S. S., Morgan, C. T., & Volkmann, J. Theory of the neural quantum in the discrimination of loudness and pitch. Amer. J. Psychol., 1941, 54, 315-335. Suppes, P. A linear model for a continuum of responses. In R. R. Bush, & W. K. Estes (Eds.), Studies in mailrematical learning theory. Stanford: Stanford Univer. Press, 1959. Pp. 400-414. Suppes, P. Stimulus-sampling theory for a continuum of responses. In K. J. Arrow, S. Karlin, & P. Suppes (Eds.), Mathematical meiliods in the social sciences, 1959. Stanford: Stanford Univer. Press, 1960. Pp. 348-365. Swets, J. A. Indices of signal detectability obtained with various psychophysical procedures. J . acoust. Soc. Amer., 1959, 31, 511-513. Swets, J. A. Detection theory and psychophysics: a review. Psychometrika, 1961, 26, 49-63.(a) Swets, J . A. Is there a sensory threshold? Science, 1961, 134, 168-177.(b) Swets, J . A,, & Birdsall, T. G. The human use of information 111. Decision making in signal detection and recognition situations involving multiple alternatives. IRE Trans. Professional Group on Injorm. 7heory, 2, 1956. 138-165. Swets, J . A., Shipley, Elizabeth F., McKey, Molly J., &Green, D. M. Multiple observa- tions of signals in noise. J. acoust. Soc. Amer., 1959, 31, 514-521. Swets, J. A,, Tanner, W. P., Jr., & Birdsall, T. G. The evidence,for a decision-making theory of visual detection. T.R. 40, Electronic Defense Group, Univer. of Mich., Ann Arbor, 1955. Swets, J. A,, Tanner, W. P., Jr., & Birdsall, T. G. Decision processes in perception. Psychol. Rev., 1961, 68, 301-340. Tanner, W. P., Jr. On the design of psychophysical experiments. In H. Quastler (Ed.), Information theory in psyclrology, Glencoe, Ill.: The Free Press, 1955. Pp. 403-414. Tanner, W. P., Jr. Theory of recognition. J. acoust. Soc. Amer., 1956, 28, 882-888. Tanner, W. P., Jr. Theory of signal detectability as an interpretive tool for psycho- physical data. J . acoust. Soc. Amer., 1960, 32, 1140-1 147. Tanner, W. P.. Jr. Physiological implications of psychophysical data. Ann. N. Y. Acad. Sci., 1961, 89, 752-765. Tanner, W. P., Jr., & Birdsall, T. G. Definitions of d' and r/ as psychophysical measures. J. acoust. Soc. Amer., 1958, 30, 922-928. Tanner, W. P., Jr., Birdsall. T. G., & Clarke, F. R. The concept ofthe ideul observer in R E F E R E N C E S '89 psychoplysics. T.R. 98, Electronic Defense Group, Univer. of Mich., Ann Arbor, 1960. Tanner, W. P., Jr., & Norman, R. Z. The human use of information. 11. Signal u detection for the case of an unknown signal parameter. IRE Trans. Professiona/ Group on Information Theory, 1954, 4, 222-226. Tanner, W. P., Jr., & Swets, J. A. A decision making theory of visual detection. Psychol. Rev., 1954, 61, 401409.(a) Tanner, W. P., Jr., & Swets, J. A. The human use of information. I. Signal detection for the case of the signal known exactly. IRE Trans. Professional Group on Itfirmation Theory. 1954, 4, 213-221 .(b) Tanner, W. P., Jr., Swets, J. A., & Green, D. M. Someget~eralproperties of the hearing mechanism. T.R. 30, Electronic Defense Group, Univer. of Mich., Ann Arbor, 1956. Van Meter, D., & Middleton, D. Modern statistical approaches to reception in com- munication theory. IRE Trans. Professional Group on Itformation Theory, 1954, 4, 119-145. Veniar, Florence A. Effect of auditory cue on discrimination of auditory stimuli. J. acortst. Soc. Amer., 1958, 30, 1079-1081.(a) Veniar, Florence A. Signal detection as a function of frequency ensemble. I. J. acoust. Soc. Amer., 1958, 30, 1020-1024.(b) Veniar, Florence A. Signal detection as a function of frequency ensemble. 11. J. acoust. Soc. Amer., 1958, 30, 1075-1078.(c) Wald, A. Statistical decision functions. New York: Wiley, 1950.
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved