The Polygraph and Lie Detection (2003)

National Academies Press: OpenBook

Chapter: 7 Uses of Polygraph Tests

Visit NAP.edu/10766 to get more information about this book, to buy it in print, or to download it as a free PDF.

Suggested Citation:"7 Uses of Polygraph Tests." National Research Council. 2003. The Polygraph and Lie Detection. Washington, DC: The National Academies Press. doi: 10.17226/10420.

7
Uses of Polygraph Tests

The available evidence indicates that in the context of specific-incident investigation and with inexperienced examinees untrained in countermeasures, polygraph tests as currently used have value in distinguishing truthful from deceptive individuals. However, they are far from perfect in that context, and important unanswered questions remain about polygraph accuracy in other important contexts. No alternative techniques are available that perform better, though some show promise for the long term. The limited evidence on screening polygraphs suggests that their accuracy in field use is likely to be somewhat lower than that of specific-incident polygraphs.

This chapter discusses the policy issues involved in using an imperfect diagnostic test such as the polygraph in real-life decision making, particularly in national security screening, which presents very difficult tradeoffs between falsely judging innocent employees deceptive and leaving major security threats undetected. We synthesize what science can offer to inform the policy decisions, but emphasize that the choices ultimately must depend on a series of value judgments incorporating a weighting of potential benefits (chiefly, deterring and detecting potential spies, saboteurs, terrorists, or other major security threats) against potential costs (such as of falsely accusing innocent individuals and losing potentially valuable individuals from the security related workforce). Cost-benefit tradeoffs like this vary with the situation. For example, the benefits are greater when the security threat being investigated is more serious; the costs are greater when the innocent individuals who might be

Suggested Citation:"7 Uses of Polygraph Tests." National Research Council. 2003. The Polygraph and Lie Detection. Washington, DC: The National Academies Press. doi: 10.17226/10420.

accused are themselves vital to national security. For this reason, tradeoff decisions are best made by elected officials or their designees, aided by the principles and practices of behavioral decision making.

We first summarize what scientific analysis can contribute to understanding the tradeoffs involved in using polygraph tests in security screening. (These tests almost always use the comparison question or relevant-irrelevant formats because concealed information tests can only be used when there are specific pieces of information that can form the basis for relevant questions.) We then discuss possible strategies for making the tradeoffs more attractive by improving the accuracy of lie detection— either by making polygraph tests more accurate or by combining them with other sources of information. We also briefly consider the legal context of policy choices about the use of polygraph tests in security screening.

TRADEOFFS IN INTERPRETATION

The primary purpose of the polygraph test in security screening is to identify individuals who present serious threats to national security. To put this in the language of diagnostic testing, the goal is to reduce to a minimum the number of false negative cases (serious security risks who pass the diagnostic screen). False positive results are also a major concern: to innocent individuals who may lose the opportunity for gainful employment in their chosen professions and the chance to help their country and to the nation, in the loss of valuable employees who have much to contribute to improved national security, or in lowered productivity of national security organizations. The prospect of false positive results can also have this effect if employees resign or prospective employees do not seek employment because of polygraph screening.

As Chapter 2 shows, polygraph tests, like any imperfect diagnostic tests, yield both false positive and false negative results. The individuals judged positive (deceptive) always include both true positives and false positives, who are not distinguishable from each other by the test alone. Any test protocol that produces a large number of false positives for each true positive, an outcome that is highly likely for polygraph testing in employee security screening contexts, creates problems that must be addressed. Decision makers who use such a test protocol might have to decide to stall or sacrifice the careers of a large number of loyal and valuable employees (and their contributions to national security) in an effort to increase the chance of catching a potential security threat, or to apply expensive and time-consuming investigative resources to the task of identifying the few true threats from among a large pool of individuals who had positive results on the screening test.

Suggested Citation:"7 Uses of Polygraph Tests." National Research Council. 2003. The Polygraph and Lie Detection. Washington, DC: The National Academies Press. doi: 10.17226/10420.

Quantifying Tradeoffs

Scientific analysis can help policy makers in such choices by making the tradeoffs clearer. Three factors affect the frequency of false negatives and false positives with any diagnostic test procedure: its accuracy (criterion validity), the threshold used for declaring a test result positive, and the base rate of the condition being diagnosed (here, deception about serious security matters). If a diagnostic procedure can be made more accurate, the result is to reduce both false negatives and false positives. With a procedure of any given level of accuracy, however, the only way to reduce the frequency of one kind of error is by adjusting the decision threshold—but doing this always increases the frequency of the other kind of error. Thus, it is possible to increase the proportion of guilty individuals caught by a polygraph test (i.e., to reduce the frequency of false negatives), but only by increasing the proportion of innocent individuals whom the test cannot distinguish from guilty ones (i.e., frequency of false positives). Decisions about how, when, and whether to use the polygraph for screening should consider what is known about these tradeoffs so that the tradeoffs actually made reflect deliberate policy choices.

Tradeoffs between false positives and false negatives can be calculated mathematically, using Bayes’ theorem (Weinstein and Fineberg, 1980; Lindley, 1998). One useful way to characterize the tradeoff in security screening is with a single number that we call the false positive index: the number of false positive cases to be expected for each deceptive individual correctly identified by a test. The index depends on the accuracy of the test; the threshold set for declaring a test positive; and the proportion, or base rate, of individuals in the population with the condition being tested (deception, in this case). The specific mathematical relationship of the index to these factors, and hence the exact value for any combination of accuracy (A), threshold, and base rate, depends on the shape of the receiver operating characteristic (ROC) curve at a given level of accuracy, although the character of the relationship is similar across all plausible shapes (Swets, 1986a, 1996:Chapter 3). Hence, for illustrative purposes we assume that the ROC shapes are determined by the simplest common model, the equivariance binormal model. 1 Because this model, while not implausible, was chosen for simplicity and convenience, the numerical results below should not be taken literally. However, their orders of magnitude are unlikely to change for any alternative class of ROC curves that would be credible for real-world polygraph test performance, and the basic trends conveyed are inherent to the mathematics of diagnosis and screening.

Although accuracy, detection threshold, and base rate all affect the

Suggested Citation:"7 Uses of Polygraph Tests." National Research Council. 2003. The Polygraph and Lie Detection. Washington, DC: The National Academies Press. doi: 10.17226/10420.

false positive index, these determinants are by no means equally important. Calculation of the index for diagnostic tests at various levels of accuracy, using various thresholds, and with a variety of base rates shows clearly that base rate is by far the most important of these factors. Figure 7-1 shows the index as a function of the base rate of positive (e.g., deceptive) cases for three thresholds for a diagnostic test with A = 0.80. It illustrates clearly that the base rate makes more difference than the threshold across the range of thresholds presented. Figure 7-2 shows the index as a function of accuracy with the threshold held constant so that the diagnostic test’s sensitivity (percent of deceptive individuals correctly identified) is 50 percent. It illustrates clearly that base rate makes more difference than the level of accuracy across the range of A values represented.

Figures 7-1 and 7-2 show that the tradeoffs involved in relying on a diagnostic test such as the polygraph, represented by the false positive index values on the vertical axis, are sharply different in situations with high base rates typical of event-specific investigations, when all examinees are identified as likely suspects, and the base rate is usually above 10 percent, than in security screening contexts, when the base rate is normally very low for the most serious infractions. The false positive index is

FIGURE 7-1 Comparison of the false positive index and base rate for three sensitivity values of a polygraph test protocol with an accuracy index (A) of 0.80.

Suggested Citation:"7 Uses of Polygraph Tests." National Research Council. 2003. The Polygraph and Lie Detection. Washington, DC: The National Academies Press. doi: 10.17226/10420.

FIGURE 7-2 Comparison of the false positive index and base rate for four values of the accuracy index (A) for a polygraph test protocol with threshold set to correctly identify 50 percent of deceptive examinees.

about 1,000 times higher when the base rate is 1 serious security risk in 1,000 than it is when the base rate is 1 in 2, or 50 percent.

The index is also affected, though less dramatically, by the accuracy of the test procedure: see Figure 7-2. (Appendix I presents the results of calculations of false positive indexes for various levels of accuracy, base rates, and thresholds for making a judgment of a positive test result.) With very low base rates, such as 1 in 1,000, the false positive index is quite large even for tests with fairly high accuracy indexes. For example, a test with an accuracy index of 0.90, if used to detect 80 percent of major security risks, would be expected to falsely judge about 200 innocent people as deceptive for each security risk correctly identified. Unfortunately, polygraph performance in field screening situations is highly unlikely to achieve an accuracy index of 0.90; consequently, the ratio of false positives to true positives is likely to be even higher than 200 when this level of sensitivity is used. Even if the test is set to a somewhat lower level of sensitivity, it is reasonable to expect that each spy or terrorist that might be correctly identified as deceptive by a polygraph test of the accuracy actually achieved in the field would be accompanied by at least hundreds of nondeceptive examinees mislabeled as deceptive. The spy or terrorist would be indistinguishable from these false positives by poly-

Suggested Citation:"7 Uses of Polygraph Tests." National Research Council. 2003. The Polygraph and Lie Detection. Washington, DC: The National Academies Press. doi: 10.17226/10420.

graph test results. The possibility that deceptive examinees may use countermeasures makes this tradeoff even less attractive.

It is useful to consider again the tradeoff of false positives versus false negatives in a manner that sets an upper bound on the attractiveness of the tradeoff (see Table 2-1, p. 48). The table shows the expected outcomes of polygraph testing in two hypothetical populations of examinees, assuming that the tests achieve an accuracy index of 0.90, which represents a higher level of accuracy than can be expected of field polygraph testing. One hypothetical population consists of 10,000 criminal suspects, of whom 5,000 are expected to be guilty; the other consists of 10,000 employees in national security organizations, of whom 10 are expected to be spies.

The table illustrates the tremendous difference between these two populations in the tradeoff. In the hypothetical criminal population, the vast majority of those who “fail” the test (between 83 and 98 percent in these examples) are in fact guilty. In the hypothetical security screening population, however, because of the extremely low base rate of spies, the vast majority of those who “fail” the test (between 95 and 99.5 percent in these examples) are in fact innocent of spying. Because polygraph testing is unlikely to achieve the hypothetical accuracy represented here, even these tradeoffs are overly optimistic. Thus, in the screening examples, an even higher proportion than those shown in Table 2-1 would likely be false positives in actual practice. We reiterate that these conclusions apply to any diagnostic procedure that achieves a similar level of accuracy. None of the alternatives to the polygraph has yet been shown to have greater accuracy, so these upper bounds apply to those techniques as well.

Tradeoffs with “Suspicious” Thresholds

If the main objective is to screen out major security threats, it might make sense to set a “suspicious” threshold, that is, one that would detect a very large proportion of truly deceptive individuals. Suppose, for instance, the threshold were set to correctly identify 80 percent of truly deceptive individuals. In this example, the false positive index is higher than 100 for any base rate below about 1 in 500, even with A = 0.90. That is, if 20 of 10,000 employees were serious security violators, and polygraph tests of that accuracy were given to all 10,000 with a threshold set to correctly identify 16 of the 20 deceptive employees, the tests would also be expected to identify about 1,600 of the 9,980 good security risks as deceptive. 2

Another way to think about the effects of setting a threshold that correctly detects a very large proportion of deceptive examinees is in terms of the likelihood that an examinee who is judged deceptive on the

Suggested Citation:"7 Uses of Polygraph Tests." National Research Council. 2003. The Polygraph and Lie Detection. Washington, DC: The National Academies Press. doi: 10.17226/10420.

test is actually deceptive. This probability is the positive predictive value of the test. If the base rate of deceptive individuals in a population of examinees is 1 in 1,000, an individual who is judged deceptive on the test will in fact be nondeceptive more than 199 times out of 200, even if the test has A = 0.90, which is highly unlikely for the polygraph (the actual numbers of true and false positives in our hypothetical population are shown in the right half of part a of Table 2-1). Thus, a result that is taken as indicating deception on such a test does so only with a very small probability.

These numbers contrast sharply with their analogs in a criminal investigation setting, in which people are normally given a polygraph test only if they are suspects. Suppose that in a criminal investigation the polygraph is used on suspects who, on other grounds, are estimated to have a 50 percent chance of being guilty. For a test with A = 0.80 and a sensitivity of 50 percent, the false positive index is 0.23 and the positive predictive value is 81 percent. That means that someone identified by this polygraph protocol as deceptive has an 81 percent chance of being so, instead of the 0.4 percent (1 in 250) chance of being so if the same test is used for screening a population with a base rate of 1 in 1,000. 3

Thus, a test that may look attractive for identifying deceptive individuals in a population with a base rate above 10 percent looks very much less attractive for screening a population with a very low base rate of deception. It will create a very large pool of suspect individuals, within which the probability of any specific individual being deceptive is less than 1 percent—and even so, it may not catch all the target individuals in the net. To put this another way, if the polygraph identifies 100 people as indicating deception, but only 1 of them is actually deceptive, the odds that any of these identified examinees is attempting to deceive are quite low, and it would take strong and compelling evidence for a decision maker to conclude on the basis of the test that this particular examinee is that 1 in 100 (Murphy, 1987).

Although actual base rates are never known for any type of screening situation, base rates can be given rough bounds. In employee screening settings, the base rate depends on the security violation. It is probably far higher for disclosure of classified information to unauthorized individuals (including “pillow talk”) than it is for espionage, sabotage, or terrorism. For the most serious security threats, the base rate is undoubtedly quite low, even if the number of major threats is 10 times as large as the number of cases reported in the popular press, reflecting both individuals caught but not publicly identified and others not caught. The one major spy caught in the FBI is one among perhaps 100,000 agents who have been employed in the bureau’s history. The base rate of major security threats in the nation’s security agencies is almost certainly far less than 1 percent.

Suggested Citation:"7 Uses of Polygraph Tests." National Research Council. 2003. The Polygraph and Lie Detection. Washington, DC: The National Academies Press. doi: 10.17226/10420.

Appendix I presents a set of curves that allow readers to estimate the false positive index and consider the implied tradeoff for a very wide range of hypothesized base rates of deceptive examinees and various possible values of accuracy index for the polygraph testing, using a variety of decision thresholds. It is intended to help readers consider the tradeoffs using the assumptions they judge appropriate for any particular application.

Thus, using the polygraph with a “suspicious” threshold so as to catch most of the major security threats creates a serious false-positive problem in employee security screening applications, mainly because of the very low base rate of guilt among those likely to be screened. When the base rate is one in 1,000 or less, one can expect a polygraph test with a threshold that correctly identifies 80 percent of deceptive examinees to incorrectly classify at least 100 nondeceptive individuals as deceptive for each security threat correctly identified. Any diagnostic procedure that implicates large numbers of innocent employees for each major security violator correctly identified comes with a variety of costs. There is the need to investigate those implicated, the great majority of whom are innocent, as well as the issue of the civil liberties of innocent employees caught by the screen. There is the potential that the screening policy will create anxiety that decreases morale and productivity among the employees who face screening. Employees who are innocent of major security violations may be less productive when they know that they are being tested routinely with an instrument that produces a false positive reading with non-negligible probability and when such a reading can put them under suspicion of disloyalty. Such effects are most serious when the deception detection threshold is set to detect threats with a reasonably high probability (above 0.5), because such a threshold will also identify considerable numbers of false positive outcomes among innocent employees. And there is the possibility that people who might have become valued employees will be deterred from taking positions in security agencies by fear of false positive polygraph results.

To summarize, the performance of the polygraph is sharply different in screening and in event-specific investigation contexts. Anyone who believes the polygraph “works” adequately in a criminal investigation context should not presume without further careful analysis that this justifies its use for security screening. Each application requires separate evaluation on its own terms. To put this another way, if the polygraph or any other technique for detecting deception is more accurate than guess-work, it does not necessarily follow that using it for screening is better than not using it because a decision to use the polygraph or any other imperfect diagnostic technique must consider its costs as well as its benefits. In the case of polygraph screening, these costs include not only the

Suggested Citation:"7 Uses of Polygraph Tests." National Research Council. 2003. The Polygraph and Lie Detection. Washington, DC: The National Academies Press. doi: 10.17226/10420.

civil liberties issues that are often debated in the context of false positive test results, but also two types of potential threats to national security. One is the false sense of security that may arise from overreliance on an imperfect screen: this could lead to undue relaxation of other security efforts and thus increase the likelihood that serious security risks who pass the screen can damage national security. The other cost is associated with damage to the national security that may result from the loss of essential personnel falsely judged to be security risks or deterred from employment in U.S. government security agencies by the prospect of false-positive polygraph results.

Tradeoffs with “Friendly” Thresholds

The discussion to this point assumes that policy makers will use a threshold such that the probability of detecting a spy is fairly high. There is, however, another possibility: they may decide to set a “friendly” threshold, that is, one that makes the probability of detecting a spy quite low. To the extent that testing deters security violations, such a test might still have utility for national security purposes. This deterrent effect is likely to be stronger when there is at least a certain amount of ambiguity concerning the setting of threshold. (If it were widely known that no one “failed” the test, its deterrent effect would be considerably lessened.) It is possible, however, to set a threshold such that almost no one is eventually judged deceptive, even though a fair number undergo additional investigation or testing. There is a clear difference between employment in the absence of security screening tests, a situation lacking in deterrent value against spies, and employment policies that include screening tests, even if screening identifies few if any spies.

Our meetings with various federal agencies that use polygraph screening suggest that different agencies set thresholds differently, although the evidence we have is anecdotal. Several agencies’ polygraph screening programs, including that of the U.S. Department of Energy, appear to adopt fairly “friendly” effective thresholds, judged by the low proportion of polygraph tests that show significant response. The net result is that these screening programs identify a relatively modest number of cases to be investigated further, with few decisions eventually being made that the employee has been deceptive about a major security infraction.

There are reasons of utility, such as possible deterrent effects, that might be put forward to justify an agency’s use of a polygraph screening policy with a friendly threshold, but such a polygraph screening policy will not identify most of the major security violators. For example, the U.S. Department of Defense (2001:4) reported that of 8,784 counterintelli-

Suggested Citation:"7 Uses of Polygraph Tests." National Research Council. 2003. The Polygraph and Lie Detection. Washington, DC: The National Academies Press. doi: 10.17226/10420.

gence scope polygraph examinations given, 290 (3 percent) individuals gave “significant responses and/or provided substantive information.” The low rate of positive test results suggests that a friendly threshold is being used, such that the majority of the major security threats who took the test would “pass” the screen. 4

On April 4, 2002, the director of the Federal Bureau of Investigation (FBI) was quoted in the New York Times as saying that “less than 1 percent of the 700” FBI personnel who were given polygraph tests in the wake of the Hanssen spy case had test results that could not be resolved and that remain under investigation (Johnston, 2002). Whatever value such a polygraph testing protocol may have for deterrence or eliciting admissions of wrongdoing, it is quite unlikely to uncover an espionage agent who is not deterred and does not confess. A substantial majority of the major security threats who take such a test would “pass” the screen. 5 For example, if Robert Hanssen had taken such tests three times during 15 years of spying, the chances are that, even without attempting countermeasures, he would not have been detected before considerable damage had been done. (He most likely would never have been detected unless the polygraph protocol achieved a criterion validity that we regard as unduly optimistic, such as A = 0.90.) Furthermore, if Hanssen had been detected as polygraph positive (along with a large number of non-spies, that is, false positives), he would not necessarily have been identified as a spy.

There may be justifications for polygraph screening with a “friendly” threshold on the grounds that the technique may have a deterrent effect or may yield admissions of wrongdoing. However, such a screen will not identify most of the major security threats. In our judgment, the accuracy of polygraph testing in distinguishing actual or potential security violators from innocent test takers is insufficient to justify reliance on its use in employee screening in federal agencies.

Although we believe it likely that polygraph testing has utility in screening contexts because it might have a deterrent effect, we were struck by the lack of scientific evidence concerning the factors that might produce or inhibit deterrence. In order to properly evaluate the costs and benefits associated with polygraph screening, research is needed on deterrence in general and, in particular, on the effects of polygraph screening on deterrence.

Recent Policy Recommendations on Polygraph Screening

We have great concern about the dangers that may arise for national security if federal agencies use the polygraph for security screening with an unclear or incorrect understanding of the implications of threshold-setting choices for the meaning of test results. Consider, for instance,

Suggested Citation:"7 Uses of Polygraph Tests." National Research Council. 2003. The Polygraph and Lie Detection. Washington, DC: The National Academies Press. doi: 10.17226/10420.

decisions that might be made on the basis of the discussion of polygraph screening in the recent report of a select commission headed by former FBI director William H. Webster (the “Webster Commission”) (Commission for the Review of FBI Security Programs, 2002). This report advocates expanded use of polygraph screening in the FBI, but does not take any explicit position on whether polygraph testing has any scientific validity for detecting deception. This stance is consistent with a view that much of the value of the polygraph comes from its utility for deterrence and for eliciting admissions. The report’s reasoning, although not inconsistent with the scientific evidence, has some implications that are reasonable and others that are quite disturbing from the perspective of the scientific evidence on the polygraph.

The Webster Commission recognizes that the polygraph is an imperfect instrument. Its recommendations for dealing with the imperfections, however, address only some of the serious problems associated with these imperfections. First, it recommends increased efforts at quality control and assurance and increased use of “improved technology and computer driven systems.” These recommendations are sensible, but they do not address the inherent limitations of the polygraph, even when the best quality control and measurement and recording techniques are used. Second, it takes seriously the problem of false positive errors, noting that at one point, the U.S. Central Intelligence Agency (CIA) had “several hundred unresolved polygraph cases” that led to the “practical suspension” of the affected officers, sometimes for years, and “a devastating effect on morale” in the CIA. The Webster Commission clearly wants to avoid a repetition of this situation at the FBI. It recommends that “adverse personnel actions should not be taken solely on the basis of polygraph results,” a position that is absolutely consistent with the scientific evidence that false positives cannot be avoided and that in security screening applications, the great majority of positives will turn out to be false. It also recommends a polygraph test only for “personnel who may pose the greatest risk to national security.” This position is also strongly consistent with the science, though the commission’s claim that such a policy “minimizes the risk of false positives” is not strictly true. Reducing the number of employees who are tested will reduce the total number of false positives, and therefore the cost of investigating false positives, but will not reduce the risk that any individual truthful examinee will be a false positive or that any individual positive result will be false. That risk can only be reduced by finding a more accurate test protocol or by setting a more “friendly” threshold.

Because the Webster Commission report does not address the problem of false-negative errors in any explicit way, it leaves open the possibility that federal agency officials may draw the wrong conclusions from

Suggested Citation:"7 Uses of Polygraph Tests." National Research Council. 2003. The Polygraph and Lie Detection. Washington, DC: The National Academies Press. doi: 10.17226/10420.

negative polygraph test results. On the basis of discussions with polygraph program and counterintelligence officials in several federal agencies (including the FBI), we believe there is a widespread belief in this community that someone who “passes” the polygraph is “cleared” of suspicion. Acting on such a belief with security screening polygraph results could pose a danger to the national security because a negative polygraph result provides little additional information on deceptiveness, beyond the knowledge that very few examinees are major violators, especially when the test protocol produces a very small percentage of positive test results. As already noted, a spy like Robert Hanssen might easily have produced consistently negative results on a series of polygraph tests under a protocol like the one currently being used with FBI employees. Negative polygraph results on individuals or on populations of federal employees should not be taken as justification for relaxing other security precautions.

Another recent policy report raises some similar issues in the context of security in the U.S. Department of Energy (DOE) laboratories. The Commission on Science and Security (2002:62), headed by John H. Hamre (the “Hamre Commission”) issued a recommendation to reduce the use of polygraph testing in the laboratories and to use it “chiefly as an investigative tool” and “sparingly as a screening tool.” It recommended polygraph screening “for individuals with access only to the most highly sensitive classified information”—a much more restricted group than those subjected to polygraph screening under the applicable federal law.

Several justifications are given for reducing the use of polygraph screening, including the “severe morale problems” that polygraph screening has caused, the lack of acceptance of polygraph screening among the DOE laboratory employees, and the lack of “conclusive evidence for the effectiveness of polygraphs as a screening technique” (Commission on Science and Security, 2002:54). The report goes so far as to say that use of polygraphs “as a simplistic screening device . . . will undermine morale and eventually undermine the very goal of good security” (p. 55). Much of this rationale thus concerns the need to reduce the costs of false positives, although the report makes no reference to the extent to which false positives may occur.

The Hamre Commission did not address the false negative problem directly, but its recommendations for reducing security threats can be seen as addressing the problem indirectly. The commission recommended various management and technological changes at the DOE laboratories that would, if effective, make espionage more difficult to conduct and easier to detect in ways that do not rely on the polygraph or other methods of employee screening. Such changes, if effective, would reduce the costs inflicted by undetected spies, and therefore the costs of false

Suggested Citation:"7 Uses of Polygraph Tests." National Research Council. 2003. The Polygraph and Lie Detection. Washington, DC: The National Academies Press. doi: 10.17226/10420.

negatives from screening, regardless of the techniques used. Given the limitations of the polygraph and other available employee screening techniques, any policies that decrease reliance on employee screening for achieving security objectives should be welcomed.

Although the commission recommended continued polygraph security screening for some DOE employees, it did not offer any explicit rationale for continuing the program, particularly considering the likelihood that the great majority of positive test results will be false. It did not claim that screening polygraphs accurately identify major security threats, and it left open the question of how DOE should use the results of screening polygraphs. We remain concerned about the false negative problem that can be predicted to occur if people who “pass” a screening polygraph test that gives a very low rate of positive results are presumed therefore to be “cleared” of security concerns. Given this concern, the Hamre Commission’s emphasis on improving security by means other than screening makes very good sense.

Both the Webster and Hamre Commission reports make recommendations to reduce the costs associated with false positive test results, although neither takes explicit cognizance of the extent to which such results are likely to occur in security screening. More importantly, neither report explicitly addresses the problem that can arise if negative polygraph screening results are taken too seriously. Overconfidence in the polygraph—belief in its validity that goes beyond what is justified by the evidence—presents a danger to national security objectives because it may lead to overreliance on negative polygraph test results. The limited accuracy of all available techniques of employee security screening underlines the importance of pursuing security objectives in ways that reduce reliance on employee screening to detect security threats.

Making Tradeoffs

Because of the limitations of polygraph accuracy for field screening applications, policy makers face very unpleasant tradeoffs when screening for target transgressions with very low base rates. We have summarized what is known about the likely frequencies of false positive and false negative results under a range of conditions. In making choices about employee security policies, policy makers must combine this admittedly uncertain information about the performance of the polygraph in detecting deception with consideration of a variety of other uncertain factors, including: the magnitude of the security threats being faced, the potential effect of polygraph policies on staff performance, morale, recruitment, and retention; the costs of back-up policies to address the limi-

Suggested Citation:"7 Uses of Polygraph Tests." National Research Council. 2003. The Polygraph and Lie Detection. Washington, DC: The National Academies Press. doi: 10.17226/10420.

tations of screening procedures; and effects of different policies on public confidence in security organizations.

In many fields of public policy, such tradeoffs are informed by systematic methods of decision analysis. Appendix J describes what would be involved in applying such techniques to policy decisions about polygraph screening. We were not asked to do a formal policy analysis, and we have not done so. Considering the advantages and disadvantages of quantitative benefit-cost analysis, we do not advocate its use for making policy decisions about polygraph security screening. The scientific basis for estimating many of the important parameters required for such an analysis is quite weak for supporting quantitative estimation. Moreover, there is no scientific basis for comparing on a single numerical scale some of the kinds of costs and of benefits that must be considered. Reasonable and well-informed people may disagree greatly about many important matters critical for a quantitative benefit-cost analysis (e.g., the relative importance of maintaining morale at the national laboratories compared with a small increased probability of catching a spy or saboteur or the value to be placed on the still-uncertain possibility that polygraph tests may treat different ethnic groups differently). When social consensus appears to be lacking on important value issues, as is the case with polygraph screening, science can help by making explicit the possible outcomes that people may consider important and by estimating the likelihood that these outcomes will be realized under specified conditions. With that information, participants in the decision process can discuss the relevant values and the scientific evidence and debate the tradeoffs. Given the state of knowledge about the polygraph and the value issues at stake, it seems unwise to put much trust in attempts to quantify the relevant values for society and calculate the tradeoffs among them quantitatively (see National Research Council, 1996b). However, scientific research can play an important role in evaluating the likely effects of different policy options on dimensions of value that are important to policy makers and to the country.

Other Potential Uses of Polygraph Tests

The above discussion considered the tradeoffs associated with polygraph testing in employee security screening situations in which the base rate of the target transgressions is extremely low and there is no specific transgression that can be the focus of relevant questions on a polygraph test. The tradeoffs are different in other applications, and the value of polygraph testing should be judged on the basis of an assessment of the aspects of the particular situation that are relevant to polygraph testing

Suggested Citation:"7 Uses of Polygraph Tests." National Research Council. 2003. The Polygraph and Lie Detection. Washington, DC: The National Academies Press. doi: 10.17226/10420.