1 Section 1: Cover Sheet 2006 Annual Report Period of Performance: 9/1/05 – 9/1/06 Principal Investigator: MURI Team: Address: Dr. Suvrajeet Sen The University of Arizona Department of Systems and Industrial Engineering PO Box 210020 Tucson, AZ 85721-0020 Dr. J. Cole Smith University of Florida Department of Industrial and Systems Engineering Gainesville, FL 32611-6595 Dr. Jionghua (Judy) Jin University of Michigan Department of Industrial and Operations Engineering Ann Arbor, MI 48109-2117 Dr. Ronald G. Askin Arizona State University Department of Industrial Engineering Tempe, AZ 85287-5906 Award Number: F49620-03-1-0377 Proposal Title: Predicting and Prescribing Human Decision Making Under Uncertain and Complex Scenarios 2 Section 2: Objectives The objectives of our research are virtually the same as written in last year’s report. However, based on our progress to date, we have reorganized them here to present a more coherent and unified set of focuses. The overarching objective of this research remains the same as before: to develop models that accomplish one or more of the following ! emulate human decision-making behavior ! provide guidelines/policies which can help improve the effectiveness of human decision-making ! provide insights into settings where human decision-making tends to degrade (e.g. fatigue, time pressure, uncertainty etc.) ! support human decision-making in settings that are either unfamiliar or have a tendency to lose effectiveness in decision-making ! lead to tractable algorithms for decision problems arising in important Air Force applications such as “Network Interdiction,” “Network Design under Threat,” “Social Network Simulation” etc. Reasons that decision makers may choose suboptimal decisions in practice vary widely. For one, the objective by which “good” decisions are evaluated may be incomplete. The decision maker may fail to enumerate all relevant considerations (as suggested by Support Theory) or be unable to obtain or quantify relevant data on one or more relevant criteria. For instance, the criteria by which we evaluate the quality of a decision may be incomplete, and could ignore certain aspects of what the decision maker considers to be important. A more common reason is that the presence of uncertainties and/or complexities, such as interactions and nonlinearities, within the model often makes it nearly impossible to determine the best decision. Other factors, such as regret, misinformation, stress, and fatigue, also influence the behavior of decision makers, although quantitative models that incorporate these effects are not well-developed at this time. This team is using its expertise in the areas of random processes, decision behavior analysis, optimization, simulation, and stochastic programming to formulate models that describe the decision maker’s behavior, as well as those that prescribe the best decisions that can be made in each situation. Our efforts will reconcile the differences in these two sets of decisions, and will either change the model to incorporate a more complete knowledge of the decision maker’s objectives, or will detect trends that can be used to train decision makers to make superior decisions. Additionally, knowledge of trends observed in suboptimal decision making will lead to models that exploit weaknesses in enemy behavior and mitigate errors committed by “friendly” decision makers. The research associated with this project may be categorized in four main which are outlined below, and an in-depth summary of which is provided in section 4. 3 Thrust A: Sequential Decision-Making In this thrust area, we investigate systematic and replicable patterns of behavior in an attempt to formulate descriptive models that are psychologically interpretable, have potential practical implications, and can better account for human decision behavior. These studies range from theoretical underpinnings to the exploration of new decision-making theories on human subjects. In most cases, the data are collected to support our studies by presenting financially motivated subjects, whose payoff is contingent on their performance, with various decision scenarios that simulate or otherwise capture major ingredients of real decision making situations. Thrust B: Computational Decision-Making Models and Algorithms This area covers several computational aspects associated with decision-making, and in some cases, our computational research suggests experimental with human/animal subjects, whereas in others, we consider or simulation experiments. The specific themes that we are investigating within this thrust area include computational models of human decision-making processes, decision-making based on game theoretic models, and models and algorithms for decision-making under risk and uncertainty. These themes provide a comprehensive framework for computationally oriented decision-making research. Thrust C: Network Decision-Making Research Research in this area encompasses both descriptive and prescriptive decision-making research in the field of networks, with a particular interest on network interdiction and secure network planning problems. The themes in this thrust area include network design, path planning, and even human decision-making behavior over congested networks. We have also designed a decision-making game in which networks can be designed by one player, and activities can be thwarted by another player. This software is flexible enough to allow either humans or algorithms to act as players. The purpose of such an exercise is to better understand relationships between human decision-makers, and computational methods based on optimization or game theory. Thrust D: Applied Decision-Making The goal of this thrust is to develop a high fidelity, synthetic human decision-making model under complex and realistic environments such as military command and control systems, automated manufacturing systems, and individual behaviors under emergency situations. The availability of such a model will allow us to understand and/or evaluate dynamics of systems involving humans more accurately. In this research endeavor, a number of engineering methodologies and technologies have been employed to help reverse-engineer (understand and extract features from) human behaviors, to represent the human decision-making model formally, and to develop a realistic simulated environment. 4 Section 3: Status of the Effort The MURI project has matured into a comprehensive study, covering many facets of human decision-making in uncertain and complex scenarios. This effort covers a broad collection of behavioral, computational, mathematical and practical aspects of decisionmaking, and their application in decision-making issues facing Air Force personnel. We have not only produced an impressive list of publications, but we have also disseminated this work within the classroom and into the industrial sector. The project underwent a thorough review in November 2005 at which the team presented a complete briefing to the AFOSR review team consisting of the program manager Dr. Jerome Busemeyer, and his colleagues Dr. Kevin Gluck (AFRL), and Dr. Todd Coombs (AFOSR). The review team was very positive on the progress made by our project. The review team also suggested some fine-tuning in our thrust areas, and the current report reflects these changes. The project also hosted a two-day long workshop (February 2006) at which we invited researchers from fields covering brain science, human factors, industrial engineering , management, mathematics, operations research and of course psychology. From the very beginning, this MURI project has focused on research that integrates the human decision-maker with the study of models and algorithms for decision-making. In some cases, this requires novel experiments that provide data regarding similarities, and differences between alternative approaches (behavioral, computational, mathematical). In other cases, humans are explicitly included within a simulation or optimization loop, and in still other cases, models of human decision-making processes are included within more extensive real-world models. The activities associated with this project have given rise to a rich set of research issues at the interface between human perceptions of complexity, risk, and uncertainty, and computational and mathematical tools from operations research. This MURI has thus given birth to the new area of study, which we call Behavioral Operations Research. 5 Section 4: Accomplishments / New Findings: Research Highlights and Relevance to the Air Force Mission This section presents the principal research accomplishments during the past year. It is organized into four main sub-sections, each representing a thrust area of our research program. Each thrust area is composed of several themes which represent building blocks upon which the thrusts rest. Finally, each theme is associated with specific research papers, each of which represents a nugget of knowledge that has been developed over the past year. These new nuggets of knowledge cover a wide array of decisionmaking research, ranging from new theory, experiments, algorithms, simulations, software, games, and even efforts aimed at decision-making practice. Moreover, the following discussion presents the relevance of each thrust, theme, and nugget to the mission of the Air Force1. Thus, the MURI project provides a comprehensive multidisciplinary vehicle covering basic research and applications significant relevance to the Air Force. 4.1 Sequential Decision Making We have been engaged in several interrelated lines of research on sequential decision making. Below, we survey our previous work on these problems and describe our more recent work on developing computational models of decision making in sequential decision problems. The latter is our most important objective and will be the focus of the bulk of our efforts for the duration of the grant. 4.1.1 Optimal Stopping Behavior Over the course of the grant, we have done significant work on better understanding decision behavior in optimal stopping problems. This program has involved both theoretical modeling of optimal decision behavior and also experimental work that allows us to examine the behavior of actual human decision makers (DMs) in these problems. The ultimate objective of this program is to develop computational models of decision making in sequential (multi-stage) decision problems. Below, we sketch the work that we have completed and describe our current work on computational modeling. The basic optimal stopping problem can be informally stated as follows: A DM sequentially encounters a set of decision alternatives and must decide which to accept. Depending on the problem formulation, the set of alternatives may be infinite or finite, the DM may be able to recall previously encountered alternatives or not, and the DM may have more or less information about the distribution from which the alternatives are taken. This kind of problem is faced by Air Force DMs in a wide range of contexts. Scientists and administrators must decide when to pursue work on (potential) new technologies that become known sequentially in time. Crews on a sortie must decide 1 In this section, when a reader encounters one or more sentences in italics, he/she should interpret the content as being directly relevant to the Air Force and its Technological Challenges. 6 which sequentially encountered targets to engage, etc. Terminating a search too soon, say by engaging a relatively unimportant enemy position, may mean that high-value alternatives down the road are missed. On the other hand, searching for too long may result in high-value alternatives being passed up. Therefore, understanding how and why these stopping decisions are likely to depart from optimality can be extremely valuable when training combat flight crews. Our research on optimal stopping problems has required that we do significant formal, theoretical work on deriving optimal decision policies for the problems that we use in our behavioral experiments. The optimal models serve as our benchmark—providing us with a means of determining how good decision making could be—and also give us a starting point for developing computational models of actual decision making. Recently, Smith, Bearden, and Lim have been working on optimal decision policies for optimal stopping problems in which the DM must expend resources to evaluate the value of each encountered alternative. For instance, a flight crew must decide how much time to spend on surveying a potential target before deciding whether to actually engage it. Spending too much time evaluating an obviously low-value target may result in their missing the opportunity to engage significantly higher valued targets later on. Likewise, spending too little time evaluating a target may lead to the decision to expend valuable resources on what turns out to be a strategically insignificant target. Throughout the course of a mission, the crew faces a set of very difficult stopping problems: They must continuously decide whether to continue evaluating a target to learn its value, and whether to engage a target (which then reduces the resources they have to engage subsequent potential targets). Thus far, two theoretical publications have resulted from this collaboration between engineering (Smith and Lim) and management (Bearden). The first, Lim, Bearden, and Smith (2005) proposed a new class of optimal stopping problems that captures the scenarios described above and presented methods for solving a special class of these problems. Smith, Lim, and Bearden (in press) extended their earlier work to a broader class of problems. Bearden and Connolly (under 2nd review) used the work of Smith, Lim, and Bearden as the basis for a set of behavioral experiments on multi-attribute optimal stopping problems. In short, they showed that, relative to the optimal policies, DMs have a tendency to search too much within alternatives and to stop too soon when searching across alternatives. These findings were very robust and consistent across nearly all experimental subjects. Therefore, one might worry that flight crews would commit too many resources evaluating potential targets (say by continuing to gather intelligence on a target that has a low expected importance), and be biased to engage relatively lowvalue targets early in their sorties and consequently forgo opportunities to engage higher-valued targets later on. It is important to stress that computing optimal decision policies for these kinds of stopping problems is exceptionally complex. Therefore, it is, of course, not surprising that DMs do not behave optimally. Bearden and Connolly (in press) examine the theoretical bounds on simplified search policies that DMs might employ in multi-attribute stopping problems. They show that relatively simple policies perform near optimally if 7 the policies are correctly parameterized. The qualitative features of these heuristic policies can be easily communicated. Thus, this work may have value in training DMs who must act quickly and in real time and who obviously do not have time to decide “optimally.” Bearden and Connolly are currently working on a paper on the robustness of heuristic search policies. They plan on submitting this paper to Journal of Mathematical Psychology in the fall 2006. Bearden, Murphy, and Rapoport (2005) considered a variant of multi-attribute sequential search problems in which the DM learns only rank information about each of the alternatives. They developed a numerical procedure for computing optimal policies for these problems, and also presented results from several behavioral experiments. In short, their data revealed that DMs tend to make poor trade-offs: They have a tendency to search for alternatives that meet minimal conditions on each attribute, and fail to appreciate that high values on some attributes can compensate for low values on others. (The work by Bahill et al., which is described in the section 4.4 of this report, is aimed at improving the quality of trade-off decisions in multi-attribute decision problems.) This recent work on multi-attribute search problems extends the work on single attribute search problems that we conducted during the early stages of this grant (e.g., Bearden, Rapoport, and Murphy (2006)). Below, we describe how we are incorporating this entire program of research into a comprehensive effort to develop computational models of decision making in sequential (multi-stage) decision problems. 4.1.2 Multi-Stage Decision Problems with Risky Alternatives Optimal behavior in sequential decision problems depends crucially on the DM’s objective. For instance, trying to maximize one’s expected payoff and trying to maximize the probability that one’s payoff exceed some pre-specified threshold can involve very different optimal policies. Obviously, the optimal action when one is trying to achieve some particular tactical objective (e.g., trying to neutralize a particular radar installation) is not necessarily the optimal action for strategic purposes (e.g., trying to win an air campaign). What is not obvious is how sensitive actual DMs’ policies are to different objectives, and in what ways actual policies depart from optimal policies (given the appropriate objective). (The work described in the previous section only involved payoff maximizing objectives.) Askin, Krishnan, and Connolly have been doing both theoretical and experimental work on sequential decision problems with different objectives. They have derived optimal decision policies for a broad class of multi-stage risky decision problems with different objective functions. We will illustrate the basic structure of this work by example. Suppose that at the beginning of a 10 stage decision problem, a DM is given one of the following objectives: Objective 1: Maximize your total accumulated points over the course of the 10 stages. 8 Objective 2: Maximize the probability that you will earn 1000 points over the course of the 10 stages. Then, in each stage, the DM must choose between the following options: Option A: 50% chance to gain 100 points and 50% chance to lose 50 points Option B: 10% chance to gain 500 points and 90% chance to lose 100 points The optimal policy for the expected point maximizing policy (i.e., for Objective 1) in each stage is straightforward: Choose the expected point maximizing option (i.e., Option A). However, under Objective 2, the optimal action in each stage depends on the number of stages remaining and the cumulative points at that stage. In particular, the DM must carefully consider the variance of the options’ payoffs in addition to their expectations. (In the experimental studies of these problems, the DM was faced with between 5 and 10 options at each stage. The example employed only 2 options purely for illustrative purposes.) Askin, Krishnan, and Connolly have worked out optimal polices for problems in which the duration of the multi-stage problem is known and also for ones in which the DM only has probabilistic information about the duration of the problem. Most important, this group has been extending descriptive models of risky choice that were developed for static problems to this more general class of multi-stage risky decision problems. Specifically, they have drawn upon theoretical notions from Prospect Theory (Tversky and Kahneman, (1992)) and Decision Field Theory (Busemeyer and Townsend, (1993)) in order to develop a more comprehensive model of decision behavior in risky multistage decision problems. The experimental data indicate that risk aversion decreases with good (better than the statistical expectation) outcomes and vice versa. Good outcomes are measured with respect to expected state at a given point in the game. Whereas individuals can reasonably determine and make optimal choices at the start of the game, deviations from the optimal decision increase over time as the current state deviates from the planned trajectory. One potential application relates to behavior in confrontational situations that extend over an interval of time and require discrete operational or tactical decisions to adapt to random outcomes that may deviate from initial planned scenarios. For instance, would the willingness of a battlefield commander to take risks vary with recent successes or setbacks in uncertain situations? Likewise, can an opponent’s changes in decision behavior be predicted based on real-time feedback of random outcomes vs. the expected outcome? Direct application of results is of course limited by the recognition that the game and hence the fitted models are based on economic payoffs. Additional testing with more substantial penalties and adverse environmental conditions are needed for future research. 4.1.3 Sequencing Problems 9 Our group is now extending this general line of work to problems in which the DM can dictate the order in which the decision alternatives are evaluated. Suppose for sake of demonstration that a crew must decide the order in which to engage three potential targets (X, Y, and Z). One target X may be of relatively low value but a mission to engage it may be able to be completed quickly, leaving time for additional missions. Another target, Y, may be of relatively high importance but may also require considerable resources. Target Z may be a very high value target, but engaging it may require expending all of the mission’s resources, leaving none available to engage additional targets. Is it better to engage X and then to try to get to Y? Should they ignore X and Y and focus on Z? Bearden, Lim, and Smith are currently collaborating on a project in which they are modeling these kinds of problems and developing methods for finding optimal sequencing policies. This work has immediate applications in domains ranging from command and control to research and development. Next, we are going to use our theoretical work as the basis for evaluating the behavior of actual human DMs in sequencing problems. By discovering how and why behavior is likely to depart from optimality, we can help advise DMs and improve their decision performance. This experimental work will involve the collaborative team of Bearden, Smith, Rapoport, and Connolly. 4.1.4 Models of Decision Behavior in Sequential Decision Problems As noted above, our deepest objective is to develop computational models of decision making in sequential decision problems (including stopping, sequencing, and assignment problems). We have established a significant body of experimental data in the first phase of our MURI work on optimal stopping problems. Much of the experimental results are described in the following papers: • • • • • • Bearden and Connolly (under 2nd review) Bearden, Rapoport, and Murphy (in press) Bearden, Murphy, and Rapoport (2005) Bearden, Rapoport, and Murphy (2006) Bearden and Rapoport (2005, INFORMS) Bearden, Murphy, and Rapoport (under review) Based on this work, we now have a very good understanding of how sequential decision behavior is likely to go awry (i.e., how it is likely to depart from optimality). In each paper, we have attempted to provide explanations for the observed departures from optimality; however, we did not develop a comprehensive model of decision behavior in these problems that will provide a deep understanding of the cognition that underlies behavior. The next phase of this research program will focus on developing a general computational account of sequential decision behavior. To do so, we are drawing on theoretical work from computer science. In particular, we are using reinforcement learning (RL) models (e.g., Sutton and Barto, (1997); we are also drawing on ideas from Bertsekas and Tsitsiklis, (1997)) as the foundation for our behavioral models. We are 10 using the RL models as our basic theoretical infrastructure and are extending them by incorporating behavioral principles. We are currently using temporal difference learning (Q-learning) to model learning in sequential decision problems such as those studied in the papers cited above. (Interestingly, the work from operations research/computer science on temporal difference learning has been largely ignored by psychologists interested in learning in complex decision problems, though it has received attention by some researchers working on animal learning.) Temporal difference (TD) learning is built upon principles of dynamic programming, which are also the bases for the optimal decision policies in most of the problems we have been studying. Thus, they are a natural starting point for this project. The TD models provide methods for DMs to learn how to adjust their policies with experience. A priori we know that we should not expect the subjects in our experiments to solve dynamic programs “in their head” at the beginning of the experiment in order to decide how to behave. Rather, it is more sensible to assume that they will learn how to solve the problems through experience (much like pilots do in flight simulators). What we would like to know is: • • How do people learn to perform sequential decision problems with experience? What are the properties of decision behavior during the course of and at the end of learning? The answers to these questions are important both theoretically and practically. On the theory side, the answers will contribute to the decision making literature in psychology that has largely ignored sequential decision problems. More important, perhaps, the answers will have implications for improving the decision behavior of actual DMs. This work could, for example, be used in developing training programs for Air Force personnel. One of the most robust findings from our experimental studies is that, even with considerable experience, DMs have a strong tendency to under-search: they do not examine enough alternatives before making a stopping decision. Right now, we are using TD principles to build neural network models of decision behavior. (We have also been exploring look-up table variants.) Our modeling results are encouraging. Even with considerable experience, the models have a tendency to search inadequately. More importantly, it seems that in order to capture most of the regularities in the experimental data, we have had to modify the conventional TD model in several psychologically relevant ways. For instance, by adding a “regret” factor to the model we can pick up on response patterns that are difficult to explain otherwise. Specifically, if the model experiences negative payoffs for passing up what turn out to be relatively good alternatives, then it experiences regret and assigns strong negative payoffs to the action of passing up alternatives. This element seems essential in modeling the learning of human subjects in sequential decision problems. This aspect of the model draws on the recent 11 work by Connolly and colleagues (Connolly and Butler, 2006; Connolly and Reb, under review). Another factor that helps the models behave more humanlike is to have them distort probabilistic information when making decisions. Using the Prospect Theory (Tversky and Kahneman, (1992)) weighting function to distort the probabilistic information that the model uses to decide among actions (by overweighting small probabilities and underweighting large ones) improves correlation between the model behavior and human behavior. We also plan on merging our efforts on this front with those of Askin, Krishnan, and Connolly described above. This project will add a learning dimension to the models developed by Askin et al. 4.1.5 Sequential Decision Making in Interactive Settings Rapoport and colleagues have extended our group’s studies of sequential decision making to game-theoretic settings in which agents’ actions affect both their own and others’ payoffs. As with the work described above, this program of research involves both theoretical (e.g., solving for Nash equilibria) and experimental work. Common examples of these kinds of problems are deciding when to join a queue and deciding what traffic route to take. Understanding the quality of (game-theoretic) strategic planning may be important for improving decision behavior in transportation logistics, for instance. Rapoport and colleagues (2005) studied a class of single-server queuing systems with a finite population size, FIFO queue discipline, and no balking or reneging. In contrast to the predominant assumptions of queuing theory of exogenously determined arrivals and steady state behavior, this work considers queuing systems with endogenously determined arrival times and focuses on transient rather than steady state behavior. When arrival times are endogenous, the resulting interactive decision process is modeled as a non-cooperative n-person game with complete information. Assuming discrete strategy spaces, the mixed-strategy equilibrium solution for groups of n=20 agents is computed using a Markov chain method. Using a 2×2 between-subject design (private vs. public information by short vs. long service time), arrival and staying out decisions are presented and compared to the equilibrium predictions. The experimental results indicate that players generate replicable patterns of behavior that are accounted for remarkably well on the aggregate, but not individual, level by the mixed-strategy equilibrium solution unless congestion is unavoidable and information about group behavior is not provided. These results are of interest and potential application to any queuing system in which people decide when to join. In other words, aggregate behavior in queuing systems may be well-predicted by game-theoretic models. This can be valuable in logistical planning. This work on queuing has been extended in a number of directions, in order to examine the generalizeability of previous results. A second project examined the decisions agents make in two queuing games with endogenously determined arrivals and batch service (in press, Games and Economic Behavior). In both games, agents are asked to independently 12 decide when to join a queue to receive bulk service, or they may simply choose not to join it at all. The symmetric mixed-strategy equilibrium of two games in discrete time where balking is prohibited and where it is allowed are tested experimentally in a study that varies the game type (balking vs. no balking) and information structure (private vs. public information) in a 2×2 between-subject design. All four experimental conditions result in aggregate, but not individual, behavior approaching mixed-strategy equilibrium play. Individual behavior can be accounted for by relatively simple heuristics. These results have applications to the formation of queues with bulk service. Additional related work by our group on interactive decision problems is described in the Network section of this report. Though the game-theoretic models can account for the aggregate (i.e., all subjects taken together) experimental queuing and network results, the models fare poorly when trying to account for individual behavior. A complete understanding of behavior in queuing and traffic scenarios will require a descriptively accurate model of individual decision making. Some of the most ambitious tests of our computational models of sequential decision making will be in these kinds of interactive problems. We will ask: How well do populations of interacting instantiations of the model capture the behavior of actual human subjects? 13 4.2 Computational Decision-Making Models and Algorithms MURI research in this area focuses on a variety of models, some of which are intended to explain computations that may form the basis for human decision-making, others that are intended to describe choices under competitive pressures, and still others that study algorithms for optimal decisions in applications of interest to the Air Force. 4.2.1 Human Decision-making Processes Computational Models There is growing evidence in the literature that diffusion models may be at the crux of the human decision-making process. In the mathematical psychology literature, several researchers (Bogacz et al (2006), Busemeyer and Townsend (1992, 1993), Diederich (1997), Diederich and Busemeyer (2003) and others) have used diffusion models to explain experimental findings about human cognition. Similarly, the neuro-science literature has observed that the process underlying individual neural activation can also be modeled using diffusion processes (MacClennan (1996), Smith and Ratcliff (2004)), and moreover, diffusion processes can also be designed for optimization (e.g. Steinbeck et al (1995)). Despite these mathematical connections, there remains a significant gap in our understanding of the decision-making process adopted by humans, and our research is aimed at seeking a unifying theory. This work is in the spirit of recent papers by Bogacz et al (2006), and Busemeyer et al (2006). Our working paper (Huang, Sen and Szidarovszky (2006)) presents a model that addresses several common themes among those published in the literature. Just as important however, we demonstrate that some of these approaches may be inconsistent with each other, and experimental work is necessary to discover which specific models may be most pertinent to human decisionmaking. Experimental Investigations One experiment that has already been carried out by Askin, Krishnan and Connolly (2006) deals with hypothesizing analytical parametric extensions of prospect theory and decision field theory to model how individuals might adjust repeated choices among alternatives in multi-stage decision processes in the presence of random outcomes. This experiment was reported in the previous section, and the basic hypothesis, as confirmed from the history of decision research, is that humans misjudge probability. In particular, Askin et al (2006) postulate that when planning, humans tend to expect the average and underestimate the amount of randomness in future random events. The experimental data indicates that risk aversion decreases with good (better than the statistical expectation) outcomes and vice versa. Good outcomes are measured with respect to expected state at a given point in the game. Whereas individuals can reasonably determine and make optimal choices at the start of the game, deviations from the optimal decision increase over time as the current state deviates from the planned trajectory. (Additional details of this experimental work were provided in the previous section.) 14 Two further experiments are currently being designed to investigate the modeling issues associated with diffusion models. The first experiment (led by Tamar Kugler) is a behavioral study using the network interdiction game that has been designed as part of the MURI project (Desai, Huang, and Sen (2006)). Human participants will make sequential decision attempting to interdict simple networks. The results will be used to estimate the weights participants put on multiple attributes of the decision model, and compare those to the predictions of theoretical models. This experiment will combine two main themes the grant has been focusing on: network interdiction, and development of a new cognitive model for decision making as presented in Huang, Sen and Szidarovszky (2006). A second experiment will be carried out in collaboration with Jennie Si (Arizona State University). In this experiment, our goal is to obtain neural level data from rats which are subject to stimuli in the laboratory. The experiment will find parameters for models of binary choice decisions modeled by a diffusion process. The experimental setup has already been used in another study investigating the effectiveness of support vector machines to classify choices made by rats. Our study will provide response-time data for use in diffusion models. 4.2.2 Choices modeled with Game Theory Leader-Follower Games Smith, Lim and Alptekinoglu (2006) consider a set of entities that can be claimed by either of two players. Each entity is worth a certain (common) benefit to each player. The two players take turns in as in a Stackelberg game: the leader acts first to claim as many entities as possible, followed by a follower. In this game, we consider a predatory follower, whose goal is to limit the amount of profit that can be made by the leader. To illustrate the mechanism by which entities are claimed by the two players, consider an example in which two armies are positioning themselves for geographically diverse resources. The leader army has a limited number of bases that it establishes. The follower army responds by positioning its bases in opposition to the leader. After both armies’ bases have been established, each asset will be controlled by whichever army has established a base closest to the asset. If two bases are established equidistant from the asset, then the armies share the asset. If no base is established within a certain minimum radius from the asset, then the asset is claimed by neither army. Several assumptions are worth noting. One, the set of potential base locations is limited to the asset locations themselves. Two, armies can be collocated, implying that they are in direct competition with one another. It is not necessary to assume equal strength of the armies; in fact, our models can handle any proportion of the leader’s strength relative to the follower’s strength, and this proportion can be asset-dependent (or even dependent on the combination of base and asset location). Three, the follower army’s goal is to minimize the amount of benefit that the leader can obtain from its bases, which 15 is not necessarily the same as maximizing its own benefit. (This situation is common when the follower is acting as an entrenched defender.) Such problems are difficult to solve as nonlinear optimization problems, but the paper by Smith, Lim and Alptekinoglu (2006) provides a methodology for solving them as bilevel integer programming problems. We present specialized methods for these problems that permit the exact solution of strategic-level problems that might be encountered in practical military scenarios. However, applications of this problem go far beyond the scenario presented above, and we also prescribe two heuristic procedures for quickly generating near-optimal solutions to larger optimization problems. Dynamic Games and Extensions Szidarovszky and colleagues have investigated several extensions of ordinary dynamic games discussed in the economics literature. While many of these models and associated analysis were mathematically elegant, they failed to capture several economic realities, such as intertemporal interactions, variable and uncertain cost/price trajectories etc. This line of research is intended to provide more realistic models and dynamic properties. Szidarovszky and Zhao (2006) included intertemporal demand interaction since demand at each time period usually depends on previous consumption, implying that the successive time periods are interdependent. Another feature considered by Chiarella and Szidarovszky (2006), and Szidarovszky and Zhao (2006) deals with the inclusion of increasing cost profiles to accommodate increased activity levels. They developed a general model in which the best responses are discontinuous and there are infinitely many equilibria. In spite of this more complicated setting, we were able to establish conditions for which the equilibrium set is a continuum. In continuous time scales, the limiting set is always the boundary of this continuum, however, in discrete time-scales, any point of this set can be obtained as the limit. Other extensions of dynamic games occur in cases where the players consist of groups, and the payoff is measured by benefits to individual members. Okuguchi and Szidarovszky (2006) proved the existence and uniqueness of Nash equilibrium of such games under realistic mathematical conditions. Dynamic games with probabilistic success rates arise in models for analyzing missions of multi-national forces. Here the probability of success is the ratio of the individual effort of each group and the total effort of all participants. For such games, Szidarovszky and Matsumoto (2006) gave conditions for the stability of the system. Yousefi and Szidarovszky (2006) also performed simulation studies examining the probability of unique or multiple equilibria, as well as the probability of stability. In most decision-oriented games, the participants have only delayed information about the actions of others. This information may cause a loss of stability of the systems. Chiarella and Szidarovszky (2006) have examined the effect of information delays and established conditions under which stability can be preserved. These conditions are based on the probabilistic properties of the delay process. In case of lost stability, cyclic behavior can be observed. 16 A summary of these models were presented in Szidarovszky (2005). Dynamic Games under Uncertainty Such games arise when the consequences of the actions of players are uncertain, which is modeled with stochastic methods. It is assumed that the players want to maximize their average payoff, but at the same time, they want to decrease the risk as much as possible. This leads to the need to use multi-objective optimization method at each time period. Embedding multi-objective methodology into the game leads to a different equilibrium and stability analysis (Chiarella and Szidarovszky (2006)). In another paper, Genc, Reynolds, and Sen (2005) presents models that may be used to predict choices in situations where decisions by a number of players are necessary to describe the alternative economic scenarios that may unfold. We study three alternative behavioral assumptions. In the first formulation, the players make decisions based on collection of probabilistic scenarios, which we refer to as a game with probabilistic scenarios (GPS). Here the decisions will depend on the scenario that unfolds, although the decision trajectories are required to obey a non-clairvoyance condition which states that decisions cannot depend on information revealed in the future. The second formulation we investigate is called a game with expected scenarios (GES) where investment decisions are based on an expected scenario (since the experiments of Askin, Krishnan, and Connolly (2006) suggests that human decision-makers may use expectations for forecasting). Once the investment decisions are made in a given period, one of the possible scenarios unfolds, and players make their production decisions in response to the specific scenario that unfolds. Finally, we study a third formulation which we call a hybrid game (HG) which combines features from the GPS and GES games. The analysis in Genc, Reynolds, and Sen (2005) indicates that competing game models such as GES might seem attractive, but using HG, we argue that the GPS game is the most tenable of the three. In addition, we show that under certain assumptions (i.e., symmetric cost structures), the presence of volatility also provides greater expected profits in a game. This provides the intuition about why players may continue to participate, even though market volatility may be on the rise. In addition, we study multistage (sequential) games under uncertainty, and provide a formulation that allows lags. The paper illustrates the advantages of our approach with an example that is well out of reach for standard dynamic programming methodology. Agent-based simulation. Szidarovszky and his colleagues are studying agent-based methodology to analyze the combined effect of individual personalities, environment, and external influence on the behavior and repeated decisions of individuals in a large artificial society. Such models are particularly relevant in studying social networks, which provide the computational foundations for important problems of homeland security. The number of agents is usually very large, the governing dynamic rules are discontinuous, and stochastic, and 17 consequently the analysis of such societal interactions is mathematically intractable. Games of dilemma (as in Prisoner’s Dilemma) can be studied by systematically and continuously varying model parameters. The game structures gradually move from one type to another, and the behavior near the boundary (between models) can be observed (Zhao et al (2006)). These observations can be used by policy makers to predict responses to policy changes. An important class of games examined by agent-based simulation is based on binary choices of the players. In the paper by Merlone et al (2006) we have proved that there are only finitely many equivalence classes of such games and developed algorithm to decide the class to which the belongs. Therefore, a specified (and very reasonable) number of simulation studies can describe all such games. 4.2.3 Decision-making under Risk and Uncertainty Most realistic decision problems include a variety of complicating factors such as resource constraints, risk, uncertainty, and interdependencies. Moreover, as scenarios evolve, decision-makers must be able to process new information, leading to greater situational awareness, and adapt decisions to new information. For Air Force personnel, such decisions arise in several situations ranging from preparations, planning, and combat. Decisions during preparations may involve strategic questions such as recruiting allies, locating bases, and developing an understanding of the objectives/values of both allies and the enemy. Uncertainties in this preparatory phase arise because of the breadth of its scope, and because of the temporal separation between this phase, and a full-fledged war. By the time specific plans are drawn up (e.g. planning sorties), certain aspects (e.g. number of planes available) may be better known, but the uncertainty regarding the success of the operation continues. It is only after a battle that the effectiveness of a battalion/squadron becomes clear, and the overall effectiveness of the allied forces becomes clear only at the end of the war. Thus uncertainty pervades decision-making problems in the Air Force. The MURI research team has made some fundamental advances in this area, and they are summarized below. Two-stage Decision-Making under Uncertainty In these models, a decision-maker first determines a set of “binary” first-stage decisions, in the sense that he/she must either choose to take a set of actions or not. For instance, these actions could represent the decision of whether or not to fortify certain resources against attack, support a military mission, or establish new bases in forward areas. Next, a random event related to the first-stage decision occurs. In the example of fortifying resources against attack, the random event could represent an attack on some of our infrastructure, the severity of the attack, and the effectiveness of our fortifications against such an attack. Following the first-stage binary decisions and the random outcome, the decision-maker chooses another decision in response to the random outcome. The challenge in these problems is to maximize the benefit achieved in the first stage, plus the expected benefit achieved in the second stage. 18 Our research covers a gamut of models, ranging from those that are well structured (Huang, Sen, and Zhou (2006)) to others that are significantly more complex (Sherali and Smith (2006)). The common theme tying these papers together arises from the need to recognize decisions that are acceptable under uncertainty. In case of the former paper, Huang, Sen and Zhou (2006) seek decisions that have a high probability of being nearoptimum, whereas, results in Sherali and Smith (2006) address approaches that help achieve a “satisficing” criterion, i.e., to ensure that a threshold goal is satisfied as often as possible. For instance, rather than maximizing some expected function of enemy attacks that are thwarted (in the above example), it is more likely that a military commander would attempt to maximize the chance that key positions are not lost to the enemy due to, for example, loss of communications or combinations of critical resources. One of the main mathematical challenges addressed by the research reported in Sherali and Smith (2006) arises from the need to introduce integer variables into the second stage (representing whether or not we must concede that the goal is not achieved under certain circumstances). The presence of these variables thus precludes the use of standard linearprogramming based methods for solving the problems. Our approach uses new techniques for overcoming these difficulties, and we demonstrate that our methods are valid and efficient enough to permit the analysis of a broad array of optimization problems fitting this description. The challenges arising from combinatorial choices (integer variables) in the second stage also appear in a variety of applications studied in connection with our project. Sen and Higle (2005), and Sen and Sherali (2006) present general search methodologies for making decisions when the response to uncertainty is combinatorial. The attractive feature about these methods is that they are based on using approximate solutions of small decision models to arrive at a well-hedged solution aimed at accommodating a large number of scenarios. These methods are applicable to decisions seeking the best combination of bases (as in Ntaimo and Sen (2005)) or similar applications where decisions under uncertainty are complicated due to combinatorial choices. We have also prepared a survey article that provides an overview of such decision problems (Sen (2005)). Multi-stage Decision-Making under Uncertainty The two-stage decision models described above are a special case of multi-stage decision models in which one plans a sequence of decisions under uncertainty. Such models are essentially constrained/stochastic generalizations of control models, and find applications in command and control. Some of the simplest examples of multi-stage decision-making arise in air-to-air refueling, route-planning for sorties, and personnel deployment decisions. Uncertainty in such applications arises from the inability to predict the manner in which the mission will evolve. The class of models discussed here allows decisions to evolve with the state of the mission. In multi-stage models, linear dynamics provide one of the more tractable settings, even in the presence of inequality constraints. Although such constraints make dynamic programming-based procedures computationally intractable, Casey and Sen (2005) 19 propose a successive approximation method which can provide near optimal policies. Note that in the presence of uncertainty, it becomes important to go beyond decisions, and seek policies, so that as sensor information becomes available, the system can adapt to updated state estimates. Casey and Sen (2005) provide a new algorithm that yields policy polyhedrons which provide guidelines for long-run decision-making under uncertainty. Such policy polyhedrons are expected to provide decision tools which a user might interpret easily. Another class of multi-stage models where linear dynamics plays a critical role is for the case in which decisions are required to satisfy some integer restrictions. One specialized model, which has applications in aircraft parts inventory, ammunition replenishment, personnel recruitment etc. is the lot sizing or batch sizing model which has typically been studied under assumptions of certainty. During wars, and similar uncertain situations, traditional deterministic models are difficult to justify. Instead, uncertainty in demands, lead-times and other parameters become important. Huang and Kucukyavuz (2006) present an efficient (polynomial) algorithm for such problems. Related papers dealing with uncertainty appears in Lulli and Sen (2004), and a mathematical characterization of the problem appears in Kucukyavuz (2006). One of the most widely used tools for multi-stage decision-making is the decision tree. Specifically, decision trees help in decomposing a complex decision into a sequence of decision-making steps. While the sequential process is effective in the modeling phase, the use of backward induction in the solution process (e.g. dynamic programming) limits the ability of decision-trees to accommodate models in which the objective is not stage-wise separable. Moreover, the presence of constraints and time-lags make it difficult to implement backward induction. Instead, we propose to convert such decisiontrees to models based on stochastic integer programming techniques. A novel path-based formulation that allows for additional constraints such as non-separability of objective functions, lag constraints, and other complex decisions is presented. We are currently developing robust algorithmic techniques for exploiting the special structure of this formulation and provide efficient solution methodologies. These results will be reported in a paper shortly (to be co-authored by Desai, Huang and Sen).. 20 4.3: Network Decision-Making Research In this section, we highlight descriptive and prescriptive decision-making research performed by our team in the last year in the field of networks, with a particular interest on network interdiction and secure network planning problems. 4.3.1: Network interdiction We first describe recent research performed by our team that can be described as “Stackelberg games” on networks. A two-player Stackelberg game is one in which a leader makes a set of decisions in order to achieve some objective (e.g., maximizing profit or minimizing risk), after which a follower makes decisions in reaction to the leader. In general, the follower might be trying to optimize his own objective without regard to the leader’s objective, or might be trying to compete with the leader over a common objective. Smith and Lim (2006) describe these types of problems in a general setting in a forthcoming book chapter. In network interdiction games, a network exists over which an operator wishes to execute some function, such as finding a shortest path, shipping a maximum flow, or transmitting a minimum cost combination of flows across a network. The role of the interdictor is to compromise certain network elements before the operator acts, by (for instance) increasing the cost of flow or reducing capacity on an arc, perhaps destroying it altogether. Hence, the interdictor acts as the leader, and the network operator acts as the follower. In order to compare the performance of human decision-makers’ solutions to optimal solutions, we first study optimal decision-making behavior in general network flow scenarios (Lim and Smith (2006)). For these problems, an attacker disables a set of network arcs in order to minimize the maximum profit that can be obtained from shipping commodities across the network. The attacker is assumed to have some budget for destroying arcs, and each arc is associated with a positive interdiction expense. Their study examines problems in which interdiction must be discrete (i.e., each arc must either be left alone or completely destroyed), and in which interdiction can be continuous (the capacities of arcs may be partially reduced). While the follower’s “reaction” problem is well-studied and not computationally difficult, the leader’s (interdictor’s) problem is very difficult. The contributions made by our team include exact and approximate models for solving the leader’s problems under either discrete or continuous interdiction assumptions. Given this study, we next examine the problem of building or fortifying a network to defend against enemy attacks in various scenarios (Smith et al., (2006)). Now, the Stackelberg game mentioned above is extended to three stages. The leader is now the network operator, whose mission is to fortify the network in advance of an attack. The follower is now the interdictor, who acts to destroy portions of the network. Finally, the leader acts last to conduct flows across the network, again trying to maximize the profit that can be obtained. In particular, Smith et al. (2006) examine the case in which an 21 enemy can destroy any portion of any arc that a designer constructs on the network, subject to some interdiction budget. While most studies of this nature assume that the enemy will act optimally, in real-world scenarios one cannot necessarily assume rationality on the part of the enemy. Hence, the authors prescribe optimal network design algorithms for three different profiles of enemy action: an enemy destroying arcs based on capacities, based on initial flows, or acting optimally to minimize our maximum profits obtained from transmitting flows. These suboptimal decision-making behaviors correspond to human decision-making scenarios in which the topology of the network is not fully understood (and hence informed decisions cannot readily be made regarding the optimal interdiction actions), or in which the decision-maker is constrained by time and cannot necessarily readily compute an optimal decision. A different approach to fortification comes in response to random attacks, or attacks that are due to nature instead of malicious behavior. Moreover, fortification attempts at such problems are not likely to completely prevent an attack; rather, increased fortification merely decreases the probability that an attack on the fortified infrastructure is successful. Desai and Sen (2006) model the probability of failure of an arc as a (convex) decreasing function of allocated mitigation resources. The resulting problem attempts to minimize a combination of design cost and network security. This is a difficult nonlinear optimization problem, which is solved using a mixture of contemporary integer and nonlinear programming methods. We have also conducted research efforts toward the assessment of vulnerabilities in networks, and how effectively decision-makers might be able to spot such weaknesses. Traditional studies on network vulnerability take a static view, considering only the topological structure of the network (i.e., the way that nodes are connected to one another via links). However, there can be other important factors involved in the estimation of network vulnerability in partially observed networks. Consider a setting in which the enemy attacks the network along a finite time horizon. At the beginning, the enemy can only see part of the network. As time goes by, additional features of network are revealed to the attacker. Given the partially observed network, the attacker will try to interdict the network flow as much as possible, within a given budget. Huang and Sen (2006) provide an alternative estimation of the vulnerability of a network under dynamic attack via stochastic programming. These new network vulnerability estimation techniques help us predict the robustness of a network in dynamic situations, and plays an important role in understand human decision-making in scenarios where the networks are only partially observable. 4.3.2: Path planning problems As opposed to the Stackelberg description of games presented above, some network problems involving an operator and an attacking agent involve decisions that are made 22 simultaneously by the two agents. Suppose that the operator seeks a least-cost path between two nodes. If an assessment of link failure probabilities can be made by the network operator, then we may consider several different methods for finding a least-cost path subject to the constraint that the path must survive (all links must operate) with a sufficiently large probability. (A. K. Andreas briefly considers such a problem in her dissertation, supported by this funding.) However, in many practical settings, decision makers desire the existence of several backup paths as well. For instance, a critical mission may be attempted by several teams working in concert with one another, with redundant capabilities in case one teams fails. “Cost” in this setting may refer to the amount of time required to complete the mission. Planning these missions can be quite difficult, because the mission paths should ideally be diverse (so that interdiction of an arc will not disrupt multiple teams), but the total time required by the teams would ideally be minimized. An interesting study that we will investigate regards whether human decision-makers tend to optimize such problems, and whether diverse routing considerations tend to dominate cost considerations. However, it is not clear which of these considerations dominates in “optimal” solutions. Andreas and Smith (2006b) analyze the problem in which two paths between a source and destination node are established, such that the probability that at least one path remains operational is not less than some threshold. These authors consider the case where both paths must be arc-disjoint and the case where arcs can be shared between the paths. This study is the first of its kind, and yields insights as to the limitations enforced by requiring arc-disjoint paths versus permitting limited arc sharing (provided that the overall probability that at least one path survives is sufficiently large). This study is then continued by Andreas et al. (2006) in which some h arc-disjoint paths are established between a source and destination. While the mathematical approach is fundamentally different from the two-path study, the results are positive in the sense that we can now compare solutions to these problems to optimal solutions produced by our algorithms. 4.3.3: Human behavior on congested networks The Braess paradox (BP) (Braess (1968)) consists of showing that, in equilibrium, adding a new link that connects two routes running between a common origin and common destination may raise the travel cost for each network user. Rapoport et al. (2006b) report the results of two experiments designed to study whether the paradox is behaviorally realized in two simulated traffic networks that differ from each other in their topology. Implementing a within-subjects design, both experiments include large groups of participants in a computer-controlled setup who independently and repeatedly choose travel routes in one of two types of traffic networks, one with the added links and the other without them. Their results reject the hypothesis that the paradox is of marginal value and its force, if at all evident, diminishes with experience. Rather, they strongly support the alternative hypothesis that with experience in traversing the networks players 23 converge to choosing the equilibrium routes in the network with added capacity despite sustaining a sharp decline in their earnings. The BP in traffic and communication networks is a powerful illustration of the possible counterintuitive implications of the Nash equilibrium solution. Extending previous research by Rapoport et al. (2006b), Rapoport et al. (2006a) report the results of a new experiment with a richer topology and asymmetric link costs of travel designed to assess the descriptive power of the BP. Their results show that with experience in traversing the network, players’ choice frequencies approach the equilibrium solution as predicted by the BP. Given the self-optimization that people tend to exhibit, as demonstrated by the foregoing studies, we adapt these decision-making principles to evacuation networks. These evacuation networks can refer to organized retreats from a battlefield, evacuation of a city from an impending disaster, or a fire evacuation plan from a large building. Andreas and Smith (2006a) note that self-selection of evacuation routes can lead to congestion and poor throughput in a system. Moreover, the evaluation of the quality of evacuation routes is often flawed: the average travel time is the most common metric. However, consider two candidate evacuation plans. Plan 1 evacuates 90% of the network’s occupants in 20 minutes, and 10% of the occupants in 40 minutes. Plan 2 evacuates all occupants in 23 minutes. While the average evacuation time of Plan 1 (22 minutes) is better than that of Plan 2 (23 minutes), one might prefer Plan 2 if there is a critical evacuation deadline. Such deadlines occur, for instance, in hurricane evacuation, or in a retreat problem when the arrival time of a malicious force can be anticipated. Andreas and Smith (2006a) examine the design of an evacuation tree, in which evacuation is subject to capacity restrictions on arcs. The cost of evacuating people in the network is determined by the sum of penalties incurred on arcs on which they travel, where penalties are determined according to a nondecreasing function of time. Given a discrete set of disaster scenarios affecting network population, arc capacities, transit times, and penalty functions, this study seeks to establish an optimal a priori evacuation tree that minimizes the expected evacuation penalty. The centralized planning that we exert over the system helps to mitigate the negative impacts of selfish routing shown in the laboratory by Rapoport et al. (2006a,b). The tree structure can be implemented in practice by simply displaying a set of directional arrows, for instance, in building hallways or at road intersections. The solution strategy is based on a decomposition technique, which allows us to analyze time-expanded networks, i.e., networks whose flows are linked spatially as well as temporally. In this fashion, we are able to quickly obtain tighter lower and upper bounds on the optimal solution, and can indeed identify an optimal solution within a few hours of computational time for instances of moderate size. 24 4.3.4: The Network Interdiction Game A large group of the MURI team (Suvrajeet Sen, Jitamitra Desai, Kai Huang, Balaji Ganesan, Arvind Narayanan, Zhihong Zhu, Tamar Kugler, J. Cole Smith, Mofya Chisonge) have been involved in designing a network-interdiction game that can be used in a variety of experiments. At the most basic level, it can be used to test the robustness of a network design to attack from automated algorithms or human attackers. At other levels, it can be used to study the behavior of human attackers, and finally, it can also be used to investigate the relative power of automated design and attack algorithms In this research effort, we model two opposing sides, with conflicting objectives (or missions). The design mission is to design a network such that it satisfies flow/demand constraints in the most optimal (cost-efficient) fashion, given that the network is subject to attack from the opposing side. The dual-purpose design objective is to constructing networks, which are not susceptible to enemy attack while simultaneously being (relatively) cost effective. On the other hand, the attack mission is to limit the flow through the network to the largest extent possible. Obviously, both sides subject to budget constraints. This network game is set-up as a simulation environment via a distributed computing framework, wherein the simulation module provides the centralized framework, and is responsible for interacting with both the design and attack modules. Another aspect of the game is the reveal module, which randomly reveals a connected portion of the designed network to the attack module. Other important features include dynamic flow updates at discrete time intervals, time-based cost structures for the attack module, dynamic graph revelations, etc. Figures 1 – 6 display one instance of the game in progress. Figure 1: Initial design Figure 2: Two arcs revealed 25 Figure 3: Revealed arcs attacked Figure 5: Three arcs revealed Figure 4: Redesign: Attacked arcs rebuilt Figure 6: Revealed arcs attacked 26 4.4 Applied Decision-Making The goal of this thrust is to develop a high fidelity, synthetic human decision-making model under complex and realistic environments such as military command and control systems, automated manufacturing systems, and individual behaviors under emergency situations. The availability of such a model will allow us to understand and/or evaluate dynamics of systems involving humans more accurately. In this research endeavor, a number of engineering methodologies and technologies have been employed to help reverse-engineer (understand and extract features from) human behaviors, to represent the human decision-making model formally, and to develop a realistic simulated environment. They are 1) component behavioral models (research outcomes) described in three other thrusts in this report, 2) extended BDI (belief, desire, intention) agent framework, 3) tradeoff studies, 4) extended Decision Field Theory, 5) advanced monitoring and control techniques for complex multivariate systems, 6) immersive virtual reality technology, 7) human-in-the-loop and distributed simulation. 4.4.1 Extended BDI Framework and Human-in-the-loop Experiments. Zhao and Son (2006) proposed extended BDI (belief, desire, intention) agent framework (Rao and Georgeff (1998)) for modeling partial human decision-making in complex automated manufacturing systems (preliminary work was reported in the last year’s report). The proposed framework is capable of 1) generating plans in real-time (suitable for dynamically changing environment), 2) supporting both the reactive as well as proactive decision-making, 3) maintaining situation awareness in human language like logic to facilitate interface with real human, and 4) changing the commitment strategy adaptive to historical performance (denoted as confidence index). In Zhao and Son (2006), the proposed model has been developed in the context of the human operator who is responsible for error detection and recovery in a complex automated manufacturing system. LORA (logic of rational agents) is employed to represent the models. A scheme of integrating the proposed human agent with an automated shop floor control system (environment) is also developed to demonstrate the proposed agent in the context of an automated manufacturing system. A distributed computing platform based on DOD High Level Architecture (now IEEE 1516 Standard) has been used to integrate an agent (implemented in JACK), real human, and the environment (Arena simulation software). Although our work has been developed and demonstrated in the context of the error detection and recovery personnel in a complex automated manufacturing environment, it is expected that the model is directly applicable to the human operators dealing with complex systems in Air Force (e.g. pilots during combats) and in civilian systems such as operators in a nuclear reactor, power plant, and extended manufacturing enterprise. Later, Son and Jin (2006) and Lee et al. (2006) further developed the proposed BDI framework (see Figure 7), where two major additions are to use of Bayesian belief network for the perceptual processor module and to employ SOAR program to implement the real-time planner module. Furthermore, to enhance the generality of the proposed BDI framework, we have applied it to various scenarios including 1) error detection and resolution personnel in a complex manufacturing facility (Zhao and Son (2006), this was 27 reported in the last year’s report), 2) evacuation behaviors under a terrorist bomb attack (Shendarkar et al. (2006)), 3) rifle shooting problem (Lee et al. (2006); Son and Jin (2006)), and 4) evacuation behaviors under fire in a factory (Vasudevan and Son (2006)). For the first two scenarios, BDI models have been only conceptually developed without involving human experiments. For the last two scenarios, simulation software systems were developed to allow human-in-the-loop experiments. Figure 8(a) shows a snapshot of software running on PC to mimic the rifle shooting situation, where considered decisions are on the frequency of calibration in shooting. The simulation software was used to conduct a preliminary experiment involving 5 subjects (Lee et al. (2006); Son and Jin (2006)), which has helped us refine the corresponding BDI models. In Fall 2006, the same software will be used to conduct experiments involving more than 60 students at The University of Michigan (Jin’s Design of Experiment class). The experimental results will be used to analyze the impact of factors (e.g. mean shift of a noise variable, standard variation of a noise variable, training) on human shooting performance as well as to fit the series of shooting behaviors into the BDI models. Figure 8(b) shows a snapshot of a human interacting with the simulated factory under fire in the immersive virtual reality environment, where a considered decision is to choose one from alternative paths for evacuation. Currently, the human subjects review process for this experiment is being undertaken at The University of Arizona. After the review is approved, we are planning to conduct experiments with human subjects to develop corresponding BDI models. Figure 7: Extended BDI (belief, desire, intention) agent framework 28 Figure 8: Screenshots of software system in PC and immersive VR environments for human-in-the-loop experiments Another class of applications being considered is one in which multiple decision-makers (e.g. multiple units of a joint force) are to be coordinated through a central command. In such cases, overall risk associated with the mission is measured by the coordinator, whereas, each individual unit has its own risk exposure. Because of the difference in the level of detail faced by the central command and the individual units, it is important to provide decision support that allows decision-makers to evaluate the consequences of their decisions. In particular, we are developing new methodology that will allow the central command to achieve a coordinated decision based on a sequence of inputs from the (subordinate) units. This methodology, which is reported Desai et al (2006), is an extension of the so-called multi-disciplinary optimization (MDO) methodology, and appears to be ideally suited for maximizing autonomy while maintaining coordination. 4.4.2 Tradeoff Studies and Application of Extended Decision Field Theory. This research endeavor concerns tradeoff studies, which is relevant with the decision executor module in the extended BDI framework (see Figure 7). Tradeoff studies provide an ideal, rational standard for making a choice among alternatives. Air Force officers routinely use tradeoff studies to help select contractors, architecture and system designs. Also, tradeoff studies are broadly recognized and mandated as the method for simultaneously considering multiple alternatives with many criteria, and as such are recommended in the Software Engineering Institute’s Capability Maturity Model Integration (CMMI 2006) Decision Analysis and Resolution (DAR 2004) process. The work by Bearden et al (2005), which is described in the Sequential Decision Making section of this report, emphasizes the importance of tradeoff studies as well. In this research, we have been developing tools, techniques, and strategies to help engineers, managers, military officers and politicians to perform tradeoff studies and to document their decision-making processes. Tradeoff studies, which involve human numerical judgment, calibration and data updating, are often approached with under confidence by analysts and are often distrusted by decision makers. The decision-making fields of Judgment and Decision Making, Cognitive Science and Experimental Economics have built up a large body of research on human biases and errors in considering numerical and criteria-based choices. Smith 29 (2006) studied hundreds of these experimental papers and isolated seven dozen biases that could specifically affect the components of tradeoff studies. Similarities between experiments in these fields and the elements of tradeoff studies show that tradeoff studies are susceptible to human biases, but also indicate ways to eliminate the presence, or ameliorate the effects of human biases on tradeoff studies. Smith et al. (2006-a) has proposed strategies to ameliorate the effects of human biases on tradeoff studies. Sensitivity analysis, a mandatory component of tradeoff studies, is a powerful tool for understanding systems, but precise mathematics, subtle tricks and customizations have to be used to reap the benefits. Smith et al. (2006-b) has shown how to overcome some of the difficulties of performing sensitivity analyses. It draws examples from a broad range of fields, including bioengineering, process control, tradeoff studies and system design. The work in the paper generalizes the important points that can be extracted from the literature covering diverse fields and long time spans. Another important component of tradeoff studies is derivation of weights of importance for the criteria. Botta and Bahill (2006) developed a prioritization process, which has been used to derive weights of importance for the criteria in tradeoff studies and to prioritize goals, customer needs, capabilities, risks, directives, initiatives, issues, activities, requirements, technical performance measures, features and functions. It has been used at National Security Solutions of BAE Systems. Technical performance measures (TPMs) are tools that show how well a system is satisfying its requirements or meeting its goals. Oakes et al. (2006) demonstrated the use of TPMs for National Security Solutions of BAE Systems. It is believed that prioritizing of weights of importance and contractors that use TPMs will increase the probability of success of Air Force systems. In the field of tradeoff studies, dynamic evolution of preferences among options (alternatives) has not been investigated extensively. Therefore, we also studied the evolution of preference state based on the Decision field theory (DFT) (Busemeyer and Townsend (1993)). Lee and Son (2006) have been investigating to extend the DFT for a dynamic and realistic environment. The first extension is to consider the psychological fluctuation or evaluation error of a human subject. This is a situation where the subjective values of attributes of each option may change over time. The second extension is to consider the case when the focus of human attention (weight vector attributes) may change dynamically from one attribute to another. This case was considered by Diederich (1997), where a Markov process was used to model the stochastic changes in weights over time. Later, Roe et al. (2001) assumed that the weights are identically and independently distributed (iid) over time. However, Markov process does not change sub-processes dynamically according to the different environment. Thus, we employed Bayesian belief network (BBN) to model when and how the human attention changes against the dynamic environments. To test the feasibility of the proposed ideas, we developed simulation software (see Figure 8(c)) of a stock market to allow human-in-the-loop experiments, where considered decisions are on when to sell the stocks. In this example, the weights on each attribute are assumed to be affected by three factors, including 1) past investment return, 2) the amount of available 30 fund, and 3) the current market trend. It is noted that more factors can also be considered in a similar manner. The prior probabilities in the BBN are attained through the actual human experiment. In Lee and Son (2006), the experimental data from one human subject was used to illustrate the proposed concepts. In Fall 2006 and Spring 2007, we are planning to conduct more extensive experiments to test the feasibility of the proposed ideas. 4.4.3 Advanced Monitoring and Control Techniques for Complex Multivariate Systems. Two major components of the extended BDI framework (see Figure 7) are the perceptual processor (monitoring) module and the decision executor (control) module. Increasing complexity of systems and recent development in sensing and computer technology has resulted in a data-rich environment in most automatic monitoring systems. In this research endeavor, we have been developing advanced monitoring and control techniques for complex, multivariate systems. In complex, multivariate systems, a high dimensional profile signal is often observed in the measurement of system responses, in which each signal profile is measured corresponding to a complete cycle of a system operation. Generally, when a system is operated under the same condition, different cycles of operations should have the same average profile signal of the system responses. Different clusters of these profile signals can reflect different operational characteristics of the monitored system under different conditions. In contrast with currently available supervised classification approaches that heavily depend on the training dataset, Zhou and Jin (2005) developed an automatic feature selection method for unsupervised clustering of high dimensional profile data. First, principal component analysis (PCA) is applied to raw profile signals. Then a new method is proposed to select only informative principal components to allow clustering to be effectively performed. The dimension of the selected features for clustering can be significantly reduced through the use of these two steps. Finally, a model-based clustering method is applied to the selected principal components to automatically find the clusters in the analyzed dataset. This research can be further applied for automatically grouping of decision maker behaviors in Air Force applications to find different clusters of decision makers in terms of their multiple dimensional behavioral responses. For example, an automatic feature selection method can be developed for clustering the behaviors of allies and the enemy in combat situations. Another important aspect in monitoring is an early and effective detection of changes in the system state. In general, system states are usually monitored by cross-correlated multiple attributes, and changes in system functionality are reflected by different patterns of the attribute changes. In Air Force combat situations, the impact of attackers can be characterized by different patterns of the received monitoring attribute signals. Early and effective detection of different attack patterns is extremely important to be well prepared to handle them. Among all the potential attackers, some attack patterns may be pre-known from our prior knowledge or learned from the historical training datasets, while there are always other unknown attack patterns (which have not been discovered 31 nor learned before). Therefore, monitoring control charts should be designed to allow us to detect both known and unknown attack patterns. Moreover, the detection sensitivity weights among these patterns must be allocated appropriately based on the occurrence probability or the risk weigh of each known pattern and the class of unknown patterns. As a result, the commonly used non-directional multivariate control charts cannot be applied directly. For this purpose, Zhou et al. (2005) proposed a new directionally variant multivariate control chart monitoring system, in which multiple univariate projection chart are designed to monitor those critical specific pattern changes while one non-directional multivariate control chart is used to monitor the unknown pattern changes. This research has developed a systematic way to optimize the Type I error allocation among those control charts according to their severity or probability of the occurrence of those patterns, which can achieve a maximal system detection power under a given total system Type I error. It proved that for those predefined patterns, the detection power of the conventional non-directional multivariate control is improved by adding simultaneous univariate “projection” control charts. However, for the unknown faulty condition, the detection power of the combined chart could be reduced. The overall benefits of using the proposed combined control charts are determined by the tradeoff between the known patterns and the unknown patterns detection. This research discussed the generic conditions and justification for using the proposed charts. It shows that with the increase of the known pattern occurrence probability or severity or the mean shift magnitude, the proposed combined control chart will have more benefits on the improvement of the overall detection power. This research can be generally applied to automatically design multivariate monitoring control charts in combats to effectively allocate detection sensitivity weights among different attack patterns based on their occurrence probability or their importance to the success of the mission. We also discovered that involving the causal relationships among the monitored system components allows us to develop a more effective monitoring system. Multivariate statistical process control (SPC) using Hotelling T 2 statistic is widely adopted for change detection in a complex multivariate system. However, T 2 control chart alone is not capable of identifying the root causes of the change of individual components. Thus, the 2 decomposition of T is proposed by Mason et al. (1995) (called MTY approach), which provides a way to identify the variables significantly contributing to an out-of-control T 2 signal. However, the MTY approach is computationally expensive and has a limited capability in root cause diagnosis for a large dimension of variables. To overcome this problem, Li et al. (2006) proposed a causation-based T 2 decomposition method by effectively integrating causal models with the MTY approach. Theoretical analysis and simulation studies demonstrate that the proposed method substantially reduces the computational complexity and enhances the diagnosability, compared with the MTY approach. This research can be further applied for the development of an advanced monitoring and diagnosis system based on the causal relationships among the monitored system components. It can help make an effective decision for quickly detecting and diagnosing, and potentially preventing a military system from failures. Employing the above mentioned monitoring techniques, Jin and Son (2006) proposed to develop an effective human decision support system for a complex environment. When 32 multiple sensors measuring different components alarm simultaneously under a complex environment, it is quite often for human to mess up decisions especially under a high stress condition. The proposed human decision support system will allow a complex multivariate monitoring problem to be explicitly decomposed into a sequence of univariate monitoring procedures that can be easily handled by human beings. Meanwhile, Decision filed theory (DFT) (Busemeyer and Townsend (1993)) is further integrated to analyze tradeoff of the human belief on the monitored components’ states with the sensor monitoring results. The obtained results can be used to support the human in developing defensive strategies and decisions for security/protection systems with distributed sensors. In addition to the advanced monitoring techniques for complex systems, we also studied the advanced control techniques. Generalized Predictive Control (GPC) is often used when the studied dynamic system has a large dead time delay (the time delay lag between executing the control action and obtaining its corresponding response output measure), and an ARMAX model is possibly to be built. The critical issue in implementing GPC is how to get an accurate prediction of the system change trend. If a system has the possibility to have different change patterns at different times, an automatic detection and estimation of the change pattern is needed. For this purpose, Jin et al. (2006) has shown how to integrate the SPC (Statistical Process Control) to develop a supervisory strategy for the GPC controller development. It shows that different feedback control strategies are appropriately developed by adaptively adjusting the control decision based on the online predicted trends from SPC monitoring. This research can be further expanded for a complex dynamic system control decision, where the system has a large and varying lags of dead time, such as policy decision making in social science and resource allocation in the military deployment. 4.5 Relevance to the Air Force and Applications to the Air Force Technology Challenges Our MURI team is committed to focusing on research that is of great relevance to the Air Force. As a result, we have presented our research findings in the context of decisionmaking issues faced by Air Force personnel and moreover, we are investigating applications that pose technological challenges for the Air Force. In order to highlight this commitment, we have required every thrust, theme and nugget in this report to present its relevance explicitly in subsections 4.1-4.4. These discussions are highlighted using italics in each of the previous four subsections, and we recommend that a reader who might have overlooked the connections, review those highlights again. 33 Section 5: Personnel Supported The following is a list of faculty and students who have been supported by this grant in the previous year. Faculty: Ron Askin Terry Bahill Terry Connolly Aleksander Ellis Jionghua (Judy) Jin Simge Kucukyavuz Tamar Kugler Amnon Rapoport Suvrajeet Sen J. Cole Smith Young Jun Son Ferenc Szidarovszky Graduate and Post-Doctoral Students: Neil Bearden Jiaqiong Chen Jitamitra Desai Manish Garg Balaji Ganesan Kai Huang Shravan Krishnan Seungho Lee Churlzu Lim Jingjie Long E. Chisonge Mofya Arvind Natarajan Sairam Rayaprolu Ameya Shendarkar Eric Smith Fransisca Sudargho Karthik Vasudevan Jayendran Venkateswaran Matthew Young Jijun Zhao Xiaobing Zhao Zhihong Zhou 34 Section 6: Publications Andreas, A.K. and Smith, J.C. 2006a. “Decomposition Algorithms for the Design of a Non-Simultaneous Capacitated Evacuation Tree Network,” submitted to Networks. Andreas, A.K. and Smith, J.C. 2006b. “Mathematical Programming Algorithms for TwoPath Routing Problems with Reliability Constraints,” submitted to INFORMS Journal on Computing. Andreas, A.K., Smith, J.C., and Küçükyavuz, S. 2006. “A Branch-and-Price-and-Cut Algorithm for Solving the Reliable h-paths Problem,” submitted to Operations Research. Askin, R., Shravan, K., and Connolly, T. “Evaluating the Effect of Random Outcomes and End of Horizon Effects in Sequential Decision Processes,” in preparation. Bahill, T. and Botta, R. 2006. “Fundamental Principles of Good System Design,” submitted to Journal of Engineering Design. Bahill, T., Botta, R., and Daniels, J. 2006. “The Zachman Framework Populated with Baseball Models,” submitted to Journal of Enterprise Architecture. Bahill, T., Szidarovszky, F., Botta, R., and Smith, E. 2006. “Valid Models Require Defined Levels,” submitted to International J. of General Systems. Bearden, J. N. 2006. “A New Secretary Problem with Rank-Based Selection and Cardinal Payoffs.” Journal of Mathematical Psychology 50 58-59. Bearden, J. N., Connolly, T. “Optimal Satisficing,” in press. MURI Chapter. Bearden, J. N., Connolly, T. “Satisficing in Sequential Search,” under 2nd review for Organizational Behavior & Human Decision Processes. Bearden, J. N., Connolly, T. “On the Robustness of Satisficing Search Policies,” in preparation. Bearden, J. N., Lim, C., Smith, J. C. “Experimental Tests of Optimal Sequencing of Alternatives,” in preparation. Bearden, J. N., and Rapoport, A. 2005. “Operations Research in Experimental Psychology.” In J. C. Smith (Ed.), Tutorials in Operations Research: Emerging Theory, Methods, and Applications, 213-236. INFORMS: Hanover, MD. Bearden, J.N., Rapoport, A., Murphy, R.O. “Sequential Observation and Selection with Rank-dependent Payoffs: An Experimental Test,” in press Management Science. Bearden, J.N., Rapoport, A., Murphy, R.O. 2006. “Sequential Selection and Assignment: An Experimental Study.” Journal of Behavioral Decision Making 19 229-250. 35 Bearden, J.N., Murphy, R.O., Rapoport, A. 2005. “A Multi-Attribute Extension of the Secretary Problem: Theory and Experiments.” Journal of Mathematical Psychology 49 410-425. Bearden, J. N., Murphy, Rapoport, A. “Decision Biases in Revenue Management: Some Laboratory Evidence,” under 1st review Manufacturing & Service Operations Management. Bischi, G., L. Sbragia and F. Szidarovsky. 2006. “Learning the Demand Function in a Repeated Cournot Oligopoly Game,” submitted to Mathematical and Computer Modelling. Botta, R. and Bahill, T. 2006. “A Prioritization Process,” submitted to Systems Engineering. (Also presented at INCOSE Symposium 2006.) Botta, R., Bahill, Z., and Bahill, T. 2006. “When are Observable States Necessary?,” Systems Engineering, Vol. 9, No. 3, pp. 228-240. Casey, M. and Sen, S. 2005. “The Scenario Generation Algorithm for Multi-stage Stochastic Linear Programming,” Mathematics of Operations Research, 30, pp. 615-631. Chiarella, C. and Szidarovszky, F. 2006. “Discrete Dynamic Oligopolies with Intertemporal Demand Interactions,” working paper, SIE Department, University of Arizona). Chiarella, C. and Szidarovszky, F. 2006. “Dynamic Oligopolies with Production Adjustment Costs,” submitted to Journal of Economic Behavior and Organization. Connolly, T., Butler, D. 2006. “Regret in Economic and Psychological Theories of Choice.” Journal of Behavioral Decision Making 19 139-154. Connolly, T., Reb, J. “Decision Justifiability and Anticipated Regret,” under 1st review Journal of Behavioral Decision Making. Desai, J., Missoum, S., Sen, S. and Gupte, A. 2006. “A Multi-disciplinary Design Optimization Algorithm for Autonomous Sub-systems,” AIAA Conference, 2006. Desai, J., Huang, K., Sen, S. 2006. “The Network Interdiction Game,” in preparation. Desai, J. and Sen, S. 2006. “A Global Optimization Algorithm for Reliable Network Design,” submitted to Networks. Genc, T., Reynolds, S. and Sen, S. “Dynamic Oligopolistic Games Under Uncertainty: A Stochastic Programming Approach,'' to appear in Journal of Economic Dynamics and Control. 36 Huang, K. and Sen, S. 2006. “Dynamic Estimation of Network Vulnerability Under Attacks,” in preparation. Huang, K., Sen, S. and Szidarovszky, F. 2006. “Connections Among Decision Field Theory Models,” working paper, MORE Institute, University of Arizona, Tucson, AZ 85721. Huang, K., Sen, S. and Zhou, Z. 2006. “Stochastic Decomposition and Extensions,” invited paper in honor of G.B. Dantzig. Jin, J., Guo, H., and Zhou, S. 2006. “Statistical Process Control Based Supervisory Generalized Predictive Control of Thin Film Deposition Processes," Journal of Manufacturing Science and Engineering, Vol.128, No.1, pp.315-325. Jin, J. and Son, Y. “Decision Support Systems for Monitoring and Control of Multivariate Systems,” presented in AFOSR Cognition & Decision Program Review Workshop, Fairborn, April, 2006. Lee, S., Shendarkar, A., Son, Y., and Jin, J. 2006. “BDI-based Human Decision-Making Model in Shooting Problem,” working paper, SIE Department, University of Arizona. Lee, S. and Son, Y. 2006. “Decision Field Theory Extension for Complex Dynamic Environment,” working paper, SIE Department, University of Arizona. (Also to be presented at INFORMS Annual Meeting, Pittsburgh, November 2006). Li, J., Jin, J. and Shi, J. 2006. “Causation-Based T2 Decomposition for Multivariate Process Monitoring and Diagnosis," proceedings of Industrial Engineering Research Conference, 2006, Orlando. (Also received the Best Paper Award from the conference). Lim, C., Bearden, J.N., Smith, J.C. 2006. “Sequential Search with Multi-attribute Options.” Decision Analysis 3 3-15. Lim, C., and Smith, J.C. 2006. “Algorithms for Discrete and Continuous Multicommodity Flow Network Interdiction Problems,” to appear in IIE Transactions. Lulli, G. and Sen, S. 2004. “A Branch and Price Algorithm for Multi-stage Stochastic Integer Programs with Applications to Stochastic Lot Sizing Problems,'' Management Science, 50, pp. 786-796. 2 Mason, R., Tracy, N. and Young, J. 1995. “Decomposition of T for Multivariate Control Chart Interpretation.” Journal of Quality Technology 27, pp. 99-108. 37 Merlone, U., Szidarovszky, F. and Szilagyi, M. N. 2006. “Finite Neighborhood Games with Binary Choices,” submitted to International Game Theory Review. Ntaimo, L. and Sen, S. 2005. “The Million Variable ''March'' for Stochastic Combinatorial Optimization,” Journal of Global Optimization, 32, no. 3. Oakes, J., Botta, R., and Bahill T. 2006. “Technical Performance Measures,” presented at INCOSE Symposium 2006. Okuguchi, K. and Szidarovszky, F. 2006. “Existence and Uniqueness of Equilibrium in Labor-managed Cournot Oligopoly,” presented and published in the proceedings of the International Game Theory Conference in Zaragosa, Spain, July 2006. Rapoport, A., Kugler, T., Dugar, S., and Gisches, E. 2006a. “Braess Paradox in the Laboratory: An Experimental Study of Route Choice in Traffic Networks with Asymmetric Costs,” to appear in Decision Modeling and Behavior in Uncertain and Complex Environments (T. Kugler, J.C. Smith, Y.-J. Son, T. Connolly, eds), Springer. Rapoport, A., Kugler, T., Dugar, S., and Gisches, E. 2006b. “Choice of Routes in Congested Traffic Networks: Experimental Tests of the Braess Paradox,” submitted to Games and Economic Behavior. Rapoport, A., Mak, V., Zwick, R. “Navigating Congested Networks with Variable Demand: Experimental Evidence,” in press Journal of Economic Psychology. Seale, D. A., Parco, J. E., Stein, W. E., Rapoport, A. 2005. “Joining a Queue or Staying Out: Effects of Information Structure and Service Time on Arrival and Exit Decisions.” Experimental Economics 8 117-144. Sen, S. 2005. “Algorithms for Stochastic Mixed-Integer Programming Models,” Chapter 9, Handbook of Discrete Optimization, (K. Aardal, G.L. Nemhauser, and R. Weismantel eds.) North-Holland Publishing Co. Sen, S. and Higle, J.L. 2005. “The C3 Theorem and a D2 Algorithm for Large Scale Stochastic Integer Programming.” Mathematical Programming, 104, pp. 1-20. Sen, S. and Sherali, H.D. 2006. “Decomposition with Branch-and-Cut Approaches for Two Stage Stochastic Integer Programming,” Mathematical Programming, 106, pp. 203-223. Shendarkar, A., Vasudevan, K., Lee, S. and Son, Y. 2006. “Crowd Simulation for Emergency Response using BDI Agent Based on Immersive Virtual Reality,” submitted to Simulation Modelling Practice and Theory. (Also to be presented at Winter Simulation Conference, Monterey, December 2006). 38 Sherali, H.D. and Smith, J.C. 2006. “Two-Stage Stochastic Risk Threshold and Hierarchical Multiple Risk Problems: Models and Algorithms,” submitted to Mathematical Programming Smith, E. 2006. “How Cognitive Biases Affect Tradeoff Studies,” Ph.D. Dissertation at The University of Arizona, Tucson, AZ. Smith, E. and Bahill, T. “Tradeoff Studies and Cognitive Biases,” presented at INCOSE Symposium July 2006. Smith, E., Son, Y., and Bahill, T. 2006a. “Ameliorating the Effects of Cognitive Biases on Tradeoff Studies,” submitted to Systems Engineering. Smith, E., Szidarovszky, F., Karnavas, J., and Bahill, T. 2006b. “Sensitivity Analysis, a Powerful System Validation Tool,” submitted to IEEE SMC. Smith, E.D., Szidarovszky, F., Karnavas, W.J. and Bahill, T. 2006. “Sensitivity Analysis, a Powerful System Validation Tool,” submitted to IEEE SMC, April26, 2006. Smith, J. C., Bearden, J. N., Lim, C. “Optimal Sequencing of Alternatives,” in preparation. Smith, J. C., Lim, C., Alptekinoglu, A. 2006. “Protection of Assets Against Intelligent Opponents,” in preparation for submission to Management Science. Smith, J.C., and Lim, C. 2006. “Algorithms for Network Interdiction and Fortification Games,” to appear in Pareto Optimality, Game Theory and Equilibria (A. Migdalas, P.M. Pardalos, L. Pitsoulis, and A. Chinchuluun, eds), Springer. Smith, J.C., Lim, C., Bearden, J.N. “On the multi-attribute stopping problem with general value functions,” in press Operations Research Letters. Smith, J.C., Lim, C., and Sudargho, F. 2006. “Survivable Network Design Under Optimal and Heuristic Interdiction Scenarios,” to appear in Journal of Global Optimization. Son, Y. and Jin, J. “Extended BDI Framework and Technologies for Modeling Partial Human Decision-Making,” presented in AFOSR Cognition & Decision Program Review Workshop, Fairborn, April, 2006. Stein, W. E., Rapoport, A., Seale, D. E., Zhang, H., Zwick, R. “Batch Queues with Choice of Arrivals: Equilibrium Analysis and Experimental Study,” in press Games and Economic Behavior. 39 Szidarovszky, F. 2006. “Delayed Nonlinear Cournot and Bertrand Dynamics with Product Differentiation,” working paper, SIE Department, University of Arizona. Szidarovszky, F. 2005. “Extended Oligopoly Models and Their Asymptotical Behavior,” presented at the Nonlinear Economic Dynamics 2005 Conference, July 28-30, 2005, Urbino, Italy. Szidarovszky, F. and Zhao, J. 2006. “Dynamic Oligopolies with Intertemporal Demand Interaction,” submitted to International Journal of Computers and Mathematics. Vasudevan, K. and Son, Y. 2006. “Evaluating Egress Schemes for Factory Layouts in a Virtual Reality CAVE Environment,” working paper, SIE Department, University of Arizona. Xu, Y. and Sen, S. 2005. “A Distributed Computing Architecture for Simulation and Optimization,” proceedings of the Winter Simulation Conference (M.E. Kuhl, N.M. Steiger, F.B. Armstrong, J.A. Jones, eds.) Yousefi, S. and Szidarovszky, F. 2006. “Once more on Price and Quantity Competition in Differentiated Duopolies: A Simulation Study,” accepted in Pure Mathematics and Applications. Zhao, J. and Szidarovszky, F. 2006. “N-firm Oligopolies with Production Adjustment Costs: Best Responses and Equilibrium,” submitted to Journal of Economic Behavior and Organization. Zhao, J., Szidarovszky, F. and Szilagyi, M. N. 2006. “Finite Neighborhood Binary Games: A Structural Study,” submitted to Artificial Societies and Social Simulation. Zhao, J., Szidarovszky, F. and Szilagyi, M. N. 2006. “Repeated Prisoner’s Dilemma and Battle of Sexes Games: A Simulation Study,” (T. Kugler, J.C. Smith and T. Connolly, eds.). Zhao, L. and Sen, S. 2006. “A Comparison of Sample-path Based SimulationOptimization and Stochastic Decomposition for Multi-location Transshipment Problems,'' to appear in Proceedings of the 2006 Winter Simulation Conference, (L.F. Perrone et al, eds.). Zhao, X. and Son, Y. 2006. “BDI-based Human Decision-Making Model in Automated Manufacturing Systems,” accepted for International Journal of Modeling and Simulation. Zhou, S., and Jin, J. 2005. “Automatic Feature Selection for Unsupervised Clustering of Cycle-based Signals in Manufacturing Processes,” IIE Transactions on Quality and Reliability, Vol. 37, pp. 569-584. 40 Zhou, S., Jin, N., and Jin, J. 2005. “Cycle-based Signal Monitoring Using A Directionally Variant Control Chart System,” IIE Transactions on Quality and Reliability, Vol. 37, pp. 971-982. Citations from the Literature Bertsekas, D. P., Tsitsiklis, J. N. 1997. “Neuro-Dynamic Programming.” Athena Scientific. Bogacz, R, Brown, E., Moehlis, J. , Holmes, P. and Cohen, J.D. 2006. “The physics of optimal decision making: A formal analysis of models of performance in twoalternative forced choice tasks,” Psychological Review, in press. Busemeyer, J.R., Jessup, R.K., Johnson, J.G., Townsend, J.T. 2006. “Building Bridges between Neural Models and Complex Decision Making Behavior,” working paper, Indiana University. Busemeyer, J.R. and Townsend, J.T. 1992. “Fundamental derivations from decision field theory,” Mathematical Social Sciences, 23:255-282. Busemeyer, J. R., Townsend, J. T. 1993. “Decision field theory: A Dynamic-cognitive Approach to Decision Making in an Uncertain Environment.” Psychological Review 100: 432-459. CMMI, "Capability Maturity Model Integration," 2006. Retrieved March 2006 from Software Engineering Institute: http://www.sei.cmu.edu/cmmi/. DAR, "DAR basics: Applying decision analysis and resolution in the real world," 2004. Retrieved March 2006 from Software Engineering Institute: http://www.sei.cmu.edu/cmmi/presentations/sepg04.presentations/dar.pdf. Diederich, A. 1997. “Dynamic Stochastic Models for Decision Making Under Time Constraints.” Journal of Mathematical Psychology, 1997, 41(3), 260-274. Diederich, A. and Busemeyer, J.R., 2003. “Simple matrix methods for analyzing diffusion models of choice probability, choice responde time, and simple response time,” Journal of Mathematical Psychology, 47:304-322. MacLennan, B. 1996. “Field Computations in Motor Control,” in Self Organization, Computational Maps, and Motor Control (P.G. Morasso, V. Sanguineti, eds.), North-Holland Publishing Co. Rao, A.S., Georgeff, M.P. 1998. “Decision Procedures for BDI Logics.” Journal of logic and computation, 293-342. 41 Roe, R., Busemeyer, J.R., and Townsend, J.T. 2001. “Multialternative Decision Field Theory: A Dynamic Connectionist Model of Decision Making.” Psychological Review, 2001, 108(2), 370-392. Smith, P.L. and Ratcliff, R. 2004. “Psychology and neurobiology of simple decisions.” Trends in Neurosciences, 27(3):161-168. Steinbeck, O., Toth, A. and Showalter, K. 1995. “Navigating complex labyrinths: optimal paths from chemical waves,” Science, 267: 868-871. Sutton, R., Barto, A. 1998. “Reinforcement Learning.” MIT Press, Cambridge. Tversky, A.V.. and Kahneman, D.V.1992. “Advances in prospect theory: Cumulative representation of uncertainty,” Journal of Risk and Uncertainty, 5, 297-323. 42 Section 7: Interactions / Transitions Part a: The following is a list of significant seminars and conference participation relevant to this grant. • • • • • • • • • • • • • • • • • Bearden, J.N. “Tutorial: Empirical versus Optimal Decision-Making in Sequential Search Problems” INFORMS conference, November 2005. Bearden, J.N. and Rapoport, A. “Behavioral Operations Research,” Behavioral Decision Making in Management Conference, Santa Monica, June 2006. Bearden, J.N was invited to speak at a conference organized at “Behavioral Research in Operations and Supply Chain Management,” Penn State on, June 2006. Botta, R. and Bahill, A.T. “A Prioritization Process,” proceedings of the 16th Annual International Symposium of the International Council on Systems Engineering (INCOSE), Orlando, FL, July 9-13, 2006, (refereed). Casey, M. and Sen, S. “Dynamic Stochastic Programming with Decision Policies (DSPDP)”, INFORMS Conference, November 2005. Connolly, T. and Butler, D. “Regret in Economic and Psychological Theories of Choice,” Society for Judgment & Decision Making meeting, Toronto, Nov 2005. Connolly, T. and Reb, J. “Mental Appropriation, Subjective Ownership, and Decision Making,” Behavioral Decision Making in Management Conference, Santa Monica, June 2006. Desai, J. “Convexification-based Global Optimization Algorithms in Emergency Response Allocation Problems” INFORMS conference, November 2005. Genc, T. and Sen, S. “Economic Interpretations for Resource Allocation Models in the Presence of Indivisible Goods”, INFORMS Conference, November 2005. Huang, K. “The Planning Horizon of the Infinite Horizon Stochastic Lot-Sizing Problem” INFORMS conference, November 2005. Kucukyavuz, S. INFORMS 2005: “Facets of the Lot-Sizing Polyhedron with Backlogging” INFORMS conference, November 2005. Kucukyavuz, S. "Facets of the Lot-Sizing Polyhedron with Backlogging" ISMP 2006 Kucukyavuz, S. "Stochastic Lot-Sizing Problem with Random Lead Times" ISMP 2006. Kugler, T., and Rapoport, A. “Public Good Provision in Inter-group Conflicts: Effects of Asymmetry and Profit-sharing Rules.” Annual International Meeting of the Economic Science Association. Montreal, Canada. June23-June 26, 2005. Kugler, T., and Rapoport, A. “Choice of Routes in Traffic Networks.” Workshop on “Decision Modeling and behavior in Uncertain and Complex Scenarios. University of Arizona. February 27-28, 2006. Lee, S., Zhao, X., Shendarkar, A., Vasudevan, K., and Son, Y. “Epoch Time Synchronization Method with Continuous Update for Distributed Supply Chain Simulation”, 12th IFAC Symposium on Information Control Problems in Manufacturing, Saint-Etienne, France, May, 2006. Oakes, J., Botta, R. and Bahill, A.T. “Technical Performance Measures,” proceedings of the 16th Annual International Symposium of the International Council on Systems Engineering (INCOSE), Orlando, FL, July 9-13, 2006, (refereed). 43 • • • • • • • • • • • • • • • • • Rathore, A., Balaraman, B., Zhao, X., Baek, S., Venkateswaran, J., Son, Y., and Wysk, R. “Development and Benchmarking of an Epoch Time Synchronization Method for Distributed Simulation”, IFIP 5.7 Conference, Rockville, MD, September, 2005. Rapoport, A., Kugler, T. “Route Choices in Traffic Networks.” Annual International Meeting of the Economic Science Association. Montreal, Canada, June, 2005 Rapoport, A. “Choice of routes in congested traffic networks.” Department of Economics, Osaka University, Japan. Invited seminar. November 2, 2005. Rapoport, A. “Choice of routes in congested traffic networks.” Invited seminar. Faculty of Management and Industrial Engineering, Technion, Israel Institute of Technology, December 14, 2005. Rapoport, A. “Choice of Routes in Traffic Networks.” Department of Information Management, Yuan Ze University, Taiwan. Invited seminar. January 6, 2006 Rapoport, A. “Choice of Routes in Congested Traffic Networks: Experimental Tests of the Braess Paradox.” Decision Analysis Taiwan National Conference. Yuan Ze University. Keynote address. January 7, 2006. Rapoport, A. “Choice of Routes in Congested Traffic Networks: Experimental Tests of the Braess Paradox.” Inaugural Asia-Pacific Meeting of the Economic Science Association. Hong Kong University of Science and Technology. Keynote address. January 23-25, 2006. Rapoport, A. “Embedding Social Dilemmas in Intergroup Competitions Reduces Free Riding.” Inaugural Asia-Pacific Meeting of the Economic Science Association. Hong Kong University of Science and Technology. January 23-25, 2006. Sen, S. “A Comparative Study of Decomposition Algorithms for Stochastic MixedInteger Programming”, INFORMS, San Francisco, November, 2005. Sen, S. (Panelist) “Creating an Testbed of Industry Problems for OR Model and Algorithm Development” INFORMS Conference, November 2005. Sen, S. "Service Enterprise Engineering," Keynote Lecture, First IEEE-Service Operations, Logistics and Transportation Conf. Beijing, China, August, 2005. Sen, S. "Operations Research: The Glue for Infrastructure Systems," Opening Keynote Lecture, Operations Research Society of India National Meeting, Bangalore, India, December 2005. Sen, S. "Stochastic Server Location and Related Stochastic Network Design Problems," Risk Symposium, Santa Fe, NM. 2006. Sen, S. “Algorithms for Stochastic Combinatorial Optimization” University of Wisconsin, March 2006. Sen, S. "On Connections Between Differential Dynamic Programming and Nested Benders' Decomposition," NSF Workshop on Approximate Dynamic Programming, Cocoyoc, Mexico, April 2006 Smith, E.D. and Bahill, A.T. “Tradeoff Studies and Cognitive Biases,” proceedings of the 16th Annual International Symposium of the International Council on Systems Engineering (INCOSE), July 9-13, 2006, Orlando, FL, (refereed). Smith, J.C., “Survivable Network Design under Various Interdiction Scenarios,” International Workshop on Global Optimization, September 2005, San José, Spain, (refereed). 44 • • • • • • • • • • • • • Smith, J.C., “A Mixed-Integer Programming Model and Algorithm for Determining the Branchwidth of a Graph” INFORMS conference, November 2005. Smith, J.C., “Optimization Methods for Routing Problems on Networks with Stochastic Failures,” Invited Lecture, Auburn University, February 2006, Auburn, AL. Simth, J.C. "Network Design Under Varying Interdiction Behavior," Risk Symposium 2006, Santa Fe, March 2006. Son, Y. “Development of an Epoch Time Synchronization Method for Distributed Simulation” INFORMS conference, November 2005. Son, Y., Kulvatunyou, B., Cho, H., and Feng, S. “A Semantic Web Service and Simulation Framework to Intelligent Distributed Manufacturing”, ASME IMECE 2005, Orlando, FL, November, 2005. Son, Y. and Jin, J. “Extended BDI Framework and Technologies for Modeling Partial Human Decision-Making,” AFOSR Cognition & Decision Program Review Workshop, Fairborn, OH, April, 2006. Son, Y., Venkateswarn, J., and Askin, R. “Federation of Multi-resolution Hybrid Models for Hierarchical Supply Chain Planning”, INFORMS International 2006, Hong Kong, June, 2006. Venkateswaran, J., Son, Y., Jones, A., Min, J. “Production and Distribution Planning for Dynamic Supply Chains”, IFIP 5.7 Conference, Rockville, MD, September, 2005. Venkateswaran, J. and Son, Y. “Information Synchronization Effects on the Stability of Collaborative Supply Chain”, Winter Simulation Conference 2005, Orlando, FL, December, 2005. Young-Jun Son gave an invited lecture at LG Electronics, Pyung-Taek, Korea, July 2006. Young-Jun Son gave an invited tutorial at 2006 IIE Annual Conference, Orlando, FL, May 2006. Zhao, X. and Son, Y. “BDI-Agent based Human Decision-making Software Models in Distributed Computing Platform,” 2006 IIE annual conference, Orlando, FL, May, 2006. Rapoport, A. and Bearden, J. Round Table Participants, XII (12th International Conference on the Foundations & Applications of Utility, Risk and Decision Theory). Rome, Italy, July, 2006. b. The MURI team (headed by Smith, Son, and Connolly) held a workshop involving approximately 25 speakers in the area of “Behavioral and Mathematical Decision Modeling” in February, 2006. A collection of papers that were presented at this workshop are going to be published as a book edited by Kugler, Smith, Connolly, and Son (to be published by Springer) 45 Section 8: New Discoveries Nothing to report (outside of the various technological advances reported in section 4). Section 9: Honors/Awards Ron Askin is a Fellow of the Institute of Industrial Engineers. A. Terry Bahill is a Fellow of the Institute of Electrical and Electronics Engineers (IEEE), a Fellow of the International Council on Systems Engineering (INCOSE). Terry Connolly is a Fellow of the American Psychological Society. Judy Jin was awarded the best paper conference at the IE Research Conference Suvrajeet Sen was selected to be a Fellow of INFORMS Young-Jun Son received the Outstanding Young Industrial Engineer Award from the Institute of Industrial Engineers (IIE grants the award to at most one person each year).
© Copyright 2024