| 1/29 | John de Figueiredo (Sloan School, MIT) Academic Earmarks and the Returns to Lobbying Authors: John M. de Figueiredo and Brian S. Silverman Abstract: In this paper, we statistically estimate the returns to lobbying by universities for educational earmarks (which now represent 10% of federal funding of university research). The returns to lobbying approximate zero for universities not represented by a member of the Senate Appropriations Committee (SAC) or House Appropriations Committee (HAC). However, the average lobbying university with representation on the SAC receives an average return to one dollar of lobbying of $11-$17; lobbying universities with representation on the HAC obtain $20-$36 for each dollar spent. Moreover, we cannot reject the hypothesis that lobbying universities with HAC or SAC representation set the marginal benefit of lobbying equal to its marginal cost, although the large majority of universities with representation on the HAC and SAC do not lobby, and thus do not take advantage of their representation in Congress. On average, 45 percent of universities are predicted to choose the optimal level of lobbying. In addition to addressing questions about the federal funding of university research, we also discuss the impact of our results for the structure of government. |
| 2/5 | Gary King (Government, Harvard University) Enhancing the Validity and Cross-cultural Comparability of Survey Research Authors: Gary King, Christopher J.L. Murray, Joshua A. Salomon, and Ajay Tandon Abstract: We offer a new approach to writing survey questions and a new statistical model that together at least partially ameliorate two long-standing problems in survey research. The first is how to measure complicated concepts, such as freedom, health, political efficacy, pornography, etc., that researchers know how to define clearly only with reference to examples. The second problem is when different respondents interpret identical survey questions in incomparable ways, as can occur when comparing respondents in different countries speaking different languages, but it also occurs frequently with different groups in the same country. Our approach to these problems is to ask respondents for self-assessments of the concept being measured along with assessments, on the same scale, of each of several hypothetical individuals described by short vignettes. The actual (but not necessarily reported) levels for the people in the vignettes are, by the design of the survey, invariant over respondents and thus provide anchors for our statistical model to transform the self-assessments to a comparable scale. With analysis, simulations, and real surveys in several countries, we show how ignoring these problems can lead to the wrong substantive conclusions and how our approach can fix them. Our methods build on insights from application-specific research on voters and legislators in political science to produce a more general measurement device. (You may also be interested in the Anchoring Vignettes web site which includes information about conferences on the subject, a FAQ, software, example vignettes, and other materials.) |
| 2/12 | James Fowler (Government, Harvard University) Connecting the Dots: How Do We Impute the Structure and Content of Large Social Networks? |
| 2/19 | Mark Glickman (Mathematics and Statistics, Boston University) Combining Speed and Accuracy to Assess Error-free Cognitive Processes Authors: Mark Glickman, Jeremy Gray (Washington Univ in St. Louis), Carlos Morales (Worcester Polytechnic Institute) Abstract: Many experiments on human cognition involve having a subject make a judgment as quickly and accurately as possible. Both reaction times and error rates are widely used indices of human performance in such experiments. A difficulty in relying on either one of these indices alone is the problem of a speed/accuracy tradeoff; subjects who react quickly are more likely to have higher error rates, whereas subjects who are more accurate are likely to have slower reaction times. Another difficulty arises when subjects respond slowly and inaccurately (rather than quickly but inaccurately), e.g., due to a lapse of attention. We introduce an approach that combines response time and accuracy information that addresses both situations. The modeling framework assumes two latent competing processes. The first, the error-free process, always produces correct responses. The second, the residual process, results in all observed errors and some of the correct responses (but does so via non-specific processes, such as guessing in compliance with instructions to respond on each trial). Inferential summaries of the speed of the error-free process provide a principled assessment of cognitive performance reducing the influences of both fast and slow guesses. Likelihood analysis is discussed for the basic model and extensions. |
| 2/26 | Donna Spiegelman (Epidemiology and Biostatistics, Harvard School of Public Health) Correlated errors in biased surrogates: study designs and methods for measurement error correction Abstract: The measurement error model proposed by Kipnis, et al. (1999) allows for correlations between subject-specific biases and between random within-subject errors in the surrogates obtained from two modes of measurement. However, most of these model parameters are not identifiable from the standard validation study design, including, importantly, the attenuation factor needed to correct for bias in relative risk estimates due to measurement error using the method of regression calibration (Rosner et al., 1989). We propose validation study designs that permit estimation and inference for the attenuation factor and other parameters of interest when this model applies. We use an estimating equations framework to develop semi-parametric estimators for these parameters, exploiting instrumental variables techniques. The methods are illustrated through application to data from the Nurses' Health Study (Willet et al., 1992) and Health Professionals' Follow-up Study (Grobbee et al., 1990) and comparisons are made to simpler models. |
| 3/5 | Donald Rubin (Statistics, Harvard University) |
| 3/12 | Scott Desposato (Political Science, University of Arizona) Correcting for Bias in Roll-Call Cohesion Scores Cohesion scores are a standard measure of legislative behavior and common in legislative studies. Their strengths explain their wide use: cohesion scores are simple, intuitive, and easy to calculate. Perhaps because of this, little attention has been paid to cohesion scores' statistical properties or relationship to existing theories of legislative behavior. In this paper, I show how under a basic random utility model of legislative behavior, cohesion scores suffer a serious bias problem: scores are artificially inflated for small parties and weak parties. This bias challenges a consistent finding in the comparative literature on political parties; that small parties are consistently more disciplined than large parties. I propose an intuitive solution that will eliminate bias for a wide variety of models: making large parties smaller by drawing samples without replacement from them. |
| 3/19 | David Lazer and Nancy Katz (John F. Kennedy School of Government, Harvard University) Putting the Network into Teamwork |
| 4/2 | Mary Beth Landrum (Health Care Policy, Harvard University) |
| 4/9 | Tao Li (Government, Harvard University) Legislative Rule Selection: a Study about US House of Representatives |
| 4/16 | Jasjeet Sekhon (Government, Harvard University) Robust Estimation and Outlier Detection for Overdispersed Multinomial Models of Count Data We develop a robust estimator---the hyperbolic tangent (tanh) estimator---for overdispersed multinomial regression models of count data. The tanh estimator provides accurate estimates and reliable inferences even when the specified model is not good for an unknown minority of the data. Seriously ill-fitted counts---outliers---are identified as part of the estimation. A Monte Carlo sampling experiment shows that the tanh estimator produces good results at practical sample sizes even when ten percent of the data are generated by a significantly different process. Theoretical results suggest that asymptotically the estimator will produce good results when up to half of the data are contaminated. The experiment shows that, with contaminated data, estimation fails using four other estimators: the nonrobust maximum likelihood estimator, the additive logistic model and two SUR models. Using the tanh estimator to analyze data from Florida for the 2000 presidential election matches well-known features of the election that the other four estimators fail to capture. In an analysis of data from the 1993 Polish parliamentary election, the tanh estimator gives sharper inferences than does a previously proposed heteroscedastic SUR model. |
| 4/23 | Elizabeth Stuart (Statistics, Harvard University) Matching and the Use of Multiple Control Groups in the Context of Causal Inference In observational studies, it is desirable to reduce bias due to covariates by obtaining treated and control groups with similar distributions of the covariates. This is often done by choosing well matched samples of the original treated and control groups. However, sometimes the originally chosen control units cannot provide adequate matches for the treated units. In these cases, it may be desirable to obtain matched controls from two control groups. Multiple control groups have been used in the context of causal inference to test for hidden biases; however, little work has been done on their use in matching or adjustment for these biases. In this talk, we address two issues associated with the use of multiple control groups. The first relates to longitudinal data and the necessity to define ``baseline" for non-randomized control units. This will be discussed in the context of a randomized clinical trial for a drug that may become commercially available, thus invalidating the traditional use of the control group. Historical data on patients with the disease will be used to supplement the original randomized control group. The second issue discussed will be how to quantify differences in unobserved variables between multiple control groups. Quantifying this bias can help guide the use of multiple control groups. This topic will be discussed in the context of the evaluation of a school-wide dropout prevention program where students in the original treated and control schools were significantly different from one another. The method explores the use of external national data on high school students in addition to the local control students. This is work in progress; no paper is available. |
| 4/30 | Lee Fleming (Business School, Harvard University) Small Worlds Enhance Innovation Authors: Lee Fleming, Charles King III, Adam Juda Abstract: We'll first describe our data set of the collaboration networks of 2.1 million inventors from the last 30 years. We'll then present a short paper that looks at the influence of small world networks and indirect ties upon regional productivity (ie., is Silicon Valley really different than Boston and does it matter). Finally, we'll sketch a preliminary research design for doing analysis at the individual inventor level and open it up to a discussion of the potential and problems at this level of analysis. |
| Fall 2005 | Spring 2006 |
| Fall 2004 | Spring 2005 |
| Fall 2003 | Spring 2004 |
| Fall 2002 | Spring 2003 |
| Fall 2001 | Spring 2002 |
| Fall 2000 | Spring 2001 |
| Fall 1999 | Spring 2000 |