Over at Greg Laden’s blog, he has notice of a study done via epidemiological procedures to look at the relationship between injury and gun possession. The paper is titled “Investigating the Link Between Gun Possession and Gun Assault “.
Source: Branas, C., Richmond, T., Culhane, D., Ten Have, T., & Wiebe, D. (2009). Investigating the Link Between Gun Possession and Gun Assault American Journal of Public Health DOI: 10.2105/AJPH.2008.143099
The researchers concluded,
On average, guns did not protect those who possessed them from being shot in an assault. Although successful defensive gun uses occur each year, the probability of success may be low for civilian gun users in urban areas. Such users should reconsider their possession of guns or, at least, understand that regular possession necessitates careful safety countermeasures.
I started writing a comment at Greg’s blog, but it just kept going and going, so I’m making a post here instead.
I will start by saying that after looking it over pretty carefully, I am unimpressed with this study. It appears to turn a non-significant result about gun possession into headlines by a series of steps that cannot be replicated from the description in the methods.
I’ve read the full paper, and the logic or math is incompletely described by which they arrived at even their numerical results, much less the further conclusions that they take. Of course, I’ve had only a brief time where I was engaged in epidemiology research, and that was over twenty years ago, so I’ve done a bit of review, too. They used a case-control experimental design. Because most of their data is nominal, not numerical, they employ “conditional logistic regression”. They mention models, but provide no descriptions of the particulars of the models, nor any parameters. (cf. the section on “Statistical Procedures” in the Methods section, where they describe a series of models and regression analyses, but without specificity.) As far as I can tell, even if someone else had a similar dataset to work with, they would not be able to fully replicate the procedure taken in this paper simply from the paper’s description of its methods. The journal site does not have a link for supplemental materials, so there does not appear to be any more extensive description of data, methods, or results than is present in the paper itself.
Their dataset is only comprised of shooting incidents. This limits what question they may reasonably be said to have addressed, as I will go into more detail about later.
this in mind, we conducted a population-based
to investigate the relationship between being
injured with a gun in an assault and an individ-
both fatal and nonfatal outcomes and accounted
for a variety of individual and situational con-
founders also measured at the time of assault.
Even after these
exclusions, the study only needed a subset of
the remaining shootings to test its hypotheses. A
random number was thus assigned to these
remaining shootings, as they presented, to enroll
a representative one third of them.
If this struck you as an strange dataset to base conclusions about the odds of being shot being modulated by having a gun in your possession at the time of an attack, you are not alone. I’ll run through what the case-control method does. A case-control analysis seeks to determine how strong an association there is between a risk factor and an outcome. In epidemiology, this is put in terms of exposure to a risk factor and whether the person has a disease (the outcome). Cases are the instances of people with the outcome/disease. Controls are people without the outcome/disease. The standard way to split things up is in a 2×2 table.
Given numbers to put in for A, B, C, and D, one can then compute an “odds of exposure ratio” for cases (A/B) and controls (C/D), and an “odds ratio” for the risk factor as (A*D)/(B*C). The overall “odds ratio” is an estimate of the relative risk people who are exposed to the risk factor are at. If a higher ratio of people exposed to the risk factor have the disease/outcome than is seen in the ratio of people not exposed to the risk factor, then the odds ratio is greater than 1 and indicates an association of the risk factor with the disease or outcome. If it goes the other way, the ratio is less than 1 and likewise is an association, but indicating that the risk factor is somehow correlated with reduced incidence of the disease or outcome. (Description above based on the fine page here.)
The authors of the present paper were pleased to note that some early work tying smoking as a risk factor to lung cancer was done with case-control analysis. I wasn’t able to grab the early paper cited, but I did find a CDC worksheet that claimed to pass along their numbers. I’ll put that here for an example. It will be laid out somewhat differently, since the work showed multiple categories of exposure, so the “unexposed” numbers I will put in the top row.
|Cigarettes/Day||Cases (had lung cancer)||Controls (did not have lung cancer)||Odds Ratio|
OK, that lays out how a historic use of the case-control method was done, and in a fairly simple way. Part of the reason the smoking/lung cancer study was so influential, though, lay in the categorization with amount of tobacco use, which goes some way toward showing a dosage-response relation on top of the lumped odds ratio of a 9x relative risk of lung cancer with smoking. One reason the case-control method worked for this was that in the smoking/lung cancer association, there was a strong signal in the data. That comes through in the magnitude of the relative risk values. Keep that in mind as we come back to consideration of the present paper.
The present study is in no way simple. Nor are its methods as given clear enough to replicate. There is, though, a table (Table 1) summarizing the basic numbers they started with in the study. It is sufficient to get the raw odds ratios for many of the conditions that they took note of or adjusted for in the study. The remaining ones are given in units that don’t permit easy calculation in terms of ratios. So here are the odds ratios based on the unprocessed, unadjusted numbers from Table 1. (Note: I’ve had to do a lot of transcriptions by hand, so there could be typos or worse, errors. I’ll fix them as they are pointed out to me.)
|Risk Factor||All Shootings||Fatal Shootings||Chance to Resist|
|Illicit drug involvement||1.56||6.12||1.02|
|Occupation: Working class||0.521||0.629||0.507|
One thing that stands out as a huge difference between the case group and the control group is location. The case group (the folks who got shot) were outdoors in 83% of non-fatal shootings and in 71% of fatal shootings. The control group, by contrast, were outdoors at the same time only 9% of the time. That translates into very high relative risk, about 50x the control group by simply going outside. If the authors wanted an attention-grabbing lead, the “being outdoors” risk factor is what they should have played up. “Don’t Go Outside” could have been the headline, validating Philadelphia agarophobics.
For myself, I’d have expected that a dataset to do what was reported would have included assaults with a firearm that either resulted in a shooting of the victim or did not result in the shooting of the victim, and do whatever “adjusting” magic was needed to control for different conditions between those two cases to find the effect of the victim being in possession of a firearm at the time. But this is definitely not the way they’ve described their data and methods.
The fact that exactly the class of assaults where a victim had possession of a firearm and was not shot was excluded from this study would seem to me to be a large methodological hole in the research. That makes the following statement from the discussion all the more bizarre:
Although successful defensive gun uses can
and do occur,33,57 the findings of this study do
not support the perception that such successes
But it seems to me that this was precisely the question that was at issue, and the choice of methodology reduced the authors to this flabby and, so far as I can tell, poorly substantiated claim. Again, maybe I’m missing something here, so if someone could clarify this, I would appreciate it.
Another mysterious item from the discussion:
tively, an individual may bring a gun to an
otherwise gun-free conflict only to have that gun
wrested away and turned on them.
Why is it that this is only stated as a possibility, and not actually quantified? Surely this would have come out in the investigation of the non-fatal shootings, and also in a part of the fatal ones, and could have been presented at least as a raw number, if not analyzed as well to reduce the confounding factors. Given the full pool of over 3,000 shooting incidents that they took their dataset from, this should easily have been characterized as a proportion of the dataset at the very least.
There is no simple relation between the relative numbers and the resultant risk factors that are reported. Since case-control studies are supposed to work by highlighting how a treatment group differs from a control group, it is problematic to see just how the small, non-significant differences between these groups in the particular character of interest get expanded into the fairly large risk factors given in the paper. The raw, unadjusted numbers don’t paint gun possession as being generically risky; in fact, across all cases, it shows a slight association with less relative risk. So it is only through the opaque method of adjustment of confounding factors that the startling relative risk estimates for gun possession come about. That process has a lot to do with how the models mentioned are constructed, and that information is not available via the published paper.
If we assume that all “adjustments” are made to the control data, we can estimate just how much “adjustment” had to occur in order to arrive at the published relative risk numbers for the “gun possession” condition in the three different contexts. I’ll lay this out in a set of tables, too.
|Case (%)||Control (raw %)||Raw Odds Ratio||Adjusted Odds Ratio||Est. Adjusted %||Adjustment Factor||5.92||7.16||0.816||4.46||1.39||0.195|
|Case (%)||Control (raw %)||Raw Odds Ratio||Adjusted Odds Ratio||Est. Adjusted %||Adjustment Factor||8.8||7.85||1.13||4.23||2.14||0.272|
|Chance to Resist Shootings|
|Case (%)||Control (raw %)||Raw Odds Ratio||Adjusted Odds Ratio||Est. Adjusted %||Adjustment Factor||8.28||7.37||1.13||5.45||1.63||0.221|
These tables show that the models used for “adjustment” of the raw data, if applied to the “control” raw data, would have reduced each raw control value for “gun possession” to about one-fifth its original value. This seems pretty aggressive for a model only described in general terms. This information reinforces the point that the startling and headline-grabbing statements about “gun possession” being risky are not founded upon the raw data, but instead come from the model. In order to put trust in the findings, the model needs far more transparency than this paper delivers.
Going back to the conclusions given:
On average, guns did not protect those who possessed them from being shot in an assault.
It seems beyond the scope of the study to say anything about not being shot in an assault with a firearm. The question that the research addresses is limited by the data that was considered, and what the chosen dataset can address is the question, “Is the general population much different in the characteristic of ‘gun possession’ from a sample of people who were shot in an assault with a firearm?” The raw data clearly says “No, there is no significant difference between the groups,” while the model apparently gives the opposite answer with statistical authority. This should cause introspection on the part of the researchers, not broad statements claiming to have established a basis for a sweeping change in perception of the issue.
Methodologically, comparing gun assaults without shootings to those resulting in shootings could have addressed the issue of whether gun possession changed the risk of being shot in an assault, but no such comparison was attempted. This seems distinctly odd, since the research logistical burden would have been lower with that approach, and reducing the research burden was specifically mentioned as a reason for adopting the case-control approach. If the risks estimated with the adjusted data were accurate, comparing gun assaults without shootings to those with would have provided a simple way to independently corroborate the finding. On the other hand, there would be relatively little cause for “adjustment” of the data between those two cases, and it may not corroborate the modeling effort undertaken here. If the Philadelphia police could be prevailed upon to provide a set of numbers from gun assaults without shootings concerning the total number of such cases and the number in which the victim possessed a gun, it would be sufficient to use as the “control” group and provide an odds ratio for comparison to that in the published paper.
Although successful defensive gun uses occur each year, the probability of success may be low for civilian gun users in urban areas.
This is entirely speculative and nothing in this study addresses any quantification of successful defenses against assaults with firearms, much less anything that would rise even to a guess about probability.
Such users should reconsider their possession of guns or, at least, understand that regular possession necessitates careful safety countermeasures.
I think better safety countermeasures is a goal we can all get behind, but I didn’t need a study with opaque and not obviously functional methodology to tell me that. As to reconsidering possession of firearms, I think I’ll take that with a grain of salt until someone is able to explain how that conclusion really follows from the data and methods described. Right now, it looks like a big non sequitur leap to me.
As noted above, it seems that there is a fairly simple way to check the model against reality. If that is done and it validates the model, I’d be somewhat surprised, but I’d be satisfied on the methodological issues that a real result had been obtained. But in the absence of either a transparent model permitting replication of results or the independent check I outlined above, my impression of the study is that it is more a means for assumptions to be converted into conclusions than a solid piece of empirical work.