Surgery and Sex: Are They Related?
When research, media hype, and gender politics meet in the OR, the truth gets sliced up
“Women More Likely to Die After Operation by Male Surgeon.”
“It’s better to be operated on by a female surgeon than a male counterpart, according to two studies.”
“Female surgeons have lower rates of long-term adverse outcomes than their male peers, study finds.”
These are just a few of the hundreds of headlines interpreting findings from large observational studies published in reputable medical journals over the past few years. Last Sunday, JAMA Surgery summarized its latest entry to the genre in a tweet: “This analysis of national Medicare claims data shows that female surgeons had better long-term post-operative outcomes compared to male surgeons, particularly for female patients.” The journal’s social media staff surely didn’t expect to get “community-noted” and ratioed by a mix of indignant surgeons, skeptical academics, and reply guys who either hate women or hate being told diversity might matter.
But is it true? Do women make better surgeons? Do female patients fare better when treated by female surgeons? The short answer is no. Women and men make equally good surgeons. I should know, I was Reviewer 1 for one of these studies, published in the British Medical Journal, arguably the most rigorous and transparent of the bunch. So, let’s talk about what the data actually show, and why so many people want it to say more than it does.
It was in 2021 that I first came across one of these studies. It was published in JAMA Surgery and received impressive global media coverage from 193 outlets, including Elle Magazine, The Washington Post, and the Parisian magazine Au Féminin. As a budding physician-scientist interested in physician workforce diversity, my curiosity was piqued. As I wrote in my letter to the JAMA Surgery editor, I was struck by how understated the study’s most important and robust finding was: there was no association between sex discordance and complications for female patients during emergent surgeries.
Emergencies offer something precious in observational research: a quasi-experimental design. In these cases, neither surgeons nor patients have the time to pick and choose who operates on whom. It’s about as close as you can get to randomization. So, if differences show up in general but disappear in emergent cases, that’s a clue: the story is likely less about surgical skill or a surgeon’s post-operative care, but instead everything else that happens before the surgery, like referrals or patients and surgeons carefully selecting one another. This study was based on the experience of surgeons and adult patients from Ontario, and the benefits of having a female surgeon for female patients were limited to elective cases, suggesting unmeasured confounding is likely at play.
Whether these differences were generalizable to all surgeries (they were not) or to these outlets’ audiences did not seem to matter in the coverage. Elle Magazine’s blockbuster article makes no mention of Ontario, and one of the authors interviewed says they “have demonstrated […] we are failing some female patients and that some are unnecessarily falling through the cracks with adverse, and sometimes fatal, consequences.” However, that’s quite an overinterpretation, especially since it remains unclear why they observed those differences only in elective surgeries. But I have some ideas.
About a year later, an editor at BMJ asked me to review a paper on the same subject, but this time using U.S. data, focused on older adults. Like the JAMA Surgery study, they found no sex-related differences when restricted to emergency surgeries (the most robust test!). However, for elective surgeries, they found that female surgeons had slightly better outcomes, with a 0.2% lower mortality risk for female patients and a 0.3% lower risk for male patients. Media reports still ran with the simplest headline: Women are better surgeons. But there is more. For example, at teaching hospitals, male patients benefited more from having female surgeons. In contrast, at non-teaching hospitals, they benefited more from having male surgeons, a phenomenon not observed among female patients. When the authors adjusted for frailty (a strong predictor of bad outcomes after surgery), many of the differences disappeared, suggesting that male surgeons may take on sicker or more complex patients.
Surgery remains a male-dominated field; however, gender parity has increased among younger cohorts. Older, more experienced surgeons skew more male on average, and with experience comes a greater willingness to take on risk. And despite women making up about 18% of the surgical workforce in the U.S., they performed fewer than 6% of cases in this study, highlighting a significant skew in patient mix or payer exposure, particularly away from older adults. But why?
Well, the culture in medicine, and society in general, still can, in some situations, penalize women more harshly for adverse outcomes, or outcomes that can be perceived as reflecting poor skill. One study found that after a complication, female surgeons not only receive fewer referrals but are also more reluctant to take on riskier cases. These labor market dynamics may help explain why the BMJ study found that sex-based differences in outcomes disappeared at hospitals where female surgeons had higher case volumes. In settings where female surgeons are more established, the apparent advantage vanishes, suggesting that previously found differences may reflect structural inequities that shape exposure to risk, rather than true performance differences. Interestingly, in low-performing hospitals, only male patients seemed to benefit from having a female surgeon, raising further questions about how gendered dynamics play out under stress, or in systems with fewer resources.
If these patterns are real and not artifacts of statistical noise, they paint a far more nuanced picture than what headlines suggest. The emergency surgery findings already tell us that men and women are equally competent surgeons. But the rest of the data point to something beyond individual skill or behavior: perhaps it’s not that female surgeons are more careful in selecting patients. Perhaps it’s that hospitals that integrate women better perform better for everyone. And when adequate resources and institutional respect are in place, small gender differences stop mattering.
The latest JAMA surgery study used the same surgeon and patient cohort as the BMJ study, focusing on longer-term outcomes, including one-year mortality —a metric far harder to attribute to a surgeon’s actions. That may explain some of the indignant response it received. On the other hand, those who find solace or vindication in non-causal findings misinterpreted as evidence of superiority are similarly misguided. One surgeon, for example, rightly advocating for pay equity in surgery, tweeted “Pay us” alongside a headline declaring, “Patients have better outcomes with female surgeons, studies find.” But that raises an uncomfortable question: knowing that observational studies using large administrative claims data are prone to errors and even false positives, and given how small these differences really are, what if the findings had gone the other way?
Studies of gender or racial concordance, flawed as many of them are, can still help us understand how cultural and structural forces shape outcomes. That’s useful. Case in point: research showing communication differences between Black and White physicians treating Black patients can teach us about the importance of both verbal and non-verbal communication in building trust with patients. But it’s a mistake to stretch these findings into proof that every Black patient should have a Black doctor, or every woman a female surgeon. It’s not realistic, and we’d never argue the reverse if the data pointed in the other direction.
A diverse physician workforce with equitable pay is worth fighting for, but the reasons are, first and foremost, moral and political. Stretching observational findings as catnip for DEI consultants or media clickbait won’t persuade skeptics. If anything, it risks undermining the kind of vital research that moves public health policy and helps address inequality. Is it worth that cost? I’m not so sure.