How to Discuss Uncertainty: Cancer Edition

September 9th, 2018 by Potato

My dad has read Atul Gawande’s Being Mortal, which discusses the suffering that people can experience near the end of their lives — particularly the suffering of medical treatments to extend those lives. There was no question about having surgery to remove his tumour — the cost-benefit there was huge (esp. as it had sent him to emerg). But he explicitly did not want to go through adjuvant chemotherapy, because chemo sucks and he knows that first hand, and he didn’t think it would have much in the way of benefits.

Then we met with the oncologist and found out that the regimen he’d have now would be much less severe than the kind of chemo he had a decade ago, and that the benefits were very real.

But exactly how to convey those benefits is a tricky matter. She told us, roughly speaking, that 50% of people with his kind of cancer would still be alive and cancer-free 5 years out — the surgery alone totally cured them, and taking chemo would be a pain in the rear but not actually help them because they were cured already. 20% of people would have had their disease come back, but not with chemo, while 30% would see their disease come back regardless of chemo. So, a 20 percentage point increase in chances, chemo sounded pretty good.

But it’s hard to frame that in a way that sticks. After hitting the first wall in chemo side effects, “20%” didn’t sound that great any more. So my dad wanted me to explain it to him plainly: he was giving up 6 months of his life (or at least quality of life) to chemo. What was he giving it up for? “How much longer will I live?” And I get it: he wants the benefit expressed in the same units as the cost, which would make decision-making so much easier. But the cancer stats just don’t seem to be expressed that way.

And it’s complicated: it’s not as simple as giving up 6 months, as there are probabilities and uncertainties there, like permanent adverse events, versus the probability of having a recurrence and dying (or having a much harsher round of treatment), or the probability of being cured of cancer for the rest of his life but then having a heart attack or stroke. Even if he was cancer free in 5 or 10 or 15 years, how much longer would he live otherwise? And giving up a year at 69 when he could go golf or enjoy the cottage is perhaps not worth gaining a year at 79 where he might not be having as much fun. These are all hard things to say.

Later, I found a chart that looked basically like this that helped show the survival benefit:

A rough sketch of a survival curve for colorectal cancer patients with and without surgery, where the benefit of adjuvant chemotherapy is about a 20% increase in 5-year disease-free survival.

But even that is lacking, as when choosing whether or not to take chemo the percentage point increase may not be as important as the percentage increase. That is, if you’ve already got a 50% chance of survival, bumping that up to 70% is a 40% improvement in your situation — taking the chemo is better than the “20 percent” figure makes it sound.

So I made a pictograph, which I think may be a better framing for showing the benefit of chemo. The MSKCC nomogram has a similar display.

An infographic with happy faces to visualize the relative survival benefit of adjuvant chemotherapy vs surgery alone.

None of this really helps answer the question in the way my dad wants, with the benefit in the same units as the cost. I started to go down the road of maybe integrating under those survival curves, to try to quantify what the expected increase in lifespan was. I even went into some of the quality-adjusted life years research and found a set of results that I could use as a weighting function — after all, an extra month of good health at 69 is not quite the same as an extra month at 79 or 89. But I stopped because that’s guaranteed to be an exercise in false precision, and I’m not sure giving him what he wants there is the best way (and I’m also doubting myself because no one that I’ve seen in health science presents results in this way to patients — at most such things are used in health economics to talk about big picture costs and programs).

CIHR Kerfuffle

July 10th, 2016 by Potato

There’s been a bit of a protest over the current “pilot” project scheme1 grant competition at CIHR. In fact, it’s so bad that the federal Minister has told them to meet with the scientists and sort it out.

The CBC does a surprisingly good job of explaining this to people who may not be scientists, but you might want a bit more context and background on the kerfuffle, which is where this post2 comes in.

Grants Background: In brief, the Canadian Institutes of Health Research (CIHR) is one of Canada’s main source of research grants, funding basic, translational, and clinical research related to human health. A grant is some money given to a research team to pay for the costs of a research project — reagents and software, grad student and post-doctoral fellow stipends, etc., etc. There is never enough money from funding agencies to go around: human curiosity knows no bounds (and the research enterprise is big). To get a grant, typically a scientist will write3 a proposal describing what they intend to study, explaining why it’s important, demonstrating the need for the research (what questions will it answer, what problems will it solve, what impact will it have on health care), describing the methods they will use in the study, and justifying why they are the ones to get the money and do the research (track record, expertise).

This proposal — along with dozens or hundreds or thousands of others — gets reviewed by the funding agency and expert external peer reviewers, who point out potential flaws in the methodology, or other issues with the proposed work. It gets scored and ranked by a review panel, based on the input from the external reviews and discussions at the panel4. In this way, the best, most promising research usually gets funded. It’s not a perfect system, but (without bothering to look up the research) the top ~10% of proposals are generally consistently funded, the bottom portion generally rejected, with some random chance elements in the middle as to what made it over the funding cut-off and what didn’t.

Funding Shortfall: Funds are tight in research, there isn’t nearly enough money to go around. This has been a long-standing problem that has been getting worse and worse over the years, particularly under the Harper conservatives in Canada and the post-GFC/sequestration in the US. There are many ways to deal with a shortfall of funds, and none of them are perfect: NSERC, for example, maintains a high success rate in their core Discovery grants, but cuts the requested budget of all but the top few score ratings so that most awardees don’t receive enough money to pay the costs of even a single full-time trainee. For other grant competitions where budgets are not cut, the success rate becomes abysmal.

That funding crunch is part of the background to the changes that CIHR made to its funding programs.

Early Career Researchers: On top of the general funding tightness and peer review issues, another issue with the reforms is the effect on early career researchers. Cancelling a year’s worth of applications can lead to a gap in funding for many, which can be deadly for a career. Plus the change in the funding model on the foundation grant side reduced the amount of money going to early career researchers — will there be an increase on the project scheme side to offset that? (Unlikely)

The Changes: CIHR made a bunch of sweeping changes at once, from combining a bunch of separate programs into a single competition, to changing the application format, to changing the way peer review was handled. All aspects are drawing fire in one way or another, but it’s the changes to peer review in particular that are the centre of the current unrest and the open letter (and Ministerial response).

Oh, and all these changes were implemented at once, after cancelling a few competitions so there was added pressure to apply now. CIHR called this a “pilot”, but that suggests a partial, limited-scale test — this post covers that aspect of the affair.

Peer Review: Here’s a great idea: rather than spending money in an already stretched environment to fly peer reviewers from all over the world to Ottawa for face-to-face meetings, let’s use the technology of the internet to do virtual conferences. Sounds brilliant, like one of those obvious cost savings measures that you can’t believe they’re not doing already. But what’s hilarious/tragic in reading the background to this story is that it’s been tried before (in actual pilots) and has been a massive failure — indeed, back when the changes were first proposed, an open letter from a group of Universite de Montreal scientists in 2012 predicted exactly this outcome of virtual peer review.

The core problem is that scientists doing peer review are humans. Incredibly busy humans. So yes, reading other people’s grants and scoring them is important for science and the integrity of the peer review system… but lots of things are important today. What is it that ultimately makes a scientist sit down and start poring over research proposals? Usually, it’s the knowledge that they’re going to have to sit at a table with their colleagues to discuss these grants and the pressure to not be the only one who didn’t do their homework. Plus, that puts a hard deadline on the process to activate the panic monster, and gives them a nice plane ride to sit down and actually do the reviews in a panicked sweat. When things are virtual, they don’t have to look their peers in the eye (they may never even know who the slacker is), especially when the instructions acknowledge that they’re busy people and suggest they can do it, like, whenever. So the reviews aren’t as good, there’s less peer pressure to be timely, and that’s what’s driving a lot of the uproar here: a large number of grants still did not have all their reviews in as the virtual conferences were nearing their end, and many of the reviews were not of high quality.

On top of that, the circumstances of this particular competition have exacerbated the problem: to avoid conflicts-of-interest, people who have a grant in the current competition aren’t invited to serve as peer reviewers. However, everyone and their dog is applying to this competition — sometimes with more than one proposal — because of the pent-up demand caused by the poor funding environment and two cancelled rounds of the previous open operating grant. So more applications, and very few people left as eligible reviewers. Plus, figuring out who has the expertise to be an expert reviewer has changed, and many are finding that the system (which I believe is now automatic and keyword-based) is matching people to grants that they are not fully qualified to review — though it’s not clear to me if that’s because the system is inherently broken, or a unique feature of this round where there are so many applications and a relative paucity of reviewers (though I’m sure it’s something that’s on the agenda for the meetings with CIHR).

The virtual reviews has also created a point of contention around how the reviews are ultimately combined and averaged: the formula hasn’t actually been released.

The Future: The next round of the competition was supposed to have been announced last week, with applications to be due in the fall. For the moment that’s on hold while CIHR meets with some scientists to sort this out and possibly re-jig how peer review is handled. Given the uproar, it’s likely something will be tweaked.

However, the funding success rates continue to be poor. That’s part of the background to the story, but the reforms (and possibly reverting them) will not be able to change that — there still isn’t enough money to go around. Though success rates were over 30% in the not-too-distant past, they have plunged below 20% in recent years, and the estimates are that this competition will see a ~13% success rate (likely about 500 grants from about 3800 applications). If there can’t be more funding, people want to be sure that the awards that do get made are fairly — and clearly, observably fairly — given to the best grant proposals. With the current system it looks like there is more noise and randomness, so it’s not so clear that the best-ranked grants are truly the best applications, because of the issues happening with the quality of peer review. In other words, low success rate + random scores = lottery. Indeed, I have in the past made the quip that when grant competitions have heartbreakingly low success rates, you may be better off spending the time you would have been writing an application on a 2nd job flipping burgers, and using that money to buy scratchers at the convenience store to fund your research program.

Of course, more money for research would really help here, so take a moment to write your MP and ask them to increase tricouncil5 funding for research.

1. Wayfare says: “Calling it a scheme was their first mistake. Nothing good is called a scheme.”
2. A quick note: I work in developing grants for CIHR and other agencies as part of my day job. It is not done to openly criticize the people who give you money, unless it’s very constructive. The criticisms I’m posting are those of others, for context on the controversy.
3. If they’re very lucky, they’ll have someone like me help to write/edit it. {/self-promote}
4. And that this point I’ll have to say much of the mechanics of this is a bit of a black box to me — I likely know more than most of my readers, but I have never seen a review panel in action.
5. Tricouncil refers to Canada’s three core federal research funding agencies: NSERC, CIHR, and SSHRC. There are other funding bodies, some of which have received increases in targeted funding even in the black Harper years, but these are the ones that could really use a letter of support from the electorate. A letter I suppose I should draft and post here… stay tuned.

Scaling Problem: House Size and Heating Bills

December 25th, 2014 by Potato

There was an article in the Globe & Mail a while ago claiming that it’s best to go with a smaller house because the bigger the house, the bigger the associated bills. Ok, that makes perfect sense.

But then it went on to claim that “it would seem reasonable to assume that it would cost twice as much to heat (or air condition) a 3,200 square foot home than it would one that is 1,600 square feet. But, as reasonable as this seems, it’s incorrect; it actually costs more than twice as much. […] Circumstances vary, but it can cost up to three times as much or more to heat and cool a home that is only twice as big.”

Now that just doesn’t make physical sense to me. We all know how scaling laws work: assume you have a spherical house, then the surface area will scale by r^2, while the volume will scale by r^3.

Ok, we don’t live in spherical houses, but still, this guy’s math must be way off. So I thought about it, and scaling with houses is actually a problem without any clear answer. Let’s set aside the complications like your own body heat or the waste heat of your home server farm (everyone has that, right?) and just talk about heat loss through the outside walls: even narrowed down with all that ceteris paribus it’s still a tricky question because houses are not spherical.

The simplest case I can think of is to take a cubical house. It has 6 unit surfaces: the roof, floor, and 4 walls. Now if you make that house twice as big by adding a second storey, the roof and ground floor are the same, and you’ve doubled the size of your walls (8 unit-walls). So doubling your floor space was less than doubling in your heat transfer area: only 1.67 times as much.

There are other ways to double the size of a house. You could go longer: expanding your floor plan from a unit square to a 2×1 rectangle. You only save on one shared wall between the unit squares in that case, so you do nearly double the outside area: 6 unit walls facing the outside, 2 floors, 2 roofs… but that’s again a 1.67 times increase (though more roof and floor with fewer walls added). Oh yeah, that’s just the first case turned sideways.

If you want to go crazy with shapes you could try find a way to get really inefficient. If you built a really long house (or made a C-shaped house to fit it on the lot — same difference for walls) that was 5 times as big as our unit square house, then it would be 3.67 times as costly to heat… wait that’s still going in the way I thought it would, with bigger houses being more costly, but scaling less than the increase in space.

In fact, the only way the author’s math works out is if you do non-apples-to-apples comparisons, like one house at 1,600 sq.ft. with 8’ ceilings and one at 3,200 sq.ft. with 16’ ceilings to drive the volume up but not the livable space measured in square feet. Or maybe it comes down to one of the complications I ignored, like floors and walls being roughly equivalent in terms of heat loss… but I doubt it.

He does mention more windows and doors just after the part I quoted, but again that doesn’t make sense to me. Yes, I lose more heat through my door than through a solid wall, but my house has two doors. A slightly bigger house would still have two doors. My parents’ house, which is maybe 2.5-3 times the size of our house, does have four doors, and my friend’s parents’ house, which is in-between, has three. But again, the number of doors are not scaling up faster than the increase in the size of the house. And the portion of the walls that are windows is not really any different with the bigger house.

So I will conclude for now that yes, a larger house will cost more to heat and cool, but it’s likely to scale less than the difference in size, because math. Fortunately, the massive building boom of recent times means that somewhere out there are a few developments with good test houses, ones built with the same insulation and materials and styles, but to different sizes. If anyone has some experimental data to back up (or refute) the spherical house reasoning, I’d love to hear it.

Yawn Interrupted

October 27th, 2013 by Potato

I have — quite by accident, as one does — achieved something truly incredible. In just a few short days I have completely destroyed a fundamental behavioural reflex that is shared by all the mammals I’ve ever shared a house with: yawning. I didn’t use pain or taste conditioning, no electrodes or supra-threshold induced currents with TMS. I didn’t use dopamine agonists, opioids, or GABA inhibitors — I didn’t inject anything at all.

What happened was that I was exhausted and a little silly from playing with Blueberry. Wayfare was going off to bed and yawning quite a lot. So I mimicked her: as she yawned, I opened my mouth. I didn’t have a sympathy yawn, I didn’t mimick the movement in my eyes or the tilt of my head: all I did was open my mouth wide until she was done yawning. And that cognitive dissonance of a not-quite-a-sympathetic-yawn made her laugh. So I did it again. Over the course of 2-3 days I did it maybe 5 or 6 times, that’s it. Now — completely untaught by me — Blueberry is doing it to her too.

And now she can’t yawn in front of either of us without laughing. “It’s so frustrating, I can’t satisfy my urge to yawn,” she says, leaving the room to get a good yawn in.

Beyond the fact that this is so hilarious that I pretty much give myself an asthma attack laughing at her tragic yet ultimately trivial problem, it is completely fascinating. Yawning is a hard reflex to suppress (though to be fair “broken” is not the same as suppressed), so I’m surprised that a few bouts of giggling is all it took. Fair warning to you all: I may be trying this out on the next few people I catch yawning.

Just Noticeable Difference

September 12th, 2012 by Potato

The just noticeable difference (JND) is the smallest difference in something that can be perceived. For instance, if you show me two pieces of string that are very nearly the same length, and then another similar pair, and another, there’s a certain length difference that I will just be able to perceive, and any that are closer together than that I won’t be able to tell apart. Similarly for other senses: two audio tones have to have a certain amount of difference in their volume or frequency in order for me to tell that they were different rather than the same tone repeated. The size of the JND is dependent on methods: you can notice a smaller difference in lengths if you look at two pieces of string side-by-side rather than one on one day, and one on the other. It can also help if there’s a point of reference, such as a grid in the background. But nevertheless, there will be some small difference below which you will be unable to tell two things apart.

So the JND can vary quite a bit depending on the experimental procedures, but given a particular method, the JND scales with the starting size of what you’re looking at: JND ∝ dl/L. If you have double the length of string, the difference in length between two comparison pieces also has to double before you’ll notice that there has been a change. If you’re in a dark room with one candle lit, lighting a 2nd is very noticeable addition to the brightness. If you’re in a bright room with a thousand candle power light on, lighting a candle may not noticeably increase the brightness — and if you can just notice adding (or subtracting) one candle against a background of say 200, then you should be able to just notice a change of 1/200th of a candle against a background of one candle.

Let’s consider the case of hair. I cut mine every 50 days or so. It goes from about 0.3″ when freshly cut to about 1″ in that time, for a rate of growth of 0.014″ per day. After I cut my hair it takes about a week before I notice that it’s gotten longer. So the constant for the JND is:
0.014*7/0.3 = 0.33

If the starting length for hair was instead say, 12″, then the scaling indicates the JND would be 3.9″. That is, a girl with shoulder-length hair would have to cut off about 4″ in order to have a good expectation that — with a one day to the next observation — a boy would notice that indeed her hair had been cut. Getting a 2″ trim would fall well below the JND, and psychophysically, it would be highly unlikely for such a difference to be spontaneously noticed. Nay, nearly physiologically impossible for such a difference to be detected under such conditions.

Everyone’s JND constant will be different, and circumstances can vary (e.g., someone may consistently wear shirts with horizontal markings on them to serve as a guidepost, or an observer may have superhuman vision discrimination, or the hair may be pulled into a ponytail, making the judgment even more difficult).

But whatever the individual circumstances, don’t forget the pioneering psychophysics work of Weber when someone doesn’t notice your haircut — they may not have been able to!