(“Kath has Issues”)
Code
library(gh)
owner <- "kathsherratt"
repo <- "kathsherratt"
issues <- gh("/repos/:owner/:repo/issues",
owner = owner, repo = repo,
state = "open", .limit = Inf)
for (issue in issues) {
labels <- ""
if (length(issue[["labels"]]) > 0) {
for (label in issue[["labels"]]) {
labels <- paste0(labels,
' <span style="background-color:#', label$color,
'; color:white; padding:2px 6px; border-radius:3px; font-size:12px;">',
label$name, '</span>')
}
}
date <- strsplit(issue[["updated_at"]], "T")[[1]][1]
title <- issue[["title"]]
body <- issue[["body"]]
cat("## ", title, " ", labels, "\n\n",
date, "\n\n",
if (!is.null(body)) body else "", "\n\n",
"---\n\n",
sep = "")
}EA
2025-11-13
Note to self: troubles with EA - need to write this up because I keep needing a reference doc
Pegasus evaluation relevance
2025-11-13
- identify each key output from spim consensus docs
- rate each for quality: Credibility of consensus estimate
- internal modelling quality
- consistency with individual estimate
- reflection of group discussion Relevance to policy Legitimacy of consensus process
Structuring feedback - spi-m is intended to provide evidence to SAGE, IE reducing uncertainty, based on best available research - this takes several forms, including - parameter estimates for underlying epidemiology - short term now/forecasts - scenario projections
This can be evaluated in terms of - quality of evidence provided - robustness and legitimacy of process for evidence production
challengers, champions, and consensus in SPI-M relevance
2025-11-13
Reading the consensus statements written by the chairs of SPI-M in the latest round of the Pegasus exercise (a simulated pandemic to test preparedness). Interestingly we have moved back to a champion/challenger format of modelling evidence synthesis. In this format, the UKHSA produces estimates for relevant policy questions, and SPI-M’s role is to challenge based on expertise. Rather than comparing among modelling work equally, we have a kind of strong prior to be confronted and updated with other models/expertise.
I can see a few implications for SPI-M in terms of the function, process, and outputs…
Function
- does this model encourage less (or even less rigorous) quantitative work among spi-m? Not necessarily a bad - - thing- but when might we actually need multiple independent models, rather than critique? Are there some questions we should prioritise for challenge?
Process
- should/how would we structure discussion to avoid an anchoring bias on UKHSA estimates (if we want to)?
- how much disagreement/ in what form is needed to actually shift away from the UKHSA estimate?
- if there were substantial disagreement, what would happen? (How strong should the prior be?)
Outputs
- do we write the consensus statement with the UKHSA estimates, or should these be adjusted based on other work, and if so how?
- how is the language of consistency/conflict expressed, and interpreted, as with phrases validated for expressing a scale of uncertainty? (“Other models were broadly consistent with”?)
ethical frameworks legitimacy
2025-11-13
thinking about evaluating modelling work and the ‘wrong but useful’ phrase (a massive cop out imo). In any case, this is a very utilitarian way of looking at models - usefulness as the object. What would other ethical frameworks imply for model evaluation?
often wrong, never in doubt
2025-10-20
- a Danny Truellism
… and also a nice catchphrase for modelling evaluations
Cf. Box: always wrong, sometimes useful
baselines / benchmarks credibility
2025-11-05
graceful degradation credibility
2025-11-05
what a wonderful term https://en.wikipedia.org/wiki/Fault_tolerance
aria nausea legitimacy
2025-11-05
Reading about ARIA’s Innovator Circles: https://aria.org.uk/innovator-circles
Conflicted because the idea of small group creativity/change is so precisely what I find fascinating and enjoyable and intellectually productive. And every other word in ARIA’s description makes me want to puke. Silicon Valley speak is drowning out the idea, which is a pretty reasonable one, that a bit of open funding (and a flag of credibility to wave at evaluators) comes in handy to help groups get together.
Also, why describe an “experiment”? It’s not, is it? After all, there’s no counterfactual… And they’re trying to recruit scientists who one might expect to care about such a definition. So why? Why not just say it’s a good old-fashioned “pilot”?
Also, this is really a sticking plaster on a broken leg in terms of barriers to bringing together small groups of similar interests. I am thinking academic competition, elitism, and accessibility to those outside the “discipline”, let alone the academy.
Between baselines and counterfactuals credibility
2025-11-05
I dislike the framing of evaluating around a counterfactual. Maybe because it is so oddly specific in its extra-ordinary hypothetical world; it feels very frequentist. OTOH I find it much more helpful - intuitively and operationally -to think about baselines.
They are similar: - used to think about the impact of an intervention - in terms of an alternative without the intervention - typically hypothetical and retrospective in the type of social research setting I work in
But different - baseline= context - counterfactuals= binary - baseline allows for a dose response
To me, a baseline is more like describing a prior, so that the evaluation is searching for the posterior.
add work highlights
2025-09-25
draft to start with
A few recent highlights
- Software engineering a dashboard for humanitarian advocacy in Gaza
- Co-developing and [delivering](https://www.lshtm.ac.uk/newsevents/news/2024/cmmid-members-teach-nowcasting-and-forecasting-stockholm] an open-source training course on Nowcasting & Forecasting Infectious Disease
the “matthew effect” legitimacy
2025-11-05
The “Matthew effect” is the academic research equivalent of “the rich get richer”. It’s demonstrated pretty widely and consistently that researchers who win funds early on in their careers, continue to accumulate more funding over the rest of their career than those who missed out on early opportunities.
I think this effect might be wildly intensified for those of us working in outbreak response research. Publishing an early estimate of some fundamental epidemiological characteristic results in a snowball of citations as an outbreak progresses. If funding decisions at least partly consider these citation metrics then we might expect a pretty outsized Matthew effect among researchers who publish early in an outbreak.
Project proposal: use bibliometrics and/or funding data to trace career trajectories through and between outbreaks, among those who publish on outbreak response.
my public library
2025-11-05
I’ve made my zotero library public, available here.
https://www.zotero.org/kathsherratt/library