Can data visualization be used to track the evolution of COVID-19, and help scientists track the spread of disease?
Yes! Nextstrain is a set of open source data tools for sharing virus genome datasets, and enabling that data to be visually analyzed and presented in context.
On March 13, Shirley Wu interviewed Nextstrain co-creator Colin Megill, with the goal of empowering those without a biology background to understand how the tool can be used, and to provide interested open-source developers with the necessary background to get involved. As I recently started contributing to the project, I was extremely interested to learn more.
To make the recording more accessible to new developers (and because I lack video editing skills), I made a React component to skip the breaks + enable viewers to jump to key segments of the raw video.
Getting involved: Links
- Project Tracking
- Nextstrain.org
- Narrative Situation Reports (Translations available)
- Open Data Sources (used in Nextstrain Visualizations)
- Interviewee: Colin Megill / @colinmegill
- Nextstrain collaborators: Trevor Bedford / @trvb, Richard Neher
- Interviewer: Shirley Wu / @sxywu
Miscellaneous
Terminology
See the second dropdown for direct links to in-video discussions.
Other COVID Data Visualizations
- Harry Steven’s Flatten the Curve “Bouncing Balls” Interactive Explanation for the Washington Post
- Nicholas Kristof / Stuart Thompson’s Scrollyteller What-If Analysis How Much Worse the Coronavirus Could Get, in Charts for NYTimes
- Gabriel Goh’s Epidemic Calculator
- MIT Technology Review’s 10 Dashboard Roundup
- Ian Johnson is working on showing time slices only when the map is displayed
Personal notes
- Examine the “molecular clock” segment more closely to understand how sequencing helps estimate how many people were infected
- Each individual sequenced result is much more valuable when contextualized by metadata from other sequenced results
- Each tree dot represents a test, which costs ~$1000 + lab technician time.
- The “Narratives” workflow lets scientists write interactive reports just by modifying a markdown file - this would be a useful workflow for other irnteractive data exploration tools. To make this integration simple, most of the application’s configuration state (filters, layout settings, etc) lives in URL parameters.
Technical notes
- The playback tool was built with the
react-player
API. - I used the Chrome Invideo extension to search the video transcript for keywords when building the timestamp list.