Visualizing Pathology Data: Colin Megill Interview

Colin Megill’s interview with Shirley Wu on Nextstrain, a tool for visual analysis of coronavirus (COVID19)/pathology data
data visualization
Published

March 15, 2020

Can data visualization be used to track the evolution of COVID-19, and help scientists track the spread of disease?

Yes! Nextstrain is a set of open source data tools for sharing virus genome datasets, and enabling that data to be visually analyzed and presented in context.

On March 13, Shirley Wu interviewed Nextstrain co-creator Colin Megill, with the goal of empowering those without a biology background to understand how the tool can be used, and to provide interested open-source developers with the necessary background to get involved. As I recently started contributing to the project, I was extremely interested to learn more.

To make the recording more accessible to new developers (and because I lack video editing skills), I made a React component to skip the breaks + enable viewers to jump to key segments of the raw video.

Twitter

Miscellaneous

Terminology

See the second dropdown for direct links to in-video discussions.

Other COVID Data Visualizations

Personal notes

  • Examine the “molecular clock” segment more closely to understand how sequencing helps estimate how many people were infected
  • Each individual sequenced result is much more valuable when contextualized by metadata from other sequenced results
  • Each tree dot represents a test, which costs ~$1000 + lab technician time.
  • The “Narratives” workflow lets scientists write interactive reports just by modifying a markdown file - this would be a useful workflow for other irnteractive data exploration tools. To make this integration simple, most of the application’s configuration state (filters, layout settings, etc) lives in URL parameters.

Technical notes

  • The playback tool was built with the react-player API.
  • I used the Chrome Invideo extension to search the video transcript for keywords when building the timestamp list.
Back to top