Wikipedia Proves Useful for Tracking Flu
Scientists may have figured out the best way yet to track the spread of flu: Watch how many people visitWikipedia articles about the flu and its symptoms.
Researchers at Boston Children’s Hospital, Harvard Medical School, say the technique bests the Center for Disease Control’s (CDC) estimate of flu levels by up to two weeks. It also bettered Google’s Flu Trends data by 17 percent, researchers said.
“Having a timely estimate of what is happening in the population is crucial to being able to accurately plan for vaccines strategies and to coordinate public health and medical personnel,” said Dr. David McIver, one of the study’s authors. “The earlier we are able to know what the flu burden is, the better we will be able distribute resources and limit disease spread.”
The CDC estimates the flu kills between 3,000 and 49,000 Americans each year.
“Each influenza season provides new challenges and uncertainties to both the public as well as the public health community,” the team wrote in a statement. “We’re hoping that with this new method of influenza monitoring, we can harness publicly available data to help people get accurate, near-realtime information about the level of disease burden in the population.”
To come up with their data, researchers McIver and Dr. John Brownstein, “calculated the number of times certain Wikipedia articles were accessed every day from December 2007 to August 2013.”
According to McIver, they looked at 35 Wikipedia articles in total, including several that were meant to act as beacons of normal website traffic, such as the Wikipedia Main Page.
The researchers said their model worked well during “severe” flu seasons as well as during the outbreak of H1N1in 2009.
While the data proved speedy when predicting the spread of flu, McIver doesn’t think it should replace data from the CDC or Google.
“It would be tough to say that this data is better than CDC data – they are very different sources of information,” he said. “All methods of estimating flu activity have their pros and cons. The best tool for measuring this type of disease burden will probably end up being a system that combines different types of data together, to get a greater overview of what is really happening in the population.”
The study was published in PLoS Computational Biology.