Collaboration is the future, According to Amazon Web Services Chief Data Scientist Matt Wood

Once data makes its way to the cloud, it opens up entirely new methods of collaboration where researchers or even entire industries can access and work together on shared datasets too big to move around. “This sort of data space is something that’s becoming common in fields where there are very large datasets,” Wood said, citing as an example the 1000 Genomes project dataset that AWS houses.



DNAnexus’s cloud-based architecture

The genetics space is drooling over the promise of cloud computing. The 1000 Genomes database is only 200TB, Wood explained, but very few project leads could get the budget to store that much data and make it accessible to their peers, much less the computation power required to process it. And even in fields such as pharmaceuticals, Amazon CTO Werner Vogels told me during an earlier interview, companies are using the cloud to collaborate on certain datasets so companies don’t have to spend time and money reinventing the wheel.

