You are here
The Rise of Data Science in the Age of Big Data Analytics: Why data distillation and machine learning aren't enough
The reason why Big Data is important is because we want to use it to make sense of our world. It's tempting to think there's some "magic bullet" for analyzing big data, but simple "data distillation" often isn't enough, and unsupervised machine-learning systems can be dangerous. (Like, bringing-down-the-entire-financial-system dangerous.) Data Science is the key to unlocking insight from Big Data: by combining computer science skills with statistical analysis and a deep understanding of the data and problem we can not only make better predictions, but also fill in gaps in our knowledge, and even find answers to questions we hadn't even thought of yet.
In this talk, David will
- Introduce the concept of Data Science, and give examples of where Data Science succeeds with Big Data ... and where automated systems have failed.
- Describe the Data Scientists' Toolkit: the systems and technology components Data Scientists need to explore, analyze and create data apps from Big Data.
- Share some thoughts about the future of Big Data Analytics, and the diverging use cases for computing grids, data appliances, and Hadoop clusters
- Discuss the skills needed to succeed
- Talk about the technology stack that a data scientist needs to be effective with Big Data, and describe emerging trends in the use of various data platforms for analytics: specifically, Hadoop for data storage and data "refinement"; data appliances for performance and production, and computing grids for data exploration and model development.
View the replay:
View the presentation: