Over 200 people packed into the Microsoft NERD Center on May 1st for the MassTLC Big Data Summit, You have the data, now what? At the onset of the program, MassTLC released three research reports on the intersection of big data and connected cities, life sciences, and healthcare. Each detailing the need for the talent and technologies related to big data and complex analytics to push these industries forward.
Oracle Big Data Strategist, Paul Sonderegger’s keynote that followed fired up the audience for the rest of the day. Somehow, Paul was able to redefine big data in ways that I’m sure the data scientists in the room never thought of before. From put a stop to and outbreak of Cholera in the 1800s and inventing electricity, his examples of using data were quite unique and attention grabbing.
Some key take-aways from the keynote: Use structured and unstructured data from both internal and external sources, marry them all together and create competitive advantages. Doing this correctly will yield: 1) getting faster answers to new questions; 2) predicting more and more accurately; 3) enabling you to create data reservoir; and 4) accelerating data-driven action.
How are the leaders doing it?
From there, we moved into our data scientists panel, How are the leaders doing it?, which included: Chris Baker, Data Science Lead, Dyn; Joe Hendrickson, Vice President, athenahealth; Pete Martin, VP of Engineering, Pixability; and Ingo Mierswa, CEO and Founder, RapidMiner.
Each of our panelists hailed from a different industry and company size. But they had many things in common, primarily a focus on team approach is the best approach. That there is almost nobody with the breadth and depth of skills it takes to carry-out the tasks of a data science team. They also agreed that to use big data most effectively you don’t want to go out in search of the needle in the haystack, you want instead to try and identify where you can begin to see patterns or anomalies and then work from there. And then finally, they provided some wise words with respect to using the information that you gain from data in a thoughtful and careful manner. Nothing is 100% but you must not let your executives jump to rash decision making without weighing in all circumstances.
Privacy and Governance
A fireside chat with Paul Barth, Co-Founder of NewVantage Partners and Justin Holmes, Interim CEO of the City of Boston provided views into the early stages of the why’s, what’s, and how’s of looking at and putting into place policies around privacy and governance. With the constant push pull of wanting transparency and access versus ensuring privacy this is quite a heady topic. Paul provided some very interesting (and cautionary) examples of how information gleaned from data may seem harmless to some but cause some issues with others, so all stakeholders and potential stakeholders need to be thought through. Also, within many industries (financial services, healthcare, etc.) combining different data sets may cause legal ramifications, such as taking a data set that is only meant for anonymous use, with a data set where personal information can be obtained.
While Justin had some views on the other end. For the City of Boston, a primary goal of opening data and making it accessible is to enable to its citizens, constituents, and visitors to gain value (speed bumps, snow plow tracking, restaurant inspections). And at the same time, they must always find a balance between impacts that making the information public could cause versus the value that it provides. The City is consistently looking for new opportunities as they pursue their efforts in making many more data sets available to the public and they have just recently embarked on their Privacy and Governance policy process. Be sure to check it out and provide your feedback.
Sourcing the Next Big Thing
Wrapping up our day, were the folks that put everything we discussed into action. Sourcing the Next Big Thing, was meant to show how the next big thing in data is how the industries are putting it to use. Richard Dale, COO of Optum; Steve Dodson, CTO of Prelert; Robert Nagle, VP & GM of Data Platforms; and moderator, Chris Selland, VP Marketing & Business Development for HP Vertica each talked about their own experiences and how they have been able to achieve positive outcomes. Again, a common theme through the day was to assemble and use a team approach. Another theme through this panel, capture and keep as much data as you can. You will never know when you might want to use it. And again, execution on the findings is key.