Due to the growth in the amount and complexity of data, data scientists have much to consider about how to benefit from and manage this data. About thirty scientists who study data for business or academia came together in Chicago recently to discuss common issues in their work and see what they can learn from each other. Most of these experts agreed that they wanted to help their organization or company more, but that they faced “infrastructural roadblocks” that prevented them from succeeding. For example in the public health field, there are legal issues that prevent the sharing of data. The scientists agreed that it is not just the ability to use powerful computational systems that is important, but also the ability to “think through complex problems before” starting the computations processes. People skills, natural curiosity, and the ability to think and question openly were also deemed important. According to Dr. Scott Nicholson from Accretive Health, “’ My definition of a data scientist is someone who uses data to solve problems, end to end, from asking the right questions to making insights actionable’”. Data scientists should ideally be able to work closely with the CEOs, because clear communication of problems and solutions is very important. Someone commented that the act of cleaning and organizing data was 70% of the work. Another problem the industry faces is the lack of workers available for data related jobs. According to Kirk Borne of George Mason Univeristy, “’For every hundred job openings, there may just be a couple of applicants ‘”.
One of the presentations from the meeting was about the intelligence behind salesforce.com, a service that helps businesses understand customer actions. The customer behavioral data it gathers totals to about 1 billion transactions in a day, and then this data is transferred to a Hadoop database. Idea exchange is a partner site that allows customers to record their responses directly. From the data gathered from salesforce.com and idea exchange, the teams have delivered “about four ideas per week based on the analysis“. This is a great example of making data useful for business.