Analytics in a Big Data World
For organizations looking to enhance their capabilities via data analytics, this resource is the go-to reference for leveraging data to enhance business capabilities.
Analytics in a Big Data World
Big Data and Analytics
D ata are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. 1 In relative terms, this means 90 -percent of the data in the world has been created in the last two years. Gartner projects that by 2015, 85 percent of Fortune 500 organizations will be unable to exploit big data for competitive advantage and about 4.4 million jobs will be created around big data. 2 Although these estimates should not be interpreted in an absolute sense, they are a strong indication of the ubiquity of big data and the strong need for analytical skills and resources because, as the data piles up, managing and analyzing these data resources in the most optimal way become critical success factors in creating competitive advantage and strategic -leverage.
Figure 1.1 shows the results of a KDnuggets 3 poll conducted during April 2013 about the largest data sets analyzed. The total number of respondents was 322 and the numbers per category are indicated between brackets. The median was estimated to be in the 40 to 50 gigabyte (GB) range, which was about double the median answer for a similar poll run in 2012 (20 to 40 GB). This clearly shows the quick increase in size of data that analysts are working on. A further regional breakdown of the poll showed that U.S. data miners lead other regions in big data, with about 28% of them working with terabyte (TB) size databases.
Figure 1.1 Results from a KDnuggets Poll about Largest Data Sets Analyzed
Source: www.kdnuggets.com/polls/2013/largest-dataset-analyzed-data-mined-2013.html .
A main obstacle to fully harnessing the power of big data using analytics is the lack of skilled resources and "data scientist" talent required to exploit big data. In another poll ran by KDnuggets in July 2013, a strong need emerged for analytics/big data/data mining/data science education. 4 It is the purpose of this book to try and fill this gap by providing a concise and focused overview of analytics for the business practitioner.
Analytics is everywhere and strongly embedded into our daily lives. As I am writing this part, I was the subject of various analytical models today. When I checked my physical mailbox this morning, I found a catalogue sent to me most probably as a result of a response modeling analytical exercise that indicated that, given my characteristics and previous purchase behavior, I am likely to buy one or more products from it. Today, I was the subject of a behavioral scoring model of my financial institution. This is a model that will look at, among other things, my checking account balance from the past 12 months and my credit payments during that period, together with other kinds of information available to my bank, to predict whether I will default on my loan during the next year. My bank needs to know this for provisioning purposes. Also today, my telephone services provider analyzed my calling behavior and my account information to predict whether I will churn during the next three months. As I logged on to my Facebook page, the social ads appearing there were based on analyzing all information (posts, pictures, my friends and their behavior, etc.) available to Facebook. My Twitter posts will be analyzed (possibly in real time) by social media analytics to understand both the subject of my tweets and the sentiment of them. As I checked out in the supermarket, my loyalty card was scanned first, followed by all my purchases. This will be used by my supermarket to analyze my market basket, which will help it decide on product bundling, next best offer, improving shelf organization, and so forth. As I made the payment with my credit card, my credit card provider used a fraud detection model to see whether it was a legitimate transaction