Big data analytics is the often complex process of examining large and varied data sets ("big data") generated by sources such as eCommerce, mobile devices, social media and the Internet of Things (IoT). It involves integrating different data sources, transforming unstructured data into structured data, and generating insights from the data using specialized tools and techniques that spread out data processing over an entire network.
The amount of digital data is growing rapidly, doubling roughly every two years. Big data analytics offer a different approach for managing and analyzing all of these data sources. While the principles of traditional data analytics generally still apply, the scale and complexity of big data analytics required the development of new ways to store and process the petabytes of structured and unstructured data involved.
The demand for faster speeds and greater storage capacities created a technological vacuum that was soon filled by approaches, including:
Big data analytics takes advantage of advanced analytic techniques to analyze really big data sets that include structured, semi-structured and unstructured data, from various sources, and in different sizes from terabytes to zettabytes.
Prior to the invention of Hadoop, the technologies underpinning modern storage and compute systems were relatively basic, limiting companies mostly to the analysis of "small data." Even this form of analytics could be difficult, especially the integration of new data sources. With traditional data analytics, which relies on relational databases of structured data, every byte of raw data needs to be formatted in a specific way before it can be ingested into the database for analysis. This often lengthy process, commonly known as extract, transform, load (or ETL) is required for each new data source. The main problem with this 3-part process and approach is that it’s incredibly time and labor intensive, sometimes requiring up to 18 months for data scientists and engineers to implement or change.
Once data was inside the database, though, in most cases it was easy enough for data analysts to query and analyze. But then along came the Internet, eCommerce, social media, mobile devices, marketing automation, Internet of Things (IoT) devices, etc., and the size, volume, and complexity of raw data became too much for all but a handful of institutions to analyze in the normal course of business.Big data analytics helps organizations harness their data and use advanced data science techniques and methods, such as natural language processing, deep learning, machine learning, uncovering hidden patterns, unknown correlations, market trends and customer preferences, to identify new opportunities and make more informed business decisions.
