Wednesday, March 22, 2017

Understanding Data, Analytics and Technology platforms Part 1

Note: This article was co-authored with Saraswathi Ramachandra

Today, lot of young engineers often ask me about analytics.

Often, the word "Analytics" is used in conjunction with "Big Data" or "Hadoop". But in real business world, Hadoop or bigdata forms a small portion (but growing) of today's business analytics.

This prompted us to write this blog to educate folks on realities of analytics in business today. This blog is written in three parts for ease of reading.

Part-1: Understanding Data for Analytics
Part-2: Understanding different Uses of Analytics
Part-3: Understanding Analytics Technology Platforms

Part-1: Understanding Data for Analytics

Before we start talk about analytics, we need to understand data. What kind of data is used in business analytics?

Today business have a wide range of data sources that are used in business analytics. The diagram below shows the popularity of data types.

Today in business organizations, Structured data still rules. According to a research done by TDWI Research in 20151, Structured data is used by 90% of all companies they surveyed across 4 continents, covering 357 large enterprises across all categories.



Figure-1: Different Data Types & Popularity

Organizations are trying to bring together multiple disparate data sources. For now, many of those data sources are the traditional data types such as structured data in Databases, data warehouse, legacy data, data from previous reports. These data sources are typically well curated and are well understood by businesses.

However, new data sources such as unstructured data (which is mainly audio, video streams), data from IoT device, Point-of-Sale machines, Social media,  Telemetry data, Web logs, click streams, etc are becoming more relevant and important for businesses - as they help businesses get a real-time view of things happening to their business. Therefore the use & value of these data types are increasing rapidly.

However, new data sources such as unstructured data (which is mainly audio, video streams), data from IoT device, Point-of-Sale machines, Social media,  Telemetry data, Web logs, click streams, etc are becoming more relevant and important for businesses - as they help businesses get a real-time view of things happening to their business. Therefore the use & value of these data types are increasing rapidly.  For instance, close to 31% of respondents cited geospatial data as being in use today. Geospatial data can be very useful for real-time analytics. Airlines use geospatial data to track and simulate the flight paths of thousands of flights a day. They can view all of these flights on an interactive map. If weather or other events occur, air traffic dispatchers can make tweaks to flight paths to as part of air traffic control. This can help reduce costs.

Even use of click-stream data, as well as machine-generated data from sensors, data from social media are being used by business to understand customer in real time. Data is continually processed and if it exceeds a certain threshold based on a calculation, an alert is generated or a downstream application is notified. This allows businesses to respond in near real time to customer activities. Today, companies are using very sophisticated technologies to analyze data in real-time and manage customer experiences.

Recently, streaming data from sensors and other IoT devices are being analyzed at the source of data generation to create time-series data. (This is often referred to as analytics on the edge), time series data can then be stored in databases & used for further analysis.

Coming Next: Part-2: Understanding different uses of analytics

No comments: