Showing posts with label data scientist. Show all posts
Showing posts with label data scientist. Show all posts

Monday, June 28, 2021

Types of Graph Analysis

 




Graph Analysis has become a groundbreaking way for organizations to look at their data and understand the relationships between them. For two years running, Gartner selected graphs as one of their top analytics and data trends because of the significant potential for value creation. 

Graphs capture relationships and connections between entities. The relationships and connections between the entities are used in data analysis. Knowing how the data is connected, and building a graph to understand the relationships are becoming increasingly important because they make it easier to explore those connections and made new insights.  For example, understanding how a person’s buying pattern is influenced by all the entities the person is connected with. 


Centrality analysis: Estimates how important a node is for the connectivity of the network. It helps to estimate the most influential people in a social network or most frequently accessed web pages by using the PageRank algorithm.

Community detection: Distance and density of relationships can be used to find groups of people interacting frequently with each other in a social network. Community analytics also deals with the detection and behaviour patterns of communities.

Connectivity analysis: Determine how strongly or weakly connected two nodes are.

Path analysis: Examines the relationships between nodes. Mostly used in shortest distance problems.



Thursday, August 30, 2018

Interesting Careers in Big Data


Big Data & Data analytics has opened a wide range of new & interesting career opportunities. There is an urgent need for Big Data professionals in organizations.

Not all these careers are new, and many of them are remapping or enhancements of older job functions. For example, a Statistician was formerly deployed mostly in government organization or in sales/manufacturing for sales forecast for financial analysis, statisticians today have become center of business operations. Similarly, Business analysts have become key for data analytics – as business analysts play a critical role of understanding business processes and identifying solutions.


Here are 12 interesting & fast growing careers in Big Data.

1. Big Data Engineer
Architect, Build & maintain IT systems for storing & analyzing big data. They are responsible for designing a Hadoop cluster used for data analytics. These engineers need to have a good understanding of computer architectures and develop complex IT systems which are needed to run analytics.

2. Data Engineer
Data engineers understand the source, volume and destination of data, and have to build solutions to handle this volume of data. This could include setting up databases for handling structured data, setting up data lakes for unstructured data, securing all the data, and managing data throughout its lifecycle.

3. Data Scientist
Data Scientist is relatively a new role. They are primarily mathematicians who can build complex models, from which one extract meaningful analysis.

4. Statistician
Statisticians are masters in crunching structured numerical data & developing models that can test business assumptions, enhance business decisions and make predictions.

5. Business Analyst
Business analysts are the conduits between big data team and businesses. They understand business processes, understand business requirements, and identify solutions to help businesses. Business analysts work with data scientists, analytics solution architects and businesses to create a common understanding of the problem and the proposed solution.

6. AI/ML Scientist
This is relatively a new role in data analytics. Historically, this was part of large government R&D programs and today, AI/ML scientists are becoming the rock stars of data analytics.

7. Analytics Solution Architects
Solution architects are the programmers who develop software solutions – which leads to automation and reports for faster/better decisions.

8. BI Specialist
BI Specialists understand data warehouses, structured data and create reporting solutions. They also work with business to evangelize BI solutions within organizations.

9. Data Visualization Specialist
This is a relatively new career. Big data presents a big challenge in terms of how to make sense of this vast data. Data visualization specialists have the skills to convert large amounts of data into simple charts & diagrams – to visualize various aspects of business. This helps business leaders to understand what’s happening in real time and take better/faster decisions.

10. AI/ML Engineer
These are elite programmers who can build AI/ML software – based on algorithms developed by AI/ML scientists. In addition, AI/ML engineers also need to monitor AL solutions for the output & decisions done by AI systems and take corrective actions when needed.

11. BI Engineer
BI Engineers build, deploy, & maintain data warehouse solutions, manage structured data through its lifecycle and develop BI reporting solutions as needed.

12. Analytics Manager
This is relatively a new role created to help business leaders understand and use data analytics, AI/ML solutions. Analytics Managers work with business leaders to smoothen solution deployment and act as liaison between business and analytics team throughout the solution lifecycle.

Friday, August 17, 2018

4 Types of Data Analytics


Data analytics can be classified into 4 types based on complexity & Value. In general, most valuable analytics are also the most complex.

1. Descriptive analytics

Descriptive analytics answers the question:  What is happening now?

For example, in IT management, it tells how many applications are running in that instant of time and how well those application are working. Tools such as Cisco AppDynamics, Solarwinds NPM etc., collect huge volumes of data and analyzes and presents it in easy to read & understand format.

Descriptive analytics compiles raw data from multiple data sources to give valuable insights into what is happening & what happened in the past. However, this analytics does not what is going wrong or even explain why, but his helps trained managers and engineers to understand current situation.

2. Diagnostic analytics

Diagnostic Analytics uses real time data and historical data to automatically deduce what has gone wrong and why? Typically, diagnostic analysis is used for root cause analysis to understand why things have gone wrong.

Large amounts of data is used to find dependencies, relationships and to identify patterns to give a deep insight into a particular problem. For example, Dell - EMC Service Assurance Suite can provide fully automated root cause analysis of IT infrastructure. This helps IT organizations to rapidly troubleshoot issues & minimize downtimes.

3. Predictive analytics

Predictive analytics tells what is likely to happen next.

It uses all the historical data to identify definite pattern of events to predict what will happen next. Descriptive and diagnostic analytics are used to detect tendencies, clusters and exceptions, and predictive analytics us built on top to predict future trends.

Advanced algorithms such as forecasting models are used to predict. It is essential to understand that forecasting is just an estimate, the accuracy of which highly depends on data quality and stability of the situation, so it requires a careful treatment and continuous optimization.

For example, HPE Infosight can predict what can happen to IT systems, based on current & historical data. This helps IT companies to manage their IT infrastructure to prevent any future disruptions.



4. Prescriptive analytics

Prescriptive analytics is used to literally prescribe what action to take when a problem occurs.

It uses a vast data sets and intelligence to analyze the outcome of the possible action and then select the best option. This state-of-the-art type of data analytics requires not only historical data, but also external information from human experts (also called as Expert systems) in its   algorithms to choose the bast possible decision.

Prescriptive analytics uses sophisticated tools and technologies, like machine learning, business rules and algorithms, which makes it sophisticated to implement and manage.

For example, IBM Runbook Automation tools helps IT Operations teams to simplify and automate repetitive tasks.  Runbooks are typically created by technical writers working for top tier managed service providers. They include procedures for every anticipated scenario, and generally use step-by-step decision trees to determine the effective response to a particular scenario.

Thursday, July 26, 2018

4 Stages of Developing a Data Lake

Companies generally go through the following four stages of development when building a data lake:


Wednesday, July 04, 2018

Skills Needed To Be A Successful Data Scientist

Data Scientist, the most demanded job of 21st century, requires multidisciplinary skills – mix of Math, Statistics, Computer Science, Communication & Business Acumen.


Monday, May 14, 2018

Popular Programming Languages for Data Analytics


Data analysis is becoming very important and an exciting field to work in. To become a data scientist, one need to have advanced mathematical skills, advanced statistical and real world programming ability. In addition of C/C++ & Java, there are several programming languages that are designed for data analysis.

I have listed down the most popular programming languages for data analysis.





Thursday, April 05, 2018

Data Preparation Process


Data preparation is the first step in modern data analytics and BI, data science, and data integration.

Data preparation takes more than 60% of the data analytics time. With business is demanding faster time to insight to remain competitive, analytics is becoming more pervasive  across the enterprise and those insights are being derived from larger numbers of diverse data sources, both internal and external to the enterprise, with varying degrees of trustworthiness. This increases complexity.

Data preparation processes reduces time to insight for analytics & is the first step for data analytics.

Thursday, March 15, 2018

What is data scientist?

Rising along side the relatively new technology of big data is the new job title data scientist.
While not tied exclusively to big data projects, the data scientist role does complement them because of the increased breadth and depth of data being examined, as compared to traditional roles.

So what does a data scientist do?

A data scientist represents an evolution from the business or data analyst role. The formal training is similar, with a solid foundation typically in computer science and applications, modeling, statistics, analytics and math. What sets the data scientist apart is strong business acumen, coupled with the ability to communicate findings to both business and IT leaders in a way that can influence how an organization approaches a business challenge. Good data scientists will not just address business problems, they will pick the right problems that have the most value to the organization.

The data scientist has to be a  part analyst and part artist. Data scientist need to know the math and must have a curious mind and be inquisitive to discover mathematical relationships between different sets of data and trends.

Data scientists must be inquisitive to explore, ask questions &  do "what if" analysis, Question existing assumptions and processes A data scientist should be curious to explore and examine data from multiple disparate sources. The data scientist will sift through all incoming data with the goal of discovering a previously hidden insight, which in turn can provide a competitive advantage or address a pressing business problem. A data scientist does not simply collect and report on data, but also looks at it from many angles, determines what it means, then recommends ways to apply the data.

Armed with data and analytical results, a top-tier data scientist will then help automate various parts of decision making in organizations and help leaders take better decisions.