Monday, June 28, 2021
Types of Graph Analysis
Monday, June 21, 2021
Fintech Use Case for Graph Analytics
In my previous blog, I had written about high-level use cases for Graph Analytics. In today’s blog, lets' dive in deeper and take a look at how Fintech companies can use graph analytics to allocate credit to customers & manage risks.
Today, banks and other Fintech companies have access to tonnes of information – but their current databases and IT solutions do not have the ability to make the best use of it. Customer information such as KYC data, and other demographic data are often storied in traditional RDBMS database, while transactional data is stored in a separate database, customer interactions on web/mobile apps, customer interactions data are stored in Big Data HDFS stores, while the data from Social network or other network data about customers are often not even used.
This is where graph databases such as Neo4J or Oracle Autonomous database etc come into play.
A graph database can connect the dots between different sources of information and one can build a really cool, intelligent AI solution to make predictions on future purchases, credit needs, and risks. Prediction data can then be validated with actual transactional data to iterate and build better models.
Graph databases are essentially built for high scalability and performance. There are several open-source algorithms and libraries that can detect components and make connections within the data, you can evaluate these connections and start making predictions, which over time will only get better.
Wednesday, June 16, 2021
BFSI Use cases for Graph Analytics
Graph analytics is used to analyze relations
among entities such as customers, products, operations, and devices. Businesses
run on these relationships between customers, customers to products,
how/where/when customer’s use products, and how business operations affect the
relationships. In a nutshell, it’s like analyzing social networks and financial
companies can gain immensely by using Graph Analytics.
Let’s see the four biggest use case of Graph
Analytics in the world of finance.
- Stay
Agile in Risk & Compliance
Financial services firms today face increased regulations when it comes to risk and compliance reporting. Rather than update data manually across silos, today's leading financial organizations use Neo4j to unite data silos into a federated metadata model, enabling them to trace and analyze their data across the enterprise as well as update their models in real-time. - Fraud
Protection
Dirty money is passed around to blend it with legitimate funds and then turned into hard assets. Detect circular money transfers to prevent money laundering via money mules. Graph Analytics discovers the network of individuals or common patterns of transfers in real-time to prevent common frauds – to detect illegal ATM transactions. Data like IP addresses, cards used, branch locations, the timing of transfers can be instantly tied to individuals to prevent fraudulent transactions. - Leverage
data across teams
Data is the lifeblood of finance. Companies strive to actively collect, store and use data. At the same time, financial companies are governed by laws, regulations, and standards around data. The burden of being compliant and ensuring data privacy has become ever more complex and expensive.
Graph Analytics allows tracking data lineage through the data lifecycle. Data can be tracked and navigated, vertex by vertex, by following the edges. With graph analytics, it is possible to follow the path and find where the information originated, where it was copied, and where it was utilized. This makes it easier to remain compliant and use data for its full value. - Capture
360-degree view of customers
Marketing is all about understanding relationships of their customers and their products. Knowing the relationships between customers, customers’ transactions, and products will build a 360-degree view of customers – which can be used for better marketing and more effectively provide customers with what they want.
Tuesday, June 15, 2021
How Banks can benefit from Blockchain Analytics?
Blockchain is a digital and decentralized public ledger with a system that records transactions across several computers linked to a peer-to-peer network. It was originally developed for cryptocurrency assets like Bitcoin, Dogecoin, Ethereum, etc., In recent years there are several new use cases have emerged in financial services. (See: Blockchain for Secure Healthcare Records, How Banks & Financial Institutions can use Blockchain Technology, Blockchain use cases)
As blockchain’s use cases go beyond cryptocurrency, including for government applications, healthcare, identity management, art, and IPR, the database of all blockchain transactions grow even more bigger, richer, and more valuable for banks – if they can use this data via data analytics and use these insights to build better services. The benefit of blockchain is its inherent transparency. The blockchain’s decentralized, open network allows banks to collect data from blockchain transactions.
The Rise of Data Analytics
Aside from all the aforementioned areas, blockchain also huge potential in analytics. Modern businesses have been benefiting from data analytics for several years now. Currently, the big problem with any data analytics is getting quality data from different sources and correlating them. There is the issue of whether there is enough of the right data.
This is where blockchain technology helps. Data recorded in a blockchain is irrefutable and can be easily cross verified from any node in the network. Having access to this large network that provides high quality data in a vast number of datasets is invaluable.
A good potential application will be blockchain analytics – to understand customers of cryptocurrency customers & traders. Bank’s asset & wealth management business and customer banking’s marketing organization can use these valuable analytics for future marketing campaigns and for managing cryptocurrency as an asset class in wealth management. This system can be used to forecast price movements for cryptocurrencies.
Today there are more than 100 digital assets including Bitcoin, Ethereum, ERC-20 tokens, and other crypto coins, representing over $200 billion worth of transactions per month.
Other use cases include risk analysis on crypto transactions: uncovering activities related to money laundering, terrorist fundraising, fraud, and other financial crimes. Blockchain analytics can de-anonymize funds flow by actively collecting millions of data points every week, and then implementing machine learning to its huge data pool to track flows to legitimate entities and also criminal activities.
Thursday, August 30, 2018
Interesting Careers in Big Data
Big Data & Data analytics has opened a wide range of new & interesting career opportunities. There is an urgent need for Big Data professionals in organizations.
Not all these careers are new, and many of them are remapping or enhancements of older job functions. For example, a Statistician was formerly deployed mostly in government organization or in sales/manufacturing for sales forecast for financial analysis, statisticians today have become center of business operations. Similarly, Business analysts have become key for data analytics – as business analysts play a critical role of understanding business processes and identifying solutions.
Here are 12 interesting & fast growing careers in Big Data.
1. Big Data Engineer
Architect, Build & maintain IT systems for storing & analyzing big data. They are responsible for designing a Hadoop cluster used for data analytics. These engineers need to have a good understanding of computer architectures and develop complex IT systems which are needed to run analytics.
2. Data Engineer
Data engineers understand the source, volume and destination of data, and have to build solutions to handle this volume of data. This could include setting up databases for handling structured data, setting up data lakes for unstructured data, securing all the data, and managing data throughout its lifecycle.
3. Data Scientist
Data Scientist is relatively a new role. They are primarily mathematicians who can build complex models, from which one extract meaningful analysis.
4. Statistician
Statisticians are masters in crunching structured numerical data & developing models that can test business assumptions, enhance business decisions and make predictions.
5. Business Analyst
Business analysts are the conduits between big data team and businesses. They understand business processes, understand business requirements, and identify solutions to help businesses. Business analysts work with data scientists, analytics solution architects and businesses to create a common understanding of the problem and the proposed solution.
6. AI/ML Scientist
This is relatively a new role in data analytics. Historically, this was part of large government R&D programs and today, AI/ML scientists are becoming the rock stars of data analytics.
7. Analytics Solution Architects
Solution architects are the programmers who develop software solutions – which leads to automation and reports for faster/better decisions.
8. BI Specialist
BI Specialists understand data warehouses, structured data and create reporting solutions. They also work with business to evangelize BI solutions within organizations.
9. Data Visualization Specialist
This is a relatively new career. Big data presents a big challenge in terms of how to make sense of this vast data. Data visualization specialists have the skills to convert large amounts of data into simple charts & diagrams – to visualize various aspects of business. This helps business leaders to understand what’s happening in real time and take better/faster decisions.
10. AI/ML Engineer
These are elite programmers who can build AI/ML software – based on algorithms developed by AI/ML scientists. In addition, AI/ML engineers also need to monitor AL solutions for the output & decisions done by AI systems and take corrective actions when needed.
11. BI Engineer
BI Engineers build, deploy, & maintain data warehouse solutions, manage structured data through its lifecycle and develop BI reporting solutions as needed.
12. Analytics Manager
This is relatively a new role created to help business leaders understand and use data analytics, AI/ML solutions. Analytics Managers work with business leaders to smoothen solution deployment and act as liaison between business and analytics team throughout the solution lifecycle.
Tuesday, August 21, 2018
Fundamentals of Data Management in the Age of Big Data
Data management, data privacy & security risks pose a great management challenge. In order to address these challenges, companies need to put proper data management policies in place. Here are eight fundamental policies of data management that needs to be adhered to by all companies.
Friday, August 17, 2018
4 Types of Data Analytics
Data analytics can be classified into 4 types based on complexity & Value. In general, most valuable analytics are also the most complex.
1. Descriptive analytics
Descriptive analytics answers the question: What is happening now?
For example, in IT management, it tells how many applications are running in that instant of time and how well those application are working. Tools such as Cisco AppDynamics, Solarwinds NPM etc., collect huge volumes of data and analyzes and presents it in easy to read & understand format.
Descriptive analytics compiles raw data from multiple data sources to give valuable insights into what is happening & what happened in the past. However, this analytics does not what is going wrong or even explain why, but his helps trained managers and engineers to understand current situation.
2. Diagnostic analytics
Diagnostic Analytics uses real time data and historical data to automatically deduce what has gone wrong and why? Typically, diagnostic analysis is used for root cause analysis to understand why things have gone wrong.
Large amounts of data is used to find dependencies, relationships and to identify patterns to give a deep insight into a particular problem. For example, Dell - EMC Service Assurance Suite can provide fully automated root cause analysis of IT infrastructure. This helps IT organizations to rapidly troubleshoot issues & minimize downtimes.
3. Predictive analytics
Predictive analytics tells what is likely to happen next.
It uses all the historical data to identify definite pattern of events to predict what will happen next. Descriptive and diagnostic analytics are used to detect tendencies, clusters and exceptions, and predictive analytics us built on top to predict future trends.
Advanced algorithms such as forecasting models are used to predict. It is essential to understand that forecasting is just an estimate, the accuracy of which highly depends on data quality and stability of the situation, so it requires a careful treatment and continuous optimization.
For example, HPE Infosight can predict what can happen to IT systems, based on current & historical data. This helps IT companies to manage their IT infrastructure to prevent any future disruptions.
4. Prescriptive analytics
Prescriptive analytics is used to literally prescribe what action to take when a problem occurs.
It uses a vast data sets and intelligence to analyze the outcome of the possible action and then select the best option. This state-of-the-art type of data analytics requires not only historical data, but also external information from human experts (also called as Expert systems) in its algorithms to choose the bast possible decision.
Prescriptive analytics uses sophisticated tools and technologies, like machine learning, business rules and algorithms, which makes it sophisticated to implement and manage.
For example, IBM Runbook Automation tools helps IT Operations teams to simplify and automate repetitive tasks. Runbooks are typically created by technical writers working for top tier managed service providers. They include procedures for every anticipated scenario, and generally use step-by-step decision trees to determine the effective response to a particular scenario.
Thursday, July 26, 2018
Wednesday, July 25, 2018
Why Edge Computing is critical for IoT success?
Edge computing is the practice of processing data near the edge of your network, where the data is being generated, instead of in a centralised data-processing warehouse.
Edge computing is a distributed, open IT architecture that features decentralised processing power, enabling mobile computing and Internet of Things (IoT) technologies. In edge computing, data is processed by the device itself or by a local computer or server, rather than being transmitted to a data centre.
Edge computing enables data-stream acceleration, including real-time data processing without latency. It allows smart applications and devices to respond to data almost instantaneously, as its being created, eliminating lag time. This is critical for technologies such as self-driving cars, and has equally important benefits for business.
Edge computing allows for efficient data processing in that large amounts of data can be processed near the source, reducing Internet bandwidth usage. This both eliminates costs and ensures that applications can be used effectively in remote locations. In addition, the ability to process data without ever putting it into a public cloud adds a useful layer of security for sensitive data.
Thursday, July 19, 2018
5 Pillars of Data Management for Data Analytics
Data is the lifeblood for Big data analytics and all the AI/ML solutions built on top.
Here are 5 basic data management principles that must never be broken.
1. Secure Data at Rest
- Most of the data is stored in storage systems which must be secured.
- All data in storage must be encrypted
2. Fast & Secure Data Access
- Fast access to data from databases, storage systems. This implies using fast storage servers and FC SAN networks.
- Strong access control & authentication is essential
3. Manage Networks for Data in Transit
- This involves building fast networks - a 40Gb Ethernet for compute clusters and 100Gb FC SAN networks
- Fast SD-WAN technologies ensure that globally distributed data can be used for data analytics.
4. Secure IoT Data Stream
- IoT endpoints are often in remote locations and have to be secured.
- Corrupt data from IoT will break Analytics.
- Having Intelligent Edge helps in preprocessing IoT data - for data quality & security
5. Rock Solid Data backup and recovery
- Accidents & Disasters do happen. Protect from data loss & data unavailability with a rock solid data backup solutions.
- Robust disaster recovery solutions can give zero RTO/RPO.
Wednesday, July 18, 2018
Business Success with Data Analytics
Data and advanced analytics have arrived. Data is becoming ubiquitous but several organizations are struggling to use data analytics in everyday business process. Companies who adapt data analytics in the truest and deepest levels will have a significant competitive advantage, ; those who fall behind risk becoming irrelevant. Analytics has the potential to upend the prevailing business models in many industries, and CEOs are struggling to understand how analytics can help.
Here are 10 key points that must be followed to succeed.
- Understand how Analytics can disrupt your industry
- Define ways in which Analytics can create value & new opportunities
- Top managers should learn to love metrics and measurements
- Change Organizational structures to enable analytics based decision making
- Experiment with data driven, test-n-learn decision making processes
- Data Ownership must be well defined & Data Access must be made easier
- Invest in data management, data Security & analytics tools
- Invest in training & hiring people to drive analytics
- Establish Organizational Benchmarks for data analytics
- Layout a long term road map for business success with Analytics
Wednesday, July 04, 2018
Top Challenges Facing AI Projects in Legacy Companies
Companies relutcantly start few AI projects - only to abandon them.
Here are are the top 7 challenges AI projects face in legacy companies:
1. Management Reluctance
Fear of Exacerbating asymmetrical power of AI
Need to Protect their domains
Pressure to maintain statusquo
2. Ensuring Corporate Accountability
Internal Fissures
Legacy Processes hinder accountability on AI systems
3. Copyrights & Legal Compliance
- Inability to agree on data copyrights
- Legacy Processes hinder compliance when new AI systems are implemented
4. Lack of Strategic Vision
- Top management lacksstrategic vision on AI
- Leaders are unaware of AI's potential
- AI projects are not fully funded
5. Data Authenticity
- Lack of tools to verify data Authenticity
- Multiple data sources
- Duplicate Data
- Incomplete Data
6. Understanding Unstructured Data
- Lack of tools to analyze Unstructured data
- Middle management does not understand value of information in unstructured data
- Incomplete data for AI tools
7. Data Availability
- Lack of tools to consolidate data
- Lack of knowledge on sources of data
- Legacy systems that prevent data sharing
Monday, July 02, 2018
Big Data Analytics for Digital Banking
Big Data has a huge impact on banking, especially in the era of digital banking.
Here are six main benefits for data analytics for banks.
1. Customer Insights
Banks can follow customer's social media & gain valuable insights on customer behavior patterns
Social media analysis gives a more accurate insights than traditional customer surveys
Social media analysis can be near real time, thus helping understand customer needs better
2. Customer Service
Big data analysis based on customer's historical data, current web data can be used to identify customer issues proactively and resolve them even before customer complains
Eg: Analyzing customers geographical data can help banks optimize ATM locations
3. Customer Experience
Banks can use big data analytics to customize website in real time - to enhance customer experience.
Banks can use analytics to send real time messages/communications regarding account status etc.,
With Big Data analytics, Banks can be proactive to enhance custoemr service.
4. Boosting Sales
Social media analysis gives a more accurate insights into customer's needs and help promote the right banking products to customers. For e.g., customers looking at housing advertisements and discussing housing finance in social media - are most likely in need of a housing loan.
Data analytics can accurately acess customer's needs & banks can promote right types of solutions.
5. Fraud Detection
Big Data analysis can detect fraud in real time and prevent it
Data from third parties and banking networks holds valuable information about customer interactions.
6. New Product Introduction
Big Data analysis can identify new needs and develop products that meet those needs
Eg: Mobile Payment services, Open Bank APIs, ERP Integration gateways, International currency exchange services etc are all based on data analytics
Wednesday, June 20, 2018
Data Life Cycle Management in the Age of Big Data
Organizations are eager to harness the power of big data. Big data creates tremendous opportunities and challenges.
The data lifecycle stretches through multiple phases as data is created, used, shared, updated, stored and eventually archived or defensively disposed. Data lifecycle management plays an especially key role in three of these phases of data’s existence:
1. Disclose Data
2. Manipulate Data
3. Consume Data
Organizations can benefit from data only if they can manage the entire data lifecycle, focus on good governance, use, share and monetize data.
Monday, May 21, 2018
AI for IT Infrastructure Management
AI is being used today for IT Infrastructure management. IT infrastructure generates lots of telemetry data from sensors & software that can be used to observe and automate. As IT infrastructure grows in size and complexity, standard monitoring tools does not work well. That's when we need AI tools to manage IT infrastructure.
Like in any classical AI system, IT infrastructure management systems also has 5 standard steps:
1. Observe:
Typical IT systems collect billions of data sets from thousands of sensors, collecting data every 4-5 minutes. I/O pattern data is also collected in parallel and parsed for analysis.
2. Learn:
Telemetry data from each device is modeled along with its global connections, and system learns each device & application stable, active states, and learns unstable states. Abnormal behavior is identified by learning from I/O patterns & configurations of each device and application.
3. Predict:
AI engines learn to predict an issue based on pattern-matching algorithms. Even application performance can be modeled and predicted based on historical workload patterns and configurations
4. Recommend:
Based on predictive analytics, recommendations are be developed based on expert systems. Recommendations are based on what constitutes an ideal environment, or what is needed to improve the current condition
5. Automate:
IT automation is done via Run Book Automation tools – which runs on behalf of IT Administrators, and all details of event & automation results are entered into an IT Ticketing system
Thursday, May 17, 2018
How to select uses cases for AI automation
AI is rapidly growing and companies are actively looking at how to use AI in their organization and automate things to improve profitability.
Approaching the problem from business management perspective, the ideal areas to automate will be around the periphery of business operations where jobs are usually routine, repetitive but needs little human intelligence - like warehouse operators, metro train drivers etc., These jobs follow a set pattern and even if there is a mistake either by human operator or by a robot - the costs are very low.
Business operations tends to employ large number of people with minimum skills and use lots of safety systems to minimize costs of errors. It is these areas that are usually the low hanging fruits for automation with AI & robotics..
Developing an AI application is a lot more complex, but all apps have 4 basic steps:
1. Identify area for automation: Areas where automation solves a business problem & saves money
2. Identify data sources. Automation needs tones of data. So one needs to identify all possible sources of data and start collecting & organizing all the data
Once data is collected, AI applications can be developed. Today, there are several AI libraries and AI tools to develop new applications. My next blog talks about all the popular AI application development tools.
Once an AI tool to automate a business process is developed, it has to be deployed, monitored and checked for additional improvements - which should be part of regular business improvement program.
Monday, May 14, 2018
Popular Programming Languages for Data Analytics
I have listed down the most popular programming languages for data analysis.
Tuesday, May 08, 2018
Build Modern Data Center for Digital Banking
Building a digital bank needs a modern data center. The dynamic nature of fintech and digital banking calls for a new data center which is highly dynamic, scalable, agile, highly available, and offers all compute, network, storage, and security services as a programmable object with unified management.
A modern data center enables banks to respond quickly to the dynamic needs of the business.
Rapid IT responsiveness is architected into the design of a modern infrastructure that abstracts traditional infrastructure silos into a cohesive virtualized, software defined environment that supports both legacy and cloud native applications and seamlessly extends across private and public clouds .
A modern data center can deliver infrastructure as code to application developers for even
faster provisioning both test & production deployment via rapid DevOps.
Modern IT infrastructure is built to deliver automation - to rapidly configure, provision, deploy, test, update, and decommission infrastructure and applications (Both legacy, Cloud native and micro services.
Modern IT infrastructure is built with security as a solid foundation to help protect data, applications, and infrastructure in ways that meet all compliance requirements, and also offer flexibility to rapidly respond to new security threats.
Thursday, May 03, 2018
Data Analytics for Competitive Advantage
Data Analytics is touted as 'THE" tool for competitive advantage.
In this article, I have done a break down of data analytics into its three main components and further listed down various activities that are done in each category.
Three Main Components of Data Analytics
1. Data Management
2. Standard Analytics
3. Advanced Analytics
Data Management
Data Management forms the foundation of data analytics. About 80% of efforts & costs are incurred in data management functions. The world of data management is vast and complex, it consists of several activities that needs to be done:
1. Data Architecture
2. Data Governance
3. Data Development
4. Data Security
4. Master Data Management
5. Metadata Management
6. Data Quality Management
7. Document & Content Management
8. Database & Data warehousing Operations
Standard Analytics
Advanced Analytics
Monday, April 30, 2018
Build State of Art AI Deep Learning Systems
HPE Apollo 6500 Gen10 System is an ideal HPC and deep learning platform providing unprecedented performance with industry leading GPUs, fast GPU interconnect, high bandwidth fabric and a configurable GPU topology to match your workloads. The ability of computers to autonomously learn, predict, and adapt using massive data sets is driving innovation and competitive advantage across many industries and applications are driving these requirements.
The system with rock-solid reliability, availability, and serviceability (RAS) features includes up to eight GPUs per server, NVLink 2.0 for fast (up to 300 GB/s) GPU-to-GPU communication, Intel® Xeon® Scalable processors support, choice of high-speed / low latency fabric, and is workload enhanced using flexible configuration capabilities. While aimed at deep learning workloads, AI models that would consume days or weeks can now be trained in a few hours or minutes.
HPE SDS Storage Solutions
HPE Solution for Intel Enterprise Edition for Lustre is a high-performance compute (HPC) storage solution that includes the HPE Apollo 4520 System and Intel® Enterprise Edition for Lustre*. The Apollo 4520 is a dual-node system with up to 46 drives. Capacity can be increased by adding additional drives in a disk enclosure. The solution can scale by adding more systems in parallel to scale performance and capacity nearly linearly.
Scality RING running on HPE ProLiant servers provides a SDS solution for petabyte-scale data storage that is designed to interoperate in the modern SDDC. The RING software is designed to create a scale-out storage system, which is deployed as a distributed system on a minimum cluster of six storage servers.
Scality Ring Object storage software solutions are designed to run on industry-standard server platforms, offering lower infrastructure costs and scalability beyond the capacity points of typical file server storage subsystems. The HPE Apollo 4200 series servers provide a comprehensive and cost-effective set of storage building blocks for customers that wish to deploy an object storage software solution.
Vertica v9
Formerly, HPE Product, Vertica Version 9 delivers High-Performance In-Database Machine Learning and Advanced Analytics, Unified advanced analytics database features advancements in in-database Machine Learning.