Download (PPT, 244KB)
Big Data Analysis

Using this Presentation Template

This presentation is designed to provide an adaptable format for presenting information related to Big Data projects to your organization. On its own, it is an instruction about the basic concepts of Big Data and how they are related.

We encourage you to use the format to expand each section to support the individual needs of your organization.

Data Mining versus Data Analytics

Structured and Unstructured Data

Nearly 80% of all data is unstructured.

Data analysis is traditionally performed only on structured data.

Unstructured data must become structured in order to be analyzed. This can be a complex and expensive endeavor.

Why Data Mining?

What value does data mining provide?

Supports decisions using unbiased information

Predicts future trends based on historical trends

Influences business focus and priorities

What limitations face data mining activities?

The security and privacy of original data unmanaged

Misuse of information

Inaccuracies in information

Why Data Analytics?

What are the benefits of Data Analytics?

Targeted analysis of risk areas

Leveraging analysis across several projects

Increased frequency of high-risk activities

What are the limitations of Data Analytics?

Cost of increased data quality

Data Volume – finding the necessary value

Improperly budgeting efforts

Specialized skill sets required

Increasing Data Analysis Efforts

What to avoid in Big Data

Be realistic, not optimistic.

Don’t put all your eggs into software.

Change the way you think.

Learn from mistakes.

Find the people who know.

Finish what you started.

Be practical; don’t oversell.

General Implementation Process

Choose a problem area.

Define data inclusions and exclusions.

Define business rules.

Translate rules into analytical queries and algorithms.

Choose appropriate presentation of results.

Maintain and improve analytics.

Anomalies and False Positives

Anomalies – something occurs that is unique or distinctly different from what is expected.

False Positive – a result indicating the presence of a given condition when it is not.

Primary Capabilities of Data Analytics

Locating Data – identifying data sources, extracting the data from the source, and validating the data.

Normalizing Data – imposing regulatory and business standards on the data—ensures the data is in a usable format, organized, and deals with anomalies and false positives as required by procedure.

Analyzing Data – identifying any significant trends, patterns, or differences which should be investigated and/or communicated.