26
INCONSISTENCIES IN BIG DATA 1 Prepared by, Minu Joseph Guided by, Mr. Thomas Varghese

Inconsistencies in big data

Embed Size (px)

Citation preview

Page 1: Inconsistencies in big data

1

INCONSISTENCIES IN BIG DATA

Prepared by, Minu Joseph

Guided by, Mr. Thomas Varghese

Page 2: Inconsistencies in big data

2

Contents• Introduction.• Problem Statement.• 3V’s• Big data.• Defining Big data.• Dimensions of big data.• Sources, applications of big data.• Inconsistencies in big data.• Inconsistency induced learning.• Conclusion.• References.

Page 3: Inconsistencies in big data

3

Introduction• A torrent of data is generated and captured in

digital form due to advancement in science and technology.

• Everything we do is increasingly leaving a digital trace.

• Large data sets which are so large and complex that traditional data processing applications are inadequate.

Page 4: Inconsistencies in big data

4

Problem Statement

• Big Data-The next big thing in IT industry.• Classification of big data inconsistencies.• Big Data and Big Data analysis in terms of

issues and challenges.• Inconsistency Induced Learning- A tool to turn

big data inconsistencies into helpful formulas for better analysis of results.

Page 5: Inconsistencies in big data

5

Page 6: Inconsistencies in big data

6

Big Data • Big data can be described by:

VolumeVelocity VarietyVariabilityVeracityComplexity

Page 7: Inconsistencies in big data

7

What is BIG DATA?

What is Big Data and how does it work (1).mp4

Page 8: Inconsistencies in big data

8

Page 9: Inconsistencies in big data

9

Dimensions In Big Data

Page 10: Inconsistencies in big data

10

Page 11: Inconsistencies in big data

11

Page 12: Inconsistencies in big data

12

Levels of Knowledge

Page 13: Inconsistencies in big data

13

INCONSITENCIES IN BIG DATA

• Temporal• Spatial• Text• Functional Dependency

Page 14: Inconsistencies in big data

14

Temporal Inconsistencies

• Conflicting information.• Data items with conflicting circumstances may

coincide or overlap in time.• SRS often contain inconsistent information.• Inconsistent information affects the

correctness and performance of the system.• Due to concurrent programming errors

Therac-25(1985-1987) lead to 6 accidents.

Page 15: Inconsistencies in big data

15

List of temporal inconsistencies

Page 16: Inconsistencies in big data

16

Spatial Inconsistencies

• Happens in datasets which include geometric or spatial dimensions.

• Traditional DB systems are enhanced to include spatially referenced data.

• Spatial inconsistencies can arise from Geometric representation of objects Spatial relationship between objects Aggregation of composite objects.

Page 17: Inconsistencies in big data

17

Spatial Inconsistencies contd..

Page 18: Inconsistencies in big data

18

Text Inconsistencies

• Inconsistencies found in unstructured natural language text.

• Data generated from social media, blogs, emails etc.

• If two texts are referring to same event or entity they are said to be of co-reference.

• Contradiction Detection detects text inconsistencies and has many applications.

Page 19: Inconsistencies in big data

19

Text Inconsistencies contd..

Page 20: Inconsistencies in big data

20

Functional Dependency Inconsistency

• When certain attribute values are equal, then other attribute values must also be equal.

• Many big databases are stored , aggregated and cleaned through the help of RDBMS.

• Here Functional dependencies play an important role in enforcing the integrity constraints for the database.

Page 21: Inconsistencies in big data

21

Functional Dependency Inconsistency contd…

• Variation of Functional Dependencies will result in inconsistencies in data and information.

Page 22: Inconsistencies in big data

22

Inconsistency Induced Learning

• Improves data quality• Helps to enhance big data applications.• Accommodates lifelong learning by allowing

successive learning episodes to be triggered through inconsistencies an agent encounters during its problem solving episodes.

• Basic idea is to identify the cause of inconsistency and then apply cause specific heuristics to resolve inconsistencies.

Page 23: Inconsistencies in big data

23

Conclusion

• Multidimensional issues and challenges in big data and big data analysis.

• Types of inconsistencies.• How to improve quality of big data analysis.

Page 24: Inconsistencies in big data

24

References• www.slideshare.com• dl.acm.org• www.ieeexplore.ieee.org• D. Zhang, On Temporal Properties of Knowledge Base

Inconsistency. Springer Transactions on Computational Science.

• M. Schroeck, R. Shockley, J. Smart, D. Romero-Morales, and P. Tufano, Analytics: the real-world use of big data: how innovative enterprises extract value from uncertain data, Executive Report, IBM Institute for Business Value and Said Business School at the University of Oxford.

• Nasrin Irshad Hussain ,Big Data,www.slideshare.com

Page 25: Inconsistencies in big data

25

QUESTIONS?

Page 26: Inconsistencies in big data

26