14
Distributed Computing with Turing Machine

Distributed Computing with Turing Machine. Turing machine Turing machines are an abstract model of computation. They provide a precise, formal definition

Embed Size (px)

Citation preview

Page 1: Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition

Distributed Computing with Turing Machine

Page 2: Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition

Turing machine Turing machines are an abstract model of computation. They provide a precise, formal definition of what it means for a function to be computable.It is similar to a finite automaton but with an unlimited and unrestricted memory.Use infinite tape as inputs unlimited memory;It has a head can read and write symbols and move R/L on the tape;The tape contains input string and the other tapes is blank.

Page 3: Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition

The distributed computing based on Bigdata to showBigdata system is a distributed system, it has an distributed file system named HDFS which can store large data in a cluster, then to manage them.

If the input contains a very large string, even the turing machine can be computed in a polynomial time, it still spend large time to solve it.

Page 4: Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition

Similarities between Bigdata and Turing MachineMass storage The Turing machine model uses an infinite tape as its unlimited memory. Turing Machine can store mass input tape and instruction. The bigdata their data is based on internet, it is also very ample

main control systemTuring Machine has a certain function to compute which control the head direction to read/write the tape. The bigdata also has a main control system named namenode. It used to distribute the datanode and let client to operate the datanode

Page 5: Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition

Different between bigdata and Turing machineSome problem can be solved on a deterministic Turing Machine in a polynomial time.

It depend on the size of the input and the function which control the slip of the head.

All the input and the compute just can be done on its own tape.

Bigdata use the HDFS system manage the data, can be execute in many computers.

Page 6: Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition

HDFSDistributed File System

Large Data Assets

HDFS Parts

NameNode◦ manage the filesystem namespace◦ manages opening, closing, renaming, etc. ◦ maps blocks to datanodes

DataNodes◦ manage stores (blocks) – create/delete◦ serves reads/writes for data blocks

Page 7: Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition

HDFS:Data loading

Page 8: Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition

Key/Value pairsTake a collection of key, value pairs

Map onto a different collection of key, value pairs. Map(k1,v1) -> (k2,v2)

shuffling (A,1),(B,2),(C,3) (B,3),(A,2),(C,1)

(D,1),(C,1),(B,2) (A,5)

Shuffled (A,(1,2,5)) (B,(2,3,2)) (C,(3,1,1)) (D,(1))

Page 9: Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition

Map-Reduce Using the function Map-Reduce to decompose the large computing problem to many small blocks.

Using the function Map put them on many computers through the internet ,every single machine can distributed compute their own data at the same time.

Reduce is a kind of combine , it depend on the key-value model.

Page 10: Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition

Map-Reduce process1. In the mapping phase, MapReduce

takes the input data and feeds each data element to the mapper.

2. In the reducing phase, the reducer processes all the outputs from the mapper and arrives at a final result.

3. In simple terms, the mapper is meant to filter and transform the input into something that the reducer can aggregate over.

Page 11: Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition
Page 12: Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition

Distributed Task Execution Problem Statement: There is a large computational problem that can be divided into multiple parts and results from all parts can be combined together to obtain a final result.

Case Study: Simulation of a Digital Communication System

There is a software simulator of a digital communication system like WiMAX that passes some volume of random data through the system model and computes error probability of throughput. Each Mapper runs simulation for specified amount of data which is 1/Nth of the required sampling and emit error rate. Reducer computes average error rate.

Solution: Problem description is split in a set of specifications and specifications are stored as input data for Mappers. Each Mapper takes a specification, performs corresponding computations and emits results. Reducer combines all emitted parts into the final result.

Page 13: Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition

Conclusion If there are a lot of input maybe 10TB or more, using turing machine will spend a lot of time to solve it. But when using distributed computing, it means split the computing in many blocks, one block is a computer(turing machine) to compute, every computer solve a relative small input, the input in every block just is the 1/N of the original input. It will spend less time to solve problem and make the computing more efficient.

Page 14: Distributed Computing with Turing Machine. Turing machine  Turing machines are an abstract model of computation. They provide a precise, formal definition

Thank you