Big data lambda architecture - Streaming Layer Hands On

Big Data PipelineLambda Architecture - Streaming(Real-Time) Layer

with Apache Kafka

Apache HadoopApache Spark

Apache Cassandraon Amazon Web Services Cloud Platform

INGEST STORE Process Visualize

BIG Data Pipeline

Data Pipeline

AngularJS Web App

ClickStreamData

ApacheWeb Logs

Log/Data File

SparkStreaming

SparkSQL

ApacheKafka

ApacheCassandra

AngularJS Web App

INGEST STREAM PROCESSVISUALIZE

InteractiveQueries

Spark Cluster

TCPSockets

BIG Data Streaming (Real-Time) Layer Pipeline

Install Kafka - 3 Node Cluster on AWS

3 EC2 instance for Kafka Cluster

Repeat commands for all - 3 EC2 instance for Kafka Cluster

cat /etc/*-release

sudo add-apt-repository ppa:webupd8team/java

sudo apt-get update

sudo apt-get install oracle-java8-installer

java -version

mkdir kafka

cd kafka

wget http://download.nextag.com/apache/kafka/0.10.0.0/kafka_2.11-0.10.0.0.tgz

tar -zxvf kafka_2.11-0.10.0.0.tgz

cd kafka_2.11-0.10.0.0

ZooKeeper ==> 172.31.48.208 / 52.91.1.93

Kafka-datanode1 ==> 172.31.63.203 / 54.173.215.211

Kafka-datanode2 ==> 172.31.9.25 / 54.226.29.194

Modify config/server.properties for kafka-datanode1 & kafkadatanode2

ZooKeeper ==> 172.31.48.208 / 52.91.1.93

Kafka-datanode1 ==> 172.31.63.203 / 54.173.215.211

Kafka-datanode2 ==> 172.31.9.25 / 54.226.29.194

Kafka-datanode1 (set following properties for config/server.properties)

ubuntu@ip-172-31-63-203:~/kafka/kafka_2.11-0.10.0.0$ vi config/server.properties

broker.id=1listeners=PLAINTEXT://172.31.63.203:9092advertised.listeners=PLAINTEXT://54.173.215.211:9092zookeeper.connect=52.91.1.93:2181

Kafka-datanode2 (set following properties for config/server.properties)

ubuntu@ip-172-31-9-25:~/kafka/kafka_2.11-0.10.0.0$ vi config/server.properties broker.id=2listeners=PLAINTEXT://172.31.9.25:9092advertised.listeners=PLAINTEXT://54.226.29.194:9092zookeeper.connect=52.91.1.93:2181

Launch zookeeper / datanode1 / datanode2

ZooKeeper ==> 172.31.48.208 / 52.91.1.93

Kafka-datanode1 ==> 172.31.63.203 / 54.173.215.211

Kafka-datanode2 ==> 172.31.9.25 / 54.226.29.194

1) Start zookeeperbin/zookeeper-server-start.sh config/zookeeper.properties

2) Start server on Kafka-datanode1bin/kafka-server-start.sh config/server.properties

3) Start server on Kafka-datanode2bin/kafka-server-start.sh config/server.properties

4) Create Topic & Start consumer bin/kafka-topics.sh --zookeeper 52.91.1.93:2181 --create --topic data --partitions 1 --replication-factor 2bin/kafka-console-consumer.sh --zookeeper 52.91.1.93:2181 --topic data --from-beginning

Java - Kafka Producer Sample Applicationpackage com.himanshu;import java.io.BufferedReader;import java.io.FileNotFoundException;import java.io.FileReader;import java.io.IOException;//import util.properties packagesimport java.util.Properties;

//import simple producer packagesimport org.apache.kafka.clients.producer.Producer;

//import KafkaProducer packagesimport org.apache.kafka.clients.producer.KafkaProducer;

//import ProducerRecord packagesimport org.apache.kafka.clients.producer.ProducerRecord;

public class DataProducer {

public static void main(String[] args) {

// Check arguments length value/*if(args.length == 0) {

System.out.println("Enter topic name");

return;

} */ //Assign topicName to string variable String topicName = "data"; //args[0].toString(); // create instance for properties to access producer configs Properties props = new Properties(); //Assign localhost id props.put("bootstrap.servers", "54.173.215.211:9092,54.226.29.194:9092"); //props.put("metadata.broker.list", "172.31.63.203:9092,172.31.9.25:9092");

//Set acknowledgements for producer requests. props.put("acks", "all"); //If the request fails, the producer can automatically retry, props.put("retries", 0); //Specify buffer size in config props.put("batch.size", 16384); //Reduce the no of requests less than 0 props.put("linger.ms", 1);

//The buffer.memory controls the total amount of memory available to the producer for buffering. props.put("buffer.memory", 33554432); props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer"); props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer"); Producer<String, String> producer = new KafkaProducer<String, String>(props); String csvFile = "/Users/himanshu/Documents/workspace/KafkaProducer/src/com/himanshu/invoice.txt"; String csvSplitBy = ","; BufferedReader br = null; String lineInvoice = ""; try {

br = new BufferedReader(new FileReader(csvFile)); while((lineInvoice = br.readLine()) != null ) { String[] invoice = lineInvoice.split(csvSplitBy); producer.send(new ProducerRecord<String, String>(topicName, lineInvoice)); System.out.println("Message sent successfully...."); }

} catch (FileNotFoundException e) { e.printStackTrace(); }

Java - Kafka Producer Sample Application

catch (IOException e) { e.printStackTrace(); } finally { producer.close(); if (br != null) { try { br.close(); } catch (IOException e) { e.printStackTrace(); } } }}

Java - Kafka Producer Sample Application

Sample data which we will be sending to Kafka Serverfrom Java Kafka Producer (csv file)

Message received on kafka datanode1

RealTime Streaming with Kafka

Apache SparkApache Cassandra

Launch Kafka Cluster (Zookeeper/kafka datanode1/ kafka datanode2)

Execute Python / Kafka Spark Job

Sample data which we will be sending to Kafka Serverfrom Java Kafka Producer (csv file)

Python Spark Job Processing Data from AWS Kafka Cluster

Python Spark Job Processing Data from AWS Kafka Cluster&

Processed Data stored in AWS Cassandra Cluster

Sample data which we will be sending to Kafka Serverfrom Java Kafka Producer

Python Spark Job Processing Data from AWS Kafka Cluster&

Processed Data stored in AWS Cassandra Cluster

Apache Spark UI

Python Spark Streaming Application

Thank Youhkbhadraa@gmail.com

Big data lambda architecture - Streaming Layer Hands On

Technology

LAMBDA PRECIFLOW peristaltic pump - tubing pump ...peristaltic-pumps.eu/pdf/LAMBDA-PRECIFLOW-peristaltic-pump-manual.pdf · info@lambda-instruments.com Hotline: +420 603 274 677 LAMBDA

Using Amazon CloudWatch Events, AWS Lambda and Spark Streaming to Process EC2 Events

Hands-on Tutorial: Building a Simple SAP HANA …...Hands-on Tutorial: Building a Simple SAP HANA Smart Data Streaming Project 4 On the next screen choose the Authentication by database

SONDA LAMBDA SONDA LAMBDA PLANAR UNIVERSAL SONDA …

ISO13485 TDK-Lambda Medical Equipment 2020TDK-Lambda Products Guide for Medical Equipment 2020 ME_FL_2004J TDK-Lambda UK イルフラコム・ブリストル TDK-Lambda China TDK-Lambda

PUTTING DATA TO USEfiles.informatandm.com/uploads/2018/10/Putting... · • Managed Service for running SQL on Streaming data Amazon Kinesis Client Library AWS Lambda • Run code

Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximations Lambda Architecture

Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and Scala

Big data Lambda Architecture - Batch Layer Hands On

( ( lambda (z) ( define x ( lambda (x) ( lambda (y z) (y x) ) ) )

LAMBDA TD Software, Inc. presents LAMBDA LCE Domotics

Streaming iPhone sensor data to SAS® Event Stream Processing · 2017-11-22 · • Text analytics • Streaming geofencing • Reference historic data –Lambda architecture *SAS

Computability and Complexity · Lambda calculus ( -calculus) Lambda-Kalkül Lambda term ( -term) Lambda-Term Landau symbol Landau-Symbol Language (L) Sprache Language concatenation

strong normalization in a typed lambda calculus with lambda

Using Amazon CloudWatch Events, AWS Lambda and Spark Streaming to process Auto Scaling events

Real-Time Analytics with Spark Streaming - Real-Time ... · A lambda architecture (not to be confused with the AWS Lambda service) is one way to implement real-time and batch data

LAMBDA 10 - University of California, San Diegoneurophysics.ucsd.edu/Manuals/Lambda/Lambda 10 Optical... · 2014. 9. 19. · Lambda-10 to Controller Connecting Cable Controller to

Using Amazon CloudWatch Events, AWS Lambda and Spark Streaming to Process EC2 Events (June 2016)

SO YOU’RE STREAMING AN ALL-HANDS MEETING TO MULTIPLE ...pages.newtek.com/rs/...6VIDEOS_ALL_HANDS_CHECKLIST.pdf · Fortunately, with live-streaming video it’s simple, allowing

MIGRATING MEDIA WORKFLOWS TO THE CLOUD 2016... · Mobile Analytics Push Notifications Sync App Streaming Email Queuing & Notifications ... Lambda, DynamoDB, RDS Digital Supply Chain