3 avro hug-2010-07-21

Doug Cutting21 July, 2010

Introduction toApache Avro

Avro is...

● data serialization

● file format

● RPC format

Existing Serialization Systems:Protocol Buffers & Thrift

● expressive● efficient (small & fast)● but not very dynamic

● cannot browse arbitrary data● viewing a new datatype

– requires code generation & load● writing a new datatype

– requires generating schema text– plus code generation & load

Avro Serialization● spec's a serialization format● schema language is in JSON

● each lang already has JSON parser

● each lang implements data reader & writer● in normal code

● code generation is optional● sometimes useful in statically typed languages

● data is untagged● schema required to read/write

Avro Schema Evolution

● writer's schema always provided to reader● so reader can compare:

● the schema used to write with● the schema expected by application

● fields that match (name & type) are read● fields written that don't match are skipped● expected fields not written can be identified● same features as provided by numeric field ids

Avro JSON Schemas

// a simple three-element record{"name": "Block", "type": "record":, "fields": [ {"name": "id", "type": "string"}, {"name": "length", "type": "integer"}, {"name": "hosts", "type": {"type": "array:, "items": "string"}} ]}

// a linked list of strings or ints{"name": "MyList", "type": "record":, "fields": [ {"name": "value", "type": ["string", "int"]}, {"name": "next", "type": ["MyList", "null"]} ]}

Avro IDL Schemas

// a simple three-element recordrecord Block { string id; int length; array<string> hosts;}

// a linked list of strings or intsrecord MyList { union {string, int} value; MyList next;}

Hadoop Data Formats

● Today, primarily● text

– pro: interoperable– con: not expressive, inefficient

● Java Writable– pro: expressive, efficient– con: platformspecific, fragile

Avro Data

● expressive● small & fast● dynamic

● schema stored with data– but factored out of instances

● APIs permit reading & creating– new datatypes without generating & loading code

Avro Data

● includes a file format● replacement for SequenceFile

● includes a textual encoding● handles versioning

● if schema changes● can still process data

● hope Hadoop apps will● upgrade from text; standardize on Avro for data

Avro MapReduce API

● Single-valued inputs and outputs● key/value pairs only required for intermediate

● map(IN, Collector<OUT>)● map-only jobs never need to create k/v pairs

● map(IN, Collector<Pair<K,V>>)● reduce(K, Iterable<V>, Collector<OUT>)

● if IN and OUT are pairs, default is sort

● In Avro trunk today, built on Hadoop 0.20 APIs.● in Avro1.4.0 release next month

Avro MapReduce Example

public void map(Utf8 text, AvroCollector<Pair<Utf8,Long>> c, Reporter r) throws IOException {

StringTokenizer i = new StringTokenizer(text.toString());

while (i.hasMoreTokens()) c.collect(new Pair<Utf8,Long>(new Utf8(i.nextToken()), 1L));}

public void reduce(Utf8 word, Iterable<Long> counts, AvroCollector<Pair<Utf8,Long>> c, Reporter r) throws IOException { long sum = 0; for (long count : counts) sum += count; c.collect(new Pair<Utf8,Long>(word, sum));}

Avro RPC

● leverage versioning support● permit different versions of services to interoperate

● for Hadoop, will● let apps talk to clusters running different versions● provide crosslanguage access

Avro IDL Protocol

@namespace("org.apache.avro.test")

protocol HelloWorld {

record Greeting { string who; string what; }

Greeting hello(Greeting greeting);}

Avro Status

● Current● C, C++, Java, Python & Ruby APIs● Interoperable RPC and data● Mapreduce API for Java

● Upcoming● MapReduce APIs for other languages

– efficient, rich data

● RPC used in Flume, Hbase, Cassandra, Hadoop, etc.– interversion compatibility– nonJava clients

Questions?

3 avro hug-2010-07-21

Technology

Avro Arrow Par Abbie LeBlanc. Pour commencer… un VIDEO! Avro Arrow

CF-105 Avro Arrow

Apache Avro and You

Serialization (Avro, Message Pack, Kryo)

Avro Tutorial

Avro Lancaster - My Complete Aviation Database | For all ...1 Avro Lancaster Avro Lancaster Royal Air Force Avro Lancaster B I PA474 of the Battle of Britain Memorial Flight. Type

Avro Manhattan - Vatikanske Milijarde

Avro introduction

Apach avro

Company Profile Avro

Avro Quick Guide

LFV Destination v20-34 2002 - Transportstyrelsen · 2015. 1. 21. · 142 BAe146-200 143 BAe146-300 146 BAe146 all Series AR1 Avro RJ100/115 AR7 Avro RJ70 AR8 Avro RJ85 TU5 Tupolev

AVRO 146-RJ70 - Avionmar

ANNUAL REPORT - AVRO

Crossmediatraining AVRO

HAL TAD KANPUR (AVRO DEPARTMENT)

Parquet and AVRO

AVRO Afd 121113-019

Avro Lancaster B.Mk.I/III®Avro Lancaster B.Mk.I/III 04300-0389 2007 BY REVELL GmbH & CO. KG PRINTED IN GERMANY Avro Lancaster B.Mk.I/III Avro Lancaster B.Mk.I/III Der viermotorige

Avro Furniture