29
THE CURIOUS CASE OF PROTOBUFS… De-mystifying Google’s hottest binary protocol Prasanna Kanagasabai Jovin Lobo

De-mystifying Google’s hottest binary protocol Prasanna Kanagasabai Jovin Lobo

Embed Size (px)

Citation preview

THE CURIOUS CASE OF PROTOBUFS…

De-mystifying Google’s hottest binary protocol

Prasanna KanagasabaiJovin Lobo

About us :

Prasanna Kanagasabai : Security Engineer @ ThoughtWorks Member of null- The Open Security Community . Author of IronSAP a module over IronWASP. Speaker @ nullcon-Delhi, Clubhack, IIT Guwahati and

various null meetups.

Jovin Lobo : Associate Consultant @ Aujas Networks Member of null- The Open Security Community. Author of GameOver – Linux distro for learning web

security. Spoken at nullCon, GNUnify before.

Agenda

Introduction. Anatomy of Protobufs

Defining Message formats in .Proto files. Protobuf compiler Python API to read write messages.

Encoding Scheme Problem Statement. Decoding like-a-pro with IronWasp

‘Protobuf Decoder’.

Introduction:

Protocol Buffers a.k.a Protobufs : Protobufs are Google's own way of

serializing structured data . Extensible, language-neutral and

platform-neutral . Smaller, faster and simpler to

implement. Java, C++ and Python

Anatomy:

Over view :

Defining a .Proto file.

#> less Example.protomessage Conference {required string conf_name = 1 ; required int32 no_of_days = 2 ; optional string email = 3 ;

}// * 1,2,3 are unique tags. These are used

by the fields in binary encoding.* For optimization use tags from 1-15 as higher nos. will use one more byte to encode.

Compiling

Syntax: protoc –I=$_input_Dir --

python_out=$_out_Dir $_Path_ProtoFile

Eg: protoc –I=. --python_out=.

Example.proto

This will generate a Example_pb2.py file in the specified destination directory.

$ProtoFile_pb2.py

The Protobuf compiler generates special descriptors for all your messages, enums, and fields.

It also generates empty classes, one for each message type:

Eg:

Reading and writing messages using the Protobuf binary format :

SerializeToString() serializes the message and returns it as a

string.

ParseFromString(data) parses a message from the given string.

Demo: Protobuf… how it wrks

Encoding.

example2.protomessage Ex1 { required int32 num = 1; // field tag }

Code snippet:obj = example2_pb2.Ex1();obj.num = 290; // field valueobj.SerializeToString();

Output : 08 A2 02 #hex000010001010001000000010 #binary

Problem statement.

This is what freaked him out

08 A2 02000010001010001000000010

Lets Decode it ..

Step 1 : Find the wire type .

Step 2: Find the field number.

Step 3: Find the field tag.

Step1: finding wire type.

0000 1000 1010 0010 0000 0010 To find wire type take the first

byte: 0000 1000 1010 0010 0000 0010

[0]000 1000 Drop MSB from First byte.

0001 000 The last 3 bits give wire type.

Wire type is 000 type = 0 is Varint.

Wire types

Step 2: Field tag.

What we already have is 0001000 Now we right shift value by 3 bits

and the remaining bits will give us the field tag. 0001000 0001 000 ‘0001 ‘ i.e. ‘ 1’

So we get the field tag = 1

Step 3: Find the field value 0000 1000 1010 0010 0000 0010 We drop the 1st byte

1010 0010 0000 0010 Drop the MSB’s from each of these bytes

1010 0010 0000 0010 010 0010 000 0010

Reverse these bytes to obtain the field value. 000 0010 010 0010 000 0010 010 0010 i.e 256 + 32 + 2 = 290

So we finally get the value of the field = 290.

So we successfully decoded example2.proto

message Ex1 { required int32 num = 1; }

Code snippet:obj = example2_pb2.Ex1();obj.num = 290;obj.SerializeToString();

Output : 08 A2 02 #hex000010001010001000000010 #binary

We successfully Decoded Value : “290”

Demo : Lets do this live

Automating all this with IronWasp Protobuf Decoder:

About IronWasp : IronWasp is an open-source web security

scanner. It is designed to be customizable to the

extent where users can create their own custom security scanners using it.

Author – Lavakumar Kuppan (@lavakumark)

Website : www.ironwasp.org

ProtoBuf Decoder

Road Map for Protobuf Decoder

01101000001111010000010110111001111001001000000101000101110101011001010111001101110100011010010110111101101110011100110010000000111111

01101000001111010000010110111001111001001000000101000101110101011001010111001101110100011010010110111101101110011100110010000000111111

01101000001111010000010110111001111001001000000101000101110101011001010111001101110100011010010110111101101110011100110010000000111111

Hmmm … Decoding ……

Any Questions ?

Done … It says ……

Any Questions ?

Done … It says ……

Thank You