29
Spring/2002 Distributed Software Engineering C:\unocourses\4350\slides\ 1 Serialization Flatten your object for automated storage or network transfer

01 Serialization

Embed Size (px)

Citation preview

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

1

SerializationFlatten your object for automated storage or

network transfer

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

2

Software object persistence• Persistence: Saving information about an object

to recreate at different time, or place or both.• Object serialization means of implementing

persistence: convert object’s state into byte stream to be used later to reconstruct (build-deserialized) a virtually identical copy of original object.

• Default serialization for an object writes:– the class of the object, – the class signature, – values of all non-transient and non-static fields.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

3

Serialization protocol• For serialization:

– java.io.ObjectOutputStream via writeObject which calls on defaultWriteObject,

• For deserialization:– java.io.ObjectInputStream via

readObject which calls on defaultReadObject. • Any object instance that belongs to the

graph of the object being serialized must be serializable as well.

• Superclass must be Serializable.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

4

Serialization protocol• Customize default: implement extended

versions of default methods in:– writeObject– readObject– But final fields cannot be read with readObject. Need to use default.

• Create own complete serialization by implementing the interface Externalizable.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

5

Specifying persistent objects

• Class of the object to be serializable must implement interface:java.io.Serializable

• This interface is an empty interface and is used to mark the objects of such class as persistent.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

6

Deserialization

• It reads values written during serialization• Static fields in the class are left untouched.

– If class needs to be loaded, then normal initialization of the class takes place, giving static fields its initial values.

• Transient fields will be initialized to default values• Recreation of the object graph will occur in

reverse order from its serialization.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

7

import java.io.Serializable;import java.util.Date; import java.util.Calendar;public class PersistentTime implements Serializable {

public PersistentTime() { time = Calendar.getInstance().getTime(); }

public Date getTime() { return time; } private Date time;

}

Example

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

8

Class java.io.ObjectOutputStream

• An ObjectOutputStream instance writes primitive data types and graphs of Java objects to an OutputStream. The objects can be read (reconstituted) using an ObjectInputStream. Persistent storage of objects can be accomplished by using a file for the stream. If the stream is a network socket stream, the objects can be reconstituted on another host or in another process.

• Only objects that support the java.io.Serializable interface can be written to streams. The class of each serializable object is encoded including the class name and signature of the class, the values of the object's fields and arrays, and the closure of any other objects referenced from the initial objects.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

9

Class java.io.ObjectOutputStream

• The method writeObject is used to write an object to the stream. Any object, including Strings and arrays, is written with writeObject. Multiple objects or primitives can be written to the stream. The objects must be read back from the corresponding ObjectInputstream with the same types and in the same order as they were written.

• Primitive data types can also be written to the stream using the appropriate methods from DataOutput. Strings can also be written using the writeUTF method.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

10

Exampleimport java.io.ObjectOutputStream;import java.io.FileOutputStream; import java.io.IOException;public class FlattenTime{ public static void main(String [] args){

String filename = "time.ser";if(args.length > 0){ filename = args[0];} PersistentTime time = new PersistentTime();FileOutputStream fos = null;ObjectOutputStream out = null;try{ fos = new FileOutputStream(filename); out = new ObjectOutputStream(fos); out.writeObject(time); out.close();}catch(IOException ex){ ex.printStackTrace();}

}}

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

11

import java.io.ObjectInputStream;import java.io.FileInputStream;import java.io.IOException;import java.util.Calendar;public class InflateTime{ public static void main(String [] args){

String filename = "time.ser"; if(args.length > 0){ filename = args[0];}PersistentTime time = null;FileInputStream fis = null;ObjectInputStream in = null;try{ fis = new FileInputStream(filename); in = new ObjectInputStream(fis); time = (PersistentTime)in.readObject(); in.close();}catch(IOException ex){ ex.printStackTrace();}catch(ClassNotFoundException ex){ ex.printStackTrace();}System.out.println("Flattened time: " + time.getTime());System.out.println("Current time: " + Calendar.getInstance().getTime());

}}

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

12

Serializable vs. Non-Serializable objects

• Java.lang.Object does not implement serializable, so you must decide which of your classes need to implement it.

• AWT, Swing components, strings, arrays are defined serializable.

• Certain classes and subclasses are not serializable: Thread, OutputStream, Socket

• When a serializable class contains instance variables which are not or should not be serializable they should be marked as that with the keyword transient.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

13

Transient fields• These fields will not be serialized.• When deserialized, these fields will be initialized

to default values– Null for object references– Zero for numeric primitives– False for boolean fields

• If these values are unacceptable – Provide a readObject() that invokes

defaultReadObject() and then restores transient fields to their acceptable values.

– Or, the fields can be initialized when used for the first time. (Lazy initialization.)

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

14

Serial version UID

• You should explicitly declare a serial version UID in every serializable class.– Eliminates serial version UID as a potential source of

incompatibility.– Small performance benefit, as Java does not have to come up

with this unique number.– private static final long serialVersionUID =rlv;– rlv can be any number out thin air, but must be unique for

each serializable class in your development.– If you want to make a new version of the class incompatible

with existing version, choose a different UID. Deserialization of previous version will fail with InvalidClassException.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

15

Customizing OutputObjectStream, InputObjectStream

• To provide special behavior in the writing or reading of stream object bytes implementprivate void writeObject(ObjectOutputStream out)

throws IOException;private void readObject(ObjectInputStream in) throws

IOException, ClassNotFoundException;

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

16

Creating your own protocol: Externalizable

• Instead of implementing the Serializable interface, implement Externalizable:interface Externalizable{

public void writeExternal(ObjectOutput out) throws IOException;

public void readExternal(ObjectInput in) throws IOException;

}

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

17

Performance

• Serialization is a very expensive process. You must clearly have reasons to serialize instead of you directly writing what you need to save about the state of an object.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

18

Default or Customized serialization? OrImplementing Serializable judiciously

• Allowing a class’s instances to be serializable can be as simple as adding the words “implements Serializable” to the class specification.

• This is a common misconception, the truth is far more complex.

• While efficiency it is one cost associated with it, there are other long-term costs that are much more substantial.

• Using default serialization is very easy but this a very specious

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

19

Serialization Costs

• Your object’s private structure is out for the viewing!!!! It’s become part of the API.

• A major cost is that it decreases flexibility to change a class’s implementation once the class has been release

• Increases the likelihood of bugs and security holes.

• Increases the testing associated with releasing a new version of the class.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

20

Serialization caveats• Implementing Serializable is not a decision to be

undertaken lightly.• Classes design for inheritance should rarely

implement serializable and interfaces should rarely extend it.– You should provide parameterless constructor on non-

serializable classes designed for inheritance, in case it is subclassed and the subclass wants to provide serialization.

• Inner classes should rarely if ever, implement Serializable.

• A static member class can be serializable.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

21

Consider using a custom serialized form

• The default serialized form of an object is an encoding of the physical representation of the object graph rooted at the object– Data contained in the object– Data contained in every object reachable from it.– Topology by which all of these objects are interlinked.

• The ideal serialized form contains only the logical data represented by the object. It is independent of its physical representation.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

22

Consider using a custom serialized form

• Default serialization is likely to be appropriate if an object’s physical presentation is identical to its logical content.– Appropriate: A Name class.– Not appropriate: A doubly linked List class.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

23

Consider using a custom serialized form

• Disadvantages of default serialization when physical and logical representation differ:– Permanently ties the exported API to the

internal representation.– Can consume excessive space.– Can consume excessive time.– Can cause stack overflow.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

24

Consider using a custom serialized form

• A reasonable serialized form for a List is the number of entries followed by each of the entries.

• Although default serialized form is correct for a List case, it may not be the case for any object whose invariants are tied to implementation-specific details.– Example: a hash table using buckets. This is based on the

hash code of the key, which may change from JVM to JVM, or for different runs of the hash table in same JVM. Thus default serialized form can violate the invariant for hash tables in this case.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

25

readObject() and security attacks• Deserialization uses defaultReadObject() and readObject() to create a new instance of a class.

• Thus readObject is a constructor!!!!!• So, readObject must behave like any other

constructor:– Check for argument’s validity if need be– Make copies of parameters where needed

• Otherwise, a very simple job for an attacker to violate object’s invariants.– Provide a hand-made serialization of the attack object.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

26

Guide for writing a bulletproof readObject

• Private reference fields should be initialized with copies of its values.

• Check invariants and throw an InvalidObjectException if they fail.

• As with constructors, do not invoke any overridable methods.

• If an entire object graph must be check for validity after deserialization, the objectInputValidation interface should be used.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

27

writeReplace()

• Sometimes it may not be appropriate to serialize the actual object, but some specifically given object.

<access> Object writeReplace() throws ObjectStreamException; Returns an object that will replace the current object during serialization. Any object may be returned including the current one.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

28

A comment about access qualifier

• These methods can be of any accessibility• They will be used if they are accessible to the

object type being serialized– If a class has private readResolve, it only affects

serialization of objects that are exactly its type.– If package-accessible readResolve affects only

subclasses within the same package– public and protected readResolve affect objects of all

subclasses.

Spring/2002 Distributed Software EngineeringC:\unocourses\4350\slides\DefiningThreads

29

readResolve()• Recall that deserialization produces an instance of

a class object.• If a given class should only have one instance

(singleton pattern), then via deserialization we can provide a different instance!!!

• In general you need to be concerned of what is being created for instance-controlled classes.

• Enter: readResolve(); this is a method that returns the appropriate instance of the class at hand by the readObject() or defaultReadObject() methods.

<access> readResolve() throws ObjectStreamException;