OCTO_TheWebGiants_2016

The WebGiants

Culture – Practices – Architecture

AUG

MEN

TED

Foreword .......................................................................................................................6

Introduction .................................................................................................................9

Culture .........................................................................................................................11The Obsession with Performance Measurement .....................13Build vs Buy ....................................................................................................19Enhancing User Experience ..................................................................27 Code crafters.................................................................................................33Open Source Contribution ...................................................................41Sharing Economy platforms .................................................................47

Organization ............................................................................................................57Pizza Teams ....................................................................................................59Feature Teams ..............................................................................................65DevOps ............................................................................................................71

Practices ......................................................................................................................85Lean Startup ..................................................................................................87Minimum Viable Product ........................................................................95Continuous Deployment ......................................................................105Feature Flipping ........................................................................................113Test A/B ..........................................................................................................123Design Thinking ........................................................................................129Device Agnostic ........................................................................................143Perpetual beta ............................................................................................151

Architecture ...........................................................................................................157Cloud First ....................................................................................................159Commodity Hardware ...........................................................................167Sharding .........................................................................................................179TP vs. BI: the new NoSQL approach .............................................193Big Data Architecture .............................................................................201Data Science................................................................................................211Design for Failure .....................................................................................221The Reactive Revolution .......................................................................227Open API .....................................................................................................235

About OCTO Technology .............................................................................243Authors .....................................................................................................................245

Table of Contents

3

THE WEB GIANTS

It has become such a cliché to start a book, a talk or a preface by stating that the rate of change is accelerating. However, it is true: the world is changing faster both because of the exponential rate of technology evolution and the central role of the user in today’s economy. It is also a change characterized by Marc Andreessen in his famous blog post as “software is eating the world“. Not only is software at the core of the digital economy, but producing software is changing dramatically too. This is not a topic for Web companies, this is a revolution that touches all companies. To cope with their environment’s change, they need to reinvent themselves into software companies, with new ways of working, organizing themselves and producing digital experiences for their customers. This is why I am so pleased to write the preface to “The Web’s Giants“. I have been using this book intensely since the first French edition was on the market. I have given copies to colleagues both at Bouygues Telecom and at AXA, I have made it a permanent reference in my own blogs, talks and writing. Why? It is the simplest, most pragmatic and convincing set of answers to the previous questions: what to do in this software-infused, technology-enabled, customer-centric fast changing 21st century? This is not a conceptual book, a book about why you should do this or that. This is a beautifully written story about how software and service development is organized in some of the best-run companies of the world. First, this is a book about practices. The best way to grow change in a complex world is to adopt practices. It is the only way to learn, by doing. These practices are sorted into three categories: culture, organization and architecture; but there is a common logic and a systemic reinforcement. Practices are easier to pick and they are less intimidating than methodologies or concepts. However, strong will and perseverance are required. I will not spoil your reading by summarizing what OCTO found when they look at the most common practices of the most successful software companies of the world. I will rather try to convince you that reading this book is an urgent task for almost everyone, based on four ideas. The first and foremost idea is that software systems must be built to change constantly. This is equally true for information systems, support systems, embedded, web or mobile software. What we could define as customer engagement platforms are no longer complex systems that one designs and builds, but continuously evolving systems that are grown. This new generation of software systems is the core of the Web Giants. Constant evolution is mandatory to cope with exponential technology changes, as well as the only way to co-construct engagement platforms through customer feedbacks. The unpredictability of usage, especially social usage, means that digital experiences software processes that can only be crafted through measure and continuous improvement. This

Foreword

4

THE WEB GIANTS FOREWORD

critical change, from software being designed to software being grown, means that all companies that provide digital experiences to their customers must become software companies. A stable software support system could be outsourced, delegated or bought, but a constantly evolving self-adaptive system becomes a core capability. This capability is deeply mixed with business and its delivery processes and agents are to be valued and respected. The second key idea is that there exists a new way of building such software systems. We are facing two tremendous challenges: to churn out innovations at the rate that is expected by the market, and to constantly integrate new features while factoring out older ones, to avoid the suffocation by constant growth that plagued previous generations of software systems. The solution is a combination of open innovation - there are clearly more smart developers outside any company than inside – together with source-level “white box“ integration and minimalist “platform“ design principles. When all your code needs to be constantly updated to follow the environment change, the less you own the better. It is also time to bring source code back from the dark depths of “black box integration“. Open source culture is both about leveraging the treasure trove of what may be found in larger development communities and about mashing up composite applications by weaving source code that one may be proud of. Follow the footsteps of the Web Giants: code that changes constantly is worth being well-written, structured, documented and test-viewed by as many eyeballs as possible. The third idea is another way of saying that “software is eating the world“, this book is not about software, it is about a new way of thinking about your company, whichever business you are in. Not surprisingly, many “known“ practices such as agile development, lean startup, measure obsession or obsession about saving customer’s time - the most precious commodity of the digital age -, have found their way into Octo’s list. By reading the practical testimonies from the Web Giants, a new kind of customer-focused organization will emerge. Thus, this is a book for everyone, not for geeks only. This is of the utmost importance since many of the change levers lay in other stakeholders’ hands than software developers themselves. For instance, a key requirement for agility is to switch from solution requirement to problem requirement, allowing the solution to be co-developed by cross-functional teams as well as users. The last idea I would propose is that there is a price to pay for this transformation. There are technologies, tools and practices that you must acquire and learn. Devops practices, such as continuous delivery or managing infrastructure as code, require to master a set of tools and to build skills, there is no “free lunch“. A key set of benefits from the Web Giants way of working comes from massive automation. This book also

5

shows some of the top recent technology patterns in the architecture section. Since this list is evolving by nature, the most important lesson is to create an environment where “doers“ may continuously experience the tools of the future, such as massively parallel cloud programming, big data or artificial intelligence. A key consequence is that there is a true efficiency and competitiveness difference between those who do and those who don’t master the said set of tools and skills. In the world of technology, we often use the world “Barbarians“ to talk about newcomers who leverage their software/technology skills to displace incumbents in older industries. This is not a question of mindset (trying to take legacy companies head-front is an age-old strategy for newcomers) but a matter of capabilities! As stated earlier, there would be other, more conceptual, ways to introduce the key ideas and practices that are pictured in this book. One could tell about the best sources on motivation and collaborative work, such as Daniel Pink for instance. These Web Giants practices reflect the state of the art of managing intrinsic motivation. The same could be said about the best books on lean management and self-organization. The reference to Lean Startup is one from many subtle references to the influence of the Toyota Way in the modern 21st century forms of organization. Similarly, it would be tempting to convoke complex system theory - see Jurgen Apello and his “Management 3.0“ book for instance - to explain why the practices observed and selected by Octo are the natural answer to the challenges of the increasingly changing and complex world that we live in. From a technology perspective, it is striking to see the similarity with the culture & organizational traits described by Salim Ismael, Michael Malone and Yuri van Geest in their book “Exponential organizations“. The beauty of this pragmatic approach is that you have almost all what you need to know in a much shorter package, which is fun and engaging to read. To conclude this preface, I would advise you to read this book carefully, to share it with your colleagues, your friends and your children - when it’s time to think about what it means to do something that matters in this new world. It tells a story about the new way of working that you cannot afford to miss. Some of the messages: measuring everything, learning by doing, loving your code and respecting those who build things, may make the most seasoned manager smile, but times are changing. This is no longer a set of suggested, “nice-to-have“ practices, as it might have been ten years ago. It is the standard of web-age software development, and de facto the only way for any company to succeed in the digital world.

Yves Caseau - National Academy of Technologies of France, President of the ICT commission.

Head of Digital of AXA Group

THE WEB GIANTS

6

THE WEB GIANTS INTRODUCTION

Introduction

Something extraordinary is happening at this very moment; a sort of revolution is underway. Across the Atlantic, as well as in other parts of the world such as France, people are reinventing how to work with information technology.They are Amazon, Facebook, Google, Netflix and LinkedIn, to name but the most famous. This new generation of players has managed to shed old dogmas to examine afresh the issues at hand by coming up with new, radical and efficient solutions for long-standing IT problems.

Computer scientists are well aware of the fact that when IT tools are introduced to a trade, the benefits of computerization can only be reaped if business processes are re-thought in light of the new potential offered by technology. One trade, however, has mostly managed thus far to avoid upheavals in their processes: Information Technology itself. Many continued – and still do – to build information systems the way one would build highways or bridges.There is a tendency to forget that the matter being handled on a daily basis is extremely volatile. By dint of hearing tell of Moore’s law,[1] its true meaning is forgotten: what couldn’t be done last year is possible today; what cannot be done today will be possible tomorrow. The beliefs and habits of the ecosystem we live in must be challenged at regular intervals. This thought is both terrifying and wonderful.

Now that the pioneers have paved the way, it is important to re-visit business processes. The new approaches laid out here offer significant increases in through efficiency, proactivity, and the capacity for innovation, to be harnessed before the competition pulls the rug out from under your feet. The good news is that the Web Giants are not only paving the way; they espouse the vision of an IT community.They are committed to the Open Source principle, openly communicating their practices to appeal to potential recruits, and work in close collaboration with the research community. Their work methods are public knowledge and very accessible to those who care to delve.

The aim of this book is to provide a synthesis of practices, technological solutions and the most salient traits of IT culture. Our hope is that it will inspire readers to make contributions to an information age capable of reshaping our world.

This book is designed for both linear and thematic reading. Those who opt for the former may find some repetition.

[1] empirical law which states that computing power roughly doubles in capacity at a fixed price every 18 months.

7

THE WEB GIANTS

Culture

8

The obsession with performance measurement ................................ 13Build vs Buy .................................................................................... 19Enhancing the user experience ........................................................ 27Code crafters .................................................................................. 33Developing Open Source ................................................................ 41

THE WEB GIANTS

THE WEB GIANTS

The obsession with

performance measurement

10

THE WEB GIANTS CULTURE / L’OBSESSION DE LA MESURE

11

THE WEB GIANTSCULTURE / THE OBSESSION WITH PERFORMANCE MEASUREMENT

Description

In IT, we are all familiar with quotes reminding us of the importance of performance measurement:

That which cannot be measured cannot be improved; without measurement, it is all opinion.

Web Giants have taken this idea to the extreme, and most have developed a strong culture of performance measurement. The structure of their activities leads them in this direction.

These activities often share three characteristics:

For these companies, IT is their means of production. Their costs are therefore directly correlated to the optimal use of equipment and software. Improvements in the number of concurrent users or CPU usage result in rapid ROI.

Revenues are directly correlated to the efficiency of the service provided. As a result, improvements in conversion rates lead to rapid ROI.

They are surrounded by computers! And computers are excellent measurement instruments, so they may as well get the most out of them!

Most Web Giants have made a habit of measuring everything, response times, most visited web pages or the articles (content or sales pages) that work best, the time spent on individual pages...

In short, nothing unusual – at first glance.

But that’s not all! – They also measure the heat generated by a given CPU, or the energy consumption of a transformer, as well as the average time between two hard disk failures (MTBF, Mean Time Between Failure).[1] This motivates them to build infrastructure that maximizes the energy efficiency of their installations, as these players closely monitor PUE, or Power Usage Effectiveness.Most importantly, they have learned to base their action plans on this wealth of metrics.

[1] http://storagemojo.com/2007/02/19/googles-disk-failure-experience

12

THE WEB GIANTS

Part of this trend is A/B testing (see “A/B Testing“ on p. 123 for further information), which consists of testing different versions of an application on different client groups. Does A work better than B? The best way to find out remains objective measurement: it results in concrete data that defy common sense and reveal the limits of armchair expertise, as

demonstrated by the www.abtests.com website, which references A/B testing results.

In an interview, Yassine Hinnach – then Senior Engineer Manager at LinkedIn – spoke of how LinkedIn teams were encouraged to quickly put any technology designed to boost site performance to the test. Thus decisions to adopt a given technology are made on the basis of observed metrics.HighScalability.com has published an article presenting Amazon’s recipes for success, based on interviews with its CTO. Among the more interesting quotes, the following caught our attention:

Everyone must be able to experiment, learn, and iterate. Position, obedience, and tradition should hold no power.

For innovation to flourish, measurement must rule.[2]

As another example of this approach, here is what Timothy B. Lee, a journalist for Wired and the New York Times, had to say about Google’s culture of performance measurement:

Rather than having intimate knowledge of what their subordinates are doing, Google executives rely on

quantitative measurements to evaluate the company’s performance. The company keeps statistics on everything—

page load times, downtime rates, click-through rates, etc—and works obsessively to improve these figures. The obsession with data-driven management extends even to

the famous free snacks, which are chosen based on careful analysis of usage patterns and survey results.“[3]

[2] http://highscalability.com/amazon-architecture [3] http://arstechnica.com/apple/news/2011/06/fourth-times-a-charm-why-icloud-faces-long-odds.ars

13

THE WEB GIANTSCULTURE / THE OBSESSION WITH PERFORMANCE MEASUREMENT

The consequences of this modus operandi run deep. A number of pure players display in their offices the motto “In God we trust. Everything else, we test“. This is more than just a nod to Deming;[4] it is a profoundly pragmatic approach to the issues at hand.

An extreme example of this trend, verging on caricature, is Google’s ‘Project Oxygen’: a team of internal statisticians combed through HR data collected from within – annual performance reviews, feedback surveys, nominations for top-manager awards. They distilled the essence of what makes a good manager down to 8 rules. Reading through them, any manager worthy of the name would be struck by how jaw-droppingly obvious it all seems. However, they backed their claims with hard, cold data,[5] and that made all the difference!

What about me?The French are fond of modeling, and are often less pragmatic than their English-speaking counterparts.

Indeed, we believe that this constant and quick feedback loop “hypothesis measurement decision“ should be an almost systematic reflex in the ISD world, and can be put into effect at a moment’s notice.

The author of these lines still has painful memories of two four-hour meetings with ten people organized to find out if shifting requests to the service layer to http would have a “significant“ impact on performance. Ten working days would have largely sufficed for a developer to figure that out, at a much lower cost.

OCTO consultants have also had the experience, several times over, of discovering that applications performed better when the cache that was used to improve performance was removed! The cure was therefore worse than the disease and its alleged efficacy never actually measured.

Management runs the risk of falling into the trap of believing that analysis by “hard data“ is a done deal. It may be a good idea to regularly check that this is indeed the case, and especially that the information gathered is put to use in decision-making.

[4] “In God we trust; all others must bring data“, W. Edward Deming.[5] Adam BRYANT, Google’s Quest to Build a Better Boss, The New York Times Company, March 12, 2011 : http://www.nytimes.com/2011/03/13/business/13hire.html

13

14

THE WEB GIANTS

Nevertheless, it cannot be emphasized enough that an ecosystem fostering the application of said information makes up part of the recipe for success of Web Giants.

Two other practices support the culture of performance metrics:

Automated tests: it’s either red or green, no one can argue with that. As a result, this ensures that it is always the same thing being measured.

Short cycles. To measure – and especially interpret – the data, one must be able to compare options, “all other things being equal“. This is crucial. We recently diagnosed the steps undertaken to improve the performance of an application. But about a dozen other optimizations were made to the next release. How then can efficient optimizations be distinguished from those that are counter-productive?

15

THE WEB GIANTS

Build vs

Buy

16

THE WEB GIANTS

17

THE WEB GIANTS CULTURE / BUILD VS BUY

Description

One striking difference in the strategy of Web Giants as compared to more usual IT departments lies in their arbitrations around Build vs. Buy.

The issue is as old as computers themselves: is it better to invest in designing software to best fit your needs or to use a software package complete with the capitalization and R&D of a publisher (or community) having had all necessary leisure to master the technology and business points?

Most major firms have gone for the second option and have enshrined maximal software packaging among their guiding principles, based on the view that IT is not one of their pillar businesses so is better left to professionals.

The major Web companies have tended to do the exact reverse. This makes sense given that IT is precisely their core business, and as such is too sensitive to be left in the hands of outsiders.

The resulting divergences are thus coherent.

Nonetheless, it is useful to push the analysis one step further because Web Giants have other motives too: first, being in control of the development process to ensure it is perfectly adjusted to meet their needs, and second, the cost of scaling up! These are concerns found in other IT departments, meaning that it can be a good idea to look very closely into your software package decisions.

Finding balanced solutions

On the first point, one of the built-in flaws of software packages is that they are designed for and by the needs which most arise for the publisher’s clients.[1] Your needs are thus only a small subset of what the software package is built to do. Adopting a software package by definition entails overkill, i.e. an overly complex solution not optimized for your

[1] We will not insist here on the fact that you should not stray too far from the standard out-of-the-box software package as this can be (very) expensive in the long term, especially when there are new releases.

18

THE WEB GIANTS

needs; and which has a price both in terms of execution and complexity, offsetting any savings made by not investing in the design and development of a complete application.

This is particularly striking in the software package data model. Much of the model’s complexity stems from the fact that the package is optimized for interoperability (a highly standardized Conceptual Data Model, extension tables, low model expressiveness as it is a meta-model...). However the abstractions and the “hyper-genericity“ that this leads to in software design has an impact on processing performance.[2]

Moreover, Web Giants have constraints in terms of volumes, transaction speed and the number of simultaneous users which push the envelopes of traditional architecture and which, in consequence, require fine-tuned optimizations determined by observed access-patterns. Such read-intensive transactions must not be optimized in the same way as others, where the stakes will be determined by I/O writing metrics.

In short, to attain such results, you have to pop the hood and poke around in the engine, which is not something you will be able to do with a software package (all guarantees are revoked from the moment you fiddle with the innards).

Because performance is an obsession for Web Giants, the overhead costs and low possibilities for adjustments to the software package make the latter quite simply unacceptable.

Costs

The second particularly critical point is of course the cost when scaling up. When the number of processors and servers increases, the costs rise very quickly, but not always in linear fashion, making some items more visible. And this is true of both business software packages and hardware.

That is precisely one of the arguments which led LinkedIn to gradually replace their Oracle database by an in-house solution, Voldemort.[3].In a similar vein, in 2010 we carried out a study on the main e-commerce

[2] When it is not a case of a cumbersome interface.[3] Yassine Hinnach, Évolution de l’architecture de LinkedIn, enjeux techniques et Organizationnels, USI 2011:http://www.usievents.com/fr/conferences/8-paris-usi-2011/sessions/1007

19


sites in France: at the time, eight of the ten largest sites (in terms of annual turnover) ran on platforms developed in-house and 2 used e-commerce software packages.

Web Giants thus prefer Build to Buy. But not only. They also massively have recourse to Open source solutions (cf. “Developing open source“, p. 41). Linux and MySQL reign supreme in many firms. Development languages and technologies are almost all open source: very little .NET for example, but instead Java, Ruby, PHP, C(++), Python, Scala... And they do not hesitate to fork off from other projects: Google for example uses a largely modified Linux kernel.[4] This is also the case for one of the main worldwide Global Distribution Systems.

Most technologies making a stir today in the world of high performance architecture are the result of developments carried out by Web Giants and then opened to the community. Cassandra, developed by Facebook, Hadoop and HBase inspired by Google and developed by Yahoo!, Voldemort by LinkedIn...

A way, in fact, of combining the advantages of software perfectly tailored to your needs but nonetheless enhanced by improvements contributed by the development community, with, as an added bonus, a market trained to use the technologies you use.

Coming back to the example of LinkedIn, many of their technologies are grounded in open source solutions:

Zoie, a real time indexing and search system based on Lucene.

Bobo, a faceted search library based on Lucene.

Azkaban, a batch workflow job scheduler to manage Hadoop job dependencies.

GLU, a deployment framework.

[4] http://lwn.net/Articles/357658

20

THE WEB GIANTS

How can I make it work for me?Does this mean I have to do away with software packages in my IT choices?

Of course not, not for everything. Software packages can be the best solution, no one today would dream of reengineering a payroll system. However, ad hoc developments should be considered in certain cases: when the IT tool is key to the success of your business. Figure 1 lays out orientations in terms of strategy.

The other context where specific developments can be the right choice is that of high performance: with companies turning to “full web solutions“, very few business software packages have the architecture to support the traffic intensity of some websites.

As for infrastructure solutions, open source has become the norm: OSs and application servers foremost. Often also databases and message buses. Open source are ideally adapted to run the solutions of Web Giants. There is no doubt as to their capacity for performance and stability.

One hurdle remains: reluctance on the part of CIOs to forgo the support found in software packages. And yet, when you look at what actually happens, when there are problems with the commercial technical platform, it is rarely support from the publisher, handsomely paid for, which provides the solution, but rather networks of specialists and help fora

Unique, differentiating. Perceived as

a commercial asset.

Innovations and strategic assets

FasterSPECIFIC

SOFTWARE PACKAGE

BPO[5]

ResourcesCheaper

Common to all industry organizations.

Perceived as a production asset.

Common to all organizations. Perceived as a ressource.

[5] Business Process Outsourcing.

21


on the Internet. For application platforms of the database or message bus type, the answer is less clearcut because some commercial solutions include functionalities that you do not find in open source alternatives. However if you are sending an Oracle into regions where MySQL will not be able to follow, that means that you have very sophisticated needs... which is not the case for 80% of the contexts we encounter !

22

THE WEB GIANTS

Enhancing User Experience

23

WEB GIANTS

24

THE WEB GIANTS CULTURE / ENHANCING THE USER EXPERIENCE

Description

Performance: a must

One conviction shared by Web Giants is that users’ judgment of performance is crucial. Performance is directly linked to visitor retention and loyalty. How users feel about a particular service is linked to the speed with which the graphic interface is displayed.

Most people have no interest in software architecture, server power, or network latency due to web based services. All that matters is the impression of seamlessness.

User-friendliness is no longer negotiable

Web Giants have fully grasped this and speak of metrics in terms of“the bat of an eyelash“. In other words, it is a matter of fractions of seconds.Their measurements, carried out namely through A/B testing (cf. “A/B Testing“, p. 123), are very clear:

Amazon : a 100ms. increase in latency means a 1% loss in sales.

Google : a page taking more than 500ms to load loses 20% of traffic (pages visited).

Yahoo! : more than 400ms to load means + 5 to 9 % abandons.

Bing : over 1 second to load means a loss of 2.8% in advertising income.

How are these performances attained?

In keeping with the Device Agnostic pattern (cf. “Device Agnostic“, p. 143), Web Giants develop native interfaces, or Web interfaces, to always offer the best possible user experience. In both cases, performance as perceived by the user must be maximized.

25

THE WEB GIANTS

Native applications

With the iPhone, Apple reintroduced applications developed for a specific device (stopping short of the assembler however) to maximize perceived performance. Thus Java and Flash technologies are banished from the iPhone. The platform also uses visual artifacts: when an app is launched, it displays the view as seen when it was last charged by the system to strengthen the impression that it is instantaneous, with the actual app being loaded in the background. On Android, Java applications are executed on a virtual machine optimized for the platform. They can also be written in C to maximize performance.

Generally speaking, there is a consensus around native development, especially on mobile platform: it must be as tightly linked as possible to the device. Multi-platform technologies such as Java ME, Flash and Silverlight do not directly enhance the user experience and are therefore put aside.

Web applications

Fully loading a Web page usually takes between 4 and 10 seconds (including graphics, JavaScript, Flash, etc.).

It would seem that perceived slowness in display is generally linked for 5% to server processing, and for 95% to browser processing. Web Giants have therefore taken considerable care to optimize the display of Web pages.

As illustration, here is a list of the main good practices which most agree optimize user perception:

It is crucial to cache all static resources (graphics, CSS style sheets, JavaScript scripts, Flash animations, etc.) whenever possible. There are various HTTP cache technologies for this. It is important to become skillful at optimizing the life-cycle of the resources in the cache.

It is also advisable to use a cache network, or Content Delivery Network (CDN) to bring the resources as close as possible to the end user to reduce network latency. We highly recommend that you have cache servers in the countries where the majority of your users live.

26

CULTURE / ENHANCING THE USER EXPERIENCE

Downloading in background is a way of masking sluggishness in the display of various elements on the page.

One thing many do is to use sprites: the principle is to aggregate images in a single file to limit the amount of data to be loaded; they can then be selected on the fly by the navigator (see the Gmail example below).

Having recourse to multiple domain names is a way to maximize parallelization in simultaneous resource loading by the navigator. One must bear in mind that navigators are subjected to a maximum number of simultaneous queries for a same domain. Yahoo.fr for example loads their images from l.yimg.com.

Placing JavaScript resources at the very end of the page to ensure that graphics appear as quickly as possible.

Using tools to minimize, i.e. removing from the code (JavaScript, HTML, etc.) all characters (enter, comments, etc.) serving to read the code but not to execute it, and to shorten as much as possible function names.

Compacting the various source code files such as JavaScript in a single file whenever possible.

Who makes it work for them?There are many examples of such practices among Web Giants, e.g. Google, Gmail, Viadeo, Github, Amazon, Yahoo!...

References among Web Giants

Google has the most extensive distributed cache network of all Web Giants: the search giant is said to have machines in all major cities, and even a private global network, although corroboration is difficult to come by.

Google Search pushes the real-time user experience to the limits with its“Instant Search“ which loads search results as you type your query. This function stems from formidable technical skill and has aroused the interest of much of the architect community.

27

THE WEB GIANTS

Gmail images are reduced to a strict minimum (two sprite images shown on Figure 1), and the site makes intensive cache use and loads JavaScript in the background

Figure 1: Gmail sprite images.

FranceSites using or having used the content delivery network Akamai:

cite-sciences.fr

lemonde.fr

allocine.com

urbandive.com

How can I make it work for me?The consequences of display latency are the same with in-house applications within any IT department: users who get fed up with the application and stop using it. This to say that this is a pattern which perfectly applies to your own business

Sources• Eric Daspet, “Performance des applications Web, quoi faire et pourquoi ?“ USI 2011 (French only):> http://www.usievents.com/fr/conferences/10-casablanca-usi-2011/ sessions/997-performance-des-applications-web-quoi-faire-et-pourquoi

• Articles on Google Instant Search:> http://highscalability.com/blog/2010/9/9/how-did-google-instant- become-faster-with-5-7x-more-results.html

> http://googleblog.blogspot.com/2010/09/google-instant-behind- scenes.html

Editor’s note: By definition, sprites are designed for screen display, we are unable to provide any better definition for the printing of this example. Thank you for your understanding.

28

THE WEB GIANTS

CodeCrafters

29

THE WEB GIANTS CULTURE / CODE CRAFTERS

DescriptionToday Web Giants are there to remind us that a career as a developer can be just as prestigious as manager or consultant. Indeed, some of the most striking successes of Silicon Valley have originated with one or several visionary geeks who are passionate about quality code.

When these companies’ products gain in visibility, satisfying an increasing number of users means hugging the virtuous cycle in development quality, without which success can vanish as quickly as it came.

Which is why a software development culture is so important to Web Giants, based on a few key principles:

attracting and recruiting the best programmers,

investing in developer training and allowing them more independence,

gaining their loyalty through workplace attractiveness and payscale,

being intransigent as to the quality of software development - because quality is non-negotiable.

Implementation

The first challenge the Giants face is thus recruiting the best programmers. They have become masters at the art, which is trickier than it might at first appear.

One test which is often used by the majors is to have the candidates write code. A test Facebook uses is the FizzBuzz. This exercise, inspired by a drinking game which some of you might recognize, consists in displaying the first 1000 prime numbers, except for multiples of 3 or 5, where “Fizz“ or “Buzz“ respectively must be displayed, and except for multiples of 3 and 5, where “FizzBuzz“ must be displayed. This little programming exercise weeds out 99.5% of the candidates. Similarly, to be hired by Google, between four and nine technical interviews are necessary.

30

THE WEB GIANTS

Salary is obviously to be taken into account. To have very good developers, you have to be ready to pay the price. At Facebook, Senior Software Engineers are among the best paid employees.

Once programmers have joined your firm, the second challenge is to favor their development, fulfillment, and to enrich their skills. In such companies, programmers are not considered code laborers to be watched over by a manager but instead as key players. The Google model, which encourages developers to devote 20% of their time to R&D projects, is often cited as an example. This practice can give rise to contributions to open-source projects, which provide many benefits to the company (cf. “Open Source Contribution“, p. 41). On the Netflix blog for example, they mention their numerous open source initiatives, namely on Zookeeper and Cassandra. The benefit to Netflix is twofold: its developers gain in notoriety outside the company, while at the same time developing the Netflix platform.

Another key element in developer loyalty is the working conditions. The internet provides ample descriptions of the extent to which Web Giants are willing to go to provide a pleasant workplace. The conditions are strikingly different from what one finds in most Tech companies. But that is not all! Netflix, again, has built a culture which strongly focuses on its employees’ autonomy and responsibility. More recently, Valve, a video game publisher, sparked a buzz among developers when they published their Handbook, which describes a work culture which is highly demanding but also propitious to personal fulfillment. 37 signals, lastly, with their book Getting Real, lays out their very open practices, often the opposite of what one generally finds in such organizations.

In addition to efforts deployed in recruiting and holding on to programmers, there is also a strong culture of code and software quality. It is this culture that creates the foundations for moving and adapting quickly, all while managing mammoth technological platforms where performance and robustness are crucial. Web Giants are very close to the Software Craftsmanship[1] movement, which promotes a set of values and practices aiming to guarantee top-quality software and to provide as much value as possible to end-users. Within this movement, Google and GitHub have not hesitated to share their coding guidelines[2].

[1] http://manifesto.softwarecraftsmanship.org[2] http://code.google.com/p/google-styleguide/ and https://github.com/styleguide

31

THE WEB GIANTS

How can I make it work for me?Recruiting It is important to implement very solid recruitment processes when hiring your programmers. After a first interview to get a sense of the person you wish to recruit, it is essential to have the person code. You can propose a few technical exercises to assess the candidate’s expertise, but it is even more interesting to have them code as a pair with one of your developers, to see whether there is good feeling around the project. You can also ask programmers to show their own code, especially what they are most proud of - or most ashamed of. More than the code itself, discussions around coding will bring in a wealth of information on the candidate. Also, did they put their code on GitHub? Do they take part in open source projects? If so, you will have representative samples of the code they can produce.

Quality: Offer your developers the context which will allow them to continue producing top-quality software (since that is non-negotiable). Leave them time to write unit tests, to set up the development build you will need for Continuous Deployment (cf. “Continuous Deployment“, p. 105), to work in pairs, to hold design workshops in their business domain, to prototype. The practice which is known to have the most impact on quality is peer code reviewing. This happens all too rarely in our sector.

R&D: Giving your developers the chance to participate in R&D projects in addition to their work is a practice which can be highly profitable. It can generate innovation, contribute to project improvement and, in the case of Open Source, increase your company’s attractiveness for developers. It is also simply a source of motivation for this often neglected group. More and more firms are adopting the principles of Hackathons, popularized by Facebook, where the principle consists in coding, in one or two days, working software.

CULTURE / CODE CRAFTERS

32

THE WEB GIANTS

Training: Training can be externalized but you can also profit from knowledge sharing among in-house developers by e.g. organizing group programming workshops, commonly called “Dojo“.[3] Developers can gather for half a day, around a video projector, to share knowledge and together learn about specific technical issues. It is also a way to share developer practices and, within a team, to align with programming standards. Lastly, working on open source projects is also a way of learning about new technologies.

Workplace: Where and how you work are important! Allowing independence, promoting openness and transparency, hailing mistakes and keeping a manageable rhythm are all paying practices in the long term.

Associated patterns

Pattern “Pizza Teams“, p. 59.

Pattern “DevOps“, p. 65.

Pattern “Continuous Deployment“, p. 105.

Sources• Company culture at Netflix:> http://www.slideshare.net/reed2001/culture-1798664

• What every good programmer should know:> http://www.slideshare.net/petegoodliffe/becoming-a-better-programmer

• List of all the programmer positions currently open at Facebook:> http://www.facebook.com/careers/teams/engineering

• The highest salary at Facebook? Senior Software Engineer:> http://www.businessinsider.com/the-highest-paying-jobs-at-facebook- ranked-2012-5?op=1

[3] http://codingdojo.org/cgi-bin/wiki.pl?WhatIsCodingDojo

33

THE WEB GIANTS CULTURE / CODE CRAFTERS

• GitHub programming guidelines:> https://github.com/styleguide

• How GitHub grows:> http://zachholman.com/talk/scaling-github

• Open source contributions from Netflix:> http://techblog.netflix.com/2012/07/open-source-at-netflix-by-ruslan.html

• The FizzBuzz test:> http://c2.com/cgi/wiki?FizzBuzzTest

• Getting Real:> http://gettingreal.37signals.com/GR_fra.php

• The Software Craftsmanship manifesto:> http://manifesto.softwarecraftsmanship.org

• The Google blog on tests:> http://googletesting.blogspot.fr

• The Happy Manifesto:> http://www.happy.co.uk/wp-content/uploads/Happy-Manifesto1.pdf

34

THE WEB GIANTS

Open Source Contribution

35

THE WEB GIANTS

DescriptionWhy is it Web Giants such as Facebook, Google and Twitter do so much to develop Open Source?

A technological edge is a key to conquering the Web. Whether it be to stand out from the competition by launching new services (remember when Gmail came out with all its storage space at a time when Hotmail was lording it?) or more practically to overcome inherent constraints such as the growth challenge linked to the expansion of their user base. On numerous occasions, Web Giants have pulled through by inventing new technologies.

If so, one would think that their technological mastery, and the asset which is the code, would be carefully shielded from prying eyes, whereas in fact the widely shared pattern one finds is that Web Giants are not only major consumers of open source technology, they are also the main contributors.

The pattern “developing open source“ consists of making public a software tool (library, framework...) developed and used in-house. The code is made available on a public server such as GitHub, with a free license of the Apache type for example, authorizing its use and adaptation by other companies. In this way, the code is potentially open to development by the entire world. Moreover, open source applications are traditionally accompanied by much publicity on the web and during programming conferences.

Who makes it work for them?There are many examples. Among the most representative is Facebook and its Cassandra database, built to manage massive quantities of data distributed over several servers. It is interesting to note that among current users of Cassandra, one finds other Web Giants, e.g. Twitter and Digg, whereas Facebook has abandoned Cassandra in favor of another open source storage solution - HBase - launched by the company Powerset. With the NoSQL movement, the new foundations of the Web are today massively based on the technologies of the Giants.

36

THE WEB GIANTS

Facebook has furthermore opened several frameworks up to the community, such as its HipHop engine which compiles PHP in C++, Thrift, a multilanguage development service, and Open Compute, an Open hardware initiative which aims to optimize how datacenters function. But Facebook is not alone.

Google has done the same with its user interface framework GWT, used namely in Adword. Another example is the Tesseract Optical Character Recognition (OCR) tool initially developed by HP and then by Google, which opened it up to the community a few years later. Lastly, one cannot name Google without citing Android, its open source operating system for mobile devices, not to mention their numerous scientific publications on storing and processing massive quantities of data. We are referring more particularly to their papers on Big Table and Map Reduce which inspired the Hadoop project.

The list could go on and on, so we will end with first Twitter and its CSS framework and very trendy responsive design, called Bootstrap, and the excellent Ruby On Rails extracted from the Basecamp project management software opened up to the community by 37signals.

Why does it work?

Putting aside ideological considerations, we propose to explore various advantages to be drawn from developing open software.

Open and free does not necessarily equate with price and profit wars. In fact, from one angle, opening up software is a way of cutting competition off in the bud for specific technologies. Contributing to Open Source is a way of redefining a given technology sector while ensuring sway over the best available solution. For a long time, Google was the main sponsor of the Mozilla Foundation and its flagship project Firefox, to the tune of 80%. A way to diversify to counter Microsoft. Let us come back to our analysis of the three advantages.

[1] Interface Homme Machine.

CULTURE / OPEN SOURCE CONTRIBUTION

37

THE WEB GIANTS

Promoting the brand

By opening cutting-edge technology up to the community, Web Giants position themselves as leaders, pioneers. It implicitly communicates a spirit of innovation reigning in their halls, a constant quest for improvements. They show themselves as being able to solve big problems, masters of technological prowess. Delivering a successful Open Source framework says that you solved a common problem faster or better than anyone else. And that, in a way, the problem is now behind you. Done and gone, you’re already moving onto the next. One step ahead of the game.

To share a framework is to make a strong statement, to reinforce the brand. It is a way to communicate an implicit and primal message: “We are the best, don’t you worry“

And then, to avoid being seen as the new Big Brother, one can’t but help feeling that the message also implied is:“We’re open, we’re good guys, fear not“.[2]

Attracting - and keeping - the best

This is an essential aspect which can be fostered by an open source approach. Because “displaying your code“ means showing part of your DNA, your way of thinking, of solving problems - show me your code and I will tell you who you are. It is the natural way of publicizing what exactly goes on in your company: the expertise of your programmers, your quality standards, what your teams work on day by day... A good means to attract “compatible“ coders who would have already been following the projects led by your company.

Developing Open Source thus helps you to spot the most dedicated, competent and motivated programmers, and when you hire them you are already sure they will easily integrate your ecosystem. In a manner of speaking, Open Source is like a huge trial period, open to all.

[2] Google’s motto: “Don’t be evil“

38

THE WEB GIANTS

Attracting the best geeks is one thing, hanging on to them is another. On this point, Open Source can be a great way to offer your company’s best programmers a showcase demonstration open to the whole world.

That way they can show their brilliance, within their company and beyond. Promoting Open Source bolsters your programmers’ resumes. It takes into account the Personal Branding needs of your staff, while keeping them happy at work. All programmers want to work in a place where programming is important, within an environment which offers a career path for software engineers. Spoken as a programmer.

Improving quality

Simply “thinking open source“ is already a leap forward in quality: opening up code - a framework - to the community first entails defining its contours, naming it, describing the framework and its aim. That alone is a significant step towards improving the quality of your software because it inevitably leads to breaking it up into modules, giving it structure. It also makes it easier to reuse the code in-house. It defines accountability within the code and even within teams.

It goes without saying that programmers who are aware that their code will be checked (not to mention read by programmers the world over) will think twice before committing an untested method or a hastily assembled piece of code. Beyond making programmers more responsible, feedback from peers outside the company is always useful.

How can I make it work for me?When properly used, Open Source can be an intelligent way not only to structure your R&D but also to assess programmer performance.

The goal of this paper was to explore the various advantages offered by opening up certain technologies. If you are not quite up to making the jump culturally speaking, or if your IS is not ready yet, it can nonetheless be useful to play with the idea taking a few simple-to-implement actions.

Depending on the size of your company, launching your very first Open Source project can unfortunately be met with general indifference. We do not all have the powers of communication of Facebook. Beginning by

CULTURE / OPEN SOURCE CONTRIBUTION

39

THE WEB GIANTS

contributing to Open Source projects already underway can be a good initial step for testing the culture within your teams.

Like Google and GitHub, another action which works towards the three advantages laid out here can be to materialize and publish on the web your programming guidelines. Another possibility is to encourage your programmers to open a development blog where they could discuss the main issues they have come up against. The Instagram Engineering Tumblr moderated by Instagram can be a very good source of inspiration.

Sources• The Facebook developer portal, Open Source projects:> http://developers.facebook.com/opensource

• Open-Source Projects Released By Google:> http://code.google.com/opensource/projects.html

• The Twitter developer portal, Open Source projects:> http://dev.twitter.com/opensource/projects

• Instagram Engineering Blog:> http://instagram-engineering.tumblr.com

• The rules for writing GitHub code:> http://github.com/styleguide

• A question on Quora: Open Source: “Why would a big company do open-source projects?“:> http://www.quora.com/Open-Source/Why-would-a-big-company-do- open-source-projects

40

THE WEB GIANTS

Sharing Economy platforms

42

THE WEB GIANTS CULTURE / SHARING ECONOMY PLATFORMS

DescriptionThe principles at work in the platforms of the sharing economy (exponential business platforms) are one of the keys to the successes of the web giants and other startups valuated at $1 billion (“unicorns“) such as BlablaCar, Cloudera, Social finance, or over $10 billion (“decacorns“) such as Uber, AirBnB, Snapchat, Flipkart (List and valuation of the Uni/Deca-corns).The latter are disrupting existing ecosystems, inventing new ones, wiping out others. And yet “Businesses never die, only business models evolve“ (To learn more, see: Philippe Siberzahan, “Relevez le défi de l’innovation de rupture“).Concerns over the risks of disintermediation are legitimate given that digital technology has led to the development of numerous highly successful “exponential business platforms“ (see the article by Maurice Levy, “Se faire ubériser“).The article below begins with a recap of what is common to these platforms and then explores the main fundamentals necessary for building or becoming an exponential business platform.

The wonderful world of the “Sharing economy“There is a continuous stream of newcomers knocking at the door, progressively transforming many sectors of the economy, driving them towards a so-called “collaborative“ economy. Among other goals, this approach strives to develop a new type of relation: Consumer-to-Consumer (C2C). This is true e.g. in the world of consumer loans, where the company LendingHome (Presentation of LendingHome) is based on peer-2-peer lending. Another area of interest is blockchain technology such as decentralisation and the “peer-2-peer'isation“ of money through the Bitcoin! What is most striking is that this type of relation can have an impact in unexpected places such as personalised urban car services (e.g. Luxe and Drop Don't Park), and movers (Lugg as an “Uber/Lyft for moving“).Business platforms such as these favor peer-2-peer relations. They have achieved exponential growth by leveraging the multitudes (For further information, see: Nicolas Colin & Henri Verdier, L'âge de la multitude: Entreprendre et gouverner après la révolution numérique). Such models make it possible for very small structures to grow very quickly by generating revenues per employee which can be from 100 to 1000 times higher than

43

THE WEB GIANTS

in businesses working in the same sector but which are much larger. The fundamental question is then to know what has enabled some of them to become hits and to grow their popularity, in terms of both community and revenues. What are the ingredients in the mix, and how does one become so rapidly successful?

At this stage, the contextual elements and common ground we discern are:

An often highly regulated market where these platforms appear and then develop by providing new solutions which break away from regulations (for example the obligation for hotels to make at least 10% of their rooms disability friendly, which does not apply to individuals using the AirBnB system).

An as yet unmet need in supply and demand can make it possible to earn a living or to generate additional revenue for a better quality of life (Cf. AirBnB's 2015 communication campaign on the subject) or at the least to share costs (Blablacar). This point in particular raises crucial questions as to the very notion of work, its regulation and the taxation of platforms.

There is strong friction around the experience, of clients and citizens, where the market has as yet to provide a response (such as valet car services in large cities around the world where parking is completely saturated)

A deliberate strategy to not invest in material assets but rather to efficiently embrace the business of creating links between people.

Given this understanding of the context, the 5 main principles we propose to become an exponential business platform are:

Develop your “network lock-in effect“.

Pair up algorithms with the user experience.

Develop trust.

Think user and be rigorous in execution.

Carefully choose your target when you launch platform experiments.

44

THE WEB GIANTS

“Network lock-in effect“The more supply and demand grow and come together, the more indispensable your platform becomes. Indispensable because in the end that is where the best offers are to be found, the best deals, where your friends are.

There is an inflection point where the network of suppliers and users becomes the main asset, the central pillar. Attracting new users is no longer the principal preoccupation. This asset makes it possible to become the reference platform for your segment. This growth can provide a monopoly over its use case, especially if there are exclusive deals that can be obtained through offers valid on your platform only.

It can then extend to offers which follow upon the first (for example Uber's position as an urban mobility platform has led them to diversify into a meal delivery service for restaurants). This is one of the elements which were very quickly theorised in the Lean Startup approach: the virality coefficient.

The perfect match: User eXperience & Algorithms What is crucial in the platform is setting up the perfect relation between supply and demand, celerity in implementing relations in time and/or space, lower prices as compared to traditional systems, and even providing services that weren't possible before. For some, algorithms for establishing relations are the core of their operations to deliver on their daily promise of offering suggestions and possibilities for relevant connections within a few micro-seconds.

The perfect match is a fine-tuned mix between stellar research into the user experience (all the way to swipe!), often using a mobile-first approach to explore and offer services, based on advanced algorithms to expose relevant associations. A telling example is the use of “Swipe“ in terms of uniquely tailored user experiences for fast browsing as in the personal relationship tool “Tinder“.

CULTURE / SHARING ECONOMY PLATFORMS

45

THE WEB GIANTS

Trust & securityTo get beyond the early adapters to reach the market majority, two elements are critical to the client experience: trust in the platform, trust towards the other platform users (both consumers and providers).

Who has not experienced stress when reserving one's first AirBnB? Who has not wondered whether Uber would actually be there?

This level of trust conveyed by the platform and platform users is so important that it has been one of the leveraging effects, like for the shared Blablacar platform which thrived once the transactions were operated by the platform.

What happens to the confidential data provided to the platform?

You may remember a recent hacking event of personal data on the “Ashley Madison“ sites affecting the 37 million platform users who wanted total discretion (Revelations around the hacking of the Ashley Madison sites). Security is thus key to protecting platform transactions, guaranteeing private data and reassuring users.

Think user & excel in executionAbove all it is about realising that what the market and what the clients want is not to be found in marketing plans, sales forecasts and key functionalities. The main questions to ask revolve around the triplets Client / Problem / Solution: Do I really have a problem that is worth solving? Is my solution the right one for my client? Will my client buy it? For how much? Use whatever you can to check your hypotheses: interviews, market studies, prototypes...

To succeed, these platforms aim to reach production very quickly, iterating and improving while their competition is still exploring their business plan. It is then a ferocious race between pioneers and copycats, because in this type of race “winner takes all“ (For further reading, see The Second Machine Age, Erik Brynjolfsson & Andrew Mcafee).

46

THE WEB GIANTS

Then excellence in execution becomes the other pillar. This operational excellence covers:

the platform itself and the users it “hosts“: active users, quality of the goods offered... quality in rating with numerous well assessed offers...

offers which are mediated by the platform (comments, satisfaction surveys...)

One may note in particular the example of AirBnB on the theme of excellence in execution, beyond software, where the quality in the description of the lodgings as well as beautiful photos were a strong differential as compared to the competition of the time (Craig's List) (A few words on the quality of the photos at AirBnB).

Critical market sizeCritical market size is one of the elements which make it possible to rapidly reach a sufficiently strong network effect (speed in reaching a critical size is fundamental to not being overrun by copycats).

Critical market size is made up of two aspects:

Selecting the primary territories for deployment, most often in cities or mega-cities,

Ensuring deployment in other cities in the area, when possible in standardized regulatory contexts.

You must therefore choose cities particularly concerned by your value propositions for your platform, where a sufficient number of early adapters is high enough to quickly garner takeaways. Mega-cities in the Americas, Europe and Asia are therefore choice targets for experimental deployments.

Lastly, during the generalisation phase, it is no surprise to see stakeholders deploying massively in the USA (a market which represents 350 million inhabitants, with standardised tax and regulatory environments, despite state and federal differences) or in China (where the Web giants are among the most impressive players, such as: Alibaba, Tencent and Weibo) as well as Russia.


47

THE WEB GIANTS

In Europe, cities such as Paris, Barcelona, London, Berlin, etc. are often prime choices for businesses.

What makes it work for them?As examined above, there are many ingredients for exponentially scalable Organizations and business models on the platform model: strong possibilities for employees to self-organise, the User eXperience, continuous experimentation... algorithms (namely intelligent networking), and leveraging one's community.

What about me?For IT and marketing departments, you can begin your thinking by exploring digital innovations (looking for new uses) that fit in with your business culture (based e.g. on Design thinking).

In certain domains, this approach can give you access to new markets or to disruption before the competition. A recent example is that of Accor which has entered the market of independent hotels through its acquisition of Fastbooking (Accor gets its hands on Fastbooking).

Still in the area of self-disruption, two main strategies are coming to the fore. The first consists, based on partnerships or capital investments through incubators, in coming back into the game without shouldering all of the risk. The other strategy, more ambitious and therefore riskier, is to take inspiration from these new approaches to transform from within.

It is then important to examine whether some of these processes can be opened up to transform them into an open platform, thereby leveraging the multitudes.

In the distribution sector for example, the question of positioning and opening up various strategic processes is raised: is it a good idea to turn your supply chain into a peer-2-peer platform so that SMEs can become consumers and not only providers in the supply chain? Are pharmacies the next on the list of programmed uberisations through stakeholders such as 1001pharmacie.com? In the medical domain, Doctolib.com has just leveraged €18 million to ensure its development (Doctolib raises funds)...

48

THE WEB GIANTS

Associated patterns

Enhancing the user experience

A/B Testing

Feature Flipping

Lean Startup

Sources• List of unicorns: > https://www.cbinsights.com/research-unicorn-companies

• Philippe Siberzahan, “Relevez le défi de l’innovation de rupture“, édition Pearson

• Article by Maurice Levy on “Tout le monde a peur de se faire ubériser“>http://www.latribune.fr/technos-medias/20141217tribd1e82ceae/tout-le-monde-a-peur-de-se-faire-uberiser-maurice-levy.html

• Lending Home present through “C’est pas mon idée“:> http://cestpasmonidee.blogspot.fr/2015/09/lendinghome-part-lassaut-du-credit.html

• Nicolas Colin & Henri Verdier, “l’âge de la multitude, 2nde édition“

• Ashley Madison hacking:> http://www.slate.fr/story/104559/ashley-madison-site-rencontres-extraconjugales-hack-adultere

• Second âge de la machine, Erik Brynjolfsson

• Quality of AirBnB photos:> https://growthhackers.com/growth-studies/airbnb

• Accor met la main sur Fastbooking:> http://www.lesechos.fr/17/04/2015/lesechos.fr/02115417027_accor-met-la-main-sur-fastbooking.htm

• Doctolib raises 18M€:> http://www.zdnet.fr/actualites/doctolib-nouvelle-levee-de-fonds-a-18-millions-d-euros-39826390.htm


49

THE WEB GIANTS

Organization

50

Pizza Teams..................................................................................... 59Feature Teams ................................................................................ 65DevOps .......................................................................................... 71

THE WEB GIANTS

51

THE WEB GIANTS

Pizza Teams

52

THE WEB GIANTS

53

THE WEB GIANTS ORGANIZATION / PIZZA TEAMS

Description

What is the right size for a team to develop great software?

Organizational studies have been investigating the issue of team size for several years now. Although answers differ and seem to depend on various criteria such as the nature of tasks to be carried out, the average level, and team diversity, there is consensus on a size of between 5 and 15 members.[1][5] Any fewer than 5 and the team is vulnerable to outside events and lacks creativity. Any more than 12 and communication is less efficient, coherency is lost, there is an increase in free-riding and in power struggles, and the team’s performance drops rapidly the more members there are.

This is obviously also true in IT. The firm Quantitative Software Management, specialized in the preservation and analysis of metrics from IT projects, has published some interesting statistics. If you like numbers, I highly recommend their Web site, it is chock full of information! Based on a sample of 491 projects, QSM measured a loss of productivity and heightened variability with an increase in team size, with a quite clear break once one reaches 7 people. In correlation, average project duration increases and development efforts skyrocket once one goes beyond 15.[6]

In a nutshell: if you want speed and quality, cut your team size!

Why are we mentioning such matters in this work devoted to Web Giants? Very simply because they are particularly aware of the importance of team size for project success, and daily deploy techniques to keep size down.

[1] http://knowledge.wharton.upenn.edu/article.cfm?articleid=1501[2] http://www.projectsatwork.com/article.cfm?ID=227526[3] http://www.teambuildingportal.com/articles/systems/teamperformance-teamsize[4] http://math.arizona.edu/~lega/485-585/Group_Dynamics_RV.pdf[5] http://www.articlesnatch.com/Article/What-Project-Team-Size-Is-Best-/589717[6] http://www.qsm.com/process_improvement_01.html

54

THE WEB GIANTS

In fact the title of this chapter is inspired by the name Amazon gave to this practice:[7] if your team can’t be fed on two pizzas, then cut people. Albeit these are American size pizzas, but nonetheless about 8 people. Werner Vogels (Amazon VP and CTO) drove the point home with the following quote which could almost be by Nietzsche:

Small teams are holy.

But Amazon is not alone, far from it.

To illustrate the importance that team dynamics have for Web Giants: Google hired Evan Wittenberg to be manager of Global Leadership Development; the former academic was known, in part, for his work on team size.

The same discipline is applied at Yahoo! which limits its product teams in the first year to between 5 and 10 people.As for Vidaeo, they have adopted the French pizza size approach with teams of 5-6 people.In the field of startups, Instagram, Dropbox, Evernote.... are known for having kept their development teams as small as possible for as long as possible.

How can I make it work for me?A small, agile team will always be more efficient than a big lazy team; such is the conclusion which could be drawn from the accumulated literature on team size.

In the end, you only need to remember it to apply it... and to steer away from linear logic such as: “to go twice as fast, all you need is double the people!“ Nothing could be more wrong!

According to these studies, a team exceeding 15 people should set alarm bells ringing.[8][10]

[7] http://www.fastcompany.com/magazine/85/bezos_4.html[8] https://speakerdeck.com/u/searls/p/the-mythical-team-month[9] http://www.3circlepartners.com/news/team-size-matters[10] http://37signals.com/svn/posts/995-if-youre-working-in-a-big-group-youre-fighting-human-nature

55

THE WEB GIANTS ORGANIZATION / PIZZA TEAMS

You then have two options:

Fight tooth and nail to prevent the team from growing, and, if that fails, to adopt the second solution;

split the team up into smaller teams. But think very carefully before you do so and bear in mind that a team is a group of people motivated around a common goal. Which is the subject of the following chapter, “Feature Teams“.

56

THE WEB GIANTS

Feature Teams

57

THE WEB GIANTS ORGANIZATION / FEATURE TEAMS

DescriptionIn the preceding chapter, we saw that Web Giants pay careful attention to the size of their teams. That is not all they pay attention to concerning teams however: they also often organize their teams around functionalities, known as “feature teams“.A small and versatile team is a key to moving swiftly, and most Web Giants resist multiplying the number of teams devoted to a single product as much as possible.

However, when a product is a hit, a dozen people no longer suffice for the scale up. Even in such a case, team size must remain small to ensure coherence, therefore it is the number of teams which must be increased. This raises the question of how to delimit the perimeters of each.

There are two main options:[1]

Segmenting into “technological“ layers.

Segmenting according to “functionality thread“.

By “functionality thread“ we mean being in a position to deliver independent functionalities from beginning to end, to provide a service to the end user.

In contrast, one can also divide teams along technological layers, with one team per type of technology: typically, the presentation layer, business layer, horizontal foundations, database...

This is generally the organization structure adopted in Information Departments, each group working within its own specialty.

However, whenever Time To Market becomes crucial, organization into technological layers, also known as Component Teams, begins to show its limitations. This is because Time To Market crunches often necessitate Agile or Lean approaches. This means specification, development, and production with the shortest possible cycles, if not on the fly.

[1] There are in truth other possible groupings, e.g. by release, geographic area, user segment or product family. But that would be beyond the scope of the work here; some of the options are dead ends, others can be assimilated to functionality thread divisions.

58

THE WEB GIANTS

Functionality 1

Functionality 2

Functionality 4

Functionality 5

Team 1- Front

Team 1- Back

Team 1- Exchange

Team 1- Base

The trouble with Component Teams is you often find yourself with bottlenecks.

Let us take the example laid out in Figure 1.

Figure 1

The red arrows indicate the first problem. The most important functionalities (functionality 1) are swamping the Front team. The other teams are left producing marginal elements for these functionalities. But nothing can be released until Team 1 has finished. There is not much the other teams can do to help (not sharing the same specialty as Team 1), so are left twiddling their thumbs or stocking less important functionalities (and don’t forget that in Lean, stocks are bad...).

There’s worse. Functionality 4 needs all four teams to work together. The trouble is that, in Agile mode, each team individually carries out the detailed analysis. Whereas here, what is needed is the detailed impact analysis on the 4 teams. This means that the detailed analysis has to take place upstream, which is precisely what Agile strives to avoid. Similarly, downstream, the work of the 4 teams has to be synchronized for testing, which means waiting for laggers. To limit the impact, task priorities have to be defined for each team in a centralized manner. And little by little, you find yourselves with a scheduling department striving to best synchronize all the work but leaving no room for team autonomy.

59

THE WEB GIANTS ORGANIZATION / FEATURE TEAMS

In short, you have a waterfall effect upstream in analysis and planning and a waterfall effect downstream in testing and deploying to production. This type of dynamics is very well described in the work of Craig Larman and Bas Vodde, Scaling Lean and Agile.

Feature teams can correct these errors: with each team working on a coherent functional subset - and doing so without having to think about the technology - they are capable of delivering value to the end client at any moment, with little need to call on other teams. This entails having all necessary skills for producing functionalities in each team, which can mean (among others) an architect, an interface specialist, a Web developer, a Java developer, a database expert, and, yes, even someone to run it... because when taken to the extreme, you end up with the DevOps “you build it, you run it“, as described in the next chapter (cf. “DevOps“, p. 71).

But then how do you ensure the technological coherence of the product, if each Java expert in each feature team takes the decisions within their perimeter? This issue is addressed by the principle of community of practice. Peers from each type of specialty get together at regular intervals to exchange on their practices and to agree on technological strategies for the product being produced.

Feature Teams have the added advantage that teams quickly progress in the business, this in turn fosters implication of the developers in the quality of the final product.

Practicing the method is of course sloppier than what we’ve laid out here: defining perimeters is no easy task, team dynamics can be complicated, communities of practice must be fostered... Despite the challenges, this organization method brings true benefits as compared to hierarchical structures, and is much more effective and agile.

To come back to our Web Giants, this is the type of organization they tend to favor. Facebook in particular, which communicates a lot around the culture, focuses on teams which bring together all the necessary talents to create a functionality.[2]

[2] http://www.time.com/time/specials/packages article/0,28804,2036683_2037109_2037111,00.html

60

THE WEB GIANTS

It is also the type of structure that Viadeo, Yahoo! and Microsoft[3] have chosen to develop their products.

How can I make it work for me?Web Giants are not alone in applying the principles of Feature Teams. It is an approach also often adopted by software publishers.

Moreover, Agile is spreading throughout our Information Departments and is starting to be applied to bigger and bigger projects. Once your project reaches a certain size (3-4 teams), Feature Teams are the most effective answer, to the point where some Information Departments naturally turn to that type of pattern.[4]

[3] Michael A. Cusumano and Richard W. Selby. 1997. How Microsoft builds software. Commun. ACM 40, 6 (June 1997), 53-61 :http://doi.acm.org/10.1145/255656.255698[4] http://blog.octo.com/compte-rendu-du-petit-dejeuner-organise-par-octo-et-strator- retour-dexperience-lagilite-a-grande-echelle (French only).

61

THE WEB GIANTS

DevOps

62

THE WEB GIANTS

63

THE WEB GIANTS ORGANIZATION / DEVOPS

DescriptionThe “DevOps“ method is a call to rethink the divisions common in our organizations, separating development on one hand, i.e. those who write application codes (“Devs“) and operations on the other, i.e. those who deploy and implement the applications (“Ops“).

Such thoughts are certainly as old as CIOs but find renewed life thanks notably to two groups. First there are the agilists who have minimized constraints on the development side and are now capable of providing highly valued software to their clients on a much more frequent basis. Then there are the experts or “Prod“ managers, known as the Web Giants (Amazon, Facebook, LinkedIn...) who have shared their experiences in how they have managed the Dev vs. Ops divide.

Beyond the intellectual beauty of the exercise, DevOps is mainly (if not entirely) gearing to reduce the Time To Market (TTM). Obviously, there are other positive effects, but the main priority, all being mentioned, is this TTM (hardly surprising in the Web industry).

Dev & Ops: differing local concerns but a common goal

Organizational divides notwithstanding, the preoccupations of Development and Operations are indeed distinct and equally laudable:

Figure 1

Seeking to innovate Seeking to rationalize

Local targets

DevOps“wall of confusion“

Different cultures

Deliver new functionalities (of quality)

Guarantee application runs (stability)

Product Culture (software) Service Culture (archiving, supervision, etc.)

64

THE WEB GIANTS

Software development seeks heightened responsiveness (under pressure notably from their industry and the market): they have to move fast, add new functionalities, reorient work, refactor, upgrade frameworks, test deployment across all environments... The very nature of software is to be flexible and adaptable.

In contrast, Operations need stability and standardization.

Stability, because it is often difficult to anticipate what the impacts of a given modification to the code, architecture or infrastructure will be. Converting a local disk into a server can impact response times, a change in code can heavily impact CPU activity leading to difficulties in capacity planning.

Standardization, because Operations seek to ensure that certain rules (equipment configuration, software versions, network security, log file configuration...) are uniformly followed to ensure the quality of service of the infrastructure.

And yet both groups, Devs and Ops, have a shared objective: to make the system work for the client.

DevOps: capitalizing on Agility

Agility became a buzzword somewhat over ten years ago, its main objective being to reduce constraints in development processes.

The Agile method introduced the notions of “short cycle“, “user feedback“, “Product owner“, i.e. a person in charge of managing the roadmap, setting priorities, etc.

Agility also shook up traditional management structures by including cross-silo teams (developers and operators) and played havoc with administrative departments.

Today, when those barriers are removed, software development is most often carried out with one to two-week frequencies. Business sees the software evolve during the construction phase.It is now time to bring people from operations into the following phases:

65

THE WEB GIANTS

Provisioning / spinning up environments: in most firms, deploying to an environment can take between one to four months (even though environments are now virtualized). This is surprisingly long, especially when the challengers are Amazon or Google.

Deployment: this is without doubt the phase when problems come to a crunch as it creates the most instability; agile teams sometimes limit themselves to one deployment per quarter to limit the impacts on production. In order to guarantee system stability, these deployments are often carried out manually, are therefore lengthy, and can introduce errors. In short, they are risky.

Incident resolution and meeting non-functional needs: Production is the other software user. Diagnosis must be fast, the problems and resilience stakes must be explained, and robustness must be taken into account.

DevOps is organized around 3 pillars: infrastructure as code (IaC), continuous delivery, and a culture of cooperation

1. “Infrastructure as Code“ or how to reduce provisioning and environment deployment delays

One of the most visible friction points is in the lack of collaboration between Dev and Ops in deployment phases. Furthermore this is the activity which consumes the most resources: half of production time is thus taken up by deployment and incident management.

Figure 2. Source: Study by Deepak Patil (Microsoft Global Foundation Services) in 2006, via a presentation modified by James Hamilton (Amazon Web Services), http://mvdirona.com/jrh/TalksAndPapers/JamesHamilton_POA20090226.pdf

ORGANIZATION / DEVOPS

66

THE WEB GIANTS

CM

DB

Mus

t re

flect

tar

get

co

nfig

urat

ion

& r

eal-

wo

rld

sys

tem

co

nfig

urat

ion

confi

gura

tio

n

OpenStackVMWare vCloud

OpenNebula

VM instanciation / OS Installation- Installation of Operating System

Bo

ots

trap

pin

g

CapistranoCustom script (shell, python…)

Co

mm

and

and

contr

ol

Application Service Orchestration- Deploy application code to services (war, php source, ruby, ...) - RDBMS deployment (figure...)

ChefPuppetCFEngine

System Configuration- Deploy and install services required for application execution (JVM, application servers...)- Configuration of these services (logs, ports, rights, etc.)

And although it is difficult to establish general rules, it is highly likely that part of this cost (the 31% segment) could be reduced by automating deployment.

There are many reliable tools available today to generate provisioning and deployment to new environments, ranging from setting up Virtual Machines to software deployment and system configuration.

Figure 3. Classification of the main tools (october 2012)

These tools (each in its own language) can be used to code infrastructure: to install and deploy an HTTP service for server applications, to create repositories for the log files... The range of services and associated gains are many:

Guaranteeing replicable and reliable processes (no user interaction, thus removing a source of errors) namely through their capacity to manage versions and rollback operations.

Productivity. One-click deployment rather than a set of manual tasks, thus saving time.

Traceability to quickly understand and explain any failures.

67


Reducing Time To Recovery: In a worst case scenario, the infrastructure can be recreated from scratch. In terms of recovery this is highly useful. In keeping with ideas stemming from Recovery Oriented Architecture, resilience can be addressed either by attempting to prevent systems from failing by working on the MTBF - Mean Time Between Failures, or by accelerating repairs by working on the MTTR - Mean Time To Recovery. The second approach, although not always possible to implement, is the least costly. It is also useful in organizations where many environments are necessary. In such organizations, the numerous environments are essentially kept available and little used because configuration takes too long.

Automation is furthermore a way of initializing a change in collaboration culture between Dev and Ops. This is because automation increases the possibilities for self-service for Dev teams, at the very least over the ante-production environments.

2. Continuous Delivery

Traditionally, in our organizations, the split between Dev and Ops comes to a head during deployment phases, when development delivers or shuffles off their code, which then continues on its long way through the production process.

The following quote from Mary and Tom Poppendieck[1] puts the problem in a nutshell:

How long would it take your organization to deploy a change that involves just one single line of code?

The answer is of course not obvious, but in the end it is here that differences in objectives diverge the most. Development seeks control over part of the infrastructure, for rapid deployment, on demand, to all environments. In contrast, production must see to making environments available, rationalizing costs, allocating resources (bandwidth, CPU...)

[1] Mary and Tom Poppendieck, Implementing Lean Software Development: From Concept to Cash, Addison-Wesley, 2006.

68

THE WEB GIANTS

Also ironical is the fact that the less one deploys, the more the TTR (Time To Repair) increases, therefore reducing the quality of service to the end client.

Figure 4.

Source: http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-

change-4608108

In other words, the more changes there are between releases (i.e. the higher the number of changes to the code), the lower the capacity to rapidly fix bugs following deployment, thus increasing TTR - this is the instability ever-dreaded by Ops.

Here again, addressing such waste can reduce the time taken up byIncident Management as shown in Figure 2.

Figure 5.

Source: http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-

change-4608108

Deploys

Size of Deploy Vs Incident TTR

5 180

Uni

ts o

f Cha

nged

Cod

e

TTR

(min

utes

)

160

140

120

100

80

60

40

20

0

4

3

2

1

0

Sev 1 TTR Sev 2 TTR Lines Per Deploys Changed

CHANGESIZE

Huge changesetsdeployed rarely

(high TTR)

(low TTR)

Tiny changesetsdeployed often

CHANGE FREQUENCY

69


To finish, Figure 5, taken from a Flickr study, shows the correlation between TTR (and therefore the seriousness of the incidents) depending on the amount of code deployed (and therefore the number of change to the code).

However, continuous deployment is not easy and requires:

Automation of the deployment and provisioning processes: Infras-tructure as Code

Automation of the software construction and deployment processes. Build automation becomes the construction chain which carries the source management software to the various environments where the software will be deployed. Thus a new build system is neces-sary, including environment management, workflow management for more quickly compiling source code into binary code, creating documentation and release notes to swiftly understand and fix any failures, the capacity to distribute testing across agents to reduce delays, and always guaranteeing short cycle times.

Taking these factors into account at the architecture level and above all respecting the following principle: decouple functionality deploy-ment and code deployment using patterns such as: Feature flipping (cf. Feature flipping p. 113), dark launch… This of course entails a new level of complexity but offers the necessary flexibility for this type of continuous deployment.

A culture of measurement with user-oriented metrics. This is not only about measuring CPU consumption, it is also about correlating busi-ness and application metrics to understand and anticipate system behavior.

3. A culture of collaboration if not an organizational model

These two practices, Infrastructure as Code and Continuous Delivery, can be implemented in traditional organizations (with Infrastructure as Code at Ops and Continuous Delivery at Dev). However, once development and production reach their local optimum and a good level of maturity, the latter will always be hampered by the organizational division.

70

THE WEB GIANTS

This is where the third pillar comes into its own; a culture of collaboration, nay cooperation, with all teams becoming more independent rather than throwing problems at each other in the production process. This can mean for example giving Dev access to machine logs, providing them with production data the day before so that they can roll out the integration environments themselves, opening up the metrics and monitoring tools (or even displaying the metrics in open spaces)... Bringing that much more flexibility to Dev, sharing responsibility and information on “what happens in Prod“, which are actually just so many tasks with little added value that Ops would no longer have to shoulder.

The main cultural elements around DevOps could be summarized as follows:

Sharing both technical metrics (response times, number of backups...) as well as business metrics (changes in generated profits...)

Ops is also the software client. This can mean making changes to the software architecture and developments to more easily integrate monitoring tools, to have relevant and useful log files, to help diagnosis (and reduce the TTD, Time To Diagnose). To go further, certain Ops needs should be expressed as user stories in the backlog.

A lean approach [http://blog.octo.com/tag/lean/] and post-mortems which focus on the deep causes (the 5 whys) and implementing countermeasures (French only).

It remains however that in this model, the zones of responsibility (especially development, software monitoring, datacenter use and support) which exist are somewhat modified.

Traditional firms give the project team priority. In this model, deployment processes, software monitoring and datacenter management are spread out across several organizations.

71


Figure 6: Project teams

Inversely, some stakeholders (especially Amazon) have taken this model very far by proposing multidisciplinary teams in charge of ensuring the service functions - from the client’s perspective (cf. Feature Teams, p. 65).You build it, you run it. In other words, each team is responsible for the business, from Dev to Ops.

Figure 7: Product team – You build it, you run it.

BUSINESS

SOFTWARE PRODUCTION FLOW

MONITORING(BUILD)

PRODUCTION(RUN)

(Source: Cutter IT Journal, Vol. 24. N°8. August 21), modified)

Project Teams

ApplicationManagement

TechnicalManagement

ServiceDesk

Users

SOFTWARE PRODUCTION FLOW

PRODUCTS/SERVICES(BUILD & RUN)

PRODUCTION

Service Desk

Infrastructure

Users

(Source: Cutter IT Journal, Vol. 24. N°8. August 21), modified)

72

THE WEB GIANTS

Moreover it is within this type of organization that the notion of self-service takes on a different and fundamental meaning. One then sees one team managing the software and its use and another team in charge of datacenters. The dividing line is farther “upstream“ than is usual, which allows scaling up and ensuring a balance between agility and cost rationalization (e.g. linked to the datacenter architecture). The AWS Cloud is probably the result of this... It is something else altogether, but imagine an organization with product teams and production teams who would jointly offer services (in the sense of ITIL) such as AWS or Google App Engine...

ConclusionDevOps is thus nothing more than a set of practices to leverage improvements around:

Tools to industrialize the infrastructure and reassure production as to how the infrastructure is used by development. Self service is a concept hardwired into the Cloud. Public Cloud offers are mature on the subject but some offers (for example VMWare) aim to reproduce the same methods internally. Without necessarily reaching such levels of maturity however, one can imagine using tools like Puppet, Chef or CFEngine...

Architecture which makes it possible to decouple deployment cycles, to deploy code without deploying all functionalities… (cf. Feature flipping, p. 113 and Continuous Deployment, p.105).

Organizational methods, leading to implementation of Amazon’s “Pizza teams“ patterns (cf. Pizza Teams, p. 59) and You build it, you run it.

Processes and methodologies to render all these exchanges more fluid. How to deploy more often? How to limit risks when deploying progressively? How to apply the “flow“ lessons from Kanban to production? How to rethink the communication and coordination mechanisms at work along the development/operations divide?

73


In sum, these four strands make it possible to reach the DevOps goals: improve collaboration, trust and objective alignment between development and operations, giving priority to addressing the stickiest issues, summarized in Figure 8.

Figure 8

Faster provisioning

Improved quality

of service

Continuous improvement

Operational efficiency

Infrastructureas Code

ContinuousDelivery

Increased deployment

reliability

Faster incident resolution (MTTR)

Improved TTM

Culture of collaboration

74

Sources• White paper on the DevOps Revolution:> http://www.cutter.com/offers/devopsrevolution.html

• Wikipedia article:> http://en.wikipedia.org/wiki/DevOps

• Flickr Presentation at the Velocity 2009 conference:> http://velocityconference.blip.tv/file/2284377/

• Definition of DevOps by Damon Edwards:> http://dev2ops.org/blog/2010/2/22/what-is-devops.html

• Article by John Allspaw on DevOps:> http://www.kitchensoap.com/2009/12/12/devops-cooperation-doesnt- just-happen-with-deployment/

• Article on the share of deployment activities in Operations:> http://dev2ops.org/blog/2010/4/7/why-so-many-devopsconversations- focus-on-deployment.html

• USI 2009 (French only):> http://www.usievents.com/fr/conferences/4-usi-2009/sessions/797- quelques-idees-issues-des-grands-du-web-pour-remettre-en-cause-vos- reflexes-d-architectes#webcast_autoplay

THE WEB GIANTS

75

THE WEB GIANTS

Practices

76

Lean Startup ................................................................................... 87Minimum Viable Product ................................................................. 95Continuous Deployment................................................................ 105Feature Flipping ............................................................................ 113Test A/B ........................................................................................ 123Design Thinking ............................................................................ 129Device Agnostic ............................................................................ 143Perpetual beta .............................................................................. 151

THE WEB GIANTS

77

THE WEB GIANTS

LeanStartup

78

THE WEB GIANTS PRACTICES / LEAN STARTUP

DescriptionCreating a product is a very perilous undertaking. Figures show that 95% of all products and startups perish from want of clients. Lean Startup is an approach to product creation designed to reduce risks and the impact of failures by, in parallel, tackling organizational, business and technical aspects, and through aggressive iterations. It was formalized by Eric Ries, and was strongly inspired by Steve Blank’s Customer Development

Build – Mesure – LearnAll products and functionalities start with a hypothesis. The hypothesis can stem from data collection on the ground or a simple intuition. Whatever the underlying reason, the Lean Startup approach aims to:

Consider all ideas as hypotheses, it doesn’t matter whether they concern marketing or functionalities,

validate all hypotheses as quickly as possible on the ground.

This last point is at the core of the Lean Startup approach. Each hypothesis, from business, systems admin or development - must be validated, for quality as well as metrics. Such an approach makes it possible to implement a learning loop for both the product and the client. Lean Startup refuses the approach which consists of developing a product for over a year only to discover that the choices made (in marketing, functionalities, sales) threaten the entire organization. Testing is of the essence.

Figure 1

IDEAS

PRODUCTLEARN

BUILD

DATA MEASURE

79

THE WEB GIANTS

Experiment to validatePart of the approach is based on the notion of Minimum Viable Product (MVP) (cf. “Minimum Viable Product“, p. 95). At what minimum can I validate my hypotheses?

We’re not necessarily speaking here of code and products in their technical senses, but rather of any effort that leads to progress on a hypothesis. Anything can be used to test market appetite - Google Docs questionnaire, mailing list or fake functionality. Experimentations with its afferent lessons are an invaluable asset in piloting a product and justifying the implementation of a learning loop.

The measurement obsessionObviously experiments must be systematically monitored through full and reliable metrics (cf. “The obsession with performance measurement“, p. 13).

A client-centered approach – Go out of the building

Checking metrics and validating quality very often means“leaving the building“, as Bob Dorf puts it, co-author of thefamous “4 Steps to the Epiphany“.

“Go out of the building“ (GOOB) is at the heart of the preoccupations of Product Managers who practice Lean Startup. Until a hypothesis has been confronted with reality, it remains a supposition. And therefore presents risks for the organization.

“No plan survives first contact with customers“ (Steve Blank) is thus one of the mottoes of Product teams:

Build only the minimum necessary for validating a hypothesis.

GOOB (from face-to-face interviews to continuous deployment).

Learn.

Build, etc.

80


This approach also allows constant contact with the client, in other words, constant validation of business hypotheses. Zappos, a giant in online shoe sales in the US, is an example of MVP being put into users’ hands at a very early stage. To confront reality and validate that users would be willing to buy shoes online, the future CEO of Zappos took snapshots of the shoes in local stores, thereby creating the inventory for an e-commerce site from scratch. In doing so, and without building cathedrals, he quickly validated that demand was there and that producing the product would be viable.

Piloting with data

Naturally, to grasp user behavior during GOOB sessions, Product Managers meticulously gather data which will help them make the right decision. They also set up tools and processes to collect such data.

The most used are well known to all. They use interviews and analytics solutions.

The Lean Startup method implements the ferocious use of these indicators to truly pilot the product strategy. On ChooseYourBoss.com[1], we postulated that users would choose LinkedIn or Viadeo to connect, to avoid users having to set up accounts and to save us the trouble of developing a login system. In such a way we built the minimum to validate or invalidate the hypothesis of what people would do when given three options to sign up, LinkedIn, Viadeo or by opening a ChooseYourBoss account. The first two worked well while the 3rd, the ChooseYourBoss account, indicated that the ChooseYourBoss account was not viable for production. Results: users not wishing to use these networks to sign in represented 11% of visitors to our site. We will therefore abstain for the time being from implementing accounts outside of social networks. We went from “informed by data“ to “piloted by data“.

Who makes it work for them?IMVU, Dropbox, Heroku, Votizen and Zappos are a few examples of Web products that managed to integrate user feedback at a very early stage in product design. Dropbox for example completely overhauled its way of doing things by drastically simplifying management of synchronized files. Heroku went from a development platform in the Cloud to a Cloud server solution. Examples abound, each more ingenious than the previous one.

[1] A site for connecting candidates and recruiters.

81

THE WEB GIANTS

What about me?Lean Startup is not a dogma. Above all it is about realizing that what the market and the clients want is not to be found in architecture, marketing plans, sales forecasts and key functionalities.

Once you’ve come to that realization, you will start seeing hypotheses everywhere. It all consists in setting up processes for validating hypotheses, without losing sight of the principle of validating minimum functionalities at any given instant t.

Before writing any code, the main questions to ask revolve around the triad Client / Problem / Solution:

Do I really have a problem that deserves to be resolved?

Is my solution the right one for my client?

Will my client buy it? How much?

Use whatever you can to check your hypotheses: interviews, market studies, prototypes...

The next step is to know whether the model you are testing on a small scale is replicable and expandable.How can you get clients to acquire a product they’ve never heard of?Will they be in a position to understand, use, and profit from your product?The third and fourth steps revolve around growth: how do you attract clients and how do you build a company capable of taking on your product and moving it forward?

Contrary to what one might think after reading this chapter, Lean Startup is not an approach reserved for mainstream websites. Innovation through validating hypotheses as quickly as possible and limiting financial investment is obviously logic which can be transposed to any type of information systems project, even in-house. We are convinced that this approach deserves wider deployment to avoid Titanic-type projects which can swallow colossal sums despite providing very little value for users. For more information, you can also consult the sessions on Lean Startup at USI which present the first two stages (www.usievents.com).

82


Sources• Running Lean – Ash Maurya

• 4 Steps to the Epiphany – Steve Blank & Bob Dorf :> http://www.stevenblank.com/books.html

• Blog Startup Genome Project :> http://blog.startupcompass.co/

• The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses – Eric Ries :> http://www.amazon.com/The-Lean-Startup-Entrepreneurs-Continuous/dp/0307887898

• The Startup Owner’s Manual – Steve Blank & Bob Dorf :> http://www.amazon.com/The-Startup-Owners-Manual-Step-By-Step/dp/0984999302

83

THE WEB GIANTS

Minimum Viable Product

84

THE WEB GIANTS

85

THE WEB GIANTS PRACTICES / MINIMUM VIABLE PRODUCT

DescriptionA Minimum Viable Product (MVP) is a strategy for product development . Lean Startup creator Eric Ries, who strongly contributed to the

elaboration of this approach, gives the following definition:

The minimum viable product is that version of a new product which allows a team to collect the maximum amount of validated learning about customers with the least effort.[1]

In sum, it is a way to quickly develop a minimal product prototype to establish whether the need for it is there, to identify possible markets, and to validate business hypotheses on e.g. income generation.

The interest of the approach is obvious: to more quickly design a product that truly meets market needs, by keeping costs down in two ways:

by reducing TTM:[2] faster means less human effort, therefore less outlay - all else being equal,

and by reducing the functional perimeter: less effort spent on functionalities which have not yet proven their worth to the end user.

In the case of startups, funds usually run low. It is therefore best to test your business plan hypotheses as rapidly as possible - and this is where a MVP shows its worth.

The advantages are well illustrated by Eric Ries’s experience at IMVU.com, an online chatting and 3D avatar website: it took them only six months to create their first MVP, whereas in a previous startup experience it took them almost five years to release their first product - which was questionably viable!

[1] http://www.startuplessonslearned.com/2009/08/minimum-viable-product-guide.html [2] Time To Market

86

THE WEB GIANTS

Today, 6 months is considered a relatively long delay, and MVPs are often deployed in less.

This is because designing an MVP does not necessarily mean producing code or a sophisticated website, quite the contrary. The goal is to get a feel for the market very early on in the project so as to validate your plans for developing your product or service. This is what is known as aFail Fast approach.

MVPs allow you to quickly validate your client needs hypotheses and therefore to reorient your product or service accordingly, very early on in your design process. This is known as a “pivot“ in the Lean Startup jargon. Or, if your hypotheses are validated by the MVP run, you must then move on to the next step: implementing the functionality you simulated, creating a proper web site, or simply a marketing page.

An MVP is not only useful for launching a new product: the principle is perfectly applicable for adding new functionalities to a product that already exists. The approach can also be more direct: for example you can ask for user feedback on what functionalities people would like (see Figure 1), at the same time gathering information on how they use your product.

MVPs are particularly relevant when you have no or little knowledge of your market and clients, nor any well defined product vision.

Implementation

An MVP can be extremely simple. For example, Nivi Babak states that “The Minimum Viable Product (MVP) is often an ad on Google Or a Power Point slide. Or a dialog box. Or a landing page. You can often build it in a day or a week.“[3] The most minimalist approach is called a Smoke Test, in reference to electronic component testing to check that a component functions properly before moving on to the next stage

[3] http://venturehacks.com/articles/minimum-viable-product

87


of testing (stress tests, etc.) and the fact that in case of failure there is often a great deal of smoke!

The most minimal form of a Smoke Test consists of an advertisement in a major search engine for example, promoting the qualities of the product you hope to develop. Clicking on the ad will send the person to a generally static web page with minimal information, but e.g. suggesting links, the goal being to gather click information, indicative of how interested the client is in the proposed service, and willingness to buy it. That is to say that the functionalities laid out in the links do not have to be operational at this stage! The strict minimum is the ad, as this is the first step in gathering information.

In an early version of the website theleanstartup.com, which applies the principles it preaches (the EYODF pattern),[4] was proposed, at the very bottom of its home page (the MVP of theleanstartup.com), a very simple dialog box for collecting user needs. There were only two fields to be filled in: e-mail address and suggestion for a new functionality, as well as the invitation: What would you like to see on future versions of this website?

Figure 1. Form for collecting user information on the website theleanstartup.com once the fields are filled in.

In terms of tooling, services such as Google Analytics, Xiti, etc. which track all user actions and browsing characteristics on a given website, are indispensable allies. For example, in the case of a new website functionality to be implemented, it is very simple to add a new tab, menu option, advertisement, and to track user actions with this type of tool.

SendMy Email and

What do you want to see in future releases?THIS IS OUR MINIMUM VIABLE PRODUCT

[email protected] my smoke test

Success! We’ve recieved your feedback.

[4] Eat Your Own Dog Food, i.e. be the own consumers of your services.

88

THE WEB GIANTS

Risks...

Beware, the MVP can generate ambiguous results, including false negatives. In fact, if an MVP is not sufficiently well thought-out, or is badly presented, it can trigger a negative reaction on the targeted clients’ part. It can seem to indicate that the planned product isn’t viable whereas in fact it is only a question of iterating to perfect the process to better meet client needs. The point is to not stop at the first whiff of failure: a single step is all it can take to go from non-viable to viable, i.e. to the MVP itself.

Henry Ford put it very aptly:“If I had asked people what they wanted, they would have said faster horses.“ Having a product vision can be more than just an option.

Who makes it work for them?Once again we will mention IMVU (see above), one of the pioneers of Lean Startup where Eric Ries & Co. tested the MVP concept, more particularly in the field of 3D avatar design. Their website, imvu.com is an online social media for 3D avatars, chat rooms, gaming, and has the world’s largest catalog of virtual goods, most of which are created by the users themselves.

Let us also return to the example of Dropbox, an online file storage service which has seen its growth skyrocket, all based on an MVP which was a fake showcase demonstration, the product didn’t yet exist. Following the posting of the video, a tidal wave of subscribers brought the beta list sign-ups from 5,000 to 75,000 people in one night, confirming that Dropbox’s product vision was indeed solid.

89


How can I make it work for me?With the prevalence of e-commerce in the social media, the web is now at the heart of economic development strategies for businesses. The MVP strategy can be activated as is for a wide range of projects, whether stemming from the IT department or Marketing, but don’t forget that it can also be applied outside the web.

It can even be applied to purely personal projects. In his reference work Running Lean, Ash Maurya gives the example of applying an MVP (and Lean Startup) to the publication of that self-same book.

Auditing Information Systems is a major part of our work at OCTO and we are often faced with innovation projects (community platforms, e-services, online shopping...) that encounter difficulties in the production process, say every six months, and where the release, delayed by one or two years, is often a flop, because the value delivered to users does not correspond to market demand... In the interval, millions of euros will have been swallowed up, for a project that will finally end up in the waste bin of the web.

An MVP type approach reduces such risks and associated costs. On the web, delays of that length to release a product cannot be sustained, and competition is not only ferocious but also swift!

Within a business information system, it is hard to see how one could carry out Smoke Tests with advertisements. And yet there too one often finds applications and functionalities which took months to develop, without necessarily being adopted by users in the end... The virtue of Lean Startup and the MVP approach is to center attention on the value added for users, and to better understand their true needs.

In such cases, an MVP can serve to prioritize the end users of the functionalities to be developed in future versions of your application.

90

THE WEB GIANTS

Sources• Eric Ries, Minimum Viable Product: a guide, Lessons Learned, 3 August, 2009> http://www.startuplessonslearned.com/2009/08/minimum-viable- product-guide.html

• Eric Ries, Minimum Viable Product, StartupLessonLearned conference> http://www.slideshare.net/startuplessonslearned/minimum-viable- product

• Eric Ries, Venture Hacks interview: “What is the minimum viable product? “> http://www.startuplessonslearned.com/2009/03/minimum-viable- product.html

• Eric Ries, How DropBox Started As A Minimal Viable Product, 19 October, 2011> http://techcrunch.com/2011/10/19/dropbox-minimal-viable-product

• Wikipedia, Minimum viable product> http://en.wikipedia.org/wiki/Minimum_viable_product

• Timothy Fitz, Continuous Deployment at IMVU: Doing the impossiblefifty times a day, 10 February, 2009> http://timothyfitz.wordpress.com/2009/02/10/continuous-deployment- at-imvu-doing-the-impossible-fifty-times-a-day

• Benoît Guillou, Vincent Coste, Lean Start-up, 29 June, 2011, Université du S.I. 2011, Paris> http://www.universite-du-si.com/fr/conferences/8-paris-usi-2011/ sessions/1012-lean-start-up (French only)

• Nivi Babak, What is the minimum viable product?, 23 March, 2009> http://venturehacks.com/articles/minimum-viable-product

• Geoffrey A. Moore, Crossing the Chasm: Marketing and Selling High-Tech Products to Mainstream Customers, 1991 (revised 1996), HarperBusiness, ISBN 0066620022

91


• Silicon Valley Product Group (SVPG), Minimum Viable Product, 24 August, 2011

> http://www.svpg.com/minimum-viable-product

• Thomas Lissajoux, Mathieu Gandin, Fast and Furious Enough, Définissez et testez rapidement votre premier MVP en utilisant des pratiques issues de Lean Startup, Conference Paris Web, 15 October, 2011> http://www.slideshare.net/Mgandin/lean-startup03-slideshare

(French only)

• Ash Maurya, Running Lean> http://www.runningleanhq.com/

92

THE WEB GIANTS

ContinuousDeployment

93

THE WEB GIANTS PRACTICES / CONTINUOUS DEPLOYMENT

DescriptionIn the chapter “Perpetual beta“, p. 151, we will see that Web Giants improve their products continuously. How do they manage to deliver improvements so frequently while in some IT departments the least change can take several weeks to be deployed in production?

In most cases, they have implemented a continuous deployment process, which can be done in two ways:

Either entirely automatically - modifications to the code are automatically tested and, if validated, deployed to production.

Or semi-automatically: at any time one can deploy the latest stable code to production in one go. This is known as “one-click deployment“.

Obviously, setting up this pattern entails a certain number of prerequisites.

Why deploy continuously?

The primary motivation behind continuous deployment is to shorten the Time To Market, but it is also a means to test hypotheses, to validate them and, in fine, to improve the product.Let us imagine a team which deploys to production on the 1st of every month (which is already a lot for many IT departments):

I have an idea on the 1st.

With a little luck, the developers will be able to implement it in the remaining 30 days.

As planned, it is deployed to production in the monthly release plan on the 1st of the following month.

Data are collected over the next month and indicate thatthe basic idea needs improvement.

But it will be a month before the new improvement can be implemented, which is to say it takes three months to reach a stabilized functionality.

94

THE WEB GIANTS

In this example, it is not development that is slowing things down but in fact the delivery process and the release plan.

Thus continuous deployment shortens the Time To Market but is also a way to accelerate product-improvement cycles.

This improves the famous Lean Startup cycle (cf. “Lean Startup“, p. 87):

Figure 1

A few definitions

Many people use “Continuous Delivery“ and “Continuous Deployment“ interchangeably. To avoid any errors in interpretation, here is our definition:

With each commit (or time interval), the code is:

Compiled, tested, deployed to an integration environment => Continuous Integration

Compiled, tested, delivered to the next team (Tests, Qualification, Production, Ops).

=> Continuous Delivery

Compiled, tested, deployed to production. => Continuous Deployment

IDEAS

CODEDATA

LEARN FAST CODE FAST

MEASURE FAST

95


The point here is not to say that Continuous Delivery and Continuous Integration are a waste of time. Quite the contrary, they are essential steps: Continuous Deployment is simply the natural extension of Continuous Delivery, itself the natural extension of Continuous Integration.

What about quality?

One frequent objection to Continuous Deployment is the lack of quality and the fear of delivering an imperfect product, of delivering bugs.

Just as with Continuous Integration, Continuous Deployment is only fully useful if you are in a position to be sure of your code at all times. This entails a full array of tests (on units, integration, performance, etc.). Beyond the indispensable unit tests, there is a wide range of automated tests such as:

Integration tests (Fitnesse, Greenpepper, etc.)

GUI tests (Selenium, etc.)

Performance tests (Gatling, OpenSTA, etc.)

Test automation can seem costly, but when the goal is to execute them several times a day (IMVU launches 1 million tests per day), return on investment grows rapidly. Some, such as Etsy, do not hesitate to create and share tools to best meet their testing and automation needs.[1]

Furthermore, when you deploy every day, the size of the deployments is obviously much smaller than when you deploy once a month. In addition, the smaller the deployment, the shorter the Time To Repair, as can be seen in Figure 2.

[1] https://github.com/etsy/deployinator

96

THE WEB GIANTS

Figure 2 (modified). Source: http://www.slideshare.net/jallspaw/ops-

metametrics-the-currency-you-pay-for-change-4608108

Etsy well illustrates the trust one can have in code and in the possibility of repairing any errors quickly. This is because they don’t bother with planning for rollbacks: “We don’t roll back code, we fix it“. According to one of their employees, the longest time span it has taken them to fix a critical bug was four minutes.

Big changes lead to big problems, little changes lead to little problems.

Who does things this way?Many of Web Giants have successfully implemented Continuous Deployment, here are a few of the most representative numbers

Facebook, very aggressive on test automation, deploys twice a day.

Flickr makes massive use of Feature Flipping (cf. “Feature Flipping“, p. 113) to avoid development branches and deploys over ten times daily. A page displays the details of the last deployment: http://code.flickr.com

Etsy (an e-commerce company), hugely invested in automated tests and deployment tooling, and deploys more than 25 times a day.

CHANGESIZE

Huge changesetsdeployed rarely

(high TTR)

(low TTR)

Tiny changesetsdeployed often

CHANGE FREQUENCY

97


IMVU (an online gaming and 3D avatar site), performs over a million tests a day and deploys approximately 50 times.

What about me?Start by estimating (or even better, by measuring!) the time it takes you and your team to deliver a simple line of code through to production, respecting the standard process, of course.

Setting up Continuous Deployment

Creating a “Development Build“ is the first step towards Continuous Deployment.To move on, you have to ensure that the tests you run cover most of the software. While some don’t hesitate to code their own test frameworks (Netflix initiated the “Chaos Monkey“ project which shuts down servers at random), there are also ready made frameworks available, such as JUnit, Gatling and Selenium. To reduce testing time, IMVU distributes its tests over no fewer than 30 machines. Others use Cloud services such as AWS to instantiate test environments on the fly and carry out parallel testing.

Once the development build produces sufficiently tested artifacts, it can be expanded to deliver the artifacts to the teams who will deploy the software across the various environments. At this stage, you are already in Continuous Delivery.

The last team can now enrich the build to include deployment tasks. This obviously entails automating various tasks, such as configuring the environments, deploying the artifacts which constitute the application, migrating the database diagrams and much more. Be very careful with your deployment scripts! It is code and, like all code, must meet quality standards (use of a SCM, testing, etc.).

Forcing Continuous Deployment

A more radical but highly interesting solution is to force the rhythm of release, making it weekly for example, to stir up change.

98

THE WEB GIANTS

Associated patterns When you implement Continuous Delivery this is necessarily accompanied by several patterns, including:

Zero Downtime Deployment, because while an hour of system shut-down isn’t a problem if you release once a month, it can become one if you release every week or every day.

Feature Flipping (see the next chapter, “Feature Flipping“), because regular releases unavoidably entail delivering unfinished functionalities or errors, you must therefore have a way of deactivating problematic functionalities instantaneously or upstream.

DevOps obviously, because Continuous Deployment is one of its pillars (cf. “DevOps“, p. 71).

Sources• Chuck Rossi, Ship early and ship twice as often, 3 August, 2012:> https://www.facebook.com/notes/facebook-engineering/ship-early- and-ship-twice-as-often/10150985860363920

• Ross Harmess, Flipping out, Flickr Developer Blog, 2 December, 2009:> http://code.flickr.com/blog/2009/12/02/flipping-out

• Chad Dickerson, How does Etsy manage development and operations? 4 February, 2011:> http://codeascraft.etsy.com/2011/02/04/how-does-etsy-manage- development-and-operations

• Timothy Fitz, Continuous Deployment at IMVU: Doing the impossible fifty times a day, 10 February, 2009:> http://timothyfitz.wordpress.com/2009/02/10/continuous-deployment- at-imvu-doing-the-impossible-fifty-times-a-day

• Jez Humble, Four Principles of Low-Risk Software Releases, 16 February, 2012:> http://www.informit.com/articles/article.aspx?p=1833567

• Fred Wilson, Continuous Deployment, 12 February, 2011:> http://www.avc.com/a_vc/2011/02/continuous-de

99

THE WEB GIANTS

Feature Flipping

100

THE WEB GIANTS PRACTICES / FEATURE FLIPPING

DescriptionThe “Feature Flipping“ pattern allows you to activate or deactivate functionalities directly in production, without having to release new code.

Several terms are used by Web Giants: Flickr and Etsy use “feature flags“, Facebook “gatekeepers“, Forrst “feature buckets“, Lyris Inc. “feature bits“, while Martin Fowler opted for “feature toggles“.

In short, everyone names and implements the pattern in their own way, and yet all of these techniques strive to reach a same goal. In this article we will use the term “feature flipping“. Successfully implemented in our enterprise app store Appaloosa,[1] this technique has brought many advantages with just a few drawbacks.

ImplementationIt is a very simple mechanism, you simply have to condition execution of the code for a given functionality in the following way:

if Feature.is_enabled(‘new_feature’) # do something newelse # do same as beforeend

The implementation of the function “is enabled“ will e.g. query a configuration file or database to know whether the functionality is activated or not.

You then need an administration console to configure the state of the various flags on the different environments.

Continuous deployment

One of the first advantages in being able to hot-switch functionalities on or off is to be able to continuously deliver the application being produced. Indeed, one of the first problems faced by organizations imple-menting continuous delivery is:

[1] cf. appaloosa-store .com

101

THE WEB GIANTS

how can one regularly commit the source referential while guaranteeing application stability and constant production readiness? In the case of functionality developments which cannot be finished in less than a day, only committing the functionality once it’s done (after a few days) is contrary to development best practices in continuous integration.The truth is that the farther apart your commits, the more complicated and risky are merges, with only limited possibilities for transversal refactoring. Given these constraints, there are two choices: “feature branching“ or “feature flipping“. In other words, creating a branch via the configuration management tool or in the code. Each has its fervent partisans, you can find some of the heated debates at: http://jamesmckay. net/2011/07/why-does-martin-fowler-not-understand-feature-branches

Feature Flipping makes it possible for developers to code inside their “ifs“, and to thus commit unfinished, non-functional code, as long as the code compiles and the tests are passed. Other developers can obtain the modifications without difficulty as long as they do not activate the functionalities being developed. Thus the code can be deployed to production since, again, the functionality will not be activated. That is where the interest lies: deployment of code to production no longer depends on completing all the functionalities under development. Once the functionality is finished, it can be activated by simply changing the status of the flag on the administration console.This has an added benefit in that the functionality can be activated to coincide e.g. with an advertising campaign; it is a way of avoiding mishaps on the day of the release.

Mastering deployment

One of the major gains brought by this pattern is that you are in control of deployment, because it allows you to activate a functionality with a simple click, and to deactivate it just as easily, thus avoiding drawn-out and problem-prone rollback processes to bring the system back to its N-1 release.

Thus you can very quickly cancel the activation of a functionality if production tests are inconclusive or user feedback is negative.

102


Unfortunately, things are not quite that simple: you must be very careful with your data and ensure that the model will work with or without the functionality being activated (see the paragraph “Limits and constraints > major modifications“).

Experiment to improve the product

A natural off-shoot of feature flipping is that it enables you to activate or deactivate functionalities for specific sub-populations. You can thus test a functionality on a user group and, depending on their response, activate it for all users or scrap it. In which case the code will look something like this:

if Feature.is_enabled_for(‘new_feature’, current_user) # do something newelse # do same as beforeend

You can then use the mechanism to test a functionality’s performance by modifying one variable in its implementation for several sub-populations. Result metrics will help you determine which implementation performs best. In other words, feature flipping is an ideal tool for carrying out A/B testing (cf. “A/B Testing“, p. 123).

Provide custom-made products

In some cases, it can be interesting to let the client choose between the two. Let us take the example of attachments in Gmail: by default, the interface proposes a number of advanced functionalities (drag and drop, multiple uploads) which can be deactivated by the user with a simple click in case of dysfunction.

Inversely, you can offer users an “enhanced“ mode, “labs“ (Gmail) are telling examples of feature flipping implementation.

To do so, all you have to do is to propose an interface where users can control the activation/deactivation of certain functionalities (self service).

103

THE WEB GIANTS

Managing billable functionalities

Activating paying functionalities with various levels of service can be com-plicated to implement, and entails conditional code of the following type:

if current_user.current_plan == ‘enterprise’ || current_user.current_plan == ‘advanced’

Let us say that some “special“ firms are paying for the basic plan but you want to give them access to all functionalities.

A given functionality was included in the “advanced“ plan two months before, but marketing has decided that it should only be included in the “enterprise“ plan... except for those who subscribed more than two months earlier.

You can use feature flipping to avoid having to manage such exceptions in the code. You just need to condition activation of the features when a client subscribes. When users subscribe to the enterprise plan, the functionalities X, Y and Z are activated. You can then very easily manage exceptions in the administration interface.

Graceful degradation

Some functionalities are more crucial to business than others. When scaling up it is a good idea to favor certain functionalities over others. Unfortunately, it is difficult to ask your software or server to give priority to anything to do with billing over displaying synthesis graphs... unless the graph display functionality is feature flipped.

We have already mentioned the importance of metrics (cf. “The obsession with performance measurement“, p. 13). Once your metrics are set up, it becomes trivial to flip functions accordingly. For example: “If the average response time for displaying the graph exceeds 10 seconds over a period of 3 minutes, then deactivate the feature“.

This allows you to progressively degrade website features in order to maintain a satisfying experience for the users of the core business functionalities. This is akin to the “circuit breaker“ pattern (described in the book “Release It!“ by Michel Nygard) which makes it possible to short-circuit a functionality if an external service is down.

104


Limits and constraints

As noted above, all you need to implement feature flipping is an “if“. However, like with any development, this can easily become a new source of complexity if you do not take the necessary precautions.

1. 1 “if“ = 2 tests.

Automated tests are still the best way to check that your software is working as it should. In the case of feature flipping, you will need at least 2 tests: with the feature flipped OFF (activated) and with the feature flipped ON (deactivated).

In development, one often forgets to test the feature OFF even though this is what your clients will see unless it is ON. Therefore, once more, applying TDD[2] is a good solution: tests written in the initial development phases guarantee testing of OFF functionalities.

2. Clean up!

Extensive use of feature flipping can lead to an accumulation of “ifs“, making it more and more difficult to manage the code. Remember that for some functionalities, flipping is only useful for ensuring continuous deployment.

For all functionalities that should never again need to be deactivated (free/optional functionalities which will never be degraded as they are critical from a functional perspective), it is important to delete the “ifs“ to lighten the code and keep it serviceable.

You should therefore set aside some time following deployment to production to “clean up“. Like all code refactoring tasks, it is all the easier the more regularly you do it.

[2] Test Driven Development

105

THE WEB GIANTS

2. Major modifications (i.e. changing your relational model)

Some functionalities entail major changes in the code and data model. Let us take the example of a Person table containing an Address field. To meet new needs, you decide to divide the tables as follows:

To manage cases like this, here is a strategy you can implement:

Add the table Address (so that the base contains both the column Address AND the table Address). For applications nothing has changed, they continue querying the old columns.

You then modify your existing applications so that they use the new tables.

You migrate the data you have and delete all unused columns.

At this point, most often the application will have changed little for the user, but calls upon a new data model.

You can then start developing new functionalities based on your new data model, using feature flipping.

The strategy is relatively simple and entails down time for the various releases (phases 2 and 4).

Other techniques can be used to manage in parallel several version of your data model, in keeping with the pattern “zero downtime deployment“, allowing you to update your relational diagram without impacting the availability of the application using it, based on various types of scripts (script expansion and contraction), triggers to synchronize the data, or even views to expose the data to the applications through an abstraction layer.

Person

IDLast nameFirst nameAddress

Person

IDLast nameFirst name

Address

IDPerson_IDStreetPost_CodeTownCountry

106


Changes to one’s relational model are much less frequent than changes to code, but they are complex and have to be planned well in advance and managed very carefully.NoSQL (Not Only SQL) databases are much more flexible as concerns the data model so can also be an interesting option.

Who makes it work for them?It works for us, even though we are not (yet!) Web Giants.

In the framework of our Appaloosa project we successfully implemented the various patterns described in this article.

For Web Giants, their size, constraints due to deployment to several sites, big data migrations, leave them no choice but to implement such mechanisms. Among the most famous are Facebook, Flickr and Lyris Inc. Closer to home are Meetic, the Bilbiothèque Nationale de France and Viadeo, with the latter being particularly insistent on code clean-up and only leaving flippers in production for a few days.

And anyone who practices continuous deployment (cf. “Continuous Deployment“, p. 105) applies, in one way or another, the feature flipping pattern..

How can I make it work for me?There are various ready-made implementations in different languages such as the gem rollout in Ruby and the feature flipper in Grails, but it is so easy that we recommend you design your own implementation tailored to your specific needs.

There are multiple benefits and possible uses, so if you need to progressively deploy functionalities, or carry out user group tests, or deploy continuously, then get started!

107

THE WEB GIANTS

Sources• Flickr Developer Blog:> http://code.flickr.com/blog/2009/12/02/flipping-out

• Summary of the Flickr session at Velocity 2010:> http://theagileadmin.com/2010/06/24/velocity-2010-always-ship-trunk

• Quora Questions on Facebook:> http://www.quora.com/Facebook-Engineering/How-does-Facebooks- Gatekeeper-service-work

• Forrst Engineering Blog:> http://blog.forrst.com/post/782356699/how-we-deploy-new-features- on-forrst

• Slideshare Lyrics Inc. :> http://www.slideshare.net/eriksowa/feature-bits-at-devopsdays-2010-us

• Talk Lyrics Inc. at Devopsdays 2010:> http://www.leanssc.org/files/201004/videos/20100421_ Sowa_EnabilingFlowWithinAndAcrossTeams/20100421_Sowa_ EnabilingFlowWithinAndAcrossTeams.html

• Whitepaper Lyrics Inc. :> http://atlanta2010.leanssc.org/wp-content/uploads/2010/04/Lean_ SSC_2010_Proceedings.pdf

• Interview with Ryan King from Twitter:> http://nosql.mypopescu.com/post/407159447/cassandra-twitter-an- interview-with-ryan-king

• Blog post by Martin Fowler:> http://martinfowler.com/bliki/FeatureToggle.html

• Blog 99designs:> http://99designs.com/tech-blog/blog/2012/03/01/feature-flipping

108

THE WEB GIANTS

TestA/B

109

THE WEB GIANTS

110

THE WEB GIANTS PRACTICES / A/B TEST

DescriptionA/B Testing is a product development method to test a given functionality’s effectiveness. You can thus test e.g. a marketing campaign via e-mail, a home page, an advertising insert or a payment method.

This test strategy allows you to validate various object releases for a single variable: the subject line of an e-mail or the contents of a web page. Like any test designed to measure performance, A/B Testing can only be carried out in an environment capable of measuring an action’s success. Let us take the example of a subject heading in an email. The test must bear on how many times it was opened to determine which contents were most compelling. For web pages, you look at click-through rates; for payments, conversion rates.

Implementation

The method itself is relatively simple. You have variants of an object which you want to test on various user subsets. Once you have determined the best variant, you open it to all users.

A piece of cake? Not quite.

The first question must be the nature of the variation: where do you set your cursor between micro-optimization and major overhaul? All depends on where you are on the learning curve. If you’re in the client exploration phase (cf. “Minimum Viable Product“, p. 95, “Lean Startup“, p. 87), A/B Testing can completely change the version tested. For example, you can set up two home pages with different marketing messages, different layouts and graphics, to see user reactions to both. If you are farther along in your project, where the variation of a conversion goal of 1% makes a difference, variations can be more subtle (size, color, placement, etc.).

111

THE WEB GIANTS

The second question is your segmentation. How will you define the various sub-sets? There is no magic recipe, but there is a fundamental rule: the segmentation criteria must have no influence on the experience results (A/B Testing = a single variable). You can take a very basic feature such as subscription date, alphabetical order, as long as it does not affect the results.

The third question is when to stop. How do you know when you have enough responses to generalize the results of the experiment? It all depends on how much traffic you are able to generate, on how complex your experiment is and the difference in performance across your various samplings. In other words, if traffic is low and results are very similar, the test will have to run for longer. The main tools available on the market (Google Website Optimizer, Omniture Test&Target, Optimizely) include methods for determining if your tests are significant. If you manage your tests manually, you should brush up on statistics and sampling principles. There are also websites to calculate significance levels for you.[1]

Let us now turn to two pitfalls to be avoided when you start A/B Testing. First, looking at performance tests from the perspective of a single goal can be misleading. Given that the test changes the user experience, you must also monitor your other business objectives. By changing the homepage of a web site for example, you will naturally monitor your subscription rate, without forgetting to look at payment performance.

The other pitfall is to offer a different experience to a single group over time. The solution you implement must be absolutely consistent for the duration of the experiment: returning users must be presented with the same experimentation version, both for the relevance of your results and the user experience. Once you have established the best solution, you will then obviously deploy it for all.

Who makes it work for them?We cannot not cite the pioneer of A/B Testing: Amazon. Web players on the whole show a tendency to share their experiments. On the Internet you will have no trouble finding examples from Google, Microsoft, Netflix, Zynga, Flickr, eBay, and many others, with at times surprising results. The site www.abtests.com lists various experiments.

[1] http://visualwebsiteoptimizer.com/ab-split-significance-calculator

112

THE WEB GIANTS

How can I make it work for me?A/B Testing is above all a right to experiment. Adopting a learning stance, with results hypotheses from the outset and a modus operandi, is a source of motivation for product teams. Linking the tests to performance is a way to set up product management driven by data.

It is relatively simple to set up A/B Testing (although you do need to respect a certain hygiene in your practices). Google Web Site Optimizer, to mention but one, implements a tool which is directly hooked up to Google Analytics. For a reasonable outlay, you can give your teams the means to objectivize their actions in relation to the end-product.

Sources• 37Ssignals, A/B Testing on the signup page:> http://37signals.com/svn/posts/1525-writing-decisions-headline-tests- on-the-highrise-signup-page

• Tim Ferris:> http://www.fourhourworkweek.com/blog/2009/08/12/google-website- optimizer-case-study

• Wikipedia:> http://en.wikipedia.org/wiki/A/B_testing

PRACTICES / A/B TEST

113

THE WEB GIANTS

Design Thinking

114

115

THE WEB GIANTS CULTURE / DESIGN THINKING

Description

In their daily quest for more connection with users, businesses are beginning to realise that these “users“, “clients“, and other “collaborators“ are first and foremost human beings. Emerging behaviour patterns, spawned by new possibilities opened up by technology, are changing consumer needs and their brand loyalties.

The web giants were among the first to adopt an approach based on the relevance of all stakeholders involved in the creation of a product, and therefore concerned by the user experience provided by a given service. Here, the way Designers have appropriated the work tools is ideal for qualifying an innovative need. Reconsidering Design has become a key issue. It is essential for any Organization that wishes to change and innovate, to question the business culture, to dare go as far as disruption.

Born in the 1950s and more recently formalised by the Agency IDEO[1]

Design Thinking was developed at Stanford University in the USA as well as the University of Toronto in Canada, before making a significant impact on Silicon Valley, to the extent that it is becoming an approach assimilated by all major web businesses and startups. It then spread to the rest of the English speaking world, and then all of Europe.

Design thinking is a human-centered approach to innovation that draws from the designer’s toolkit to integrate the needs of people, the possibilities of technology, and the requirements for business success

Tim Brown IDEO

[1] > http://www.wired.com/insights/2014/04/origins-design-thinking/

116

THE WEB GIANTS

A new vision of Design

Emergence of a strategic asset

First of all one must reconsider the word Design itself, to understand its deeper, almost etymological, meaning. And therefore recognise that when you speak of Design, it means that you want to give significance to something, whether a product, a service or an Organization.

In fact, Design is whenever you want to “give meaning“ to something. A far cry from the simple representation, aesthetic or merely practical, of a product.

“Great design is not something anybody has traditionally expected from Google“ – TheVerge

Several web giants became aware of the strategic relevance of “operational“ Design before more fully implementing Design Thinking[2] This is the case for Google which, in 2011, [3] published a strong strategic vision for Design, namely offering an additional choice between “Full metrics“ (systematic A/B Testing, incremental feedback without embarked user feedback...)

[2] > http://www.forbes.com/sites/darden/2012/05/01/designing-for-growth-apple-does-it-so-can-you/

[3] > http://www.theverge.com/2013/1/24/3904134/google-redesign-how-larry-page-engineered-beautiful-revolution

MEANING CONCEPTION

DESIGN

WHY HOW

117


Today, there are even Designers behind the creation of various web giants, such as AirBnB.[4] And some who go so far as to consider Design as the main asset in their global business strategy (Pinterest, various Design Disruptors).

The first step to implementing a strategic Design is to create an environment which fosters the expression of different opinions around the role of Design within the company. This is how you avoid conflation between operational, cognitive and strategic aspects.

[4] > http://www.centrodeinnovacionbbva.com/en/news/airbnb-design-thinking-success-story

STEP 01

STEP 02

STEP 03

STEP 04

Companies that do not use design

Companies that use design for styling and appearance

Companies that integrate design into the development process

Companies that consider design a key strategic element

Emotional Design

Interaction Design

Strategic Design

Meaningful

Usable

Delightful

118

THE WEB GIANTS

Designing the experience a dialog between users and professionals

“Design is human. it’s not about “is it pretty,“ but about the connection it creates between a product and our lives.“ – Jenny Arden, Design Manager AirBnB

A strong bond is established through Design between the user and the designer. This is a context where the designer offers a service, promises an experience, after which the user qualifies the experience through feedback - negative or positive - which can lead to designer loyalty. It is this relationship that leads to strong business value.

Such commitments are to be seen towards social networks (LinkedIn, Facebook, Pinterest, Twitter…) and therefore largely among the web giants and, by extension, towards all desirable digital services.

It is the Design process which materialises this relationship; the shared history between the brand, its product, or the service behind the product and users.

“When people can build an identity with your product over time, they form a natural loyalty to it.“ Soleio Cuervo, Head of Design, Dropbox

Then come specialists of this precious relationship, in the form of labs or other types of specialised Organizations (Google Venture, IBM), working to optimise this new balance.

119


Design thinking

The working hypothesis

Design thinking entails understanding needs, and makes it possible to create tailored and adequate solutions for any problem that comes up. This means taking an interest in fellow humans in the most open, compassionate way possible.

Innovation appears in the balance between the following factors:What is viable from a business prospective, in line with the business model.What is technologically feasible, neither obsolete nor too in advance.And, lastly, what is desirable, the human factor and takeaways.

The specificity of the process lies in its ability to address a problem through unprecedented collaboration between all stakeholders: from the “creators“ (those who drive the business strategy, for example the company) to the “users“ whoever they may be (in-house and external, direct and indirect).

Business

HumansTechnologic

doable désirable

viable

Responsibility

INNOVATION

120

THE WEB GIANTS

methodological approach

The methodological translation of the Design Thinking approach is a series of steps where the goal is to provide structure for innovation by optimising the analytical and intuitive aspects of a problem.

100% reliability

100% validity

Bridging the Fundamental Predilection Gap

Design Thinking

50/50Mix

Rotman

The approach unfolds in three main phases: Inspiration or Discovery: learning to examine a problem or request.

Understanding and observing people and their habits. Getting a feel for emerging wishes and needs.

Ideating or Defining: making sense of the discoveries triggered by a concept or vision. Establishing the business and technology possibilities and prototyping the target innovations as quickly as possible.

Implementing or Delivering: materialising and testing to maximise feedback on the innovation so as to swiftly make adjustments.

121


More precisely, these phases are often broken down into several steps to anchor the methodology. The number and nature of the steps vary depending on who is implementing them. Below are the 5+1 steps suggested by the Stanford Institute of Design and adopted by IDEO:5

Empathy: Begin by understanding the people who will be impacted by your product or service. This has to do with contacts, interviews, relations. It is the choice of rediscovering the demand environment. The mandate is openness, curiosity, and not formalisation.

Definition: It is the formalisation of a concept bearing on all the elements discovered during the first step. It is based on real needs, driven by potential clients rather than the company's context.

Ideation: This is the step where ideas are generated. This optimism phase encourages all possible ideas emerging from the previously discovered concepts. Exercises and Design workshops can serve to focus on specific aspects to see what intentions are possible. Little by little, ideas are grouped together, refined, completed, and given more specific meaning.

Prototyping: Then comes the moment for materialisation, for moving on to the “how“. Here the problems are represented more concretely, to draw out potential. Speed is of the essence, especially in making mistakes so as to quickly reposition. Simple materials are used such as cardboard, putty...

Testing: It is then time to test the prototype, with potential users, to ensure its feasibility and check that it is a cultural fit for your brand. Sparked interest is proof that the prototype is a solution in tune with a user need.

Lastly, let us add evolution: The results from the preceding phases should be a new starting point for researching the best way to create value around a given need. One thus understands that the implementation of the Design approach does not end once the process has started, because it forces you to systematically evolve what you already have.

[5] > https://dschool.stanford.edu/sandbox/groups/designresources/wiki/36873/attachments/74b3d/ModeGuideBOOTCAMP2010L.pdf?sessionID=c2bb722c7c1ad51462291013c0eeb6c47f33e564

122

THE WEB GIANTS

Empathize Ideate

Define Prototype

Test

Some of the steps can be repeated, adjusted, refined, added to. New ideas are born out of tests: following prototyping for example, other types of potential clients can emerge... And this happens in a context of iteration, co-creation, sometimes without any hierarchy, and with a sufficiently optimistic mindset to accept any failures.

Design vs. Tech[6] Design is currently such a major driver for the web giants that questions arise concerning technology as a crucial strategic element.

Choices are made in the front-end of everything – Scott Belsky Behance

One effectively observes that the beneficial effects of Moore's law are diminishing while, at the same time, users are gaining in maturity, to the point where they are increasingly involved in defining the perfect interface for them.

[6] http://www.kpcb.com/blog/design-in-tech-report-2015

123


Why are Tech Companies Acquiring Design Agencies?

The old way of thinking The new way of thinking

The solution to every newproblem in tech has beensimple : more tech.

1A better experience was made with a faster CPUor more memory

2Moore’s Law no longercuts it as yhe key path toa happier customer.

3

(modi�ed from Design In Tech presentation, John Maeda KMPG partner)

Moreover, the new generations of users no longer consider possibilities driven by technology as innovation breakthroughs but rather as basic expectations (it is normal for technology to open up new possibilities). Thus it is Design which makes the difference in what clients buy and the brands they are loyal to.

Noting this trend, many web giants started buying up companies specialised in Design in 2010.

#Design in Tech M&A Activity

(modi�ed from Design In Tech presentation, John Maeda KMPG partner)

2005

NUMBER OFDESIGNERCO-FOUNDEDTECH COMPANIES

2006 2007 2008 2009 2010 to the present

FlikrAndroid

YouTubeVimeo

FabLevelMoney

PolarUltravisual

WillCallBeats

ReadmillSimple

SoldTumblrPulse

MailboxFoodspotting

ForrstBehance

Acrylic SofwareSlideshareInstagramOMGPOP

PostcrousGowallaHunch

Push Pop PressDaytum

about.meSongzaMint

+acq. for $1.65B

+acq. for $1.0B

Mobile was theinflection pointfor #DesignInTech

Mobile was theinflection pointfor #DesignInTech

124

THE WEB GIANTS

How can I make it work for me? Which way implementingThe crucial step is to evolve your company into a Design-centric Organization:

The strategy is to promote full integration of Design Thinking in your company:[7]

Design-centric leaders, who consider Design as a structural cultural edge both within their company and in the expression of their values (Products, services, expert advice, quality of product code...).

Embracing the Design culture: The development of the business culture is systematically informed by values of empathy rather than organic growth, the user experience (UX) is the most important benchmark, and the goal is to provide high quality client experiences (CX) with true value.

The Design thought process Design thinking and its implementation are a given in the company mindset, and therefore teams concentrate on opportunities in problematics rather than on project opportunities. Several implementation vectors can serve to promote this mindset:The acquisition of talent, i.e. incorporating designers (IBM)Calling upon consultants for help with issues which go beyond methodology Assimilation, by integrating Design studios and coaches[8]

A structure built around Design. Companies are organised around attracting talent and co-leaders for each position to encourage each other to create initiatives, an integral part of their responsibilities.

Globally speaking, 10% of the 125 richest companies in the USA have Top Managers or CEOs from Design. Alongside the web giants, one notes that the CEO of Nike is a Designer. Apple is the only company to have a SVP for Design.

Adapting will take time, especially as most have yet to realise the relevance of Design. Getting help from Designers or UX specialists familiar with the approach is necessary for sharing these new tools and then putting them into operation.

[7] https://hbr.org/2015/09/design-as-strategy[8] http://www.wired.com/2015/05/consulting-giant-mckinsey-bought-top-design-firm/

125


Companies that do not use design

2003 2007 2003 2007 2003 2007 2003 2007

Companies that usedesign for styling and appearance

Companies that integrate design into the development process

Companies that consider design a key strategic element

How can I make it work for them?While since 2010, GAFAM, NATU and other web giants have been following this strategy, today all sectors refer, directly or indirectly, to Design Thinking in their quest for an optimal client experience.[9]

Among concrete examples, we will mention the following:On the point of disappearing after multiple failures, AirBnB managed to turn themselves around thanks to Design Thinking[10]

Exploration of aggregated services, proposed and tested by Uber in partnership with Google[11]

Still at Uber, the Design Thinking approach underlies the entire internal structure of the company[12]

With the same goal, IBM restructured its Organization through a Design transition[13]

At Dropbox, Design Thinking is ubiquitous. Both in terms of its products and internal structure[14] [15]

More precisely, one can describe strong implication in Strategic Design as:Implementation in several stages (from visual Design to strategic Design) at: Google, Apple, Facebook, Dropbox, Twitter, Netflix, Salesforce, Amazon…An overarching Design-centric strategy at:Pinterest, AirBnB, Google Ventures, Coursera, Etsy, Uber, most FinTechs

[9] http://blog.invisionapp.com/product-design-documentary-design-disruptors/[10] https://www.youtube.com/watch?v=RUEjYswwWPY[11] > http://www.happinessmakers.com/knowledge/2015/11/29/inside-ubers-design-

thinking[12] > http://talks.ui-patterns.com/videos/applying-design-thinking-at-the-organizational-

level-uber-amritha-prasad[13] > http://www.fastcodesign.com/3028271/ibm-invests-100-million-to-expand-design-

business[14] > https://twitter.com/intercom/status/614537634833137664[15] > http://designerfund.com/bridge/day-in-the-life-rasmus-andersson/

126

THE WEB GIANTS

Associated patterns Pattern “Enhancing the user experience“ p. 27

Pattern “Lean Startup“ p. 87

Sources

• Evolution of Design Thinking: Special issue of the Harvard Business Review: > https://hbr.org/archive-toc/BR1509?cm_sp=Magazine%20Archive-_-

Links-_-Previous%20Issues> http://stanfordbusiness.tumblr.com/post/129579353544/how-design-

thinking-can-help-drive-relevancy-in

• The example of AirBnB:> https://growthhackers.com/growth-studies/airbnb> https://www.youtube.com/watch?v=RUEjYswwWPY

• Methodology:> https://www.ideo.com/images/uploads/thoughts/IDEO_HBR_Design_

Thinking.pdf> https://www.rotman.utoronto.ca/Connect/RotmanAdvantage/

CreativeMethodology.aspx> http://www.gv.com/sprint/

• Design Value:> http://www.dmi.org/default.asp?page=DesignDrivesValue#.

VW6gfEycSdQ.twitter> Design-Driven Innovation-Why it Matters for SME Competitiveness

White Paper – Circa Group

• Design in Tech:> http://www.kpcb.com/blog/design-in-tech-report-2015

127

THE WEB GIANTS

DeviceAgnostic

128

THE WEB GIANTS

129

THE WEB GIANTS PRATICES / DEVICE AGNOSTIC

DescriptionFor Web Giants, user-friendliness is no longer open to debate: it is non negotiable.

As early as 2003, the Web 2.0 manifesto pleaded in favor of the “Rich User Experience“, and today, anyone working in the world of the Web knows the importance of providing the best possible user interface. It is held to be a crucial factor in winning market shares.

In addition to demanding high quality user experience, people want to access their applications anywhere, anytime, in all contexts of their daily lives. Thus a distinction is generally made between situations where one is sitting (e.g. at the office), nomadic (e.g. waiting in an airport terminal) or mobile (e.g. walking down the street).

These situations are currently linked to various types of equipment, or devices. Simply put, one can distinguish between:

Desktop computers for sedentary use.

Laptops and tablets for nomadic use.

Smartphones for mobile use.

The Device Agnostic pattern means doing one’s utmost to offer the best user experience possible whatever the situation and device.

One of the first companies to develop this type of pattern was Apple with its iTunes ecosystem. In fact, Apple first made music accessible on PC/Mac and iPod, then on the iPhone and iPad. Thus they have covered the three use situations. In contrast, Apple does not fully apply the pattern as their music is not accessible on Android or Windows Phone.

To implement this approach, it can be necessary to offer as many interfaces as there are use situations. Indeed, a generic interface of the one-size-fits-all type does not allow for optimal use on computers, tablets, smartphones, etc.

130

THE WEB GIANTS

The solution adopted by many of Web Giants is to invest in developing numerous interfaces, applying the pattern API first (cf. “Open API“, p. 235). Here the principle is for the application architecture to be based on a generic API, with the various interfaces then being directly developed by the company, or indirectly through the developer and partner ecosystem based on the API.

To get the most out of each device, it is becoming ever more difficult to use only Web interfaces. This is because they do not manage functionalities specific to a given device (push, photo-video capture, accelerometer, etc.). Users also get an impression of lag because the process entails frontloading the entire contents,[1] whereas native applications need no loading or only a few XML or JSON resources.

I’d love to build one version of our App that could work everywhere. Instead, we develop separate native versions for Windows, Mac, Desktop Web, iOS, Android, BlackBerry, HP WebOS and (coming soon) Windows Phone 7.We do it because the results are better and, frankly, that’s all-important.We could probably save 70% of our development budget by switching to a single, cross-platform client, but we would probably lose 80% of our users.

Phil Libin, CEO Evernote (January, 2011)

However things are changing with HTML5 which functions in offline mode and provides resources for many applications not needing GPS or an accelerometer. In sum, there are two approaches adopted by Web companies: those who use only native applications such as Evernote, and those who take a hybrid approach using HTML5 contents embarked in the native application which then becomes a simple empty shell, capable only of receiving push notifications. This is in particular the case of Gmail, Google+ and Facebook for iPhone. One of the benefits of this approach is to enhance visibility in the AppStores where users go for their applications.

The hybrid pattern is thus a good compromise: companies can use the HTML5 code on a variety of devices and still install the application via an App Store with Apple, Android, Windows Phone, and, soon, Mac and Windows.

[1] This frontloading can be optimized (cf. “Enhancing the user experience“, p. 27) but there are no miracles…

131


Who makes it work for them?There are many examples of the Device Agnostic pattern being implemented among Web Giants. Among others:

In the category of exclusively native applications: Evernote, Twitter, Dropbox, Skype, Amazon, Facebook.

In the category of hybrid applications: Gmail, Google+.


Facebook proposes:

A Web interface for PC/Mac: www.facebook.com.

A Web interface for Smartphones: m.facebook.com.

Embarked mobile interfaces for iPad, iPhone, Android, Windows Phone, Blackberry, PalmOS.

A text message interface to update one’s status and receive notifications of friend updates.

An email interface to update one’s status.

In addition, there are several embarked interfaces for Mac and PC offered by third parties such as Seesmic and Twhirl.

Twitter stands out from the other Web Giants in that it is their ecosystem which does the implementing for them (cf. “Open API“, p. 235). Many of the Twitter graphic interfaces were in fact created by third parties such as TweetDeck, Tweetie for Mac and PC, Twitterrific, Twidroid for smartphones... To the extent that, for a time, Twitter’s web interface was considered unuser friendly and many preferred to use the interfaces generated by the ecosystem instead. Twitter is currently overhauling the interfaces.

132

THE WEB GIANTS

In FranceOne finds the Device Agnostic pattern among major media groups. For example Le Monde proposes:

A Web interface for PC/Mac: www.lemonde.fr

A Web interface for Smartphones: mobile.lemonde.fr

Hybrid mobile interfaces for iPhone, Android, Windows Phone, Blackberry, PalmOS, Nokia OVI, Bada

An interface for iPad

It is also found in services with high consultation rates such as banking. For example, the Crédit Mutuel proposes:

A Web interface for PC/Mac: www.creditmutuel.fr

A redirect service for all types of device: m.cmut.fr

A Web interface for Smartphones: mobi.cmut.fr

A Web interface for tablets: mini.cmut.fr

A WAP interface: wap.cmut.fr

A simplified Java interface for low technology phones

Embarked mobile interfaces for iPad, iPhone, Android, Windows Phone, Blackberry

An interface for iPad.

133


How can I make it work for me?The pattern is useful for any B2C service where access anywhere, anytime is important.

If your budget is limited, you can implement the mobile application most used by your target clients, and propose an open API in the hopes that others will develop the interface for additional devices.

Associated patterns

The Open API or open ecosystem pattern, p. 235.

The Enhancing the User Experience pattern, p. 27.

Exception!As mentioned earlier, this pattern is only limited by the budget required for its implementation.

Sources• Rich User Experiences, Web2.0 Manifesto, Tim Oreilly:> http://oreilly.com/Web2/archive/what-is-Web-20.html

• Four Lessons From Evernote’s First Week On The Mac App Store,Phil Libin:> http://techcrunch.com/2011/01/19/evernote-mac-app-store

134

THE WEB GIANTS

Perpetual beta

135

THE WEB GIANTS

136

THE WEB GIANTS PRATICES / PERPETUAL BETA

DescriptionBefore introducing perpetual beta, we must revisit a classic pattern in the world of open software:

Release early,release often.

The principle behind this pattern consists of regularly releasing code to the community to get continuous feedback on your product from programmers, testers, and users. This practice is described in Eric Steven Raymond’s 1999 work “The Cathedral and the Bazaar“ It is in keeping with the short iteration principle in agile methods.

The principle of perpetual beta was introduced in the Web 2.0 manifesto written by Tim O’Reilly where he writes:

Users must be treated as co-developers, in a reflection of open source development practices (...).The open source dictum, ‘release early and release often’, in fact has morphed into an even more radical position, ‘the perpetual beta’, in which the product is developed in the open, with new features slipstreamed in on a monthly, weekly, or even daily basis.

The term “perpetual beta“ refers to the fact that an application is never finalized but is constantly evolving: there are no real releases of new versions. Working this way is obviously in line with the logic of “Continuous Delivery“ (cf. “Continuous Deployment“, p. 105).

This constant evolution is possible because it is a case here of services on line rather than software:

In the case of software, version management usually follows a roadmap with publication benchmarks: releases. These releases are spread out over time for two reasons: the time it takes to deploy the versions to the users, and the need to ensure maintenance and support for the various versions released to users. Monitoring

137

THE WEB GIANTS

support, safety updates and ongoing maintenance on several versions of a single program is a nightmare, and a costly one. Let us take the example of Microsoft: the Redmond-based publisher had to manage at one point the changes to Windows XP, Vista and Seven. One imagines three engineering teams all working on the same software: a terrible waste of energy and a major crisis for any company lacking Microsoft’s resources. This syndrome is known as “version perversion“.

In the context of online services, only one version of the application needs to be managed. Furthermore, since it is Web Giants themselves who upload and host their applications, users benefit from updates without having to manage the software deployment.

New functionalities appear on the fly where they are “happily“ discovered by the users. In this way one learns to use new functions in applications progressively. Generally speaking, the logistics of ascendant interoperability are well managed (with a few exceptions, such as support in disconnected mode in Gmail, when they gave up Google Gears). This model is widely applied by the stakeholders of Cloud Computing.

The “customer driven roadmap“ is a complementary and virtuous feature of the perpetual beta (cf. “Lean Startup“, p. 87). Since Web Giants manage the production platform, they can also finely measure use of their software. Thereby measuring the success of each new functionality. As mentioned previously, the Giants follow metrics very closely. So closely in fact that we have devoted a chapter to the subject (cf. “The obsession with performance measurement“, p. 13).

More classically, running the production platform provides opportunities to launch surveys among various target populations to get user feedback.

To apply the perpetual beta pattern, you must have the means to carry out regular deployments. The prerequisites are:

implementing automatic software builds,

practicing Continuous Delivery,

ensuring you can rollback in case of trouble...

138

THE WEB GIANTS PRATICES / PERPETUAL BETA

There is some controversy around the perpetual beta: some clients equate beta with an unfinished product and believe that services following this pattern are not reliable enough to count on. This has led some service operators to remove the mention “beta“ from their site, albeit without changing their practices.

Who makes it work for them?The reference was Gmail which sported the mention beta until 2009 (with the vintage function “back to beta“ being added later).

It is a practice implemented by many Web Giants. Facebook, Amazon, Twitter, Flickr, Delicious.com, etc.

A good illustration of perpetual beta is provided by Gmail Labs: they are small unitary functionalities which users can decide to activate or not. Depending on the rate of adoption, Google then decides to integrate them in the standard version of their service or not (cf. “Feature Flipping“, p. 113).

In France, the following services display, or have displayed, the beta logo on theirhome page:

urbandive.com : a navigation service with street view by the Pages Jaunes,

sen.se : a service for storing and analyzing personal data.

Associated patterns

Pattern “Continous Deployment“, p. 105.

Pattern “Test A/B“, p. 123.

Pattern “The obsession with performance measurement“, p. 13.

139

THE WEB GIANTS

Exception!Some Web Giants still choose to keep multiple versions up and running simultaneously. Maintaining several versions of an API is particularly relevant as it saves developers from being forced into updating their code every time a new version of the API is released. (cf. “Open API“, p.235.)

The Amazon Web Services API is a good example.

Sources• Tim O’Reilly, What Is Web 2.0 ?, 30 September, 2005:> http://oreilly.com/pub/a/web2/archive/what-is-web-20.html

• Eric Steven Raymond, The Cathedral and the Bazaar:> http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/

140

THE WEB GIANTS

Architecture

141

Cloud First .................................................................................... 159Commodity Hardware ................................................................... 167Sharding ....................................................................................... 179TP vs. BI: the new NoSQL approach ............................................... 193Big Data Architecture .................................................................... 201Data Science ................................................................................. 211Design for Failure .......................................................................... 219The Reactive Revolution ................................................................ 225Open API ..................................................................................... 233

142

THE WEB GIANTS

Cloud First

143

THE WEB GIANTS

144

THE WEB GIANTS ARCHITECTURE / CLOUD FIRST

DescriptionAs we saw in the description of the pattern “Build vs. Buy“ (cf. “Build vs. Buy“, p. 19): Web Giants favor specific developments so as to control their tools from end to end, whereas many companies instead use software packages, considering that software tools are commodities.[1]

Although Web Giants, like startups, prefer to develop critical applications in-house, they do at times have recourse to third-party commodities. In this case, they apply the commodity logic to the fullest by choosing to

completely outsource the service in the Cloud.

By favoring services in the Cloud, Web Giants, again like startups, take a very pragmatic stance: profiting from the best innovations by their peers, speedily and with an easy-to-use purchase model, to focus their efforts on their business strengths. This model can be inspiring for all companies wishing to move fast and to reduce investment costs to win market shares.

Why favor the Cloud in the commodity framework? The table on the following page lays out the advantages.

The Cloud approach can be divided into three main strands:

Using APIs and Mashups: Web Giants massively call upon services developed by Cloud companies (Google Maps, user identification on Facebook, payment with PayPal, statistics with Google Analytics, etc.) and integrate them in their own pages via the mashup principle.

Outsourcing functional commodities: Web majors often externalize their commodities to SaaS services (e.g. Google Apps for collaborating, Salesforce for managing sales personnel, etc.)

Outsourcing technical commodities: Web players also regularly use Iaas and PaaS platforms to host their services (Netflix and Heroku for example use Amazon Web Services).

145

THE WEB GIANTS

Analysis axisModel

In-house management

Cloud

Cost Initial outlay for licenses, equipment, staff.

Pay-per-use: neither investment nor com-mitment.

Time to Market License purchase, then deployment by the company within a few weeks.

Self-service subs-cription automatically implemented within minutes.

Roadmap/new functionalities

Designed in the mid term by publishers following feedback from user groups.

Implemented in the short term depending on what users do with the service.

Rhythm of change

Often one major release per year.

New functionalities on the fly.

Support and updates

Additional yearly cost.

Included in the subs-cription.

Hosting andoperating

Entails building and operating a datacenter by experts.

Delegated to the Cloudoperator.

The physical safety of data

Data integrity is the responsibility of the company.

The major Cloud opera-tors ensure the safety of data in accordance with the ISO standards ISO 27001[1] and SSAE 16.[2]

[1] ISO 27001 : http://en.wikipedia.org/wiki/ISO_27001

[2] SSAE 16 (replacing the Type 2 SAS 70) : http://www.ssae-16.com

146


Housing technical commodities in the Cloud is particularly interesting for Web companies. With the pay-as-you-go model, they can launch online activities with next to no hosting costs. Charges increase progressively as the number of users grows, alongside revenues, so all is well. The Cloud has thus radically changed their launch schedules.

The Amazon Web Services platform IaaS is massively usedby Web Giants such as Dropbox, 37signals, Netflix, Heroku...

During the CloudForce 2009 conference in Paris,a Vice-President of Salesforce affirmed that the company did not use an IaaS platform because such solutions did not exist when the company was created, but that if it were to be done today they would certainly choose IaaS.

Who makes it work for them?The eligibility of the Cloud varies depending both on the type of data you manipulate and regulatory constraints. Thus:

Banks in Luxembourg are forbidden from storing their data elsewhere than in certified organizations.

Companies working with sensitive data, industrial secrets or patents are reluctant to store them in the Cloud. The Patriot Act[3] in particular pushes companies away from the Cloud: it forces companies registered in the United States to make their databases available upon request by government authorities.

Companies which work with personal data can also be forced to restrict their recourse to the Cloud because of the CNIL regulations, the respect of which varies from one Cloud platform to the next (variable implementation of Safe Harbor Privacy Principles).[4]

[3] http://en.wikipedia.org/wiki/PATRIOT_Act

[4] http://en.wikipedia.org/wiki/International_Safe_Harbor_Privacy_Principle

147

THE WEB GIANTS

When there are no such constraints, using the Cloud is possible. And many companies of all sizes and from all sectors have migrated to the Cloud, in the USA as well as in Europe.

Let us describe a case that well illustrates the potential of the Cloud:

In 2011, Vivek Kundra, former CIO at the White House, announced the program “Cloud First“ which stipulated that all US administrations had to use the Cloud first and foremost for IT.This decision should be put in context: in the USA there is the “GovCloud“, i.e. Cloud offers suited to administrations, with full respect for their constraints, located on American soil, and isolated from other clients.Such services are offered by Amazon, Google and other providers.[5]

In some companies, it is the mindset which is dead against storing data in the Cloud. This reluctance is due to the factors presented above, but also to a lack of confidence (Cloud providers have not yet reached the levels of trust of banks) and also possibly unwillingness to change. Web Giants are less affected by these two latter impediments, they are already well acquainted with the Cloud providers and are open to change.

Cloud addiction?

One should also be careful not to depend too fully on a single Cloud platform to house critical applications. These platforms are not fail-proof, as shown by recent failures: Microsoft Azure (February, 2012), Salesforce (June, 2012), Amazon Web Services (April and July, 2012).The failures at AWS highlighted their lack of maturity in the use of theCloud:

Pinterest, Instagram, Heroku which were dependent on a single Amazon datacenter were strongly impacted,

[5] Federal Cloud Computing Strategy, Vivek Kundra, 2011:

http://www.forbes.com/sites/microsoft/2011/02/15/kundra-outlines-cloud-first-policy-for-u-s-government

148


Netflix used several Amazon datacenters and was thus less affected[6]

(cf. “Design for Failure“, p. 221).

One should note however that such failures create media hype whereas very little is known about the robustness of corporate datacenters. It is therefore difficult to measure the true impact on users.

Here are a few Service Level Agreements that you can compare withthose of your companies:

Amazon EC2: 99.95% availability per year.

Google Apps: 99.9% availability per year.


A few examples of recourse to the Cloud by Web Giants:

using Amazon Web Services: Heroku, Dropbox, 37Signals, Netflix, Etsy, Foursquare, Voyages SNCF. In fact, Amazon represents 1% of all traffic on the Web;

using Salesforce: Google, LinkedIn;

using Google Apps: Box.net.

In FranceA few examples of Cloud use in France:

In industry: Valeo, Treves use Google Apps.

In insurance: Malakoff Méderic uses Google Apps.

[6] Feedback from Netflix on AWS failures:

http://techblog.netflix.com/2011/04/lessons-netflix-learned-from-aws-outage.htm

149

THE WEB GIANTS

In the banking sector: most use Salesforce for at least part of their activities.

In the Internet sector: PagesJaunes uses Amazon Web Services.

In the public sector: La Poste uses Google Apps for their mail delivery staff.

How can I make it work for me?If you are a SME or a VSE, you would probably benefit from externalizing your commodities in the Cloud, for the same reasons as Web Giants. All the more so as regulatory issues, such as the protection of industrial secrets, must be resolved following the emergence of French and European Clouds such as Adromède.

If you are a large company, already well endowed with hardware and IT teams, the benefits of the Cloud can be offset by the cost of change. It can nevertheless be relevant to study the question. In any case, you can profit from the Cloud’s agility and pay-as-you-go approach for:

innovative projects: pilot projects, Proof of Concept, project incubation, etc.

Environments with limited life spans (development, testing, design, etc.).

Related Pattern

Pattern “Build vs. Buy“, p. 19.

Exception!As stated earlier, regulatory constraints can cut off access to the Cloud.

In some cases, re-internalization is the best solution: when data and user volumetrics increase spectacularly, it can be cheaper to repatriate applications and build a datacenter on totally optimized architecture. This type of optimization does however typically require highly-qualified staff.

150

THE WEB GIANTS

Commodity Hardware

151

THE WEB GIANTS

152

THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE

DescriptionAlthough invisible behind your web browser, millions of servers run day and night to make the Web available 24/7. There are very few leaks as to numbers, but it is clear that major Web companies have dozens or even hundreds of thousands of machines like EC2,[1] it is even surmised that Google has somewhere around a million.[2] Managing so many machines is not only a technical challenge, it is above all an economic one.Most major players have circumvented the problem by using mass produced equipment, also called “commodity hardware“, which is the term we will use from now.

This is one of the reasons which has led the Web Giants to interconnect a large number of mass-produced machines rather than using a single large system. A single service to a client, a single application, can run on hundreds of machines. Managing hardware this way is known as Warehouse Scale Computing,[3] with hundreds of machines replacing a single server.

Business needs

Web Giants share certain practices, described in various other chapters of this book:[4]

A business model tied to the analysis of massive quantities of data - for example indexing web pages (i.e. approximately 50 billion pages).

One of the most important performance issues is to ensure that query response times stay low.

[1] Source SGI.

[2] Here again it is hard to make estimates.

[3] This concept is laid out in great detail in the very long paper The Data Center as a Computer, we only mention a few of their concepts here. The full text can be found at: http://www.morganclaypool.com/doi/pdfplus/10.2200/S00516ED2V01Y201306CAC024

[4] cf. in particular “Sharding“, p. 179.

153

THE WEB GIANTS

Income from e.g. advertising is not linked to the number of queries, per query income is actually very low.[5] Comparatively speaking, the cost per unit using traditional large servers remains too elevated. The incentive to find the architecture with the lowest transaction costs is thus very high.

Lastly, the scales of magnitude of processing carried out by the Giants are far removed from traditional computer processing management, where until now the number of users was limited by the number of employees. No machine, however big, is capable of meeting their needs.

In short, these players need scalability (marginal cost per constant transaction), and the marginal cost must stay low.

Mass-produced machines vs. high-end servers

When scalability is at issue, there are two main alternatives:

Scale-up or vertical growth consists in using a better performing machine. This is the alternative that has most often been chosen in the past because it is very simple to implement. Moreover Moore’s law means that builders regularly offer more powerful machines at constant prices.

Scale-out or horizontal scaling consists in pooling the resources of several machines which individually can be much less powerful. This removes all limits as to the size of the machine.

Furthermore, PC components, technologies and architectures show a highly advantageous performance/cost ratio. Their relatively weak processing capacity as compared to more efficient architectures such as RISC are compensated for by lower costs obtained through mass production. A study based on the results of the TPC-C[6] shows that the relative cost per transaction is three times lower with a low-end server than with a top of the line one.

[5] “Early on, there was an emphasis on the dollar per (search) query,“ [Urs] Hoelzle said. “We were forced to focus. Revenue per query is very low.“ http://news.cnet.com/8301-1001_3- 10209580-92.html

[6] Ibid, [3] preceeding page

154


At the scales implemented by Web Giants - thousands of machines coordinated to execute a single function - other costs become highly prominent: electric power, cooling, space, etc. The cost per transaction must take these various factors into account.

Realizing that has led the Giants to favor horizontal expansion (scale-out) based on commodity hardware.

Who makes it work for them?Just about all of Web Giants. Google, Amazon, Facebook, LinkedIn… all currently use x86 type servers and commodity hardware. However, using such components introduces other constraints, and having a Data Center as a Computer entails scaling constraints which differ widely from what most of us think of as datacenters. Let us therefore go into more detail.

Material characteristics which impact programming

Traditional server architecture strives, to the extent allowed by the hardware, to provide developers with a “theoretical architecture“, including a processor, a central memory containing the program and data, and a file system.[7]

Familiar programming based on variables, calling functions, threads and processes make this approach necessary.

The architectures of large systems are as close to this “theoretical architecture“ as a set of machines in a datacenter is far.

Machines of the SMP (Symmetric Multi Processor) type, used for scaling-up, now make it possible to use standard programming, with access to the entire memory and all disks in a uniform manner.

[7] This architecture is known as the Von Neumann architecture.

155

THE WEB GIANTS

Figure 1 (modified). Source RedPaper 4640, page 34.

As the figures on the diagram show, great efforts are made to ensure that speed and latency are nearly identical between a processor, its memory and disks, whether they are connected directly, connected to a same processor book[9] or different ones. If any NUMA (Non Uniform Memory Access - accessing a nearby memory is faster than accessing memory in a different part of the system) characteristics are retained, they are concentrated on the central memory, with latency and bandwidth differences in a 1 to 2 ratio.

[9] A processor book is a compartment which contains processors, memory and in and out connectors, at the first level it is comparable to a main computer board. Major SMP systems are made up of a set of compartments of this sort interconnected through a second board: the midplane.

Processor Book 8 of 8

Processor Book n of 8

I/O Drawer

HMC

HMC

24 port, 100Mb Enet Switch

Oscillator Card TPMDTPMD

DIMM

DIMM

DIMM

DIMM BUFFER

BUFFER

BUFFER

BUFFER DIMM

DIMM

DIMM

DIMM BUFFER

BUFFER

BUFFER

BUFFER

Mid

plan

e

Oscillator Card

FSPSystem

Controller

FSP NodeController

FSP NodeController

FSPSystem

Controller

One server : RAM : 8 TB, 39,4 GB/sDisk : 304 TB, 10 ms upto 50 GB/s

One Processor Book :RAM 1 TB,133 ns, 46,6 GB/sDisk : 304 TB, 10ms upto 50 GB/s

One Processor :RAM 256 GB,100 ns, 76,5 GB/sDisk : 304 TB, 10ms upto 50 GB/s

INTER NODE FABRIC BUSINTER NODE FABRIC BUSINTER NODE FABRIC BUS

INTER NODE FABRIC BUS

INTER NODE FABRIC BUS

24 port, 100Mb Enet Switch

Lorem Ipsum

Lorem Ipsum

Lorem Ipsum

Lorem Ipsum

Lorem Ipsum

LoremIpsum

LoremIpsum

LoremIpsum

LoremIpsum

One server : RAM : 8 TB, 39,4 GB/sDisk : 304 TB, 10 ms upto 50 GB/s

One Processor Book{9} :RAM 1 TB,133 ns, 46,6 GB/sDisk : 304 TB, 10ms upto 50 GB/s

One Processor :RAM 256 GB,100 ns, 76,5 GB/sDisk : 304 TB, 10ms upto 50 GB/s

156


Operating systems and middleware like Oracle can take charge of such disparities.

From a scale-out perspective, the program no longer runs on a single large system but is instead managed by a program which distributes it over a set of machines. This manner of connecting machines in commodity hardware gives a very different vision from that of the“theoretical architecture“ for the developer.

Figure 2. Source The Data Center As A Computer page 8

L1$ :Level 1 cache , L2$ : level 2.cache

Local DRAM

Local 25

L 15 L 25

Local 25

L 15 L 25

Rack Switch

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

Disk

Disk

Disk

Disk

Disk

Disk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

Disk

Disk

Disk

P P P P

Datacenter Switch

One serverDRAM: 16GB 100ns, 20GB/sDisk : 2TB, 10ms, 200MB/s

Local Rack (80 servers)DRAM: 1TB, 300us, 100MB/sDisk : 160TB, 11ms, 100MB/s

Cluster (30 racks)DRAM: 30TB, 500us, 10MB/sDisk: 4.80PB, 12ms, 10MB/s

Local DRAM

Local 2$

L 1$ L 1$

Local 2$

L 1$ L 1$

Rack Switch

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

Disk

Disk

Disk

Disk

Disk

Disk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

Disk

Disk

Disk

P P P P

Datacenter Switch

One serverDRAM: 16GB 100ns, 20GB/sDisk : 2TB, 10ms, 200MB/sP : ProcessorLocal Rack (80 servers)DRAM: 1TB, 300µs, 100MB/sDisk : 160TB, 11ms, 100MB/s

Cluster (30 racks)DRAM: 30TB, 500µs, 10MB/sDisk: 4,80PB, 12ms, 10MB/s

Local DRAM

Local 2$

L 1$ L 1$

Local 2$

L 1$ L 1$

Rack Switch

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

Disk

Disk

Disk

Disk

Disk

Disk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

Disk

Disk

Disk

P P P P

Datacenter Switch



Local DRAM

Local 2$

L 1$ L 1$

Local 2$

L 1$ L 1$

Rack Switch

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

DRAM

Disk

Disk

Disk

Disk

Disk

Disk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

DiskDisk

Disk

Disk

Disk

P P P P

Datacenter Switch



157

THE WEB GIANTS

Whenever you use the network to access data on another server, availability time increases and speed is divided by 1000. In addition, it is the network equipment feeding into the datacenter that is the limiting factor in terms of the aggregated bandwidth of all machines.

In consequence, to optimize access time and speed within the datacenter, the data and processing must be well distributed across servers (especially to avoid distributing data often accessed together over several machines). However, operating systems and the traditional middleware layers are not designed for functioning this way. The solution is for processing to take place at the application level. This is precisely where sharding[10] strategies come into play.

Service front elements, serving Web pages, easily support such constraints given that versioning is not an issue and it is easy to distribute HTTP requests over several machines. It will however be up to the other applications to explicitly manage network exchanges or to anchor themselves in new specific middleware layers. Solutions for storing this type of material are also deployed among Web Giants by using sharding techniques.

Implementing failure resistance

The second significant difference between large systems and Warehouse Scale Computers lies in failure tolerance. For decades, large systems have been coming up with advanced hardware mechanisms to maximally reduce failures (RAID, changing equipment live, replication at the SAN level, error correction and failover at the memory and I/O level, etc.). A Warehouse Scale Computer has the opposite features for two reasons:

Commodity hardware components are less reliable;

the global availability of a system simultaneously deploying to several machines is the product of the availability of each server.[11]

[10] cf. “Sharding“, p. 179.

[11] Thus if each machine has an annual downtime of 9 hours, the availability of 100 servers will be at best 0.999100≈ 0.90%, i.e. 36 days of unavailability per year!

158


[12] SGI is the result of a merger between Silicon Graphics, Cray and above all of Rackable who had expertise in the field of x86 servers.

[13] http://www.youtube.com/watch?v=Ho1GEyftpmQ

Because of this, Web Giants consider that the system must be able to function continuously even when some components have failed. Once again, the application layer is responsible for ensuring this tolerance for failure (cf. “Design for Failure“, p. 221).

On what criteria are the machines chosen?

That being said, the machines chosen by the Giants do not always resemble what we think of as PCs or even the x86 servers of majors such as HP or IBM. Google is certainly the most striking example as it builds its own machines. Other majors such as Amazon work with more specialized suppliers such as SGI.[12]

The top priority in choosing their servers is, of course, the bottom line. Whittling components down to meet their precise needs and the quantity of servers purchased give Web Giants a strong negotiating position. Although verified data is lacking, it is estimated that the cost of a server for them can go as low as $500.

The second priority is electric power consumption. Given the sheer magnitude of servers deployed, power consumption has become a major expense item. Google recently stated that their average consumption was about 260 million watts, amounting to a bill of approximately $30,000 per hour. The choice of components as well as a capacity to configure the consumption of each component very precisely can also engender huge savings.

In sum, even though they contain the same parts you would find in your desktop, the server configurations are a long shot away. With the exception of a few initiatives such as OpenCompute from Facebook, the finer details are a secret that the Giants keep fiercely. The most one can discover is that Google replaced its centralized oscillators with 12V batteries directly connected to the servers.[13]

159

THE WEB GIANTS

Exception!There are almost no examples of Web Giants communicating with any other technology besides x86. If we went back in time, we would probably find a “Powered by Sun“ logo at Salesforce[14].

How can I make it work for me?Downsizing, i.e. replacing central servers by smaller machines peaked in the 1990s. We are not giving a salespitch for commodity hardware, even if one does get the feeling that the x86 has taken over the business. The extensive choice of commodity hardware goes beyond, as it transfers the responsibility for scalability and failure resistance to applications.For Warehouse Scale Computing, like for the Web Giants, when the costs of electricity and investment become crucial, it is the only viable solution. For existing software which can run on the sole resources of a single multiprocessor server, the cost of (re-)developing it as a distributed system and the cost of the hardware can be balanced in the Information System.

The decision to use commodity hardware in your company must be made in the framework of your global architecture: as much as possible, develop what you already have with better quality machines or adapt it to migrate (completely) to commodity hardware. In practice, applications designed for distribution such as front Web services will migrate easily. In contrast, highly integrated applications such as software packages necessarily entail specific infrastructure with disk redundancy, which is hardly compatible with a commodity hardware datacenter such as used by Web Giants.

[14] > http://techcrunch.com/2008/07/14/salesforce-ditch-remainder-of-sun-hardware

160


Associated patternsDistributed computing is essential to using commodity hardware. Patterns such as sharding (cf. “Sharding“, p. 179) need to be implemented in the code to be able to migrate to commodity hardware for data storage.

Using a large number of machines also complicates server administration, and patterns such as DevOps need to be adopted (cf. “DevOps“, p. 71).Lastly, the propensity shown by Web Giants to design computers, or rather datacenters, adapted to their needs is obviously linked to their preference for build vs. buy (cf. “Build vs. Buy“, p. 19).

161

THE WEB GIANTS

Sharding

162

THE WEB GIANTS

DescriptionFor any information system, data are an important asset which must be captured, stored and processed reliably and efficiently. While central servers often play the role of data custodian, most Web Giants have adopted a different strategy: sharding, or data distribution.[1]

Sharding describes a set of techniques for distributing data over several machines to ensure architecture scalability.

Business needs

Before detailing implementation, let us say a few words about the needs driving the process. Among Web Giants there are several shared concerns which most are familiar with: storing and analyzing massive quantities of data,[2] strong performance stakes to ensure delays are minimal, scalability[3] and even flexibility needs linked to consultation peaks.[4]

We will insist on a specificity of the type of actors facing the issues mentioned above. For Web Giants, revenues are often independent of the quantity of data processed and stem instead from advertising and user subscriptions.[5] They therefore need to keep unit costs per transaction very low. In traditional IT departments, transactions can easily be linked to physical flows (sales, inventory). Such flows make it easy to bill services depending on the number of transactions (conceptually speaking through a sort of tax). However with e-commerce sites for example, browsing the catalog or adding items to a cart does not necessarily entail revenues because the user can quit the site just before confirming payment.

[1] According to Wikipedia, a database shard is a horizontal partition of data in a database or search server. (http://en.wikipedia.org/wiki/Shard_(database_architecture)

[2] Heightened by Information Systems being opened to the Internet (user behavior analysis, links to social media...).

[3] Scalability is of course tied to a system’s capacity to absorb a bigger load, but more important still is the cost. In other words, a system is scalable if it can handle the additional query without taking more time and if the additional query costs the same amount as the preceding ones (i.e. underlying infrastructure costs must not skyrocket).

[4] Beyond scalability, elasticity is linked to the capacity to have only variable costs unrelated to the load. Which is to say that a system is elastic if, whatever the traffic (10 queries per second or 1000 queries per second), the query price per unit remains the same.

[5] For example, no size limit to e-mail accounts.

163

THE WEB GIANTS ARCHITECTURE / SHARDING

In sum, the Information Systems of Web Giants must ensure scalability at extremely low marginal costs to uphold their business model.

Sharding to cut costs

As yet, most databases are organized centrally: a single server, possibly with redundancy in active/passive mode for availability. The usual solution for increasing the transaction load is vertical scalability or scale-up, i.e. buying a more powerful machine (more I/O, more CPUs, more RAM...).

There are limits however to this approach: a single machine, no matter how powerful, cannot alone index the entire Web for example. Moreover there is the all-important question of costs leading to the search for other approaches.

Remember from the last chapter:

A study[6] carried out by engineers at Google shows that as soon as the load exceeds the capacities of a large system, the unit cost for large systems is much higher than with mass-produced machines.[7]

Although calculating per transaction costs is no easy matter and is open to controversy - architecture complexification, network load to be figured into the costs - the majority of Web Giants have opted for commodity hardware.

Sharding is one of the key elements in implementing horizontal scale-up.

[6] The study http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905 CAC006 is also summarized in the OCTO blog article: http://blog.octo.com/ datacenter-as-a-computer-une-plongee-dans-les-datacenters-des-acteurs-du-cloud.

[7] This is another way of saying “commodity hardware“: the machines are not necessarily low-end, but the performance/cost ratio is the highest possible for a given system

164

THE WEB GIANTS

Centralized database

Vertical partitioning

Horizontal partitioning

Client1

2Client

1

2

Client1

ContratA

ContratA

BContrat

A

B

Client2

ContratB

How to shard

In fact there are two ways of partitioning - or sharding - data: vertically or horizontally.

Vertical sharding is the most widely used and consists of isolating rows in the database table per concept. For example, deciding to store client lists in one database and their contracts in another.

Horizontal sharding is where the database tables are divided and distributed across multiple servers. For example, storing client lists from A to M on one machine and from N to Z on another. Horizontal sharding is based on a distribution key- the first letter in the name in the example above.[8]

Web Giants have mostly implemented horizontal sharding. It has the advantage namely of not being limited by the number of concepts as is the case with vertical sharding.

Figure 1

[8] In fact, partitioning is a function of the probability of names to begin with a given letter.

165


Techniques linked to sharding

Based on their choice of horizontal scale-up, Web Giants have developed specific solutions (grouped under the acronym NoSQL -Not Only SQL) to meet the challenges and having the followingcharacteristics:

implementation using mass-produced machines,

data sharding managed at the software level.

While sharding makes it possible to overcome the issues mentioned above, it also entails implementing new techniques.

Managing availability is much more complex. In a centralized sys-tem, or one used as such, the system is either available or not, and only the rate of unavailability will be measured. In a sharded system, some data servers can be available and others not. If the failure of a single server makes the entire system unavailable, the unavailability rate is equal to the product of the unavailability of each of the data servers. The rate will thus drop sharply: If 100 machines are each down 1 day per year, the system would show a rate of unavailability of nearly 3 months.[9] Since a distributed system can remain available despite the failure of one of the data servers, albeit in downgraded mode, availability must be measured through two figures: yield, i.e. the above defined unavailability rate; and harvest, i.e. completeness of the response, i.e. measuring so to say the absence of unavailabi-lity.[10]

Distribution of the load is usually tailored to data use. A product reference (massively accessed in read mode) won’t raise the same performance issues as a virtual shopping cart (massively accessed in write mode). The replication rate, for example, will be different.

[9] (364/365)100 = 76% = 277/365 i.e. 88 days.

[10] Thus if when a server fails, the others ignore the modifications made to that server and then resolve the various modifications once the server reconnects to the cluster, the harvest is smaller. The response is incomplete because it has not integrated the latest changes, but maintains the yield. The NoSQL solutions developed by the Giants integrate various mechanisms to manage this: data replication over several servers, vector clock algorithms to resolve competing updates when the server reconnects to the cluster. Further details may be found in the following article: http://radlab.cs.berkeley.edu/people/fox/ static/pubs/pdf/c18.pdf

166

THE WEB GIANTS

Lastly, managing the addition of new servers and the data partitioning problems this poses (recalibrating the cluster) are novel issues specific to sharding. FourSquare for example were down for 11 hours in October 2010[11] following overload of one of their servers then to trouble when they connected the back-up server, which in the end caused the entire site to crash. Data distribution algorithms such as consistent hashing[12] limit data replication costs when servers are removed or connected to overcome these problems.

Sharding also means adapting your application architecture:

Queries have to be adapted to take distribution into account so as to avoid any inter-shard queries because the cost of accessing seve-ral remote servers is prohibitive. Thus the APIs of such systems limit query possibilities to data in the same shard.

Whether one is using relational databases or NoSQL type bases, models are upended and modelization is widely limited in such sys-tems to the level key/value, key/document or in classes of columns for which the key or line index serves as the basis for partitioning.

Atomicity (the A in ACID) is often restricted so as to avoid atomic updates affecting several shards and therefore transactions distribu-ted over several machines at high performance cost.

Who makes it work for them?The implementation of these techniques varies across companies. Some have simply adapted their databases to facilitate sharding. Others have hand-written ad hoc NoSQL solutions. Following the path from SQL to NoSQL, here are a few representative implementations:

[11] For more details on the FourSquare incident:http://blog.foursquare.com/2010/10/05/so-that-was-a-bummer/ and the analysis of another blog http://highscalability.com/blog/2010/10/15/troubles-with-sharding-what-can-we-learn-from- the-foursquare.html[12] Further details in the following article:http://blog.octo.com/consistent-hashing-ou-l%E2%80%99art-de-distribuer-les-donnees/

167


Wikipedia

This famous collaborative encyclopedia rests on many instances of distributed MySQL and a MemCached memory cache. It is thus an example of sharding implementation with run-of-the-mill components.

Figure 2

The architecture uses master-slave replication to divide the load between read and write functions on the one hand, and partitions the data by Wiki and use case. The article text is also deported to dedicated instances. They thus use MySQL instances with between 200 and 300 GB of data.

Consultation Edition

MemCached

Metadata

DATA STORAGEFOR ARTICLE TEXT

DATA STORAGEFOR ARTICLE TEXT

SLAVES FOR READSLAVES FOR READ MASTER FOR WRITESMASTER FOR WRITES

Wiki A

MySQLReplication

Wiki B Wiki B

168

THE WEB GIANTS

Flickr

The architecture of this photo sharing site is also based on several master and slave MySQL instances (the shards), but here based on a replication ring making it easier to add data servers..

Figure 3

An identifier serves as the partitioning key (usually the photo owner’s ID) which distributes the data over the various servers. When a server fails, entries are redirected to the next server in the loop. Each instance on the loop is also replicated on two slave servers to function in read-only mode if their master server is down.

MasterMaster

SlavesSlaves

ids 1à N/4

MemCached

Metadata

Next master

Reads

Writes

MySQLreplication

169


Facebook

The Facebook architecture is interesting in that it shows the transition from a relational data base to an entirely distributed model.Facebook started out using MySQL, a highly efficient open source solution. They then implemented a number of extensions to partition the data.

Figure 4

Today, the Facebook architecture has banished all central data storage. Centralized access is managed by the cache (MemCached) or a dedicated service. In their architecture, MySQL serves to feed data to MemCached in the form of key-value and is no longer queried in SQL. The MySQL replication system is also used after an extension to replicate the shards across several datacenters. That being said, its use has very little to do with relational databases. Data are accessed only through the key-value. At this level there are no joins. Lastly, the structure of the data is taken into account to co-locate data used simultaneously.

DATACENTER #1DATACENTER #1 DATACENTER #2DATACENTER #2

<Clé, Valeur>

Clé = C1

MemCached

<Clé, Valeur>

MySQL MySQLMySQL replication

Asynchronous

170

THE WEB GIANTS

Amazon

The Amazon architecture stands out in its more advanced management of the loss of one or more datacenters on Dynamo.

Amazon started out in the 1990s with a single Web server and an Oracle database. They then set up a set of business services in 2001 with dedicated storage. Alongside databases, two systems use sharding: S3 and Dynamo. S3 is an online blob storage site identified by a URL. Dynamo (first used in-house, but recently made available to the public through Amazon Web Services) is a distributed key-value storage system designed to ensure high availability and very fast responses.

In order to enhance availability on Dynamo, several versions of a same dataset can coexist, using the principle of eventual consistency[13].

Figure 5

[13] There are quorum mechanisms (http://en.wikipedia.org/wiki/Quorum_(distributed_ computing) to arbitrate between availability and consistency.

Consultation Edition

Foo (Bar= «1», Version=1)

Foo (Bar= «2», Version=2)

Asynchronous propagation

171


In read mode, an algorithm such as the vector clock[14] or, as a last resort, the client application, will have to resolve any conflicts. There is thus a balance to be found in how much is replicated to choose the best compromise between resistance to data center failure on the one hand and system performance on the other.

LinkedIn

LinkedIn’s background is similar to Amazon’s: they started in 2003 with a single database approach, then partitioned for specific businesses with implementation of a distributed system similar to Dynamo’s: Voldemort. But contrary to Dynamo, it is open source. One should also note that indexes and social graphs have always been stored separately by LinkedIn.

Google

Google was the first to broadcast information on their distributed storage system. Rather than having its roots in databases, it emulates file systems.In the paper[15] on the Google File System (GFS), the authors mention that their choice of commodity hardware was instrumental, given the weaknesses noted in a previous chapter (cf. “Commodity Hardware“, p. 167). This distributed file system is used, directly and indirectly, to store Google’s data (search index, emails).

Figure 6

Its architecture is based on a centralized metadata server (to guide client applications) and a very large number of data storage systems. The degree of data consistency is lower than that guaranteed by a traditional

[14] The Vector Clock algorithm provides the order in which a given distributed dataset wasmodified.[15] http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google. com/fr//papers/gfs-sosp2003.pdf

Client Chunk ServersMaster

1 2

3 4

5 6

3

2

6

4

1

5

172

THE WEB GIANTS

file system, but this topic alone deserves an entire article. In production, Google uses clusters of several hundred machines, enabling them to store petabytes of data to index.

Exception!It is however undeniable that a great many sites are grounded in relational database technologies without sharding (or without mentioning sharding): StackOverflow, SalesForce, Voyages-SNCF, vente-privee.com… It is difficult to draw up an exhaustive list, one way or another.We nonetheless believe that sharding has become the traditional strategy on data-intensive web sites. Indeed, the architecture of SalesFoce is based on an Oracle database, but it uses the architecture very differently from the practices in our usual ITs: tables with multiple un-typed columns with generic names (col1, col2), a query engine upstream from Oracle to take into account these specificities, etc. Optimizations show the limits of purely relational architecture.In our view, the most striking exception is StackOverflow, where the architecture is based on a single relational SQL server. This site chose to implement architecture based purely on vertical scalability, with their initial architecture, inspired by Wikipedia, then evolving to conform to this strategy. Moreover, one must also note that the scalability needs of StockOverflow are not necessarily comparable to those of other sites because their targeted community (IT engineers) is narrow, thus the mode favors the quality of contributions over their quantity. Furthermore, choosing a platform under Microsoft license gives them an efficient tool but where the costs would certainly become prohibitive in the case of a horizontal scale up.

How can I make it work for me?Data distribution is one of the keys that enabled Web Giants to reach their current size and to provide services that no other architecture is capable of supporting. But make no mistake, it is no easy task: issues which are easy to resolve in a relational world (joins, data integrity) demand mastering new tools and methods.Areas which are data intensive but with limited consistency stakes - as is e.g. the case with data which can be partitioned - are those where distributed data will be most beneficial.

173


Offers compatible with Hadoop use these principles and are relevant to BI, more particularly in analyzing non-structured data. Concerning transactions, consistency issues are more important. Constraints around access APIs are also a limiting factor, but new offers such as SQLFire by VMWare or NuoDB attempt to combine sharding and an SQL interface. Thus something to keep an eye on.

In short, you need to ask yourself which data belong to the same use case (what partitions are possible?) and, for each, what the consequences of loss of data integrity would be. Depending on the answers, you can identify the main architecture features that would enable you, above and beyond sharding, to choose the tool to best meet your needs. More than a magic fix, data partitioning must be considered as a strategy to reach scale-up levels which would be impossible without it.

Associated patternsWhether you use open source or in-house products depends on your use of data partitioning as it entails a great deal of fine tuning. The ACID transactional model is also affected by data sharding. The pattern Eventually Consistent offers another vision and solution to meet user needs despite the impacts due to sharding. Again, mastering this pattern is very useful for implementing distributed data. Lastly, and more importantly, sharding is cannot be dissociated from the commodity hardware choice implemented by Web Giants.

Sources• Olivier Mallassi, Datacenter as a Computer : une plongée dans les datacenters des acteurs du cloud, 6 June, 2011 (French only) :> http://blog.octo.com/datacenter-as-a-computer-une-plongee-dans-les-

datacenters-des-acteurs-du-cloud/

• The size of the World Wide Web (The Internet), Daily estimated size of the World Wide Web:

> http://www.worldwidewebsize.com/

THE WEB GIANTS

174

• Wikipedia:> http://en.wikipedia.org/wiki/Shard_(database_architecture)> http://en.wikipedia.org/wiki/Partition_%28database%29> http://www.codefutures.com/weblog/database-sharding/2008/06/

wikipedias-scalability-architecture.html

• eBay:> http://www.codefutures.com/weblog/database-sharding/2008/05/

database-sharding-at-ebay.html

• Friendster and Flickr:> http://www.codefutures.com/weblog/database-sharding/2007/09/

database-sharding-at-friendster-and.html

• HighScalability:> http://highscalability.com/

• Amazon:> http://www.allthingsdistributed.com/

175

THE WEB GIANTS

TP vs. BI:the new

NoSQL approach

176

THE WEB GIANTS ARCHITECTURE / TP VS. BI: THE NEW NOSQL APPROACH

DescriptionIn traditional ISs, structured data processing architectures are generally split across two domains. Both of course are grounded in relational databases, but each with their own models and constraints.

On the one hand, Transactional Processing (TP), based on ACID transactions,

and on the other Business Intelligence (BI), grounded in fact tables and dimensions.

Web Giants have both developed new tools and come up with new ways of organizing processing to meet these two needs. Distributed storage and processing is widely used in both cases.

Business needs

One recurrent specificity of Web Giants is their need to process data which are only partially structured, or not at all, different from the usual data tables used in management information systems: Web pages for Google, social graphs for Facebook and LinkedIn. A relational model based on two-dimensional tables where one of the dimensions is stable (the number and type of columns) is ill-adapted to this type of need.

Moreover, as we saw in the chapter on sharding (cf. “Sharding“, p. 179), constraints on data volumes and transaction amounts often push Web Giants to partition their data. This overturns the traditional vision of TP where the data are always consistent.

BI solutions, lastly, are usually driven by internal IT decisions. For Web Giants, BI is often the foundation for new services which can be used directly by clients: LinkedIn’s People You May Know, new music releases suggested by sites such as Last.fm,[1] Amazon recommendations, are all services which entail

[1] Hadoop, The Definitive Guide O’Reilly, June, 2009.

177

THE WEB GIANTS

manipulating vast quantities of data to provide recommendations to users as quickly as possible.

Who makes it work for them?The new approach of Web Giants on the level of TP (Transaction Processing) and BI (Business Intelligence) lies in generic storage and deferred processing whenever possible. The main goal in the underlying storage is only to absorb huge volumes of queries both redundantly and reliably. We call it ‘generic’ because it is poorer in terms of indexing, data organization and consistency than traditional databases. Processing and analyzing data for queries and consistency management are deported to the software level. The following strategies are implemented.

TP: the ACID constraints limited to what is strictly necessary

The sharding pattern highly complicates the traditional vision of a single consistent database used for TP. Major players such as Facebook and Amazon have thus adapted their view of transactional data. As specified by the CAP theorem,[2] within a given system one cannot at the same time achieve consistency, availability and partition tolerance. First of all, data consistency is no longer permanent but only provided when the user reads the data.

This is known as eventual consistency: it is when the information is read that its integrity is checked, and any differing versions in the data servers are resolved.Amazon fostered this approach when they designed their distributed storage system Dynamo.[3] On a set of N machines, the data are replicated on W of them, in addition to version stamping. For queries, N-W+1 machines are searched, thereby ensurin that the user has the latest version.[4] The e-commerce giant chose to reduce data consistency in favor of gains in the availability of its distributed system.

[2] http://en.wikipedia.org/wiki/CAP_theorem

[3] http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html

[4] In this way one is always certain of reading the data on at least one of the W machines where the freshest data have been written. For further information, see http://www. allthingsdistributed.com/2007/10/amazons_dynamo.html

178


Furthermore, to meet their performance goals, data freshness criteria are no longer comprehensive, but categorized. Facebook and LinkedIn rely on user updates for real-time freshness of these data: modifications must be immediately visible to ensure user trust in the system. In contrast, global consistency is reduced: when users sign up for a Facebook group for example, they immediately see the information appear but other group members may experience some delay in being notified.[5]

At LinkedIn, services are also categorized. For non critical services such as retweets, the information is propagated asynchronously.[6]

Whereas any user modifications on their own data are immediately propagated so as to be instantly visible to them.

Asynchronous processing is what makes it possible for Web Giants to best manage the heavy traffic loads they face. In sum, to guarantee performance and availability, Web Giants tailor their storage systems so that data consistency depends on usage. The goal is not to be consistent at all times, but rather to provide eventual consistency.

BI: the indexation mechanism behind all searches

To provide information on vast quantities of data, Web Giants also tend to pre-calculate indexes, which is to say data structures specifically designed to answer user questions. To better understand this point, let us look at the indexes that Google has designed for its search engine. Google is foremost in the arena due to the volume of its indexing: the Web entire.

[5] http://www.infoq.com/presentations/Facebook-Software-Stack

[6] Interview with Yassine Hinnach, Architect at LinkedIn.

179

THE WEB GIANTS

At the implementation level, Google uses sharding to store raw data (BigTable column database grounded in the distributed Google File System).[7]

Indexes based on keywords are then produced asynchronously, and are used to answer user queries. The raw data are analyzed with a distributed algorithm, based on the programming model MapReduce.

The process can be divided into two main phases: map, which, in parallel, identically processes each piece of data; and reduce, which aggregates the various results in a single final result. The map phase is easily distributable by using one machine for processing and another for the corresponding data, as can be seen in Figure 1.

Figure 1

[7] cf. “Sharding“, p. 179.

ReduceMap

01234567

180


This technique is highly scalable[8] and makes it possible for example for a web crawler to consume all web pages visited, to establish for each the list of outgoing links, then to aggregate them during the reduce phase to obtain a list of the most referenced pages. Google has implemented a sequence of MapReduce tasks to generate the indexes for its search engine.[9]

This allows them to process huge quantities of data in batch mode. The technique has been widely copied, namely through the Apache Foundation open source project Hadoop.[10]

Hadoop uses both the distributed file system and a framework to implement the MapReduce programming model, directly inspired by Google’s research paper. It was then adopted by Yahoo! for indexing, by LinkedIn to prepare its email campaigns, and by Facebook to analyze the various logs generated by their servers... Many firms, including several other Web Giants (eBay, Twitter) use it.[11]

In 2010, Google set up a new indexation process based on event mechanisms.[12] Updates do not happen in real time, contrary to database triggers, but latency (the time between page publication and the possibility to search it) is greatly reduced as compared to a batch system based on the MapReduce programming model.

Exception!All of these examples share a commonality: they target a pretty specific set of needs. Many key Web players also use relational databases for other applications.The “one size fits all“ approach of these databases means they are easier to use but also more limited, notably in terms of scalability. The processes and distributed storage systems described above are only implemented for the services most frequently used by these key players.

[8] Or scalable, i.e. capable of processing more data if the system is enlarged.[9] http://research.google.com/archive/mapreduce.html[10] http://hadoop.apache.org[11] http://wiki.apache.org/hadoop/PoweredBy[12] Google Percolator: http://research.google.com/pubs/pub36726.html

181

THE WEB GIANTS

How can I make it work for me?It is certainly in indexation solutions and BI on Big Data that the market is most mature. With Hadoop, a reliable open source implementation, a large number of support solutions, related tools, re-implementations and commercial repackaging have been developed, based on the same APIs.Projects based on the indexation of large quantities of data, or which are semi- or non- structured, are the primary candidates for adoption of this type of method. The main advantage is that data can be preserved thanks to much lower storage costs. Information is no longer lost through over-hasty aggregations.

In this way the data analysis algorithms producing indexes or reports can also be more easily adjusted over time since they are constantly processing all available data rather than pre-filtered subsets. A switch from relational databases in TP will probably take more time. Various distributed solutions inspired by Web Giants’ technologies have come out under the label NoSQL (Cassandra, Redis).

Other distributed solutions, more at the crossroads of relational databases and data matrices in terms of consistency and APIs, have come out under the name NewSQL (SQLFire, VoltDB). Architectural patterns such as Event Sourcing and CQRS[13] can also contribute to spanning gaps across disciplines. In fact, their contributions make it possible to model transactional data as a flow of events which are both non correlated and semi-structured. Building a comprehensive and consistent vision of the data comes after, for data dissemination. Web Giants models cannot be directly transposed to meet the general TP needs of businesses, and there are many other approaches to be found on the market to overcome traditional database limits.

Associated patternsThis pattern is mainly linked to the sharding pattern (cf. “Sharding“, p. 179), because, through distributed algorithms, it makes it possible to work on this new type of storage.One should also note here the influence of the pattern Build vs. Buy (cf.“Build vs. Buy“, p. 19) which has led Web Giants to adopt highly specialized tools to meet their needs.

[13] Command and Query Responsibility Separation.

182

THE WEB GIANTS

Big Data Architecture

183

THE WEB GIANTS

To better meet their users' needs, the Web Giants do everything they can to reduce their Time to Market. Data in all forms are key to this strategy. They not only serve for technical analyses, but are also business drivers. They are what make it possible to personalise the user experience, more and more often in real time, and above all inform decision making. The Web giants have long understood the importance of data and use them unabashedly. At Google for example, all ideas must come with metrics, all arguments must be based on data, or you will not be heard in the meeting.[1]

Everyone speaks of Big Data, but the Web Giants were the first stakeholders, or, at the least, associates. Behind the buzz word are new challenges, including an especially complicated one: how do you store and process the exponential volume of data generated? There are more connected objects than humans on the planet, and Cisco forecasts that by 2020 there will be over 50 billion sensors,[2] how do you use all that information?

Time to ActionAs shown in the preceding chapter, NoSQL architecture can process and query ever larger amounts of data.

Big Data is usually described by 3 main characteristics, often called the 3Vs:[3]

Volume, the capacity to process terabytes, petabytes, and even exabytes of extracted data

Variety, the capacity to process all data formats, whether structured or not

Velocity, the capacity to process events in real time, or at least as quickly as possible

With architectures of the NoSQL/NewSQL type, as described previously, only the components Variety and Volume were highlighted. Let us now look at how the Web Giants also embrace the third component: Velocity.

[1] http://googlesystem.blogspot.com.au/2005/12/google-ten-golden-rules.html[2] https://www.cisco.com/web/about/ac79/docs/innov/IoT_IBSG_0411FINAL.pdf[3] https://en.wikipedia.org/wiki/Big_data

184

THE WEB GIANTS ARCHITECTURE / BIG DATA ARCHITECTURE

Making data available We will talk here about double-headed architectures capable of storing and querying data in all forms, processed in batches or in real time. But before broaching this complex subject, let us first take a look at the characteristics and Big Data architecture patterns the Web Giants implement.

A data lake for data

In an information system, the data are distributed over dozens, or even hundreds, of components. The data are spread out in various sources, some on site but others with third party editors or blocked in proprietary software. Having the data on hand is not enough, they must also be instantly accessible. If you don't have the data, it is unlikely you will think of playing around with them. Isolated data is underexploited data: the Allen Curve[5] also applies to data!

That is why the Web giants centralise their data in a scalable system where they can be easily queried without any presumptions about how they will be used. Perhaps most of them will not even be used, but that does not matter: the important thing is to have them nearby just in case a new idea emerges.

This type of system, usually based on the Hadoop framework, is commonly called a “data lake“.[6A] It is a storage and distributed processing platform capable of handling ever increasing amounts of data, whatever their nature. On paper, it can be scaled to infinity,[7] both in terms of storage and processing, and can manage numerous competing jobs and tasks linearly thanks to the size of the infrastructure.

An asideSome also speak of 4Vs or even 5Vs,[4] adding components to the 3Vs mentioned above such as:

Veracity, the capacity to manage inconsistencies and ambiguitiesa

Value, the capacity to apply differential processing to data depending on the value attributed to them

The latter is without doubt the most debatable, since the main benefit of this type of architecture is that there are no presuppositions as to how the data will be analysed, and therefore no pre-established values.

[4] https://www.linkedin.com/pulse/20140306073407-64875646-big-data-the-5-vs-everyone-must-know[5] https://en.wikipedia.org/wiki/Allen_curve[6A] https://en.wikipedia.org/wiki/Data_lake[7] even if nothing is infinitely scalable https://www.youtube.com/watch?v=modXC5IWTJI

185

THE WEB GIANTS

Immutable data

A data lake can store all types of data, it is up to the user to decide what to use it for. Of all the data it can hold, raw data are particularly interesting. Available without changes or alterations, they can be modelled depending on user needs.

Immutability drastically reduces manipulation errors:

the data are entered without any transformation, limiting the risk of losing the context or errors in interpretation

the data are stored only once and are never updated, thus limiting manipulation errors and keeping a full record.

Immutable, they can also theoretically[6b] be reused an infinite number of times. The data are not “consumed“ but “used“. In case of errors, bugs or code updates, the processing simply needs to be relaunched to obtain the latest results.

When they are timestamped and sufficiently individualised, such immutable data are also known as “events“.

Schema on read

Another highly interesting characteristic is in interpreting the data. For a “traditional“ BI ingestion, the data are cleaned up, formatted, and normalised before being ingested. The Web Giants consider that each time data is transformed, part of the context is altered. By storing raw data, it is up to users to decide how to transform them.

Let us take the example of Twitter. Each tweet contains a multitude of information: text, images, videos, links, hashtags. They are timestamped, geographically located, shared, liked... Depending on the system using the data, it must be able to transform them by focusing on the aspect which seems most relevant. An application to map the most recent tweets will probably not have the same angle of approach as one looking for the most shared content.

[6B] In practice, Google uses its data over a period of 30 days, for both volumetric and legal reasons.

186


This pattern, Schema on read, has several advantages:

It maximally simplifies ingestion, avoiding all data loss and making it much less expensive to add data to a data lake.

It gives clients flexibility by allowing personalised extraction and transformation depending on needs.

This pattern, joined with the preceding ones, becomes a driver of innovation. It does away with technical barriers to data processing, making it possible to develop new prototypes more and more quickly. The best way to find value in your data is to play around with them!

From Big Data to Fast DataThe Web Giants strive to give value to their clients as quickly as possible. Sometimes, and more and more often, offline processing is no longer sufficient for user needs.

In that case, the best way to get value from your data is to interact with them as soon as they are ingested: the data lake as described above allows

EnterpriseDWH

Database

TransactionalSystems

Reporting,requests

External Data,OpenAPI

Messages& Events

Messages& Events

PUBLICATION

INGESTION

Analyticalbatchs

MachineLearning

Flowmanagement

Non-structuredstorage

Semi-structuredstorage (NoSQL)

Structured storage(ex. relational)D

ATA

LA

KE Interactive

requests

Rawfiles

Applicativelogs

External Data,OpenAPI

187

THE WEB GIANTS

you to process data in batch mode only. However, between two batch passages, freshly gathered data are not used. Not only do you not get full benefit from them, but worse, some data may be outdated before they're even used. The fresher the data, the greater their potential interest.

To process millions or even billions of events per second, two types of technology are used:

Event distributors and collectors such as Flume and Kafka

Tools to process the events in near real time, such as Spark and Storm

More than being just customers, the Web giants partake in creating and sharing these bricks:

Kafka is a high speed distributed message queue developed by LinkedIn[8]

Storm makes it possible to process millions of messages per second, originally developed by Twitter[9]

The goal is not to replace the batch processing brick already included in the data lake, but instead to add real time features. This layer is often referred to as the Fast Layer, and the capacity to leverage Big Data for real time processing is known as Fast Data.[10] Real time reduces the Time to Action, so prized by the Web Giants.[11]

APIs

DAT

A L

AK

E REAL TIM

E

PUBLICATION

INGESTION

REAL TIME PUBLICATION

REAL TIME INGESTION

APIs

Interactive andbatch processing Sandbox

Distributed File Storage Resilient Storage

Statelessprocessing

Statefulprocessing

Enterprise DWH

High Volume data Log files Applications Applications High Velocity data

Dataimport

Dataimport

Dataimport

Dataimport

Dataimport

[8] http://kafka.apache.org/[9] http://storm.apache.org/[10] http://www.infoworld.com/article/2608040/big-data/fast-data--the-next-step-after-big-data.html[11] http://www.datasciencecentral.com/profiles/blogs/time-to-insight-versus-time-to-action

188


Should the two channels, batch and real time, be treated as distinct or, on the contrary, be unified?In theory, the ideal is to be able to process the entire dataset, but that is not so simple. There are numerous initiatives but you are unlikely to need any for your ecosystem, where most use cases can do without. The Web giants advise batch oriented architecture if you have no strong latency constraints, or instead fully real time architecture, but rarely both at once.

Lambda architecture

Lambda architecture is undoubtedly the most widespread response to the need to unify the two approaches. The principle is to process the data in two layers, batch and real time, carrying out the same processes in both channels, then consolidating the results in a third, dedicated layer:

The batch layer precalculates the results based on the complete dataset. It processes raw data and can be regenerated on demand.

The speed layer serves to overcome batch latency by generating real time views which undergo the same processing as in the batch layer. These real time views are continuously updated and the events are crushed in the process, therefore the views can only be replayed by the batch layer.

The serving layer then indexes both views, batch and real time, and displays them in the form of consolidated output.

Since the raw data are always available in the batch layer, if there are any errors, the output can be regenerated.

SensorLayer

DistributionLayer

Batch Layer Serving Layer

IoT

...

All data

Process stream Incrementedinformation

real time view

real time view

Precomputedinformation

batch view

batch view

Data

Ser

vice

(Mer

ge)

Visu

aliza

tion

Adapted from: Marz, N. & Warren, J. (2013) Big Data. Manning.

Batchrecompute

Realtimeincrement

Speed Layer

IncomingData

mobile

social

189

THE WEB GIANTS

However, few use cases are truly adapted to this type of architecture. It has not yet reached maturity, even among the Web Giants, and is highly complex to implement. More specifically, it entails developing the same processing twice on two types of very different technologies. Doing it once is already difficult enough without having to double the task, especially given that it must all be synchronised.

As an alternative to Lambda architecture, Twitter offers, through Summingbird,[12] an abstraction layer where you can integrate computation in both layers within a single framework. What you gain in simplicity you lose in flexibility however: the number of usable features is reduced at the intersection of both modes.

Kappa Architecture

LinkedIn has released another variant of this model: Kappa Architecture.[13] Their approach is based on processing all data, old and new, in a single layer: the fast layer, thus reducing the complex equation.

It is a way of better dividing the streams into small independent steps, easier to debug, with each step serving as a checkpoint to replay unitary processing in case of error. Reprocessing data is one of the most complicated challenges with this type of architecture and must be thoroughly thought through from the outset. Because code, formats and data constantly change, processing must be able to integrate the changes continuously, and that is no small matter.

SensorLayer

DistributionLayer

Batch Layer Serving Layer

IoT

... Process stream Incrementedinformation

real time view

real time view Visu

aliza

tion

Adapted from: Marz, N. & Warren, J. (2013) Big Data. Manning.

BatchAnalytical analysis

Realtimeincrement

Speed Layer

IncomingData

mobile

social

Replay

Data

Ser

vice

All data

[12] https://github.com/twitter/summingbird[13] http://radar.oreilly.com/2014/07/questioning-the-lambda-architecture.html

190


How can I make it work for me?Whether you have already invested in Business Intelligence or not, leveraging your data is no longer an option. A data lake type solution has become almost inevitable. More flexible than a data warehouse, it is now possible to process unstructured data and create models on demand. It does not (yet) replace traditional BI but opens up new vistas and possibilities.

Based on open source solutions, mostly around Hadoop and its ecosystem, this central business reference is a staunch ally to make data accessible, whatever their type: managing unstructured data, storing and processing large volumes, all with commodity hardware, which is to say low outlay.

Whatever your business line, the use cases are numerous and varied: from log analysis and safety audits to optimising the buying journey, not forgetting data science of course, data lakes are a key component to intelligent user experience design. To go beyond the offline processing of your data, add online features to your data lake. Although we do not necessarily recommend implementing e.g. Lambda or Kappa architectures, which are too complex for most use cases and not always mature, this does not take away from the advantages to be reaped from real time schemas which truly open new perspectives. Stay simple!

191

THE WEB GIANTS

Data Science

193

THE WEB GIANTS DATA SCIENCE

Data science now provides technology which is both low cost and methodologically reliable to better use data in information systems. Data science drives business intelligence even deeper by automating data analysis and processing in order to e.g. predict events, behavior patterns, trends or to generate new insights. In what follows we provide an overview of data science, with illustrations taken from some of its most groundbreaking and surprising applications.

Data science is used to extract information from more or less structured data, based on methodologies and expertise developed at the crossroads of IT, statistics, and all business lines involving data.[1] [2]

Practically speaking, solving a data science problem translates as projecting into the future patterns grounded in data from the past. One speaks of supervised learning when the main issue is forecasting for a specific target. When the target has not been specified or data are lacking, detecting patterns is said to be unsupervised.One should note that data science also includes building atemporal patterns and then visualizing their various facets.

Taking the classic example of purchasing histories and pricing in online retail, data science serves to determine whether a client will buy a new product, or what price they would be willing to pay for the product, and are thus two examples of supervised learning in the respective areas of classification and regression. Carving out marketing segments based on behavior variables, in contrast, is an example of unsupervised learning.

More broadly, data science covers all technology and algorithms used to model, implement and visualize an issue using available data, but also to better understand problems by examining them from several viewpoints to potentially solve them in the future. Machine learning is defined as the algorithmic aspect of data science.

[1] Dhar V. 2013. “Data science and prediction“. Communications of the ACM[2] Cleveland WS. 2001. “Data science: an action plan for expanding the technical area of the field of statistics“. Bell Labs Statistics Research Report

194

THE WEB GIANTS

Enthusiasm for the discipline is such that today's data scientists must constantly monitor the field to remain on top. Let us seize the occasion to note that in the second half of 2015, OCTO published a Hadoop white book and a book on data science (in French, English translation forthcoming).[3] [4]

Web GiantsAmong the Web Giants, there is strong movement towards unstructured data (e.g. video and sound). These have traditionally been ignored by analytics due to volume constraints and technical barriers to extracting the information. However they are back in fashion with a combination of breakthroughs in neural network science (including the field currently known as deep learning); in technology, with ever more affordable and powerful machines; and lastly with the wide media coverage of a number of futuristic applications.

Groundbreaking work has been going on over the last few years, namely in images and natural language processing, both sound and text.

In December, 2014 Microsoft announced the launch of Skype Translator, a real time translation tool for 5 languages, to break down language barriers.[5]

With DeepFace, Facebook announced, in June, 2014 a giant step forward in facial recognition, reaching a precision level of 97%, close to human performance for a similar task.[6]

Google presents similar results with FaceNet in an article dated June, 2015 on facial recognition and clustering.[7]

[3] http://bit.ly/WP-Hadoop2015 (French)[4] data-science-fondamentaux-et-etudes-de-cas[5] skype-translator-unveils-the-magic-to-more-people-around-the-world[6] deepface-closing-the-gap-to-human-level-performance-in-face-verification[7] http://arxiv.org/pdf/1503.03832.pdf

195

THE WEB GIANTS

Such developments in unstructured data processing show that it is now possible to extract value from data hitherto considered out of reach. The key lies in structuring the data:

A raw image is transformed into a face, and then linked to a person. The image's context can also be described in a sentence.[8] The patterns extracted from the images can be reproduced with slight modifications, or blended with other images, such as a famous painting to produce artistic motifs.[9]

Speech can be transcribed as text, and music as notes on a score. Patterns extracted from music make it possible to a certain extent to reproduce a composer or musical genre.

Masses of unstructured texts are transformed into meaning using semantic vectors. Processing natural language becomes a question of algebraic manipulations, facilitating its use by the algorithms of data science.[10] The mainstreaming of bots and personal assistants such as Apple's Siri, Google's Now and Facebook's M partakes in our ability to carry out more and more detailed semantic analyses on unstructured text.

The study of brain activity provides clues to identifying signs of illness such as epilepsy or to determining which cerebral patterns correspond to moving one's arm.[11]

Some problems requiring cutting edge expertise are now being handled using data science approaches, including to detect the Higgs boson and searching for black matter using sky imaging.[12] [13]

Such use cases, often tightly linked to challenges launched by academic circles, have largely contributed to the media frenzy around data science.

Moreover, for the Web Giants, data science has become not only a way to continuously improve internal processes, but also an integral part of the business model. Google products are free because the data generated by the user has value for advertising targeting. Twitter draws a share of its revenue from the combination of advertising and analytics products. Uber is a perfect example of a data-driven company which, in serving as intermediary between the client and the driver, has nothing to sell other than intelligence in creating links.[14] Intermediation services can easily be copied by the competition, but not the intelligence behind the services.

[9] inceptionism-going-deeper-into-neural[10] learning-meaning-behind-words[11] grasp-and-lift-eeg-detection[12] kaggle.com/c/higgs-boson[13] kaggle.com/c/DarkWorlds/data[14] data-science-disruptors

DATA SCIENCE

196

THE WEB GIANTS

A flourishing ecosystem and accessible toolsThe standardization of data science came about through the contribution of many tools from the open source world such as the multiple machine learning and data handling libraries in languages such as R and Python[15] [16] and from the world of Big Data. These open source ecosystems and their dynamic communities have facilitated access to data science for many an IT engineer or statistician wishing to become a data scientist.

In parallel, tools for data analysis by major publishers, whether oriented statistics or IT, have also evolved towards integrating open source tools or developing their own implementations of machine learning algorithms.[17] Both the open source and proprietary ecosystems are flourishing, mature, and more and more accessible in terms of training and documentation.

Open source is used as much to attract major talent from data science as to provide tools for the community. This strategy is picking up speed as illustrated by the buzz generated by TensorFlow, an open source deep learning framework for digital calculations published by Google in November, 2015.[18] Thanks to highly permissive licensing, these tools are absorbed and improved by the community, transforming them into de facto standards. We have completely lost track of the number of tools from the Hadoop ecosystem which were internally developed by the Web Giants (such as Hive and Presto at Facebook, Pig at Yahoo, Storm and Summingbird at Twitter...) and then took on a second life in the open source world.

Platforms for online competitions in data science (such as the most well known kaggle.com or datascience.net in France) have given new, vibrant visibility to the potential of data science. Various Web Giants such as Facebook and major players in distribution and industry quickly understood that this could help them attract the best talent.[19] Many data science competitions propose job interviews as the top prize, in addition to financial awards and certain glory.

[15] four-main-languages-analytics-data-mining-data-science[16] kdnuggets.com/2015/05/r-vs-python-data-science[17] Why-is-SAS-insufficient-to-become-a-data-scientist-Why-need-to-learn-Python-or-R[18] tensorflow-googles-latest-machine_9[19] kaggle.com/competitions

197

THE WEB GIANTS

The Web Giants swiftly organized to recruit the best data scientists, thus anticipating the value added by interdisciplinary teams specialized in capitalizing on data.[20]

Many, e.g. Google, Facebook and Baidu, have also hired top specialists in machine learning such as Geoffrey Hinton, Yann LeCun and Andrew Ng.[21] [22] [23]

Current challenges in data scienceOne of the most crucial steps in any data science project is called feature engineering. This consists of extracting the relevant numeric variables to characterize one or several facets of the phenomenon under study. For example, numerically describing user behavior on a web site by calculating how often a given page is accessed, or characterizing an image by the number of contours it contains. Feature engineering is also considered one of the most fastidious tasks a data science has to carry out. For unstructured data such as images, deep learning has made it possible to automate the procedure, placing the use cases mentioned above within reach.For structured data, the creation and selection of new features to improve prediction remain strongly specific to each particular business. This is an essential component of the alchemy of a good data scientist. Feature engineering is sill largely implemented manually by the world's best data scientists for structured data.[24]

How can I make it work for me?Are all the data you produce stored and then readily accessible? What percentage of the data is in fact processed and analyzed? How often? To what extent do you use the available data to measure your processes and orient your actions? How much importance do you attach to recruiting data scientists, data engineers and data architects?

Data science contributes more broadly to the best practices of data driven companies, i.e. those that use the available data both qualitatively and quantitatively to improve all their processes. Answering the few questions above allows you to measure your maturity as concerns data.

[20] the-state-of-data-science[21] wired.com/2013/03/google_hinton/[22] facebook.com/yann.lecun/posts/10151728212367143[23] chinese-search-giant-baidu-hires-man-behind-the-google-brain[24] http://blog.kaggle.com/2014/08/01/learning-from-the-best/

DATA SCIENCE

198

THE WEB GIANTS

You have perhaps already used predictive methods based on linear algorithms such as logistic regression traditionally found when establishing marketing scores. Today, the rigorous implementation of the data science methodology gives you control over the inherent complexity in using non linear algorithms. The underlying compromise in giving up linear algorithms is the loss of capacity to understand and explain predictions in exchange for more realistic, and therefore more useful, predictions.

How do I get started?Depending on the nature of your business, you may have unstructured data that deserve a fresh look:Call center recordings to be transcribed and semanticized to better understand your customer relations.Written texts supplied by clients or emails sent by staff to be used to categorize complaints and requests, to detect fads and trends.

The takeaway is that in the use cases of most of our clients and in international competitions, the vast majority concern structured or semi-structured data:Mapping links between customers and timestamped transactions can bring to light potential fraud by processing volumes far beyond what is possible manually.Web logs begin as far upstream as possible to characterize customer journey's which lead to a strategic target such as shopping cart abandonment.Temporal series produced by industrial sensors help prevent problems on assembly lines.Server logs identify warning signs before a machine breaks down.Relational data on clients, sales and products form a set of characteristics including identity, geographic location, behavior patterns and social networks which are systematically integrated in the 360 models of the examples described above.

Better yet, personalizing your client segment, predicting component failures, improving the performance of your production units, gaining customer loyalty, forecasting increases in demand and reducing churn, are all possible use cases.[25] Data science has become a strategic business asset that you can no longer do without.

[25] kaggle.com/wiki/DataScienceUseCases

199

THE WEB GIANTS

Sources[1] Dhar V. 2013. “Data science and prediction“. Communications of the ACM[2] Cleveland WS. 2001. “Data science: an action plan for expanding the technical area of the field of statistics“. Bell Labs Statistics Research Report[3] http://bit.ly/WP-Hadoop2015 (French)[4] data-science-fondamentaux-et-etudes-de-cas[5] skype-translator-unveils-the-magic-to-more-people-around-the-world[6] deepface-closing-the-gap-to-human-level-performance-in-face-verification[7] http://arxiv.org/pdf/1503.03832.pdf[8] google-stanford-build-hybrid-neural-networks-that-can-explain-photos[9] inceptionism-going-deeper-into-neural[10] learning-meaning-behind-words[11] grasp-and-lift-eeg-detection[12] kaggle.com/c/higgs-boson[13] kaggle.com/c/DarkWorlds/data[14] data-science-disruptors[15] four-main-languages-analytics-data-mining-data-science[16] kdnuggets.com/2015/05/r-vs-python-data-science[17] Why-is-SAS-insufficient-to-become-a-data-scientist-Why-need-to-learn-Python-or-R[18] tensorflow-googles-latest-machine_9[19] kaggle.com/competitions[20] the-state-of-data-science[21] wired.com/2013/03/google_hinton/[22] facebook.com/yann.lecun/posts/10151728212367143[23] chinese-search-giant-baidu-hires-man-behind-the-google-brain[24] http://blog.kaggle.com/2014/08/01/learning-from-the-best/[25] kaggle.com/wiki/DataScienceUseCases

DATA SCIENCE

200

THE WEB GIANTS

Design for

Failure

201

THE WEB GIANTS

Description of the pattern“Everything fails all the time“ is a famous aphorism by Werner Vogels, CTO of Amazon: indeed it is impossible to plan for all the ways a system can crash, in any layer - an inconsistent administration rule, system resources that are not released following a transaction, hardware failure, etc.

It is on this simple principle that the architecture of Web Giants is based, it is known as the Design for Failure pattern: computer software must be able to overcome the failure of any underlying component and infrastructure.

Hardware is never 100% reliable, it is therefore crucial to isolate components and applications (data grids, HDFS...) to guarantee permanent service availability.

At Amazon for example, it is estimated that 30 hard drives are changed every day per data center. The cost is justified by the nearly constant availability of the site amazon.fr (less than 0.3 s. of outage per year), where one must remember that each minute of outage costs over 50,000 euros in lost sales.

A distinction is generally made between the traditional continuity of service management model and the design for failure model which is characterized by five stages of redundancy:

Stage 1: physical redundancy (network, disk, data center). That is where the traditional model stops.

Stage 2: virtual redundancy. An application is distributed over several identical virtual machines within a VM cluster.

Stage 3: redundancy of the VM clusters (or Availability Zone on AWS). These clusters are organized into clusters of clusters.

Stage 4: redundancy of the clusters of clusters (or Region on AWS). A single supplier manages these regions.

Stage 5: redundancy of Internet suppliers (e.g. AWS and Rackspace) in the highly unlikely event of AWS being completely down. Of course, you will have understood that the higher the redundancy level, the more the deployment and switch mechanisms are automated.

202

THE WEB GIANTS ARCHITECTURE / DESIGN FOR FAILURE

Applications created within Design for failure continue to function despite system or connected application crashes, even if it means, to continue providing an acceptable level of service, downgrading functions for the most recently connected users or all users.

This entails including design for failure in the application engineering, based for example on:

Eventual consistency: instead of systematically seeking consistency with each transaction with often costly mechanisms of the XA[1] type, consistency is ensured at the end (eventually) when the failed services are once again available.

Graceful degradation (not to be confused with the Web User Interface of the same name): when there are sharp spikes in load, performance-costly functionalities are deactivated live.

At Netflix, the streaming service is never interrupted, even when their system for recommendations is down or failing or slow: they are there, no matter what the failure.

Moreover, to reach that continuity of service, Netflix uses automated testing tools such as ChaosMonkey (recently open-sourced), LatencyMonkey and ChaosGorilla, which check that applications continue to run correctly despite random failures in, respectively, one or several VM, network latency, an Availability Zone.

Netflix thus lives up to its motto: “The best way to avoid failure is to fail constantly“.

Who makes it work for them? Obviously Amazon, who furnishes the basic AWS building blocks. Obviously Google and Facebook who communicate frequently on these topics.But also Netflix, SmugMug, Twilio, Etsy, etc.In France, although some sites have very high availability rates, very few comment on their processes and, to the best of our knowledge, very few are capable of expanding their redundancy beyond stage 1 (physical)

[1] Distributed transaction, 2-phase commit.

203

THE WEB GIANTS

or 2 (virtual machines). Let us nonetheless mention Criteo, Amadeus, Viadeo, the main telephone operators (SFR, Bouygues, Orange) for their real-time need coverage.

What about me?Physical redundancy, rollback plans, Disaster Recovery Plan sites, etc. are not Design for Failure patterns but rather redundancy stages.

Design for Failure entails a change in paradigm, going from “preventing all failures“ to “failure is part of the game“, going from “fear of crashing“ to “analyzing and improving“.

In fact, applications built along the lines of Design for Failure no longer generate such feelings of panic because all failures are naturally mastered; this leaves time for post-mortem analysis and improvements to the PDCA[2]. It is, to borrow a term from Improv Theater, “taking emergencies easy“.

This entails taking action on both a technical and a human level. First of all in application engineering:

The components of an application or application set must be decentralized and made redundant using VM, by Zone, by Region (in the Cloud. Same principle if you host your own IS) without any shared failure zones. The most complex issue is synchronizing databases.

All components must be resilient to underlying infrastructure failures.

Applications must support communication breaks and high network latency.

The entire production workflow for these applications has to be automated.

Then, for the organization:

Get out of the A-Team culture (remember: “the last chance at the last moment“) and automate processes to overcome systems failure. At Google, there is 1 systems administrator for over 3000 machines.

[2] Plan-Do-Check-Act, a method for continuous improvement, known as the “Deming Wheel“.

204

THE WEB GIANTS

Analyze and fix failures upstream with the Failure Mode and Effects Analysis (FMEA) method, and downstream with post-mortems and PDCA.

Patterns connexes Pattern “Cloud First“ , p. 159.

Pattern “Commodity Hardware“, p. 167.

Pattern “DevOps“, p. 71.

ExceptionsFor totally disconnected applications, with few users or few business challenges, the redundancy must be simple or non-existant. Arbitration between each redundancy level is then carried out using ROI criteria (costs and complexities vs. estimated losses during inavailabilities).

Sources• Don MacAskill, How SmugMug survived the Amazonpocalypse, 21 April, 2004:> http://don.blogs.smugmug.com/2011/04/24/how-smugmug-survived- the-amazonpocalypse

• Scott Gilbertson, Lessons From a Cloud Failure: It’s Not Amazon, It’s You, 25 April, 2011:> http://www.wired.com/business/2011/04/lessons-amazon-cloud-failure

• Krishnan Subramanian, Designing For Failure: Some Key Facts It’s You, 26 April, 2011: http://www.cloudave.com/11973/designing-for-failure-some-key-facts

ARCHITECTURE / DESIGN FOR FAILURE

205

THE WEB GIANTS

The ReactiveRevolution

206

THE WEB GIANTS

For many years now, competing processes have been executed in different threads. A program is basically a sequence of instructions that run linearly in a thread. To perform all the requested tasks, a server will generate several threads. But these threads will spend most of their time waiting for the result of a network call, a disk read or a database query.

Web giants have moved on to a new model to eliminate such time loss and to increase the number of users per server by reducing latency, improving performance globally and managing peak loads more simply.

The reactive manifesto defines a reactive application around four interrelated pillars: event-driven, responsive, scalable and resilient.

A responsive application is event-driven, able to provide an optimal user experience, by making better use of available computing power and higher error and failure tolerance, and hence scalability and resilience. But the most powerful concept here is the event-driven orientation, everything else can be seen through this prism.

The reactive model is a development model driven by events.

It is called by a variety of names. It's all a matter of perspective:

event-driven, driven by events

reactive, that reacts to events

push based application, the data is fronted as it becomes available

Even better: Hollywood, summarised by the famous “don’t call us, we’ll call you“

Use cases: when latency mattersThis architectural model is very relevant for applications interacting with users in real time.

This includes several use cases like:

Social networks, shared documents and direct communication tools

207

THE WEB GIANTS

Financial analysis, pooled information like traffic congestion or public transport, pollution...

Multiplayer games

Multi-channel approaches, mobile application synchronisation

Open or private APIs, when usage is impossible to predict

IoT and index management

Massive user influx such as sport events, sales, TV ads...

And more generally when effectively managing complex algorithms is the issue, e.g. for ticket booking, graph management, the semantic web

One of the crucial elements in all these applications is latency handling. For an application to be responsive and thus usable, users must experience the lowest possible latency.

It’s all about the threading strategy To put it simply, there are two types of thread:

Hard-threads: these are real competing processes that are executed by the different processor cores

Soft-threads: these are simulations of competing processes that dedicate portions of the CPU to each process, alternately

Fortunately, the soft-threads allow machines to simultaneously run many more threads than they have cores.

The reactive model aims to remove as many soft-threads as possible and only use hard-threads, making more efficient use of modern processors.

To reduce the number of threads, the CPU must not be shared on a time basis, but instead on an event basis. Each call involves processing a piece of code. It must never be blocked, to release the CPU as quickly as possible to process the next event.

ARCHITECTURE / THE REACTIVE REVOLUTION

208

THE WEB GIANTS

Implementing this model means operating in all software layers: from operating systems to development languages passing through frameworks, hardware drivers and databases.

A data structure that eliminates locks is beyond doubt an important lever for system performance. New functional data models then become the best allies for reactive models.

Among new software making the most buzz, many use an internal reactive model. To name but a few: Redis, Node.js, Storm, Play, Vertx, Axom, and Scala.

The reactive model is more likely to respond well to load peaks. It reduces the limit on the number of simultaneous users controlled by an arbitrary fixed parameter on the thread pool. Most of the Web giants have published their experience feedback on their migration to this model: Coursera,[1]

Gilt, Groupon, Klout, LinkedIn,[2] NetFlix[3], Paypal, Twitter,[4] WalMart[5] and Yahoo.

Their voices are unanimous: reactive architectures make it possible to offer the best user experience with the highest scalability.

Why now?

“Software gets slower faster than hardware gets faster. “ Niklaus Wirth – 1995

The reactive model is not new. It has been used in all user interface frameworks since the invention of the mouse. Each click or keystroke generates an event.

Even client-side JavaScript uses this model. There is no thread in this language, yet it is possible to have multiple simultaneous AJAX requests. Everything works using call-backs and events.

[1] http://downloads.typesafe.com/website/casestudies/Coursera-Case-Study.pdf[2] http://engineering.linkedin.com/play/play-framework-async-io-without-thread-pool-and-callback-hell[3] https://blog.twitter.com/2013/new-tweets-per-second-record-and-how[4] http://venturebeat.com/2012/01/24/why-walmart-is-using-node-js/ [5] http://www.infoq.com/presentations/netflix-reactive-rest

209

THE WEB GIANTS

Current development architectures are the result of a succession of steps and evolutions. Some strong concepts have been introduced and used extensively before being replaced by new ideas. The environment is also changing. The way we respond to it has changed.

User experience has been the driving force of this change: today, who is willing to fill in a form, wait for the page to reload to provide feedback (failure/success) and wait again for the confirmation email? What of getting such information immediately rather than asynchronously?

Have we reached the limits of our systems? Is there still space to be conquered? Performance gains to discover?

In our systems, there is a huge untapped power reservoir. For doubling the number of users, adding a server will do the trick. But since the advent of mobile, companies have to handle about 20x more requests: is it reasonable to multiply the number of servers in proportion? And is it sufficient? Certainly not. To a certain extent, it sounds better to review the architecture to harness the power that’s available: there are many more available processor cycles to optimise. And when programs spend significant amounts of time waiting for disks, networks or databases, they don’t harness server potential.

From this point forward, this paradigm becomes accessible to everyone while becoming built-in into modern development languages. These new development patterns integrate latency and performance management at the beginning of all projects. It is no longer a challenge to overcome when it is too late to change the application architecture.

Applications based on the request/response model (HTTP / SOAP / REST) can tolerate a thread model. In contrast, applications based on flows like JMS or WebSocket will have everything to gain from working off a model based on events and soft threads.

Unless your application is mostly devoted to calculations, you should start thinking about implementing the reactive approach. The paradigm is compatible with all languages.


210

THE WEB GIANTS

Things are moving fast: new frameworks now offer asynchronous APIs, and, in-house, mostly use non blocking APIs, with language libraries also changing, now providing classes which make it possible to react to events more simply, and, lastly, the languages themselves are changing to make it easier to script simple codes (closures) or generate asynchronous code from synchronous code.

In addition, patterns can be set up to manage threadless multitasking scripts:

a generator, which produces elements and pauses for each iteration, until the next invocation

continuation, a closure which becomes a procedure to be executed once it has been processed

coroutine, which makes it possible to pause processing

composition, which makes it possible to sequence processing in the pipeline

Async/Await, to distribute processing over several cores

In other words, the reactive revolution is underway!

How can I make it work for me?Reactive architecture is to architecture what NoSQL is to relational databases: a very good alternative when you have reached your limits.

It is all a question of latency and access competition: for real time applications, whether embarked or not, choosing reactive architecture is justified as soon as you have a significant increase in volume. So no reactive corporate website, but instead real time processing and display of IoT data (a fleet's position for example).

The same goes for APIs, so the back-ends must be designed in consequence: if your volume is under control, reactive architecture is overkill. Open to partners or even in open API, it appears necessary to design non-blocking architecture from the outset.

211

THE WEB GIANTS

Lastly, on one hand, wisely using the cloud can help you overcome many of these limits (Amazon's Lambdas for example), and, on the other hand, many software publishers have demonstrated their willingness to produce highly scalable architecture. When choosing a SaaS software package or one hosted on the premises for these use cases, companies must now turn to editors who have proven they master such architecture.

All of these technologies have physical limits. Disk volumes are increasing, but not access time. There are more cores in processors, but frequency has not increased. Memory is increasing, beyond the capacity of garbage collectors. If you are nearing these limits, or will do so in the next few years, reactive architecture is definitely made for you.


212

THE WEB GIANTS

Open API

213

THE WEB GIANTS

214

THE WEB GIANTS ARCHITECTURE / OPEN API

DescriptionThe principle behind Open API is to develop and offer services which can be used by a third party without any preconceived ideas as to how they will be used.

Development is thus mainly devoted to applied logic and system persistence. The interface and business logic are developed by others, often more specialized in interface technologies and ergonomics, or having other specificities.[1]

The application engine therefore exposes an API,[2] which is to say a bundle of services. The end application is based on service packages, which can include services provided by third parties. This is the case for example for HousingMaps.com, a service for visualizing advertisements on CraigsList using Google Maps.

The pattern belongs to the broader principles of SOA:[3] decoupling and composition possibilities. For a while, there was a divide between the architecture of Web Giants, generally of the REST[4] type and corporate SOA, mostly based on SOAP.[5] There has been a lot of controversy among bloggers on this opposition between the two architectures. What we believe is that the REST API exposed by Web Giants is just one form of SOA among others.

Web Giants publicly expose their API, thus creating open ecosystems. What this strategy does for them is to:

Generate direct income, by billing the service. Example: Google Maps charges for their service beyond 25,000 transactions per day.

Expand the community, thereby recruiting users. Example: thanks to the apps derived from its platform, Twitter has reached 140 million active users (and 500 million subscribers).

[1] http://www.slideshare.net/kmakice/maturation-of-the-twitter-ecosystem

[2] Application Programming Interface.

[3] Service Oriented Architecture.

[4] Representational State Transfer. [5] Simple Object Access Protocol.

215

THE WEB GIANTS

Foster the emergence of new uses for its platform thus developing their income model. Example: in 2009, Apple noted that application developers wanted to sell not only their applications, but also content for them. The AppStore model was changed to include that possibility.

At times, externalize R&D, then acquire the most talented startups. That is what Salesforce did with Financialforce.com.

Marc Andreessen, creator of Netscape, divides open platforms into three types:

Level 1 - Access API: these platforms allow users to access business applications without providing the user interface. Examples: book searches on Amazon, geocoding on Mappy.

Level 2 - Plug-in API: These platforms integrate applications in the supplier’s user interface. Examples: Facebook apps, Netvibes Widgets.

Level 3 - Runtime Environment: These platforms provide not only the API and the interface, but also the execution environment. Example: AppExchange applications in the Salesforce or iPhone ecosystem.

It is also good to know that Web Giants APIs are accessible in self-service, i.e. you can subscribe directly on the web site without any commercial relations with the provider.

At level 3, you must design a multi-tenant system. The principle is to manage the applications of several businesses in isolation, finding a balance between mutualization and self-containment.

The pattern API First is derived from the Open API pattern: its approach is to begin by building an API, then to consume it to build applications for your end users. The idea is to be on the same level as the ecosystem users, which means applying the same architecture principles you are offering your clients to yourself, which is to say the pattern Eat Your Own Dog’s Food (EYODF). Some architects working for Web Giants consider it the best way to build a new platform.

216

THE WEB GIANTS ARCHITECTURE / OPEN API

In practice, the API First pattern is an ideal which is not always reached: in recent history, it would seem that it has been applied for Google Maps and Google Wave, two services developed by Lars Rasmussen. And yet it was not applied for Google+, stirring the wrath of many a blogger.

Who makes it work for them?Pretty much everyone, actually...


The Google Maps API is a celebrity: according to programmableWeb.com, alongside Twitter it is one of those most used by websites. It has become the de facto standard for showing objects on a map. It uses authentication processes (client IDs) to measure consumption of a given application, so as to be able to bill the service beyond a certain quota.

Twitter’s API is widely used: it offers sophisticated services to access subs-criber data, in read and write versions. One can even using streaming to receive tweet updates in real time. All of the site’s functionalities are accessible via their API. The API also makes it possible to delegate the authorization process (using the OAuth protocol), thereby allowing a third party application to tweet in your name.

In FranceThe mapping service Mappy offers APIs for geocoding, calculating itine-raries, etc., available at api.mappy.comWith api.orange.com, Orange offers the possibility to send text messages, to geolocalize subscribers, etc.

What about me?You should consider Open API whenever you want to create an ecosystem open to partners or clients, in-house or externally. Such an ecosystem can be open on Internet or restricted to a single organization. A relatively classic scenario in a business is exposing the yearly directory of collaborators to integrate their identities in the applications.

217

THE WEB GIANTS

Another familiar case is integrating services exposed by other suppliers (for example a bank consuming the services of an insurance company).

Lastly, a less traditional use is to open a platform for your end clients:

A bank could allow its users to access all of their transactions: see the examples of the AXA Banque and CAStore APIs.

A telephone or energy provider could give their clients access to to their current consumption rate.

Related Pattern

Pattern “Device Agnostic“ p. 143

Exception!

Anything requiring a complex workflow.

Real-time IT (aircraft, car, machine tool): in this case service composition can pose performance issues.

Data manipulation posing regulatory issues: channelling critical data between platforms is best avoided.

Sources• REST (Representational State Transfer) style:> http://en.wikipedia.org/wiki/Representational_State_Transfer

• SOA> http://en.wikipedia.org/wiki/Service-oriented_architecture

• Book “SOA, Le guide de l’architecte d’un SI agile“ (French only):> http://www.dunod.com/informatique-multimedia/fondements-de-lin- formatique/architectures-logicielles/ouvrages-professionnel/soa-0

218

THE WEB GIANTS

• Open platforms according to Marc Andreessen:> http://highscalability.com/scalability-perspectives-3-marc-andreessen- internet-platforms

• Mathieu Lorber, Stéphen Périn, What strategy for your web API? USI 2012 (French only):> http://www.usievents.com/fr/sessions/1052-what-strategy-for-your- web-api?conference_id=11-paris-usi-2012

ARCHITECTURE / OPEN API

219

About OCTO Technology“We believe that IT transforms our societies. We are fully convinced that major breakthroughs are the result of sharing knowledge and the pleasure of working with others. We are constantly in quest of improvements.

THERE IS A BETTER WAY !“– OCTO Technology Manifest

OCTO Technology specializes in consulting and ICT project creation.

Since 1998, we have been helping our clients build their Information Systems and create the software to transform their firms. We provide expertise on technology, methodology, and Business Intelligence.

At OCTO our clients are accompanied by teams who are passionate about maximizing technology and creativity to rapidly transform their ideas into value: Adeo, Altadis, Asip Santé, Ag2r, Allianz, Amadeus, Axa, Banco Fibra, BNP Fortis, Bouygues, Canal+, Cdiscount, Carrefour, Cetelem, CNRS, Corsair Fly, Danone, DCNS, Generali, GEFCO ING, Itaú, Legal&General, La Poste, Maroc Telecom, MMA, Orange, Pages jaunes, Parkeon, Société Générale, Viadeo, TF1, Thales, etc.

We have grown into an international group with four subsidiaries:Morocco, Switzerland, Brazil and, more recently, Australia.

Since 2007, OCTO Technology has been granted the status of “innovativefirm“ by OSEO Innovation.

For four years, from 2011 to 2015, OCTO was awarded 1st or 2nd prize in the Great Place to Work contest for firms with fewer than 500 employees.

ABOUT US

220

THE WEB GIANTS

Authors

Erwan Alliaume David Alia

Philippe BenmoussaMarc Bojoly

Renaud CastaingLudovic CinquinVincent Coste

Mathieu Gandin Benoît Guillou

Rudy Krol Benoît Lafontaine

Olivier MalassiÉric Pantera

Stéphen PérinGuillaume PlouinPhillipe Prados

Translated from the French

by Margaret Dunham & Natalie Schmitz

Copyright © November 2012 by OCTO Technology,All rights reserved.

Illustrations

The drawings are by Tonu in collaboration with Luc de Brabandere. They are both active on www.cartoonbase.com, located in Belgium. CartoonBase works mostly with businesses and works to promote the use of cartoons and to encourage greater creativity in graphic art and illustrations of all kinds.

Graphics and design by OCTO Technology,

with the support of Studio CPCR

and [email protected]

ISBN 13 : 978-2-9525895-4-3

Price: AUD $32

The Web Giants

Culture – Practices – Architecture

In the US and elsewhere around the world, people are reinventing the way IT is done. These revolutionaries most famously include Amazon, Facebook, Google, Netflix, and LinkedIn. We call them the Web Giants.This new generation has freed itself from tenets of the past to provide a different approach and radically efficient solutions to old IT problems. Now that these pioneers have shown us the way, we cannot simply maintain the status quo. The Web Giant way of working combines firepower, efficiency, responsiveness, and a capacity for innovation that our competitors will go after if we don’t first. In your hands is a compilation and structural outline of the Web Giants’ practices, technological solutions, and most salient cultural traits (obsession with measurement, pizza teams, DevOps, open ecosystems, open software, big data and feature flipping).Written by a consortium of experts from the OCTO community, this book is for anyone looking to understand Web Giant culture. While some of the practices are fairly technical, most of them do not require any IT expertise and are open for exploitation by marketing and product teams, managers, and geeks alike. We hope this will inspire you to be an active part of IT, that driving force that transforms our societies.

THE OBSESSION WITH MEASUREMENT • FLUIDITY OF THE USER EXPERIENCE • ARTISAN CODERS • BUILD VERSUS BUY • CONTRIBUTING TO FREE SOFTWARE • DEVOPS • PIZZA TEAMS • MINIMUM VIABLE PRODUCT • PERPETUAL BETA • A/B TESTING • DEVICE AGNOSTIC • OPEN API AND OPEN ECOSYSTEMS • FEATURE FLIPPING • SHARDING • COMMODITY HARDWARE • TP VERSUS BI: THE NEW NOSQL APPROACH • CLOUD FIRST • DATA SCIENCE • REACTIVE PROGRAMMING • DESIGN THINKING • BIG DATA ARCHITECTURE • BUSINESS PLATFORM

OCTO designs, develops, and implements tailor-made IT solutions and strategic apps

...Differently.

WE WORK WITH startups, public

administrations, AND large corporations FOR WHOM IT IS a powerful engine for change.

octo.com - blog.octo.com - web-giants.com

Documents

OCTO_TheWebGiants_2016