Upload
stephen-perin
View
159
Download
48
Embed Size (px)
Citation preview
The WebGiants
Culture – Practices – Architecture
AUG
MEN
TED
Foreword .......................................................................................................................6
Introduction .................................................................................................................9
Culture .........................................................................................................................11The Obsession with Performance Measurement .....................13Build vs Buy ....................................................................................................19Enhancing User Experience ..................................................................27 Code crafters.................................................................................................33Open Source Contribution ...................................................................41Sharing Economy platforms .................................................................47
Organization ............................................................................................................57Pizza Teams ....................................................................................................59Feature Teams ..............................................................................................65DevOps ............................................................................................................71
Practices ......................................................................................................................85Lean Startup ..................................................................................................87Minimum Viable Product ........................................................................95Continuous Deployment ......................................................................105Feature Flipping ........................................................................................113Test A/B ..........................................................................................................123Design Thinking ........................................................................................129Device Agnostic ........................................................................................143Perpetual beta ............................................................................................151
Architecture ...........................................................................................................157Cloud First ....................................................................................................159Commodity Hardware ...........................................................................167Sharding .........................................................................................................179TP vs. BI: the new NoSQL approach .............................................193Big Data Architecture .............................................................................201Data Science................................................................................................211Design for Failure .....................................................................................221The Reactive Revolution .......................................................................227Open API .....................................................................................................235
About OCTO Technology .............................................................................243Authors .....................................................................................................................245
Table of Contents
3
THE WEB GIANTS
It has become such a cliché to start a book, a talk or a preface by stating that the rate of change is accelerating. However, it is true: the world is changing faster both because of the exponential rate of technology evolution and the central role of the user in today’s economy. It is also a change characterized by Marc Andreessen in his famous blog post as “software is eating the world“. Not only is software at the core of the digital economy, but producing software is changing dramatically too. This is not a topic for Web companies, this is a revolution that touches all companies. To cope with their environment’s change, they need to reinvent themselves into software companies, with new ways of working, organizing themselves and producing digital experiences for their customers. This is why I am so pleased to write the preface to “The Web’s Giants“. I have been using this book intensely since the first French edition was on the market. I have given copies to colleagues both at Bouygues Telecom and at AXA, I have made it a permanent reference in my own blogs, talks and writing. Why? It is the simplest, most pragmatic and convincing set of answers to the previous questions: what to do in this software-infused, technology-enabled, customer-centric fast changing 21st century? This is not a conceptual book, a book about why you should do this or that. This is a beautifully written story about how software and service development is organized in some of the best-run companies of the world. First, this is a book about practices. The best way to grow change in a complex world is to adopt practices. It is the only way to learn, by doing. These practices are sorted into three categories: culture, organization and architecture; but there is a common logic and a systemic reinforcement. Practices are easier to pick and they are less intimidating than methodologies or concepts. However, strong will and perseverance are required. I will not spoil your reading by summarizing what OCTO found when they look at the most common practices of the most successful software companies of the world. I will rather try to convince you that reading this book is an urgent task for almost everyone, based on four ideas. The first and foremost idea is that software systems must be built to change constantly. This is equally true for information systems, support systems, embedded, web or mobile software. What we could define as customer engagement platforms are no longer complex systems that one designs and builds, but continuously evolving systems that are grown. This new generation of software systems is the core of the Web Giants. Constant evolution is mandatory to cope with exponential technology changes, as well as the only way to co-construct engagement platforms through customer feedbacks. The unpredictability of usage, especially social usage, means that digital experiences software processes that can only be crafted through measure and continuous improvement. This
Foreword
4
THE WEB GIANTS FOREWORD
critical change, from software being designed to software being grown, means that all companies that provide digital experiences to their customers must become software companies. A stable software support system could be outsourced, delegated or bought, but a constantly evolving self-adaptive system becomes a core capability. This capability is deeply mixed with business and its delivery processes and agents are to be valued and respected. The second key idea is that there exists a new way of building such software systems. We are facing two tremendous challenges: to churn out innovations at the rate that is expected by the market, and to constantly integrate new features while factoring out older ones, to avoid the suffocation by constant growth that plagued previous generations of software systems. The solution is a combination of open innovation - there are clearly more smart developers outside any company than inside – together with source-level “white box“ integration and minimalist “platform“ design principles. When all your code needs to be constantly updated to follow the environment change, the less you own the better. It is also time to bring source code back from the dark depths of “black box integration“. Open source culture is both about leveraging the treasure trove of what may be found in larger development communities and about mashing up composite applications by weaving source code that one may be proud of. Follow the footsteps of the Web Giants: code that changes constantly is worth being well-written, structured, documented and test-viewed by as many eyeballs as possible. The third idea is another way of saying that “software is eating the world“, this book is not about software, it is about a new way of thinking about your company, whichever business you are in. Not surprisingly, many “known“ practices such as agile development, lean startup, measure obsession or obsession about saving customer’s time - the most precious commodity of the digital age -, have found their way into Octo’s list. By reading the practical testimonies from the Web Giants, a new kind of customer-focused organization will emerge. Thus, this is a book for everyone, not for geeks only. This is of the utmost importance since many of the change levers lay in other stakeholders’ hands than software developers themselves. For instance, a key requirement for agility is to switch from solution requirement to problem requirement, allowing the solution to be co-developed by cross-functional teams as well as users. The last idea I would propose is that there is a price to pay for this transformation. There are technologies, tools and practices that you must acquire and learn. Devops practices, such as continuous delivery or managing infrastructure as code, require to master a set of tools and to build skills, there is no “free lunch“. A key set of benefits from the Web Giants way of working comes from massive automation. This book also
5
shows some of the top recent technology patterns in the architecture section. Since this list is evolving by nature, the most important lesson is to create an environment where “doers“ may continuously experience the tools of the future, such as massively parallel cloud programming, big data or artificial intelligence. A key consequence is that there is a true efficiency and competitiveness difference between those who do and those who don’t master the said set of tools and skills. In the world of technology, we often use the world “Barbarians“ to talk about newcomers who leverage their software/technology skills to displace incumbents in older industries. This is not a question of mindset (trying to take legacy companies head-front is an age-old strategy for newcomers) but a matter of capabilities! As stated earlier, there would be other, more conceptual, ways to introduce the key ideas and practices that are pictured in this book. One could tell about the best sources on motivation and collaborative work, such as Daniel Pink for instance. These Web Giants practices reflect the state of the art of managing intrinsic motivation. The same could be said about the best books on lean management and self-organization. The reference to Lean Startup is one from many subtle references to the influence of the Toyota Way in the modern 21st century forms of organization. Similarly, it would be tempting to convoke complex system theory - see Jurgen Apello and his “Management 3.0“ book for instance - to explain why the practices observed and selected by Octo are the natural answer to the challenges of the increasingly changing and complex world that we live in. From a technology perspective, it is striking to see the similarity with the culture & organizational traits described by Salim Ismael, Michael Malone and Yuri van Geest in their book “Exponential organizations“. The beauty of this pragmatic approach is that you have almost all what you need to know in a much shorter package, which is fun and engaging to read. To conclude this preface, I would advise you to read this book carefully, to share it with your colleagues, your friends and your children - when it’s time to think about what it means to do something that matters in this new world. It tells a story about the new way of working that you cannot afford to miss. Some of the messages: measuring everything, learning by doing, loving your code and respecting those who build things, may make the most seasoned manager smile, but times are changing. This is no longer a set of suggested, “nice-to-have“ practices, as it might have been ten years ago. It is the standard of web-age software development, and de facto the only way for any company to succeed in the digital world.
Yves Caseau - National Academy of Technologies of France, President of the ICT commission.
Head of Digital of AXA Group
THE WEB GIANTS
6
THE WEB GIANTS INTRODUCTION
Introduction
Something extraordinary is happening at this very moment; a sort of revolution is underway. Across the Atlantic, as well as in other parts of the world such as France, people are reinventing how to work with information technology.They are Amazon, Facebook, Google, Netflix and LinkedIn, to name but the most famous. This new generation of players has managed to shed old dogmas to examine afresh the issues at hand by coming up with new, radical and efficient solutions for long-standing IT problems.
Computer scientists are well aware of the fact that when IT tools are introduced to a trade, the benefits of computerization can only be reaped if business processes are re-thought in light of the new potential offered by technology. One trade, however, has mostly managed thus far to avoid upheavals in their processes: Information Technology itself. Many continued – and still do – to build information systems the way one would build highways or bridges.There is a tendency to forget that the matter being handled on a daily basis is extremely volatile. By dint of hearing tell of Moore’s law,[1] its true meaning is forgotten: what couldn’t be done last year is possible today; what cannot be done today will be possible tomorrow. The beliefs and habits of the ecosystem we live in must be challenged at regular intervals. This thought is both terrifying and wonderful.
Now that the pioneers have paved the way, it is important to re-visit business processes. The new approaches laid out here offer significant increases in through efficiency, proactivity, and the capacity for innovation, to be harnessed before the competition pulls the rug out from under your feet. The good news is that the Web Giants are not only paving the way; they espouse the vision of an IT community.They are committed to the Open Source principle, openly communicating their practices to appeal to potential recruits, and work in close collaboration with the research community. Their work methods are public knowledge and very accessible to those who care to delve.
The aim of this book is to provide a synthesis of practices, technological solutions and the most salient traits of IT culture. Our hope is that it will inspire readers to make contributions to an information age capable of reshaping our world.
This book is designed for both linear and thematic reading. Those who opt for the former may find some repetition.
[1] empirical law which states that computing power roughly doubles in capacity at a fixed price every 18 months.
7
THE WEB GIANTS
Culture
8
The obsession with performance measurement ................................ 13Build vs Buy .................................................................................... 19Enhancing the user experience ........................................................ 27Code crafters .................................................................................. 33Developing Open Source ................................................................ 41
THE WEB GIANTS
THE WEB GIANTS
The obsession with
performance measurement
10
THE WEB GIANTS CULTURE / L’OBSESSION DE LA MESURE
11
THE WEB GIANTSCULTURE / THE OBSESSION WITH PERFORMANCE MEASUREMENT
Description
In IT, we are all familiar with quotes reminding us of the importance of performance measurement:
That which cannot be measured cannot be improved; without measurement, it is all opinion.
Web Giants have taken this idea to the extreme, and most have developed a strong culture of performance measurement. The structure of their activities leads them in this direction.
These activities often share three characteristics:
For these companies, IT is their means of production. Their costs are therefore directly correlated to the optimal use of equipment and software. Improvements in the number of concurrent users or CPU usage result in rapid ROI.
Revenues are directly correlated to the efficiency of the service provided. As a result, improvements in conversion rates lead to rapid ROI.
They are surrounded by computers! And computers are excellent measurement instruments, so they may as well get the most out of them!
Most Web Giants have made a habit of measuring everything, response times, most visited web pages or the articles (content or sales pages) that work best, the time spent on individual pages...
In short, nothing unusual – at first glance.
But that’s not all! – They also measure the heat generated by a given CPU, or the energy consumption of a transformer, as well as the average time between two hard disk failures (MTBF, Mean Time Between Failure).[1] This motivates them to build infrastructure that maximizes the energy efficiency of their installations, as these players closely monitor PUE, or Power Usage Effectiveness.Most importantly, they have learned to base their action plans on this wealth of metrics.
[1] http://storagemojo.com/2007/02/19/googles-disk-failure-experience
12
THE WEB GIANTS
Part of this trend is A/B testing (see “A/B Testing“ on p. 123 for further information), which consists of testing different versions of an application on different client groups. Does A work better than B? The best way to find out remains objective measurement: it results in concrete data that defy common sense and reveal the limits of armchair expertise, as
demonstrated by the www.abtests.com website, which references A/B testing results.
In an interview, Yassine Hinnach – then Senior Engineer Manager at LinkedIn – spoke of how LinkedIn teams were encouraged to quickly put any technology designed to boost site performance to the test. Thus decisions to adopt a given technology are made on the basis of observed metrics.HighScalability.com has published an article presenting Amazon’s recipes for success, based on interviews with its CTO. Among the more interesting quotes, the following caught our attention:
Everyone must be able to experiment, learn, and iterate. Position, obedience, and tradition should hold no power.
For innovation to flourish, measurement must rule.[2]
As another example of this approach, here is what Timothy B. Lee, a journalist for Wired and the New York Times, had to say about Google’s culture of performance measurement:
Rather than having intimate knowledge of what their subordinates are doing, Google executives rely on
quantitative measurements to evaluate the company’s performance. The company keeps statistics on everything—
page load times, downtime rates, click-through rates, etc—and works obsessively to improve these figures. The obsession with data-driven management extends even to
the famous free snacks, which are chosen based on careful analysis of usage patterns and survey results.“[3]
[2] http://highscalability.com/amazon-architecture [3] http://arstechnica.com/apple/news/2011/06/fourth-times-a-charm-why-icloud-faces-long-odds.ars
13
THE WEB GIANTSCULTURE / THE OBSESSION WITH PERFORMANCE MEASUREMENT
The consequences of this modus operandi run deep. A number of pure players display in their offices the motto “In God we trust. Everything else, we test“. This is more than just a nod to Deming;[4] it is a profoundly pragmatic approach to the issues at hand.
An extreme example of this trend, verging on caricature, is Google’s ‘Project Oxygen’: a team of internal statisticians combed through HR data collected from within – annual performance reviews, feedback surveys, nominations for top-manager awards. They distilled the essence of what makes a good manager down to 8 rules. Reading through them, any manager worthy of the name would be struck by how jaw-droppingly obvious it all seems. However, they backed their claims with hard, cold data,[5] and that made all the difference!
What about me?The French are fond of modeling, and are often less pragmatic than their English-speaking counterparts.
Indeed, we believe that this constant and quick feedback loop “hypothesis measurement decision“ should be an almost systematic reflex in the ISD world, and can be put into effect at a moment’s notice.
The author of these lines still has painful memories of two four-hour meetings with ten people organized to find out if shifting requests to the service layer to http would have a “significant“ impact on performance. Ten working days would have largely sufficed for a developer to figure that out, at a much lower cost.
OCTO consultants have also had the experience, several times over, of discovering that applications performed better when the cache that was used to improve performance was removed! The cure was therefore worse than the disease and its alleged efficacy never actually measured.
Management runs the risk of falling into the trap of believing that analysis by “hard data“ is a done deal. It may be a good idea to regularly check that this is indeed the case, and especially that the information gathered is put to use in decision-making.
[4] “In God we trust; all others must bring data“, W. Edward Deming.[5] Adam BRYANT, Google’s Quest to Build a Better Boss, The New York Times Company, March 12, 2011 : http://www.nytimes.com/2011/03/13/business/13hire.html
13
14
THE WEB GIANTS
Nevertheless, it cannot be emphasized enough that an ecosystem fostering the application of said information makes up part of the recipe for success of Web Giants.
Two other practices support the culture of performance metrics:
Automated tests: it’s either red or green, no one can argue with that. As a result, this ensures that it is always the same thing being measured.
Short cycles. To measure – and especially interpret – the data, one must be able to compare options, “all other things being equal“. This is crucial. We recently diagnosed the steps undertaken to improve the performance of an application. But about a dozen other optimizations were made to the next release. How then can efficient optimizations be distinguished from those that are counter-productive?
15
THE WEB GIANTS
Build vs
Buy
16
THE WEB GIANTS
17
THE WEB GIANTS CULTURE / BUILD VS BUY
Description
One striking difference in the strategy of Web Giants as compared to more usual IT departments lies in their arbitrations around Build vs. Buy.
The issue is as old as computers themselves: is it better to invest in designing software to best fit your needs or to use a software package complete with the capitalization and R&D of a publisher (or community) having had all necessary leisure to master the technology and business points?
Most major firms have gone for the second option and have enshrined maximal software packaging among their guiding principles, based on the view that IT is not one of their pillar businesses so is better left to professionals.
The major Web companies have tended to do the exact reverse. This makes sense given that IT is precisely their core business, and as such is too sensitive to be left in the hands of outsiders.
The resulting divergences are thus coherent.
Nonetheless, it is useful to push the analysis one step further because Web Giants have other motives too: first, being in control of the development process to ensure it is perfectly adjusted to meet their needs, and second, the cost of scaling up! These are concerns found in other IT departments, meaning that it can be a good idea to look very closely into your software package decisions.
Finding balanced solutions
On the first point, one of the built-in flaws of software packages is that they are designed for and by the needs which most arise for the publisher’s clients.[1] Your needs are thus only a small subset of what the software package is built to do. Adopting a software package by definition entails overkill, i.e. an overly complex solution not optimized for your
[1] We will not insist here on the fact that you should not stray too far from the standard out-of-the-box software package as this can be (very) expensive in the long term, especially when there are new releases.
18
THE WEB GIANTS
needs; and which has a price both in terms of execution and complexity, offsetting any savings made by not investing in the design and development of a complete application.
This is particularly striking in the software package data model. Much of the model’s complexity stems from the fact that the package is optimized for interoperability (a highly standardized Conceptual Data Model, extension tables, low model expressiveness as it is a meta-model...). However the abstractions and the “hyper-genericity“ that this leads to in software design has an impact on processing performance.[2]
Moreover, Web Giants have constraints in terms of volumes, transaction speed and the number of simultaneous users which push the envelopes of traditional architecture and which, in consequence, require fine-tuned optimizations determined by observed access-patterns. Such read-intensive transactions must not be optimized in the same way as others, where the stakes will be determined by I/O writing metrics.
In short, to attain such results, you have to pop the hood and poke around in the engine, which is not something you will be able to do with a software package (all guarantees are revoked from the moment you fiddle with the innards).
Because performance is an obsession for Web Giants, the overhead costs and low possibilities for adjustments to the software package make the latter quite simply unacceptable.
Costs
The second particularly critical point is of course the cost when scaling up. When the number of processors and servers increases, the costs rise very quickly, but not always in linear fashion, making some items more visible. And this is true of both business software packages and hardware.
That is precisely one of the arguments which led LinkedIn to gradually replace their Oracle database by an in-house solution, Voldemort.[3].In a similar vein, in 2010 we carried out a study on the main e-commerce
[2] When it is not a case of a cumbersome interface.[3] Yassine Hinnach, Évolution de l’architecture de LinkedIn, enjeux techniques et Organizationnels, USI 2011:http://www.usievents.com/fr/conferences/8-paris-usi-2011/sessions/1007
19
THE WEB GIANTS CULTURE / BUILD VS BUY
sites in France: at the time, eight of the ten largest sites (in terms of annual turnover) ran on platforms developed in-house and 2 used e-commerce software packages.
Web Giants thus prefer Build to Buy. But not only. They also massively have recourse to Open source solutions (cf. “Developing open source“, p. 41). Linux and MySQL reign supreme in many firms. Development languages and technologies are almost all open source: very little .NET for example, but instead Java, Ruby, PHP, C(++), Python, Scala... And they do not hesitate to fork off from other projects: Google for example uses a largely modified Linux kernel.[4] This is also the case for one of the main worldwide Global Distribution Systems.
Most technologies making a stir today in the world of high performance architecture are the result of developments carried out by Web Giants and then opened to the community. Cassandra, developed by Facebook, Hadoop and HBase inspired by Google and developed by Yahoo!, Voldemort by LinkedIn...
A way, in fact, of combining the advantages of software perfectly tailored to your needs but nonetheless enhanced by improvements contributed by the development community, with, as an added bonus, a market trained to use the technologies you use.
Coming back to the example of LinkedIn, many of their technologies are grounded in open source solutions:
Zoie, a real time indexing and search system based on Lucene.
Bobo, a faceted search library based on Lucene.
Azkaban, a batch workflow job scheduler to manage Hadoop job dependencies.
GLU, a deployment framework.
[4] http://lwn.net/Articles/357658
20
THE WEB GIANTS
How can I make it work for me?Does this mean I have to do away with software packages in my IT choices?
Of course not, not for everything. Software packages can be the best solution, no one today would dream of reengineering a payroll system. However, ad hoc developments should be considered in certain cases: when the IT tool is key to the success of your business. Figure 1 lays out orientations in terms of strategy.
The other context where specific developments can be the right choice is that of high performance: with companies turning to “full web solutions“, very few business software packages have the architecture to support the traffic intensity of some websites.
As for infrastructure solutions, open source has become the norm: OSs and application servers foremost. Often also databases and message buses. Open source are ideally adapted to run the solutions of Web Giants. There is no doubt as to their capacity for performance and stability.
One hurdle remains: reluctance on the part of CIOs to forgo the support found in software packages. And yet, when you look at what actually happens, when there are problems with the commercial technical platform, it is rarely support from the publisher, handsomely paid for, which provides the solution, but rather networks of specialists and help fora
Unique, differentiating. Perceived as
a commercial asset.
Innovations and strategic assets
FasterSPECIFIC
SOFTWARE PACKAGE
BPO[5]
ResourcesCheaper
Common to all industry organizations.
Perceived as a production asset.
Common to all organizations. Perceived as a ressource.
[5] Business Process Outsourcing.
21
THE WEB GIANTS CULTURE / BUILD VS BUY
on the Internet. For application platforms of the database or message bus type, the answer is less clearcut because some commercial solutions include functionalities that you do not find in open source alternatives. However if you are sending an Oracle into regions where MySQL will not be able to follow, that means that you have very sophisticated needs... which is not the case for 80% of the contexts we encounter !
22
THE WEB GIANTS
Enhancing User Experience
23
WEB GIANTS
24
THE WEB GIANTS CULTURE / ENHANCING THE USER EXPERIENCE
Description
Performance: a must
One conviction shared by Web Giants is that users’ judgment of performance is crucial. Performance is directly linked to visitor retention and loyalty. How users feel about a particular service is linked to the speed with which the graphic interface is displayed.
Most people have no interest in software architecture, server power, or network latency due to web based services. All that matters is the impression of seamlessness.
User-friendliness is no longer negotiable
Web Giants have fully grasped this and speak of metrics in terms of“the bat of an eyelash“. In other words, it is a matter of fractions of seconds.Their measurements, carried out namely through A/B testing (cf. “A/B Testing“, p. 123), are very clear:
Amazon : a 100ms. increase in latency means a 1% loss in sales.
Google : a page taking more than 500ms to load loses 20% of traffic (pages visited).
Yahoo! : more than 400ms to load means + 5 to 9 % abandons.
Bing : over 1 second to load means a loss of 2.8% in advertising income.
How are these performances attained?
In keeping with the Device Agnostic pattern (cf. “Device Agnostic“, p. 143), Web Giants develop native interfaces, or Web interfaces, to always offer the best possible user experience. In both cases, performance as perceived by the user must be maximized.
25
THE WEB GIANTS
Native applications
With the iPhone, Apple reintroduced applications developed for a specific device (stopping short of the assembler however) to maximize perceived performance. Thus Java and Flash technologies are banished from the iPhone. The platform also uses visual artifacts: when an app is launched, it displays the view as seen when it was last charged by the system to strengthen the impression that it is instantaneous, with the actual app being loaded in the background. On Android, Java applications are executed on a virtual machine optimized for the platform. They can also be written in C to maximize performance.
Generally speaking, there is a consensus around native development, especially on mobile platform: it must be as tightly linked as possible to the device. Multi-platform technologies such as Java ME, Flash and Silverlight do not directly enhance the user experience and are therefore put aside.
Web applications
Fully loading a Web page usually takes between 4 and 10 seconds (including graphics, JavaScript, Flash, etc.).
It would seem that perceived slowness in display is generally linked for 5% to server processing, and for 95% to browser processing. Web Giants have therefore taken considerable care to optimize the display of Web pages.
As illustration, here is a list of the main good practices which most agree optimize user perception:
It is crucial to cache all static resources (graphics, CSS style sheets, JavaScript scripts, Flash animations, etc.) whenever possible. There are various HTTP cache technologies for this. It is important to become skillful at optimizing the life-cycle of the resources in the cache.
It is also advisable to use a cache network, or Content Delivery Network (CDN) to bring the resources as close as possible to the end user to reduce network latency. We highly recommend that you have cache servers in the countries where the majority of your users live.
26
CULTURE / ENHANCING THE USER EXPERIENCE
Downloading in background is a way of masking sluggishness in the display of various elements on the page.
One thing many do is to use sprites: the principle is to aggregate images in a single file to limit the amount of data to be loaded; they can then be selected on the fly by the navigator (see the Gmail example below).
Having recourse to multiple domain names is a way to maximize parallelization in simultaneous resource loading by the navigator. One must bear in mind that navigators are subjected to a maximum number of simultaneous queries for a same domain. Yahoo.fr for example loads their images from l.yimg.com.
Placing JavaScript resources at the very end of the page to ensure that graphics appear as quickly as possible.
Using tools to minimize, i.e. removing from the code (JavaScript, HTML, etc.) all characters (enter, comments, etc.) serving to read the code but not to execute it, and to shorten as much as possible function names.
Compacting the various source code files such as JavaScript in a single file whenever possible.
Who makes it work for them?There are many examples of such practices among Web Giants, e.g. Google, Gmail, Viadeo, Github, Amazon, Yahoo!...
References among Web Giants
Google has the most extensive distributed cache network of all Web Giants: the search giant is said to have machines in all major cities, and even a private global network, although corroboration is difficult to come by.
Google Search pushes the real-time user experience to the limits with its“Instant Search“ which loads search results as you type your query. This function stems from formidable technical skill and has aroused the interest of much of the architect community.
27
THE WEB GIANTS
Gmail images are reduced to a strict minimum (two sprite images shown on Figure 1), and the site makes intensive cache use and loads JavaScript in the background
Figure 1: Gmail sprite images.
FranceSites using or having used the content delivery network Akamai:
cite-sciences.fr
lemonde.fr
allocine.com
urbandive.com
How can I make it work for me?The consequences of display latency are the same with in-house applications within any IT department: users who get fed up with the application and stop using it. This to say that this is a pattern which perfectly applies to your own business
Sources• Eric Daspet, “Performance des applications Web, quoi faire et pourquoi ?“ USI 2011 (French only):> http://www.usievents.com/fr/conferences/10-casablanca-usi-2011/ sessions/997-performance-des-applications-web-quoi-faire-et-pourquoi
• Articles on Google Instant Search:> http://highscalability.com/blog/2010/9/9/how-did-google-instant- become-faster-with-5-7x-more-results.html
> http://googleblog.blogspot.com/2010/09/google-instant-behind- scenes.html
Editor’s note: By definition, sprites are designed for screen display, we are unable to provide any better definition for the printing of this example. Thank you for your understanding.
28
THE WEB GIANTS
CodeCrafters
29
THE WEB GIANTS CULTURE / CODE CRAFTERS
DescriptionToday Web Giants are there to remind us that a career as a developer can be just as prestigious as manager or consultant. Indeed, some of the most striking successes of Silicon Valley have originated with one or several visionary geeks who are passionate about quality code.
When these companies’ products gain in visibility, satisfying an increasing number of users means hugging the virtuous cycle in development quality, without which success can vanish as quickly as it came.
Which is why a software development culture is so important to Web Giants, based on a few key principles:
attracting and recruiting the best programmers,
investing in developer training and allowing them more independence,
gaining their loyalty through workplace attractiveness and payscale,
being intransigent as to the quality of software development - because quality is non-negotiable.
Implementation
The first challenge the Giants face is thus recruiting the best programmers. They have become masters at the art, which is trickier than it might at first appear.
One test which is often used by the majors is to have the candidates write code. A test Facebook uses is the FizzBuzz. This exercise, inspired by a drinking game which some of you might recognize, consists in displaying the first 1000 prime numbers, except for multiples of 3 or 5, where “Fizz“ or “Buzz“ respectively must be displayed, and except for multiples of 3 and 5, where “FizzBuzz“ must be displayed. This little programming exercise weeds out 99.5% of the candidates. Similarly, to be hired by Google, between four and nine technical interviews are necessary.
30
THE WEB GIANTS
Salary is obviously to be taken into account. To have very good developers, you have to be ready to pay the price. At Facebook, Senior Software Engineers are among the best paid employees.
Once programmers have joined your firm, the second challenge is to favor their development, fulfillment, and to enrich their skills. In such companies, programmers are not considered code laborers to be watched over by a manager but instead as key players. The Google model, which encourages developers to devote 20% of their time to R&D projects, is often cited as an example. This practice can give rise to contributions to open-source projects, which provide many benefits to the company (cf. “Open Source Contribution“, p. 41). On the Netflix blog for example, they mention their numerous open source initiatives, namely on Zookeeper and Cassandra. The benefit to Netflix is twofold: its developers gain in notoriety outside the company, while at the same time developing the Netflix platform.
Another key element in developer loyalty is the working conditions. The internet provides ample descriptions of the extent to which Web Giants are willing to go to provide a pleasant workplace. The conditions are strikingly different from what one finds in most Tech companies. But that is not all! Netflix, again, has built a culture which strongly focuses on its employees’ autonomy and responsibility. More recently, Valve, a video game publisher, sparked a buzz among developers when they published their Handbook, which describes a work culture which is highly demanding but also propitious to personal fulfillment. 37 signals, lastly, with their book Getting Real, lays out their very open practices, often the opposite of what one generally finds in such organizations.
In addition to efforts deployed in recruiting and holding on to programmers, there is also a strong culture of code and software quality. It is this culture that creates the foundations for moving and adapting quickly, all while managing mammoth technological platforms where performance and robustness are crucial. Web Giants are very close to the Software Craftsmanship[1] movement, which promotes a set of values and practices aiming to guarantee top-quality software and to provide as much value as possible to end-users. Within this movement, Google and GitHub have not hesitated to share their coding guidelines[2].
[1] http://manifesto.softwarecraftsmanship.org[2] http://code.google.com/p/google-styleguide/ and https://github.com/styleguide
31
THE WEB GIANTS
How can I make it work for me?Recruiting It is important to implement very solid recruitment processes when hiring your programmers. After a first interview to get a sense of the person you wish to recruit, it is essential to have the person code. You can propose a few technical exercises to assess the candidate’s expertise, but it is even more interesting to have them code as a pair with one of your developers, to see whether there is good feeling around the project. You can also ask programmers to show their own code, especially what they are most proud of - or most ashamed of. More than the code itself, discussions around coding will bring in a wealth of information on the candidate. Also, did they put their code on GitHub? Do they take part in open source projects? If so, you will have representative samples of the code they can produce.
Quality: Offer your developers the context which will allow them to continue producing top-quality software (since that is non-negotiable). Leave them time to write unit tests, to set up the development build you will need for Continuous Deployment (cf. “Continuous Deployment“, p. 105), to work in pairs, to hold design workshops in their business domain, to prototype. The practice which is known to have the most impact on quality is peer code reviewing. This happens all too rarely in our sector.
R&D: Giving your developers the chance to participate in R&D projects in addition to their work is a practice which can be highly profitable. It can generate innovation, contribute to project improvement and, in the case of Open Source, increase your company’s attractiveness for developers. It is also simply a source of motivation for this often neglected group. More and more firms are adopting the principles of Hackathons, popularized by Facebook, where the principle consists in coding, in one or two days, working software.
CULTURE / CODE CRAFTERS
32
THE WEB GIANTS
Training: Training can be externalized but you can also profit from knowledge sharing among in-house developers by e.g. organizing group programming workshops, commonly called “Dojo“.[3] Developers can gather for half a day, around a video projector, to share knowledge and together learn about specific technical issues. It is also a way to share developer practices and, within a team, to align with programming standards. Lastly, working on open source projects is also a way of learning about new technologies.
Workplace: Where and how you work are important! Allowing independence, promoting openness and transparency, hailing mistakes and keeping a manageable rhythm are all paying practices in the long term.
Associated patterns
Pattern “Pizza Teams“, p. 59.
Pattern “DevOps“, p. 65.
Pattern “Continuous Deployment“, p. 105.
Sources• Company culture at Netflix:> http://www.slideshare.net/reed2001/culture-1798664
• What every good programmer should know:> http://www.slideshare.net/petegoodliffe/becoming-a-better-programmer
• List of all the programmer positions currently open at Facebook:> http://www.facebook.com/careers/teams/engineering
• The highest salary at Facebook? Senior Software Engineer:> http://www.businessinsider.com/the-highest-paying-jobs-at-facebook- ranked-2012-5?op=1
[3] http://codingdojo.org/cgi-bin/wiki.pl?WhatIsCodingDojo
33
THE WEB GIANTS CULTURE / CODE CRAFTERS
• GitHub programming guidelines:> https://github.com/styleguide
• How GitHub grows:> http://zachholman.com/talk/scaling-github
• Open source contributions from Netflix:> http://techblog.netflix.com/2012/07/open-source-at-netflix-by-ruslan.html
• The FizzBuzz test:> http://c2.com/cgi/wiki?FizzBuzzTest
• Getting Real:> http://gettingreal.37signals.com/GR_fra.php
• The Software Craftsmanship manifesto:> http://manifesto.softwarecraftsmanship.org
• The Google blog on tests:> http://googletesting.blogspot.fr
• The Happy Manifesto:> http://www.happy.co.uk/wp-content/uploads/Happy-Manifesto1.pdf
34
THE WEB GIANTS
Open Source Contribution
35
THE WEB GIANTS
DescriptionWhy is it Web Giants such as Facebook, Google and Twitter do so much to develop Open Source?
A technological edge is a key to conquering the Web. Whether it be to stand out from the competition by launching new services (remember when Gmail came out with all its storage space at a time when Hotmail was lording it?) or more practically to overcome inherent constraints such as the growth challenge linked to the expansion of their user base. On numerous occasions, Web Giants have pulled through by inventing new technologies.
If so, one would think that their technological mastery, and the asset which is the code, would be carefully shielded from prying eyes, whereas in fact the widely shared pattern one finds is that Web Giants are not only major consumers of open source technology, they are also the main contributors.
The pattern “developing open source“ consists of making public a software tool (library, framework...) developed and used in-house. The code is made available on a public server such as GitHub, with a free license of the Apache type for example, authorizing its use and adaptation by other companies. In this way, the code is potentially open to development by the entire world. Moreover, open source applications are traditionally accompanied by much publicity on the web and during programming conferences.
Who makes it work for them?There are many examples. Among the most representative is Facebook and its Cassandra database, built to manage massive quantities of data distributed over several servers. It is interesting to note that among current users of Cassandra, one finds other Web Giants, e.g. Twitter and Digg, whereas Facebook has abandoned Cassandra in favor of another open source storage solution - HBase - launched by the company Powerset. With the NoSQL movement, the new foundations of the Web are today massively based on the technologies of the Giants.
36
THE WEB GIANTS
Facebook has furthermore opened several frameworks up to the community, such as its HipHop engine which compiles PHP in C++, Thrift, a multilanguage development service, and Open Compute, an Open hardware initiative which aims to optimize how datacenters function. But Facebook is not alone.
Google has done the same with its user interface framework GWT, used namely in Adword. Another example is the Tesseract Optical Character Recognition (OCR) tool initially developed by HP and then by Google, which opened it up to the community a few years later. Lastly, one cannot name Google without citing Android, its open source operating system for mobile devices, not to mention their numerous scientific publications on storing and processing massive quantities of data. We are referring more particularly to their papers on Big Table and Map Reduce which inspired the Hadoop project.
The list could go on and on, so we will end with first Twitter and its CSS framework and very trendy responsive design, called Bootstrap, and the excellent Ruby On Rails extracted from the Basecamp project management software opened up to the community by 37signals.
Why does it work?
Putting aside ideological considerations, we propose to explore various advantages to be drawn from developing open software.
Open and free does not necessarily equate with price and profit wars. In fact, from one angle, opening up software is a way of cutting competition off in the bud for specific technologies. Contributing to Open Source is a way of redefining a given technology sector while ensuring sway over the best available solution. For a long time, Google was the main sponsor of the Mozilla Foundation and its flagship project Firefox, to the tune of 80%. A way to diversify to counter Microsoft. Let us come back to our analysis of the three advantages.
[1] Interface Homme Machine.
CULTURE / OPEN SOURCE CONTRIBUTION
37
THE WEB GIANTS
Promoting the brand
By opening cutting-edge technology up to the community, Web Giants position themselves as leaders, pioneers. It implicitly communicates a spirit of innovation reigning in their halls, a constant quest for improvements. They show themselves as being able to solve big problems, masters of technological prowess. Delivering a successful Open Source framework says that you solved a common problem faster or better than anyone else. And that, in a way, the problem is now behind you. Done and gone, you’re already moving onto the next. One step ahead of the game.
To share a framework is to make a strong statement, to reinforce the brand. It is a way to communicate an implicit and primal message: “We are the best, don’t you worry“
And then, to avoid being seen as the new Big Brother, one can’t but help feeling that the message also implied is:“We’re open, we’re good guys, fear not“.[2]
Attracting - and keeping - the best
This is an essential aspect which can be fostered by an open source approach. Because “displaying your code“ means showing part of your DNA, your way of thinking, of solving problems - show me your code and I will tell you who you are. It is the natural way of publicizing what exactly goes on in your company: the expertise of your programmers, your quality standards, what your teams work on day by day... A good means to attract “compatible“ coders who would have already been following the projects led by your company.
Developing Open Source thus helps you to spot the most dedicated, competent and motivated programmers, and when you hire them you are already sure they will easily integrate your ecosystem. In a manner of speaking, Open Source is like a huge trial period, open to all.
[2] Google’s motto: “Don’t be evil“
38
THE WEB GIANTS
Attracting the best geeks is one thing, hanging on to them is another. On this point, Open Source can be a great way to offer your company’s best programmers a showcase demonstration open to the whole world.
That way they can show their brilliance, within their company and beyond. Promoting Open Source bolsters your programmers’ resumes. It takes into account the Personal Branding needs of your staff, while keeping them happy at work. All programmers want to work in a place where programming is important, within an environment which offers a career path for software engineers. Spoken as a programmer.
Improving quality
Simply “thinking open source“ is already a leap forward in quality: opening up code - a framework - to the community first entails defining its contours, naming it, describing the framework and its aim. That alone is a significant step towards improving the quality of your software because it inevitably leads to breaking it up into modules, giving it structure. It also makes it easier to reuse the code in-house. It defines accountability within the code and even within teams.
It goes without saying that programmers who are aware that their code will be checked (not to mention read by programmers the world over) will think twice before committing an untested method or a hastily assembled piece of code. Beyond making programmers more responsible, feedback from peers outside the company is always useful.
How can I make it work for me?When properly used, Open Source can be an intelligent way not only to structure your R&D but also to assess programmer performance.
The goal of this paper was to explore the various advantages offered by opening up certain technologies. If you are not quite up to making the jump culturally speaking, or if your IS is not ready yet, it can nonetheless be useful to play with the idea taking a few simple-to-implement actions.
Depending on the size of your company, launching your very first Open Source project can unfortunately be met with general indifference. We do not all have the powers of communication of Facebook. Beginning by
CULTURE / OPEN SOURCE CONTRIBUTION
39
THE WEB GIANTS
contributing to Open Source projects already underway can be a good initial step for testing the culture within your teams.
Like Google and GitHub, another action which works towards the three advantages laid out here can be to materialize and publish on the web your programming guidelines. Another possibility is to encourage your programmers to open a development blog where they could discuss the main issues they have come up against. The Instagram Engineering Tumblr moderated by Instagram can be a very good source of inspiration.
Sources• The Facebook developer portal, Open Source projects:> http://developers.facebook.com/opensource
• Open-Source Projects Released By Google:> http://code.google.com/opensource/projects.html
• The Twitter developer portal, Open Source projects:> http://dev.twitter.com/opensource/projects
• Instagram Engineering Blog:> http://instagram-engineering.tumblr.com
• The rules for writing GitHub code:> http://github.com/styleguide
• A question on Quora: Open Source: “Why would a big company do open-source projects?“:> http://www.quora.com/Open-Source/Why-would-a-big-company-do- open-source-projects
40
THE WEB GIANTS
Sharing Economy platforms
42
THE WEB GIANTS CULTURE / SHARING ECONOMY PLATFORMS
DescriptionThe principles at work in the platforms of the sharing economy (exponential business platforms) are one of the keys to the successes of the web giants and other startups valuated at $1 billion (“unicorns“) such as BlablaCar, Cloudera, Social finance, or over $10 billion (“decacorns“) such as Uber, AirBnB, Snapchat, Flipkart (List and valuation of the Uni/Deca-corns).The latter are disrupting existing ecosystems, inventing new ones, wiping out others. And yet “Businesses never die, only business models evolve“ (To learn more, see: Philippe Siberzahan, “Relevez le défi de l’innovation de rupture“).Concerns over the risks of disintermediation are legitimate given that digital technology has led to the development of numerous highly successful “exponential business platforms“ (see the article by Maurice Levy, “Se faire ubériser“).The article below begins with a recap of what is common to these platforms and then explores the main fundamentals necessary for building or becoming an exponential business platform.
The wonderful world of the “Sharing economy“There is a continuous stream of newcomers knocking at the door, progressively transforming many sectors of the economy, driving them towards a so-called “collaborative“ economy. Among other goals, this approach strives to develop a new type of relation: Consumer-to-Consumer (C2C). This is true e.g. in the world of consumer loans, where the company LendingHome (Presentation of LendingHome) is based on peer-2-peer lending. Another area of interest is blockchain technology such as decentralisation and the “peer-2-peer'isation“ of money through the Bitcoin! What is most striking is that this type of relation can have an impact in unexpected places such as personalised urban car services (e.g. Luxe and Drop Don't Park), and movers (Lugg as an “Uber/Lyft for moving“).Business platforms such as these favor peer-2-peer relations. They have achieved exponential growth by leveraging the multitudes (For further information, see: Nicolas Colin & Henri Verdier, L'âge de la multitude: Entreprendre et gouverner après la révolution numérique). Such models make it possible for very small structures to grow very quickly by generating revenues per employee which can be from 100 to 1000 times higher than
43
THE WEB GIANTS
in businesses working in the same sector but which are much larger. The fundamental question is then to know what has enabled some of them to become hits and to grow their popularity, in terms of both community and revenues. What are the ingredients in the mix, and how does one become so rapidly successful?
At this stage, the contextual elements and common ground we discern are:
An often highly regulated market where these platforms appear and then develop by providing new solutions which break away from regulations (for example the obligation for hotels to make at least 10% of their rooms disability friendly, which does not apply to individuals using the AirBnB system).
An as yet unmet need in supply and demand can make it possible to earn a living or to generate additional revenue for a better quality of life (Cf. AirBnB's 2015 communication campaign on the subject) or at the least to share costs (Blablacar). This point in particular raises crucial questions as to the very notion of work, its regulation and the taxation of platforms.
There is strong friction around the experience, of clients and citizens, where the market has as yet to provide a response (such as valet car services in large cities around the world where parking is completely saturated)
A deliberate strategy to not invest in material assets but rather to efficiently embrace the business of creating links between people.
Given this understanding of the context, the 5 main principles we propose to become an exponential business platform are:
Develop your “network lock-in effect“.
Pair up algorithms with the user experience.
Develop trust.
Think user and be rigorous in execution.
Carefully choose your target when you launch platform experiments.
44
THE WEB GIANTS
“Network lock-in effect“The more supply and demand grow and come together, the more indispensable your platform becomes. Indispensable because in the end that is where the best offers are to be found, the best deals, where your friends are.
There is an inflection point where the network of suppliers and users becomes the main asset, the central pillar. Attracting new users is no longer the principal preoccupation. This asset makes it possible to become the reference platform for your segment. This growth can provide a monopoly over its use case, especially if there are exclusive deals that can be obtained through offers valid on your platform only.
It can then extend to offers which follow upon the first (for example Uber's position as an urban mobility platform has led them to diversify into a meal delivery service for restaurants). This is one of the elements which were very quickly theorised in the Lean Startup approach: the virality coefficient.
The perfect match: User eXperience & Algorithms What is crucial in the platform is setting up the perfect relation between supply and demand, celerity in implementing relations in time and/or space, lower prices as compared to traditional systems, and even providing services that weren't possible before. For some, algorithms for establishing relations are the core of their operations to deliver on their daily promise of offering suggestions and possibilities for relevant connections within a few micro-seconds.
The perfect match is a fine-tuned mix between stellar research into the user experience (all the way to swipe!), often using a mobile-first approach to explore and offer services, based on advanced algorithms to expose relevant associations. A telling example is the use of “Swipe“ in terms of uniquely tailored user experiences for fast browsing as in the personal relationship tool “Tinder“.
CULTURE / SHARING ECONOMY PLATFORMS
45
THE WEB GIANTS
Trust & securityTo get beyond the early adapters to reach the market majority, two elements are critical to the client experience: trust in the platform, trust towards the other platform users (both consumers and providers).
Who has not experienced stress when reserving one's first AirBnB? Who has not wondered whether Uber would actually be there?
This level of trust conveyed by the platform and platform users is so important that it has been one of the leveraging effects, like for the shared Blablacar platform which thrived once the transactions were operated by the platform.
What happens to the confidential data provided to the platform?
You may remember a recent hacking event of personal data on the “Ashley Madison“ sites affecting the 37 million platform users who wanted total discretion (Revelations around the hacking of the Ashley Madison sites). Security is thus key to protecting platform transactions, guaranteeing private data and reassuring users.
Think user & excel in executionAbove all it is about realising that what the market and what the clients want is not to be found in marketing plans, sales forecasts and key functionalities. The main questions to ask revolve around the triplets Client / Problem / Solution: Do I really have a problem that is worth solving? Is my solution the right one for my client? Will my client buy it? For how much? Use whatever you can to check your hypotheses: interviews, market studies, prototypes...
To succeed, these platforms aim to reach production very quickly, iterating and improving while their competition is still exploring their business plan. It is then a ferocious race between pioneers and copycats, because in this type of race “winner takes all“ (For further reading, see The Second Machine Age, Erik Brynjolfsson & Andrew Mcafee).
46
THE WEB GIANTS
Then excellence in execution becomes the other pillar. This operational excellence covers:
the platform itself and the users it “hosts“: active users, quality of the goods offered... quality in rating with numerous well assessed offers...
offers which are mediated by the platform (comments, satisfaction surveys...)
One may note in particular the example of AirBnB on the theme of excellence in execution, beyond software, where the quality in the description of the lodgings as well as beautiful photos were a strong differential as compared to the competition of the time (Craig's List) (A few words on the quality of the photos at AirBnB).
Critical market sizeCritical market size is one of the elements which make it possible to rapidly reach a sufficiently strong network effect (speed in reaching a critical size is fundamental to not being overrun by copycats).
Critical market size is made up of two aspects:
Selecting the primary territories for deployment, most often in cities or mega-cities,
Ensuring deployment in other cities in the area, when possible in standardized regulatory contexts.
You must therefore choose cities particularly concerned by your value propositions for your platform, where a sufficient number of early adapters is high enough to quickly garner takeaways. Mega-cities in the Americas, Europe and Asia are therefore choice targets for experimental deployments.
Lastly, during the generalisation phase, it is no surprise to see stakeholders deploying massively in the USA (a market which represents 350 million inhabitants, with standardised tax and regulatory environments, despite state and federal differences) or in China (where the Web giants are among the most impressive players, such as: Alibaba, Tencent and Weibo) as well as Russia.
CULTURE / SHARING ECONOMY PLATFORMS
47
THE WEB GIANTS
In Europe, cities such as Paris, Barcelona, London, Berlin, etc. are often prime choices for businesses.
What makes it work for them?As examined above, there are many ingredients for exponentially scalable Organizations and business models on the platform model: strong possibilities for employees to self-organise, the User eXperience, continuous experimentation... algorithms (namely intelligent networking), and leveraging one's community.
What about me?For IT and marketing departments, you can begin your thinking by exploring digital innovations (looking for new uses) that fit in with your business culture (based e.g. on Design thinking).
In certain domains, this approach can give you access to new markets or to disruption before the competition. A recent example is that of Accor which has entered the market of independent hotels through its acquisition of Fastbooking (Accor gets its hands on Fastbooking).
Still in the area of self-disruption, two main strategies are coming to the fore. The first consists, based on partnerships or capital investments through incubators, in coming back into the game without shouldering all of the risk. The other strategy, more ambitious and therefore riskier, is to take inspiration from these new approaches to transform from within.
It is then important to examine whether some of these processes can be opened up to transform them into an open platform, thereby leveraging the multitudes.
In the distribution sector for example, the question of positioning and opening up various strategic processes is raised: is it a good idea to turn your supply chain into a peer-2-peer platform so that SMEs can become consumers and not only providers in the supply chain? Are pharmacies the next on the list of programmed uberisations through stakeholders such as 1001pharmacie.com? In the medical domain, Doctolib.com has just leveraged €18 million to ensure its development (Doctolib raises funds)...
48
THE WEB GIANTS
Associated patterns
Enhancing the user experience
A/B Testing
Feature Flipping
Lean Startup
Sources• List of unicorns: > https://www.cbinsights.com/research-unicorn-companies
• Philippe Siberzahan, “Relevez le défi de l’innovation de rupture“, édition Pearson
• Article by Maurice Levy on “Tout le monde a peur de se faire ubériser“>http://www.latribune.fr/technos-medias/20141217tribd1e82ceae/tout-le-monde-a-peur-de-se-faire-uberiser-maurice-levy.html
• Lending Home present through “C’est pas mon idée“:> http://cestpasmonidee.blogspot.fr/2015/09/lendinghome-part-lassaut-du-credit.html
• Nicolas Colin & Henri Verdier, “l’âge de la multitude, 2nde édition“
• Ashley Madison hacking:> http://www.slate.fr/story/104559/ashley-madison-site-rencontres-extraconjugales-hack-adultere
• Second âge de la machine, Erik Brynjolfsson
• Quality of AirBnB photos:> https://growthhackers.com/growth-studies/airbnb
• Accor met la main sur Fastbooking:> http://www.lesechos.fr/17/04/2015/lesechos.fr/02115417027_accor-met-la-main-sur-fastbooking.htm
• Doctolib raises 18M€:> http://www.zdnet.fr/actualites/doctolib-nouvelle-levee-de-fonds-a-18-millions-d-euros-39826390.htm
CULTURE / SHARING ECONOMY PLATFORMS
49
THE WEB GIANTS
Organization
50
Pizza Teams..................................................................................... 59Feature Teams ................................................................................ 65DevOps .......................................................................................... 71
THE WEB GIANTS
51
THE WEB GIANTS
Pizza Teams
52
THE WEB GIANTS
53
THE WEB GIANTS ORGANIZATION / PIZZA TEAMS
Description
What is the right size for a team to develop great software?
Organizational studies have been investigating the issue of team size for several years now. Although answers differ and seem to depend on various criteria such as the nature of tasks to be carried out, the average level, and team diversity, there is consensus on a size of between 5 and 15 members.[1][5] Any fewer than 5 and the team is vulnerable to outside events and lacks creativity. Any more than 12 and communication is less efficient, coherency is lost, there is an increase in free-riding and in power struggles, and the team’s performance drops rapidly the more members there are.
This is obviously also true in IT. The firm Quantitative Software Management, specialized in the preservation and analysis of metrics from IT projects, has published some interesting statistics. If you like numbers, I highly recommend their Web site, it is chock full of information! Based on a sample of 491 projects, QSM measured a loss of productivity and heightened variability with an increase in team size, with a quite clear break once one reaches 7 people. In correlation, average project duration increases and development efforts skyrocket once one goes beyond 15.[6]
In a nutshell: if you want speed and quality, cut your team size!
Why are we mentioning such matters in this work devoted to Web Giants? Very simply because they are particularly aware of the importance of team size for project success, and daily deploy techniques to keep size down.
[1] http://knowledge.wharton.upenn.edu/article.cfm?articleid=1501[2] http://www.projectsatwork.com/article.cfm?ID=227526[3] http://www.teambuildingportal.com/articles/systems/teamperformance-teamsize[4] http://math.arizona.edu/~lega/485-585/Group_Dynamics_RV.pdf[5] http://www.articlesnatch.com/Article/What-Project-Team-Size-Is-Best-/589717[6] http://www.qsm.com/process_improvement_01.html
54
THE WEB GIANTS
In fact the title of this chapter is inspired by the name Amazon gave to this practice:[7] if your team can’t be fed on two pizzas, then cut people. Albeit these are American size pizzas, but nonetheless about 8 people. Werner Vogels (Amazon VP and CTO) drove the point home with the following quote which could almost be by Nietzsche:
Small teams are holy.
But Amazon is not alone, far from it.
To illustrate the importance that team dynamics have for Web Giants: Google hired Evan Wittenberg to be manager of Global Leadership Development; the former academic was known, in part, for his work on team size.
The same discipline is applied at Yahoo! which limits its product teams in the first year to between 5 and 10 people.As for Vidaeo, they have adopted the French pizza size approach with teams of 5-6 people.In the field of startups, Instagram, Dropbox, Evernote.... are known for having kept their development teams as small as possible for as long as possible.
How can I make it work for me?A small, agile team will always be more efficient than a big lazy team; such is the conclusion which could be drawn from the accumulated literature on team size.
In the end, you only need to remember it to apply it... and to steer away from linear logic such as: “to go twice as fast, all you need is double the people!“ Nothing could be more wrong!
According to these studies, a team exceeding 15 people should set alarm bells ringing.[8][10]
[7] http://www.fastcompany.com/magazine/85/bezos_4.html[8] https://speakerdeck.com/u/searls/p/the-mythical-team-month[9] http://www.3circlepartners.com/news/team-size-matters[10] http://37signals.com/svn/posts/995-if-youre-working-in-a-big-group-youre-fighting-human-nature
55
THE WEB GIANTS ORGANIZATION / PIZZA TEAMS
You then have two options:
Fight tooth and nail to prevent the team from growing, and, if that fails, to adopt the second solution;
split the team up into smaller teams. But think very carefully before you do so and bear in mind that a team is a group of people motivated around a common goal. Which is the subject of the following chapter, “Feature Teams“.
56
THE WEB GIANTS
Feature Teams
57
THE WEB GIANTS ORGANIZATION / FEATURE TEAMS
DescriptionIn the preceding chapter, we saw that Web Giants pay careful attention to the size of their teams. That is not all they pay attention to concerning teams however: they also often organize their teams around functionalities, known as “feature teams“.A small and versatile team is a key to moving swiftly, and most Web Giants resist multiplying the number of teams devoted to a single product as much as possible.
However, when a product is a hit, a dozen people no longer suffice for the scale up. Even in such a case, team size must remain small to ensure coherence, therefore it is the number of teams which must be increased. This raises the question of how to delimit the perimeters of each.
There are two main options:[1]
Segmenting into “technological“ layers.
Segmenting according to “functionality thread“.
By “functionality thread“ we mean being in a position to deliver independent functionalities from beginning to end, to provide a service to the end user.
In contrast, one can also divide teams along technological layers, with one team per type of technology: typically, the presentation layer, business layer, horizontal foundations, database...
This is generally the organization structure adopted in Information Departments, each group working within its own specialty.
However, whenever Time To Market becomes crucial, organization into technological layers, also known as Component Teams, begins to show its limitations. This is because Time To Market crunches often necessitate Agile or Lean approaches. This means specification, development, and production with the shortest possible cycles, if not on the fly.
[1] There are in truth other possible groupings, e.g. by release, geographic area, user segment or product family. But that would be beyond the scope of the work here; some of the options are dead ends, others can be assimilated to functionality thread divisions.
58
THE WEB GIANTS
Functionality 1
Functionality 2
Functionality 4
Functionality 5
Team 1- Front
Team 1- Back
Team 1- Exchange
Team 1- Base
The trouble with Component Teams is you often find yourself with bottlenecks.
Let us take the example laid out in Figure 1.
Figure 1
The red arrows indicate the first problem. The most important functionalities (functionality 1) are swamping the Front team. The other teams are left producing marginal elements for these functionalities. But nothing can be released until Team 1 has finished. There is not much the other teams can do to help (not sharing the same specialty as Team 1), so are left twiddling their thumbs or stocking less important functionalities (and don’t forget that in Lean, stocks are bad...).
There’s worse. Functionality 4 needs all four teams to work together. The trouble is that, in Agile mode, each team individually carries out the detailed analysis. Whereas here, what is needed is the detailed impact analysis on the 4 teams. This means that the detailed analysis has to take place upstream, which is precisely what Agile strives to avoid. Similarly, downstream, the work of the 4 teams has to be synchronized for testing, which means waiting for laggers. To limit the impact, task priorities have to be defined for each team in a centralized manner. And little by little, you find yourselves with a scheduling department striving to best synchronize all the work but leaving no room for team autonomy.
59
THE WEB GIANTS ORGANIZATION / FEATURE TEAMS
In short, you have a waterfall effect upstream in analysis and planning and a waterfall effect downstream in testing and deploying to production. This type of dynamics is very well described in the work of Craig Larman and Bas Vodde, Scaling Lean and Agile.
Feature teams can correct these errors: with each team working on a coherent functional subset - and doing so without having to think about the technology - they are capable of delivering value to the end client at any moment, with little need to call on other teams. This entails having all necessary skills for producing functionalities in each team, which can mean (among others) an architect, an interface specialist, a Web developer, a Java developer, a database expert, and, yes, even someone to run it... because when taken to the extreme, you end up with the DevOps “you build it, you run it“, as described in the next chapter (cf. “DevOps“, p. 71).
But then how do you ensure the technological coherence of the product, if each Java expert in each feature team takes the decisions within their perimeter? This issue is addressed by the principle of community of practice. Peers from each type of specialty get together at regular intervals to exchange on their practices and to agree on technological strategies for the product being produced.
Feature Teams have the added advantage that teams quickly progress in the business, this in turn fosters implication of the developers in the quality of the final product.
Practicing the method is of course sloppier than what we’ve laid out here: defining perimeters is no easy task, team dynamics can be complicated, communities of practice must be fostered... Despite the challenges, this organization method brings true benefits as compared to hierarchical structures, and is much more effective and agile.
To come back to our Web Giants, this is the type of organization they tend to favor. Facebook in particular, which communicates a lot around the culture, focuses on teams which bring together all the necessary talents to create a functionality.[2]
[2] http://www.time.com/time/specials/packages article/0,28804,2036683_2037109_2037111,00.html
60
THE WEB GIANTS
It is also the type of structure that Viadeo, Yahoo! and Microsoft[3] have chosen to develop their products.
How can I make it work for me?Web Giants are not alone in applying the principles of Feature Teams. It is an approach also often adopted by software publishers.
Moreover, Agile is spreading throughout our Information Departments and is starting to be applied to bigger and bigger projects. Once your project reaches a certain size (3-4 teams), Feature Teams are the most effective answer, to the point where some Information Departments naturally turn to that type of pattern.[4]
[3] Michael A. Cusumano and Richard W. Selby. 1997. How Microsoft builds software. Commun. ACM 40, 6 (June 1997), 53-61 :http://doi.acm.org/10.1145/255656.255698[4] http://blog.octo.com/compte-rendu-du-petit-dejeuner-organise-par-octo-et-strator- retour-dexperience-lagilite-a-grande-echelle (French only).
61
THE WEB GIANTS
DevOps
62
THE WEB GIANTS
63
THE WEB GIANTS ORGANIZATION / DEVOPS
DescriptionThe “DevOps“ method is a call to rethink the divisions common in our organizations, separating development on one hand, i.e. those who write application codes (“Devs“) and operations on the other, i.e. those who deploy and implement the applications (“Ops“).
Such thoughts are certainly as old as CIOs but find renewed life thanks notably to two groups. First there are the agilists who have minimized constraints on the development side and are now capable of providing highly valued software to their clients on a much more frequent basis. Then there are the experts or “Prod“ managers, known as the Web Giants (Amazon, Facebook, LinkedIn...) who have shared their experiences in how they have managed the Dev vs. Ops divide.
Beyond the intellectual beauty of the exercise, DevOps is mainly (if not entirely) gearing to reduce the Time To Market (TTM). Obviously, there are other positive effects, but the main priority, all being mentioned, is this TTM (hardly surprising in the Web industry).
Dev & Ops: differing local concerns but a common goal
Organizational divides notwithstanding, the preoccupations of Development and Operations are indeed distinct and equally laudable:
Figure 1
Seeking to innovate Seeking to rationalize
Local targets
DevOps“wall of confusion“
Different cultures
Deliver new functionalities (of quality)
Guarantee application runs (stability)
Product Culture (software) Service Culture (archiving, supervision, etc.)
64
THE WEB GIANTS
Software development seeks heightened responsiveness (under pressure notably from their industry and the market): they have to move fast, add new functionalities, reorient work, refactor, upgrade frameworks, test deployment across all environments... The very nature of software is to be flexible and adaptable.
In contrast, Operations need stability and standardization.
Stability, because it is often difficult to anticipate what the impacts of a given modification to the code, architecture or infrastructure will be. Converting a local disk into a server can impact response times, a change in code can heavily impact CPU activity leading to difficulties in capacity planning.
Standardization, because Operations seek to ensure that certain rules (equipment configuration, software versions, network security, log file configuration...) are uniformly followed to ensure the quality of service of the infrastructure.
And yet both groups, Devs and Ops, have a shared objective: to make the system work for the client.
DevOps: capitalizing on Agility
Agility became a buzzword somewhat over ten years ago, its main objective being to reduce constraints in development processes.
The Agile method introduced the notions of “short cycle“, “user feedback“, “Product owner“, i.e. a person in charge of managing the roadmap, setting priorities, etc.
Agility also shook up traditional management structures by including cross-silo teams (developers and operators) and played havoc with administrative departments.
Today, when those barriers are removed, software development is most often carried out with one to two-week frequencies. Business sees the software evolve during the construction phase.It is now time to bring people from operations into the following phases:
65
THE WEB GIANTS
Provisioning / spinning up environments: in most firms, deploying to an environment can take between one to four months (even though environments are now virtualized). This is surprisingly long, especially when the challengers are Amazon or Google.
Deployment: this is without doubt the phase when problems come to a crunch as it creates the most instability; agile teams sometimes limit themselves to one deployment per quarter to limit the impacts on production. In order to guarantee system stability, these deployments are often carried out manually, are therefore lengthy, and can introduce errors. In short, they are risky.
Incident resolution and meeting non-functional needs: Production is the other software user. Diagnosis must be fast, the problems and resilience stakes must be explained, and robustness must be taken into account.
DevOps is organized around 3 pillars: infrastructure as code (IaC), continuous delivery, and a culture of cooperation
1. “Infrastructure as Code“ or how to reduce provisioning and environment deployment delays
One of the most visible friction points is in the lack of collaboration between Dev and Ops in deployment phases. Furthermore this is the activity which consumes the most resources: half of production time is thus taken up by deployment and incident management.
Figure 2. Source: Study by Deepak Patil (Microsoft Global Foundation Services) in 2006, via a presentation modified by James Hamilton (Amazon Web Services), http://mvdirona.com/jrh/TalksAndPapers/JamesHamilton_POA20090226.pdf
ORGANIZATION / DEVOPS
66
THE WEB GIANTS
CM
DB
Mus
t re
flect
tar
get
co
nfig
urat
ion
& r
eal-
wo
rld
sys
tem
co
nfig
urat
ion
confi
gura
tio
n
OpenStackVMWare vCloud
OpenNebula
VM instanciation / OS Installation- Installation of Operating System
Bo
ots
trap
pin
g
CapistranoCustom script (shell, python…)
Co
mm
and
and
contr
ol
Application Service Orchestration- Deploy application code to services (war, php source, ruby, ...) - RDBMS deployment (figure...)
ChefPuppetCFEngine
System Configuration- Deploy and install services required for application execution (JVM, application servers...)- Configuration of these services (logs, ports, rights, etc.)
And although it is difficult to establish general rules, it is highly likely that part of this cost (the 31% segment) could be reduced by automating deployment.
There are many reliable tools available today to generate provisioning and deployment to new environments, ranging from setting up Virtual Machines to software deployment and system configuration.
Figure 3. Classification of the main tools (october 2012)
These tools (each in its own language) can be used to code infrastructure: to install and deploy an HTTP service for server applications, to create repositories for the log files... The range of services and associated gains are many:
Guaranteeing replicable and reliable processes (no user interaction, thus removing a source of errors) namely through their capacity to manage versions and rollback operations.
Productivity. One-click deployment rather than a set of manual tasks, thus saving time.
Traceability to quickly understand and explain any failures.
67
THE WEB GIANTS ORGANIZATION / DEVOPS
Reducing Time To Recovery: In a worst case scenario, the infrastructure can be recreated from scratch. In terms of recovery this is highly useful. In keeping with ideas stemming from Recovery Oriented Architecture, resilience can be addressed either by attempting to prevent systems from failing by working on the MTBF - Mean Time Between Failures, or by accelerating repairs by working on the MTTR - Mean Time To Recovery. The second approach, although not always possible to implement, is the least costly. It is also useful in organizations where many environments are necessary. In such organizations, the numerous environments are essentially kept available and little used because configuration takes too long.
Automation is furthermore a way of initializing a change in collaboration culture between Dev and Ops. This is because automation increases the possibilities for self-service for Dev teams, at the very least over the ante-production environments.
2. Continuous Delivery
Traditionally, in our organizations, the split between Dev and Ops comes to a head during deployment phases, when development delivers or shuffles off their code, which then continues on its long way through the production process.
The following quote from Mary and Tom Poppendieck[1] puts the problem in a nutshell:
How long would it take your organization to deploy a change that involves just one single line of code?
The answer is of course not obvious, but in the end it is here that differences in objectives diverge the most. Development seeks control over part of the infrastructure, for rapid deployment, on demand, to all environments. In contrast, production must see to making environments available, rationalizing costs, allocating resources (bandwidth, CPU...)
[1] Mary and Tom Poppendieck, Implementing Lean Software Development: From Concept to Cash, Addison-Wesley, 2006.
68
THE WEB GIANTS
Also ironical is the fact that the less one deploys, the more the TTR (Time To Repair) increases, therefore reducing the quality of service to the end client.
Figure 4.
Source: http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-
change-4608108
In other words, the more changes there are between releases (i.e. the higher the number of changes to the code), the lower the capacity to rapidly fix bugs following deployment, thus increasing TTR - this is the instability ever-dreaded by Ops.
Here again, addressing such waste can reduce the time taken up byIncident Management as shown in Figure 2.
Figure 5.
Source: http://www.slideshare.net/jallspaw/ops-metametrics-the-currency-you-pay-for-
change-4608108
Deploys
Size of Deploy Vs Incident TTR
5 180
Uni
ts o
f Cha
nged
Cod
e
TTR
(min
utes
)
160
140
120
100
80
60
40
20
0
4
3
2
1
0
Sev 1 TTR Sev 2 TTR Lines Per Deploys Changed
CHANGESIZE
Huge changesetsdeployed rarely
(high TTR)
(low TTR)
Tiny changesetsdeployed often
CHANGE FREQUENCY
69
THE WEB GIANTS ORGANIZATION / DEVOPS
To finish, Figure 5, taken from a Flickr study, shows the correlation between TTR (and therefore the seriousness of the incidents) depending on the amount of code deployed (and therefore the number of change to the code).
However, continuous deployment is not easy and requires:
Automation of the deployment and provisioning processes: Infras-tructure as Code
Automation of the software construction and deployment processes. Build automation becomes the construction chain which carries the source management software to the various environments where the software will be deployed. Thus a new build system is neces-sary, including environment management, workflow management for more quickly compiling source code into binary code, creating documentation and release notes to swiftly understand and fix any failures, the capacity to distribute testing across agents to reduce delays, and always guaranteeing short cycle times.
Taking these factors into account at the architecture level and above all respecting the following principle: decouple functionality deploy-ment and code deployment using patterns such as: Feature flipping (cf. Feature flipping p. 113), dark launch… This of course entails a new level of complexity but offers the necessary flexibility for this type of continuous deployment.
A culture of measurement with user-oriented metrics. This is not only about measuring CPU consumption, it is also about correlating busi-ness and application metrics to understand and anticipate system behavior.
3. A culture of collaboration if not an organizational model
These two practices, Infrastructure as Code and Continuous Delivery, can be implemented in traditional organizations (with Infrastructure as Code at Ops and Continuous Delivery at Dev). However, once development and production reach their local optimum and a good level of maturity, the latter will always be hampered by the organizational division.
70
THE WEB GIANTS
This is where the third pillar comes into its own; a culture of collaboration, nay cooperation, with all teams becoming more independent rather than throwing problems at each other in the production process. This can mean for example giving Dev access to machine logs, providing them with production data the day before so that they can roll out the integration environments themselves, opening up the metrics and monitoring tools (or even displaying the metrics in open spaces)... Bringing that much more flexibility to Dev, sharing responsibility and information on “what happens in Prod“, which are actually just so many tasks with little added value that Ops would no longer have to shoulder.
The main cultural elements around DevOps could be summarized as follows:
Sharing both technical metrics (response times, number of backups...) as well as business metrics (changes in generated profits...)
Ops is also the software client. This can mean making changes to the software architecture and developments to more easily integrate monitoring tools, to have relevant and useful log files, to help diagnosis (and reduce the TTD, Time To Diagnose). To go further, certain Ops needs should be expressed as user stories in the backlog.
A lean approach [http://blog.octo.com/tag/lean/] and post-mortems which focus on the deep causes (the 5 whys) and implementing countermeasures (French only).
It remains however that in this model, the zones of responsibility (especially development, software monitoring, datacenter use and support) which exist are somewhat modified.
Traditional firms give the project team priority. In this model, deployment processes, software monitoring and datacenter management are spread out across several organizations.
71
THE WEB GIANTS ORGANIZATION / DEVOPS
Figure 6: Project teams
Inversely, some stakeholders (especially Amazon) have taken this model very far by proposing multidisciplinary teams in charge of ensuring the service functions - from the client’s perspective (cf. Feature Teams, p. 65).You build it, you run it. In other words, each team is responsible for the business, from Dev to Ops.
Figure 7: Product team – You build it, you run it.
BUSINESS
SOFTWARE PRODUCTION FLOW
MONITORING(BUILD)
PRODUCTION(RUN)
(Source: Cutter IT Journal, Vol. 24. N°8. August 21), modified)
Project Teams
ApplicationManagement
TechnicalManagement
ServiceDesk
Users
SOFTWARE PRODUCTION FLOW
PRODUCTS/SERVICES(BUILD & RUN)
PRODUCTION
Service Desk
Infrastructure
Users
(Source: Cutter IT Journal, Vol. 24. N°8. August 21), modified)
72
THE WEB GIANTS
Moreover it is within this type of organization that the notion of self-service takes on a different and fundamental meaning. One then sees one team managing the software and its use and another team in charge of datacenters. The dividing line is farther “upstream“ than is usual, which allows scaling up and ensuring a balance between agility and cost rationalization (e.g. linked to the datacenter architecture). The AWS Cloud is probably the result of this... It is something else altogether, but imagine an organization with product teams and production teams who would jointly offer services (in the sense of ITIL) such as AWS or Google App Engine...
ConclusionDevOps is thus nothing more than a set of practices to leverage improvements around:
Tools to industrialize the infrastructure and reassure production as to how the infrastructure is used by development. Self service is a concept hardwired into the Cloud. Public Cloud offers are mature on the subject but some offers (for example VMWare) aim to reproduce the same methods internally. Without necessarily reaching such levels of maturity however, one can imagine using tools like Puppet, Chef or CFEngine...
Architecture which makes it possible to decouple deployment cycles, to deploy code without deploying all functionalities… (cf. Feature flipping, p. 113 and Continuous Deployment, p.105).
Organizational methods, leading to implementation of Amazon’s “Pizza teams“ patterns (cf. Pizza Teams, p. 59) and You build it, you run it.
Processes and methodologies to render all these exchanges more fluid. How to deploy more often? How to limit risks when deploying progressively? How to apply the “flow“ lessons from Kanban to production? How to rethink the communication and coordination mechanisms at work along the development/operations divide?
73
THE WEB GIANTS ORGANIZATION / DEVOPS
In sum, these four strands make it possible to reach the DevOps goals: improve collaboration, trust and objective alignment between development and operations, giving priority to addressing the stickiest issues, summarized in Figure 8.
Figure 8
Faster provisioning
Improved quality
of service
Continuous improvement
Operational efficiency
Infrastructureas Code
ContinuousDelivery
Increased deployment
reliability
Faster incident resolution (MTTR)
Improved TTM
Culture of collaboration
74
Sources• White paper on the DevOps Revolution:> http://www.cutter.com/offers/devopsrevolution.html
• Wikipedia article:> http://en.wikipedia.org/wiki/DevOps
• Flickr Presentation at the Velocity 2009 conference:> http://velocityconference.blip.tv/file/2284377/
• Definition of DevOps by Damon Edwards:> http://dev2ops.org/blog/2010/2/22/what-is-devops.html
• Article by John Allspaw on DevOps:> http://www.kitchensoap.com/2009/12/12/devops-cooperation-doesnt- just-happen-with-deployment/
• Article on the share of deployment activities in Operations:> http://dev2ops.org/blog/2010/4/7/why-so-many-devopsconversations- focus-on-deployment.html
• USI 2009 (French only):> http://www.usievents.com/fr/conferences/4-usi-2009/sessions/797- quelques-idees-issues-des-grands-du-web-pour-remettre-en-cause-vos- reflexes-d-architectes#webcast_autoplay
THE WEB GIANTS
75
THE WEB GIANTS
Practices
76
Lean Startup ................................................................................... 87Minimum Viable Product ................................................................. 95Continuous Deployment................................................................ 105Feature Flipping ............................................................................ 113Test A/B ........................................................................................ 123Design Thinking ............................................................................ 129Device Agnostic ............................................................................ 143Perpetual beta .............................................................................. 151
THE WEB GIANTS
77
THE WEB GIANTS
LeanStartup
78
THE WEB GIANTS PRACTICES / LEAN STARTUP
DescriptionCreating a product is a very perilous undertaking. Figures show that 95% of all products and startups perish from want of clients. Lean Startup is an approach to product creation designed to reduce risks and the impact of failures by, in parallel, tackling organizational, business and technical aspects, and through aggressive iterations. It was formalized by Eric Ries, and was strongly inspired by Steve Blank’s Customer Development
Build – Mesure – LearnAll products and functionalities start with a hypothesis. The hypothesis can stem from data collection on the ground or a simple intuition. Whatever the underlying reason, the Lean Startup approach aims to:
Consider all ideas as hypotheses, it doesn’t matter whether they concern marketing or functionalities,
validate all hypotheses as quickly as possible on the ground.
This last point is at the core of the Lean Startup approach. Each hypothesis, from business, systems admin or development - must be validated, for quality as well as metrics. Such an approach makes it possible to implement a learning loop for both the product and the client. Lean Startup refuses the approach which consists of developing a product for over a year only to discover that the choices made (in marketing, functionalities, sales) threaten the entire organization. Testing is of the essence.
Figure 1
IDEAS
PRODUCTLEARN
BUILD
DATA MEASURE
79
THE WEB GIANTS
Experiment to validatePart of the approach is based on the notion of Minimum Viable Product (MVP) (cf. “Minimum Viable Product“, p. 95). At what minimum can I validate my hypotheses?
We’re not necessarily speaking here of code and products in their technical senses, but rather of any effort that leads to progress on a hypothesis. Anything can be used to test market appetite - Google Docs questionnaire, mailing list or fake functionality. Experimentations with its afferent lessons are an invaluable asset in piloting a product and justifying the implementation of a learning loop.
The measurement obsessionObviously experiments must be systematically monitored through full and reliable metrics (cf. “The obsession with performance measurement“, p. 13).
A client-centered approach – Go out of the building
Checking metrics and validating quality very often means“leaving the building“, as Bob Dorf puts it, co-author of thefamous “4 Steps to the Epiphany“.
“Go out of the building“ (GOOB) is at the heart of the preoccupations of Product Managers who practice Lean Startup. Until a hypothesis has been confronted with reality, it remains a supposition. And therefore presents risks for the organization.
“No plan survives first contact with customers“ (Steve Blank) is thus one of the mottoes of Product teams:
Build only the minimum necessary for validating a hypothesis.
GOOB (from face-to-face interviews to continuous deployment).
Learn.
Build, etc.
80
THE WEB GIANTS PRACTICES / LEAN STARTUP
This approach also allows constant contact with the client, in other words, constant validation of business hypotheses. Zappos, a giant in online shoe sales in the US, is an example of MVP being put into users’ hands at a very early stage. To confront reality and validate that users would be willing to buy shoes online, the future CEO of Zappos took snapshots of the shoes in local stores, thereby creating the inventory for an e-commerce site from scratch. In doing so, and without building cathedrals, he quickly validated that demand was there and that producing the product would be viable.
Piloting with data
Naturally, to grasp user behavior during GOOB sessions, Product Managers meticulously gather data which will help them make the right decision. They also set up tools and processes to collect such data.
The most used are well known to all. They use interviews and analytics solutions.
The Lean Startup method implements the ferocious use of these indicators to truly pilot the product strategy. On ChooseYourBoss.com[1], we postulated that users would choose LinkedIn or Viadeo to connect, to avoid users having to set up accounts and to save us the trouble of developing a login system. In such a way we built the minimum to validate or invalidate the hypothesis of what people would do when given three options to sign up, LinkedIn, Viadeo or by opening a ChooseYourBoss account. The first two worked well while the 3rd, the ChooseYourBoss account, indicated that the ChooseYourBoss account was not viable for production. Results: users not wishing to use these networks to sign in represented 11% of visitors to our site. We will therefore abstain for the time being from implementing accounts outside of social networks. We went from “informed by data“ to “piloted by data“.
Who makes it work for them?IMVU, Dropbox, Heroku, Votizen and Zappos are a few examples of Web products that managed to integrate user feedback at a very early stage in product design. Dropbox for example completely overhauled its way of doing things by drastically simplifying management of synchronized files. Heroku went from a development platform in the Cloud to a Cloud server solution. Examples abound, each more ingenious than the previous one.
[1] A site for connecting candidates and recruiters.
81
THE WEB GIANTS
What about me?Lean Startup is not a dogma. Above all it is about realizing that what the market and the clients want is not to be found in architecture, marketing plans, sales forecasts and key functionalities.
Once you’ve come to that realization, you will start seeing hypotheses everywhere. It all consists in setting up processes for validating hypotheses, without losing sight of the principle of validating minimum functionalities at any given instant t.
Before writing any code, the main questions to ask revolve around the triad Client / Problem / Solution:
Do I really have a problem that deserves to be resolved?
Is my solution the right one for my client?
Will my client buy it? How much?
Use whatever you can to check your hypotheses: interviews, market studies, prototypes...
The next step is to know whether the model you are testing on a small scale is replicable and expandable.How can you get clients to acquire a product they’ve never heard of?Will they be in a position to understand, use, and profit from your product?The third and fourth steps revolve around growth: how do you attract clients and how do you build a company capable of taking on your product and moving it forward?
Contrary to what one might think after reading this chapter, Lean Startup is not an approach reserved for mainstream websites. Innovation through validating hypotheses as quickly as possible and limiting financial investment is obviously logic which can be transposed to any type of information systems project, even in-house. We are convinced that this approach deserves wider deployment to avoid Titanic-type projects which can swallow colossal sums despite providing very little value for users. For more information, you can also consult the sessions on Lean Startup at USI which present the first two stages (www.usievents.com).
82
THE WEB GIANTS PRACTICES / LEAN STARTUP
Sources• Running Lean – Ash Maurya
• 4 Steps to the Epiphany – Steve Blank & Bob Dorf :> http://www.stevenblank.com/books.html
• Blog Startup Genome Project :> http://blog.startupcompass.co/
• The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses – Eric Ries :> http://www.amazon.com/The-Lean-Startup-Entrepreneurs-Continuous/dp/0307887898
• The Startup Owner’s Manual – Steve Blank & Bob Dorf :> http://www.amazon.com/The-Startup-Owners-Manual-Step-By-Step/dp/0984999302
83
THE WEB GIANTS
Minimum Viable Product
84
THE WEB GIANTS
85
THE WEB GIANTS PRACTICES / MINIMUM VIABLE PRODUCT
DescriptionA Minimum Viable Product (MVP) is a strategy for product development . Lean Startup creator Eric Ries, who strongly contributed to the
elaboration of this approach, gives the following definition:
The minimum viable product is that version of a new product which allows a team to collect the maximum amount of validated learning about customers with the least effort.[1]
In sum, it is a way to quickly develop a minimal product prototype to establish whether the need for it is there, to identify possible markets, and to validate business hypotheses on e.g. income generation.
The interest of the approach is obvious: to more quickly design a product that truly meets market needs, by keeping costs down in two ways:
by reducing TTM:[2] faster means less human effort, therefore less outlay - all else being equal,
and by reducing the functional perimeter: less effort spent on functionalities which have not yet proven their worth to the end user.
In the case of startups, funds usually run low. It is therefore best to test your business plan hypotheses as rapidly as possible - and this is where a MVP shows its worth.
The advantages are well illustrated by Eric Ries’s experience at IMVU.com, an online chatting and 3D avatar website: it took them only six months to create their first MVP, whereas in a previous startup experience it took them almost five years to release their first product - which was questionably viable!
[1] http://www.startuplessonslearned.com/2009/08/minimum-viable-product-guide.html [2] Time To Market
86
THE WEB GIANTS
Today, 6 months is considered a relatively long delay, and MVPs are often deployed in less.
This is because designing an MVP does not necessarily mean producing code or a sophisticated website, quite the contrary. The goal is to get a feel for the market very early on in the project so as to validate your plans for developing your product or service. This is what is known as aFail Fast approach.
MVPs allow you to quickly validate your client needs hypotheses and therefore to reorient your product or service accordingly, very early on in your design process. This is known as a “pivot“ in the Lean Startup jargon. Or, if your hypotheses are validated by the MVP run, you must then move on to the next step: implementing the functionality you simulated, creating a proper web site, or simply a marketing page.
An MVP is not only useful for launching a new product: the principle is perfectly applicable for adding new functionalities to a product that already exists. The approach can also be more direct: for example you can ask for user feedback on what functionalities people would like (see Figure 1), at the same time gathering information on how they use your product.
MVPs are particularly relevant when you have no or little knowledge of your market and clients, nor any well defined product vision.
Implementation
An MVP can be extremely simple. For example, Nivi Babak states that “The Minimum Viable Product (MVP) is often an ad on Google Or a Power Point slide. Or a dialog box. Or a landing page. You can often build it in a day or a week.“[3] The most minimalist approach is called a Smoke Test, in reference to electronic component testing to check that a component functions properly before moving on to the next stage
[3] http://venturehacks.com/articles/minimum-viable-product
87
THE WEB GIANTS PRACTICES / MINIMUM VIABLE PRODUCT
of testing (stress tests, etc.) and the fact that in case of failure there is often a great deal of smoke!
The most minimal form of a Smoke Test consists of an advertisement in a major search engine for example, promoting the qualities of the product you hope to develop. Clicking on the ad will send the person to a generally static web page with minimal information, but e.g. suggesting links, the goal being to gather click information, indicative of how interested the client is in the proposed service, and willingness to buy it. That is to say that the functionalities laid out in the links do not have to be operational at this stage! The strict minimum is the ad, as this is the first step in gathering information.
In an early version of the website theleanstartup.com, which applies the principles it preaches (the EYODF pattern),[4] was proposed, at the very bottom of its home page (the MVP of theleanstartup.com), a very simple dialog box for collecting user needs. There were only two fields to be filled in: e-mail address and suggestion for a new functionality, as well as the invitation: What would you like to see on future versions of this website?
Figure 1. Form for collecting user information on the website theleanstartup.com once the fields are filled in.
In terms of tooling, services such as Google Analytics, Xiti, etc. which track all user actions and browsing characteristics on a given website, are indispensable allies. For example, in the case of a new website functionality to be implemented, it is very simple to add a new tab, menu option, advertisement, and to track user actions with this type of tool.
SendMy Email and
What do you want to see in future releases?THIS IS OUR MINIMUM VIABLE PRODUCT
[email protected] my smoke test
Success! We’ve recieved your feedback.
[4] Eat Your Own Dog Food, i.e. be the own consumers of your services.
88
THE WEB GIANTS
Risks...
Beware, the MVP can generate ambiguous results, including false negatives. In fact, if an MVP is not sufficiently well thought-out, or is badly presented, it can trigger a negative reaction on the targeted clients’ part. It can seem to indicate that the planned product isn’t viable whereas in fact it is only a question of iterating to perfect the process to better meet client needs. The point is to not stop at the first whiff of failure: a single step is all it can take to go from non-viable to viable, i.e. to the MVP itself.
Henry Ford put it very aptly:“If I had asked people what they wanted, they would have said faster horses.“ Having a product vision can be more than just an option.
Who makes it work for them?Once again we will mention IMVU (see above), one of the pioneers of Lean Startup where Eric Ries & Co. tested the MVP concept, more particularly in the field of 3D avatar design. Their website, imvu.com is an online social media for 3D avatars, chat rooms, gaming, and has the world’s largest catalog of virtual goods, most of which are created by the users themselves.
Let us also return to the example of Dropbox, an online file storage service which has seen its growth skyrocket, all based on an MVP which was a fake showcase demonstration, the product didn’t yet exist. Following the posting of the video, a tidal wave of subscribers brought the beta list sign-ups from 5,000 to 75,000 people in one night, confirming that Dropbox’s product vision was indeed solid.
89
THE WEB GIANTS PRACTICES / MINIMUM VIABLE PRODUCT
How can I make it work for me?With the prevalence of e-commerce in the social media, the web is now at the heart of economic development strategies for businesses. The MVP strategy can be activated as is for a wide range of projects, whether stemming from the IT department or Marketing, but don’t forget that it can also be applied outside the web.
It can even be applied to purely personal projects. In his reference work Running Lean, Ash Maurya gives the example of applying an MVP (and Lean Startup) to the publication of that self-same book.
Auditing Information Systems is a major part of our work at OCTO and we are often faced with innovation projects (community platforms, e-services, online shopping...) that encounter difficulties in the production process, say every six months, and where the release, delayed by one or two years, is often a flop, because the value delivered to users does not correspond to market demand... In the interval, millions of euros will have been swallowed up, for a project that will finally end up in the waste bin of the web.
An MVP type approach reduces such risks and associated costs. On the web, delays of that length to release a product cannot be sustained, and competition is not only ferocious but also swift!
Within a business information system, it is hard to see how one could carry out Smoke Tests with advertisements. And yet there too one often finds applications and functionalities which took months to develop, without necessarily being adopted by users in the end... The virtue of Lean Startup and the MVP approach is to center attention on the value added for users, and to better understand their true needs.
In such cases, an MVP can serve to prioritize the end users of the functionalities to be developed in future versions of your application.
90
THE WEB GIANTS
Sources• Eric Ries, Minimum Viable Product: a guide, Lessons Learned, 3 August, 2009> http://www.startuplessonslearned.com/2009/08/minimum-viable- product-guide.html
• Eric Ries, Minimum Viable Product, StartupLessonLearned conference> http://www.slideshare.net/startuplessonslearned/minimum-viable- product
• Eric Ries, Venture Hacks interview: “What is the minimum viable product? “> http://www.startuplessonslearned.com/2009/03/minimum-viable- product.html
• Eric Ries, How DropBox Started As A Minimal Viable Product, 19 October, 2011> http://techcrunch.com/2011/10/19/dropbox-minimal-viable-product
• Wikipedia, Minimum viable product> http://en.wikipedia.org/wiki/Minimum_viable_product
• Timothy Fitz, Continuous Deployment at IMVU: Doing the impossiblefifty times a day, 10 February, 2009> http://timothyfitz.wordpress.com/2009/02/10/continuous-deployment- at-imvu-doing-the-impossible-fifty-times-a-day
• Benoît Guillou, Vincent Coste, Lean Start-up, 29 June, 2011, Université du S.I. 2011, Paris> http://www.universite-du-si.com/fr/conferences/8-paris-usi-2011/ sessions/1012-lean-start-up (French only)
• Nivi Babak, What is the minimum viable product?, 23 March, 2009> http://venturehacks.com/articles/minimum-viable-product
• Geoffrey A. Moore, Crossing the Chasm: Marketing and Selling High-Tech Products to Mainstream Customers, 1991 (revised 1996), HarperBusiness, ISBN 0066620022
91
THE WEB GIANTS PRACTICES / MINIMUM VIABLE PRODUCT
• Silicon Valley Product Group (SVPG), Minimum Viable Product, 24 August, 2011
> http://www.svpg.com/minimum-viable-product
• Thomas Lissajoux, Mathieu Gandin, Fast and Furious Enough, Définissez et testez rapidement votre premier MVP en utilisant des pratiques issues de Lean Startup, Conference Paris Web, 15 October, 2011> http://www.slideshare.net/Mgandin/lean-startup03-slideshare
(French only)
• Ash Maurya, Running Lean> http://www.runningleanhq.com/
92
THE WEB GIANTS
ContinuousDeployment
93
THE WEB GIANTS PRACTICES / CONTINUOUS DEPLOYMENT
DescriptionIn the chapter “Perpetual beta“, p. 151, we will see that Web Giants improve their products continuously. How do they manage to deliver improvements so frequently while in some IT departments the least change can take several weeks to be deployed in production?
In most cases, they have implemented a continuous deployment process, which can be done in two ways:
Either entirely automatically - modifications to the code are automatically tested and, if validated, deployed to production.
Or semi-automatically: at any time one can deploy the latest stable code to production in one go. This is known as “one-click deployment“.
Obviously, setting up this pattern entails a certain number of prerequisites.
Why deploy continuously?
The primary motivation behind continuous deployment is to shorten the Time To Market, but it is also a means to test hypotheses, to validate them and, in fine, to improve the product.Let us imagine a team which deploys to production on the 1st of every month (which is already a lot for many IT departments):
I have an idea on the 1st.
With a little luck, the developers will be able to implement it in the remaining 30 days.
As planned, it is deployed to production in the monthly release plan on the 1st of the following month.
Data are collected over the next month and indicate thatthe basic idea needs improvement.
But it will be a month before the new improvement can be implemented, which is to say it takes three months to reach a stabilized functionality.
94
THE WEB GIANTS
In this example, it is not development that is slowing things down but in fact the delivery process and the release plan.
Thus continuous deployment shortens the Time To Market but is also a way to accelerate product-improvement cycles.
This improves the famous Lean Startup cycle (cf. “Lean Startup“, p. 87):
Figure 1
A few definitions
Many people use “Continuous Delivery“ and “Continuous Deployment“ interchangeably. To avoid any errors in interpretation, here is our definition:
With each commit (or time interval), the code is:
Compiled, tested, deployed to an integration environment => Continuous Integration
Compiled, tested, delivered to the next team (Tests, Qualification, Production, Ops).
=> Continuous Delivery
Compiled, tested, deployed to production. => Continuous Deployment
IDEAS
CODEDATA
LEARN FAST CODE FAST
MEASURE FAST
95
THE WEB GIANTS PRACTICES / CONTINUOUS DEPLOYMENT
The point here is not to say that Continuous Delivery and Continuous Integration are a waste of time. Quite the contrary, they are essential steps: Continuous Deployment is simply the natural extension of Continuous Delivery, itself the natural extension of Continuous Integration.
What about quality?
One frequent objection to Continuous Deployment is the lack of quality and the fear of delivering an imperfect product, of delivering bugs.
Just as with Continuous Integration, Continuous Deployment is only fully useful if you are in a position to be sure of your code at all times. This entails a full array of tests (on units, integration, performance, etc.). Beyond the indispensable unit tests, there is a wide range of automated tests such as:
Integration tests (Fitnesse, Greenpepper, etc.)
GUI tests (Selenium, etc.)
Performance tests (Gatling, OpenSTA, etc.)
Test automation can seem costly, but when the goal is to execute them several times a day (IMVU launches 1 million tests per day), return on investment grows rapidly. Some, such as Etsy, do not hesitate to create and share tools to best meet their testing and automation needs.[1]
Furthermore, when you deploy every day, the size of the deployments is obviously much smaller than when you deploy once a month. In addition, the smaller the deployment, the shorter the Time To Repair, as can be seen in Figure 2.
[1] https://github.com/etsy/deployinator
96
THE WEB GIANTS
Figure 2 (modified). Source: http://www.slideshare.net/jallspaw/ops-
metametrics-the-currency-you-pay-for-change-4608108
Etsy well illustrates the trust one can have in code and in the possibility of repairing any errors quickly. This is because they don’t bother with planning for rollbacks: “We don’t roll back code, we fix it“. According to one of their employees, the longest time span it has taken them to fix a critical bug was four minutes.
Big changes lead to big problems, little changes lead to little problems.
Who does things this way?Many of Web Giants have successfully implemented Continuous Deployment, here are a few of the most representative numbers
Facebook, very aggressive on test automation, deploys twice a day.
Flickr makes massive use of Feature Flipping (cf. “Feature Flipping“, p. 113) to avoid development branches and deploys over ten times daily. A page displays the details of the last deployment: http://code.flickr.com
Etsy (an e-commerce company), hugely invested in automated tests and deployment tooling, and deploys more than 25 times a day.
CHANGESIZE
Huge changesetsdeployed rarely
(high TTR)
(low TTR)
Tiny changesetsdeployed often
CHANGE FREQUENCY
97
THE WEB GIANTS PRACTICES / CONTINUOUS DEPLOYMENT
IMVU (an online gaming and 3D avatar site), performs over a million tests a day and deploys approximately 50 times.
What about me?Start by estimating (or even better, by measuring!) the time it takes you and your team to deliver a simple line of code through to production, respecting the standard process, of course.
Setting up Continuous Deployment
Creating a “Development Build“ is the first step towards Continuous Deployment.To move on, you have to ensure that the tests you run cover most of the software. While some don’t hesitate to code their own test frameworks (Netflix initiated the “Chaos Monkey“ project which shuts down servers at random), there are also ready made frameworks available, such as JUnit, Gatling and Selenium. To reduce testing time, IMVU distributes its tests over no fewer than 30 machines. Others use Cloud services such as AWS to instantiate test environments on the fly and carry out parallel testing.
Once the development build produces sufficiently tested artifacts, it can be expanded to deliver the artifacts to the teams who will deploy the software across the various environments. At this stage, you are already in Continuous Delivery.
The last team can now enrich the build to include deployment tasks. This obviously entails automating various tasks, such as configuring the environments, deploying the artifacts which constitute the application, migrating the database diagrams and much more. Be very careful with your deployment scripts! It is code and, like all code, must meet quality standards (use of a SCM, testing, etc.).
Forcing Continuous Deployment
A more radical but highly interesting solution is to force the rhythm of release, making it weekly for example, to stir up change.
98
THE WEB GIANTS
Associated patterns When you implement Continuous Delivery this is necessarily accompanied by several patterns, including:
Zero Downtime Deployment, because while an hour of system shut-down isn’t a problem if you release once a month, it can become one if you release every week or every day.
Feature Flipping (see the next chapter, “Feature Flipping“), because regular releases unavoidably entail delivering unfinished functionalities or errors, you must therefore have a way of deactivating problematic functionalities instantaneously or upstream.
DevOps obviously, because Continuous Deployment is one of its pillars (cf. “DevOps“, p. 71).
Sources• Chuck Rossi, Ship early and ship twice as often, 3 August, 2012:> https://www.facebook.com/notes/facebook-engineering/ship-early- and-ship-twice-as-often/10150985860363920
• Ross Harmess, Flipping out, Flickr Developer Blog, 2 December, 2009:> http://code.flickr.com/blog/2009/12/02/flipping-out
• Chad Dickerson, How does Etsy manage development and operations? 4 February, 2011:> http://codeascraft.etsy.com/2011/02/04/how-does-etsy-manage- development-and-operations
• Timothy Fitz, Continuous Deployment at IMVU: Doing the impossible fifty times a day, 10 February, 2009:> http://timothyfitz.wordpress.com/2009/02/10/continuous-deployment- at-imvu-doing-the-impossible-fifty-times-a-day
• Jez Humble, Four Principles of Low-Risk Software Releases, 16 February, 2012:> http://www.informit.com/articles/article.aspx?p=1833567
• Fred Wilson, Continuous Deployment, 12 February, 2011:> http://www.avc.com/a_vc/2011/02/continuous-de
99
THE WEB GIANTS
Feature Flipping
100
THE WEB GIANTS PRACTICES / FEATURE FLIPPING
DescriptionThe “Feature Flipping“ pattern allows you to activate or deactivate functionalities directly in production, without having to release new code.
Several terms are used by Web Giants: Flickr and Etsy use “feature flags“, Facebook “gatekeepers“, Forrst “feature buckets“, Lyris Inc. “feature bits“, while Martin Fowler opted for “feature toggles“.
In short, everyone names and implements the pattern in their own way, and yet all of these techniques strive to reach a same goal. In this article we will use the term “feature flipping“. Successfully implemented in our enterprise app store Appaloosa,[1] this technique has brought many advantages with just a few drawbacks.
ImplementationIt is a very simple mechanism, you simply have to condition execution of the code for a given functionality in the following way:
if Feature.is_enabled(‘new_feature’) # do something newelse # do same as beforeend
The implementation of the function “is enabled“ will e.g. query a configuration file or database to know whether the functionality is activated or not.
You then need an administration console to configure the state of the various flags on the different environments.
Continuous deployment
One of the first advantages in being able to hot-switch functionalities on or off is to be able to continuously deliver the application being produced. Indeed, one of the first problems faced by organizations imple-menting continuous delivery is:
[1] cf. appaloosa-store .com
101
THE WEB GIANTS
how can one regularly commit the source referential while guaranteeing application stability and constant production readiness? In the case of functionality developments which cannot be finished in less than a day, only committing the functionality once it’s done (after a few days) is contrary to development best practices in continuous integration.The truth is that the farther apart your commits, the more complicated and risky are merges, with only limited possibilities for transversal refactoring. Given these constraints, there are two choices: “feature branching“ or “feature flipping“. In other words, creating a branch via the configuration management tool or in the code. Each has its fervent partisans, you can find some of the heated debates at: http://jamesmckay. net/2011/07/why-does-martin-fowler-not-understand-feature-branches
Feature Flipping makes it possible for developers to code inside their “ifs“, and to thus commit unfinished, non-functional code, as long as the code compiles and the tests are passed. Other developers can obtain the modifications without difficulty as long as they do not activate the functionalities being developed. Thus the code can be deployed to production since, again, the functionality will not be activated. That is where the interest lies: deployment of code to production no longer depends on completing all the functionalities under development. Once the functionality is finished, it can be activated by simply changing the status of the flag on the administration console.This has an added benefit in that the functionality can be activated to coincide e.g. with an advertising campaign; it is a way of avoiding mishaps on the day of the release.
Mastering deployment
One of the major gains brought by this pattern is that you are in control of deployment, because it allows you to activate a functionality with a simple click, and to deactivate it just as easily, thus avoiding drawn-out and problem-prone rollback processes to bring the system back to its N-1 release.
Thus you can very quickly cancel the activation of a functionality if production tests are inconclusive or user feedback is negative.
102
THE WEB GIANTS PRACTICES / FEATURE FLIPPING
Unfortunately, things are not quite that simple: you must be very careful with your data and ensure that the model will work with or without the functionality being activated (see the paragraph “Limits and constraints > major modifications“).
Experiment to improve the product
A natural off-shoot of feature flipping is that it enables you to activate or deactivate functionalities for specific sub-populations. You can thus test a functionality on a user group and, depending on their response, activate it for all users or scrap it. In which case the code will look something like this:
if Feature.is_enabled_for(‘new_feature’, current_user) # do something newelse # do same as beforeend
You can then use the mechanism to test a functionality’s performance by modifying one variable in its implementation for several sub-populations. Result metrics will help you determine which implementation performs best. In other words, feature flipping is an ideal tool for carrying out A/B testing (cf. “A/B Testing“, p. 123).
Provide custom-made products
In some cases, it can be interesting to let the client choose between the two. Let us take the example of attachments in Gmail: by default, the interface proposes a number of advanced functionalities (drag and drop, multiple uploads) which can be deactivated by the user with a simple click in case of dysfunction.
Inversely, you can offer users an “enhanced“ mode, “labs“ (Gmail) are telling examples of feature flipping implementation.
To do so, all you have to do is to propose an interface where users can control the activation/deactivation of certain functionalities (self service).
103
THE WEB GIANTS
Managing billable functionalities
Activating paying functionalities with various levels of service can be com-plicated to implement, and entails conditional code of the following type:
if current_user.current_plan == ‘enterprise’ || current_user.current_plan == ‘advanced’
Let us say that some “special“ firms are paying for the basic plan but you want to give them access to all functionalities.
A given functionality was included in the “advanced“ plan two months before, but marketing has decided that it should only be included in the “enterprise“ plan... except for those who subscribed more than two months earlier.
You can use feature flipping to avoid having to manage such exceptions in the code. You just need to condition activation of the features when a client subscribes. When users subscribe to the enterprise plan, the functionalities X, Y and Z are activated. You can then very easily manage exceptions in the administration interface.
Graceful degradation
Some functionalities are more crucial to business than others. When scaling up it is a good idea to favor certain functionalities over others. Unfortunately, it is difficult to ask your software or server to give priority to anything to do with billing over displaying synthesis graphs... unless the graph display functionality is feature flipped.
We have already mentioned the importance of metrics (cf. “The obsession with performance measurement“, p. 13). Once your metrics are set up, it becomes trivial to flip functions accordingly. For example: “If the average response time for displaying the graph exceeds 10 seconds over a period of 3 minutes, then deactivate the feature“.
This allows you to progressively degrade website features in order to maintain a satisfying experience for the users of the core business functionalities. This is akin to the “circuit breaker“ pattern (described in the book “Release It!“ by Michel Nygard) which makes it possible to short-circuit a functionality if an external service is down.
104
THE WEB GIANTS PRACTICES / FEATURE FLIPPING
Limits and constraints
As noted above, all you need to implement feature flipping is an “if“. However, like with any development, this can easily become a new source of complexity if you do not take the necessary precautions.
1. 1 “if“ = 2 tests.
Automated tests are still the best way to check that your software is working as it should. In the case of feature flipping, you will need at least 2 tests: with the feature flipped OFF (activated) and with the feature flipped ON (deactivated).
In development, one often forgets to test the feature OFF even though this is what your clients will see unless it is ON. Therefore, once more, applying TDD[2] is a good solution: tests written in the initial development phases guarantee testing of OFF functionalities.
2. Clean up!
Extensive use of feature flipping can lead to an accumulation of “ifs“, making it more and more difficult to manage the code. Remember that for some functionalities, flipping is only useful for ensuring continuous deployment.
For all functionalities that should never again need to be deactivated (free/optional functionalities which will never be degraded as they are critical from a functional perspective), it is important to delete the “ifs“ to lighten the code and keep it serviceable.
You should therefore set aside some time following deployment to production to “clean up“. Like all code refactoring tasks, it is all the easier the more regularly you do it.
[2] Test Driven Development
105
THE WEB GIANTS
2. Major modifications (i.e. changing your relational model)
Some functionalities entail major changes in the code and data model. Let us take the example of a Person table containing an Address field. To meet new needs, you decide to divide the tables as follows:
To manage cases like this, here is a strategy you can implement:
Add the table Address (so that the base contains both the column Address AND the table Address). For applications nothing has changed, they continue querying the old columns.
You then modify your existing applications so that they use the new tables.
You migrate the data you have and delete all unused columns.
At this point, most often the application will have changed little for the user, but calls upon a new data model.
You can then start developing new functionalities based on your new data model, using feature flipping.
The strategy is relatively simple and entails down time for the various releases (phases 2 and 4).
Other techniques can be used to manage in parallel several version of your data model, in keeping with the pattern “zero downtime deployment“, allowing you to update your relational diagram without impacting the availability of the application using it, based on various types of scripts (script expansion and contraction), triggers to synchronize the data, or even views to expose the data to the applications through an abstraction layer.
Person
IDLast nameFirst nameAddress
Person
IDLast nameFirst name
Address
IDPerson_IDStreetPost_CodeTownCountry
106
THE WEB GIANTS PRACTICES / FEATURE FLIPPING
Changes to one’s relational model are much less frequent than changes to code, but they are complex and have to be planned well in advance and managed very carefully.NoSQL (Not Only SQL) databases are much more flexible as concerns the data model so can also be an interesting option.
Who makes it work for them?It works for us, even though we are not (yet!) Web Giants.
In the framework of our Appaloosa project we successfully implemented the various patterns described in this article.
For Web Giants, their size, constraints due to deployment to several sites, big data migrations, leave them no choice but to implement such mechanisms. Among the most famous are Facebook, Flickr and Lyris Inc. Closer to home are Meetic, the Bilbiothèque Nationale de France and Viadeo, with the latter being particularly insistent on code clean-up and only leaving flippers in production for a few days.
And anyone who practices continuous deployment (cf. “Continuous Deployment“, p. 105) applies, in one way or another, the feature flipping pattern..
How can I make it work for me?There are various ready-made implementations in different languages such as the gem rollout in Ruby and the feature flipper in Grails, but it is so easy that we recommend you design your own implementation tailored to your specific needs.
There are multiple benefits and possible uses, so if you need to progressively deploy functionalities, or carry out user group tests, or deploy continuously, then get started!
107
THE WEB GIANTS
Sources• Flickr Developer Blog:> http://code.flickr.com/blog/2009/12/02/flipping-out
• Summary of the Flickr session at Velocity 2010:> http://theagileadmin.com/2010/06/24/velocity-2010-always-ship-trunk
• Quora Questions on Facebook:> http://www.quora.com/Facebook-Engineering/How-does-Facebooks- Gatekeeper-service-work
• Forrst Engineering Blog:> http://blog.forrst.com/post/782356699/how-we-deploy-new-features- on-forrst
• Slideshare Lyrics Inc. :> http://www.slideshare.net/eriksowa/feature-bits-at-devopsdays-2010-us
• Talk Lyrics Inc. at Devopsdays 2010:> http://www.leanssc.org/files/201004/videos/20100421_ Sowa_EnabilingFlowWithinAndAcrossTeams/20100421_Sowa_ EnabilingFlowWithinAndAcrossTeams.html
• Whitepaper Lyrics Inc. :> http://atlanta2010.leanssc.org/wp-content/uploads/2010/04/Lean_ SSC_2010_Proceedings.pdf
• Interview with Ryan King from Twitter:> http://nosql.mypopescu.com/post/407159447/cassandra-twitter-an- interview-with-ryan-king
• Blog post by Martin Fowler:> http://martinfowler.com/bliki/FeatureToggle.html
• Blog 99designs:> http://99designs.com/tech-blog/blog/2012/03/01/feature-flipping
108
THE WEB GIANTS
TestA/B
109
THE WEB GIANTS
110
THE WEB GIANTS PRACTICES / A/B TEST
DescriptionA/B Testing is a product development method to test a given functionality’s effectiveness. You can thus test e.g. a marketing campaign via e-mail, a home page, an advertising insert or a payment method.
This test strategy allows you to validate various object releases for a single variable: the subject line of an e-mail or the contents of a web page. Like any test designed to measure performance, A/B Testing can only be carried out in an environment capable of measuring an action’s success. Let us take the example of a subject heading in an email. The test must bear on how many times it was opened to determine which contents were most compelling. For web pages, you look at click-through rates; for payments, conversion rates.
Implementation
The method itself is relatively simple. You have variants of an object which you want to test on various user subsets. Once you have determined the best variant, you open it to all users.
A piece of cake? Not quite.
The first question must be the nature of the variation: where do you set your cursor between micro-optimization and major overhaul? All depends on where you are on the learning curve. If you’re in the client exploration phase (cf. “Minimum Viable Product“, p. 95, “Lean Startup“, p. 87), A/B Testing can completely change the version tested. For example, you can set up two home pages with different marketing messages, different layouts and graphics, to see user reactions to both. If you are farther along in your project, where the variation of a conversion goal of 1% makes a difference, variations can be more subtle (size, color, placement, etc.).
111
THE WEB GIANTS
The second question is your segmentation. How will you define the various sub-sets? There is no magic recipe, but there is a fundamental rule: the segmentation criteria must have no influence on the experience results (A/B Testing = a single variable). You can take a very basic feature such as subscription date, alphabetical order, as long as it does not affect the results.
The third question is when to stop. How do you know when you have enough responses to generalize the results of the experiment? It all depends on how much traffic you are able to generate, on how complex your experiment is and the difference in performance across your various samplings. In other words, if traffic is low and results are very similar, the test will have to run for longer. The main tools available on the market (Google Website Optimizer, Omniture Test&Target, Optimizely) include methods for determining if your tests are significant. If you manage your tests manually, you should brush up on statistics and sampling principles. There are also websites to calculate significance levels for you.[1]
Let us now turn to two pitfalls to be avoided when you start A/B Testing. First, looking at performance tests from the perspective of a single goal can be misleading. Given that the test changes the user experience, you must also monitor your other business objectives. By changing the homepage of a web site for example, you will naturally monitor your subscription rate, without forgetting to look at payment performance.
The other pitfall is to offer a different experience to a single group over time. The solution you implement must be absolutely consistent for the duration of the experiment: returning users must be presented with the same experimentation version, both for the relevance of your results and the user experience. Once you have established the best solution, you will then obviously deploy it for all.
Who makes it work for them?We cannot not cite the pioneer of A/B Testing: Amazon. Web players on the whole show a tendency to share their experiments. On the Internet you will have no trouble finding examples from Google, Microsoft, Netflix, Zynga, Flickr, eBay, and many others, with at times surprising results. The site www.abtests.com lists various experiments.
[1] http://visualwebsiteoptimizer.com/ab-split-significance-calculator
112
THE WEB GIANTS
How can I make it work for me?A/B Testing is above all a right to experiment. Adopting a learning stance, with results hypotheses from the outset and a modus operandi, is a source of motivation for product teams. Linking the tests to performance is a way to set up product management driven by data.
It is relatively simple to set up A/B Testing (although you do need to respect a certain hygiene in your practices). Google Web Site Optimizer, to mention but one, implements a tool which is directly hooked up to Google Analytics. For a reasonable outlay, you can give your teams the means to objectivize their actions in relation to the end-product.
Sources• 37Ssignals, A/B Testing on the signup page:> http://37signals.com/svn/posts/1525-writing-decisions-headline-tests- on-the-highrise-signup-page
• Tim Ferris:> http://www.fourhourworkweek.com/blog/2009/08/12/google-website- optimizer-case-study
• Wikipedia:> http://en.wikipedia.org/wiki/A/B_testing
PRACTICES / A/B TEST
113
THE WEB GIANTS
Design Thinking
114
115
THE WEB GIANTS CULTURE / DESIGN THINKING
Description
In their daily quest for more connection with users, businesses are beginning to realise that these “users“, “clients“, and other “collaborators“ are first and foremost human beings. Emerging behaviour patterns, spawned by new possibilities opened up by technology, are changing consumer needs and their brand loyalties.
The web giants were among the first to adopt an approach based on the relevance of all stakeholders involved in the creation of a product, and therefore concerned by the user experience provided by a given service. Here, the way Designers have appropriated the work tools is ideal for qualifying an innovative need. Reconsidering Design has become a key issue. It is essential for any Organization that wishes to change and innovate, to question the business culture, to dare go as far as disruption.
Born in the 1950s and more recently formalised by the Agency IDEO[1]
Design Thinking was developed at Stanford University in the USA as well as the University of Toronto in Canada, before making a significant impact on Silicon Valley, to the extent that it is becoming an approach assimilated by all major web businesses and startups. It then spread to the rest of the English speaking world, and then all of Europe.
Design thinking is a human-centered approach to innovation that draws from the designer’s toolkit to integrate the needs of people, the possibilities of technology, and the requirements for business success
Tim Brown IDEO
[1] > http://www.wired.com/insights/2014/04/origins-design-thinking/
116
THE WEB GIANTS
A new vision of Design
Emergence of a strategic asset
First of all one must reconsider the word Design itself, to understand its deeper, almost etymological, meaning. And therefore recognise that when you speak of Design, it means that you want to give significance to something, whether a product, a service or an Organization.
In fact, Design is whenever you want to “give meaning“ to something. A far cry from the simple representation, aesthetic or merely practical, of a product.
“Great design is not something anybody has traditionally expected from Google“ – TheVerge
Several web giants became aware of the strategic relevance of “operational“ Design before more fully implementing Design Thinking[2] This is the case for Google which, in 2011, [3] published a strong strategic vision for Design, namely offering an additional choice between “Full metrics“ (systematic A/B Testing, incremental feedback without embarked user feedback...)
[2] > http://www.forbes.com/sites/darden/2012/05/01/designing-for-growth-apple-does-it-so-can-you/
[3] > http://www.theverge.com/2013/1/24/3904134/google-redesign-how-larry-page-engineered-beautiful-revolution
MEANING CONCEPTION
DESIGN
WHY HOW
117
THE WEB GIANTS CULTURE / DESIGN THINKING
Today, there are even Designers behind the creation of various web giants, such as AirBnB.[4] And some who go so far as to consider Design as the main asset in their global business strategy (Pinterest, various Design Disruptors).
The first step to implementing a strategic Design is to create an environment which fosters the expression of different opinions around the role of Design within the company. This is how you avoid conflation between operational, cognitive and strategic aspects.
[4] > http://www.centrodeinnovacionbbva.com/en/news/airbnb-design-thinking-success-story
STEP 01
STEP 02
STEP 03
STEP 04
Companies that do not use design
Companies that use design for styling and appearance
Companies that integrate design into the development process
Companies that consider design a key strategic element
Emotional Design
Interaction Design
Strategic Design
Meaningful
Usable
Delightful
118
THE WEB GIANTS
Designing the experience a dialog between users and professionals
“Design is human. it’s not about “is it pretty,“ but about the connection it creates between a product and our lives.“ – Jenny Arden, Design Manager AirBnB
A strong bond is established through Design between the user and the designer. This is a context where the designer offers a service, promises an experience, after which the user qualifies the experience through feedback - negative or positive - which can lead to designer loyalty. It is this relationship that leads to strong business value.
Such commitments are to be seen towards social networks (LinkedIn, Facebook, Pinterest, Twitter…) and therefore largely among the web giants and, by extension, towards all desirable digital services.
It is the Design process which materialises this relationship; the shared history between the brand, its product, or the service behind the product and users.
“When people can build an identity with your product over time, they form a natural loyalty to it.“ Soleio Cuervo, Head of Design, Dropbox
Then come specialists of this precious relationship, in the form of labs or other types of specialised Organizations (Google Venture, IBM), working to optimise this new balance.
119
THE WEB GIANTS CULTURE / DESIGN THINKING
Design thinking
The working hypothesis
Design thinking entails understanding needs, and makes it possible to create tailored and adequate solutions for any problem that comes up. This means taking an interest in fellow humans in the most open, compassionate way possible.
Innovation appears in the balance between the following factors:What is viable from a business prospective, in line with the business model.What is technologically feasible, neither obsolete nor too in advance.And, lastly, what is desirable, the human factor and takeaways.
The specificity of the process lies in its ability to address a problem through unprecedented collaboration between all stakeholders: from the “creators“ (those who drive the business strategy, for example the company) to the “users“ whoever they may be (in-house and external, direct and indirect).
Business
HumansTechnologic
doable désirable
viable
Responsibility
INNOVATION
120
THE WEB GIANTS
methodological approach
The methodological translation of the Design Thinking approach is a series of steps where the goal is to provide structure for innovation by optimising the analytical and intuitive aspects of a problem.
100% reliability
100% validity
Bridging the Fundamental Predilection Gap
Design Thinking
50/50Mix
Rotman
The approach unfolds in three main phases: Inspiration or Discovery: learning to examine a problem or request.
Understanding and observing people and their habits. Getting a feel for emerging wishes and needs.
Ideating or Defining: making sense of the discoveries triggered by a concept or vision. Establishing the business and technology possibilities and prototyping the target innovations as quickly as possible.
Implementing or Delivering: materialising and testing to maximise feedback on the innovation so as to swiftly make adjustments.
121
THE WEB GIANTS CULTURE / DESIGN THINKING
More precisely, these phases are often broken down into several steps to anchor the methodology. The number and nature of the steps vary depending on who is implementing them. Below are the 5+1 steps suggested by the Stanford Institute of Design and adopted by IDEO:5
Empathy: Begin by understanding the people who will be impacted by your product or service. This has to do with contacts, interviews, relations. It is the choice of rediscovering the demand environment. The mandate is openness, curiosity, and not formalisation.
Definition: It is the formalisation of a concept bearing on all the elements discovered during the first step. It is based on real needs, driven by potential clients rather than the company's context.
Ideation: This is the step where ideas are generated. This optimism phase encourages all possible ideas emerging from the previously discovered concepts. Exercises and Design workshops can serve to focus on specific aspects to see what intentions are possible. Little by little, ideas are grouped together, refined, completed, and given more specific meaning.
Prototyping: Then comes the moment for materialisation, for moving on to the “how“. Here the problems are represented more concretely, to draw out potential. Speed is of the essence, especially in making mistakes so as to quickly reposition. Simple materials are used such as cardboard, putty...
Testing: It is then time to test the prototype, with potential users, to ensure its feasibility and check that it is a cultural fit for your brand. Sparked interest is proof that the prototype is a solution in tune with a user need.
Lastly, let us add evolution: The results from the preceding phases should be a new starting point for researching the best way to create value around a given need. One thus understands that the implementation of the Design approach does not end once the process has started, because it forces you to systematically evolve what you already have.
[5] > https://dschool.stanford.edu/sandbox/groups/designresources/wiki/36873/attachments/74b3d/ModeGuideBOOTCAMP2010L.pdf?sessionID=c2bb722c7c1ad51462291013c0eeb6c47f33e564
122
THE WEB GIANTS
Empathize Ideate
Define Prototype
Test
Some of the steps can be repeated, adjusted, refined, added to. New ideas are born out of tests: following prototyping for example, other types of potential clients can emerge... And this happens in a context of iteration, co-creation, sometimes without any hierarchy, and with a sufficiently optimistic mindset to accept any failures.
Design vs. Tech[6] Design is currently such a major driver for the web giants that questions arise concerning technology as a crucial strategic element.
Choices are made in the front-end of everything – Scott Belsky Behance
One effectively observes that the beneficial effects of Moore's law are diminishing while, at the same time, users are gaining in maturity, to the point where they are increasingly involved in defining the perfect interface for them.
[6] http://www.kpcb.com/blog/design-in-tech-report-2015
123
THE WEB GIANTS CULTURE / DESIGN THINKING
Why are Tech Companies Acquiring Design Agencies?
The old way of thinking The new way of thinking
The solution to every newproblem in tech has beensimple : more tech.
1A better experience was made with a faster CPUor more memory
2Moore’s Law no longercuts it as yhe key path toa happier customer.
3
(modi�ed from Design In Tech presentation, John Maeda KMPG partner)
Moreover, the new generations of users no longer consider possibilities driven by technology as innovation breakthroughs but rather as basic expectations (it is normal for technology to open up new possibilities). Thus it is Design which makes the difference in what clients buy and the brands they are loyal to.
Noting this trend, many web giants started buying up companies specialised in Design in 2010.
#Design in Tech M&A Activity
(modi�ed from Design In Tech presentation, John Maeda KMPG partner)
2005
NUMBER OFDESIGNERCO-FOUNDEDTECH COMPANIES
2006 2007 2008 2009 2010 to the present
FlikrAndroid
YouTubeVimeo
FabLevelMoney
PolarUltravisual
WillCallBeats
ReadmillSimple
SoldTumblrPulse
MailboxFoodspotting
ForrstBehance
Acrylic SofwareSlideshareInstagramOMGPOP
PostcrousGowallaHunch
Push Pop PressDaytum
about.meSongzaMint
+acq. for $1.65B
+acq. for $1.0B
Mobile was theinflection pointfor #DesignInTech
Mobile was theinflection pointfor #DesignInTech
124
THE WEB GIANTS
How can I make it work for me? Which way implementingThe crucial step is to evolve your company into a Design-centric Organization:
The strategy is to promote full integration of Design Thinking in your company:[7]
Design-centric leaders, who consider Design as a structural cultural edge both within their company and in the expression of their values (Products, services, expert advice, quality of product code...).
Embracing the Design culture: The development of the business culture is systematically informed by values of empathy rather than organic growth, the user experience (UX) is the most important benchmark, and the goal is to provide high quality client experiences (CX) with true value.
The Design thought process Design thinking and its implementation are a given in the company mindset, and therefore teams concentrate on opportunities in problematics rather than on project opportunities. Several implementation vectors can serve to promote this mindset:The acquisition of talent, i.e. incorporating designers (IBM)Calling upon consultants for help with issues which go beyond methodology Assimilation, by integrating Design studios and coaches[8]
A structure built around Design. Companies are organised around attracting talent and co-leaders for each position to encourage each other to create initiatives, an integral part of their responsibilities.
Globally speaking, 10% of the 125 richest companies in the USA have Top Managers or CEOs from Design. Alongside the web giants, one notes that the CEO of Nike is a Designer. Apple is the only company to have a SVP for Design.
Adapting will take time, especially as most have yet to realise the relevance of Design. Getting help from Designers or UX specialists familiar with the approach is necessary for sharing these new tools and then putting them into operation.
[7] https://hbr.org/2015/09/design-as-strategy[8] http://www.wired.com/2015/05/consulting-giant-mckinsey-bought-top-design-firm/
125
THE WEB GIANTS CULTURE / DESIGN THINKING
Companies that do not use design
2003 2007 2003 2007 2003 2007 2003 2007
Companies that usedesign for styling and appearance
Companies that integrate design into the development process
Companies that consider design a key strategic element
How can I make it work for them?While since 2010, GAFAM, NATU and other web giants have been following this strategy, today all sectors refer, directly or indirectly, to Design Thinking in their quest for an optimal client experience.[9]
Among concrete examples, we will mention the following:On the point of disappearing after multiple failures, AirBnB managed to turn themselves around thanks to Design Thinking[10]
Exploration of aggregated services, proposed and tested by Uber in partnership with Google[11]
Still at Uber, the Design Thinking approach underlies the entire internal structure of the company[12]
With the same goal, IBM restructured its Organization through a Design transition[13]
At Dropbox, Design Thinking is ubiquitous. Both in terms of its products and internal structure[14] [15]
More precisely, one can describe strong implication in Strategic Design as:Implementation in several stages (from visual Design to strategic Design) at: Google, Apple, Facebook, Dropbox, Twitter, Netflix, Salesforce, Amazon…An overarching Design-centric strategy at:Pinterest, AirBnB, Google Ventures, Coursera, Etsy, Uber, most FinTechs
[9] http://blog.invisionapp.com/product-design-documentary-design-disruptors/[10] https://www.youtube.com/watch?v=RUEjYswwWPY[11] > http://www.happinessmakers.com/knowledge/2015/11/29/inside-ubers-design-
thinking[12] > http://talks.ui-patterns.com/videos/applying-design-thinking-at-the-organizational-
level-uber-amritha-prasad[13] > http://www.fastcodesign.com/3028271/ibm-invests-100-million-to-expand-design-
business[14] > https://twitter.com/intercom/status/614537634833137664[15] > http://designerfund.com/bridge/day-in-the-life-rasmus-andersson/
126
THE WEB GIANTS
Associated patterns Pattern “Enhancing the user experience“ p. 27
Pattern “Lean Startup“ p. 87
Sources
• Evolution of Design Thinking: Special issue of the Harvard Business Review: > https://hbr.org/archive-toc/BR1509?cm_sp=Magazine%20Archive-_-
Links-_-Previous%20Issues> http://stanfordbusiness.tumblr.com/post/129579353544/how-design-
thinking-can-help-drive-relevancy-in
• The example of AirBnB:> https://growthhackers.com/growth-studies/airbnb> https://www.youtube.com/watch?v=RUEjYswwWPY
• Methodology:> https://www.ideo.com/images/uploads/thoughts/IDEO_HBR_Design_
Thinking.pdf> https://www.rotman.utoronto.ca/Connect/RotmanAdvantage/
CreativeMethodology.aspx> http://www.gv.com/sprint/
• Design Value:> http://www.dmi.org/default.asp?page=DesignDrivesValue#.
VW6gfEycSdQ.twitter> Design-Driven Innovation-Why it Matters for SME Competitiveness
White Paper – Circa Group
• Design in Tech:> http://www.kpcb.com/blog/design-in-tech-report-2015
127
THE WEB GIANTS
DeviceAgnostic
128
THE WEB GIANTS
129
THE WEB GIANTS PRATICES / DEVICE AGNOSTIC
DescriptionFor Web Giants, user-friendliness is no longer open to debate: it is non negotiable.
As early as 2003, the Web 2.0 manifesto pleaded in favor of the “Rich User Experience“, and today, anyone working in the world of the Web knows the importance of providing the best possible user interface. It is held to be a crucial factor in winning market shares.
In addition to demanding high quality user experience, people want to access their applications anywhere, anytime, in all contexts of their daily lives. Thus a distinction is generally made between situations where one is sitting (e.g. at the office), nomadic (e.g. waiting in an airport terminal) or mobile (e.g. walking down the street).
These situations are currently linked to various types of equipment, or devices. Simply put, one can distinguish between:
Desktop computers for sedentary use.
Laptops and tablets for nomadic use.
Smartphones for mobile use.
The Device Agnostic pattern means doing one’s utmost to offer the best user experience possible whatever the situation and device.
One of the first companies to develop this type of pattern was Apple with its iTunes ecosystem. In fact, Apple first made music accessible on PC/Mac and iPod, then on the iPhone and iPad. Thus they have covered the three use situations. In contrast, Apple does not fully apply the pattern as their music is not accessible on Android or Windows Phone.
To implement this approach, it can be necessary to offer as many interfaces as there are use situations. Indeed, a generic interface of the one-size-fits-all type does not allow for optimal use on computers, tablets, smartphones, etc.
130
THE WEB GIANTS
The solution adopted by many of Web Giants is to invest in developing numerous interfaces, applying the pattern API first (cf. “Open API“, p. 235). Here the principle is for the application architecture to be based on a generic API, with the various interfaces then being directly developed by the company, or indirectly through the developer and partner ecosystem based on the API.
To get the most out of each device, it is becoming ever more difficult to use only Web interfaces. This is because they do not manage functionalities specific to a given device (push, photo-video capture, accelerometer, etc.). Users also get an impression of lag because the process entails frontloading the entire contents,[1] whereas native applications need no loading or only a few XML or JSON resources.
I’d love to build one version of our App that could work everywhere. Instead, we develop separate native versions for Windows, Mac, Desktop Web, iOS, Android, BlackBerry, HP WebOS and (coming soon) Windows Phone 7.We do it because the results are better and, frankly, that’s all-important.We could probably save 70% of our development budget by switching to a single, cross-platform client, but we would probably lose 80% of our users.
Phil Libin, CEO Evernote (January, 2011)
However things are changing with HTML5 which functions in offline mode and provides resources for many applications not needing GPS or an accelerometer. In sum, there are two approaches adopted by Web companies: those who use only native applications such as Evernote, and those who take a hybrid approach using HTML5 contents embarked in the native application which then becomes a simple empty shell, capable only of receiving push notifications. This is in particular the case of Gmail, Google+ and Facebook for iPhone. One of the benefits of this approach is to enhance visibility in the AppStores where users go for their applications.
The hybrid pattern is thus a good compromise: companies can use the HTML5 code on a variety of devices and still install the application via an App Store with Apple, Android, Windows Phone, and, soon, Mac and Windows.
[1] This frontloading can be optimized (cf. “Enhancing the user experience“, p. 27) but there are no miracles…
131
THE WEB GIANTS PRATICES / DEVICE AGNOSTIC
Who makes it work for them?There are many examples of the Device Agnostic pattern being implemented among Web Giants. Among others:
In the category of exclusively native applications: Evernote, Twitter, Dropbox, Skype, Amazon, Facebook.
In the category of hybrid applications: Gmail, Google+.
References among Web Giants
Facebook proposes:
A Web interface for PC/Mac: www.facebook.com.
A Web interface for Smartphones: m.facebook.com.
Embarked mobile interfaces for iPad, iPhone, Android, Windows Phone, Blackberry, PalmOS.
A text message interface to update one’s status and receive notifications of friend updates.
An email interface to update one’s status.
In addition, there are several embarked interfaces for Mac and PC offered by third parties such as Seesmic and Twhirl.
Twitter stands out from the other Web Giants in that it is their ecosystem which does the implementing for them (cf. “Open API“, p. 235). Many of the Twitter graphic interfaces were in fact created by third parties such as TweetDeck, Tweetie for Mac and PC, Twitterrific, Twidroid for smartphones... To the extent that, for a time, Twitter’s web interface was considered unuser friendly and many preferred to use the interfaces generated by the ecosystem instead. Twitter is currently overhauling the interfaces.
132
THE WEB GIANTS
In FranceOne finds the Device Agnostic pattern among major media groups. For example Le Monde proposes:
A Web interface for PC/Mac: www.lemonde.fr
A Web interface for Smartphones: mobile.lemonde.fr
Hybrid mobile interfaces for iPhone, Android, Windows Phone, Blackberry, PalmOS, Nokia OVI, Bada
An interface for iPad
It is also found in services with high consultation rates such as banking. For example, the Crédit Mutuel proposes:
A Web interface for PC/Mac: www.creditmutuel.fr
A redirect service for all types of device: m.cmut.fr
A Web interface for Smartphones: mobi.cmut.fr
A Web interface for tablets: mini.cmut.fr
A WAP interface: wap.cmut.fr
A simplified Java interface for low technology phones
Embarked mobile interfaces for iPad, iPhone, Android, Windows Phone, Blackberry
An interface for iPad.
133
THE WEB GIANTS PRATICES / DEVICE AGNOSTIC
How can I make it work for me?The pattern is useful for any B2C service where access anywhere, anytime is important.
If your budget is limited, you can implement the mobile application most used by your target clients, and propose an open API in the hopes that others will develop the interface for additional devices.
Associated patterns
The Open API or open ecosystem pattern, p. 235.
The Enhancing the User Experience pattern, p. 27.
Exception!As mentioned earlier, this pattern is only limited by the budget required for its implementation.
Sources• Rich User Experiences, Web2.0 Manifesto, Tim Oreilly:> http://oreilly.com/Web2/archive/what-is-Web-20.html
• Four Lessons From Evernote’s First Week On The Mac App Store,Phil Libin:> http://techcrunch.com/2011/01/19/evernote-mac-app-store
134
THE WEB GIANTS
Perpetual beta
135
THE WEB GIANTS
136
THE WEB GIANTS PRATICES / PERPETUAL BETA
DescriptionBefore introducing perpetual beta, we must revisit a classic pattern in the world of open software:
Release early,release often.
The principle behind this pattern consists of regularly releasing code to the community to get continuous feedback on your product from programmers, testers, and users. This practice is described in Eric Steven Raymond’s 1999 work “The Cathedral and the Bazaar“ It is in keeping with the short iteration principle in agile methods.
The principle of perpetual beta was introduced in the Web 2.0 manifesto written by Tim O’Reilly where he writes:
Users must be treated as co-developers, in a reflection of open source development practices (...).The open source dictum, ‘release early and release often’, in fact has morphed into an even more radical position, ‘the perpetual beta’, in which the product is developed in the open, with new features slipstreamed in on a monthly, weekly, or even daily basis.
The term “perpetual beta“ refers to the fact that an application is never finalized but is constantly evolving: there are no real releases of new versions. Working this way is obviously in line with the logic of “Continuous Delivery“ (cf. “Continuous Deployment“, p. 105).
This constant evolution is possible because it is a case here of services on line rather than software:
In the case of software, version management usually follows a roadmap with publication benchmarks: releases. These releases are spread out over time for two reasons: the time it takes to deploy the versions to the users, and the need to ensure maintenance and support for the various versions released to users. Monitoring
137
THE WEB GIANTS
support, safety updates and ongoing maintenance on several versions of a single program is a nightmare, and a costly one. Let us take the example of Microsoft: the Redmond-based publisher had to manage at one point the changes to Windows XP, Vista and Seven. One imagines three engineering teams all working on the same software: a terrible waste of energy and a major crisis for any company lacking Microsoft’s resources. This syndrome is known as “version perversion“.
In the context of online services, only one version of the application needs to be managed. Furthermore, since it is Web Giants themselves who upload and host their applications, users benefit from updates without having to manage the software deployment.
New functionalities appear on the fly where they are “happily“ discovered by the users. In this way one learns to use new functions in applications progressively. Generally speaking, the logistics of ascendant interoperability are well managed (with a few exceptions, such as support in disconnected mode in Gmail, when they gave up Google Gears). This model is widely applied by the stakeholders of Cloud Computing.
The “customer driven roadmap“ is a complementary and virtuous feature of the perpetual beta (cf. “Lean Startup“, p. 87). Since Web Giants manage the production platform, they can also finely measure use of their software. Thereby measuring the success of each new functionality. As mentioned previously, the Giants follow metrics very closely. So closely in fact that we have devoted a chapter to the subject (cf. “The obsession with performance measurement“, p. 13).
More classically, running the production platform provides opportunities to launch surveys among various target populations to get user feedback.
To apply the perpetual beta pattern, you must have the means to carry out regular deployments. The prerequisites are:
implementing automatic software builds,
practicing Continuous Delivery,
ensuring you can rollback in case of trouble...
138
THE WEB GIANTS PRATICES / PERPETUAL BETA
There is some controversy around the perpetual beta: some clients equate beta with an unfinished product and believe that services following this pattern are not reliable enough to count on. This has led some service operators to remove the mention “beta“ from their site, albeit without changing their practices.
Who makes it work for them?The reference was Gmail which sported the mention beta until 2009 (with the vintage function “back to beta“ being added later).
It is a practice implemented by many Web Giants. Facebook, Amazon, Twitter, Flickr, Delicious.com, etc.
A good illustration of perpetual beta is provided by Gmail Labs: they are small unitary functionalities which users can decide to activate or not. Depending on the rate of adoption, Google then decides to integrate them in the standard version of their service or not (cf. “Feature Flipping“, p. 113).
In France, the following services display, or have displayed, the beta logo on theirhome page:
urbandive.com : a navigation service with street view by the Pages Jaunes,
sen.se : a service for storing and analyzing personal data.
Associated patterns
Pattern “Continous Deployment“, p. 105.
Pattern “Test A/B“, p. 123.
Pattern “The obsession with performance measurement“, p. 13.
139
THE WEB GIANTS
Exception!Some Web Giants still choose to keep multiple versions up and running simultaneously. Maintaining several versions of an API is particularly relevant as it saves developers from being forced into updating their code every time a new version of the API is released. (cf. “Open API“, p.235.)
The Amazon Web Services API is a good example.
Sources• Tim O’Reilly, What Is Web 2.0 ?, 30 September, 2005:> http://oreilly.com/pub/a/web2/archive/what-is-web-20.html
• Eric Steven Raymond, The Cathedral and the Bazaar:> http://www.catb.org/~esr/writings/cathedral-bazaar/cathedral-bazaar/
140
THE WEB GIANTS
Architecture
141
Cloud First .................................................................................... 159Commodity Hardware ................................................................... 167Sharding ....................................................................................... 179TP vs. BI: the new NoSQL approach ............................................... 193Big Data Architecture .................................................................... 201Data Science ................................................................................. 211Design for Failure .......................................................................... 219The Reactive Revolution ................................................................ 225Open API ..................................................................................... 233
142
THE WEB GIANTS
Cloud First
143
THE WEB GIANTS
144
THE WEB GIANTS ARCHITECTURE / CLOUD FIRST
DescriptionAs we saw in the description of the pattern “Build vs. Buy“ (cf. “Build vs. Buy“, p. 19): Web Giants favor specific developments so as to control their tools from end to end, whereas many companies instead use software packages, considering that software tools are commodities.[1]
Although Web Giants, like startups, prefer to develop critical applications in-house, they do at times have recourse to third-party commodities. In this case, they apply the commodity logic to the fullest by choosing to
completely outsource the service in the Cloud.
By favoring services in the Cloud, Web Giants, again like startups, take a very pragmatic stance: profiting from the best innovations by their peers, speedily and with an easy-to-use purchase model, to focus their efforts on their business strengths. This model can be inspiring for all companies wishing to move fast and to reduce investment costs to win market shares.
Why favor the Cloud in the commodity framework? The table on the following page lays out the advantages.
The Cloud approach can be divided into three main strands:
Using APIs and Mashups: Web Giants massively call upon services developed by Cloud companies (Google Maps, user identification on Facebook, payment with PayPal, statistics with Google Analytics, etc.) and integrate them in their own pages via the mashup principle.
Outsourcing functional commodities: Web majors often externalize their commodities to SaaS services (e.g. Google Apps for collaborating, Salesforce for managing sales personnel, etc.)
Outsourcing technical commodities: Web players also regularly use Iaas and PaaS platforms to host their services (Netflix and Heroku for example use Amazon Web Services).
145
THE WEB GIANTS
Analysis axisModel
In-house management
Cloud
Cost Initial outlay for licenses, equipment, staff.
Pay-per-use: neither investment nor com-mitment.
Time to Market License purchase, then deployment by the company within a few weeks.
Self-service subs-cription automatically implemented within minutes.
Roadmap/new functionalities
Designed in the mid term by publishers following feedback from user groups.
Implemented in the short term depending on what users do with the service.
Rhythm of change
Often one major release per year.
New functionalities on the fly.
Support and updates
Additional yearly cost.
Included in the subs-cription.
Hosting andoperating
Entails building and operating a datacenter by experts.
Delegated to the Cloudoperator.
The physical safety of data
Data integrity is the responsibility of the company.
The major Cloud opera-tors ensure the safety of data in accordance with the ISO standards ISO 27001[1] and SSAE 16.[2]
[1] ISO 27001 : http://en.wikipedia.org/wiki/ISO_27001
[2] SSAE 16 (replacing the Type 2 SAS 70) : http://www.ssae-16.com
146
THE WEB GIANTS ARCHITECTURE / CLOUD FIRST
Housing technical commodities in the Cloud is particularly interesting for Web companies. With the pay-as-you-go model, they can launch online activities with next to no hosting costs. Charges increase progressively as the number of users grows, alongside revenues, so all is well. The Cloud has thus radically changed their launch schedules.
The Amazon Web Services platform IaaS is massively usedby Web Giants such as Dropbox, 37signals, Netflix, Heroku...
During the CloudForce 2009 conference in Paris,a Vice-President of Salesforce affirmed that the company did not use an IaaS platform because such solutions did not exist when the company was created, but that if it were to be done today they would certainly choose IaaS.
Who makes it work for them?The eligibility of the Cloud varies depending both on the type of data you manipulate and regulatory constraints. Thus:
Banks in Luxembourg are forbidden from storing their data elsewhere than in certified organizations.
Companies working with sensitive data, industrial secrets or patents are reluctant to store them in the Cloud. The Patriot Act[3] in particular pushes companies away from the Cloud: it forces companies registered in the United States to make their databases available upon request by government authorities.
Companies which work with personal data can also be forced to restrict their recourse to the Cloud because of the CNIL regulations, the respect of which varies from one Cloud platform to the next (variable implementation of Safe Harbor Privacy Principles).[4]
[3] http://en.wikipedia.org/wiki/PATRIOT_Act
[4] http://en.wikipedia.org/wiki/International_Safe_Harbor_Privacy_Principle
147
THE WEB GIANTS
When there are no such constraints, using the Cloud is possible. And many companies of all sizes and from all sectors have migrated to the Cloud, in the USA as well as in Europe.
Let us describe a case that well illustrates the potential of the Cloud:
In 2011, Vivek Kundra, former CIO at the White House, announced the program “Cloud First“ which stipulated that all US administrations had to use the Cloud first and foremost for IT.This decision should be put in context: in the USA there is the “GovCloud“, i.e. Cloud offers suited to administrations, with full respect for their constraints, located on American soil, and isolated from other clients.Such services are offered by Amazon, Google and other providers.[5]
In some companies, it is the mindset which is dead against storing data in the Cloud. This reluctance is due to the factors presented above, but also to a lack of confidence (Cloud providers have not yet reached the levels of trust of banks) and also possibly unwillingness to change. Web Giants are less affected by these two latter impediments, they are already well acquainted with the Cloud providers and are open to change.
Cloud addiction?
One should also be careful not to depend too fully on a single Cloud platform to house critical applications. These platforms are not fail-proof, as shown by recent failures: Microsoft Azure (February, 2012), Salesforce (June, 2012), Amazon Web Services (April and July, 2012).The failures at AWS highlighted their lack of maturity in the use of theCloud:
Pinterest, Instagram, Heroku which were dependent on a single Amazon datacenter were strongly impacted,
[5] Federal Cloud Computing Strategy, Vivek Kundra, 2011:
http://www.forbes.com/sites/microsoft/2011/02/15/kundra-outlines-cloud-first-policy-for-u-s-government
148
THE WEB GIANTS ARCHITECTURE / CLOUD FIRST
Netflix used several Amazon datacenters and was thus less affected[6]
(cf. “Design for Failure“, p. 221).
One should note however that such failures create media hype whereas very little is known about the robustness of corporate datacenters. It is therefore difficult to measure the true impact on users.
Here are a few Service Level Agreements that you can compare withthose of your companies:
Amazon EC2: 99.95% availability per year.
Google Apps: 99.9% availability per year.
References among Web Giants
A few examples of recourse to the Cloud by Web Giants:
using Amazon Web Services: Heroku, Dropbox, 37Signals, Netflix, Etsy, Foursquare, Voyages SNCF. In fact, Amazon represents 1% of all traffic on the Web;
using Salesforce: Google, LinkedIn;
using Google Apps: Box.net.
In FranceA few examples of Cloud use in France:
In industry: Valeo, Treves use Google Apps.
In insurance: Malakoff Méderic uses Google Apps.
[6] Feedback from Netflix on AWS failures:
http://techblog.netflix.com/2011/04/lessons-netflix-learned-from-aws-outage.htm
149
THE WEB GIANTS
In the banking sector: most use Salesforce for at least part of their activities.
In the Internet sector: PagesJaunes uses Amazon Web Services.
In the public sector: La Poste uses Google Apps for their mail delivery staff.
How can I make it work for me?If you are a SME or a VSE, you would probably benefit from externalizing your commodities in the Cloud, for the same reasons as Web Giants. All the more so as regulatory issues, such as the protection of industrial secrets, must be resolved following the emergence of French and European Clouds such as Adromède.
If you are a large company, already well endowed with hardware and IT teams, the benefits of the Cloud can be offset by the cost of change. It can nevertheless be relevant to study the question. In any case, you can profit from the Cloud’s agility and pay-as-you-go approach for:
innovative projects: pilot projects, Proof of Concept, project incubation, etc.
Environments with limited life spans (development, testing, design, etc.).
Related Pattern
Pattern “Build vs. Buy“, p. 19.
Exception!As stated earlier, regulatory constraints can cut off access to the Cloud.
In some cases, re-internalization is the best solution: when data and user volumetrics increase spectacularly, it can be cheaper to repatriate applications and build a datacenter on totally optimized architecture. This type of optimization does however typically require highly-qualified staff.
150
THE WEB GIANTS
Commodity Hardware
151
THE WEB GIANTS
152
THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE
DescriptionAlthough invisible behind your web browser, millions of servers run day and night to make the Web available 24/7. There are very few leaks as to numbers, but it is clear that major Web companies have dozens or even hundreds of thousands of machines like EC2,[1] it is even surmised that Google has somewhere around a million.[2] Managing so many machines is not only a technical challenge, it is above all an economic one.Most major players have circumvented the problem by using mass produced equipment, also called “commodity hardware“, which is the term we will use from now.
This is one of the reasons which has led the Web Giants to interconnect a large number of mass-produced machines rather than using a single large system. A single service to a client, a single application, can run on hundreds of machines. Managing hardware this way is known as Warehouse Scale Computing,[3] with hundreds of machines replacing a single server.
Business needs
Web Giants share certain practices, described in various other chapters of this book:[4]
A business model tied to the analysis of massive quantities of data - for example indexing web pages (i.e. approximately 50 billion pages).
One of the most important performance issues is to ensure that query response times stay low.
[1] Source SGI.
[2] Here again it is hard to make estimates.
[3] This concept is laid out in great detail in the very long paper The Data Center as a Computer, we only mention a few of their concepts here. The full text can be found at: http://www.morganclaypool.com/doi/pdfplus/10.2200/S00516ED2V01Y201306CAC024
[4] cf. in particular “Sharding“, p. 179.
153
THE WEB GIANTS
Income from e.g. advertising is not linked to the number of queries, per query income is actually very low.[5] Comparatively speaking, the cost per unit using traditional large servers remains too elevated. The incentive to find the architecture with the lowest transaction costs is thus very high.
Lastly, the scales of magnitude of processing carried out by the Giants are far removed from traditional computer processing management, where until now the number of users was limited by the number of employees. No machine, however big, is capable of meeting their needs.
In short, these players need scalability (marginal cost per constant transaction), and the marginal cost must stay low.
Mass-produced machines vs. high-end servers
When scalability is at issue, there are two main alternatives:
Scale-up or vertical growth consists in using a better performing machine. This is the alternative that has most often been chosen in the past because it is very simple to implement. Moreover Moore’s law means that builders regularly offer more powerful machines at constant prices.
Scale-out or horizontal scaling consists in pooling the resources of several machines which individually can be much less powerful. This removes all limits as to the size of the machine.
Furthermore, PC components, technologies and architectures show a highly advantageous performance/cost ratio. Their relatively weak processing capacity as compared to more efficient architectures such as RISC are compensated for by lower costs obtained through mass production. A study based on the results of the TPC-C[6] shows that the relative cost per transaction is three times lower with a low-end server than with a top of the line one.
[5] “Early on, there was an emphasis on the dollar per (search) query,“ [Urs] Hoelzle said. “We were forced to focus. Revenue per query is very low.“ http://news.cnet.com/8301-1001_3- 10209580-92.html
[6] Ibid, [3] preceeding page
154
THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE
At the scales implemented by Web Giants - thousands of machines coordinated to execute a single function - other costs become highly prominent: electric power, cooling, space, etc. The cost per transaction must take these various factors into account.
Realizing that has led the Giants to favor horizontal expansion (scale-out) based on commodity hardware.
Who makes it work for them?Just about all of Web Giants. Google, Amazon, Facebook, LinkedIn… all currently use x86 type servers and commodity hardware. However, using such components introduces other constraints, and having a Data Center as a Computer entails scaling constraints which differ widely from what most of us think of as datacenters. Let us therefore go into more detail.
Material characteristics which impact programming
Traditional server architecture strives, to the extent allowed by the hardware, to provide developers with a “theoretical architecture“, including a processor, a central memory containing the program and data, and a file system.[7]
Familiar programming based on variables, calling functions, threads and processes make this approach necessary.
The architectures of large systems are as close to this “theoretical architecture“ as a set of machines in a datacenter is far.
Machines of the SMP (Symmetric Multi Processor) type, used for scaling-up, now make it possible to use standard programming, with access to the entire memory and all disks in a uniform manner.
[7] This architecture is known as the Von Neumann architecture.
155
THE WEB GIANTS
Figure 1 (modified). Source RedPaper 4640, page 34.
As the figures on the diagram show, great efforts are made to ensure that speed and latency are nearly identical between a processor, its memory and disks, whether they are connected directly, connected to a same processor book[9] or different ones. If any NUMA (Non Uniform Memory Access - accessing a nearby memory is faster than accessing memory in a different part of the system) characteristics are retained, they are concentrated on the central memory, with latency and bandwidth differences in a 1 to 2 ratio.
[9] A processor book is a compartment which contains processors, memory and in and out connectors, at the first level it is comparable to a main computer board. Major SMP systems are made up of a set of compartments of this sort interconnected through a second board: the midplane.
Processor Book 8 of 8
Processor Book n of 8
I/O Drawer
HMC
HMC
24 port, 100Mb Enet Switch
Oscillator Card TPMDTPMD
DIMM
DIMM
DIMM
DIMM BUFFER
BUFFER
BUFFER
BUFFER DIMM
DIMM
DIMM
DIMM BUFFER
BUFFER
BUFFER
BUFFER
Mid
plan
e
Oscillator Card
FSPSystem
Controller
FSP NodeController
FSP NodeController
FSPSystem
Controller
One server : RAM : 8 TB, 39,4 GB/sDisk : 304 TB, 10 ms upto 50 GB/s
One Processor Book :RAM 1 TB,133 ns, 46,6 GB/sDisk : 304 TB, 10ms upto 50 GB/s
One Processor :RAM 256 GB,100 ns, 76,5 GB/sDisk : 304 TB, 10ms upto 50 GB/s
INTER NODE FABRIC BUSINTER NODE FABRIC BUSINTER NODE FABRIC BUS
INTER NODE FABRIC BUS
INTER NODE FABRIC BUS
24 port, 100Mb Enet Switch
Lorem Ipsum
Lorem Ipsum
Lorem Ipsum
Lorem Ipsum
Lorem Ipsum
LoremIpsum
LoremIpsum
LoremIpsum
LoremIpsum
One server : RAM : 8 TB, 39,4 GB/sDisk : 304 TB, 10 ms upto 50 GB/s
One Processor Book{9} :RAM 1 TB,133 ns, 46,6 GB/sDisk : 304 TB, 10ms upto 50 GB/s
One Processor :RAM 256 GB,100 ns, 76,5 GB/sDisk : 304 TB, 10ms upto 50 GB/s
156
THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE
Operating systems and middleware like Oracle can take charge of such disparities.
From a scale-out perspective, the program no longer runs on a single large system but is instead managed by a program which distributes it over a set of machines. This manner of connecting machines in commodity hardware gives a very different vision from that of the“theoretical architecture“ for the developer.
Figure 2. Source The Data Center As A Computer page 8
L1$ :Level 1 cache , L2$ : level 2.cache
Local DRAM
Local 25
L 15 L 25
Local 25
L 15 L 25
Rack Switch
DRAM
DRAM
DRAM
DRAM
DRAM
DRAM
DRAM
DRAM
Disk
Disk
Disk
Disk
Disk
Disk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
Disk
Disk
Disk
P P P P
Datacenter Switch
One serverDRAM: 16GB 100ns, 20GB/sDisk : 2TB, 10ms, 200MB/s
Local Rack (80 servers)DRAM: 1TB, 300us, 100MB/sDisk : 160TB, 11ms, 100MB/s
Cluster (30 racks)DRAM: 30TB, 500us, 10MB/sDisk: 4.80PB, 12ms, 10MB/s
Local DRAM
Local 2$
L 1$ L 1$
Local 2$
L 1$ L 1$
Rack Switch
DRAM
DRAM
DRAM
DRAM
DRAM
DRAM
DRAM
DRAM
Disk
Disk
Disk
Disk
Disk
Disk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
Disk
Disk
Disk
P P P P
Datacenter Switch
One serverDRAM: 16GB 100ns, 20GB/sDisk : 2TB, 10ms, 200MB/sP : ProcessorLocal Rack (80 servers)DRAM: 1TB, 300µs, 100MB/sDisk : 160TB, 11ms, 100MB/s
Cluster (30 racks)DRAM: 30TB, 500µs, 10MB/sDisk: 4,80PB, 12ms, 10MB/s
Local DRAM
Local 2$
L 1$ L 1$
Local 2$
L 1$ L 1$
Rack Switch
DRAM
DRAM
DRAM
DRAM
DRAM
DRAM
DRAM
DRAM
Disk
Disk
Disk
Disk
Disk
Disk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
Disk
Disk
Disk
P P P P
Datacenter Switch
One serverDRAM: 16GB 100ns, 20GB/sDisk : 2TB, 10ms, 200MB/sP : ProcessorLocal Rack (80 servers)DRAM: 1TB, 300µs, 100MB/sDisk : 160TB, 11ms, 100MB/s
Cluster (30 racks)DRAM: 30TB, 500µs, 10MB/sDisk: 4,80PB, 12ms, 10MB/s
Local DRAM
Local 2$
L 1$ L 1$
Local 2$
L 1$ L 1$
Rack Switch
DRAM
DRAM
DRAM
DRAM
DRAM
DRAM
DRAM
DRAM
Disk
Disk
Disk
Disk
Disk
Disk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
DiskDisk
Disk
Disk
Disk
P P P P
Datacenter Switch
One serverDRAM: 16GB 100ns, 20GB/sDisk : 2TB, 10ms, 200MB/sP : ProcessorLocal Rack (80 servers)DRAM: 1TB, 300µs, 100MB/sDisk : 160TB, 11ms, 100MB/s
Cluster (30 racks)DRAM: 30TB, 500µs, 10MB/sDisk: 4,80PB, 12ms, 10MB/s
157
THE WEB GIANTS
Whenever you use the network to access data on another server, availability time increases and speed is divided by 1000. In addition, it is the network equipment feeding into the datacenter that is the limiting factor in terms of the aggregated bandwidth of all machines.
In consequence, to optimize access time and speed within the datacenter, the data and processing must be well distributed across servers (especially to avoid distributing data often accessed together over several machines). However, operating systems and the traditional middleware layers are not designed for functioning this way. The solution is for processing to take place at the application level. This is precisely where sharding[10] strategies come into play.
Service front elements, serving Web pages, easily support such constraints given that versioning is not an issue and it is easy to distribute HTTP requests over several machines. It will however be up to the other applications to explicitly manage network exchanges or to anchor themselves in new specific middleware layers. Solutions for storing this type of material are also deployed among Web Giants by using sharding techniques.
Implementing failure resistance
The second significant difference between large systems and Warehouse Scale Computers lies in failure tolerance. For decades, large systems have been coming up with advanced hardware mechanisms to maximally reduce failures (RAID, changing equipment live, replication at the SAN level, error correction and failover at the memory and I/O level, etc.). A Warehouse Scale Computer has the opposite features for two reasons:
Commodity hardware components are less reliable;
the global availability of a system simultaneously deploying to several machines is the product of the availability of each server.[11]
[10] cf. “Sharding“, p. 179.
[11] Thus if each machine has an annual downtime of 9 hours, the availability of 100 servers will be at best 0.999100≈ 0.90%, i.e. 36 days of unavailability per year!
158
THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE
[12] SGI is the result of a merger between Silicon Graphics, Cray and above all of Rackable who had expertise in the field of x86 servers.
[13] http://www.youtube.com/watch?v=Ho1GEyftpmQ
Because of this, Web Giants consider that the system must be able to function continuously even when some components have failed. Once again, the application layer is responsible for ensuring this tolerance for failure (cf. “Design for Failure“, p. 221).
On what criteria are the machines chosen?
That being said, the machines chosen by the Giants do not always resemble what we think of as PCs or even the x86 servers of majors such as HP or IBM. Google is certainly the most striking example as it builds its own machines. Other majors such as Amazon work with more specialized suppliers such as SGI.[12]
The top priority in choosing their servers is, of course, the bottom line. Whittling components down to meet their precise needs and the quantity of servers purchased give Web Giants a strong negotiating position. Although verified data is lacking, it is estimated that the cost of a server for them can go as low as $500.
The second priority is electric power consumption. Given the sheer magnitude of servers deployed, power consumption has become a major expense item. Google recently stated that their average consumption was about 260 million watts, amounting to a bill of approximately $30,000 per hour. The choice of components as well as a capacity to configure the consumption of each component very precisely can also engender huge savings.
In sum, even though they contain the same parts you would find in your desktop, the server configurations are a long shot away. With the exception of a few initiatives such as OpenCompute from Facebook, the finer details are a secret that the Giants keep fiercely. The most one can discover is that Google replaced its centralized oscillators with 12V batteries directly connected to the servers.[13]
159
THE WEB GIANTS
Exception!There are almost no examples of Web Giants communicating with any other technology besides x86. If we went back in time, we would probably find a “Powered by Sun“ logo at Salesforce[14].
How can I make it work for me?Downsizing, i.e. replacing central servers by smaller machines peaked in the 1990s. We are not giving a salespitch for commodity hardware, even if one does get the feeling that the x86 has taken over the business. The extensive choice of commodity hardware goes beyond, as it transfers the responsibility for scalability and failure resistance to applications.For Warehouse Scale Computing, like for the Web Giants, when the costs of electricity and investment become crucial, it is the only viable solution. For existing software which can run on the sole resources of a single multiprocessor server, the cost of (re-)developing it as a distributed system and the cost of the hardware can be balanced in the Information System.
The decision to use commodity hardware in your company must be made in the framework of your global architecture: as much as possible, develop what you already have with better quality machines or adapt it to migrate (completely) to commodity hardware. In practice, applications designed for distribution such as front Web services will migrate easily. In contrast, highly integrated applications such as software packages necessarily entail specific infrastructure with disk redundancy, which is hardly compatible with a commodity hardware datacenter such as used by Web Giants.
[14] > http://techcrunch.com/2008/07/14/salesforce-ditch-remainder-of-sun-hardware
160
THE WEB GIANTS ARCHITECTURE / COMMODITY HARDWARE
Associated patternsDistributed computing is essential to using commodity hardware. Patterns such as sharding (cf. “Sharding“, p. 179) need to be implemented in the code to be able to migrate to commodity hardware for data storage.
Using a large number of machines also complicates server administration, and patterns such as DevOps need to be adopted (cf. “DevOps“, p. 71).Lastly, the propensity shown by Web Giants to design computers, or rather datacenters, adapted to their needs is obviously linked to their preference for build vs. buy (cf. “Build vs. Buy“, p. 19).
161
THE WEB GIANTS
Sharding
162
THE WEB GIANTS
DescriptionFor any information system, data are an important asset which must be captured, stored and processed reliably and efficiently. While central servers often play the role of data custodian, most Web Giants have adopted a different strategy: sharding, or data distribution.[1]
Sharding describes a set of techniques for distributing data over several machines to ensure architecture scalability.
Business needs
Before detailing implementation, let us say a few words about the needs driving the process. Among Web Giants there are several shared concerns which most are familiar with: storing and analyzing massive quantities of data,[2] strong performance stakes to ensure delays are minimal, scalability[3] and even flexibility needs linked to consultation peaks.[4]
We will insist on a specificity of the type of actors facing the issues mentioned above. For Web Giants, revenues are often independent of the quantity of data processed and stem instead from advertising and user subscriptions.[5] They therefore need to keep unit costs per transaction very low. In traditional IT departments, transactions can easily be linked to physical flows (sales, inventory). Such flows make it easy to bill services depending on the number of transactions (conceptually speaking through a sort of tax). However with e-commerce sites for example, browsing the catalog or adding items to a cart does not necessarily entail revenues because the user can quit the site just before confirming payment.
[1] According to Wikipedia, a database shard is a horizontal partition of data in a database or search server. (http://en.wikipedia.org/wiki/Shard_(database_architecture)
[2] Heightened by Information Systems being opened to the Internet (user behavior analysis, links to social media...).
[3] Scalability is of course tied to a system’s capacity to absorb a bigger load, but more important still is the cost. In other words, a system is scalable if it can handle the additional query without taking more time and if the additional query costs the same amount as the preceding ones (i.e. underlying infrastructure costs must not skyrocket).
[4] Beyond scalability, elasticity is linked to the capacity to have only variable costs unrelated to the load. Which is to say that a system is elastic if, whatever the traffic (10 queries per second or 1000 queries per second), the query price per unit remains the same.
[5] For example, no size limit to e-mail accounts.
163
THE WEB GIANTS ARCHITECTURE / SHARDING
In sum, the Information Systems of Web Giants must ensure scalability at extremely low marginal costs to uphold their business model.
Sharding to cut costs
As yet, most databases are organized centrally: a single server, possibly with redundancy in active/passive mode for availability. The usual solution for increasing the transaction load is vertical scalability or scale-up, i.e. buying a more powerful machine (more I/O, more CPUs, more RAM...).
There are limits however to this approach: a single machine, no matter how powerful, cannot alone index the entire Web for example. Moreover there is the all-important question of costs leading to the search for other approaches.
Remember from the last chapter:
A study[6] carried out by engineers at Google shows that as soon as the load exceeds the capacities of a large system, the unit cost for large systems is much higher than with mass-produced machines.[7]
Although calculating per transaction costs is no easy matter and is open to controversy - architecture complexification, network load to be figured into the costs - the majority of Web Giants have opted for commodity hardware.
Sharding is one of the key elements in implementing horizontal scale-up.
[6] The study http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y200905 CAC006 is also summarized in the OCTO blog article: http://blog.octo.com/ datacenter-as-a-computer-une-plongee-dans-les-datacenters-des-acteurs-du-cloud.
[7] This is another way of saying “commodity hardware“: the machines are not necessarily low-end, but the performance/cost ratio is the highest possible for a given system
164
THE WEB GIANTS
Centralized database
Vertical partitioning
Horizontal partitioning
Client1
2Client
1
2
Client1
ContratA
ContratA
BContrat
A
B
Client2
ContratB
How to shard
In fact there are two ways of partitioning - or sharding - data: vertically or horizontally.
Vertical sharding is the most widely used and consists of isolating rows in the database table per concept. For example, deciding to store client lists in one database and their contracts in another.
Horizontal sharding is where the database tables are divided and distributed across multiple servers. For example, storing client lists from A to M on one machine and from N to Z on another. Horizontal sharding is based on a distribution key- the first letter in the name in the example above.[8]
Web Giants have mostly implemented horizontal sharding. It has the advantage namely of not being limited by the number of concepts as is the case with vertical sharding.
Figure 1
[8] In fact, partitioning is a function of the probability of names to begin with a given letter.
165
THE WEB GIANTS ARCHITECTURE / SHARDING
Techniques linked to sharding
Based on their choice of horizontal scale-up, Web Giants have developed specific solutions (grouped under the acronym NoSQL -Not Only SQL) to meet the challenges and having the followingcharacteristics:
implementation using mass-produced machines,
data sharding managed at the software level.
While sharding makes it possible to overcome the issues mentioned above, it also entails implementing new techniques.
Managing availability is much more complex. In a centralized sys-tem, or one used as such, the system is either available or not, and only the rate of unavailability will be measured. In a sharded system, some data servers can be available and others not. If the failure of a single server makes the entire system unavailable, the unavailability rate is equal to the product of the unavailability of each of the data servers. The rate will thus drop sharply: If 100 machines are each down 1 day per year, the system would show a rate of unavailability of nearly 3 months.[9] Since a distributed system can remain available despite the failure of one of the data servers, albeit in downgraded mode, availability must be measured through two figures: yield, i.e. the above defined unavailability rate; and harvest, i.e. completeness of the response, i.e. measuring so to say the absence of unavailabi-lity.[10]
Distribution of the load is usually tailored to data use. A product reference (massively accessed in read mode) won’t raise the same performance issues as a virtual shopping cart (massively accessed in write mode). The replication rate, for example, will be different.
[9] (364/365)100 = 76% = 277/365 i.e. 88 days.
[10] Thus if when a server fails, the others ignore the modifications made to that server and then resolve the various modifications once the server reconnects to the cluster, the harvest is smaller. The response is incomplete because it has not integrated the latest changes, but maintains the yield. The NoSQL solutions developed by the Giants integrate various mechanisms to manage this: data replication over several servers, vector clock algorithms to resolve competing updates when the server reconnects to the cluster. Further details may be found in the following article: http://radlab.cs.berkeley.edu/people/fox/ static/pubs/pdf/c18.pdf
166
THE WEB GIANTS
Lastly, managing the addition of new servers and the data partitioning problems this poses (recalibrating the cluster) are novel issues specific to sharding. FourSquare for example were down for 11 hours in October 2010[11] following overload of one of their servers then to trouble when they connected the back-up server, which in the end caused the entire site to crash. Data distribution algorithms such as consistent hashing[12] limit data replication costs when servers are removed or connected to overcome these problems.
Sharding also means adapting your application architecture:
Queries have to be adapted to take distribution into account so as to avoid any inter-shard queries because the cost of accessing seve-ral remote servers is prohibitive. Thus the APIs of such systems limit query possibilities to data in the same shard.
Whether one is using relational databases or NoSQL type bases, models are upended and modelization is widely limited in such sys-tems to the level key/value, key/document or in classes of columns for which the key or line index serves as the basis for partitioning.
Atomicity (the A in ACID) is often restricted so as to avoid atomic updates affecting several shards and therefore transactions distribu-ted over several machines at high performance cost.
Who makes it work for them?The implementation of these techniques varies across companies. Some have simply adapted their databases to facilitate sharding. Others have hand-written ad hoc NoSQL solutions. Following the path from SQL to NoSQL, here are a few representative implementations:
[11] For more details on the FourSquare incident:http://blog.foursquare.com/2010/10/05/so-that-was-a-bummer/ and the analysis of another blog http://highscalability.com/blog/2010/10/15/troubles-with-sharding-what-can-we-learn-from- the-foursquare.html[12] Further details in the following article:http://blog.octo.com/consistent-hashing-ou-l%E2%80%99art-de-distribuer-les-donnees/
167
THE WEB GIANTS ARCHITECTURE / SHARDING
Wikipedia
This famous collaborative encyclopedia rests on many instances of distributed MySQL and a MemCached memory cache. It is thus an example of sharding implementation with run-of-the-mill components.
Figure 2
The architecture uses master-slave replication to divide the load between read and write functions on the one hand, and partitions the data by Wiki and use case. The article text is also deported to dedicated instances. They thus use MySQL instances with between 200 and 300 GB of data.
Consultation Edition
MemCached
Metadata
DATA STORAGEFOR ARTICLE TEXT
DATA STORAGEFOR ARTICLE TEXT
SLAVES FOR READSLAVES FOR READ MASTER FOR WRITESMASTER FOR WRITES
Wiki A
MySQLReplication
Wiki B Wiki B
168
THE WEB GIANTS
Flickr
The architecture of this photo sharing site is also based on several master and slave MySQL instances (the shards), but here based on a replication ring making it easier to add data servers..
Figure 3
An identifier serves as the partitioning key (usually the photo owner’s ID) which distributes the data over the various servers. When a server fails, entries are redirected to the next server in the loop. Each instance on the loop is also replicated on two slave servers to function in read-only mode if their master server is down.
MasterMaster
SlavesSlaves
ids 1à N/4
MemCached
Metadata
Next master
Reads
Writes
MySQLreplication
169
THE WEB GIANTS ARCHITECTURE / SHARDING
The Facebook architecture is interesting in that it shows the transition from a relational data base to an entirely distributed model.Facebook started out using MySQL, a highly efficient open source solution. They then implemented a number of extensions to partition the data.
Figure 4
Today, the Facebook architecture has banished all central data storage. Centralized access is managed by the cache (MemCached) or a dedicated service. In their architecture, MySQL serves to feed data to MemCached in the form of key-value and is no longer queried in SQL. The MySQL replication system is also used after an extension to replicate the shards across several datacenters. That being said, its use has very little to do with relational databases. Data are accessed only through the key-value. At this level there are no joins. Lastly, the structure of the data is taken into account to co-locate data used simultaneously.
DATACENTER #1DATACENTER #1 DATACENTER #2DATACENTER #2
<Clé, Valeur>
Clé = C1
MemCached
<Clé, Valeur>
MySQL MySQLMySQL replication
Asynchronous
170
THE WEB GIANTS
Amazon
The Amazon architecture stands out in its more advanced management of the loss of one or more datacenters on Dynamo.
Amazon started out in the 1990s with a single Web server and an Oracle database. They then set up a set of business services in 2001 with dedicated storage. Alongside databases, two systems use sharding: S3 and Dynamo. S3 is an online blob storage site identified by a URL. Dynamo (first used in-house, but recently made available to the public through Amazon Web Services) is a distributed key-value storage system designed to ensure high availability and very fast responses.
In order to enhance availability on Dynamo, several versions of a same dataset can coexist, using the principle of eventual consistency[13].
Figure 5
[13] There are quorum mechanisms (http://en.wikipedia.org/wiki/Quorum_(distributed_ computing) to arbitrate between availability and consistency.
Consultation Edition
Foo (Bar= «1», Version=1)
Foo (Bar= «2», Version=2)
Asynchronous propagation
171
THE WEB GIANTS ARCHITECTURE / SHARDING
In read mode, an algorithm such as the vector clock[14] or, as a last resort, the client application, will have to resolve any conflicts. There is thus a balance to be found in how much is replicated to choose the best compromise between resistance to data center failure on the one hand and system performance on the other.
LinkedIn’s background is similar to Amazon’s: they started in 2003 with a single database approach, then partitioned for specific businesses with implementation of a distributed system similar to Dynamo’s: Voldemort. But contrary to Dynamo, it is open source. One should also note that indexes and social graphs have always been stored separately by LinkedIn.
Google was the first to broadcast information on their distributed storage system. Rather than having its roots in databases, it emulates file systems.In the paper[15] on the Google File System (GFS), the authors mention that their choice of commodity hardware was instrumental, given the weaknesses noted in a previous chapter (cf. “Commodity Hardware“, p. 167). This distributed file system is used, directly and indirectly, to store Google’s data (search index, emails).
Figure 6
Its architecture is based on a centralized metadata server (to guide client applications) and a very large number of data storage systems. The degree of data consistency is lower than that guaranteed by a traditional
[14] The Vector Clock algorithm provides the order in which a given distributed dataset wasmodified.[15] http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google. com/fr//papers/gfs-sosp2003.pdf
Client Chunk ServersMaster
1 2
3 4
5 6
3
2
6
4
1
5
172
THE WEB GIANTS
file system, but this topic alone deserves an entire article. In production, Google uses clusters of several hundred machines, enabling them to store petabytes of data to index.
Exception!It is however undeniable that a great many sites are grounded in relational database technologies without sharding (or without mentioning sharding): StackOverflow, SalesForce, Voyages-SNCF, vente-privee.com… It is difficult to draw up an exhaustive list, one way or another.We nonetheless believe that sharding has become the traditional strategy on data-intensive web sites. Indeed, the architecture of SalesFoce is based on an Oracle database, but it uses the architecture very differently from the practices in our usual ITs: tables with multiple un-typed columns with generic names (col1, col2), a query engine upstream from Oracle to take into account these specificities, etc. Optimizations show the limits of purely relational architecture.In our view, the most striking exception is StackOverflow, where the architecture is based on a single relational SQL server. This site chose to implement architecture based purely on vertical scalability, with their initial architecture, inspired by Wikipedia, then evolving to conform to this strategy. Moreover, one must also note that the scalability needs of StockOverflow are not necessarily comparable to those of other sites because their targeted community (IT engineers) is narrow, thus the mode favors the quality of contributions over their quantity. Furthermore, choosing a platform under Microsoft license gives them an efficient tool but where the costs would certainly become prohibitive in the case of a horizontal scale up.
How can I make it work for me?Data distribution is one of the keys that enabled Web Giants to reach their current size and to provide services that no other architecture is capable of supporting. But make no mistake, it is no easy task: issues which are easy to resolve in a relational world (joins, data integrity) demand mastering new tools and methods.Areas which are data intensive but with limited consistency stakes - as is e.g. the case with data which can be partitioned - are those where distributed data will be most beneficial.
173
THE WEB GIANTS ARCHITECTURE / SHARDING
Offers compatible with Hadoop use these principles and are relevant to BI, more particularly in analyzing non-structured data. Concerning transactions, consistency issues are more important. Constraints around access APIs are also a limiting factor, but new offers such as SQLFire by VMWare or NuoDB attempt to combine sharding and an SQL interface. Thus something to keep an eye on.
In short, you need to ask yourself which data belong to the same use case (what partitions are possible?) and, for each, what the consequences of loss of data integrity would be. Depending on the answers, you can identify the main architecture features that would enable you, above and beyond sharding, to choose the tool to best meet your needs. More than a magic fix, data partitioning must be considered as a strategy to reach scale-up levels which would be impossible without it.
Associated patternsWhether you use open source or in-house products depends on your use of data partitioning as it entails a great deal of fine tuning. The ACID transactional model is also affected by data sharding. The pattern Eventually Consistent offers another vision and solution to meet user needs despite the impacts due to sharding. Again, mastering this pattern is very useful for implementing distributed data. Lastly, and more importantly, sharding is cannot be dissociated from the commodity hardware choice implemented by Web Giants.
Sources• Olivier Mallassi, Datacenter as a Computer : une plongée dans les datacenters des acteurs du cloud, 6 June, 2011 (French only) :> http://blog.octo.com/datacenter-as-a-computer-une-plongee-dans-les-
datacenters-des-acteurs-du-cloud/
• The size of the World Wide Web (The Internet), Daily estimated size of the World Wide Web:
> http://www.worldwidewebsize.com/
THE WEB GIANTS
174
• Wikipedia:> http://en.wikipedia.org/wiki/Shard_(database_architecture)> http://en.wikipedia.org/wiki/Partition_%28database%29> http://www.codefutures.com/weblog/database-sharding/2008/06/
wikipedias-scalability-architecture.html
• eBay:> http://www.codefutures.com/weblog/database-sharding/2008/05/
database-sharding-at-ebay.html
• Friendster and Flickr:> http://www.codefutures.com/weblog/database-sharding/2007/09/
database-sharding-at-friendster-and.html
• HighScalability:> http://highscalability.com/
• Amazon:> http://www.allthingsdistributed.com/
175
THE WEB GIANTS
TP vs. BI:the new
NoSQL approach
176
THE WEB GIANTS ARCHITECTURE / TP VS. BI: THE NEW NOSQL APPROACH
DescriptionIn traditional ISs, structured data processing architectures are generally split across two domains. Both of course are grounded in relational databases, but each with their own models and constraints.
On the one hand, Transactional Processing (TP), based on ACID transactions,
and on the other Business Intelligence (BI), grounded in fact tables and dimensions.
Web Giants have both developed new tools and come up with new ways of organizing processing to meet these two needs. Distributed storage and processing is widely used in both cases.
Business needs
One recurrent specificity of Web Giants is their need to process data which are only partially structured, or not at all, different from the usual data tables used in management information systems: Web pages for Google, social graphs for Facebook and LinkedIn. A relational model based on two-dimensional tables where one of the dimensions is stable (the number and type of columns) is ill-adapted to this type of need.
Moreover, as we saw in the chapter on sharding (cf. “Sharding“, p. 179), constraints on data volumes and transaction amounts often push Web Giants to partition their data. This overturns the traditional vision of TP where the data are always consistent.
BI solutions, lastly, are usually driven by internal IT decisions. For Web Giants, BI is often the foundation for new services which can be used directly by clients: LinkedIn’s People You May Know, new music releases suggested by sites such as Last.fm,[1] Amazon recommendations, are all services which entail
[1] Hadoop, The Definitive Guide O’Reilly, June, 2009.
177
THE WEB GIANTS
manipulating vast quantities of data to provide recommendations to users as quickly as possible.
Who makes it work for them?The new approach of Web Giants on the level of TP (Transaction Processing) and BI (Business Intelligence) lies in generic storage and deferred processing whenever possible. The main goal in the underlying storage is only to absorb huge volumes of queries both redundantly and reliably. We call it ‘generic’ because it is poorer in terms of indexing, data organization and consistency than traditional databases. Processing and analyzing data for queries and consistency management are deported to the software level. The following strategies are implemented.
TP: the ACID constraints limited to what is strictly necessary
The sharding pattern highly complicates the traditional vision of a single consistent database used for TP. Major players such as Facebook and Amazon have thus adapted their view of transactional data. As specified by the CAP theorem,[2] within a given system one cannot at the same time achieve consistency, availability and partition tolerance. First of all, data consistency is no longer permanent but only provided when the user reads the data.
This is known as eventual consistency: it is when the information is read that its integrity is checked, and any differing versions in the data servers are resolved.Amazon fostered this approach when they designed their distributed storage system Dynamo.[3] On a set of N machines, the data are replicated on W of them, in addition to version stamping. For queries, N-W+1 machines are searched, thereby ensurin that the user has the latest version.[4] The e-commerce giant chose to reduce data consistency in favor of gains in the availability of its distributed system.
[2] http://en.wikipedia.org/wiki/CAP_theorem
[3] http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
[4] In this way one is always certain of reading the data on at least one of the W machines where the freshest data have been written. For further information, see http://www. allthingsdistributed.com/2007/10/amazons_dynamo.html
178
THE WEB GIANTS ARCHITECTURE / TP VS. BI: THE NEW NOSQL APPROACH
Furthermore, to meet their performance goals, data freshness criteria are no longer comprehensive, but categorized. Facebook and LinkedIn rely on user updates for real-time freshness of these data: modifications must be immediately visible to ensure user trust in the system. In contrast, global consistency is reduced: when users sign up for a Facebook group for example, they immediately see the information appear but other group members may experience some delay in being notified.[5]
At LinkedIn, services are also categorized. For non critical services such as retweets, the information is propagated asynchronously.[6]
Whereas any user modifications on their own data are immediately propagated so as to be instantly visible to them.
Asynchronous processing is what makes it possible for Web Giants to best manage the heavy traffic loads they face. In sum, to guarantee performance and availability, Web Giants tailor their storage systems so that data consistency depends on usage. The goal is not to be consistent at all times, but rather to provide eventual consistency.
BI: the indexation mechanism behind all searches
To provide information on vast quantities of data, Web Giants also tend to pre-calculate indexes, which is to say data structures specifically designed to answer user questions. To better understand this point, let us look at the indexes that Google has designed for its search engine. Google is foremost in the arena due to the volume of its indexing: the Web entire.
[5] http://www.infoq.com/presentations/Facebook-Software-Stack
[6] Interview with Yassine Hinnach, Architect at LinkedIn.
179
THE WEB GIANTS
At the implementation level, Google uses sharding to store raw data (BigTable column database grounded in the distributed Google File System).[7]
Indexes based on keywords are then produced asynchronously, and are used to answer user queries. The raw data are analyzed with a distributed algorithm, based on the programming model MapReduce.
The process can be divided into two main phases: map, which, in parallel, identically processes each piece of data; and reduce, which aggregates the various results in a single final result. The map phase is easily distributable by using one machine for processing and another for the corresponding data, as can be seen in Figure 1.
Figure 1
[7] cf. “Sharding“, p. 179.
ReduceMap
01234567
180
THE WEB GIANTS ARCHITECTURE / TP VS. BI: THE NEW NOSQL APPROACH
This technique is highly scalable[8] and makes it possible for example for a web crawler to consume all web pages visited, to establish for each the list of outgoing links, then to aggregate them during the reduce phase to obtain a list of the most referenced pages. Google has implemented a sequence of MapReduce tasks to generate the indexes for its search engine.[9]
This allows them to process huge quantities of data in batch mode. The technique has been widely copied, namely through the Apache Foundation open source project Hadoop.[10]
Hadoop uses both the distributed file system and a framework to implement the MapReduce programming model, directly inspired by Google’s research paper. It was then adopted by Yahoo! for indexing, by LinkedIn to prepare its email campaigns, and by Facebook to analyze the various logs generated by their servers... Many firms, including several other Web Giants (eBay, Twitter) use it.[11]
In 2010, Google set up a new indexation process based on event mechanisms.[12] Updates do not happen in real time, contrary to database triggers, but latency (the time between page publication and the possibility to search it) is greatly reduced as compared to a batch system based on the MapReduce programming model.
Exception!All of these examples share a commonality: they target a pretty specific set of needs. Many key Web players also use relational databases for other applications.The “one size fits all“ approach of these databases means they are easier to use but also more limited, notably in terms of scalability. The processes and distributed storage systems described above are only implemented for the services most frequently used by these key players.
[8] Or scalable, i.e. capable of processing more data if the system is enlarged.[9] http://research.google.com/archive/mapreduce.html[10] http://hadoop.apache.org[11] http://wiki.apache.org/hadoop/PoweredBy[12] Google Percolator: http://research.google.com/pubs/pub36726.html
181
THE WEB GIANTS
How can I make it work for me?It is certainly in indexation solutions and BI on Big Data that the market is most mature. With Hadoop, a reliable open source implementation, a large number of support solutions, related tools, re-implementations and commercial repackaging have been developed, based on the same APIs.Projects based on the indexation of large quantities of data, or which are semi- or non- structured, are the primary candidates for adoption of this type of method. The main advantage is that data can be preserved thanks to much lower storage costs. Information is no longer lost through over-hasty aggregations.
In this way the data analysis algorithms producing indexes or reports can also be more easily adjusted over time since they are constantly processing all available data rather than pre-filtered subsets. A switch from relational databases in TP will probably take more time. Various distributed solutions inspired by Web Giants’ technologies have come out under the label NoSQL (Cassandra, Redis).
Other distributed solutions, more at the crossroads of relational databases and data matrices in terms of consistency and APIs, have come out under the name NewSQL (SQLFire, VoltDB). Architectural patterns such as Event Sourcing and CQRS[13] can also contribute to spanning gaps across disciplines. In fact, their contributions make it possible to model transactional data as a flow of events which are both non correlated and semi-structured. Building a comprehensive and consistent vision of the data comes after, for data dissemination. Web Giants models cannot be directly transposed to meet the general TP needs of businesses, and there are many other approaches to be found on the market to overcome traditional database limits.
Associated patternsThis pattern is mainly linked to the sharding pattern (cf. “Sharding“, p. 179), because, through distributed algorithms, it makes it possible to work on this new type of storage.One should also note here the influence of the pattern Build vs. Buy (cf.“Build vs. Buy“, p. 19) which has led Web Giants to adopt highly specialized tools to meet their needs.
[13] Command and Query Responsibility Separation.
182
THE WEB GIANTS
Big Data Architecture
183
THE WEB GIANTS
To better meet their users' needs, the Web Giants do everything they can to reduce their Time to Market. Data in all forms are key to this strategy. They not only serve for technical analyses, but are also business drivers. They are what make it possible to personalise the user experience, more and more often in real time, and above all inform decision making. The Web giants have long understood the importance of data and use them unabashedly. At Google for example, all ideas must come with metrics, all arguments must be based on data, or you will not be heard in the meeting.[1]
Everyone speaks of Big Data, but the Web Giants were the first stakeholders, or, at the least, associates. Behind the buzz word are new challenges, including an especially complicated one: how do you store and process the exponential volume of data generated? There are more connected objects than humans on the planet, and Cisco forecasts that by 2020 there will be over 50 billion sensors,[2] how do you use all that information?
Time to ActionAs shown in the preceding chapter, NoSQL architecture can process and query ever larger amounts of data.
Big Data is usually described by 3 main characteristics, often called the 3Vs:[3]
Volume, the capacity to process terabytes, petabytes, and even exabytes of extracted data
Variety, the capacity to process all data formats, whether structured or not
Velocity, the capacity to process events in real time, or at least as quickly as possible
With architectures of the NoSQL/NewSQL type, as described previously, only the components Variety and Volume were highlighted. Let us now look at how the Web Giants also embrace the third component: Velocity.
[1] http://googlesystem.blogspot.com.au/2005/12/google-ten-golden-rules.html[2] https://www.cisco.com/web/about/ac79/docs/innov/IoT_IBSG_0411FINAL.pdf[3] https://en.wikipedia.org/wiki/Big_data
184
THE WEB GIANTS ARCHITECTURE / BIG DATA ARCHITECTURE
Making data available We will talk here about double-headed architectures capable of storing and querying data in all forms, processed in batches or in real time. But before broaching this complex subject, let us first take a look at the characteristics and Big Data architecture patterns the Web Giants implement.
A data lake for data
In an information system, the data are distributed over dozens, or even hundreds, of components. The data are spread out in various sources, some on site but others with third party editors or blocked in proprietary software. Having the data on hand is not enough, they must also be instantly accessible. If you don't have the data, it is unlikely you will think of playing around with them. Isolated data is underexploited data: the Allen Curve[5] also applies to data!
That is why the Web giants centralise their data in a scalable system where they can be easily queried without any presumptions about how they will be used. Perhaps most of them will not even be used, but that does not matter: the important thing is to have them nearby just in case a new idea emerges.
This type of system, usually based on the Hadoop framework, is commonly called a “data lake“.[6A] It is a storage and distributed processing platform capable of handling ever increasing amounts of data, whatever their nature. On paper, it can be scaled to infinity,[7] both in terms of storage and processing, and can manage numerous competing jobs and tasks linearly thanks to the size of the infrastructure.
An asideSome also speak of 4Vs or even 5Vs,[4] adding components to the 3Vs mentioned above such as:
Veracity, the capacity to manage inconsistencies and ambiguitiesa
Value, the capacity to apply differential processing to data depending on the value attributed to them
The latter is without doubt the most debatable, since the main benefit of this type of architecture is that there are no presuppositions as to how the data will be analysed, and therefore no pre-established values.
[4] https://www.linkedin.com/pulse/20140306073407-64875646-big-data-the-5-vs-everyone-must-know[5] https://en.wikipedia.org/wiki/Allen_curve[6A] https://en.wikipedia.org/wiki/Data_lake[7] even if nothing is infinitely scalable https://www.youtube.com/watch?v=modXC5IWTJI
185
THE WEB GIANTS
Immutable data
A data lake can store all types of data, it is up to the user to decide what to use it for. Of all the data it can hold, raw data are particularly interesting. Available without changes or alterations, they can be modelled depending on user needs.
Immutability drastically reduces manipulation errors:
the data are entered without any transformation, limiting the risk of losing the context or errors in interpretation
the data are stored only once and are never updated, thus limiting manipulation errors and keeping a full record.
Immutable, they can also theoretically[6b] be reused an infinite number of times. The data are not “consumed“ but “used“. In case of errors, bugs or code updates, the processing simply needs to be relaunched to obtain the latest results.
When they are timestamped and sufficiently individualised, such immutable data are also known as “events“.
Schema on read
Another highly interesting characteristic is in interpreting the data. For a “traditional“ BI ingestion, the data are cleaned up, formatted, and normalised before being ingested. The Web Giants consider that each time data is transformed, part of the context is altered. By storing raw data, it is up to users to decide how to transform them.
Let us take the example of Twitter. Each tweet contains a multitude of information: text, images, videos, links, hashtags. They are timestamped, geographically located, shared, liked... Depending on the system using the data, it must be able to transform them by focusing on the aspect which seems most relevant. An application to map the most recent tweets will probably not have the same angle of approach as one looking for the most shared content.
[6B] In practice, Google uses its data over a period of 30 days, for both volumetric and legal reasons.
186
THE WEB GIANTS ARCHITECTURE / BIG DATA ARCHITECTURE
This pattern, Schema on read, has several advantages:
It maximally simplifies ingestion, avoiding all data loss and making it much less expensive to add data to a data lake.
It gives clients flexibility by allowing personalised extraction and transformation depending on needs.
This pattern, joined with the preceding ones, becomes a driver of innovation. It does away with technical barriers to data processing, making it possible to develop new prototypes more and more quickly. The best way to find value in your data is to play around with them!
From Big Data to Fast DataThe Web Giants strive to give value to their clients as quickly as possible. Sometimes, and more and more often, offline processing is no longer sufficient for user needs.
In that case, the best way to get value from your data is to interact with them as soon as they are ingested: the data lake as described above allows
EnterpriseDWH
Database
TransactionalSystems
Reporting,requests
External Data,OpenAPI
Messages& Events
Messages& Events
PUBLICATION
INGESTION
Analyticalbatchs
MachineLearning
Flowmanagement
Non-structuredstorage
Semi-structuredstorage (NoSQL)
Structured storage(ex. relational)D
ATA
LA
KE Interactive
requests
Rawfiles
Applicativelogs
External Data,OpenAPI
187
THE WEB GIANTS
you to process data in batch mode only. However, between two batch passages, freshly gathered data are not used. Not only do you not get full benefit from them, but worse, some data may be outdated before they're even used. The fresher the data, the greater their potential interest.
To process millions or even billions of events per second, two types of technology are used:
Event distributors and collectors such as Flume and Kafka
Tools to process the events in near real time, such as Spark and Storm
More than being just customers, the Web giants partake in creating and sharing these bricks:
Kafka is a high speed distributed message queue developed by LinkedIn[8]
Storm makes it possible to process millions of messages per second, originally developed by Twitter[9]
The goal is not to replace the batch processing brick already included in the data lake, but instead to add real time features. This layer is often referred to as the Fast Layer, and the capacity to leverage Big Data for real time processing is known as Fast Data.[10] Real time reduces the Time to Action, so prized by the Web Giants.[11]
APIs
DAT
A L
AK
E REAL TIM
E
PUBLICATION
INGESTION
REAL TIME PUBLICATION
REAL TIME INGESTION
APIs
Interactive andbatch processing Sandbox
Distributed File Storage Resilient Storage
Statelessprocessing
Statefulprocessing
Enterprise DWH
High Volume data Log files Applications Applications High Velocity data
Dataimport
Dataimport
Dataimport
Dataimport
Dataimport
[8] http://kafka.apache.org/[9] http://storm.apache.org/[10] http://www.infoworld.com/article/2608040/big-data/fast-data--the-next-step-after-big-data.html[11] http://www.datasciencecentral.com/profiles/blogs/time-to-insight-versus-time-to-action
188
THE WEB GIANTS ARCHITECTURE / BIG DATA ARCHITECTURE
Should the two channels, batch and real time, be treated as distinct or, on the contrary, be unified?In theory, the ideal is to be able to process the entire dataset, but that is not so simple. There are numerous initiatives but you are unlikely to need any for your ecosystem, where most use cases can do without. The Web giants advise batch oriented architecture if you have no strong latency constraints, or instead fully real time architecture, but rarely both at once.
Lambda architecture
Lambda architecture is undoubtedly the most widespread response to the need to unify the two approaches. The principle is to process the data in two layers, batch and real time, carrying out the same processes in both channels, then consolidating the results in a third, dedicated layer:
The batch layer precalculates the results based on the complete dataset. It processes raw data and can be regenerated on demand.
The speed layer serves to overcome batch latency by generating real time views which undergo the same processing as in the batch layer. These real time views are continuously updated and the events are crushed in the process, therefore the views can only be replayed by the batch layer.
The serving layer then indexes both views, batch and real time, and displays them in the form of consolidated output.
Since the raw data are always available in the batch layer, if there are any errors, the output can be regenerated.
SensorLayer
DistributionLayer
Batch Layer Serving Layer
IoT
...
All data
Process stream Incrementedinformation
real time view
real time view
Precomputedinformation
batch view
batch view
Data
Ser
vice
(Mer
ge)
Visu
aliza
tion
Adapted from: Marz, N. & Warren, J. (2013) Big Data. Manning.
Batchrecompute
Realtimeincrement
Speed Layer
IncomingData
mobile
social
189
THE WEB GIANTS
However, few use cases are truly adapted to this type of architecture. It has not yet reached maturity, even among the Web Giants, and is highly complex to implement. More specifically, it entails developing the same processing twice on two types of very different technologies. Doing it once is already difficult enough without having to double the task, especially given that it must all be synchronised.
As an alternative to Lambda architecture, Twitter offers, through Summingbird,[12] an abstraction layer where you can integrate computation in both layers within a single framework. What you gain in simplicity you lose in flexibility however: the number of usable features is reduced at the intersection of both modes.
Kappa Architecture
LinkedIn has released another variant of this model: Kappa Architecture.[13] Their approach is based on processing all data, old and new, in a single layer: the fast layer, thus reducing the complex equation.
It is a way of better dividing the streams into small independent steps, easier to debug, with each step serving as a checkpoint to replay unitary processing in case of error. Reprocessing data is one of the most complicated challenges with this type of architecture and must be thoroughly thought through from the outset. Because code, formats and data constantly change, processing must be able to integrate the changes continuously, and that is no small matter.
SensorLayer
DistributionLayer
Batch Layer Serving Layer
IoT
... Process stream Incrementedinformation
real time view
real time view Visu
aliza
tion
Adapted from: Marz, N. & Warren, J. (2013) Big Data. Manning.
BatchAnalytical analysis
Realtimeincrement
Speed Layer
IncomingData
mobile
social
Replay
Data
Ser
vice
All data
[12] https://github.com/twitter/summingbird[13] http://radar.oreilly.com/2014/07/questioning-the-lambda-architecture.html
190
THE WEB GIANTS ARCHITECTURE / BIG DATA ARCHITECTURE
How can I make it work for me?Whether you have already invested in Business Intelligence or not, leveraging your data is no longer an option. A data lake type solution has become almost inevitable. More flexible than a data warehouse, it is now possible to process unstructured data and create models on demand. It does not (yet) replace traditional BI but opens up new vistas and possibilities.
Based on open source solutions, mostly around Hadoop and its ecosystem, this central business reference is a staunch ally to make data accessible, whatever their type: managing unstructured data, storing and processing large volumes, all with commodity hardware, which is to say low outlay.
Whatever your business line, the use cases are numerous and varied: from log analysis and safety audits to optimising the buying journey, not forgetting data science of course, data lakes are a key component to intelligent user experience design. To go beyond the offline processing of your data, add online features to your data lake. Although we do not necessarily recommend implementing e.g. Lambda or Kappa architectures, which are too complex for most use cases and not always mature, this does not take away from the advantages to be reaped from real time schemas which truly open new perspectives. Stay simple!
191
THE WEB GIANTS
Data Science
193
THE WEB GIANTS DATA SCIENCE
Data science now provides technology which is both low cost and methodologically reliable to better use data in information systems. Data science drives business intelligence even deeper by automating data analysis and processing in order to e.g. predict events, behavior patterns, trends or to generate new insights. In what follows we provide an overview of data science, with illustrations taken from some of its most groundbreaking and surprising applications.
Data science is used to extract information from more or less structured data, based on methodologies and expertise developed at the crossroads of IT, statistics, and all business lines involving data.[1] [2]
Practically speaking, solving a data science problem translates as projecting into the future patterns grounded in data from the past. One speaks of supervised learning when the main issue is forecasting for a specific target. When the target has not been specified or data are lacking, detecting patterns is said to be unsupervised.One should note that data science also includes building atemporal patterns and then visualizing their various facets.
Taking the classic example of purchasing histories and pricing in online retail, data science serves to determine whether a client will buy a new product, or what price they would be willing to pay for the product, and are thus two examples of supervised learning in the respective areas of classification and regression. Carving out marketing segments based on behavior variables, in contrast, is an example of unsupervised learning.
More broadly, data science covers all technology and algorithms used to model, implement and visualize an issue using available data, but also to better understand problems by examining them from several viewpoints to potentially solve them in the future. Machine learning is defined as the algorithmic aspect of data science.
[1] Dhar V. 2013. “Data science and prediction“. Communications of the ACM[2] Cleveland WS. 2001. “Data science: an action plan for expanding the technical area of the field of statistics“. Bell Labs Statistics Research Report
194
THE WEB GIANTS
Enthusiasm for the discipline is such that today's data scientists must constantly monitor the field to remain on top. Let us seize the occasion to note that in the second half of 2015, OCTO published a Hadoop white book and a book on data science (in French, English translation forthcoming).[3] [4]
Web GiantsAmong the Web Giants, there is strong movement towards unstructured data (e.g. video and sound). These have traditionally been ignored by analytics due to volume constraints and technical barriers to extracting the information. However they are back in fashion with a combination of breakthroughs in neural network science (including the field currently known as deep learning); in technology, with ever more affordable and powerful machines; and lastly with the wide media coverage of a number of futuristic applications.
Groundbreaking work has been going on over the last few years, namely in images and natural language processing, both sound and text.
In December, 2014 Microsoft announced the launch of Skype Translator, a real time translation tool for 5 languages, to break down language barriers.[5]
With DeepFace, Facebook announced, in June, 2014 a giant step forward in facial recognition, reaching a precision level of 97%, close to human performance for a similar task.[6]
Google presents similar results with FaceNet in an article dated June, 2015 on facial recognition and clustering.[7]
[3] http://bit.ly/WP-Hadoop2015 (French)[4] data-science-fondamentaux-et-etudes-de-cas[5] skype-translator-unveils-the-magic-to-more-people-around-the-world[6] deepface-closing-the-gap-to-human-level-performance-in-face-verification[7] http://arxiv.org/pdf/1503.03832.pdf
195
THE WEB GIANTS
Such developments in unstructured data processing show that it is now possible to extract value from data hitherto considered out of reach. The key lies in structuring the data:
A raw image is transformed into a face, and then linked to a person. The image's context can also be described in a sentence.[8] The patterns extracted from the images can be reproduced with slight modifications, or blended with other images, such as a famous painting to produce artistic motifs.[9]
Speech can be transcribed as text, and music as notes on a score. Patterns extracted from music make it possible to a certain extent to reproduce a composer or musical genre.
Masses of unstructured texts are transformed into meaning using semantic vectors. Processing natural language becomes a question of algebraic manipulations, facilitating its use by the algorithms of data science.[10] The mainstreaming of bots and personal assistants such as Apple's Siri, Google's Now and Facebook's M partakes in our ability to carry out more and more detailed semantic analyses on unstructured text.
The study of brain activity provides clues to identifying signs of illness such as epilepsy or to determining which cerebral patterns correspond to moving one's arm.[11]
Some problems requiring cutting edge expertise are now being handled using data science approaches, including to detect the Higgs boson and searching for black matter using sky imaging.[12] [13]
Such use cases, often tightly linked to challenges launched by academic circles, have largely contributed to the media frenzy around data science.
Moreover, for the Web Giants, data science has become not only a way to continuously improve internal processes, but also an integral part of the business model. Google products are free because the data generated by the user has value for advertising targeting. Twitter draws a share of its revenue from the combination of advertising and analytics products. Uber is a perfect example of a data-driven company which, in serving as intermediary between the client and the driver, has nothing to sell other than intelligence in creating links.[14] Intermediation services can easily be copied by the competition, but not the intelligence behind the services.
[9] inceptionism-going-deeper-into-neural[10] learning-meaning-behind-words[11] grasp-and-lift-eeg-detection[12] kaggle.com/c/higgs-boson[13] kaggle.com/c/DarkWorlds/data[14] data-science-disruptors
DATA SCIENCE
196
THE WEB GIANTS
A flourishing ecosystem and accessible toolsThe standardization of data science came about through the contribution of many tools from the open source world such as the multiple machine learning and data handling libraries in languages such as R and Python[15] [16] and from the world of Big Data. These open source ecosystems and their dynamic communities have facilitated access to data science for many an IT engineer or statistician wishing to become a data scientist.
In parallel, tools for data analysis by major publishers, whether oriented statistics or IT, have also evolved towards integrating open source tools or developing their own implementations of machine learning algorithms.[17] Both the open source and proprietary ecosystems are flourishing, mature, and more and more accessible in terms of training and documentation.
Open source is used as much to attract major talent from data science as to provide tools for the community. This strategy is picking up speed as illustrated by the buzz generated by TensorFlow, an open source deep learning framework for digital calculations published by Google in November, 2015.[18] Thanks to highly permissive licensing, these tools are absorbed and improved by the community, transforming them into de facto standards. We have completely lost track of the number of tools from the Hadoop ecosystem which were internally developed by the Web Giants (such as Hive and Presto at Facebook, Pig at Yahoo, Storm and Summingbird at Twitter...) and then took on a second life in the open source world.
Platforms for online competitions in data science (such as the most well known kaggle.com or datascience.net in France) have given new, vibrant visibility to the potential of data science. Various Web Giants such as Facebook and major players in distribution and industry quickly understood that this could help them attract the best talent.[19] Many data science competitions propose job interviews as the top prize, in addition to financial awards and certain glory.
[15] four-main-languages-analytics-data-mining-data-science[16] kdnuggets.com/2015/05/r-vs-python-data-science[17] Why-is-SAS-insufficient-to-become-a-data-scientist-Why-need-to-learn-Python-or-R[18] tensorflow-googles-latest-machine_9[19] kaggle.com/competitions
197
THE WEB GIANTS
The Web Giants swiftly organized to recruit the best data scientists, thus anticipating the value added by interdisciplinary teams specialized in capitalizing on data.[20]
Many, e.g. Google, Facebook and Baidu, have also hired top specialists in machine learning such as Geoffrey Hinton, Yann LeCun and Andrew Ng.[21] [22] [23]
Current challenges in data scienceOne of the most crucial steps in any data science project is called feature engineering. This consists of extracting the relevant numeric variables to characterize one or several facets of the phenomenon under study. For example, numerically describing user behavior on a web site by calculating how often a given page is accessed, or characterizing an image by the number of contours it contains. Feature engineering is also considered one of the most fastidious tasks a data science has to carry out. For unstructured data such as images, deep learning has made it possible to automate the procedure, placing the use cases mentioned above within reach.For structured data, the creation and selection of new features to improve prediction remain strongly specific to each particular business. This is an essential component of the alchemy of a good data scientist. Feature engineering is sill largely implemented manually by the world's best data scientists for structured data.[24]
How can I make it work for me?Are all the data you produce stored and then readily accessible? What percentage of the data is in fact processed and analyzed? How often? To what extent do you use the available data to measure your processes and orient your actions? How much importance do you attach to recruiting data scientists, data engineers and data architects?
Data science contributes more broadly to the best practices of data driven companies, i.e. those that use the available data both qualitatively and quantitatively to improve all their processes. Answering the few questions above allows you to measure your maturity as concerns data.
[20] the-state-of-data-science[21] wired.com/2013/03/google_hinton/[22] facebook.com/yann.lecun/posts/10151728212367143[23] chinese-search-giant-baidu-hires-man-behind-the-google-brain[24] http://blog.kaggle.com/2014/08/01/learning-from-the-best/
DATA SCIENCE
198
THE WEB GIANTS
You have perhaps already used predictive methods based on linear algorithms such as logistic regression traditionally found when establishing marketing scores. Today, the rigorous implementation of the data science methodology gives you control over the inherent complexity in using non linear algorithms. The underlying compromise in giving up linear algorithms is the loss of capacity to understand and explain predictions in exchange for more realistic, and therefore more useful, predictions.
How do I get started?Depending on the nature of your business, you may have unstructured data that deserve a fresh look:Call center recordings to be transcribed and semanticized to better understand your customer relations.Written texts supplied by clients or emails sent by staff to be used to categorize complaints and requests, to detect fads and trends.
The takeaway is that in the use cases of most of our clients and in international competitions, the vast majority concern structured or semi-structured data:Mapping links between customers and timestamped transactions can bring to light potential fraud by processing volumes far beyond what is possible manually.Web logs begin as far upstream as possible to characterize customer journey's which lead to a strategic target such as shopping cart abandonment.Temporal series produced by industrial sensors help prevent problems on assembly lines.Server logs identify warning signs before a machine breaks down.Relational data on clients, sales and products form a set of characteristics including identity, geographic location, behavior patterns and social networks which are systematically integrated in the 360 models of the examples described above.
Better yet, personalizing your client segment, predicting component failures, improving the performance of your production units, gaining customer loyalty, forecasting increases in demand and reducing churn, are all possible use cases.[25] Data science has become a strategic business asset that you can no longer do without.
[25] kaggle.com/wiki/DataScienceUseCases
199
THE WEB GIANTS
Sources[1] Dhar V. 2013. “Data science and prediction“. Communications of the ACM[2] Cleveland WS. 2001. “Data science: an action plan for expanding the technical area of the field of statistics“. Bell Labs Statistics Research Report[3] http://bit.ly/WP-Hadoop2015 (French)[4] data-science-fondamentaux-et-etudes-de-cas[5] skype-translator-unveils-the-magic-to-more-people-around-the-world[6] deepface-closing-the-gap-to-human-level-performance-in-face-verification[7] http://arxiv.org/pdf/1503.03832.pdf[8] google-stanford-build-hybrid-neural-networks-that-can-explain-photos[9] inceptionism-going-deeper-into-neural[10] learning-meaning-behind-words[11] grasp-and-lift-eeg-detection[12] kaggle.com/c/higgs-boson[13] kaggle.com/c/DarkWorlds/data[14] data-science-disruptors[15] four-main-languages-analytics-data-mining-data-science[16] kdnuggets.com/2015/05/r-vs-python-data-science[17] Why-is-SAS-insufficient-to-become-a-data-scientist-Why-need-to-learn-Python-or-R[18] tensorflow-googles-latest-machine_9[19] kaggle.com/competitions[20] the-state-of-data-science[21] wired.com/2013/03/google_hinton/[22] facebook.com/yann.lecun/posts/10151728212367143[23] chinese-search-giant-baidu-hires-man-behind-the-google-brain[24] http://blog.kaggle.com/2014/08/01/learning-from-the-best/[25] kaggle.com/wiki/DataScienceUseCases
DATA SCIENCE
200
THE WEB GIANTS
Design for
Failure
201
THE WEB GIANTS
Description of the pattern“Everything fails all the time“ is a famous aphorism by Werner Vogels, CTO of Amazon: indeed it is impossible to plan for all the ways a system can crash, in any layer - an inconsistent administration rule, system resources that are not released following a transaction, hardware failure, etc.
It is on this simple principle that the architecture of Web Giants is based, it is known as the Design for Failure pattern: computer software must be able to overcome the failure of any underlying component and infrastructure.
Hardware is never 100% reliable, it is therefore crucial to isolate components and applications (data grids, HDFS...) to guarantee permanent service availability.
At Amazon for example, it is estimated that 30 hard drives are changed every day per data center. The cost is justified by the nearly constant availability of the site amazon.fr (less than 0.3 s. of outage per year), where one must remember that each minute of outage costs over 50,000 euros in lost sales.
A distinction is generally made between the traditional continuity of service management model and the design for failure model which is characterized by five stages of redundancy:
Stage 1: physical redundancy (network, disk, data center). That is where the traditional model stops.
Stage 2: virtual redundancy. An application is distributed over several identical virtual machines within a VM cluster.
Stage 3: redundancy of the VM clusters (or Availability Zone on AWS). These clusters are organized into clusters of clusters.
Stage 4: redundancy of the clusters of clusters (or Region on AWS). A single supplier manages these regions.
Stage 5: redundancy of Internet suppliers (e.g. AWS and Rackspace) in the highly unlikely event of AWS being completely down. Of course, you will have understood that the higher the redundancy level, the more the deployment and switch mechanisms are automated.
202
THE WEB GIANTS ARCHITECTURE / DESIGN FOR FAILURE
Applications created within Design for failure continue to function despite system or connected application crashes, even if it means, to continue providing an acceptable level of service, downgrading functions for the most recently connected users or all users.
This entails including design for failure in the application engineering, based for example on:
Eventual consistency: instead of systematically seeking consistency with each transaction with often costly mechanisms of the XA[1] type, consistency is ensured at the end (eventually) when the failed services are once again available.
Graceful degradation (not to be confused with the Web User Interface of the same name): when there are sharp spikes in load, performance-costly functionalities are deactivated live.
At Netflix, the streaming service is never interrupted, even when their system for recommendations is down or failing or slow: they are there, no matter what the failure.
Moreover, to reach that continuity of service, Netflix uses automated testing tools such as ChaosMonkey (recently open-sourced), LatencyMonkey and ChaosGorilla, which check that applications continue to run correctly despite random failures in, respectively, one or several VM, network latency, an Availability Zone.
Netflix thus lives up to its motto: “The best way to avoid failure is to fail constantly“.
Who makes it work for them? Obviously Amazon, who furnishes the basic AWS building blocks. Obviously Google and Facebook who communicate frequently on these topics.But also Netflix, SmugMug, Twilio, Etsy, etc.In France, although some sites have very high availability rates, very few comment on their processes and, to the best of our knowledge, very few are capable of expanding their redundancy beyond stage 1 (physical)
[1] Distributed transaction, 2-phase commit.
203
THE WEB GIANTS
or 2 (virtual machines). Let us nonetheless mention Criteo, Amadeus, Viadeo, the main telephone operators (SFR, Bouygues, Orange) for their real-time need coverage.
What about me?Physical redundancy, rollback plans, Disaster Recovery Plan sites, etc. are not Design for Failure patterns but rather redundancy stages.
Design for Failure entails a change in paradigm, going from “preventing all failures“ to “failure is part of the game“, going from “fear of crashing“ to “analyzing and improving“.
In fact, applications built along the lines of Design for Failure no longer generate such feelings of panic because all failures are naturally mastered; this leaves time for post-mortem analysis and improvements to the PDCA[2]. It is, to borrow a term from Improv Theater, “taking emergencies easy“.
This entails taking action on both a technical and a human level. First of all in application engineering:
The components of an application or application set must be decentralized and made redundant using VM, by Zone, by Region (in the Cloud. Same principle if you host your own IS) without any shared failure zones. The most complex issue is synchronizing databases.
All components must be resilient to underlying infrastructure failures.
Applications must support communication breaks and high network latency.
The entire production workflow for these applications has to be automated.
Then, for the organization:
Get out of the A-Team culture (remember: “the last chance at the last moment“) and automate processes to overcome systems failure. At Google, there is 1 systems administrator for over 3000 machines.
[2] Plan-Do-Check-Act, a method for continuous improvement, known as the “Deming Wheel“.
204
THE WEB GIANTS
Analyze and fix failures upstream with the Failure Mode and Effects Analysis (FMEA) method, and downstream with post-mortems and PDCA.
Patterns connexes Pattern “Cloud First“ , p. 159.
Pattern “Commodity Hardware“, p. 167.
Pattern “DevOps“, p. 71.
ExceptionsFor totally disconnected applications, with few users or few business challenges, the redundancy must be simple or non-existant. Arbitration between each redundancy level is then carried out using ROI criteria (costs and complexities vs. estimated losses during inavailabilities).
Sources• Don MacAskill, How SmugMug survived the Amazonpocalypse, 21 April, 2004:> http://don.blogs.smugmug.com/2011/04/24/how-smugmug-survived- the-amazonpocalypse
• Scott Gilbertson, Lessons From a Cloud Failure: It’s Not Amazon, It’s You, 25 April, 2011:> http://www.wired.com/business/2011/04/lessons-amazon-cloud-failure
• Krishnan Subramanian, Designing For Failure: Some Key Facts It’s You, 26 April, 2011: http://www.cloudave.com/11973/designing-for-failure-some-key-facts
ARCHITECTURE / DESIGN FOR FAILURE
205
THE WEB GIANTS
The ReactiveRevolution
206
THE WEB GIANTS
For many years now, competing processes have been executed in different threads. A program is basically a sequence of instructions that run linearly in a thread. To perform all the requested tasks, a server will generate several threads. But these threads will spend most of their time waiting for the result of a network call, a disk read or a database query.
Web giants have moved on to a new model to eliminate such time loss and to increase the number of users per server by reducing latency, improving performance globally and managing peak loads more simply.
The reactive manifesto defines a reactive application around four interrelated pillars: event-driven, responsive, scalable and resilient.
A responsive application is event-driven, able to provide an optimal user experience, by making better use of available computing power and higher error and failure tolerance, and hence scalability and resilience. But the most powerful concept here is the event-driven orientation, everything else can be seen through this prism.
The reactive model is a development model driven by events.
It is called by a variety of names. It's all a matter of perspective:
event-driven, driven by events
reactive, that reacts to events
push based application, the data is fronted as it becomes available
Even better: Hollywood, summarised by the famous “don’t call us, we’ll call you“
Use cases: when latency mattersThis architectural model is very relevant for applications interacting with users in real time.
This includes several use cases like:
Social networks, shared documents and direct communication tools
207
THE WEB GIANTS
Financial analysis, pooled information like traffic congestion or public transport, pollution...
Multiplayer games
Multi-channel approaches, mobile application synchronisation
Open or private APIs, when usage is impossible to predict
IoT and index management
Massive user influx such as sport events, sales, TV ads...
And more generally when effectively managing complex algorithms is the issue, e.g. for ticket booking, graph management, the semantic web
One of the crucial elements in all these applications is latency handling. For an application to be responsive and thus usable, users must experience the lowest possible latency.
It’s all about the threading strategy To put it simply, there are two types of thread:
Hard-threads: these are real competing processes that are executed by the different processor cores
Soft-threads: these are simulations of competing processes that dedicate portions of the CPU to each process, alternately
Fortunately, the soft-threads allow machines to simultaneously run many more threads than they have cores.
The reactive model aims to remove as many soft-threads as possible and only use hard-threads, making more efficient use of modern processors.
To reduce the number of threads, the CPU must not be shared on a time basis, but instead on an event basis. Each call involves processing a piece of code. It must never be blocked, to release the CPU as quickly as possible to process the next event.
ARCHITECTURE / THE REACTIVE REVOLUTION
208
THE WEB GIANTS
Implementing this model means operating in all software layers: from operating systems to development languages passing through frameworks, hardware drivers and databases.
A data structure that eliminates locks is beyond doubt an important lever for system performance. New functional data models then become the best allies for reactive models.
Among new software making the most buzz, many use an internal reactive model. To name but a few: Redis, Node.js, Storm, Play, Vertx, Axom, and Scala.
The reactive model is more likely to respond well to load peaks. It reduces the limit on the number of simultaneous users controlled by an arbitrary fixed parameter on the thread pool. Most of the Web giants have published their experience feedback on their migration to this model: Coursera,[1]
Gilt, Groupon, Klout, LinkedIn,[2] NetFlix[3], Paypal, Twitter,[4] WalMart[5] and Yahoo.
Their voices are unanimous: reactive architectures make it possible to offer the best user experience with the highest scalability.
Why now?
“Software gets slower faster than hardware gets faster. “ Niklaus Wirth – 1995
The reactive model is not new. It has been used in all user interface frameworks since the invention of the mouse. Each click or keystroke generates an event.
Even client-side JavaScript uses this model. There is no thread in this language, yet it is possible to have multiple simultaneous AJAX requests. Everything works using call-backs and events.
[1] http://downloads.typesafe.com/website/casestudies/Coursera-Case-Study.pdf[2] http://engineering.linkedin.com/play/play-framework-async-io-without-thread-pool-and-callback-hell[3] https://blog.twitter.com/2013/new-tweets-per-second-record-and-how[4] http://venturebeat.com/2012/01/24/why-walmart-is-using-node-js/ [5] http://www.infoq.com/presentations/netflix-reactive-rest
209
THE WEB GIANTS
Current development architectures are the result of a succession of steps and evolutions. Some strong concepts have been introduced and used extensively before being replaced by new ideas. The environment is also changing. The way we respond to it has changed.
User experience has been the driving force of this change: today, who is willing to fill in a form, wait for the page to reload to provide feedback (failure/success) and wait again for the confirmation email? What of getting such information immediately rather than asynchronously?
Have we reached the limits of our systems? Is there still space to be conquered? Performance gains to discover?
In our systems, there is a huge untapped power reservoir. For doubling the number of users, adding a server will do the trick. But since the advent of mobile, companies have to handle about 20x more requests: is it reasonable to multiply the number of servers in proportion? And is it sufficient? Certainly not. To a certain extent, it sounds better to review the architecture to harness the power that’s available: there are many more available processor cycles to optimise. And when programs spend significant amounts of time waiting for disks, networks or databases, they don’t harness server potential.
From this point forward, this paradigm becomes accessible to everyone while becoming built-in into modern development languages. These new development patterns integrate latency and performance management at the beginning of all projects. It is no longer a challenge to overcome when it is too late to change the application architecture.
Applications based on the request/response model (HTTP / SOAP / REST) can tolerate a thread model. In contrast, applications based on flows like JMS or WebSocket will have everything to gain from working off a model based on events and soft threads.
Unless your application is mostly devoted to calculations, you should start thinking about implementing the reactive approach. The paradigm is compatible with all languages.
ARCHITECTURE / THE REACTIVE REVOLUTION
210
THE WEB GIANTS
Things are moving fast: new frameworks now offer asynchronous APIs, and, in-house, mostly use non blocking APIs, with language libraries also changing, now providing classes which make it possible to react to events more simply, and, lastly, the languages themselves are changing to make it easier to script simple codes (closures) or generate asynchronous code from synchronous code.
In addition, patterns can be set up to manage threadless multitasking scripts:
a generator, which produces elements and pauses for each iteration, until the next invocation
continuation, a closure which becomes a procedure to be executed once it has been processed
coroutine, which makes it possible to pause processing
composition, which makes it possible to sequence processing in the pipeline
Async/Await, to distribute processing over several cores
In other words, the reactive revolution is underway!
How can I make it work for me?Reactive architecture is to architecture what NoSQL is to relational databases: a very good alternative when you have reached your limits.
It is all a question of latency and access competition: for real time applications, whether embarked or not, choosing reactive architecture is justified as soon as you have a significant increase in volume. So no reactive corporate website, but instead real time processing and display of IoT data (a fleet's position for example).
The same goes for APIs, so the back-ends must be designed in consequence: if your volume is under control, reactive architecture is overkill. Open to partners or even in open API, it appears necessary to design non-blocking architecture from the outset.
211
THE WEB GIANTS
Lastly, on one hand, wisely using the cloud can help you overcome many of these limits (Amazon's Lambdas for example), and, on the other hand, many software publishers have demonstrated their willingness to produce highly scalable architecture. When choosing a SaaS software package or one hosted on the premises for these use cases, companies must now turn to editors who have proven they master such architecture.
All of these technologies have physical limits. Disk volumes are increasing, but not access time. There are more cores in processors, but frequency has not increased. Memory is increasing, beyond the capacity of garbage collectors. If you are nearing these limits, or will do so in the next few years, reactive architecture is definitely made for you.
ARCHITECTURE / THE REACTIVE REVOLUTION
212
THE WEB GIANTS
Open API
213
THE WEB GIANTS
214
THE WEB GIANTS ARCHITECTURE / OPEN API
DescriptionThe principle behind Open API is to develop and offer services which can be used by a third party without any preconceived ideas as to how they will be used.
Development is thus mainly devoted to applied logic and system persistence. The interface and business logic are developed by others, often more specialized in interface technologies and ergonomics, or having other specificities.[1]
The application engine therefore exposes an API,[2] which is to say a bundle of services. The end application is based on service packages, which can include services provided by third parties. This is the case for example for HousingMaps.com, a service for visualizing advertisements on CraigsList using Google Maps.
The pattern belongs to the broader principles of SOA:[3] decoupling and composition possibilities. For a while, there was a divide between the architecture of Web Giants, generally of the REST[4] type and corporate SOA, mostly based on SOAP.[5] There has been a lot of controversy among bloggers on this opposition between the two architectures. What we believe is that the REST API exposed by Web Giants is just one form of SOA among others.
Web Giants publicly expose their API, thus creating open ecosystems. What this strategy does for them is to:
Generate direct income, by billing the service. Example: Google Maps charges for their service beyond 25,000 transactions per day.
Expand the community, thereby recruiting users. Example: thanks to the apps derived from its platform, Twitter has reached 140 million active users (and 500 million subscribers).
[1] http://www.slideshare.net/kmakice/maturation-of-the-twitter-ecosystem
[2] Application Programming Interface.
[3] Service Oriented Architecture.
[4] Representational State Transfer. [5] Simple Object Access Protocol.
215
THE WEB GIANTS
Foster the emergence of new uses for its platform thus developing their income model. Example: in 2009, Apple noted that application developers wanted to sell not only their applications, but also content for them. The AppStore model was changed to include that possibility.
At times, externalize R&D, then acquire the most talented startups. That is what Salesforce did with Financialforce.com.
Marc Andreessen, creator of Netscape, divides open platforms into three types:
Level 1 - Access API: these platforms allow users to access business applications without providing the user interface. Examples: book searches on Amazon, geocoding on Mappy.
Level 2 - Plug-in API: These platforms integrate applications in the supplier’s user interface. Examples: Facebook apps, Netvibes Widgets.
Level 3 - Runtime Environment: These platforms provide not only the API and the interface, but also the execution environment. Example: AppExchange applications in the Salesforce or iPhone ecosystem.
It is also good to know that Web Giants APIs are accessible in self-service, i.e. you can subscribe directly on the web site without any commercial relations with the provider.
At level 3, you must design a multi-tenant system. The principle is to manage the applications of several businesses in isolation, finding a balance between mutualization and self-containment.
The pattern API First is derived from the Open API pattern: its approach is to begin by building an API, then to consume it to build applications for your end users. The idea is to be on the same level as the ecosystem users, which means applying the same architecture principles you are offering your clients to yourself, which is to say the pattern Eat Your Own Dog’s Food (EYODF). Some architects working for Web Giants consider it the best way to build a new platform.
216
THE WEB GIANTS ARCHITECTURE / OPEN API
In practice, the API First pattern is an ideal which is not always reached: in recent history, it would seem that it has been applied for Google Maps and Google Wave, two services developed by Lars Rasmussen. And yet it was not applied for Google+, stirring the wrath of many a blogger.
Who makes it work for them?Pretty much everyone, actually...
References among Web Giants
The Google Maps API is a celebrity: according to programmableWeb.com, alongside Twitter it is one of those most used by websites. It has become the de facto standard for showing objects on a map. It uses authentication processes (client IDs) to measure consumption of a given application, so as to be able to bill the service beyond a certain quota.
Twitter’s API is widely used: it offers sophisticated services to access subs-criber data, in read and write versions. One can even using streaming to receive tweet updates in real time. All of the site’s functionalities are accessible via their API. The API also makes it possible to delegate the authorization process (using the OAuth protocol), thereby allowing a third party application to tweet in your name.
In FranceThe mapping service Mappy offers APIs for geocoding, calculating itine-raries, etc., available at api.mappy.comWith api.orange.com, Orange offers the possibility to send text messages, to geolocalize subscribers, etc.
What about me?You should consider Open API whenever you want to create an ecosystem open to partners or clients, in-house or externally. Such an ecosystem can be open on Internet or restricted to a single organization. A relatively classic scenario in a business is exposing the yearly directory of collaborators to integrate their identities in the applications.
217
THE WEB GIANTS
Another familiar case is integrating services exposed by other suppliers (for example a bank consuming the services of an insurance company).
Lastly, a less traditional use is to open a platform for your end clients:
A bank could allow its users to access all of their transactions: see the examples of the AXA Banque and CAStore APIs.
A telephone or energy provider could give their clients access to to their current consumption rate.
Related Pattern
Pattern “Device Agnostic“ p. 143
Exception!
Anything requiring a complex workflow.
Real-time IT (aircraft, car, machine tool): in this case service composition can pose performance issues.
Data manipulation posing regulatory issues: channelling critical data between platforms is best avoided.
Sources• REST (Representational State Transfer) style:> http://en.wikipedia.org/wiki/Representational_State_Transfer
• SOA> http://en.wikipedia.org/wiki/Service-oriented_architecture
• Book “SOA, Le guide de l’architecte d’un SI agile“ (French only):> http://www.dunod.com/informatique-multimedia/fondements-de-lin- formatique/architectures-logicielles/ouvrages-professionnel/soa-0
218
THE WEB GIANTS
• Open platforms according to Marc Andreessen:> http://highscalability.com/scalability-perspectives-3-marc-andreessen- internet-platforms
• Mathieu Lorber, Stéphen Périn, What strategy for your web API? USI 2012 (French only):> http://www.usievents.com/fr/sessions/1052-what-strategy-for-your- web-api?conference_id=11-paris-usi-2012
ARCHITECTURE / OPEN API
219
About OCTO Technology“We believe that IT transforms our societies. We are fully convinced that major breakthroughs are the result of sharing knowledge and the pleasure of working with others. We are constantly in quest of improvements.
THERE IS A BETTER WAY !“– OCTO Technology Manifest
OCTO Technology specializes in consulting and ICT project creation.
Since 1998, we have been helping our clients build their Information Systems and create the software to transform their firms. We provide expertise on technology, methodology, and Business Intelligence.
At OCTO our clients are accompanied by teams who are passionate about maximizing technology and creativity to rapidly transform their ideas into value: Adeo, Altadis, Asip Santé, Ag2r, Allianz, Amadeus, Axa, Banco Fibra, BNP Fortis, Bouygues, Canal+, Cdiscount, Carrefour, Cetelem, CNRS, Corsair Fly, Danone, DCNS, Generali, GEFCO ING, Itaú, Legal&General, La Poste, Maroc Telecom, MMA, Orange, Pages jaunes, Parkeon, Société Générale, Viadeo, TF1, Thales, etc.
We have grown into an international group with four subsidiaries:Morocco, Switzerland, Brazil and, more recently, Australia.
Since 2007, OCTO Technology has been granted the status of “innovativefirm“ by OSEO Innovation.
For four years, from 2011 to 2015, OCTO was awarded 1st or 2nd prize in the Great Place to Work contest for firms with fewer than 500 employees.
ABOUT US
220
THE WEB GIANTS
Authors
Erwan Alliaume David Alia
Philippe BenmoussaMarc Bojoly
Renaud CastaingLudovic CinquinVincent Coste
Mathieu Gandin Benoît Guillou
Rudy Krol Benoît Lafontaine
Olivier MalassiÉric Pantera
Stéphen PérinGuillaume PlouinPhillipe Prados
Translated from the French
by Margaret Dunham & Natalie Schmitz
Copyright © November 2012 by OCTO Technology,All rights reserved.
Illustrations
The drawings are by Tonu in collaboration with Luc de Brabandere. They are both active on www.cartoonbase.com, located in Belgium. CartoonBase works mostly with businesses and works to promote the use of cartoons and to encourage greater creativity in graphic art and illustrations of all kinds.
Graphics and design by OCTO Technology,
with the support of Studio CPCR
ISBN 13 : 978-2-9525895-4-3
Price: AUD $32
The Web Giants
Culture – Practices – Architecture
In the US and elsewhere around the world, people are reinventing the way IT is done. These revolutionaries most famously include Amazon, Facebook, Google, Netflix, and LinkedIn. We call them the Web Giants.This new generation has freed itself from tenets of the past to provide a different approach and radically efficient solutions to old IT problems. Now that these pioneers have shown us the way, we cannot simply maintain the status quo. The Web Giant way of working combines firepower, efficiency, responsiveness, and a capacity for innovation that our competitors will go after if we don’t first. In your hands is a compilation and structural outline of the Web Giants’ practices, technological solutions, and most salient cultural traits (obsession with measurement, pizza teams, DevOps, open ecosystems, open software, big data and feature flipping).Written by a consortium of experts from the OCTO community, this book is for anyone looking to understand Web Giant culture. While some of the practices are fairly technical, most of them do not require any IT expertise and are open for exploitation by marketing and product teams, managers, and geeks alike. We hope this will inspire you to be an active part of IT, that driving force that transforms our societies.
THE OBSESSION WITH MEASUREMENT • FLUIDITY OF THE USER EXPERIENCE • ARTISAN CODERS • BUILD VERSUS BUY • CONTRIBUTING TO FREE SOFTWARE • DEVOPS • PIZZA TEAMS • MINIMUM VIABLE PRODUCT • PERPETUAL BETA • A/B TESTING • DEVICE AGNOSTIC • OPEN API AND OPEN ECOSYSTEMS • FEATURE FLIPPING • SHARDING • COMMODITY HARDWARE • TP VERSUS BI: THE NEW NOSQL APPROACH • CLOUD FIRST • DATA SCIENCE • REACTIVE PROGRAMMING • DESIGN THINKING • BIG DATA ARCHITECTURE • BUSINESS PLATFORM
OCTO designs, develops, and implements tailor-made IT solutions and strategic apps
...Differently.
WE WORK WITH startups, public
administrations, AND large corporations FOR WHOM IT IS a powerful engine for change.
octo.com - blog.octo.com - web-giants.com