Upload
william-grosso
View
1.937
Download
2
Tags:
Embed Size (px)
DESCRIPTION
A talk I gave at SDForum's Software Architecture and Modeling SIG.
Citation preview
The Evolving Architecture of the Web
William GrossoTwofish
In the past few years, the world of web applications has changed dramatically. Some of the changes have been obvious and fully chronicled: The rise of rich internet applications, the reliance on advertising models?
But while the world has noted these user-facing changes, another, more profound, shift has been occurring with less fanfare: Web applications have become deeply linked, not just in the HTML but at the service level. As a result, web applications are growing exponentially more complex and functional, and new opportunities are rising for service providers AND application developers.
In this talk, I'll give an overview of the emerging architecture of the web, talk about how its evolution is changing the breadth of what application providers can offer, and dive into some real lessons from building out an internet-scale virtual economy service.
AKA
Andreeson, Berners-Lee, Bray, …. et al were somewhat wrong in the large and in the long-term, but they built some cool stuff and man did they change the world.
AKA
Disintermediation and you
(great web page note: http://www.saffo.com/essays/disinteremediation.php and http://en.wikipedia.org/wiki/Disintermediation for background)
Outline
• Who Am I / Standard Caveats• The first 10 years of the web• The second 10 years of the web• Where we’re going from here• Concrete Predictions
Outline
• Who Am I / Standard Caveats• The first 10 years of the web• The second 10 years of the web• Where we’re going from here• Concrete Predictions
Who Am I?
End of Year Talk
• Traditionally a time to be a little “forward thinking”
• Goal is to take a step back and look at large-scale trends
• Goal is also to be somewhat provocative– “I would rather be vaguely correct than precisely
wrong” – John Maynard Keynes– “The references are not without merit” – George
Santayana
I’m Predicting the Future of the Web
• This is a seriously wacky thing to do– 2 billion users– At least 10 million people coding various bits and
pieces– Nobody really has a handle on what it is or what
it’s used for.• And most of you got in for free anyway• “Opinions are like nipples. Everyone has two.”
– Brian Goncher.
Historically, predicting the net hasn’t been easy ….
My Employer is Not To Blame
• Twofish is a great company– We make software to support virtual economies
(SaaS backend play)– I’m the CTO
• You ain’t the customer, I ain’t here to sell you nothing, and this is mostly my opinion– “And besides, the wench is dead.” – Christopher
Marlowe.
Interrupt at Any Time
• A lot of this talk is historical in nature.– I strongly believe that predictions should be
grounded in trends– And accurately spotting trends requires paying
attention to what has happened.
• You guys were here too– Stop me if I get something wrong!
Outline
• Who Am I / Standard Caveats• The first 10 years of the web• The second 10 years of the web• Where we’re going from here• Concrete Predictions
It Started in 1990
Wikipedia• “In 1989, Berners-Lee and CERN
data systems engineer Robert Cailliau each submitted separate proposals for an Internet-based hypertext system providing similar functionality. The following year, they collaborated on a joint proposal, the WorldWideWeb (W3) project,[1] which was accepted by CERN.”
• “The first publicly available description of HTML was a document called HTML Tags, first mentioned on the Internet by Berners-Lee in late 1991”
Design Center of Original Web
• Better publishing• Easier way to get scientific papers• You go to a web site, see a list of things you
can read, and choose one.– The document is written in an SGML-defined
markup language (HTML)– The browser renders it
1993
• Nothing much happened (apparently, the world was busy playing with CD-Roms)
• Well, okay, one thing happened.– Adobe defined PDF 1.0
1994
• Netscape founded• W3C founded• SSL 2.0 released– SSL 1.0 was never released– SSL 2.0 had huge security holes that made it unusable
for serious security needs. • Cookies implemented in Navigator, though not
announced to world• Apache Group begins working on Apache Server
1995• HTML 2.0 was published as IETF RFC 1866
– Not the W3C – it had been created in 1994, but didn’t own the HTML standard yet.
• Java released• Java Applets released• First wiki released by Ward Cunningham• Keynote Systems founded (monitoring websites became a viable business)• PHP created (first web-native language)• Amazon.com launched (but was only about books)• Squid-proxy spun out of Harvest cache daemon• Netscape releases first version of Javascript in Navigator 2
– Named after “Java” because Java was hot. – Not supported in Internet Explorer until version 3 (1996)
• CGI defined and implemented
1996
• SSL 3.0 released. For the first time, the internet had secure browsing functionality.
• HTML gains tables and imagemaps.• FastCGI defined and implemented• World learns about cookies.• CSS Level 1 becomes a W3C recommendation• Flash 1 released by Macromedia• Amazon launches Amazon Affiliates
Later on, this article goes on to talk about how to disable cookies in Navigator 3.The Federal Trade Commission had a series of hearings on Cookies in 1997.
1997
• ActiveX controls introduced by Microsoft• HTML 3.2[11] was published as a W3C
Recommendation. It was the first version developed and standardized exclusively
• WAP 1.0 defined • HTML 4.0 was published as a W3C
Recommendation.
1998• “Open Source” coined as a term– Note that GNU had been around since the 1980’s,
Linux since the early 1990’s, and the Apache Group since 1994
• Microsoft introduces the HttpRequestObject but nobody really notices
• CSS Level 2 becomes a W3C recommendation• XML 1.0 defined• Akamai Founded• Netscape Navigator open-sourced
Believe it or not, XML began as an attempt to add semantics to the web
1999
• HTML 4.01 was published as a W3C Recommendation.
• Apache Software Foundation legally incorporated from Apache Group
• Dave Winer publishes XML-RPC specification• RSS .91 created• RDF becomes a W3C recommendation• Blogger launched (acquired by Google in 2003)• “Information Rules” published
2000
• Flash 5 released with ActionScript 1.0 (essentially Javascript)
• Struts launched– First serious MVC platform
• XHTML becomes a W3C recommendation
Architecture of Sites
Architecture of Internet
• Content wells• Sites mostly inward-facing– This is why we needed search engines so much
What Really Happened• High speed innovation on standards (especially front-end)
– “Internet Time”• High speed changes in design standards• Cultural Assimilation
– The idea of a website entered the popular culture– People got used to cookies and search engines and javascript on pages
and bookmarking and ….• The Browser Wars
– By the end of the decade, Netscape was just plain dead• The Rise of Open Source• Publishing and Subscriptions as the Central Business Model
– Most of the innovation was around display technology, identifying users (and security) and server-side infrastructure
– Sophisticated client functionality mostly speculative
1997
2000
Now
By 2001, Melanie Griffiths Internet Company Was Already Gone
Key Technology Lessons• HTTP enabled very loose client-server coupling. – This made rapid changes in HTML possible.– Enabled the creation of many different server
platforms• An empirical observation on internet technology
adoption.– From initial splash to high profile adoption is less than
2 years for all known successful internet technology standards.
– From high profile adoption to large scale adoption is less than 3 years for all known successful internet technology standards.
Outline
• Who Am I / Standard Caveats• The first 10 years of the web• The second 10 years of the web• Where we’re going from here• Concrete Predictions
Foreshadowing
• In the 2000’s, the things that changed were completely different– Massive changes in how the web was used– Massive changes in the underlying computational
substrate– Massive changes in the power of the client
machines
2001
• Wikipedia founded on idea of collaborative editing of an encyclopedia.
• Tim Berners-Lee publishes an article in Scientific American on the semantic web
Nothing much else happened because of the dot com crashLots of interesting changes in the years to follow though
2002
• Google launches their first API (for searching)• Friendster and LinkedIn founded• Fielding’s ACM paper ("
Principled Design of the Modern Web Architecture”) kickstarts REST ideology. – Subset of his thesis, made more readable.
2003
• SOAP becomes a W3C recommendation• Laszlo 1.0 (later openlaszlo) released• Memcached 1.0 released– Livejournal stack starts to head out into the world
2004• First Web 2.0 conference held• Firefox 1.0 released• Facebook launched• Microformats.org launched• Dojo Toolkit founded• Podcasting starts to catch on• WhatWG founded and starts work on HTML 5• OWL becomes a W3C recommendation• Flex 1.0 released
Somewhere Between 2001 and 2004
• Online advertising became the dominant business model on the web
• Google became the number 1 search engine• CSS became the preferred way to do layout– Still only partially adopted in 2008 though
2005• The term “Ajax” coined• “User Generated Content” coined and enters
mainstream usage• “Vertical Search Market quickly becomes crowded”
(http://www.ecommercetimes.com/story/41982.html) • Ruby on Rails hits 1.0• “Mashups” briefly become all the rage (before settling
into “Add a Map to your App”)• Dutch Police accidentally find a botnet with 1.5 million
PCs– http://www.securityfocus.com/brief/19
2006• Prototype (JS toolkit) started• Amazon EC2 enters beta• Mashery founded• First version of Facebook API released• Focus on collective intelligence
– Wisdom of crowds / Prediction markets / GATE and Lingpipe• iPhone released
– “The Mobile Web” becomes a reality with Safari browser• CafePress brought down by botnet attack
– http://camelsnose.wordpress.com/2007/01/07/al-firdaws-cyberspace-terrorists-or-script-kiddies/
– http://irregulartimes.com/index.php/archives/2007/01/05/umran-javed-cafepress/#comment-254171
– http://www.darknet.org.uk/2007/02/cafepresscom-under-heavy-ddos-attack/
2007
• Amazon releases the Kindle• “Social Graph” coined• Facebook Platform launched• OpenSocial founded• Adobe and Yahoo put ads in PDF
2008
• Facebook hits 50000 widgets• Microsoft launches LiveMesh• Yahoo launches BOSS and SearchMonkey• Twitter becomes stable with 6M users• The great financial collapse
Somewhere Between 2004 and 2008
• Chat becomes an important feature of many major websites
• The “danga stack” wins for the consumer internet
Architecture of Sites Is Complex
• DB now MySQL, with master-slave replication and sharded
• DB guarded by a huge farm of memcached servers
• Asynchronous worker processes handling most tasks in background
• Goal is to horizontally scale out
Architecture of Facebook Application
• Like standard app, but more complex
BrowserFB Proxy
Standard Web App Goo
Proxied by FBAnd calling FB for social graph
data
Social graph APIFBMLFBJSFQLThe WS API
Emergence of Backend Services• Want fast binary content? Use a CDN• Want to store data robustly? You should probably use the cloud• Sending Email? Use something like ExactTarget.• Taking money? That’s PayFlowPro as a web service• Doing Geographic customization? Call out to Quova• Doing Fraud Detection? Call out to iovation• Need software license management for your SaaS play. That’s
Zuora.• Need search functionality? Use BOSS • Need some collaborative filtering? Aggregate Knowledge does that• And so on
– With the exception of advertising, and some trivial mashups, almost all the high-value integration with “other sites” occur at the server level
Architecture of Current Internet
Browser Social / Container Proxy
Standard Web App Goo(app servers, internal services,
memcached servers, db servers all running on EC2)
High Value Specialized
Services run by other
companies
Semi-Blatant Plug
• Want a virtual economy, including currencies and micro-transactions, use Twofish– We fit nicely into the right hand side– But we also use a bunch of those services
Browser Social / Container Proxy
Standard Web App Goo(app servers, internal services,
memcached servers, db servers all running on EC2)
High Value Specialized
Services run by other
companiesWHICH ARE
OFTEN LAYERED ON TOP OF EACH
OTHER
The Fabric Changed, Web Applications Didn’t
Users Expect More Interactivity. Otherwise, slow change in visual design. Rate of Change Slowed Dramatically.
Frameworks. Cost of doing business dropped dramatically.Infrastructure changed significantly.
Hardware, Operating Systems, Tubes. Dramatic change. From virtualization to cores to ram to O(1) Schedulers to ubiquitous bandwidth
to ….
Outline
• Who Am I / Standard Caveats• The first 10 years of the web• The second 10 years of the web• Where we’re going from here• Concrete Predictions
CSS is a Huge Learning Point
• Nobody wants to admit it, but HTML 4 wasn’t that bad– You could get a lot done and it had a pretty easy
initial learning curve– If you can get stuff done already in a widely
available technology with a low learning curve, new stuff has a huge hurdle in front of it.• CSS still isn’t ubiquitous (not even close)
It’s Much Easier to Build Smaller Services and Connect Them
• By and large, people aren’t moving lots of data back and forth between websites
• Instead, they’re calling APIs• This is huge and important. There are two
basic forms of “information collaboration”– I can give you the data– Or I can answer your questions
The Web Is Moving to Question Answering, Not Information Sharing
• Easier to build, easier to scale, and easier to Monetize
• Google pioneered this with their search API in 2002– They didn’t give you the index– They answered very specific questions– And they rate limited people, so that you can’t get
the information
Architectures Becoming … Even More Complicated
• The number of moving parts is dizzying, even internally
• When you layer in the across-the-web services, it is astounding– The failure modes for software are getting more
complex– Product management isn’t keeping up!
Amount of New Code Per App Decreasing Exponentially
1100 lines of code
Outline
• Who Am I / Standard Caveats• The first 10 years of the web• The second 10 years of the web• Where we’re going from here• Concrete Predictions
The Semantic Web Is Doomed
HTML 5 Will Never Be Widely Available
Jakob Nielsen was right, but off in his timing.
The Client Will Remain (mostly) Thin
• Too many different devices, too many distinct display modalities– Web browsers (5 or 6 now?)– Game consoles– Special purpose web devices– IP TV– Phones
• Logic is going to mostly remain on the servers
SOAP Will Win Over REST
• That’s unnecessarily antagonistic. – REST has its place.
• But SOAP will become ubiquitous. – The need for complex datatypes and
interoperability will drive SOAP to the heart of internet-scale computing.
Almost Everything Else Layered on Top of SOAP will Die
Long Value Chains in Building Web Apps
• We’ve already seen the first level of this– That was the slide listing some of the functionality
that is currently outsourced to other data centers
• Those services rely on other services, which rely on other services.
A Web of Search Engines
• The ubiquity of high quality search engines– From Lucene to BOSS
• The growing availability of text analysis engines– From Gate to OpenCalais
• The utter ease of UI layer creation– And the ease of deployment onto the cloud
• Google’s market share for search on the web will steadily decrease from now on.– Though the standard measurements may not show it.
OpenID Will Become Ubiquitous
• Logins? That’s a function of identity– All the major web apps already support OpenID.
• The arguments for accepting OpenID are compelling for small web apps– After that, it’s a question of how the 5 biggest
OpenID players will form a consortium
An Internet Scale Message Queue Framework Will Emerge
• We’ve got layers and layers• Cross-internet calls for value added services– Standard web architectures are already building
the backends with asynchronous services– Polling, asking for results later, and background
uploading are common patterns
• Amazon’s SMS is interesting here …
Frameworks Like Grails Will Become Dominant in the Java Universe
• And then dominant everywhere• Marries Rails-like productivity with Java
robustness and back-end libraries• In the coming world of interop and
disintermediated chains, this is a huge winner
There Will Be A
• A billion dollar (valuation) company – With under 10K lines of non-marketing code• And no data center
– And under 50 employees» By 2012
It’s not a billion dollar company, and it has too much code, but ….
Additional Reading