scale

Excellent Presentation on Scalability and MySQL Proxy

I briefly attended a webinar on the use of MySQL proxy last week called, "Designing Scalable Architectures with MySQL Proxy." I say briefly because about 5 minutes in I was called away for an "emergency."  So, I'm very happy to see that this absolutely excellent presentation and all it's related information has been posted on the web.  Here is the starting point, the presentation itself:

http://datacharmer.org/presentations/webinar_proxy_jul2008/mysql_proxy_scalable_arch.pdf

That was a very entertaining slide deck.  Here is a nice aggregation of links to the information as well.

If you are into MySQL, MySQL Proxy, and scalability this is a DO NOT miss.

Enjoy!

Instead of an Article, A Response to a Scalability Article

Nati Shalom again has posted a thoughtful and excellent blog post

Twitter as a scalability case study

 

So, I just put my comment right there on his blog and it's probably easier to read it there in-line.  Here is a re-post of my response though since it was practically a blog post anyway.  It doesn't really stand totally on it's own so you should check out Nati's post first.

A wise article. Thank you. My thoughts after reading it some directly or tangentially related...

Slowly but surely people are catching on that what can generally be referred to as loosely coupled asynchronous capable systems architecture stacks and software architectures are critical to building truly scalable systems. Twitter even knew this early on and tried to adjust many times but for reasons unknown by me still made/makes some rather odd choices about their systems and software architecture; their mysql usage for example. Funny thing is, the body of work exists to build these sites already. People just keep focusing on the wrong things like language wars or putting the individual problems into overly broad problem domains or applying the wrong solutions all together.

Today's languages and frameworks can take some of the sting out of developing and scaling an application to a point but once an application moves beyond any significant traffic level problems inevitably arrive that lay bare all the bad choices that followed before. The people who really know how to address and fix those problems are few and far between. The people who know how to avoid those problems from the very beginning based on their experience are even more rare. The companies who have those rare people on staff and actually listen to them almost don't exist at all.

I think, in business, the definition of insanity is doing the same thing over and over and expecting a different result. If so, we're seeing industry insanity around the concepts of designing web based applications that will scale. People just keep making the same mistakes again and again.

I've been thinking about all this in terms of site traffic for the average implementation of a systems and software architecture underlying a web application that might grow on todays terms. I think the page views per month to description looks loosely like this.

0 - 100,000 page views = micro site
100,000 - 1,000,000 = small site - troubles start here
1,000,000 - 100,000,000 = large site - troubles magnify dramatically here. OMG Rewrite!
100,000,000 - 1 billion = very large site - nothing works that used to work because it just can't because your system died a tragic death

Each of these requires certain skills and knowledge to build to. But all of them can be handled if planned for well up front. It's commonly held that premature scaling is the root of all evil ( and cost over runs). It's just not true. Designing the systems and software architecture of your site to be a very large site doesn't require the CAPEX expense to do so any more. It's just an technology architecture problem and it requires a broad range of skills to solve.

Anyway, I've officially rambled on and I want ice cream so bye!

Thanks for reading ProductionScale!

 

Cloud Computing: Get Your Head in the Clouds

Almost every day recently I find myself explaining Cloud Computing to different people at all levels and roles in various organizations. So, I decided to take a stab at it from my point of view.  The challenge in explaining cloud computing is that there is more than one answer to the what is it question. The field is evolving rapidly and everyone wants a piece now. This article attempts to define and break down cloud computing to it's most important components in the context of the business use case. This article is for the potential cloud computing consumer. So, what is cloud computing?

Cloud Computing (Figure 1.0) is a commercial extension of computing resources like computation cycles and storage offered as a metered service similar to a physical public utility like electricity, water, natural gas, or telephone network. It cloudcomputinggraphic.jpegenables a computing system to acquire or release computing resources on demand in a manner such that the loss of any one component of the system will not cause total system failure. Cloud computing also allows the deployment of software applications into an environment running the necessary technology stack for the purposes of development, staging, or production of a software application. It does all this in a way that minimizes the necessary interaction with the underlying layers of the technology stack. In this way cloud computing obfuscates much of the complexity that underlies Software as a Service (SaaS) or batch computing software applications. To explain better though, let's simplify that and break it down this definition to it's constituent parts.

More compactly stated, cloud computing is a commercial extension of utility computing that enables scalable, elastic, highly available deployment of software applications while minimizing the level of detailed interaction with the underlying technology stack itself.


Definitions of Necessary Terms to clarify and define cloud computing:

Utility Computing - The combination of computing resources as a metered service in a way similar to a physical public utility.

Scalability - The ability of a computing system to grow relatively easily in response to increased demand

Elasticity - The ability of a system to dynamically acquire or release compute resources on-demand

Highly Available - Systems designed such that the loss of any one component of a system will not result in system failure

Deployment - Placing your software application into a technology stack in a running environment for development, testing, or production

Software Application - An arrangement of programming code designed to achieve some specific purpose

Technology Stack - The Hardware and Software layers underlying a given software application. Figure 1.1.

generictechstack.jpeg 

Now we know what cloud computing is and that's great. But, I'll let you in on a little secret. For most people, it just doesn't matter! Remember, we're talking about cloud computing in the context of business. We're not talking about buzz words, technobabble, or hope. Let's get to the really important question. How can all this cloud computing alphabet soup help my business or my idea flourish? Starting again with that question and keeping it in mind at all times you and your technology team can make pragmatic decisions over time. You may have heard it before, and it bears repeating, that there is no place in day to day businesses for technology for technologies sake alone in most companies. Technology, cloud related or otherwise, has to make good business sense to make any sense at all.

To decide if cloud computing can help your idea or business flourish you have to decide what type of service provider you need. There are three distinct sub-areas of cloud computing. They are IaaS, PaaS, and SaaS. We will define and discuss these in turn.

Infrastructure as a Service (IaaS)

IaaS clouds make it very easy and affordable to provision resources such as servers, connections, storage, and related tools necessary to build an application environment from scratch on-demand. IaaS clouds are the underlying infrastructure of PaaS and SaaS clouds. A common characteristic of IaaS clouds is that they are more complex to work with but with that complexity comes a high degree of flexibility. So, these are generally lower level services in the grand scheme of things; not in a derogatory sense of course. You'll be dealing with virtual machines, operating systems, patches, and various other issues. You'll likely require some specialized help to make it all work well.

Some current examples of these types of services are (some of these are hybrids too but I put them where the most belong in my opinon).

Amazon Web Services - Extremely flexible Build your own w/ many add-ons
VMWare - Build your own
Elastra - Up an comer build and manage your own IaaS
3Tera - Sexy GUI based IaaS/PaaS building tools
Xen - Build your own
XCalibre - Very interesting and can do Linux or Windows
Nirvanix - All about cloud storage, very interesting subset similar to Amazon S3
EngineYard - Rails only Build your own
Joyent - Build your own on Solaris w/ Java/PHP/Rails/Python

The number one benefit of such services is rapid provisioning. You will not have to wait days, weeks, or months for new servers. In some cases, you can have them in minutes! In fact, it's so easy to provision that it's easier to just throw away "broken" servers and replace them with new instances in most cases. All the details of provisioning, racking, stacking, cabling, and more are completely abstracted away from you.

The developers of applications for such systems will often need to adjust things to accommodate for the IaaS cloud. It can also be somewhat difficult to move from one cloud to another in some cases. But, less so with IaaS clouds than the PaaS clouds we will discuss next.

Billing for these services is usually incremental by use and can get complex with tiered on-demand pricing that can be difficult to track in real time. Pricing is usually well defined but can be rather difficult to forecast in some cases. It can vary to the minute depending on levels of use, tiers of service, and other interesting combinations. Now, on to the second type of cloud computing model that's important in the context of this article.

Platform as a Service (PaaS)

PaaS clouds are designed, often within IaaS Clouds by experts to make the deployment and scalability of your application trivial and your costs incremental and reasonably predictable. Here are a few of the choice Application Stack Cloud Providers (ASCP) in this space today.

Mosso, PHP, .NET, Java, Rails, Python, other?
Google App Engine, Python
SalesForce - Proprietary
Morph - Ruby on Rails
Heroku - Ruby on Rails

There are more and more PaaS clouds sprouting up constantly and rapidly. The number one benefit of such a service is that for very little money, none in some cases, you can launch your application with little effort beyond having developing and and possibly some porting work if it's an existing application. Additionally, there will be a large degree of scalability built into your PaaS choice by design as it is a cloud as defined earlier in the article. Finally, you will not need to hire a professional systems administrator more than likely as they are part of the service itself. If you are trying to keep your operations staff lean this can be a useful path to follow assuming your application will capitulate.

The number one down side of choosing an PaaS Cloud provider is that all such services come with various restrictions or trade-offs that may be a non-starter for your project. This is especially true of you already have a pre-existing application that might need to be ported to the PaaS solution you choose. You will need to plan on some porting development time costs and it might not be trivial. For example, a particular PHP extension, rails gem, or operating system tool may be unavailable that your application needs and you'll have to code around these types of issues as your ability to add this to an Application Platform Cloud will be limited unless it's a custom PaaS.

Billing for these services varies. It can be by the hour, request, CPU cycle, or other creative ways. Some even help you do pass through billing for your customers; like Mosso. But, the defining factor in pricing of Application Platform Clouds is that they generally strive to be robust, simple, and easy to load your application into when you are ready.

Software as a Service (SaaS)

Software as a Service has been around for a while now and actually precedes the newer term Cloud Computing. What's interesting, and the reason to include SaaS in this article is that Cloud Computing is breathing ever more life into the SaaS model by reducing the costs associated with producing a SaaS application. A couple of well known examples of SaaS are GMail or Salesforce.

SaaS is not really the ultimate goal of Cloud Computing per say but it is an important, relevant, and right now step along the evolution of compute resource management and allocation.

In summary, understanding cloud computing requires some base knowledge and historical review to know where it came from. This will enlighten people that it's not exactly new but that there is a new excitement now due to technology and business convergence. Once you have that base, which you now do, explicitly learning to use the cloud is the next step. Then, you can almost certainly implement your ideas faster, cheaper, and more profitably than people have ever been able to do before. It's a very exciting time to be in computing and in fact, to be on this planet. Welcome to the clouds!

Now THAT is Scale in the Cloud - Animoto goes Viral

Today I saw this at the AWS blog.  It's really an amazing story.

They had 25,000 members on Monday, 50,000 on Tuesday, and 250,000 on Thursday. Their EC2 usage grew as well. For the last month or so they had been using between 50 and 100 instances. On Tuesday their usage peaked at around 400, Wednesday it was 900, and then 3400 instances as of Friday morning.

Now, someone please check my math here but if they ran 3400 compute nodes for one month and only used small instances that would be a $244,800 / month invoice.  They don't mention bandwidth so I can't even venture a guess but let's assume it's probably not a small number.

While I am very impressed by the scale and scalability I am also very interested to know if they can actual monetize at a level that provides a good return on a $300 to $500 / hr investment + bandwidth.  They built in elasticity so it can shrink as well as grow but still, the average number of AMI's this month is going to be up there!  Also, I wondered what happens to Animoto when they max out the the credit card on their AWS account?

Sources:

Animoto Blog: http://blog.animoto.com/
Amazon Blog: http://aws.typepad.com/aws/2008/04/animoto---scali.html

 

Scalr: Scalr is a fully redundant, self-curing, self-hosting EC2 environment

I was enjoying a few sleepless hours tonight and ran across an interesting news release that a piece of software called Scalr has been open-sourced.

Scalr describes itself as a fully redundant, self-curing, self-hosting EC2 environment.  This is reminiscent of something called WeoCEO that wrote about briefly a while back.

I twittered today that it must definitely be the year of the cloud and not the year of the Rat.  I think I might be right.  The cloud news releases are pretty much non-stop these days and the rate of innovation is accelerating.  Exciting times and it's about time.

Since it's late and I'm not in a summary typing kind of mood here is the text from the Scalr page as printed and with full credit to the URL just below so you can go see for yourself.

Scalr is a fully redundant, self-curing and self-scaling hosting environment utilizing Amazon's EC2.

It allows you to create server farms through a web-based interface using prebuilt AMI's for load balancers (pound or nginx), app servers (apache, others), databases (mysql master-slave, others), and a generic AMI to build on top of.

The health of the farm is continuously monitored and maintained. When the Load Average on a type of node goes above a configurable threshold a new node is inserted into the farm to spread the load and the cluster is reconfigured. When a node crashes a new machine of that type is inserted into the farm to replace it.

4 AMI's are provided for load balancers, mysql databases, application servers, and a generic base image to customize. Scalr allows you to further customize each image, bundle the image and use that for future nodes that are inserted into the farm. You can make changes to one machine and use that for a specific type of node. New machines of this type will be brought online to meet current levels and the old machines are terminated one by one.

The project is still very young, but we're hoping that by open sourcing it the AWS development community can turn this into a robust hosting platform and give users an alternative to the current fee based services available.

 SOURCE:  http://code.google.com/p/scalr/

 I need a fully staffed lab to keep up with loading, deploying, testing, and applying all the things coming onto the market lately.  If anyone is ready to volunteer to fund the Cloud Computing Innovation and Business Application Labs TM (CCIBAL) just let me know and we'll get right on it.