An Engineer

An Instance of Perspective

Archive for the ‘ec2’ Category

Google App Engine vs. Amazon Web Services

with 9 comments

We use Amazon’s S3 storage service here at Phanfare and love it. I especially like that while we are leveraging Amazon’s cost position and development budget, we could probably swap out the service for a competitive service or our own service if we really had to.

Google’s new App Engine offering, which gives you a vertically integrated development environment to create a web application in Python, has pros and cons relative to the Amazon Web Services approach of giving you more industry standard pieces like Linux instances (EC2), key-value stores (SimpleDB) and web-service-based filesystems (S3). (If you are not familiar with the offerings, Gartner has a nice summary)

Amazon’s offering is a lower level offering, closer to the hardware. It will take you longer to get started with Amazon Web Services and require more work to build systems but the resulting systems will be more extensible (and my guess, higher performing). For example, if I need to convert video using an obscure codec, I can probably install the appropriate code on an EC2 linux instance, but there may not be a suitable Python module for Google’s App Engine.

You can build Google’s App Engine on top of Amazon’s EC2 and S3 offerings, but you would have a tough time building Amazon’s web services using Google App Engine. To make the point, the folks at AppDrop are running the open source App Engine SDK on an Amazon EC2 instance.

There is a place for both the Amazon and Google approaches. If you want to create a new web app that requires very little third party open source software, Google App Engine will get you running faster, especially if you are proficient in Python and have no pre-existing code. The Google App solution might just wind up being your early prototype, but will let you get to market faster. If you are extending an existing service, have a lot of code, or want to split between in-house and cloud-based infrastructure, as we do at Phanfare (we use only S3), then Amazon is the natural choice.

Personally, Amazon’s approach is more attractive as we look to build Phanfare. Amazon is creating virtual instances of industry-standard services that everyone is building. I know that if we create services that run on an Amazon standard Linux EC2 instance that we can move it off of Amazon fairly easily. I also like that Amazon has broken down the problem of building scalable systems into different services pieces that do one thing very well. Large monolithic systems can get overly complex and unreliable.

Google has developed an environment that nobody is using today. If Google decides that Google App Engine is not strategic for them and discontinues it, it could be catastrophic for me. Sure I can take the SDK and run it myself like they did at Appdrop, but that won’t guarantee any level of reliability. By contrast, I am pretty sure that Linux is not going away. If we had to find another host for our Linux-based system, it would be easy.

Written by erlichson

April 16, 2008 at 12:29 am

Phanfare now backing up photos and videos to Amazon S3

with 23 comments

I am happy to announce that we have moved our backups to Amazon’s Simple Storage Service, known as S3. All current backups go to S3 and we are copying over historical data. We currently have about 20 terabytes at Amazon and will have about 40 terabytes when all the data is moved over.

We also maintain a copy of customer photos and videos on our RAID servers in our NJ datacenter. Amazon promises multi-data center redundancy for S3 data, so Phanfare customers now have the peace of mind of knowing that their data is in at least three datacenters, on opposite coasts of the US (NJ and WA).

The natural question is, why did we do it? We did it because we wanted to provide the assurance of off-site backup and because the engineering costs (time and money) in building out something similar to S3 exceed any cost savings we might have realized by managing the storage ourselves in the medium term.

We actually get more redundancy than we had before. Before we backed up data on a second set of RAID servers in our NJ datacenter. Those servers were cheaper to operate than Amazon S3 assuming 2 year amortization, but they did not provide the same level of geographic or physical redundancy. So for us, using Amazon was not cheaper, but it was better. Including the opportunity cost of working on Phanfare’s core products versus working on offsite backup, using Amazon is a definite strategic win for us.

To make Amazon actually lower our overall long term costs, we would need to stop storing the data ourselves, instead just caching hot data. We have competitors that do that and it would be cheaper, but we are not positive it would be better. After all, right now, Amazon does not provide a Service Level Agreement (SLA) or even a phone number to call if you are unhappy with the Amazon web service. I don’t expect that Amazon will ever lose our data of course, but we would like an SLA before we bet our customers’ data on that.

Amazon’s web services are game-changing, especially to smaller companies. They allow small companies to have a cost position that rivals some of the biggest online competitors. Amazon’s web services also lower the cost of entry for new startups and hence increase competition and foster innovation. Both these things are good for consumers and we applaud Amazon for embarking on their ambitious plan of providing storage and compute in the cloud for other companies. I know they are also trying to amortize their own costs of development, but for us it is wonderful. With proper SLAs, we would consider using Amazon’s Elastic Compute Cloud too (EC2).

EC2 enjoys local area network (LAN) latency and bandwidth to S3 storage and that would make S3 that much more attractive as primary storage for Phanfare. One of the first rules of building a high performance system is to keep compute close to the data it operates on, and hence without using EC2, we would always need to cache data on our side for performance. The latency between NJ and Seattle is too long otherwise.

If you think about it, Phanfare does for consumers what Amazon does for us. Just as it would be difficult and expensive for a consumer to build a system to store his photos and videos into the cloud, accessible from anywhere and backed up in geographically distributed locations, it would be difficult and expensive for Phanfare to replicate Amazon’s level of web infrastructure.

TAGS: , ,

Written by erlichson

July 12, 2007 at 2:22 pm