An Engineer

An Instance of Perspective

Archive for the ‘amazon’ Category

Amazon Announces Reduced Redundancy Storage (Hint: We don't use it)

with 5 comments

Amazon just announced Reduced Redundancy Storage, designed to provide 99.99% durability. We don’t use that version of Amazon S3. We use the version of Amazon S3 that provides 99.999999999% durability and can sustain the concurrent loss of data in two facilities.

The exciting part of the news for us is not the reduced redundancy storage; it’s that Amazon has finally disclosed that the durability goal for the version of Amazon S3 we use is 99.999999999%.

What does that mean in human terms? Well, Amazon says that even their Reduced Redundancy Storage (RRS) is 400x more reliable than a disk drive. But if you store 10,000 files using RRS, you would expect to lose one each year. Or put another way, the expected lifetime of a file is 10,000 years. But with regular Amazon S3, you would have to store about 100B files to expect to lose one each year.

Phanfare is built using the most reliable online storage available and is designed to be the primary copy of your data, far more reliable than anything you can do yourself. And it’s that durability that drives a significant part of the underlying cost of delivering the service and one of the reasons we recently raised our prices.

Even online backup services, which sell based on the fear of you losing your data, don’t typically use online storage with the durability of Amazon S3. That is how they get their price down. But they figure if they lose a little data, you have the primary copy anyway and they can just back it up again. Not so with Phanfare. We assume that you are using us as primary storage for your photos and videos.

Written by erlichson

May 19, 2010 at 9:57 am

Freemium did not work for Phanfare

with 51 comments

Fred Wilson of Union Square Venture is a big proponent of the freemium business model on the internet. He recently reiterated that when it comes to delivering media on the net, freemium is a great way to go. Fred originally endorsed Freemium back in March of 2006.

I have a tremendous amount of respect for Fred. I don’t know him well, although we have met a few times. And I read his blog pretty regularly so feel like I know his views. I can’t say that I was not enticed by Fred’s arguments. At the time, Phanfare was growing nicely but marketing costs were high. It seemed that if we created a freemium business model and allow everyone to use Phanfare in some basic form that it would help us prospect for customers willing to go for paid upsells.

But I also had my concerns which I wrote about in May of 2006 in a post entitled Why is there no free version of Phanfare. At the time, I was concerned that there were few network effects to Phanfare and hence the value of having a large community of free users was not that high. I was also concerned that as a differentiated provider, we would be hard pressed to make money with the load of the free users.

But in 2007, we embarked on changing Phanfare to incorporate a free version. The thinking at the time was that our calculus had not considered the cost of marketing, which is lower if you can prospect by attracting people with a free version. We thought having some network effects to Phanfare were important, so we also added social networking. We thought that people would connect to their friends on Phanfare and expose them to the system.

We offered 1GB of storage to free users and unlimited storage to paid users. Classic freemium.

Here is what happened.

  • We saw a surge in registered users.
  • We saw drastically reduced margins. Customers with less than 1GB but paying our full subscription fee were our most profitable customers. With those people at the free level, our margins were down significantly.
  • We lost the ability to effectively use CPC search marketing (google). When we had a free trial, it was easy to see which clicks were worth paying for. But with freemium, the conversion funnel was so long (average of nearly a year before the person needed more storage) that any attempt to optimize price per conversion was hopeless. (Try adjusting the shower temperature with a 12 month delay between knob and temperature change and see how it goes for you. In control theory parlance, this is known as introducing a delay in the feedback loop)
  • We lost our position as a premium provider. People perceived Phanfare as “free” and it was hard to describe ourselves as a paid service for those who care about preserving their full size originals and displaying them in a better way.
  • There were few network effects to Phanfare. People who were our customers did get their friends to register for free accounts, but the rate at which those people became Phanfare participants was very low. We did not have a social network; we registered the audience.

Another issue is that storage is the driving cost factor for Phanfare. The page views per bytes/stored is so low for us (we are an archival service storing full size originals) that advertising, even if it was welcomed by our user base, would not pay the bills.

Fred Wilson estimated that Facebook might have $.25 revenue per user per month. Phanfare could never survive on that.

In the end, freemium is not a good model when the cost of delivering service to free users is high. But more fundamentally, I reiterate my position that freemium is a bad marketing plan for any premium business that hopes to be the differentiated provider.

Freemium makes sense when at least one of the following conditions are true

  • Free users have zero marginal cost to the company. True for Skype and other P2P services that get participants to volunteer infrastructure.
  • The value of the product to a prospective customer depends on their being a large network. True in dating, Skype, Facebook, Ebay and Twitter.
  • The business can be run ad supported. That means that the business has reach and attention that scale along with costs and costs are low enough that ads pay the way. (At Phanfare our page views per byte stored are so low that advertising does not work well). Any ad-supported business can consider paid upsells. At some level they are running freemium. Or put another way, they are using their reach to pitch their own products.

Not a single one of these conditions is true for Phanfare. Moral of the story: trust your instincts.

Written by erlichson

July 8, 2009 at 10:56 am

Kindle iPhone app is a great complement to the Kindle

with 3 comments

Amazon recently released a Kindle iPhone app that allows you to read any book that you have purchased. You can use the app without owning a Kindle or in addition to the Kindle. Where the app really shines is using it in addition to the Kindle.

Reading on the iPhone for long periods of time is not optimal compared to reading on the Kindle. The Kindle screen is larger, the battery life is better and the screen is easy on the eyes, with a reflective display.

But what makes the Kindle app great is that your cell phone is always with you, while your Kindle is likely not. Hence, when you are standing in line or on a sitting on a train, you can read a few pages of your current book.

When you get back to the Kindle, it synchronizes your place in the book automatically, advancing the page to where you left off. How cool is that?

The other place I find myself using the Kindle iPhone app is in bed. Since the iPhone is backlit, you can read it without turning a light on in the room, allowing you to read in the dark while your wife is trying to fall asleep. I find the iPhone is also easier to hold when lying in bed, being a bit lighter.

The iPhone holds the same place in my reading life as it does in my photographic life. I own a digital SLR and a point and shoot camera, but the convenience of having a wireless camera in my pocket and full access to my entire collection often trumps the quality of the the dedicated cameras. Like my Kindle, I would never want to give up my DSLR for an iPhone, but they work very well together.

Also, the Kindle works the way iTunes should work. You can download any book you buy from Amazon as many times as you want. The device knows you have purchased the book. Similarly, an iPod and iTunes should know that you have previously purchased a song and give you access. The player becomes a caching device that gets its personality from the cloud, versus being a static copy of content.

On an iPhone or iPod touch, why not just wirelessly sync my purchased music directly from Apple? Why do I need to manage the device by maintaining my iTunes collection on a computer? Yes, I know that Apple does allow buying music on the iPhone and touch today, but this is not a true wireless sync with the Apple mothership. You still need to sync the song by wire to your computer. And if you lose it, Apple will force to buy it again.

Amazon is in the Kindle business so they can sell the media (electronic books) so for them, giving you access on other devices is a no-brainer. Apple has traditionally sold media as a way of selling more hardware, so they probably are not quite a sure they want you to be listening to your music on other non-Apple devices. But I suspect this is changing for Apple. They took the word “Computer” out of their name about a year back and I think they probably believe they are becoming more of a media company.

Written by erlichson

March 10, 2009 at 10:22 am

Posted in amazon, Apple

Tagged with

Surviving an Amazon S3 outage

with 8 comments

We use Amazon S3 to store our 80 terabytes of photos and videos. We like the service and it works well. Yesterday, it went down for nearly 8 hours. And during that time, we were mostly up. Cloud computing is all the rage, but sometimes, the weather is really bad and you can’t see the clouds. We planned for that rainy day. Hence, on a day when Amazon S3 was entirely down, I was at the pool, literally. I will tell you about how we did it.

When users upload photos and videos, we first move them to our own servers. In the background, we send the data to S3. If Amazon S3 goes down, we can buffer data for up to two days before we notice. By buffering, we remove the real time requirements of Amazon S3 being up for our users to upload data. We can’t buffer indefinitely, but we are betting than an Amazon S3 outage longer than 2 days is very rare. We always believed short outages would occur. In fact this, is is not the first one.

For serving photos and videos, we act as our own content distribution network (CDN) and cache the hot data. That means that users can view most recent photos and videos, including what was recently uploaded.

All this caching and buffering is done outside of Amazon. We don’t use Amazon’s compute cloud (EC2) for that. We have considered moving more of our system to Amazon Web Services. It is unfortunate that EC2 was built to require S3 to be up in order for to it run. New instances are loaded from S3. So an S3 outage is correlated with an EC2 outage.

Photo and video sharing services that did not plan for S3 outages were completely down yesterday. We estimate that most of the cost savings for our business comes from outsourcing the storage. While we could save some additional money by using EC2, it is not as dramatic as the S3 savings. Hence, we will have to carefully consider before we put all our eggs in that basket.

Written by erlichson

July 21, 2008 at 9:51 am

A cautionary tale about maintaining data at home

with 11 comments

It should come as no surprise to anyone that I rely on Phanfare to safeguard my photos and videos. They live happily in the cloud, in their original sizes and quality and I access them from wherever I need. I strongly believe in cloud computing. I think personal computers (Windows and Mac OS) are difficult to maintain, overly complicated devices that expose too much complexity to the user.

A personal computer is best as an internet terminal, replaceable with a different computer as needed, provided you install the necessary software. And I believe in the long run most consumers won’t buy general purpose computers. But we live in the here and now.

I am not 100% converted to cloud computing in my personal life yet. There is lots of legacy stuff I setup years ago. At home I have a Mac Pro desktop with 3 drives running the latest version of Leopard. Two of the drives were purchased about 3 years ago at the exact same time: 450 GB Western Digital SATA drives. I installed them in my Mac Pro (the Pro has 4 bays) and setup a software RAID 1 (full mirror).

On my personal RAID I keep my iPhoto library (I sometimes use iPhoto as part of my workflow), my iTunes collection, in progress iMovie projects and a VMWare Windows XP instance.

Well, I got back from a business trip to find that I had basically lost the whole RAID. The RAID was not mounted. I rebooted and it mounted. I checked on it in Disk Utility and found that one of the drives were marked FAILED and the other was marked S.M.A.R.T. failure, which is a early warning system built into drives telling you it is about to fail. The RAID was marked “degraded,” which means not providing redundancy, and some information in the Disk Utility interface recommended that I replace the one drive that was hanging on and move the data ASAP. I tried, but got errors when copying the files.

So I lost all the data. No big deal. The music is on an iPod, although a few months of ITMS purchases are not synched. The photos I care about are all on Phanfare and the VMWare instance is just a standard XP config with MS Office and some other files.

But I was really trying to NOT to lose that data. I had a RAID, the drives were fairly new, the home office is climate controlled, the computer is rarely moved, we have smoke alarms and heat sensors and the computer is on a UPS to protect it from vagaries in the power grid. And yet I lost it all.

Morale of the story: Keep your stuff in the cloud. I am going to find a service that will keep my iTunes collection (anyone have experience with mp3tunes.com?) in the cloud. And I am going to finally pull the trigger and stop maintaining personal files like tax records on home servers (that is not my only RAID- the other one is a DELL HW RAID in the basement waiting for a flood).

I tried Jungle Disk and it looks pretty good. Jungle disk is a SW layer that sits atop Amazon S3 and lets you store your files on S3 and pay only Amazon’s rates for storage and bandwidth. (Note that I don’t think the average consumer needs the complexity of Jungle Disk and personal S3 accounts, but some of the underlying applications I use don’t yet have good enough online services).

If I can’t manage to keep my data intact at home, I suspect you can’t either and frankly, why try? There is simply no comparison with the type of monitoring, redundancy and security you can get from an online service versus rolling your own in your basement.

Written by erlichson

July 3, 2008 at 11:19 am

Google App Engine vs. Amazon Web Services

with 9 comments

We use Amazon’s S3 storage service here at Phanfare and love it. I especially like that while we are leveraging Amazon’s cost position and development budget, we could probably swap out the service for a competitive service or our own service if we really had to.

Google’s new App Engine offering, which gives you a vertically integrated development environment to create a web application in Python, has pros and cons relative to the Amazon Web Services approach of giving you more industry standard pieces like Linux instances (EC2), key-value stores (SimpleDB) and web-service-based filesystems (S3). (If you are not familiar with the offerings, Gartner has a nice summary)

Amazon’s offering is a lower level offering, closer to the hardware. It will take you longer to get started with Amazon Web Services and require more work to build systems but the resulting systems will be more extensible (and my guess, higher performing). For example, if I need to convert video using an obscure codec, I can probably install the appropriate code on an EC2 linux instance, but there may not be a suitable Python module for Google’s App Engine.

You can build Google’s App Engine on top of Amazon’s EC2 and S3 offerings, but you would have a tough time building Amazon’s web services using Google App Engine. To make the point, the folks at AppDrop are running the open source App Engine SDK on an Amazon EC2 instance.

There is a place for both the Amazon and Google approaches. If you want to create a new web app that requires very little third party open source software, Google App Engine will get you running faster, especially if you are proficient in Python and have no pre-existing code. The Google App solution might just wind up being your early prototype, but will let you get to market faster. If you are extending an existing service, have a lot of code, or want to split between in-house and cloud-based infrastructure, as we do at Phanfare (we use only S3), then Amazon is the natural choice.

Personally, Amazon’s approach is more attractive as we look to build Phanfare. Amazon is creating virtual instances of industry-standard services that everyone is building. I know that if we create services that run on an Amazon standard Linux EC2 instance that we can move it off of Amazon fairly easily. I also like that Amazon has broken down the problem of building scalable systems into different services pieces that do one thing very well. Large monolithic systems can get overly complex and unreliable.

Google has developed an environment that nobody is using today. If Google decides that Google App Engine is not strategic for them and discontinues it, it could be catastrophic for me. Sure I can take the SDK and run it myself like they did at Appdrop, but that won’t guarantee any level of reliability. By contrast, I am pretty sure that Linux is not going away. If we had to find another host for our Linux-based system, it would be easy.

Written by erlichson

April 16, 2008 at 12:29 am

Selling Music in the World of Free

with 8 comments

Although Apple is having good success selling music online, there has long been serious concern within the industry that with DRM disappearing the business of selling music is going the way of the dodo bird as online sales are not keeping up with the declining physical sales. Fred Wilson eagerly awaits the day when all music is free via advertiser-supported streams.

DRM is going away. That is clear enough. So where does that leave the music industry? I believe there is an opportunity to provide a service that people will pay for to buy songs. Rather than buying a song and simply getting a single copy, buying a song should make it available to you for perpetuity from any network-connected device you want at higher and higher resolution as time goes on.

Apple is in the best position to provide a service like this. Here is how it would work. You would buy a song on iTunes from Apple while logged in using your Apple ID. Apple would sync that song without DRM into iTunes. You can do what you want with it. But Apple would also make it so that you can login from any iPod and get to your full purchased library.

Essentially, the songs you “own” would be part of a hosted library that you could access from any device. When you enter a car, you could login to the car audio system and get access to your full library (it would be cached on the hard drive in your car stereo). When at a friend’s house, you could login at their computer (or stereo) to iTunes with your credentials and get access to your songs. Through smart synching and caching, it would appear that your music is available everywhere you want it.

This is a type of service I want to buy a song from. It sells the convenience of a hosted environment. Whether the songs are DRMed or not does not matter (they likely won’t be). Because even if I export a song and give it to a friend, he has only 10% of the experience unless he is also an “owner” of record with Apple.

What is required to fulfill this vision:

  • The provider has to be strong enough to get the licensing deals that would allow this type of sale of music to consumers.
  • The provider has to be able to get the service and synching incorporated in the iPod, the default music player for most people.
  • The provider needs to able to get the service incorporated in other consumer electronic devices like car stereos and home audio systems, to allow true universal access.
  • Consumers need to believe that the provider behind the sale is not going away.

Few companies satisfy all these criteria. Apple is one of them. Amazon gets pretty close.

By layering service on top of the music, piracy becomes a non-issue. You might be able to copy the music from a friend, but you can’t steal the service. Of course, there might be some sharing of login credentials, but this is much more easily addressed by monitoring simultaneous usage.

This type of music service would finally make owned music a hosted experience like most other consumer apps, while still providing commercial free music, which many people want. It simultaneously solves the music backup problem as well. I don’t need to backup my music because any iPod I own will automatically have my music, synched wirelessly over the network (we can all dream). And I won’t need a computer to enjoy an iPod, which is welcome because computers are a disaster (we need a good consumer appliance).
Because this service would guarantee that music would be provided at higher quality as time goes on, I might even buy the songs that I already own free and clear from old CDs.

How about the pricing? Could you provide a song for $0.99 and offer to restore it for the person indefinitely into the future? I think you probably could. After all, the music is not streamed, just synched, and you do have the attention of the consumer and can probably sell some advertising at appropriate points in the process (for example, when waiting for your device to synchronize or in the music store).

Would everyone buy songs this way? I think ad-supported music might be bigger, but there is a market out there for a premium version. Like buying a CD versus listening to the radio, this service would provide a better experience for the music you really care about.

Written by erlichson

April 14, 2008 at 4:42 pm

The Golden Age of Hosted Services

with 9 comments

When Mark Heinrich and I started our last company in 1999 we rented space and a network drop at Exodus data center in Jersey City. We bought servers, racked them, and installed a bunch of equipment at the office, including a phone switch and an email server.

At the time, most small businesses without engineers on staff would just go without infrastructure or hire someone to install a bit locally, probably not properly backed up.

In the last 10 years, a relatively short time, everything has changed. There are now excellent hosted services up and down the food chain of information processing. At Phanfare, we use Datapipe managed hosting to provide us with servers and bandwidth. I have physically been in the data center twice. We use Amazon’s S3 data service to reliably store data (I have never been to their datacenter). We have our email scrubbed of spam by Postini and we just subscribed to a hosted support offering from Parature. Our payment processing is outsourced with Ebay’s Payflow Pro product.

Even our accounting is hosted. We scan our invoices, shred everything, and email PDfs to our bookkeeper, whom I have physically met once. She in turn provides us access to our books via Windows terminal services.

Our faxes go to an outsourced fax service. Our phone service is voice over IP, provided by M5. They ran a T1 to our office with DSL backup and provide us with phone service that is so reliable that you can cut our T1 line and the phone call stays up. We bought the Cisco IP phones. We have a support organization in Saint Louis that is on the same phone system, completely transparently.

What is remarkable about all these services is that they are all excellent. We have had minor issues with Datapipe, and sometimes S3 does go down, but by and large, we are happy customers. The age of hosted services has arrived. If you are starting a business today, you can find a good hosted service for just about any software you might consider installing in house: email, blogging, intranet wiki, word processing, file storage, backup, to name just a few.

Some of the services we are using, like Datapipe, seem somewhat primitive compared to today’s hosted offerings. Rather than rent machines by the month that are physically dedicated to us, you can now buy a la carte infrastructure from Amazon or a fully hosted development stack from Google.

What this means is that the barrier to entry to build a business, especially a web-based business, is very low. That means competition is going to be fierce, and it also means that entrepreneurs don’t need VC money to start a web business. With a few thousand dollars and a smart dedicated engineer, you can build a prototype and see if it gains traction. Even monetization is available in hosted form from Google and a variety of other ad networks that will run ads on your site and send you money.

The age of hosted services will also empower and enable groups that have traditionally not had access to high quality IT services. For example, the average public school lives in the dark ages of information processing. But now, there are great opportunities to build a world class hosted solution that a school can create an “instance” on and run the whole school. It also means that we can export one of our greatest resources, information processing and software design, all over the world.

Consumers are the beneficiary as well. Gmail, Google docs, and Phanfare are all hosted services that can provide a consumer with a portable computing experience that is not tied to a particular computer or place. That is a great benefit to consumers because the personal computer, whether Mac or PC, is really not a consumer device. The PC is a hard-to-manage, hard-to-maintain engineering tool that will eventually let you down in some way or another. But with hosted services, you don’t care because you just move to the next computer and sign-in.

Written by erlichson

April 8, 2008 at 10:57 am

Posted in amazon, Apple, General, Phanfare, s3

Phanfare now backing up photos and videos to Amazon S3

with 23 comments

I am happy to announce that we have moved our backups to Amazon’s Simple Storage Service, known as S3. All current backups go to S3 and we are copying over historical data. We currently have about 20 terabytes at Amazon and will have about 40 terabytes when all the data is moved over.

We also maintain a copy of customer photos and videos on our RAID servers in our NJ datacenter. Amazon promises multi-data center redundancy for S3 data, so Phanfare customers now have the peace of mind of knowing that their data is in at least three datacenters, on opposite coasts of the US (NJ and WA).

The natural question is, why did we do it? We did it because we wanted to provide the assurance of off-site backup and because the engineering costs (time and money) in building out something similar to S3 exceed any cost savings we might have realized by managing the storage ourselves in the medium term.

We actually get more redundancy than we had before. Before we backed up data on a second set of RAID servers in our NJ datacenter. Those servers were cheaper to operate than Amazon S3 assuming 2 year amortization, but they did not provide the same level of geographic or physical redundancy. So for us, using Amazon was not cheaper, but it was better. Including the opportunity cost of working on Phanfare’s core products versus working on offsite backup, using Amazon is a definite strategic win for us.

To make Amazon actually lower our overall long term costs, we would need to stop storing the data ourselves, instead just caching hot data. We have competitors that do that and it would be cheaper, but we are not positive it would be better. After all, right now, Amazon does not provide a Service Level Agreement (SLA) or even a phone number to call if you are unhappy with the Amazon web service. I don’t expect that Amazon will ever lose our data of course, but we would like an SLA before we bet our customers’ data on that.

Amazon’s web services are game-changing, especially to smaller companies. They allow small companies to have a cost position that rivals some of the biggest online competitors. Amazon’s web services also lower the cost of entry for new startups and hence increase competition and foster innovation. Both these things are good for consumers and we applaud Amazon for embarking on their ambitious plan of providing storage and compute in the cloud for other companies. I know they are also trying to amortize their own costs of development, but for us it is wonderful. With proper SLAs, we would consider using Amazon’s Elastic Compute Cloud too (EC2).

EC2 enjoys local area network (LAN) latency and bandwidth to S3 storage and that would make S3 that much more attractive as primary storage for Phanfare. One of the first rules of building a high performance system is to keep compute close to the data it operates on, and hence without using EC2, we would always need to cache data on our side for performance. The latency between NJ and Seattle is too long otherwise.

If you think about it, Phanfare does for consumers what Amazon does for us. Just as it would be difficult and expensive for a consumer to build a system to store his photos and videos into the cloud, accessible from anywhere and backed up in geographically distributed locations, it would be difficult and expensive for Phanfare to replicate Amazon’s level of web infrastructure.

TAGS: , ,

Written by erlichson

July 12, 2007 at 2:22 pm