mashraqi

+1.408.FRANKMASH (408.372-6562)
[ This is my personal blog so all opinions expressed here are mine. I am a product, scalability, operations and monetization advisor and currently employed as Director of Business Operations & Technical Strategy for a top 50 website that delivers billions of page views per month. I was a keynote panelist for Scaling Up or Out keynote at MySQL Conference and speak regularly at conferences and user groups. ]
Farhan "Frank" Mashraqi

Saturday, July 05, 2008

Energy Efficient Operations: Some Challenges and Opportunities

Yet more notes from Velocity.

After the break, the next session is Energy Efficient Operations: Some Challenges and Opportunities. Luiz Barroso from Google is the presenter. I got a couple minutes late as I had to pick the charger.



Server electricity usage in perspective:
  • worldwide electricity usage of servers is around 1% of total electricity consumption.
  • usage doubled between 2000 and 2005
  • could increase by 40%-76% by 2010.
PC enery consumption likely higher:
  • installed base for servers in 2005 - 27M
  • installed base for PCs in 2005: 870M
Measuring computing energy efficiency
  • harder for computers than for refrigerators
  • efficiency = work done / energy used = computing speed / power
  • biggest thing you can do for energy efficiency is write fast code. it can have really big impact.
  • from measurement standpoint, it is useful to break down the energy efficiency/budget equation
  • breaking it down:
    • efficiency = (work done / energy used in chips) * (energy used in chips / energy provided to servers) * (energy provided to servers / energy entering the building)
    • first: computing efficiency
    • second: server efficiency
    • third: datacenter efficiency or 1/PUE (power usage efficiency)
Energy efficiency opportunities:
  • datacenter energy efficiency
    • LBNL survey of 24 facilities shows avg PUE of 1.83
  • underutilized data centers
    • wasted power provisioning investment
    • makes cooling and power distribution less efficient
  • server energy efficiency
    • typical server power supplies dissipate 25% of total energy
    • DC-to-DC voltage regulatorscan lose another 25%
  • computing efficiency
    • servers have poor energy efficiency in their most common usage range
Plan for today:
  • datacenter efficiency
    • the power provisioning efficiency: What can you achieve if you utilize all energy in your data center.
  • two key energy related costs:
    • 10 year energy costs ($9/watt)
    • cost of building a datacenter ($10-22/watt)
  • Facility costs are as important as energy consumption costs
TCO components: Rough cost breakdown: datacenter (28%) hardware (50%) energy (22%)

Datacenter buildout can be larger than energy itself.

Efficiency provisioning playbook:
  • consolidate workloads into the minimum number of machines needed for peak usage requirements
    • smart scheduling or virtualization help here
  • measure actual power usage of devices
    • nameplates lie!
  • study activity trends and investigate the oversubscription potential
    • the subject of our ISCA 07 article
Six month power monitoring study at Google (ISCA 07)
  • Basic setup
    • model based power monitoring scheme
    • measure usage statistics at rack, PDU and cluster levels
    • 4 diferent workloads over 5k servers
More servers leads to higher oversubscription potential.

Safely oversubscribing power
  • oversubscribe at the datacenter level, not of at server or rack levels
  • profile power usage of applications: learn what to expect
  • mix workloads
  • manage overload
    • provision a sizeable 'best effort' workload; victimize it first
    • use applications with QoS stack
    • good news: time constants to react are long
Energy-proportional computing: (An article was published in december of last year)
  • look at datacenter as a device you have to lower power for
  • he calls the datacenter: a land-held
  • CPU activity distribution over six months (graph)
    • real production systems don't run full blast all the time.
    • systems run 10% to 50% of their full capacity most of the time.
  • fraction of time these servers are doing nothing is very small.
  • A datacenter and a laptop are indeed different
Characteristics of well designed internet services:
  • high performance and high availability requires
    • load balancing and wide data distribution -> no useful idle intervals, lots of low activity intervals
  • example: Google file system:
    • replicas distributed across multiple machines
    • reads load balancing across replicas, writes need to reach all.
Key implications:
  • sleep or power-down strategies are much less useful in servers
  • focus on energy efficiency at peak performance is misguided
Power varies with amount of activity in servers. When a machine is completely idle, it still pretty much uses half of peak power it consumes. At 1/3 of peak, power efficiency is halved.

Energy-proportional computing: (the idea)
  • no work, no power consumed
  • some work, some power consumed
  • lots of work, lots of power consumed
That would be the end of power management software.

What if we could build machines with a wide activity range? He shows a graph.

Estimated impact of energy proportionality is quite huge based on another graph.

Conclusion:
  • write fast code!
    • the software engineer's biggest contribution to energy efficiency
  • consider reduction of all energy-related costs
    • electricity, and datacenter provisioning
Some Google initiatives
  • carbon neutrality
  • 1.6MW solar panel installation in Mtn. View
  • plugin-in hybrids (http://rechargeit.org)
One of the best presentations at Velocity.

More publications by Luiz:

Labels: , , , , , ,

Wednesday, June 25, 2008

Werner Vogels: Keynote at Structure 08

Werner Vogels is now giving a keynote. He is the CTO of Amazon.com.

At Structure 10, the whole discussion will be different. We will be talking about different business models. This is a snapshot at the beginning of the movement. He showed an Animoto video presentation created by him.

What's so special about Animoto? They have no server infrastructure, even though what they do is very compute intensive. When they had 25,000 customers they were hovering around 50 instances. They launched a Facebook app that allowed you to import photos, create video and post it back to Facebook. At that point, they started signing up 25,000 customers per hour. They had to increase their instances to more than 5,000. Imagine Animoto going to VC and asking for money for 5k servers.

Cloud computing is moving the world from capital expenditure (CAPEX) to operating expenditure (OPEX). We are moving to a variable cost model.

Amazon is now in 7 countries with more than 79 million active customer accounts.

Bandwidth used by AWS is way higher than Amazon store.

Amazon used to be a technology consumer. Now, there is no third party software left at Amazon because of scale. There are more than 1M+ sellers. They moved from single application to a platform.

First 5-6 years, uptil 2001, Amazon was like a traditional site. The challenge was how do they keep scaling? How will we make it to the next year. Around 2001, Target came to Amazon and asked to be integrated. At the same time, a number of architecture pieces broke. Then they wanted to move to a platform while working on integrating Target.

At times, Amazon was thinking of going back to mainframe. They wanted to create a very agile environment.

They created an infrastructure where no direct database connections were allowed. Everything must go through a business logic layer.

The gateway page on Amazon can use upto 200+ services to be created.

The 70/30 switch
  • Companies now have to become experts in many areas not related to their business and answer questions like, why is BGP protocol not stable?, why do datacenters go down? etc. These companies are spending upto 70% of time, energy and dollars on undifferentiated heavy lifting.
  • Only 30% of time, energy and dollars are spent on differentiated value creation.
He is showing a photo of destroyed datacenter. If your data was in that datacenter, it is gone!. Now talking about 365 main which did 'everything right.' 6 of their 8 diesel generators failed and brought Web 2.0 down. At Amazon, the thought is to survive an entire datacenter failure.

Don't depend on Raid-5 to protect your data.

Peak capacity management is a big issue for companies such as Walmart.com and Target.com that experience seasonal spikes in traffic.

They wanted to cover three areas: compute, messaging and storage. EC2 covers compute, S3, Simple DB and EC2 PS (Persistent storage) covers storage and SQS is the fabric that holds everything together. EC2-PS still doesn't has a name.

Most data at Amazon was key-value based. There were secondary key accesses. SimpleDB was a compliment to S3.

It's easy for companies to spend as much as 70% of their intellectual capacity in scaling.

Infrastructure Services Drivers
  1. Security
  2. Scalablity
  3. Availability
  4. Performance
  5. Cost-effective
Next he is showing example of SmugMug who relies heavily on Amazon's EC2 and S3. They currently have 600TB of pictures stored in Amazon S3. In Amazon S3 there are more than 18 billion objects as of March 2008.

SmugMug is now venturing in different businesses where they provide interface to allow you to store anything. The product is called Smugmug Vault.

Addressing Uncertainty
  • Acquire resources on demand and release them
  • pay for what you use
  • leverage other's core competencies
  • turn fixed cost into variable
What sense does it makes to order a lot of hardware when you don't even have a product? You also aren't sure how many customers you'd get.

Get everything from http://aws.amazon.com, you only need a credit card.

Labels: , , , , , , , ,

Monday, June 23, 2008

Green Data Centers

Next up is Bill Coleman (Cassatt Corporation) who is responsible for B in BEA. He is also credited for his work on Solaris. Currently he is CEO of Cassatt Corporation. The talk is about Green Data Centers.

  • What we are doing today in data centers is unsustainable. He calls them 'your father's data center'
  • Concerns
    • first is energy cost
    • second is operations cost. IDC says it has gone from 25% to 75%.
  • everything is a lot more complex today than it was 15 years ago.
  • how we got here? this is a consequence of innovation. In 1990, people were putting networks in data centers. Then came storage, followed by software people who wanted multi-tiered applications. Then came DBAs :)
  • Then came virtualization. Is it end of IT? We are doing things still as it is 1960s. There is no automation involved, everything must be changed physically.
  • We are at end of sustainability of data centers as we know it today.
  • Virtualization makes scale a little bit better. All we are doing is pushing back the ends.
  • 1.0 of cloud: i can build a green field application with proprietary
  • 2.0 of cloud: functions of PC now exist in cloud. it will still be proprietary.
  • Apple invented PC but didn't commoditize it.
  • Very low utilization rates. The next phase of cloud computing will offer higher utilization rates.
Thanks Bill for a great insight into green data centers.

Labels: , , , , ,

  • View Farhan 'Frank' Mashraqi's profile on LinkedIn
  • Structure 08
  • Graphing Social Patterns - East 2008
  • Velocity Conference
    follow me on Twitter

    © 2006 The Mashraqi's.