(2008-10-08) Schlossnagle Cto Club

Theo Schlossnagle gave a talk to the Ny Cto Club about Scalability

models

  • Enterprise class: 5-9s for 100% of population: reality everyone disappointed

  • Carrier class: 5-9s for 5-9s of population: few miserable, most happy

    • example (amazon): block some portion of new sessions at the Load Balancer (need Load Balancing hardware with that feature) (could also do in software layer, but that uses the processor that's already the bottleneck)

Software architecture

  • db federation: don't throw away data integrity universally (because some app will screw up, and you'll get dirty data), pick key things to sacrifice

    • pick doc-like fields, put them in CouchDB (or File System...) (Keep db as small as possible)
  • Typically run at 30% capacity, plus have other hardware available

  • For each page or action, count number of IO ops (incl log writing, etc)

  • Logging: learn from your own history (pick right metrics)

  • Identify component that needs to scale, build from the start on 3 boxes: going from 3 to bigger is much easier

Cloud

  • no real guarantees

  • EucaLyptus (xen): Open Source Amazon EC2 clone to run yourself

  • Amazon EC2 great for math (video transcoding)

  • Scaling not fast enough for digg: takes 10min to get new machine up and tested - by that time it's too late

Chat afterwards

  • don't automate failover if large cost for false alarm (eg in MySQL you may lose some transactions. If that's ok, then is it always ok, so you can architect for that?)

  • MySQL is their most prevalent db, but they won't use it for important data. They'll use PostgreSQL, MsSqlServer, or other (even Oracle)


Edited:    |       |    Search Twitter for discussion

No twinpages!