Fail Better: Radical Ideas from the Practice of Cloud Computing with Tom Limoncelli
Distributed or "cloud" computing involves many moving parts, any of which can break or fail. Succeeding in this environment requires embracing failure, not running or hiding from it. To do this requires challenging our instincts with radical ideas. Tom will highlight some of the most radical advice from the new book The Practice of Cloud System Administration.
Topics will include:
-Create resiliency at the most economic level
-Do risky procedures often
-Create a blameless culture to encourage communication and improve system reliability.
Attendees will be inspired to think differently about how they build resilient distributed systems and will see how to put these ideas into practice.
Tom Limoncelli
Tom Limoncelli is an internationally recognized author, speaker, and system administrator. His new book, The Practice of Cloud System Administration, launched last year. His past books include Time Management for System Administrators and The Practice of System and Network Administration. In 2005 he shared the Usenix LISA Outstanding Achievement Award with Christine J. Hogan. While at Google he was a Site Reliability Engineer (SRE) for projects such as the web crawler, Blog Search, office IT deployments and The Ganeti project (http://code.google.com/p/ganeti). Tom works in New York City at Stack Exchange, home of ServerFault.com and StackOverflow.com. He is also a member of the ACM Queue Editorial Board. His column, "Everything Sysadmin," appears in the magazine.