Emil Stolarsky (@emilstolarsky) and Jaime Woo (@jaimewoo), co-founders of @IncidentLabsInc talk about experiences running web applications at scale, evolving into SRE roles, communicating SRE concepts across teams, and tips for initial success.
SHOW: 446
SHOW SPONSOR LINKS:
- MongoDB Homepage - The most popular database for modern applications
- MongoDB Atlas - MongoDB-as-a-Service on AWS, Azure and GCP
- Datadog Homepage - Modern Monitoring and Analytics
- Try Datadog yourself by starting a free, 14-day trial today. Listeners of this podcast will also receive a free Datadog T-shirt
- DivvyCloud - Achieve continuous security & compliance. Request a free trial today!
- DivvyCloud’s 2020 Cloud Misconfigurations Report - Cloud misconfigurations cost enterprises $5 trillion in 2018 and 2019.
CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotw
SHOW NOTES:
- Incident Labs (homepage)
- Ovvy (tool) - On-Call Management
- Google SRE Book
Topic 1 - Welcome to the show. Tell us a little bit about your backgrounds, and some of your experiences that lead you to focus on SRE.
Topic 2 - SRE is still an evolving concept, and people are still learning about it. How do you frame a conversation with people about how SRE works? How much is technology-centric and how much is culture/process-centric?
Topic 3 - We’re all living in an unusual time, given the current COVID-19 pandemic. How do you see SRE changing as work environments change (e.g. WFH) or volume or change-rate is dramatically impacted?
Topic 4 - What have you found are successful communication and collaboration models for SREs engineers with their associated teams (or other stakeholders)?
Topic 5 - How well do you find different groups understand the concepts around error budgets and SLOs?
Topic 6 - If people are just now getting started with SRE, what are some early tips (or tools) that you recommend for them to have initial success (or avoid failures)?
FEEDBACK?
- Email: show at thecloudcast dot net
- Twitter: @thecloudcastnet