DGSE - Developing a Google SRE Culture
Many IT organizations experience a disconnect between developers, who focus on agility, and operators, who focus on stability. Site Reliability Engineering (SRE) is how Google bridges the gap between development and operations, while also providing mission-critical production support. In this course, you'll learn the fundamentals and best practices of SRE, the importance of adopting an SRE culture, and how SRE can improve collaboration between IT and business leaders—and help the entire organization succeed.
II. Duration: 01 day
- Explain why SRE is important to an organization’s IT transformation project's success
- Distinguish between DevOps and SRE
- Articulate the pillars of DevOps
- Explain how SRE practices align to DevOps pillars
- Understand the value SRE can provide to an organization
- Describe the technical and cultural fundamentals of SRE
- Assess organizational SRE maturity level
- Identify where SRE can be applied within the business
- Recognize the skills an SRE needs
- Articulate the different types of SRE team implementations
- Advocate for SRE culture adoption across the organization
IV. Intended Audience
Primary audience: ?IT leaders and business leaders who are interested in embracing SRE philosophy. Roles include, but are not limited to: CTO, IT director/manager, engineering VP/director/manager. Secondary audience:
- Other product and IT roles such as operations managers or engineers, software engineers, service managers, or product managers may also find this content useful as an introduction to SRE.
- No formal prerequisites required.
- Recommended pre-reading: Site Reliability Engineering: How Google Runs Production Systems - Chapter 1 Introduction
The course includes presentations, demonstrations, and hands-on labs.
Module 1: Welcome to Developing a Google SRE Culture
- Explain why SRE is important to an organization’s IT transformation project's success.
Module 2: DevOps, SRE, and Why They Exist
- Distinguish between DevOps and SRE.
- Articulate the pillars of DevOps.
- Explain how SRE practices align to DevOps pillars.
Module 3: SLOs with Consequences
- Understand the value SRE can provide to an organization.
- Describe the technical fundamentals of SRE (SLOs, error budgets, and blameless postmortems).
- Describe the cultural fundamentals of SRE (Psychological safety, blamelessness, unified vision, collaboration, and knowledge sharing).
Module 4: Make Tomorrow Better than Today
- Describe the technical fundamentals of SRE (continuous integration/continuous delivery, canarying, and toil automation).
- Describe the cultural fundamentals of SRE (design thinking, prototyping, psychology of change, and resistance to change).
Module 5: Regulate Workload
- Describe the technical fundamentals of SRE (measuring toil and reliability, and monitoring).
- Describe the cultural fundamentals of SRE (goal-setting, transparency, data-driven decision making).
Module 6: Apply SRE in Your Organization
- Assess their organization’s SRE maturity level.
- Identify where SRE can be applied within their business.
- Recognize the skills an SRE needs.
- Articulate the different types of SRE team implementations.
- Advocate for SRE culture adoption across their organization.
Module 7: Final Assessment
- Assess SRE technical and cultural fundamentals knowledge.
- Upcoming classrooms
- There are no upcoming instructor-led sessions
Học trực tuyến
Học tại Hồ Chí Minh
Học tại Hà Nội