Modern cloud-native and distributed systems operate at a scale and complexity where traditional, reactive SRE practices — based on static thresholds, dashboards, and manual incident response — are no longer sufficient. This programme strengthens the technical depth, architectural judgement, and implementation capability of Senior DevOps Engineers and Architects through a progressive, system-centric model.

All learning is anchored on real microservices-based reference applications implemented in Java (Spring Boot) and .NET (ASP.NET Core). Learners instrument, observe, stress, break, recover, and optimise these services across the programme lifecycle. AI augments — not replaces — SRE decision-making: detecting issues earlier, improving diagnosis, preventing failures, and enabling safe, automated remediation under defined guardrails.

WHO SHOULD ATTEND

Senior DevOps Engineers and Architects responsible for operating, scaling, and stabilising cloud-native systems at production scale
SRE Practitioners who manage monitoring, automation, and CI/CD pipelines and are expected to
architect reliability and intelligent automation
Platform Engineers with direct involvement in production incidents, RCA, and post-incident
improvement activities

Experience: 3–8+ years required.

PRE-REQUISITES
Must-Have

Experience with Java (Spring Boot) / .NET (ASP.NET Core) services in production
Strong knowledge of AWS or Azure — compute, networking, storage, IAM, monitoring
Practical Kubernetes experience including troubleshooting pod, service, and node-level failures
Experience building and maintaining CI/CD pipelines; direct production incident experience.

Good to Have

Database administration and optimisation
Hands-on proficiency with Prometheus, Jaeger, Grafana, ELK, or Terraform

Microservice architecture knowledge

KEY OUTCOMES

Define business-aligned SLIs and SLOs at application and transaction levels and implement SLI
instrumentation within Java / .NET services
Assess solution, database, and infrastructure architectures from performance, scalability, and
reliability perspectives
Conduct Fault Vulnerability Analysis (FVA) using historical data, incident patterns, and AI-assisted
insights
Design and implement high-fidelity observability architectures enabling AI-driven anomaly detection, signal correlation, and contextual analysis
Design and validate chaos engineering strategies by executing controlled failure scenarios
Participate in production incident RCA leveraging AI-assisted correlation, blast-radius analysis, and intelligent incident summarisation
Define release management and rollback strategies aligned to SLOs, error budgets, and AI-supported risk-based change analysis
Design and implement automation for toil reduction and self-healing with AI-assisted closed-loop
remediation under defined guardrails

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Advanced AI-Enabled SRE

Created by

Categories

Reviews

About the Instructor

cyber user

0.0

Course rating

Duration

Lessons

Skill

IP6FD - IPv6 Fundamentals, Design, and Deployment v4.1

CompTIA A+ 1101-1102

Certified Information Systems Security Professional (CISSP)

Monitoring & Evaluation Masterclass

Courses & Training

Company

Subscribe to our newsletter

Follow us on

Disclaimer