Research Article

Resilience by Design: A Deep Dive into Chaos Engineering in Cloud-Native Architectures

Authors

  • Susanta Kumar Sahoo Independent Researcher, USA

Abstract

Chaos engineering emerges as a methodological necessity for ensuring resilience in cloud-native architectures, which, while offering scalability and flexibility, introduce complex interdependencies and unpredictable failure modes. Traditional testing approaches fall short in identifying systemic vulnerabilities, whereas chaos engineering proactively discovers weaknesses through controlled experimentation. The discipline begins with defining steady-state metrics and formulating hypotheses about system behavior under stress before introducing controlled failures. Implementation in Kubernetes environments leverages specialized tooling, CI/CD integration, and service mesh capabilities, while comprehensive observability through metrics, logs, and traces provides critical insights into failure propagation. Beyond technical considerations, successful adoption requires organizational transformation centered on psychological safety, blameless learning, cross-functional ownership, and knowledge sharing practices. As systems continue to distribute and complexify, chaos engineering transitions from an optional practice to a fundamental discipline within site reliability engineering.

Article information

Journal

Journal of Computer Science and Technology Studies

Volume (Issue)

7 (9)

Pages

525--532

Published

2025-09-12

How to Cite

Susanta Kumar Sahoo. (2025). Resilience by Design: A Deep Dive into Chaos Engineering in Cloud-Native Architectures. Journal of Computer Science and Technology Studies, 7(9), 525-532. https://doi.org/10.32996/jcsts.2025.7.9.60

Downloads

Views

4

Downloads

6

Keywords:

Resilience Engineering, Fault Injection, Distributed Systems, Observability, Psychological Safety