Skip to main content

Gateway Network Partition

· 3 min read
Christopher Kujawa
Principal Software Engineer @ Camunda
  • Documented failure cases for AsyncSnasphortDirector. Gave me some ideas where it might make sense to reinstall partition. Discussed a bit with @Deepthi
  • Still our automated chaos experiments are not running. I need some time for that, but I had no time for that today.
  • Run a chaos experiment together with @pihme, where we do a network partition with the gateway.

Correlate Message after failover

· One min read
Christopher Kujawa
Principal Software Engineer @ Camunda
  • Documented failure cases for engine and stream processor. I think almost all possible failure cases I can think of we already handle, except problems on reading, which I think can't be handled.
  • Checked what the current issue is with the automated chaos experiments. It seems it is a infra problem. You can check the discussion in #infra. It might be affected due to Infra-1292
  • Run a chaos experiment, where we correlate a message after fail over.

First Chaos Day!

· 2 min read
Christopher Kujawa
Principal Software Engineer @ Camunda

First Chaos day 🎉

  • Documented failure cases for exporter (already some exist, it seemed) gave me a new idea for ZEP
  • Introduced Peter to our Chaos Repository, discussed a bit about the hypothesis backlog, reopened the Chaos Trello board where we will organize ourselves
  • Run a chaos experiment, where we put high CPU load on the Leader #6