Skip to main content

33 posts tagged with "availability"

View All Tags

Disconnect Leader and one Follower

ยท 8 min read
Christopher Kujawa
Chaos Engineer @ Zeebe

Happy new year everyone ๐ŸŽ‰

This time I wanted to verify the following hypothesis Disconnecting Leader and one Follower should not make cluster disruptive (#45). But in order to do that we need to extract the Leader and Follower node for a partition from the Topology. Luckily in December we got an external contribution which allows us to print zbctl status as json. This gives us now more possibilities, since we can extract values much better out of it.

TL;DR The experiment was successful ๐Ÿ‘

Message Correlation after Failover

ยท 4 min read
Christopher Kujawa
Chaos Engineer @ Zeebe

Today I wanted to finally implement an experiment which I postponed for long time, see #24. The problem was that previous we were not able to determine on which partition the message was published, so we were not able to assert that it was published on the correct partition. With this #4794 it is now possible, which was btw an community contribution. ๐ŸŽ‰

Many Job Timeouts

ยท 4 min read
Christopher Kujawa
Chaos Engineer @ Zeebe

In the last game day (on friday 06.11.2020) I wanted to test whether we can break a partition if many messages time out at the same time. What I did was I send many many messages with a decreasing TTL, which all targeting a specific point in time, such that they will all timeout at the same time. I expected that if this happens that the processor will try to time out all at once and break because the batch is to big. Fortunately this didn't happen, the processor was able to handle this.

I wanted to verify the same with job time out's.