Should a configured webhook integration fail for more than 10 times without a successful response, it will be paused until a user enables it again in the Integration settings page.
I don’t think that’s a good idea. 10 times is by far not enough. Instead I would suggest to turn it automatically off if requests have failed for three days straight, or similar. Your network and API also could have issues.
Imagine the following:
Event is triggered. Event makes an API call to your website. But your website / API currently has an issue. Requests to your website fails → Hook returns „Bad request“. Now alone your website being down could trigger pausing the webhook. And yes, that does indeed happen. We see your API having troubles quite often. This morning for example.
Getting an email every time a webhook fails is kinda annoying. Please add an option to send these to a different inbox, or only send an email when a webhook gets automatically paused.
Your documentation also states:
awork expects a webhook request to return with a successful response within 30 seconds. Otherwise, the event will be marked as failed and retried for up to 10 times.
Does re-trying actually work? Because I was unable to verify that.
When is retrying attempted? When a webserver is not reachable at all? Or on response code not being 200?
we release a fix for the webhook retries tonight, so that each event is retried up to 5 times with delays between the retries of 5sec, 1min, 5min, 30min, 1h. On top of that we add a jitter of 1-30sec so that on retry not all events are retried at the same time. Only after failing for 5 times, we increase the FailureCount of the webhook.
A webhook is deactivated after 10 consecutive failures. A successful API request resets the FailureCount.
Regarding our API performance: We are aware of the problem and have made mayor improvements in the area in the last 2 weeks. There were several issues with the way our database handled queries that were really hard to spot but we finally found the underlying issues. Together with a bunch of other improvements, this week was way more stable than the previous ones. We will heavily invest in performance improvements in the future as well.
I hope this answers your questions.
Best regards
Ian