We must be very careful if we want to call third-party services. Especially if the network call goes through different firewalls, VPNs, security components. Although the third party promises SLA, it must not appease us.
I recommend following the HIME principle. We should be careful about:
Our service must remain healthy. The connection must not be stuck too long, neither the threads nor the memory must be exhausted. The service must be at least partially functional even if the third party behaves incorrectly.
A typical error is not setting the timeout, which leads to a gradual overload of our system.
Data should remain consistent. If we call a third-party service that changes the data and the timeout occurs, we do not know whether the data was updated correctly or not.
A typical error is not sending a UUID in the request or missing additional check whether data has been changed.
If the third-party service is behaving incorrectly, we should be notified as soon as possible.
We should log everything – when the service was called, the input data, the result, the processing time. If we claim a problem with a third-party service, we need to provide a detailed event log.
Monitoring and Evidence is easy to setup. We can use some logging system and setup ELK stack (Elastic search, Logstash and Kibana).
Health and Integrity is more complicated. In-depth testing is very important. We can use Spoiler Proxy to simulate some common network and software errors.
These rules do not apply only to calls to third-party services. The rules can also be generalized to micro-service architecture.