Resilience

The original Spring Cloud version of PetClinic used Resilience4j to configure calls to the visit service with a timeout of 4 seconds, and a fallback to return an empty list of visits in the event that the request to get visits timed out.

In this version of the application, the Spring Cloud dependencies were removed. We can replace this configuration with an Istio Custom Resource.

The file timeouts.yaml configures the equivalent 4s timeout on requests to the visits service, replacing the previous Resilience4j-based implementation.

timeouts.yaml
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: visits
spec:
  hosts:
  - visits-service.default.svc.cluster.local
  http:
  - route:
      - destination:
          host: visits-service.default.svc.cluster.local
    timeout: 4s

Apply the timeout configuration to your cluster:

kubectl apply -f manifests/config/timeouts.yaml

The fallback logic in PetClinicController.getOwnerDetails was retrofitted to detect the Gateway Timeout (504) response code instead of using a Resilience4j API.

To test this feature, the environment variable DELAY_MILLIS was introduced into the visits service to insert a delay when fetching visits.

Here is how to test the behavior:

  1. Call visits-service directly:

    kubectl exec deploy/sleep -- curl -s visits-service:8080/pets/visits?petId=8 | jq
    
    kubectl exec deploy/sleep -- curl -s visits-service:8080/pets/visits\?petId=8 | jq
    

    Observe the call succeed and return a list of visits for this particular pet.

  2. Call the petclinic-frontend endpoint, and note that for each pet, we see a list of visits:

    kubectl exec deploy/sleep -- curl -s petclinic-frontend:8080/api/gateway/owners/6 | jq
    
  3. Edit the deployment manifest for the visits-service so that the environment variable DELAY_MILLIS is set to the value "5000" (which is 5 seconds). One way to do this is to edit the file with (then save and exit):

    kubectl edit deploy visits-v1
    

    Wait until the new pod has rolled out and become ready.

  4. Once the new visits-service pod reaches Ready status, make the same call again:

    kubectl exec deploy/sleep -- curl -v visits-service:8080/pets/visits?petId=8
    
    kubectl exec deploy/sleep -- curl -v visits-service:8080/pets/visits\?petId=8
    

    Observe the 504 (Gateway timeout) response this time around (because it exceeds the 4-second timeout).

  5. Call the petclinic-frontend endpoint once more, and note that for each pet, the list of visits is empty:

    kubectl exec deploy/sleep -- curl -s petclinic-frontend:8080/api/gateway/owners/6 | jq
    

    That is, the call succeeds, the timeout is caught, and the fallback empty list of visits is returned in its place.

  6. Tail the logs of petclinic-frontend and observe a log message indicating the fallback was triggered.

    kubectl logs --follow svc/petclinic-frontend
    

Restore the original behavior with no delay: edit the visits-v1 deployment again and set the environment variable value to "0".

To learn more about resilience features in Istio, see:

Let us next turn our attention to security-related configuration.