Skip to content

Observability

Distributed Tracing

The Istio documentation dedicates a page to guide users on how to propagate trace headers in calls between microservices, in order to support distributed tracing.

In this version of PetClinic, all Spring Boot microservices have been configured to propagate trace headers using micrometer-tracing.

Micrometer tracing is an elegant solution, in that we do not have to couple the trace header propagation with the application logic. Instead, it becomes a simple matter of static configuration.

See the application.yaml resource files and the property management.tracing.baggage.remote-fields which configures the fields to propagate.

To make testing this easier, configure Istio with 100% trace sampling, as follows:

telemetry.yaml
---
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
  name: mesh-default
  namespace: istio-system
spec:
  tracing:
  - providers:
    - name: zipkin
    randomSamplingPercentage: 100.0
  accessLogging:
  - providers:
    - name: envoy
kubectl apply -f manifests/config/telemetry.yaml

Observe distributed traces

In its samples directory, Istio provides sample deployment manifests for various observability tools, including Zipkin and Jaeger.

Deploy Jaeger to your Kubernetes cluster:

  1. Deploy Jaeger:

    kubectl apply -f istio-1.23.0/samples/addons/jaeger.yaml
    
  2. Wait for the Jaeger pod to be ready:

    kubectl get pod -n istio-system
    

Next, let us turn our attention to calling an endpoint that will generate a trace capture, and observe it in the Jaeger dashboard:

  1. Call the petclinic-frontend endpoint that calls both the customers and visits services. Feel free to make mulitple requests to generate multiple traces.

    curl -s http://$LB_IP/api/gateway/owners/6 | jq
    
  2. Launch the jaeger dashboard:

    istioctl dashboard jaeger
    
  3. In Jaeger, search for traces involving the services petclinic-frontend, customers, and visits.

You should see one or more traces, each with six spans. Click on any one of them to display the full end-to-end request-response flow across all three services.

Distributed Trace Example

Close the Jaeger dashboard.

Exposing metrics

Istio has built-in support for Prometheus as a mechanism for metrics collection.

Each Spring Boot application is configured with a micrometer dependency to expose a scrape endpoint for Prometheus to collect metrics.

Call the scrape endpoint and inspect the metrics exposed directly by the Spring Boot application:

kubectl exec deploy/customers-v1 -c istio-proxy -- \
  curl -s localhost:8080/actuator/prometheus

Separately, Envoy collects a variety of metrics, often referred to as RED metrics, for: Requests, Errors, and Durations.

Inspect the metrics collected and exposed by the Envoy sidecar:

kubectl exec deploy/customers-v1 -c istio-proxy -- \
  curl -s localhost:15090/stats/prometheus

One common metric to note is the counter istio_requests_total:

kubectl exec deploy/customers-v1 -c istio-proxy -- \
  curl -s localhost:15090/stats/prometheus | grep istio_requests_total

Both the application metrics and envoy's metrics are aggregated (merged) and exposed on port 15020:

kubectl exec deploy/customers-v1 -c istio-proxy -- \
  curl -s localhost:15020/stats/prometheus

What allows Istio to aggregate both scrape endpoints are annotations placed in the pod template specification for each application, communicating the URL of the Prometheus scrape endpoint.

For example, here are the prometheus annotations for the customers service.

Rob Salmond's blog entry on Prometheus provides a nice illustration of how both scrape endpoints are aggregated. For more information on metrics merging and Prometheus, see the Istio documentation.

Send requests to the application

To send a steady stream of requests through the petclinic-frontend application, we use siege. Feel free to use other tools, or maybe a simple bash while loop.

Run the following siege command to send requests to various endpoints in our application:

siege --concurrent=6 --delay=2 --file=./urls.txt

Leave the siege command running.

Open a separate terminal in which to run subsequent commands.

The Prometheus dashboard

Deploy Prometheus to your Kubernetes cluster:

kubectl apply -f istio-1.23.0/samples/addons/prometheus.yaml

Launch the Prometheus dashboard:

istioctl dash prometheus

Here are some PromQL queries you can try out, that will fetch metrics from Prometheus' metrics store:

  1. The number of requests made by petclinic-frontend to the cutomers service:

    istio_requests_total{source_app="petclinic-frontend",destination_app="customers-service",reporter="source"}
    
  2. A business metric exposed by the application proper: the number of calls to the findPet method:

    petclinic_pet_seconds_count{method="findPet"}
    

Istio's Grafana metrics dashboards

Istio provides standard service mesh dashboards, based on the standard metrics collected by Envoy and sent to Prometheus.

Deploy Grafana:

kubectl apply -f istio-1.23.0/samples/addons/grafana.yaml

Launch the Grafana dashboard:

istioctl dash grafana

Navigate to the dashboards section, you will see an Istio folder.

Select the Istio service dashboard.

Review the Istio Service Dashboards for the services petclinic-frontend, vets, customers, and visits.

The dashboard exposes metrics such as the client request volume, client success rate, and client request durations:

Istio service dashboard for the customers service

PetClinic custom Grafana dashboard

The version of PetClinic from which this version derives already contained a custom Grafana dashboard.

To import the dashboard into Grafana:

  1. Navigate to "Dashboards"
  2. Click the "New" pulldown button, and select "Import"
  3. Select "Upload dashboard JSON file", and select the file grafana-petclinic-dashboard.json from the repository's base directory.
  4. Select "Prometheus" as the data source
  5. Finally, click "Import"

The top two panels showing request latencies and request volumes are technically now redundant: both are now subsumed by the standard Istio dashboards.

Below those panels are custom application metrics. Metrics such as number of owners, pets, and visits created or updated.

Create a new Owner, give an existing owner a new pet, or add a visit for a pet, and watch those counters increment in Grafana.

Kiali

Kiali is a bespoke "console" for Istio Service Mesh. One of the features of Kiali that stands out are the visualizations of requests making their way through the call graph.

  1. Cancel the currently-running siege command. Relaunch siege, but with a different set of target endpoints:

    siege --concurrent=6 --delay=2 --file=./frontend-urls.txt
    
  2. Deploy Kiali:

    kubectl apply -f istio-1.23.0/samples/addons/kiali.yaml
    
  3. Launch the Kiali dashboard:

    istioctl dashboard kiali
    

    Select the Graph view and the default namespace.

    The flow of requests through the applications call graph will be rendered.

Visualization of traffic flow in Kiali

Kiali also has integrations specific for JVM and Spring Boot applications.