Skip to content

Monitoring and Observability

🎯 Learning Goals

By the end of this lesson, you will be able to:

  • Configure structured logging in FastAPI.
  • Collect metrics (requests, latency, errors) and expose them for monitoring.
  • Add tracing to follow requests across services.
  • Integrate FastAPI with Prometheus and visualize metrics in Grafana.

⚡ Step 1: Logging

Logging is the foundation of observability. In production, logs should be structured and easy to parse.

📄 main.py

import logging
from fastapi import FastAPI

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s %(levelname)s %(name)s %(message)s"
)

logger = logging.getLogger("taskmanager")

app = FastAPI()

@app.get("/health")
def health_check():
    logger.info("Health check endpoint called")
    return {"status": "ok"}

🔍 What’s happening?

  • Logs include timestamp, severity, logger name, and message.
  • Use logger.info(), logger.error(), etc. in your endpoints.
  • In production, logs can be shipped to ELK (Elasticsearch, Logstash, Kibana) or Loki.

⚡ Step 2: Metrics

Metrics give quantitative insights (e.g. request counts, latency).

Install prometheus-fastapi-instrumentator:

pip install prometheus-fastapi-instrumentator

📄 main.py

from prometheus_fastapi_instrumentator import Instrumentator

instrumentator = Instrumentator().instrument(app).expose(app)

Now your app exposes metrics at /metrics.

Example metrics:

  • http_requests_total → number of requests
  • http_request_duration_seconds → latency
  • http_requests_in_progress → concurrent requests

⚡ Step 3: Tracing

Tracing lets you follow a request across services. Use OpenTelemetry.

Install:

pip install opentelemetry-api opentelemetry-sdk opentelemetry-instrumentation-fastapi

📄 main.py

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor

trace.set_tracer_provider(TracerProvider())
span_processor = BatchSpanProcessor(ConsoleSpanExporter())
trace.get_tracer_provider().add_span_processor(span_processor)

FastAPIInstrumentor.instrument_app(app)

🔍 What’s happening?

  • Every request creates a trace span.
  • Spans are exported to console (for demo).
  • In production, export to Jaeger, Zipkin, or Tempo.

⚡ Step 4: Integrating with Prometheus and Grafana

Prometheus

  • Prometheus scrapes metrics from /metrics.
  • Configure prometheus.yml:
scrape_configs:
  - job_name: "fastapi"
    static_configs:
      - targets: ["localhost:8000"]

Run Prometheus:

docker run -p 9090:9090 -v ./prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus

Grafana

  • Grafana visualizes Prometheus metrics.
  • Run Grafana:
docker run -d -p 3000:3000 grafana/grafana
  • Add Prometheus as a data source (http://localhost:9090).
  • Create dashboards for:
    • Request rate
    • Error rate
    • Latency percentiles

🧠 Best Practices

  • Use structured logs (JSON) for easier parsing.
  • Collect metrics for requests, latency, and errors.
  • Add tracing to debug distributed systems.
  • Use Prometheus + Grafana for monitoring and visualization.
  • Set up alerts (e.g. high error rate) in Grafana.

🧪 Practice Challenge

  1. Add a custom metric for the number of tasks created.
  2. Configure Grafana to alert when error rate > 5%.
  3. Export traces to Jaeger and visualize request flows.

🧠 Recap

In this lesson, you:

  • Configured logging for FastAPI.
  • Exposed metrics with Prometheus.
  • Added tracing with OpenTelemetry.
  • Integrated with Prometheus and Grafana for monitoring and visualization.