Monitoring and Observability¶

🎯 Learning Goals¶

By the end of this lesson, you will be able to:

Configure structured logging in FastAPI.
Collect metrics (requests, latency, errors) and expose them for monitoring.
Add tracing to follow requests across services.
Integrate FastAPI with Prometheus and visualize metrics in Grafana.

⚡ Step 1: Logging¶

Logging is the foundation of observability. In production, logs should be structured and easy to parse.

📄 main.py

import logging
from fastapi import FastAPI

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s %(levelname)s %(name)s %(message)s"
)

logger = logging.getLogger("taskmanager")

app = FastAPI()

@app.get("/health")
def health_check():
    logger.info("Health check endpoint called")
    return {"status": "ok"}

🔍 What’s happening?¶

Logs include timestamp, severity, logger name, and message.
Use logger.info(), logger.error(), etc. in your endpoints.
In production, logs can be shipped to ELK (Elasticsearch, Logstash, Kibana) or Loki.

⚡ Step 2: Metrics¶

Metrics give quantitative insights (e.g. request counts, latency).

Install prometheus-fastapi-instrumentator:

pip install prometheus-fastapi-instrumentator

📄 main.py

from prometheus_fastapi_instrumentator import Instrumentator

instrumentator = Instrumentator().instrument(app).expose(app)

Now your app exposes metrics at /metrics.

Example metrics:¶

http_requests_total → number of requests
http_request_duration_seconds → latency
http_requests_in_progress → concurrent requests

⚡ Step 3: Tracing¶

Tracing lets you follow a request across services. Use OpenTelemetry.

Install:

pip install opentelemetry-api opentelemetry-sdk opentelemetry-instrumentation-fastapi

📄 main.py

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor

trace.set_tracer_provider(TracerProvider())
span_processor = BatchSpanProcessor(ConsoleSpanExporter())
trace.get_tracer_provider().add_span_processor(span_processor)

FastAPIInstrumentor.instrument_app(app)

🔍 What’s happening?¶

Every request creates a trace span.
Spans are exported to console (for demo).
In production, export to Jaeger, Zipkin, or Tempo.

⚡ Step 4: Integrating with Prometheus and Grafana¶

Prometheus¶

Prometheus scrapes metrics from /metrics.
Configure prometheus.yml:

scrape_configs:
  - job_name: "fastapi"
    static_configs:
      - targets: ["localhost:8000"]

Run Prometheus:

docker run -p 9090:9090 -v ./prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus

Grafana¶

Grafana visualizes Prometheus metrics.
Run Grafana:

docker run -d -p 3000:3000 grafana/grafana

Add Prometheus as a data source (http://localhost:9090).
Create dashboards for:
- Request rate
- Error rate
- Latency percentiles

🧠 Best Practices¶

Use structured logs (JSON) for easier parsing.
Collect metrics for requests, latency, and errors.
Add tracing to debug distributed systems.
Use Prometheus + Grafana for monitoring and visualization.
Set up alerts (e.g. high error rate) in Grafana.

🧪 Practice Challenge¶

Add a custom metric for the number of tasks created.
Configure Grafana to alert when error rate > 5%.
Export traces to Jaeger and visualize request flows.

🧠 Recap¶

In this lesson, you:

Configured logging for FastAPI.
Exposed metrics with Prometheus.
Added tracing with OpenTelemetry.
Integrated with Prometheus and Grafana for monitoring and visualization.