Monitoring CloudQuery can be done in a number of main ways:
CloudQuery utilizes structured logging (in plain and JSON formats) which can be analyzed by local tools such as
grep and remote aggregations tools like
datadog or any other popular log aggregation that supports structured logging.
ELT workloads can be long running and sometimes it is necessary to better understand what calls are taking the most time; to potentially optimize those on the plugin side, ignore them or split them to a different workload.
To collect traces you need a collector that supports OpenTelemetry protocol, for example OpenTelemetry Collector (opens in a new tab). For example you can use Jaeger (opens in a new tab) to visualize and analyze traces.
To start Jaeger locally you can use Docker:
docker run -d \ -e COLLECTOR_OTLP_ENABLED=true \ -p 16686:16686 \ -p 4318:4318 \ jaegertracing/all-in-one:latest
and then specify in the source spec the following:
kind: source spec: name: "aws" path: "cloudquery/aws" registry: "cloudquery" version: "v22.19.2" tables: ["aws_s3_buckets"] destinations: ["postgresql"] otel_endpoint: "localhost:4318" otel_endpoint_insecure: true # this is only in development when running local jaeger spec:
After that you can open http://localhost:16686 (opens in a new tab) and see the traces: