Configuration
#
SetupMake sure you read our Getting Started section first.
#
InstallationCloudQuery History needs a PostgreSQL (version >= 11) instance with TimescaleDB extension (version >= 2.6) installed or a TimescaleDB instance (version >= 12). You can either spawn a local one (usually good for development and local testing) or connect to an existing one. For more install options of TimescaleDB see their guide.
For local, you can use the following docker command:
docker run -d --name localTimeScaleDB -p 5432:5432 -e POSTGRES_PASSWORD=pass timescale/timescaledb:latest-pg12
#
Configuration#
DSN FormatUsing TimescaleDB, the type
parameter in connection
block of config.hcl
should be set to tsdb
.
cloudquery { connection { type = "tsdb" username = "user" password = "pass" host = "localhost" port = 5432 database = "localTimeScaleDB" sslmode = "disable" }}
Alternatively if using the URI DSN format, you can start it with tsdb://
, as such:
cloudquery { connection { dsn = "tsdb://user:[email protected]:5432/localTimeScaleDB?sslmode=disable" }}
This way CloudQuery knows you're pointing to a TimescaleDB instance in the connection block.
#
History SettingsThe history block in our config.hcl
allows us to configure our history preferences. Currently we have the following options:
- Retention [Optional]: defines the retention policy of your data and how long it should exist in days. The default is 7 days.
- Interval [Optional]: Defines how history chunks are split by time, defaults to one chunk per 24 hours. See TimescaleDB docs for more info on this behavior.
- Truncation [Optional]: truncates fetch time by hour, for example if we fetch with TimeTruncation = 1 at 11:25 the fetch date will truncate to 11:00, defaults to 24 hours, which means one set of fetch data per day. Data fetched on the same truncation date will be replaced.
cloudquery { # history block configuration makes cloudquery run in history mode history { // Save data retention for 7 days retention = 7 // Truncate our fetch by 6 hours per fetch truncation = 6 // Tell Timescale to split our chunks in a daily interval (24 hours) interval = 24 } # ... continuation of cloudquery block (connection, providers, etc')}
#
QueryingCloudQuery will create a new schema called history
which holds full historical data for each table. In our public schema you will have views
that fetch the latest fetch data of each table. This allows the previous queries before enabling history mode to work out-of-the-box.
-- Will select all historical fetches of aws_iam_usersSELECT * FROM history.aws_iam_users
-- Will select only the latest fetch of aws_iam_usersSELECT * from aws_iam_users
#
ExamplesSelect all S3 buckets that were fetched yesterday and don't exist today, i.e find deleted buckets.
SELECT * FROM history.aws_s3_buckets WHERE cq_fetch_date BETWEEN now() - interval '2 day' and now() - interval '1 day'EXCEPTSELECT * FROM aws_s3_buckets