Skip to main content

Getting Started with AWS

Download and Install#

You can download the precompiled binary from releases, or using CLI:

curl -L https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_linux_x86_64 -o cloudquerychmod a+x cloudquery

Running#

Init command#

After installing CloudQuery, you need to generate a config.hcl file that will describe which cloud provider you want to use and which resources you want CloudQuery to ETL:

cloudquery init aws
# cloudquery init aws gcp # This will generate a config containing aws and gcp providers# cloudquery init --help # Show all possible auto generated configs and flags

All official and approved community providers are listed at CloudQuery Hub with their respective documentation.

Spawn or connect to a Database#

CloudQuery needs a PostgreSQL database (>11). You can either spawn a local one (usually good for development and local testing) or connect to an existing one.

By default, cloudquery will try to connect to the database postgres on localhost:5432 with username postgres and password pass. After installing docker, you can create such a local postgres instance with:

docker run --name cloudquery_postgres -p 5432:5432 -e POSTGRES_PASSWORD=pass -d postgres

If you are running postgres at a different location or with different credentials, you need to edit config.hcl - see the Connect to an Existing Database tab.

Authenticate with AWS#

CloudQuery needs to be authenticated with your AWS account in order to fetch information about your cloud setup.

info

CloudQuery requires only read permissions (we will never make any changes to your cloud setup). Attaching the ReadOnlyAccess policy to the user/role CloudQuery is running as should work for the most part, but you can fine-tune it even more to have read-only access for the specific set of resources that you want CloudQuery to fetch. See also this blogpost.

There are multiple ways to authenticate with AWS, and CloudQuery respects the AWS credential provider chain. This means that CloudQuery will follow the following priorities when attempting to authenticate:

  • AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN environment variables.
  • credentials and config files in ~/.aws folder (in this respective priority).
  • IAM roles for AWS compute resources (including EC2 instances, fargate and ECS containers).

You can find more info about AWS authentication here and here

CloudQuery can use the credentials from the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN environment variables (AWS_SESSION_TOKEN can be optional for some accounts). For information about obtaining credentials, see the AWS guide.

export AWS_ACCESS_KEY_ID={Your AWS Access Key ID}export AWS_SECRET_ACCESS_KEY={Your AWS secret access key}export AWS_SESSION_TOKEN={Your AWS session token}

Multi Account/Organization Access#

If you have multiple AWS accounts/organizations, you can follow the steps set in the cq-provider-aws README.

Fetch Command#

Once config.hcl is generated and you are authenticated with AWS, run the following command to fetch the resources.

cloudquery fetch# cloudquery fetch --help # Show all possible fetch flags

Exploring and Running Queries#

Once CloudQuery fetched the resources, you can explore your cloud infrastructure with SQL!

You can use psql to connect to your postgres instance (of course, you need to change the connection-string to match the location and credentials of your database):

psql "postgres://postgres:[email protected]:5432/postgres?sslmode=disable"

Schema and tables for AWS are available in CloudQuery Hub.

A few example queries for AWS:

List ec2_images:#

SELECT * FROM aws_ec2_images;

Find all public-facing AWS load balancers:#

SELECT * FROM aws_elbv2_load_balancers WHERE scheme = 'internet-facing';

Policy Command#

CloudQuery Policies allow users to write security, governance, cost, and compliance rules, using SQL as the query layer and HCL as the logical layer.

All official and approved community policies are listed on CloudQuery Hub.

Execute a policy#

All official policies are hosted at https://github.com/cloudquery-policies.

cloudquery policy run aws//cis_v1.2.0

Next Steps#

At Cloudquery Hub, you can read more about the CloudQuery AWS provider - including exploring the SQL schema, and advanced configurations.