Skip to main content

Getting Started with AWS

Download and Install

You can download the precompiled binary from releases, or using CLI:

curl -L https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_linux_x86_64 -o cloudquery
chmod a+x cloudquery

Running

Init command

After installing CloudQuery, you need to generate a cloudquery.yml file that will describe which cloud provider you want to use and which resources you want CloudQuery to ETL:

cloudquery init aws

# cloudquery init aws gcp # This will generate a config containing aws and gcp providers
# cloudquery init --help # Show all possible auto generated configs and flags

All official and approved community providers are listed at CloudQuery Hub with their respective documentation.

Spawn or connect to a Database

CloudQuery needs a PostgreSQL database (>=10). You can either spawn a local one (usually good for development and local testing) or connect to an existing one.

By default, cloudquery will try to connect to the database postgres on localhost:5432 with username postgres and password pass. After installing docker, you can create such a local postgres instance with:

docker run --name cloudquery_postgres -p 5432:5432 -e POSTGRES_PASSWORD=pass -d postgres

If you are running postgres at a different location or with different credentials, you need to edit cloudquery.yml - see the Connect to an Existing Database tab.

Authenticate with AWS

CloudQuery needs to be authenticated with your AWS account in order to fetch information about your cloud setup.

info

CloudQuery requires only read permissions (we will never make any changes to your cloud setup). Attaching the ReadOnlyAccess policy to the user/role CloudQuery is running as should work for the most part, but you can fine-tune it even more to have read-only access for the specific set of resources that you want CloudQuery to fetch. See also this blog post.

There are multiple ways to authenticate with AWS, and CloudQuery respects the AWS credential provider chain. This means that CloudQuery will follow the following priorities when attempting to authenticate:

  • AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN environment variables.
  • credentials and config files in ~/.aws folder (in this respective priority).
  • IAM roles for AWS compute resources (including EC2 instances, fargate and ECS containers).

You can find more info about AWS authentication here and here

CloudQuery can use the credentials from the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN environment variables (AWS_SESSION_TOKEN can be optional for some accounts). For information about obtaining credentials, see the AWS guide.

export AWS_ACCESS_KEY_ID={Your AWS Access Key ID}
export AWS_SECRET_ACCESS_KEY={Your AWS secret access key}
export AWS_SESSION_TOKEN={Your AWS session token}

Multi Account/Organization Access

If you have multiple AWS accounts/organizations, you can follow the steps set in the cq-provider-aws README.

Fetch Command

Once cloudquery.yml is generated and you are authenticated with AWS, run the following command to fetch the resources.

cloudquery fetch
# cloudquery fetch --help # Show all possible fetch flags

Exploring and Running Queries

Once CloudQuery fetched the resources, you can explore your cloud infrastructure with SQL!

You can use psql to connect to your postgres instance (of course, you need to change the connection-string to match the location and credentials of your database):

psql "postgres://postgres:[email protected]:5432/postgres?sslmode=disable"

Schema and tables for AWS are available in CloudQuery Hub.

A few example queries for AWS:

List ec2_images:

SELECT * FROM aws_ec2_images;

Find all public-facing AWS load balancers:

SELECT * FROM aws_elbv2_load_balancers WHERE scheme = 'internet-facing';

Policy Command

CloudQuery Policies allow users to write security, governance, cost, and compliance rules, using SQL as the query layer and HCL as the logical layer.

All official and approved community policies are listed on CloudQuery Hub.

Execute a policy

All official policies are hosted at https://github.com/cloudquery-policies.

cloudquery policy run aws//cis_v1.2.0

Next Steps

At Cloudquery Hub, you can read more about the CloudQuery AWS provider - including exploring the SQL schema, and advanced configurations.