Getting Started with AWS
#
Download and InstallYou can download the precompiled binary from releases, or using CLI:
- Linux
- OSX
- Windows
- Precompiled Binaries (x86_64)
- Precompiled Binaries (arm64)
curl -L https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_linux_x86_64 -o cloudquerychmod a+x cloudquery
curl -L https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_linux_arm64 -o cloudquerychmod a+x cloudquery
- Homebrew
- Precompiled Binaries (x86_64)
- Precompiled Binaries (arm64)
brew install cloudquery/tap/cloudquery
# After initial install you can upgrade the version via:# brew upgrade cloudquery
curl -L https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_darwin_x86_64 -o cloudquerychmod a+x cloudquery
curl -L https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_darwin_arm64 -o cloudquerychmod a+x cloudquery
- Precompiled Binaries (CMD)
- Precompiled Binaries (PowerShell)
curl -L https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_windows_x86_64.exe -o cloudquery.exe
Invoke-WebRequest https://github.com/cloudquery/cloudquery/releases/latest/download/cloudquery_windows_x86_64.exe -o cloudquery.exe
#
Running#
Init commandAfter installing CloudQuery, you need to generate a config.hcl
file that will describe which cloud provider you want to use and which resources you want CloudQuery to ETL:
cloudquery init aws
# cloudquery init aws gcp # This will generate a config containing aws and gcp providers# cloudquery init --help # Show all possible auto generated configs and flags
All official and approved community providers are listed at CloudQuery Hub with their respective documentation.
#
Spawn or connect to a DatabaseCloudQuery needs a PostgreSQL database (>11). You can either spawn a local one (usually good for development and local testing) or connect to an existing one.
- Spawn a Local Database with Docker
- Connect to an Existing Database
By default, cloudquery will try to connect to the database postgres
on localhost:5432
with username postgres
and
password pass
. After installing docker, you can create such a local postgres instance with:
docker run --name cloudquery_postgres -p 5432:5432 -e POSTGRES_PASSWORD=pass -d postgres
If you are running postgres at a different location or with different credentials, you need to edit config.hcl
-
see the Connect to an Existing Database
tab.
CloudQuery connects to the postgres database that is defined in the config.hcl
's connection
section.
Edit this section to configure the location and credentials of your postgres database.
cloudquery { ... ...
connection { dsn = "host=localhost user=postgres password=pass database=postgres port=5432" }}
#
Authenticate with AWSCloudQuery needs to be authenticated with your AWS account in order to fetch
information about your cloud setup.
info
CloudQuery requires only read permissions (we will never make any changes to your cloud setup).
Attaching the ReadOnlyAccess
policy to the user/role CloudQuery is running as should work for the most part,
but you can fine-tune it even more to have read-only access for the specific set of resources that you want
CloudQuery to fetch.
See also this blogpost.
There are multiple ways to authenticate with AWS, and CloudQuery respects the AWS credential provider chain. This means that CloudQuery will follow the following priorities when attempting to authenticate:
AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
, andAWS_SESSION_TOKEN
environment variables.credentials
andconfig
files in~/.aws
folder (in this respective priority).- IAM roles for AWS compute resources (including EC2 instances, fargate and ECS containers).
You can find more info about AWS authentication here and here
- Environment Variables
- Shared Configuration Files
- IAM roles for AWS compute resources
CloudQuery can use the credentials from the AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
, and
AWS_SESSION_TOKEN
environment variables (AWS_SESSION_TOKEN
can be optional for some accounts). For information about obtaining credentials, see the
AWS guide.
- Linux and OSX
- Windows (CMD)
- Windows (PowerShell)
export AWS_ACCESS_KEY_ID={Your AWS Access Key ID}export AWS_SECRET_ACCESS_KEY={Your AWS secret access key}export AWS_SESSION_TOKEN={Your AWS session token}
SET AWS_ACCESS_KEY_ID={Your AWS Access Key ID}SET AWS_SECRET_ACCESS_KEY={Your AWS secret access key}SET AWS_SESSION_TOKEN={Your AWS session token}
$Env:AWS_ACCESS_KEY_ID={Your AWS Access Key ID}$Env:AWS_SECRET_ACCESS_KEY={Your AWS secret access key}$Env:AWS_SESSION_TOKEN={Your AWS session token}
CloudQuery can use credentials from your credentials
and config
files in the .aws
directory in your home folder.
The contents of these files are practically interchangeable, but CloudQuery will prioritize credentials in the credentials
file.
For information about obtaining credentials, see the AWS guide.
Here are example contents for a credentials
file:
[default]aws_access_key_id = <YOUR_ACCESS_KEY_ID>aws_secret_access_key = <YOUR_SECRET_ACCESS_KEY>
You can also specify credentials for a different profile, and instruct cloudquery to use the credentials from this profile instead of the default one.
For example:
[myprofile]aws_access_key_id = <YOUR_ACCESS_KEY_ID>aws_secret_access_key = <YOUR_SECRET_ACCESS_KEY>
Then, you can either export the AWS_PROFILE
environment variable:
export AWS_PROFILE=myprofile
or, configure your desired profile in the local_profile
field of your CloudQuery config.hcl
:
provider "aws" { configuration { accounts "<account_alias>" { local_profile = "myprofile" } ... } ...}
Cloudquery can use IAM roles for AWS compute resources (including EC2 instances, fargate and ECS containers). If you configured your AWS compute resources with IAM, cloudquery will use these roles automatically! You don't need to specify additional credentials manually. For more information on configuring IAM, see the AWS docs here and here.
#
Multi Account/Organization AccessIf you have multiple AWS accounts/organizations, you can follow the steps set in the cq-provider-aws README.
#
Fetch CommandOnce config.hcl
is generated and you are authenticated with AWS, run the following command to fetch the resources.
cloudquery fetch# cloudquery fetch --help # Show all possible fetch flags
#
Exploring and Running QueriesOnce CloudQuery fetched the resources, you can explore your cloud infrastructure with SQL!
- Running psql locally
- Running psql on the docker
You can use psql
to connect to your postgres instance (of course, you need to change the connection-string to match the
location and credentials of your database):
psql "postgres://postgres:[email protected]:5432/postgres?sslmode=disable"
If you opted for running the PostgreSQL server in a docker as described above, you can also run psql
directly from the docker
instead of installing it on your machine:
docker exec -it cloudquery_postgres psql -U postgres
Schema and tables for AWS are available in CloudQuery Hub.
A few example queries for AWS:
#
List ec2_images:SELECT * FROM aws_ec2_images;
#
Find all public-facing AWS load balancers:SELECT * FROM aws_elbv2_load_balancers WHERE scheme = 'internet-facing';
#
Policy CommandCloudQuery Policies allow users to write security, governance, cost, and compliance rules, using SQL as the query layer and HCL as the logical layer.
All official and approved community policies are listed on CloudQuery Hub.
#
Execute a policyAll official policies are hosted at https://github.com/cloudquery-policies.
cloudquery policy run aws//cis_v1.2.0
#
Next StepsAt Cloudquery Hub, you can read more about the CloudQuery AWS provider - including exploring the SQL schema, and advanced configurations.