Documentation
Quickstart
CloudQuery Cloud

What is CloudQuery Cloud

CloudQuery Cloud is a great way to get started with CloudQuery and syncing data from source to destination without the need to deploy your own infrastructure. It is also much easier to connect to sources and destinations with the integrated OAuth authentication support. You only need to select a source and destination integration and CloudQuery will take care of the rest.

With CloudQuery Cloud, you can:

Prerequisites

To run a sync, you will need to know where you want to store the data (a database or a data warehouse), and what sources you want CloudQuery to connect to and sync data from.

If you don't have a database available, we recommend that you try a managed service such as Neon (opens in a new tab), or Aiven (opens in a new tab). They both offer a free tier sufficient for your initial experiments.

Creating Your First Sync

First, head to CloudQuery Cloud (opens in a new tab) and sign up for an account. After the sign up, you will be asked to specify team name. Team name is important if you later decide to create your own integrations. Please note that the team name cannot be changed later.

After you have signed up and created a team, you can create your first sync from the Syncs tab.

Create Sync

Clicking the button will start a wizard that will walk you through the whole process.

Configure a Source

First, you need to define a Source - a configuration of a CloudQuery integration that will define what service CloudQuery will connect to and get the data from. To do this, select a source integration from the list of available sources. In this example, we will use AWS as the source.

Select a source

Authentication

Once you select the source integration, its configuration UI will load. Notice the Setup Guide on the right side that will guide you through the individual configuration options for the specific integration.

Source Configuration

Some integrations support OAuth connection, other require an API key, or access token. AWS integration will help you deploy a CloudFormation stack to create an IAM role which will be used by CloudQuery to access the AWS data.

Selecting Tables to Sync

Some integrations allow you to select which tables to Sync. integrations with large amounts of tables (such as AWS) have a list of services to sync from instead. This makes the configuration easier. Should you require better granularity, we recommend using the self-hosted syncs instead.

Here is a screenshot of the AWS services selection:

Selecting AWS Services

Here is a screenshot of the GitHub tables selection:

Selecting GitHub Tables

Advanced Options

Some integrations allow you to configure additional options for the source that may help you improve the sync performance and timing or its rate-limiting behavior.

Testing the connection

Once the integration is configured, click the Test Connection button to verify that CloudQuery can connect to the source. If the connection is successful, a new Source (and can be used to set up a new sync later) will be created and you can proceed to the next step.

Configure a Destination

Next, you need to configure where the data will be stored. For this, you will set up a Destination - a configuration of a CloudQuery integration that will enable CloudQuery to connect to the destination database.

Choose a CloudQuery integration from the list of available destinations. PostgreSQL is a good choice for most use cases and is usually the easiest to get started with.

Select Destination

Once you select the integration, follow the setup guide on the right side to configure the connection to the destination. For most of the destinations, you will be asked to provide the host name, database name, and credentials to authenticate with.

Select Destination

If you use a managed database service, such as Neon (opens in a new tab), or Aiven (opens in a new tab), the configuration should be straightforward with a connection string. If you deployed your own database in a cloud provider, you may need to make sure the database is accessible from the CloudQuery Cloud IP addresses. See the sidebar on the right in the integration configuration UI for the list of IP addresses.

Click Test Connection to save the Destination and proceed to the next step.

Configure Sync Options and Schedule

On the last page, you can configure the sync name, schedule, allocated resources, and incremental syncs.

Schedule

You can configure the syncs to run Daily, Weekly, Monthly, or at a custom schedule using the cron syntax.

If you configure syncs to run Daily, they will run at midnight every day. Weekly syncs will run at midnight on Mondays, and monthly syncs will run at midnight on the first day of each month (all in UTC timezone).

Allocated Resources

You can choose to allocate vCPU and vRAM resources. Some integrations may need more vRAM or vCPU to run efficiently. The default values are usually sufficient for most use cases.

Incremental Sync

Some source integrations support incremental syncs on some of their tables. Incremental syncs fetch only the data that changed since the last sync. Such tables are marked as "incremental" in the source integration table documentation.

By default, this option is on and it means syncs will generally take less time and move fewer rows. If you need to refresh the data, disable this option.

Run the Sync

After you have configured the sync options, you can either save the sync and run it immediately, or you can schedule it. If you run it immediately, the next execution will happen at the specified schedule.

When the sync is running, you can observe the execution as it happens in the Sync Run Details view. The data is being streamed to your database during this process.

Running Sync

Errors and warnings are displayed in the logs section. Only errors that prevent the full sync are considered as breaking. This means that a sync is considered successful even if it has some errors or warnings due to the fact that some tables may not be accessible.

API Access

Cloud syncs can be controlled using API. To use the API, you will need to generate an API key. For the list of available endpoints, see our API Reference (opens in a new tab).

Limitations of Cloud Syncs Compared to Self-hosted

  • Some integrations may not be available in CloudQuery Cloud. This is due to the fact that we are still in beta and are continuously adding more integrations.
  • Not all configuration options that are available for self-hosted syncs are available for Cloud Syncs. This is to make the Cloud Syncs easier to use and configure. We will consider adding more options in the future based on your feedback.
  • Docker-based integrations cannot be used with CloudQuery Cloud. This is a limitation during beta phase and will be removed before the general availability.
  • Not all authentication mechanisms that work on self-hosted syncs work with CloudQuery Cloud. We are continuously adding support for more authentication mechanisms.
  • Community and custom integrations cannot be used with CloudQuery Cloud. This is by design and will not be removed in the near future.