Amazon S3 Destination
Write data the right way into Amazon S3 buckets, powered by Query Security Data Pipelines.
Overview
Amazon S3 is one of the most popular cloud object storage services around, since its inception in 2006, it has stored multiple exabytes of data for 10s of millions of users and organizations. Amazon S3 is also a popular storage engine to back security data lakes and security data lakehouses onto in combination with query engines such as Amazon Athena, TrinoDB, Star Rocks, or used as an intermediary storage between Amazon Redshift, ClickHouse, Databricks, and/or Snowflake.
The Amazon S3 Destination for Query Security Data Pipelines writes data the right way pertaining to the aforementioned usage. The data is written in the following ways to aid in schema registration and performant reads:
- All data is written into OCSF-formatted Apache Parquet, partitioned on the
time
field by year, month, day, and hour. The file names are seeded by a 16 digit UUID and the datetime to avoid object overwrites. - Data is compressed with ZStandard (ZSTD) compression that offers the best balance of compression efficiency and decompression speed for querying the data.
- Data is written into Hive-like partitions (e.g.,
source=your_connector_name/event=detection_finding/year=2025/month=08/day=21/hour=17/
) which is easily discovered by AWS Glue and other metadata catalogs and query engines.
To support writing the data, Query will assume your AWS IAM Role with an IAM Policy that gives permissions to write and read objects and read bucket metadata. To audit this behavior, the Session Name will contain _QuerySecurityDataPipelines
in your AWS CloudTrail Management Event logs. Refer to the next section for information on how to setup an AWS IAM Role and Policy.
Information on thesource
partitionThe ensure that there will not be any issues with downstream query engines, your Connector Name (the Source) is converted into a lowercase string that only contains underscores.
All special characters except for whitespace, periods (
.
), and hyphens (-
) are stripped. Whitespace, periods, and hyphens are replaced by underscores (_
). Finally, the entire string is lowercased.For example,
My-Data.Source@2024
will be converted intomydata_source2024
.
Prerequisites
words
Configure an Amazon S3 Destination
If you already have an existing Destination, skip to Step 4.
-
Navigate to the Pipelines feature in the Query Console and select + New Pipeline form the top of the page, as shown below.
Note: If you already have an existing Destination, this toggle will read Manage Destinations (n) instead, as shown below. You can directly add a new Destination from the Destination Manager.

-
Before configuring your pipeline, if a destination does not exist, select Create New Destination within the Pipeline creation interface as shown below.
-
Provide a Destination Name, you can reuse these destinations across multiple pipelines. From Platform select
Amazon S3
, and provide the following parameters. Once completed, select Create.-
Role ARN : Your IAM Role ARN created in the prerequisites, paste in the entire ARN and not just the Role Name.
-
External ID: External IDs are used to prevent the "confused deputy" problem in AWS STS, and serves as a pre-shared key, this is helpful when you'll use one IAM Role for multiple Destinations. There is not a minimum length or complexity requirement.
-
Bucket Name: The name of your AWS S3 bucket, do not include
s3://
, any paths, or your ARN. -
Region Name: The AWS region name that the Bucket exists under, this is used for creating STS Sessions.
-
-
After creating the Destination, you will now have a dropdown menu to select the new Destination for your Pipeline.
Updated 1 day ago