Dynamic Schema Configuration (Legacy)

Learn how to setup "dynamic schema" platforms within the Query Federated Search platform

IMPORTANT NOTE: This documentation is in reference to a legacy capability within the product

Overview

Within the Query Federated Search platform there are two classifications of Connectors. There are what are called "static schemas" which are pre-built integrations into upstream data sources such as Endpoint Detection & Response (EDR) tools such as SentinelOne Singularity Platform or Crowdstrike Falcon or Identity Providers (IdPs) such as Okta or Azure Entra ID (F.K.A Azure Active Directory). You cannot modify these types of Connectors as they are purpose-built to cover specific data of interest to support specific jobs to be done by analysts, hunters, and other security professionals.

Next are what is called a "dynamic schema" platform in which the Query Federated Search Connector provides an interface and translation layer into that platforms own query and search capabilities where the schema is, well, dynamic. This can be data within a Splunk Enterprise Security (ES) index or data stored within an Amazon Simple Storage Service (S3) bucket that you have onboarded into AWS Glue Data Catalog. In these cases users must map their data into the Query Data Model to surface specific Events and specific Entities from which to search and the Connector itself will translate, optimize, plan, execute and return the results.

NOTE: As of 1 NOV 2023 users must use a JSON document to define the schema mapping, in the future, this experience will be enhanced to allow users to introspect and map their schemas using a no-code workflow per configured Connector.

Configuring Dynamic Schemas with JSON

To successfully onboard a dynamic schema into the Query Federated Search platform you data must have the following:

  • Record ID: Any unique identifier such as a UUID, GUID or another value in your data that denotes a singular entry (e.g., an agent/machine UUID, a trace ID or request ID, etc.)
  • Time: Either a Unix/Epoch second timestamp or an ISO 8061 formatted date and time that allows the Query Federated Search platform to properly scope a query. Date and time cannot be in separate rows or keys within your dynamic schema.
  • Entity: At least one Entity - also known as an observable or indicator - within your data set to be able to query on. This can be an IP Address, a Hostname, a User ID, or otherwise. You can map as many as you want.

At the very minimum you will need to define at least one Entity within the OBJECT_OBSERVABLE mapping in the schema mapping JSON. Refer to the table and example JSON snippet below for information on how to fill our the schema mapping JSON. Leave all other values as the defaults except for the placeholders that are CAPITALIZED_SNAKE_CASE.

  • index_or_table_name: The name of the Index (e.g., in Splunk or Elastic) or the name of the Table or View (e.g., in AWS Glue Data Catalog, Amazon Athena or Snowflake) which contains your data.
  • mapping.name.value: The name of the Key of the Column in your index or table, respectively, that contains the data you wish to map to an Entity.
  • mapping.value.path.[0].value: The name of the Key of the Column in your index or table, respectively, that contains the data you wish to map to an Entity.
  • mapping.type_id.value: The type of Entity you wish to map, enter one of the following:
    • IP_ADDRESS: An IPv4 or IPv6 IP Address.
    • URL: An RFC 3986 Uniform Resource Lookup (URL) or Uniform Resource Identifier (URI) string.
    • HOSTNAME: A hostname or simply a (sub-)domain name.
    • FILE_HASH: Any hash to identify a process, a file, or other artifact regardless of hash type. This can be anything from MD5, SHA1, SHA2, SSDEEP or otherwise.
    • EMAIL_ADDRESS: Any email address.
    • FILE_NAME: The name or path of any files within a filesystem.
    • USER_NAME: The name or identifier of a user or other identity principal such as a full name, a proper username, an email address (when used as a username), the User ID, SPN, AWS Access Key ID or AWS ARN of an IAM Role or IAM User.
{
  "OBJECT_OBSERVABLE": [
    {
      "index_or_table_name": "NAME_OF_TABLE_OR_INDEX",
      "mapping": {
        "name": {
          "type": "literal",
          "value": "NAME_OF_COLUMN_OR_KEY"
        },
        "value": {
          "path": [
            {
              "type": "literal",
              "value": "NAME_OF_COLUMN_OR_KEY"
            }
          ],
          "type": "input"
        },
        "type_id": {
          "type": "literal",
          "value": "TYPE_OF_ENTITY"
        }
      },
      "is_enabled": true,
      "relationship": {}
    }
  ]
}

Configuring Dynamic Schemas with Query's Flask-App

Coming Soon.