Subqueries

Overview

Subqueries in FSQL allow you to use the results from one query as filter criteria in another query. This enables sophisticated threat hunting workflows where you need to correlate data across different event types or time windows.

A subquery is essentially a query within a query—the inner query executes first, and its results are used to filter the outer query.

Syntax

The basic syntax for a subquery uses curly braces {} to wrap the inner query:

QUERY <outer_query_fields>
WITH <field> IN {
  QUERY <inner_query_field>
  WITH <inner_query_filters>
  AFTER <inner_query_time>
}
AFTER <outer_query_time>

Key syntax rules:

  • The inner query must be enclosed in curly braces {}
  • The inner query must select only one field
  • The field selected by the inner query must have the same data type as the field it's filtering in the outer query
  • Subqueries can be used with the following operators: =, ==, !=, !==, IN, IIN, >, <, >=, <=

Basic Example

Find all network activity from hosts that have new security alerts:

QUERY #network.(message + time + activity_id + severity_id + status_id)
WITH %hostname IN {
  QUERY detection_finding.device.hostname
  WITH detection_finding.status_id = NEW
}
AFTER 1d

What this does:

  1. The inner query finds all hostnames with new detection findings
  2. The outer query searches network events from those hosts in the last day
  3. Returns network activity details for any hosts with recent alerts

Common Use Cases

1. Investigating Compromised Accounts

Find all authentication activity for users who had failed MFA attempts:

QUERY authentication.**
WITH authentication.user.name IN {
  QUERY authentication.user.name
  WITH authentication.is_mfa = TRUE
  AND authentication.status_id = FAILURE
  AFTER 7d
}
AFTER 1d

This helps you track what accounts with failed MFA are doing—potential indicators of credential stuffing or brute force attacks.

2. Lateral Movement Detection

Find network connections to hosts that experienced suspicious process execution:

QUERY #network.**
WITH #network.dst_endpoint.hostname IN {
  QUERY process_activity.device.hostname
  WITH process_activity.process.name IN 'powershell.exe', 'cmd.exe', 'wmic.exe'
  AND process_activity.process.cmd_line CONTAINS 'hidden'
  AFTER 24hrs
}
AFTER 24hrs

3. Email-Based Threat Hunting

Find all file system activity on hosts that received emails from a suspicious sender:

QUERY file_system_activity.**
WITH file_system_activity.device.hostname IN {
  QUERY email_activity.dst_endpoint.hostname
  WITH email_activity.email.from CONTAINS 'suspicious-domain.com'
  AFTER 7d
}
AFTER 7d

4. Cloud Account Enumeration

Find all AWS API calls made by accounts that had unusual authentication patterns:

QUERY api_activity.**
WITH api_activity.actor.user.uid IN {
  QUERY authentication.user.uid
  WITH authentication.src_endpoint.location.country != 'US'
  AND authentication.activity_id = LOGON
  AND authentication.status_id = SUCCESS
  AFTER 3d
}
AND api_activity.cloud.provider = 'AWS'
AFTER 3d

5. Cascading Security Incidents

Find all processes spawned by executables that were downloaded recently:

QUERY process_activity.**
WITH %hash IN {
  QUERY file_activity.file.hashes.value
  WITH file_activity.activity_id = CREATE
  AND file_activity.file.type_id = REGULAR_FILE
  AFTER 1d
}
AFTER 1d

Nested Subqueries

Subqueries can be nested to any depth, though performance degrades with deeper nesting. Here's an example with two levels of nesting:

QUERY #network.**
WITH #network.src_endpoint.hostname IN {
  QUERY process_activity.device.hostname
  WITH process_activity.actor.user.name IN {
    QUERY authentication.user.name
    WITH authentication.status_id = FAILURE
    AND authentication.is_mfa = TRUE
    AFTER 7d
  }
  AFTER 3d
}
AFTER 1d

What this does:

  1. Innermost query: Find users with failed MFA in the last 7 days
  2. Middle query: Find hosts where those users ran processes in the last 3 days
  3. Outer query: Find network activity from those hosts in the last day

Performance consideration: Each nested level adds latency. For deeply nested queries (3+ levels), consider breaking them into separate queries or using different time windows to reduce data volume.

Working with Different Data Types

IP Addresses

QUERY #network.**
WITH #network.src_endpoint.ip IN {
  QUERY detection_finding.src_endpoint.ip
  WITH detection_finding.severity_id IN HIGH, CRITICAL
  AFTER 24hrs
}
AFTER 24hrs

File Hashes

QUERY process_activity.**
WITH ANY process_activity.process.file.hashes.value IN {
  QUERY detection_finding.evidences.file.hashes.value
  WITH detection_finding.malware.name NOT EMPTY
  AFTER 7d
}
AFTER 1d

Email Addresses

QUERY api_activity.**
WITH api_activity.actor.user.email_addr IN {
  QUERY email_activity.email.to
  WITH email_activity.email.from CONTAINS 'phishing-domain'
  AFTER 3d
}
AFTER 1d

Using Comparison Operators

Find network traffic where bytes transferred exceed the average from baseline activity:

QUERY #network.**
WITH #network.traffic.bytes > {
  SUMMARIZE AVG #network.traffic.bytes
  WITH #network.src_endpoint.hostname = 'prod-server-01'
  AFTER 30d
}
AFTER 1d

Find authentication events with more failed attempts than a threshold:

QUERY authentication.**
WITH authentication.count >= {
  SUMMARIZE MAX authentication.count
  WITH authentication.status_id = SUCCESS
  AFTER 7d
}
AND authentication.status_id = FAILURE
AFTER 1d

Using Negation

Find network activity NOT originating from known-good hosts:

QUERY #network.**
WITH #network.src_endpoint.hostname != {
  QUERY inventory_info.device.hostname
  WITH inventory_info.device.type_id = SERVER
  AND inventory_info.device.is_compliant = TRUE
  AFTER 7d
}
AFTER 1d

Limitations and Best Practices

Limitations

  1. Single field selection: The inner query can only select one field. This query is invalid:

    -- INCORRECT: Cannot select multiple fields
    QUERY authentication.**
    WITH (authentication.user.name, authentication.src_endpoint.ip) IN {
      QUERY detection_finding.(user.name + src_endpoint.ip)
      ...
    }
  2. Data type matching: The inner query field must match the outer query field type:

    -- INCORRECT: Comparing string to IP address
    QUERY #network.**
    WITH #network.src_endpoint.ip IN {
      QUERY authentication.user.name  -- Returns string, not IP
      ...
    }
  3. Supported operators only: Subqueries can only be used with these operators:

    • = (equals)
    • == (case-insensitive equals)
    • != (not equals)
    • !== (case-insensitive not equals)
    • IN (in list)
    • IIN (case-insensitive in list)
    • > (greater than)
    • < (less than)
    • >= (greater than or equal)
    • <= (less than or equal) Operations like CONTAINS, STARTSWITH, ENDSWITH, or ANY/ALL cannot be used with subqueries.
  4. Large result sets: Subqueries returning thousands of results can be slow. Consider using more restrictive filters in the inner query.

Best Practices

  1. Test inner queries first: Always verify your inner query returns the expected results before embedding it:

    -- First, test this separately
    QUERY detection_finding.device.hostname
    WITH detection_finding.status_id = NEW
    AFTER 1d
    
    -- Then use it in the subquery
  2. Use appropriate time windows: The inner query's time range should make sense relative to the outer query:

    -- Good: Inner query has wider time window
    QUERY process_activity.**
    WITH %hostname IN {
      QUERY detection_finding.device.hostname
      WITH detection_finding.severity_id = CRITICAL
      AFTER 7d  -- Look further back for alerts
    }
    AFTER 1d  -- Recent process activity
  3. Add filters to reduce result size: Don't rely solely on the subquery for filtering:

    -- Good: Additional filters in outer query
    QUERY #network.**
    WITH #network.dst_endpoint.port IN 22, 3389, 445  -- RDP, SSH, SMB
    AND #network.src_endpoint.hostname IN {
      QUERY detection_finding.device.hostname
      WITH detection_finding.status_id = NEW
      AFTER 3d
    }
    AFTER 1d
  4. Use entity searches for flexibility: Entity searches (%hostname, %ip, etc.) work well with subqueries:

    QUERY #network.**
    WITH %ip IN {
      QUERY %ip
      WITH detection_finding.severity_id = CRITICAL
      AFTER 7d
    }
    AFTER 1d
  5. Consider data source selection: Use the FROM clause to limit which connectors are searched:

    QUERY #network.**
    WITH %hostname IN {
      QUERY detection_finding.device.hostname
      WITH detection_finding.status_id = NEW
      FROM 'CrowdStrike', 'SentinelOne'
      AFTER 3d
    }
    FROM 'Zeek Logs', 'Palo Alto Firewall'
    AFTER 1d

Troubleshooting

Slow Query Performance

If your subquery is running slowly:

  1. Narrow the inner query time range:

    -- Instead of AFTER 30d
    AFTER 3d  -- Use a shorter window
  2. Add more specific filters to the inner query:

    -- Add severity, status, or other filters
    QUERY detection_finding.device.hostname
    WITH detection_finding.status_id = NEW
    AND detection_finding.severity_id IN HIGH, CRITICAL  -- More specific
  3. Limit the number of connectors searched:

    FROM 'Primary EDR', 'Backup EDR'  -- Instead of searching all connectors

No Results Returned

If your subquery returns no results:

  1. Test the inner query independently to verify it returns data
  2. Check data type compatibility between inner and outer fields
  3. Verify time windows make sense (inner query should look further back or at the same time as outer)
  4. Confirm field paths are correct using the data model reference

Example Debugging Process

-- Step 1: Test inner query
QUERY detection_finding.device.hostname
WITH detection_finding.status_id = NEW
AFTER 7d
LIMIT 10

-- Step 2: Verify hostnames are returned

-- Step 3: Use in full query
QUERY #network.**
WITH %hostname IN {
  QUERY detection_finding.device.hostname
  WITH detection_finding.status_id = NEW
  AFTER 7d
}
AFTER 1d