Normalization and the Query Data Model (QDM)
Search Results Normalization into QDM (Query Data Model)
Query normalizes the results that come from different data sources into a standardized cybersecurity data schema called the QDM (Query Data Model) which is heavily inspired by the Open Cybersecurity Schema Framework (OCSF). OCSF is open-source and an industry standard, with the backing and collaboration from key vendors in cybersecurity. OCSF was announced at BlackHat 2022 with its initial founding coalition including organizations like Splunk, AWS, Broadcom, Cloudflare, CrowdStrike, IBM Security, Okta, Palo Alto Networks, Rapid7, Sumo Logic, Tanium, Trend Micro, and Zscaler. For more information on OCSF, please refer to:
Query plays the role of the "Data Broker" by converting data from data source vendors' native formats into QDM. This normalization results in OCSF "Objects" as Query's cybersecurity entities of interest, and OCSF "Event Classes" as the activity related to those events.
Query Data Model
View and browse the QDM at https://schema.query.ai/
Note: OCSF is rapidly evolving, hence Query is using a smaller, stable subset, along with some additional modifications, suitable for our cybersecurity Federated Search use-cases. Please refer to the official OCSF Documentation link to understand the broader schema.
OCSF Objects as QDM's Cybersecurity Entities
OCSF Objects have been adopted into Query's data model to represent cybersecurity Entities. Each object has its set of attributes that Query can extract and set from the federated search results coming from multiple disparate data sources. Below are the OCSF Objects in Query's Data Model:
- User: Represents a user. Its attributes include fields like
name, email_addr, uid, ...
- Device: Represents an endpoint. Its attributes include fields like
hostname, ip, instance_uid, ...
- Network Endpoint: Represents any public/private source/destination of a network connection. Its attributes include fields like
hostname, ip, port, svc_name, ...
- Process: Represents running instance of a launched program. Its attributes include fields like
name, pid, parent_process, user, ...
- Email: Represents email metadata such as sender, recipients, and direction. Its attributes include fields like
from, to, subject, size, ...
- File: Represents files, folders, links and mounts, including the reputation information, if applicable. Its attributes include fields like
name, path, type_id, fingerprints, ...
- URL: Represents the path and reputation of a URL. Its attributes include fields like
hostname, path, scheme, port, query_string, ...
- Domain Info: Represents registration information pertaining to a domain. Its attributes include fields like
domain, registrar, created_time, modified_time, ...
- Location: Represents geographic location information. Its attributes include fields like
coordinates, city, country, ...
QDM Entities extracted from OCSF Objects
As stated above, the Query Federated Search Platform selectively surfaces certain fields from within OCSF objects, the below list are the normalized Entities you are able to dispatch searches with. The Query Federated Search Platform will translate the query into whichever combinations of API calls, SQL statements, KQL statements, or otherwise, and bring back deduplicated, normalized and correlated (by parent->child relationships) records with that entity in them.
For instance, if you search for an IP Address 205.100.1.1
, Query will dispatch the search to all of the Connectors you have enabled. In this example, you may have SentinelOne (an Endpoint Detection & Response tool), Microsoft Intune and JAMF Pro (both Mobile Device Management tools), and Tégo Threat Feed API (a Threat Intelligence concentrator) -- each will bring back different records such as related findings, alerts and potential devices from EDR, onboarded endpoints or servers from MDM, and relevant geolocation, reputation scoring, registrar and reverse DNS information from CTI tools. This can help Analysts, Detection Engineers, and other users of the Query Federated Search Platform to quickly collate, Orient, Decide and Act on the information they're provided with for closing incidents, developing detection content, or otherwise.
As of 29 AUG 2024, the following entities are supported.
- Hostnames (and Domains)
- IP Addresses (IPv4 and IPv6)
- MAC Addresses
- User Names
- Email Addresses
- URL Strings (and URIs)
- File Names
- Hashes (e.g., MD5, SHA1, SHA256, SSDEEP, VHASH, etc.)
- Process Names
- Resource IDs
- Ports
- Subnets (e.g.,
192.168.1.0/24
or2001:0db8:85a3:0000::/64
) - Command Lines (e.g.,
python3 encrypted.py
orssh user@ubuntu-ip-10-0-0-1
) - Country Code (e.g.,
US
orCN
) - Process ID
- User Agent
- Common Weaknesses & Enumerations (CWE) IDs (e.g.,
CVE-2024-100251
) - Common Vulnerabilities & Enumerations (CVE) IDs (e.g.,
CWE-79
) - User Credential UID (e.g., AWS User Access Key ID -
AKIA0007EXAMPLE0002
) - User ID
- Group Name
- Group ID
- Account Name
- Account ID
- Script Content
Several other Entities are under active development as well as the ability to search upon any field within any Object in the future.
Relevant QDM Event Classes
Query normalizes and correlates the above OCSF Objects' activity information coming from various data sources into the event classes below:
- Security Finding: Security Finding events describe findings, detections, anomalies, alerts and/or actions performed by security products.
- Email Activity: Email Activity events report findings and activities of emails.
- File System Activity: File System Activity events report when a process performs an action on a file or folder.
- Account Change: Account Change events report when specific user account management tasks are performed, such as a user/role being created, changed, deleted, renamed, disabled, enabled, locked out or unlocked.
- Authentication: Authentication events report authentication session activities such as user attempts a logon or logoff, successfully or otherwise.
- Authorization: Authorization events report special privileges or groups assigned to a session.
- Entity Management: Entity Management events report activity by a managed client, a micro service, or a user at a management console. The activity can be a create, read, update, and delete operation on a managed entity.
- Network Activity: Network Activity events report network connection and traffic activity.
- HTTP Activity: HTTP Activity events report HTTP connection and traffic information.
- API Activity: API Activity events describe general CRUD (Create, Read, Update, Delete) API activities, e.g. (AWS Cloudtrail).
Relevant OCSF Event Categories for Query
OCSF groups similar Event Classes into what it calls "Categories". The above Event Classes fall into the below OCSF Categories. Note that Categories are not displayed in the UI and are listed here more for information on Query's schema:
- Findings: Category for any finding events. This includes
security_finding
events. - System Activity: Category for any system activity events. This includes
file_system_activity
events. - Audit Activity: Category for any audit activity events. This includes
account_change, authentication, authorization, and entity_management
events. - Network Activity: Category for any network activity events. This includes
network_activity, http_activity, email_activity, and api_activity
events.
Searching by Time
All events in OCSF have three time attributes: time
, start_time
, and end_time
. Query always provides a value for event time
. Most systems of record have only one timestamp for events; in these cases, start_time
and end_time
will be empty. When you change the value of the time picker in the search bar, you're changing a filter on the time
field.
Objects in OCSF do not have an association to time. Most systems of record that provide data on objects also lack this association; they respond to queries with information about the current state of the environment. Because of this, time filters are ignored when searching for objects.
Updated about 1 month ago