TRITON Evaluation: Detection Categories
![]() |
The evaluation focuses on articulating how detections occur, rather than assigning scores to vendor capabilities.
For the evaluation, we categorize each detection and capture notes about how those detections occur. We organize detections according to each technique. Techniques may have more than one detection if the capability detects the technique in different ways, and detections we observe are included in the results. While we make every effort to capture different detections, vendor capabilities may be able to detect procedures in ways that we did not capture. For a detection to be included for a given technique, it must apply to that technique specifically (i.e. just because a detection applies to one technique in a Step or Sub-Step does not mean it applies to all techniques of that Step). For proof of detection in each category, we require that the proof be provided to us, but we may not include all detection details in public results, particularly when those details are sensitive.
To determine the appropriate category for a detection, we review the screenshot(s) provided, notes taken during the evaluation, results of follow-up questions to the vendor, and vendor feedback on draft results. We also independently test procedures in a separate lab environment and review open-source tool detections and forensic artifacts. This testing informs what is considered to be a detection for each technique.
After performing detection categorizations, we calibrate the categories across all vendors to look for discrepancies and ensure categories are applied consistently. The decision of what category to apply is ultimately based on human analysis and is therefore subject to discretion and biases inherent in all human analysis, although we do make efforts to hedge against these biases by structuring analysis as described above.
Data Sources
Detections will be tagged with the data source(s) that signify the type of data used to generate the detection. This will be used to differentiate and provide more precise descriptions of similar detections (ex: telemetry from file monitoring versus process command-line arguments). The list of possible data source tags will be calibrated by MITRE after execution of the evaluations.
Detection Categories
Not Applicable
Vendor did not have visibility on the system under test. The vendor must state before the evaluation what systems they did not deploy a sensor on to enable Not Applicable to be in scope for relevant steps.
Examples |
---|
The vendor product only collects network traffic and is unable to collect host-based data natively. |
None
No data is automatically collected, processed, or made available within the capability related to the behavior under test. If data is available that is not directly relevant to the procedure tested, this will be categorized as “None.”
Examples |
---|
No data is collected related to the particular action being performed. An alert fires at the same time that the red team performs the procedure, but it is not related to the technique under test. |
Telemetry
Minimally processed data is collected by the capability showing that event(s) occurred specific to the behavior under test. (i.e. showing the procedure/command that was executed). Evidence must show definitively that behavior occurred and be related to the execution mechanism (did happen vs may have happened). Evidence must be related to what caused the behavior. There is no evidence of complex logic or an advanced rule leading to the data output, and no labeling occurred other than simple field labeling.
Examples |
---|
A certain command is seen in traffic sent to the management port of a controller. |
General
Processed data specifies that malicious/abnormal event(s) occurred, with relation to the behavior under test. No or limited details are provided as to why the action was performed (tactic), or details for how the action was performed (technique).
Examples |
---|
An alert is triggered based on a baseline deviation of a process variable, but no information is provided about why the event happened. |
Tactic
Processed data specifies an ATT&CK Tactic or equivalent level of enrichment to the data collected by the capability. Gives the analyst information on the potential intent of the activity or helps answer the question "why this would be done.”
Examples |
---|
An alert called “Malicious Discovery” is triggered on a series of discovery techniques. The alert has a score indicating the alert is likely malicious. The alert does not identify the specific type of discovery performed. An alert describing that persistence occurred but not specifying how persistence was achieved. |
Technique
Processed data specifies an ATT&CK Technique or equivalent level of enrichment to the data collected by the capability. Gives the analyst information on how the action was performed or helps answer the question "what was done" (i.e. Brute Force I/O).
Examples |
---|
An alert called “Impair Process Control with Unauthorized Command Message” is triggered describing what command message was issued, why it is malicious and how it has impaired process control. |
Modifier Detection Types
Configuration Change
The configuration of the capability was changed since the start of the evaluation. This may be done to show additional data can be collected and/or processed. The “Configuration Change” modifier may be applied based on various different types of changes:
Configuration Change is subdivided into:
- UX – Change was to the user experience and not to the capability's ability to detect behavior. Changes could include display of a certain type of data that was already collected but not visible to the user.
- Detection Logic – Change was to the capability's ability to process information that impacts its ability to detect adversary behavior.
- Data Sources – Change was to the capability's ability to capture a new type of data that impacts its ability to detect adversary behavior.
Examples |
---|
Data showing that a controller’s program changed state is collected on the backend but not displayed to the end user by default. The vendor changes a backend setting to allow “Telemetry” on program state changes to be displayed in the user interface, so a detection of “Telemetry” and “Configuration Change-UX” would be given for the Change Program State technique. The vendor toggles a setting that would display an additional label of “Discovery” when the foo, foo1, and foo2 discovery commands are executed. A detection of “Tactic” and “Configuration Change-Detection” would be given (as opposed to a detection of “Telemetry” that would have been given before the change). A rule or detection logic is created and applied retroactively or is later retested to show functionality that exists in the capability. This would be labeled with a modifier “Configuration Change-Detection”. |