[Bug] Query validation failing to capture InSet edge case with ip field types #3540

Mikaayenson · 2024-03-26T21:55:28Z

Summary

Related to #3534

When ip fields are used with in_set fields that are type keyword, failures occur in elasticsearch.

For example:

network where destination.address in ("127.0.0.1", "::1")

Our internal validation does not catch this.

The validation issue is a combination of two things:

The type of ip fields are internally set to String here within detection-rules
We don't yet have an ip type_hint to correlate in eql here class TypeHint(Enum):

After fixing those two small updates, we then would run into issues where all ip types can not be compared to type string, which should not occur. Rather the edge case is that the combination of the in operation with mismatched types needs to be handled, most likely in def in_sets, which later performs validation in validate_type.

The dilemma is that If we set correct the type hint for ip fields, the validation we fail as expected for in_set checks, however by field type hint information is already set and will also fail in other locations where this check is relaxed and allows ip / string type comparisons. On the other hand, if we leave type hints as-is, and try to cover this edge case within in_set, the type hint is already

Note: ATM, the only occurance of this within our repos was resolved in 3534.
Note: I observed there are some integrations where fields like destination.ip are type keyword and some are type ip. Look at the integrations-schemas zip to see examples. They are all type ip in the network integration.
Note: To get the error to finally bubble up, we also need to make sure calls to elasticsearch_type_family in integrations doesn't auto convert the type of ip fields to keyword.

Additional Details

The proposed solution is to do two things:

Allow the regular way of validating to occur with no changes to the field type_hint which converts ip's to keywords.
Within the in_set validation, pull the original field value from the schema, and get its actual type which is a type ip. Then use the ip type_hint to validate the fields within the in_set.
Normalize the kql_schema to just schema
Ensure the integration schemas built dont modify the original field type.

I've attached git diff's to help with the implementation.

Detection Rules Git Diff

diff --git a/detection_rules/ecs.py b/detection_rules/ecs.py
index 1418c9647..878e03175 100644
--- a/detection_rules/ecs.py
+++ b/detection_rules/ecs.py
@@ -176,19 +176,24 @@ class KqlSchema2Eql(eql.Schema):
     }
 
     def __init__(self, kql_schema):
-        self.kql_schema = kql_schema
         eql.Schema.__init__(self, {}, allow_any=True, allow_generic=False, allow_missing=False)
+        self.schema = kql_schema
+
 
     def validate_event_type(self, event_type):
         # allow all event types to fill in X:
         #   `X` where ....
         return True
 
+
+    def flatten(self):
+        pass
+
     def get_event_type_hint(self, event_type, path):
         from kql.parser import elasticsearch_type_family
 
         dotted = ".".join(path)
-        elasticsearch_type = self.kql_schema.get(dotted)
+        elasticsearch_type = self.schema.get(dotted)
         es_type_family = elasticsearch_type_family(elasticsearch_type)
         eql_hint = self.type_mapping.get(es_type_family)
 
diff --git a/detection_rules/rule_validators.py b/detection_rules/rule_validators.py
index 6631db83e..eea122ffc 100644
--- a/detection_rules/rule_validators.py
+++ b/detection_rules/rule_validators.py
@@ -214,7 +214,7 @@ class EQLValidator(QueryValidator):
     def text_fields(self, eql_schema: Union[ecs.KqlSchema2Eql, endgame.EndgameSchema]) -> List[str]:
         """Return a list of fields of type text."""
         from kql.parser import elasticsearch_type_family
-        schema = eql_schema.kql_schema if isinstance(eql_schema, ecs.KqlSchema2Eql) else eql_schema.endgame_schema
+        schema = eql_schema.schema if isinstance(eql_schema, ecs.KqlSchema2Eql) else eql_schema.endgame_schema
 
         return [f for f in self.unique_fields if elasticsearch_type_family(schema.get(f)) == 'text']

EQL Git Diff

diff --git a/eql/parser.py b/eql/parser.py
index 8df9cf1..52ebc1e 100644
--- a/eql/parser.py
+++ b/eql/parser.py
@@ -808,6 +808,19 @@ class LarkToEQL(Interpreter):
             if not inner.validate_type(outer):
                 raise self._type_error(inner, outer, error_message)
 
+        if self._elasticsearch_syntax and hasattr(outer, "type_info"):
+            # Check edge case of in_set and ip/string comparison
+            outer_type = outer.type_info
+            type_hint = self._schema.schema.get(str(outer.node), "unknown")
+            if hasattr(self._schema, "type_mapping") and type_hint == "ip":
+                outer.type_info = TypeHint.IP
+                for inner in container:
+                    if not inner.validate_type(outer):
+                        raise self._type_error(inner, outer, error_message)
+
+            # reset the type
+            outer.type_info = outer_type
+
         # This will always evaluate to true/false, so it should be a boolean
         term = ast.InSet(outer.node, [c.node for c in container])
         nullable = outer.nullable or any(c.nullable for c in container)
diff --git a/eql/types.py b/eql/types.py
index 07d6af2..3f09646 100644
--- a/eql/types.py
+++ b/eql/types.py
@@ -7,6 +7,7 @@ class TypeHint(Enum):
 
     Array = "array"
     Boolean = "boolean"
+    IP = "ip"
     Numeric = "number"
     Null = "null"
     Object = "object"

The text was updated successfully, but these errors were encountered:

brokensound77 · 2024-04-01T17:28:01Z

does elasticsearch (or endpoint) support IP_TYPE for in?

Mikaayenson · 2024-04-01T18:31:48Z

@brokensound77 Here's the original thread which I forgot to add. Elasticsearch supports IP_TYPE. Believe the actual is just that the types of the key:value need to match. all IP_TYPE or all keywords etc.

Mikaayenson · 2024-04-01T21:14:17Z

Speaking with @brokensound77 lets try to create an eql.Types.IP = str and check the type name in InSet

Mikaayenson · 2024-04-04T18:51:00Z

As an alternative to subclassing and overriding all the methods from eql, we could monkey patch the in_set function where the original patch suggestions existed.

Here's an example:

in_set Patch

diff --git a/detection_rules/rule_validators.py b/detection_rules/rule_validators.py
index 6631db83e..61a785a81 100644
--- a/detection_rules/rule_validators.py
+++ b/detection_rules/rule_validators.py
@@ -4,7 +4,7 @@
 # 2.0.
 
 """Validation logic for rules containing queries."""
-from functools import cached_property
+from functools import cached_property, wraps
 from typing import List, Optional, Tuple, Union
 
 import eql
@@ -21,6 +21,9 @@ from .rule import (EQLRuleData, QueryRuleData, QueryValidator, RuleMeta,
                    TOMLRuleContents, set_eql_config)
 from .schemas import get_stack_schemas
 
+from eql.parser import _parse as base_parse, LarkToEQL, NodeInfo, TypeHint
+from eql import ast
+
 EQL_ERROR_TYPES = Union[eql.EqlCompileError,
                         eql.EqlError,
                         eql.EqlParseError,
@@ -31,6 +34,42 @@ EQL_ERROR_TYPES = Union[eql.EqlCompileError,
 KQL_ERROR_TYPES = Union[kql.KqlCompileError, kql.KqlParseError]
 
 
+BaseInSetMethod = LarkToEQL.in_set  # preserve the original method for later use
+
+def custom_in_set(self, node):
+    """Override and address the limitations of the eql in_set method."""
+    print("custom_in_set")
+    # return BaseInSetMethod(self, node)
+    outer, container = self.visit(node.child_trees)  # type: (NodeInfo, list[NodeInfo])
+
+    if not outer.validate_type(TypeHint.primitives()):
+        # can't compare non-primitives to sets
+        raise self._type_error(outer, TypeHint.primitives())
+
+    # Check that everything inside the container has the same type as outside
+    error_message = "Unable to compare {expected_type} to {actual_type}"
+    for inner in container:
+        if not inner.validate_type(outer):
+            raise self._type_error(inner, outer, error_message)
+
+    # This will always evaluate to true/false, so it should be a boolean
+    term = ast.InSet(outer.node, [c.node for c in container])
+    nullable = outer.nullable or any(c.nullable for c in container)
+    return NodeInfo(term, TypeHint.Boolean, nullable=nullable, source=node)
+
+def custom_base_parse_decorator(func):
+    """Override and address the limitations of the eql in_set method."""
+    @wraps(func)
+    def wrapper(query, start=None, **kwargs):
+        LarkToEQL.in_set = custom_in_set
+        print("custom_base_parse_decorator")
+        result = func(query, start=start, **kwargs)
+        LarkToEQL.in_set = BaseInSetMethod
+        return result
+    return wrapper
+
+eql.parser._parse = custom_base_parse_decorator(base_parse)
+
 class KQLValidator(QueryValidator):
     """Specific fields for KQL query event types."""

This is much smaller of a change.

Note: There are still other things to do:

Make the change in the new in_set method to fix the issue.
Probably clean up the names and how its written.
May not want this in rule_validators. Not sure where the best place is if not here.

Mikaayenson added the bug Something isn't working label Mar 26, 2024

eric-forte-elastic self-assigned this Mar 26, 2024

eric-forte-elastic mentioned this issue Apr 4, 2024

[Bug] Query validation failing to capture InSet edge case with ip field types #3572

Merged

eric-forte-elastic linked a pull request Apr 4, 2024 that will close this issue

[Bug] Query validation failing to capture InSet edge case with ip field types #3572

Merged

eric-forte-elastic closed this as completed in #3572 May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] Query validation failing to capture InSet edge case with ip field types #3540

[Bug] Query validation failing to capture InSet edge case with ip field types #3540

Mikaayenson commented Mar 26, 2024 •

edited

Loading

brokensound77 commented Apr 1, 2024

Uh oh!

Mikaayenson commented Apr 1, 2024

Uh oh!

Mikaayenson commented Apr 1, 2024

Uh oh!

Mikaayenson commented Apr 4, 2024

Uh oh!

[Bug] Query validation failing to capture InSet edge case with ip field types #3540

[Bug] Query validation failing to capture InSet edge case with ip field types #3540

Comments

Mikaayenson commented Mar 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Additional Details

brokensound77 commented Apr 1, 2024

Uh oh!

Mikaayenson commented Apr 1, 2024

Uh oh!

Mikaayenson commented Apr 1, 2024

Uh oh!

Mikaayenson commented Apr 4, 2024

Uh oh!

Mikaayenson commented Mar 26, 2024 •

edited

Loading