You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add a funciton to workaround the truncted XML elment text property:
- lxml element text property only returns the first text child node and is split by embeded comments
- use xpath('text()') and string join to get the full list of text parts returned for the entire element.
- Misc whitespace and spelling corrections.
logger.warning(f'\'description\' was not given with \'--metadata\', but \'--metadata-full-description\' was given, so description metadata will be inlcuded in full.')
71
+
logger.warning(f'\'description\' was not given with \'--metadata\', but \'--metadata-full-description\' was given, so description metadata will be included in full.')
- The .text property of elements from lxml only includes the first text part and is split by commented causing <Element>.text to produce truncated/partial text.
240
+
- Instead of <Element>.text, using <Element>.xpath('text()') is safer and will get a list of all text.
241
+
- Extending lxml etree BaseElement class was not used due to:
242
+
- Complexity where one cannot simply extend a class without interacting with the element tree. See: https://lxml.de/element_classes.html.
243
+
- Extending the class injects a parent node element and shifts the current element with its properrties as a child element of the extended element.
244
+
"""
245
+
246
+
#return ''.join([t for t in element.itertext()])
247
+
return''.join(element.xpath('text()'))
248
+
249
+
235
250
defget_queries(xml_query_list):
236
251
"""
237
252
Extract elements and metadata from a windows event XML query list
# - The Path (log name) might be specified in either the query attributes (along with to the query ID) or in the XPath definition.
261
+
# - The etree.Element class was extended as QueryElement with and all_text property because the standard text property truncates text after any embedded comments.
246
262
# - TODO: It's undetermined if both the XPath and the query can simultaneously be set and are allowed to be different or must be consistant.
247
263
# - TODO: Collected comments can get disassociated from the nearby select or suppress statement they annotate, so it's less useful for large/complex query IDs.
Enumerate combinations of select or suppress sub-query elements and propergate references to the deepest level of event specificity.
291
307
During enumeration, event and provider metadata lookups are done and used to increase the specificity.
292
-
308
+
293
309
enum: dict object passed as a reference to add enumerated events and references to.
294
310
s_file: subscription file name to reference
295
311
q_id: Query ID of the XML element
296
312
q_parent_path: Path attribute in the Query element
297
313
q_type: Query type element, either Select or Suppress
298
314
q_xpath: XPath data/text within Select or Suppress element
299
-
315
+
300
316
Note, a pseudo-hierarchy of event query specificity, and related reference level, is as follows:
301
-
317
+
302
318
Paths:
303
319
Path: required
304
320
Providers
305
321
Provider: optional
306
322
Events:
307
323
Event ID: optional
308
324
(Reference at this level)
309
-
325
+
310
326
When query parsing and metadata lookups fail to resolve a specific provider(s) or event ID(s), a single null node is created.
311
-
327
+
312
328
NOTE: Query specificity:
313
329
- A query could just select an Event directly from a Path without specifying the Provider.
314
330
- When an Event ID or Level is selected without a Provider, ambiguity results and multiple Providers and Events are in scope.
315
331
- With the Security log Path/Channel, the Microsoft-Windows-Security-Auditing is the most common provider producing into this channel.
316
332
- The Path attribute can be stipulated at query level and the sub-query Select or Suppress level. The deeper Select/Suppress level is assumed to take precedence.
317
-
333
+
318
334
NOTE: Query metadata lookup failure reasons:
319
335
- FIXME: Regex to extract event provider, event ID, or level failed to match properly.
320
336
- Query did not select a valid/defined event or provider.
321
337
- Source system metadata extract lacked provider or event manifests.
322
338
323
339
- FIXME: This function has become too complex and bloated. Perhaps it should be simplified by refactored using objects to compartmentalise and abstract the complexity.
324
340
"""
325
-
341
+
326
342
# Assume sub-level the <Select Path=...> or <Suppress Path=...> attribute will take precadence over parent <Query Path=...> attribute when choosing a channel
logger.warning(f"Contradicting query Path in '{s_file}'. Query ID {q_id}'s the child path attribute, <{q_type} Path='{q_path}'>, does not match the parent path attribute, <Query Path='{q_parent_path}'>. The child Path will take precadence.")
336
-
352
+
337
353
# XPath extractions and permutations
338
354
q_xpath=q['XPath']
339
-
# NOTE:
355
+
# NOTE:
340
356
# - itertools.product or the list comprehension building a tuple from nested iterations will return an empty list if one of the iterations/lists input is empty
341
357
# - Avoid this by replacing empty lists with a non-empty list and the single null/None value to force the product to expand out the empty sets
342
358
# - FIXME: expansion is only based on Providers, Event IDs and Levels and this will fail to expand on queries that use other event attribtues such as Keywords, Tasks, Opcodes or Event Data.
0 commit comments