Unlocked labs for KQL: 3
Completed checkpoints: 0 / 5
Detection Engineering Path
1. Query Fundamentals 2. Signal Isolation 3. Behavioral Aggregation 4. Threat Hunting 5. Detection Rule Development
Beginner
KQL Foundations: Build a Signal-First Mindset
Learn to write deterministic first-pass hunts that remove noise before any heavy analysis.
Learning Outcomes
- Identify the minimum fields needed for triage-ready evidence.
- Use exact filters to isolate security-relevant events.
- Explain why filter-first query flow improves speed and trust.
Core Concepts
Start With Evidence Scope
Every query starts by deciding what event class is relevant. For auth incidents, begin with auth tables and explicit action fields.
auth_events | where Action == "LoginFailed"
If scope is vague, downstream stats become misleading.
Filter Before Shape
Do not project or summarize before removing noise. Filtering first preserves performance and reduces accidental false positives.
auth_events | where Action == "LoginFailed" and ResultCode !in ("KnownNoiseA", "KnownNoiseB")This keeps your alert logic focused on attacker behavior, not platform chatter.
Worked Scenario
SOC Alert Storm at Shift Handover
You inherit 30,000 auth events and need suspicious failures in under 5 minutes.
- Restrict to failed auth only.
- Exclude known benign service accounts if environment allows.
- Project minimal triage fields to speed analyst review.
auth_events | where Action == "LoginFailed" | project Timestamp, User, SrcIp, Device, Country
Common Mistakes
- Starting with broad contains() against raw message fields.
- Returning every column, which slows triage and hides signal.
- Skipping explicit action/result filters and mixing unrelated data.
Detection Context
Auth failures alone are weak signals. Combine burst rate, user spread, source concentration, and time window to isolate attacker behavior.
Password spraying pattern: few attempts per user, many targeted accounts, often a concentrated source. Map query shape to adversary tradecraft.
Threat model mapping: MITRE ATT&CK T1078
Query Performance Tradeoffs
==
fast
Best for exact deterministic filters.
startswith
medium
Good for controlled naming patterns.
contains
slow
Broad matching can increase scan cost and noise.
Analyst Tip
If your first query returns more than 5,000 rows, you likely have not isolated signal. Tighten filters before aggregating.
Detection Strategy
- Set failure-rate threshold over fixed time window.
- Require distinct-user spread to catch spray behavior.
- Constrain by source concentration and high-signal entities.
- Document trusted infrastructure exclusions for false-positive control.
Production Detection Rule
auth_events | where Action == "LoginFailed" | project Timestamp, User, SrcIp, Device, Country
Intermediate
KQL Aggregation: Convert Logs Into Behaviors
Use summarize and distinct counts to detect spread-based attacks like password spray.
Learning Outcomes
- Differentiate event pressure from victim spread.
- Rank entities by operational risk, not just volume.
- Choose grouping keys that preserve attacker behavior.
Core Concepts
Pressure vs Spread
count() shows how noisy activity is; dcount(User) shows how widely a source is targeting identities.
summarize Failed=count(), Victims=dcount(User) by SrcIp
Spray attacks often hide in moderate count but high victim spread.
Risk-First Ranking
Sort by spread first, then pressure. This prioritizes likely coordinated attacks above isolated brute-force noise.
sort by Victims desc, Failed desc
Analysts should investigate high-spread sources early to reduce blast radius.
Worked Scenario
Suspected Password Spray Campaign
Security tooling reports many failures, but volume alone is not enough to prioritize.
- Filter to failed login events.
- Aggregate by source with both count and distinct user breadth.
- Sort by victim spread first to reveal true spray behavior.
auth_events | where Action == "LoginFailed" | summarize Failed=count(), Victims=dcount(User) by SrcIp | sort by Victims desc, Failed desc | limit 10
Common Mistakes
- Grouping by too many fields and fragmenting suspicious behavior.
- Using only count() and missing spread-driven abuse.
- Ranking solely by failed volume without distinct-user signal.
Detection Context
Auth failures alone are weak signals. Combine burst rate, user spread, source concentration, and time window to isolate attacker behavior.
Password spraying pattern: few attempts per user, many targeted accounts, often a concentrated source. Map query shape to adversary tradecraft.
Threat model mapping: MITRE ATT&CK T1078
Query Performance Tradeoffs
==
fast
Best for exact deterministic filters.
startswith
medium
Good for controlled naming patterns.
contains
slow
Broad matching can increase scan cost and noise.
Analyst Tip
If your first query returns more than 5,000 rows, you likely have not isolated signal. Tighten filters before aggregating.
Detection Strategy
- Set failure-rate threshold over fixed time window.
- Require distinct-user spread to catch spray behavior.
- Constrain by source concentration and high-signal entities.
- Document trusted infrastructure exclusions for false-positive control.
Production Detection Rule
auth_events | where Action == "LoginFailed" | summarize Failed=count(), Victims=dcount(User) by SrcIp | sort by Victims desc, Failed desc | limit 10
Intermediate
KQL Strings: Precision Matching for Detection
Use strict string operators to reduce false positives in user and host detections.
Learning Outcomes
- Pick exact operators for prefix/suffix patterns.
- Avoid broad substring matching when precise operators exist.
- Explain string operator tradeoffs in triage notes.
Core Concepts
Prefer Exact Intent
Choose startswith/endswith when your hypothesis is position-aware. Contains is broader and noisier.
where User startswith "admin"
Precision operators make detections more stable across datasets.
Chain With Context
String checks should be combined with action and time constraints to avoid weak single-signal alerts.
where User startswith "admin" and Action == "LoginFailed" and Timestamp > ago(24h)
Contextual constraints improve confidence and reduce responder fatigue.
Worked Scenario
Privilege Account Abuse Triage
You need to isolate failed logins on privileged naming patterns without matching unrelated strings.
- Use startswith for deterministic account prefix matching.
- Scope by action and recent timeframe.
- Return only response-critical fields.
auth_events | where Action == "LoginFailed" and User startswith "admin" and Timestamp > ago(24h) | project Timestamp, User, SrcIp, Country
Common Mistakes
- Using contains("admin") and matching non-admin noise.
- Ignoring case normalization considerations in mixed data sources.
- Relying on naming pattern alone without behavior context.
Detection Context
Auth failures alone are weak signals. Combine burst rate, user spread, source concentration, and time window to isolate attacker behavior.
Password spraying pattern: few attempts per user, many targeted accounts, often a concentrated source. Map query shape to adversary tradecraft.
Threat model mapping: MITRE ATT&CK T1078
Query Performance Tradeoffs
==
fast
Best for exact deterministic filters.
startswith
medium
Good for controlled naming patterns.
contains
slow
Broad matching can increase scan cost and noise.
Analyst Tip
If your first query returns more than 5,000 rows, you likely have not isolated signal. Tighten filters before aggregating.
Detection Strategy
- Set failure-rate threshold over fixed time window.
- Require distinct-user spread to catch spray behavior.
- Constrain by source concentration and high-signal entities.
- Document trusted infrastructure exclusions for false-positive control.
Production Detection Rule
auth_events | where Action == "LoginFailed" and User startswith "admin" and Timestamp > ago(24h) | project Timestamp, User, SrcIp, Country
Intermediate
KQL Time Binning: Detect Burst Windows
Build fixed-window timelines to identify short attack bursts and escalation windows.
Learning Outcomes
- Use bin() for deterministic interval analysis.
- Interpret spikes as pivot points for deeper hunts.
- Produce timeline output analysts can act on quickly.
Core Concepts
Fixed Intervals
bin(Timestamp, 5m) creates consistent buckets so peaks are comparable and auditable.
summarize Failed=count() by bin(Timestamp, 5m)
Inconsistent windows break trend reliability.
Interval Prioritization
Ranking top windows by volume focuses analysts on likely coordinated attempts first.
sort by Failed desc | limit 10
Top intervals become pivot points for source and user deep dives.
Worked Scenario
Early-Morning Failure Spike
Blue team sees noisy overnight failures and needs to confirm whether the spike is coordinated.
- Scope to failed auth events.
- Bucket by 5-minute windows.
- Rank highest windows and pivot from those intervals.
auth_events | where Action == "LoginFailed" | summarize Failed=count() by bin(Timestamp, 5m) | sort by Failed desc | limit 10
Common Mistakes
- Aggregating by raw timestamp without bin(), creating unusable granularity.
- Comparing windows of different sizes.
- Stopping at timeline output without entity-level pivots.
Detection Context
Auth failures alone are weak signals. Combine burst rate, user spread, source concentration, and time window to isolate attacker behavior.
Password spraying pattern: few attempts per user, many targeted accounts, often a concentrated source. Map query shape to adversary tradecraft.
Threat model mapping: MITRE ATT&CK T1078
Query Performance Tradeoffs
==
fast
Best for exact deterministic filters.
startswith
medium
Good for controlled naming patterns.
contains
slow
Broad matching can increase scan cost and noise.
Analyst Tip
If your first query returns more than 5,000 rows, you likely have not isolated signal. Tighten filters before aggregating.
Detection Strategy
- Set failure-rate threshold over fixed time window.
- Require distinct-user spread to catch spray behavior.
- Constrain by source concentration and high-signal entities.
- Document trusted infrastructure exclusions for false-positive control.
Production Detection Rule
auth_events | where Action == "LoginFailed" | summarize Failed=count() by bin(Timestamp, 5m) | sort by Failed desc | limit 10
Advanced
KQL Detection Patterns: From Query to Rule
Turn hunting logic into resilient detections with clear risk thresholds and evidence output.
Learning Outcomes
- Encode behavior-based hypotheses as checks.
- Balance alert precision and miss-risk with threshold design.
- Produce response-ready outputs for SOC playbooks.
Core Concepts
Behavior Hypothesis First
Define attacker behavior before syntax. For spray: one source, many users, repeated failures.
SrcIp with Failed>=30 and Victims>=8
Hypothesis-first logic survives data drift better than pattern memorization.
Threshold Pairing
Use both pressure and spread thresholds to avoid over-alerting on benign failures.
where Failed >= 30 and Victims >= 8
Dual thresholds improve precision while retaining meaningful recall.
Worked Scenario
Detection Rule Hardening
A draft spray query alerts too often. You need to harden it for production.
- Keep strict failed-auth filter.
- Aggregate by source with count and distinct-user breadth.
- Apply dual thresholds and output evidence fields only.
auth_events | where Action == "LoginFailed" | summarize Failed=count(), Victims=dcount(User) by SrcIp | where Failed >= 30 and Victims >= 8 | project SrcIp, Failed, Victims
Common Mistakes
- Shipping rules validated on only one sample outcome.
- Ignoring analyst workload when tuning thresholds.
- Returning non-actionable fields that slow escalation.
Detection Context
Auth failures alone are weak signals. Combine burst rate, user spread, source concentration, and time window to isolate attacker behavior.
Password spraying pattern: few attempts per user, many targeted accounts, often a concentrated source. Map query shape to adversary tradecraft.
Threat model mapping: MITRE ATT&CK T1110
Query Performance Tradeoffs
==
fast
Best for exact deterministic filters.
startswith
medium
Good for controlled naming patterns.
contains
slow
Broad matching can increase scan cost and noise.
Analyst Tip
If your first query returns more than 5,000 rows, you likely have not isolated signal. Tighten filters before aggregating.
Detection Strategy
- Set failure-rate threshold over fixed time window.
- Require distinct-user spread to catch spray behavior.
- Constrain by source concentration and high-signal entities.
- Document trusted infrastructure exclusions for false-positive control.
Production Detection Rule
auth_events | where Action == "LoginFailed" | summarize Failed=count(), Victims=dcount(User) by SrcIp | where Failed >= 30 and Victims >= 8 | project SrcIp, Failed, Victims