Portfolio · Incident Investigation & Resolution
Incident
Case Studies
Real-world investigations, root cause analyses, and remediations from managed IT environments. All client details redacted. Each case documents the diagnostic approach, technical findings, and resolution actions taken.
6
Case studies
100%
Resolved
Multi
Platform coverage
Redacted
Client details
Featured Investigation
01
Featured · Deepest Technical Investigation REDACTED
💥
Security Software · Windows Crash Forensics
ESET Agent BEX64 Crash Loop Destabilizing RMM Communications
Multi-server WER analysis · Kernel driver forensics · DEP violation root cause
Fatal buffer overflow crashes of ERAAgent.exe (build 12.4.1124.0) identified across three servers simultaneously — DATA, HOST, and DOMAIN CONTROLLER roles. Each crash left ESET's kernel filter drivers loaded but unmanaged, causing ScreenConnect and ConnectWise Automate agents to flap and flood monitoring with false-positive offline alerts. Windows Error Reporting analysis confirmed identical crash signatures across machines, pinpointing a defective July 2025 agent build as the root cause.
ESET ERAAgent BEX64 Crash Analysis WER Forensics Kernel Filter Drivers ConnectWise Automate ScreenConnect DEP Violation RMM Reliability
All Case Studies
02
Compliance & Governance REDACTED
📋
SOC 2 CC7 · NIST 800-171 · Patch Governance
Patch Management Control Effectiveness Assessment
0% compliance discovered · 72-day exposure window · Governance failure
A failed patch cycle on 11/24/2025 went undetected for 72 days, producing 0% patch compliance across 2026. Identified a closed-loop process failure — patching occurred but validation, exception handling, and remediation stages all collapsed. Root cause was a governance gap, not a tooling failure. Full remediation included segregation of duties framework and continuous compliance monitoring.
SOC 2 CC7 NIST 800-171 NinjaOne Patch Compliance Root Cause Analysis Segregation of Duties
03
Connectivity Investigation · Disputed Root Cause REDACTED
📡
Onboarding Incident · Multi-Incident Report
Agent Connectivity Disruption — Post-Onboarding Investigation
ScreenConnect correlation · Disputed Windows Defender attribution · Session interruption during evidence collection
Within 5 days of onboarding, repeated agent connectivity disruptions clustered in a 4–6PM window. Engineering attributed the issue to Windows Defender with no supporting evidence aligned to the disruption timestamps. Documented a stronger, timestamped correlation with ScreenConnect activity instead. A second incident occurred during evidence collection — an observed RDP session displacement and subsequent reboot while collecting supporting data.
Agent Connectivity ScreenConnect RDP Forensics RMM Onboarding Disputed Root Cause Session Security
04
Hardware Investigation · Storage Validation REDACTED
🗄️
VMware ESXi · HPE Smart Array · Storage Forensics
Drive / RAID Controller Alerts Validated as Monitoring Noise
iLO + ESXi SSH + Smart Array CLI · 7-layer investigation · No hardware fault found
Drive-related "Disk Error: red" entries in NinjaOne triggered alongside memory alerts on an HPE ProLiant Gen10 running ESXi 7.0.3. A 7-step multi-layer investigation traversed iLO firmware, ESXi SSH, Smart Array CLI, logical drive, and physical disk layers. All storage components confirmed healthy. Red entries traced to non-present empty bays and unsupported SMART health lookups on the array LUN — not a hardware fault.
VMware ESXi HPE Smart Array iLO Management RAID Validation NinjaOne SMART Health Monitoring Noise
05
EDR Security · Deployment Governance REDACTED
🛡️
SentinelOne · EDR Deployment · Policy Enforcement
SentinelOne Agent Misconfiguration Analysis & Remediation
Services hung in stopping state · Tamper protection absent · Wrong console instance
SentinelOne services were found stuck in a "stopping" state across multiple servers — reboot-persistent, not transient. Local service stop and process termination were permitted, indicating tamper protection was not enforced. The organization had been onboarded into a new console instance without access being provisioned. Investigation identified incomplete deployment as the root cause, not a product defect.
SentinelOne EDR Deployment Tamper Protection Policy Enforcement Console Access Deployment Governance
06
Monitoring Operations · Platform Engineering REDACTED
📊
NinjaOne · VMware ESXi · Monitoring Policy Engineering
NinjaOne Monitoring Tuning — VMware Host False Positive Reduction
6 targeted policy changes · Phantom drive alerts eliminated · Critical escalation preserved
Following confirmed storage validation on an HPE ProLiant Gen10 / ESXi 7.0.3 host, NinjaOne continued generating drive alerts and holding the device in a chronic "Needs attention" state. Produced a structured 6-action tuning specification for an L2–L3 administrator — eliminating SMART health false positives, de-emphasizing a known memory warning, converting the uptime alert to maintenance hygiene, and preserving escalation for genuinely critical sensors.
NinjaOne VMware ESXi Monitoring Tuning Alert Noise Reduction Policy Engineering SNMP Sensors