Building an Edge AI Inference Pipeline for Security Operations: Architecture and Concepts (Part 1 of 4)

Most security teams have the same problem: data volume is growing faster than analyst capacity, and signature-based detection alone is not catching the sophisticated, low-and-slow attacks that matter most. Machine learning promises to help, but production ML in a SOC is genuinely hard to operationalize. The gap between a data science notebook and a running inference pipeline that feeds your SIEM is wider than most blog posts acknowledge.

This series documents the end-to-end build of a working edge AI inference pipeline in a real security lab – not a cloud demo, not a toy dataset. The hardware is four NVIDIA Jetson Nano 4GB Developer Kit nodes. The SIEM is Splunk Enterprise 10.0 with Enterprise Security. The ML toolkit is Splunk’s own Deep Learning Toolkit (DSDL). The detection targets are real security data sources: Zeek connection logs, Splunk Stream DNS telemetry, and Windows Security event logs.

By the end of this four-part series, you will have a working blueprint for:

Running GPU-accelerated inference containers on constrained edge hardware
Wiring those containers to Splunk DSDL’s native protocol
Training Isolation Forest models on real security data
Generating scored anomaly events that drive ES correlation rules and notable events

This first post covers the architecture and conceptual foundation. If you are a security architect or senior security engineer who has been curious about operationalizing ML in your SOC without a massive cloud spend, this series is written for you.

Prerequisites

Before following along, you should have:

Experience with Splunk ES administration and SPL
Familiarity with Docker and Linux system administration
A working Splunk Enterprise 10.0 instance with ES installed
Basic understanding of supervised and unsupervised machine learning concepts
Access to Zeek, DNS, or Windows event log data in Splunk

Step 1 – Understanding the Problem Space

The fundamental challenge with ML in a SOC is not the algorithm – it is the operational pipeline. You need data to flow from your SIEM to the model, predictions to flow back, and the results to be actionable. Most ML-for-security projects fail not because the model is wrong but because the pipeline around it is brittle.

Traditional SIEM-to-ML pipelines have three common failure modes. First, they require a data science team to maintain a separate ML platform that security operations has no visibility into. Second, they use batch processing that introduces latency between detection and response. Third, they generate predictions that have no clear path into the analyst workflow.

The approach in this series addresses all three. DSDL runs inference containers close to your Splunk deployment, predictions feed back into ES as scored events that drive correlation rules, and the entire pipeline is managed through SPL that your Splunk team already understands.

Step 2 – Understanding Edge Inference

An inference service is a long-running web server that has a trained model loaded in memory and answers HTTP requests with predictions. When Splunk sends data to it, the model runs on that data and returns anomaly scores or classification labels. The server stays running between requests – the model is loaded once at startup, not reloaded for each request.

This is conceptually different from batch ML where you export data, run a script, and import results. Inference services are always on and respond in near-real-time, which is what makes them useful in a detection pipeline.

The edge aspect means running these inference services on hardware that lives in your environment rather than a cloud provider. For a security lab, this has several advantages. Data stays on your network. You maintain full control over model versions. There is no egress cost. And for air-gapped or restricted environments, it is often the only viable approach.

Step 3 – Understanding the Jetson Nano Constraints

The NVIDIA Jetson Nano 4GB Developer Kit is a capable edge AI platform but it has hard constraints that shape every architectural decision in this series.

The hardware runs JetPack 4.6.6, which is the ceiling for this device. JetPack 4.6.6 provides Ubuntu 18.04, Python 3.6.9 natively, and CUDA 10.2. These are not soft constraints – you cannot upgrade JetPack on the original Nano to a newer version.

The practical implications are significant. Kubernetes and K3s will not run on this hardware – the kernel is too old. Flask 2.x will not install – it dropped Python 3.6 support. Large language models and transformer architectures will not fit in 4 GB of unified RAM. The right models for this hardware are small, tabular, scikit-learn-based algorithms. Isolation Forest is the perfect match: it runs entirely on CPU, has a tiny RAM footprint around 50 MB, and is exactly the kind of unsupervised anomaly detector that security data calls for.

Working within these constraints rather than fighting them is what makes the build in this series reliable and repeatable.

Step 4 – Understanding How DSDL Works

Splunk’s Deep Learning Toolkit (DSDL) is a Splunk app that bridges SPL and Docker containers running inference servers. When you write | fit MLTKContainer algo=isolation_forest in a Splunk search, DSDL serializes your search results as a CSV string, POSTs them to an endpoint on your inference container over HTTPS, and the container returns predictions as another CSV string that DSDL merges back into your Splunk results.

Three things about DSDL’s actual behavior differ from what the documentation implies:

First, DSDL does not send JSON arrays. It sends data as a CSV string inside a JSON wrapper: {"data": "<csv string>", "meta": {"options": {...}, "feature_variables": [...]}}. Custom containers must parse this format exactly or the pipeline silently fails.

Second, DSDL uses Python’s urllib library with a dynamically constructed SSL context. When using self-signed certificates, DSDL fetches only the leaf certificate from the server and builds an SSL context from it – which cannot verify the certificate chain against your CA. The fix is to specify the CA certificate path in DSDL’s docker.conf configuration using the endpoint_cert_filename_or_path key.

Third, DSDL uses a containers.conf file to map model names to container endpoints. Without a [__dev__] stanza in this file, every fit and apply call fails with a blank endpoint error. This is not documented clearly anywhere in Splunk’s official documentation.

Understanding these three behaviors upfront saves hours of debugging.

Step 5 – Designing the Architecture

The as-built architecture for this series has four layers.

The hardware layer consists of four Jetson Nano nodes running JetPack 4.6.6 with Docker. Each node runs one inference container and exposes two ports: 8501 for HTTPS inference requests and 2375 for the Docker TCP API that DSDL uses to manage container lifecycle.

The inference layer is a custom Python Flask application implementing DSDL’s native endpoint protocol. It loads a scikit-learn Isolation Forest model at startup and exposes two routes: /fit for training and /apply for inference. The application is packaged as a Docker image built from NVIDIA’s official nvcr.io/nvidia/l4t-ml:r32.7.1-py3 base image, which is pre-loaded with scikit-learn, numpy, and pandas.

The integration layer is DSDL configured on the Splunk search head with two environments: Environment 1 points to node 1 for Zeek network anomaly detection, and Environment 2 points to node 2 for DNS tunneling detection.

The detection layer consists of Splunk ES correlation rules that consume the anomaly_score, is_anomaly, and anomaly_label fields returned by the inference containers and generate notable events when scores exceed defined thresholds.

Data flows in two directions. From Splunk to the Nanos: the search head POSTs CSV data via HTTPS to port 8501. From the Nanos to Splunk: anomalous events push back to Splunk’s HTTP Event Collector on port 8088 on the indexer cluster.

Conclusion

This post established the conceptual foundation for everything that follows. The key takeaways are:

Inference services are always-on web servers that load models at startup and answer HTTP requests with predictions. Edge inference means running these services on local hardware, keeping data in your environment and eliminating cloud dependencies.

The Jetson Nano 4GB is capable edge AI hardware with real constraints. Working within JetPack 4.6.6’s limits requires specific software choices but the hardware is well-matched to tabular security data anomaly detection.

DSDL is more opinionated than it appears. Its CSV wire format, urllib SSL behavior, and containers.conf dependency are all undocumented constraints that require reading the source code to work around correctly.

In Part 2, you will build the Jetson Nano inference container from scratch – Dockerfile, the DSDL-native Flask application, TLS certificate generation, and the deployment workflow for all four nodes.

This post is Part 1 of the Edge AI for SecOps series. Part 2: Building the DSDL-Native Inference Container Part 3: Wiring the Pipeline Part 4: Real Security Data

Prerequisites#

Step 1 – Understanding the Problem Space#

Step 2 – Understanding Edge Inference#

Step 3 – Understanding the Jetson Nano Constraints#

Step 4 – Understanding How DSDL Works#

Step 5 – Designing the Architecture#

Conclusion#