Network Intrusion Features via Tumbling Time Windows
This article originally used the term "Sliding Time Window". This article actually discusses a variant called the "Tumbling Time Window"
Feature creation is one of the first steps toward creating Machine Models that apply to network monitoring or other stream-oriented data processes. We massage independent variables into a form that can be used by ML models or other statistical tools. This often involves transforming source data through numerical conversion, bucketing, aggregation, and other techniques.
For this project, we'd like to try and train a machine model to detect intrusion events by having it look at network traffic. People sometimes try and directly consume events as inputs. An individual network packet does not contain enough context to be useful on its own. A Tumbling time window makes it possible to create features with more context than you would get with a single message.
This GitHub repository contains Python code that creates features from Wireshark/tshark packet streams. The program accepts live tshark output or tshark streams created from .pcap files.
Tumbling Time Window
- The number of packets in each window varies.
- The amount of time represented by the window is constant.
Window Summary Info as Features
- the number of TCP, UDP, ARP protocol messages,
- the number of TLS, HTTP, SSDP and, SMB2 service messages,
- the number of host pairs
- the total bytes transferred
Video
Alternative Window Strategies
- Tumbling Time Window: Fixed time non-overlapping windows that advance at a fixed rate. Windows have a fixed length. There may or may not be events in a given window. Data can be in only one window.
- Hopping Window: The window advances at a specific rate with a specific width. The window advances irrespective of events received. This is an overlapping version of the Tumbling Window. Data can be in multiple windows.
- Time-Based Sliding Window: The events that happened in the last N seconds. They are triggered every time a new event is received. They are data triggered. There is at least one event in each window, the trigger event. Data can be in multiple windows.
- Eviction-Based Sliding Window: The window contains the last N elements. Window length (time) varies based on the event rate.
References
- Repository:
- Python source code https://github.com/freemansoft/Network-intrusion-dataset-creator This code is 8x faster than the original.
- Other Blogs and Videos:
- Blog: https://joe.blog.freemansoft.com/2021/04/network-intrusion-features-via-sliding.html
- Video: https://youtu.be/b3MaxbAAdDw
- Blog: https://joe.blog.freemansoft.com/2021/04/creating-features-in-python-using.html
- Video: https://youtu.be/jKgGh5a5gFA
- Originating Research
- Research paper the original source code was based on. https://www.researchgate.net/profile/Nadun-Rajasinghe/project/A-customizable-Network-Intrusion-Detection-dataset-creating-framework/attachment/5aff08f8b53d2f63c3ccae32/AS:627686015766528@1526663416701/download/1570426776.pdf?context=ProjectUpdatesLog
- Original Python source repository https://github.com/nrajasin/Network-intrusion-dataset-creator
- Other
- https://www.mikulskibartosz.name/difference-between-tumbling-and-sliding-window/
- https://learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-window-functions
- https://docs.lenses.io/3.2/sql/streaming/windowing.html
Comments
Post a Comment