Automatic Push PCAP
Requires FW:6979+
FMADIO Packet Capture systems provide a built in Push mode to transfer capture PCAP data on a regular schedule to a remote system. An example is pushing 1minute PCAPs to a remote NFS share or an S3 storage bucket.
Configuration is via configuration scripts located:
If there is no such file above, please copy the basic example from the following location
An example is shown as follows:
Multiple push targets can be specified, there is no real limit however throughput does get effected.
In the above example there are 2 push rules
A) Push all packet data (no filter)
This push target sends all PCAP data the remote NFS share mounted on
/mnt/remote0
See NFS mount configuration section for details on setting up /mnt/remote0 mounting points.
The sepcified is "FilterBPF=nil" meaning there is no filter, thus all traffic is pushed
B) Push all TCP data from network 192.168.1.0/24
The second example shows pushing all TCP data on the network 192.168.1.0/24 to the specified /mnt/remote0/push/ directory with a PCAP file prefix of "tcp_*"
Note FilterBPF=net 192.168.1.0/24 and tcp
This applies a full BPF (Berkley Packet Filter https://en.wikipedia.org/wiki/Berkeley_Packet_Filter ) with the filter "tcp" on the packets before writing it to the location. This results in only TCP data written to the /mnt/remote0/push/tcp_*.pcap output files
Supported Endpoints
Mode | Description |
linux file | linux file on FMADIO capture system |
NFS | remote NFS mountpoint on FMADIO capture system |
SFTP | remote SSH file system via rclone ( https://rclone.org/sftp/ ) |
FTP | FTP push via rclone ( https://rclone.org/ftp/ ) |
S3 | S3 protocol via rclone ( https://rclone.org/s3/ ) |
Google Drive | Google drive via rclone ( https://rclone.org/drive/ ) |
Digital Ocean | Digital Ocan Spaces via rclone ( https://rclone.org/s3/#digitalocean-spaces ) |
Azure Blob | Microsoft Azure Blob via rclone ( https://rclone.org/azureblob/ ) |
Dropbox | Dropbox via rclone ( https://rclone.org/dropbox/ ) |
Hadoop HDFS | Hadoop file system via rclone ( https://rclone.org/hdfs/ ) |
Ceph | Ceph S3 interface via rclone ( https://rclone.org/s3/ ) |
and many more, see the rclone documentation for full list of endpoints supported
Command Reference
Following is a description of each option for per push target.
Desc
Required
Provides a text human readable description for each push target. It is also used for for log file description
For example the above push logfiles will go to /mnt/store0/log/push_pcap-all_* this can be helpful for troubleshooting any problems
Mode
Default (FILE)
Specifies how the output files are written. Currently there are 2 modes, standard linux file "File" and rclone which provides multiple end points such as FTP, S3, Google Drive, Azure Cloud and many more.
Options
Command | Description |
FILE | output a regular linux file. this can be ln the local file system or over a remote NFS mount |
RCLONE | use rclone as the end point file. Note rclone needs to be setup and configured before remote push is started For RCLONE Config please see their documentation https://rclone.org/commands/rclone_config/ FMADIO by default stores config file into
Requires FW:7157+ |
LXC | Writes output to the LXC ring buffer. Location of the ring buffer is the Path variable e.g /opt/fmadio/queue/lxc_ring0 Requires FW:7738+ |
CURL | Pipe down a CURL pipe, e.g. to FTP or HTTP POST |
Path
Required
Full remote path of the target PCAPs + the leading prefix of the remote output.
The above example uses the "FILE" mode, which specifies a full linux system file path.
Command | Description |
/mnt/remote0/push/all | FILE mode output PCAP files will be written for example as |
gdrive://pcap/all | RCLONE mode output PCAP files written be written to the rclone configured google drive endpoint into the google drive directory |
Split
Required
This specifies how to split the incoming PCAP data, either by Bytes or by Time. Following example is splitting by Time
Command | Description |
--split-time <nano seconds> | Splits PCAP data by time, argument is in nanoseconds Scientific notation can be used |
--split-byte <bytes> | Splits PCAP data by Size. argument is in bytes, Scientific notation can be used |
FileName
Required
Specifies how to split filename is encoded. Different downstream applications require specific encodings. If your downstream applications need an encoding not listed, please contact us for support.
Command | Description |
--filename-epoch-sec-startend | writes the sec epoch start/end time as the file name e.g March 21, 2021 1:50:55
|
--filename-epoch-sec | writes the sec epoch start time as the file name. e.g March 21, 2021 1:50:55
|
--filename-epoch-msec | Writes the epoch start time in milliseconds as the file name
e.g for April 22 02:48 GMT
|
--filename-epoch-usec | Writes the epoch start time in microseconds
e.g For April 22 02:48 GMT
|
--filename-epoch-nsec | Writes the epoch start time in nanoseconds
e.g For April 22 02:48 GMT
|
--filename-tstr-HHMM | writes the YYYYMMDD_HHMM style file name. e.g. 2021 Dec 1st 23:50
|
--filename-tstr-HHMMSS | writes the YYYYMMDD_HHMMSS style file name. e.g. 2021 Dec 1st 23:50:59
|
--filename-tstr-HHMMSS_NS | writes the YYYYMMDD_HHMMSS.MSEC.USEC.NSEC style file name.
e.g. 2021 Dec 1st 23:50:59 123456789nsec |
--filename-tstr-HHMMSS_TZ | Wrties the filename in Hour Min Sec with a local timezone suffix
e.g 2022 April 22 19:59 CST
|
--filename-strftime <time string> | Generic strftime print
e.g command line
|
FilterBPF
Default (nil)
Full libpcap BPF filter can be applied to reduce the total PCAP size or segment specific list of PCAPs . The system uses the native libpcap library, everything that tcpdump supports FilterBPF also supports.
The above is an example BPF filter "net 192.168.1.0/24 and tcp" its a slightly more complicated BPF and shows the flexibility and wide range of options available. Technically there is no limit on the complexity of the BPF filter, we recommend to keep it as simple as possible to reduce the CPU load
Command | Description |
FilterBPF | Enter a full tcpdump equivlent BPF filter expression example host filter
|
Decap
Default (true)
In addition to FilterBPF full packet de-encapsulation is performed by default before the BPF filter is applied. This for example can decode VLAN, ERSPAN, GRE tunnels and many more. It enables the BPF filter is applied on the inner payload instead of the encapsulated output,
Example to disable automatic De-encapsulation
Configuration is a simple boolean type only
Command | Description |
Decap | boolean value of "true" enables Packet De-encapsulation (Default true) |
PipeCmd
Requires FW:7157+
Default (nil)
Pipe commands are processed on a per PCAP split basis before the end transport is applied. Examples to use this are to GZIP or compress files before sending to the endpoint.
This is a generic stdin/stdout linux application, gzip, lz4 are current examples, Other options are possible, please contact us for more details
The above runs gzip with compression level 1 on the split PCAP before sending to the output location. Some examples are shown below
Command | Description |
gzip -c -1 | Run GZIP on split PCAPs with fastest compression mode |
gzip -c -9 | Run GZIP on split PCAPs in maximum compression mode |
lz4 -c | Run LZ4 compression on split PCAPs for fast compression |
FileSuffix
Requires FW:7157+
Default (nil)
By default the split PCAP filename suff is .pcap
For most operations that is sufficient, however for more complicated operations such as GZIP compressing with PipeCmd a .pcap.gz file suffix is more appropriate. The Following is an example config target that compresses and outputs splits in .pcap.gz file format
The above example pushes gzip 1minute PCAP splits to an S3 protocol storage device
Command | Description |
.pcap | Default suffix. |
.pcap.gz | GZIP Compressed PCAP |
.pcap.lz4 | LZ4 compressed PCAP |
Chunked
Requires FW: 7355+
Default: (nil)
Chunked mode is a more optimized processing mode. It increases the aggregate thoughput of the Push operation specifically for network traffic profiles skewed towards small packets.
By default its disabled.
Example as follows
Description | |
---|---|
true | Enables chunk mode |
FollowStart
Requires FW: 7355+
FollowStart forces the push to start from the beginning of the capture. If its disabled it will push from the current capture position.
Default is "false" push from the current capture position
Example as follows
Description | |
---|---|
false | Push from the current capture position |
true | Push from the start of the currently active capture |
CPU
Requires FW: 7750+
Sets a specific CPU for stream_cat to run on by overriding the default CPU setting. This is helpful when multiple pushes are running in parallel.
Default: the system assigned CPU number for push (typically CPU 23)
Example as follows
Setting | Description |
---|---|
nil | Uses the default system CPU for push operations |
<numeric value> | Literal Numeric value indicating which CPU to run stream_cat on |
Consumer Application Restart
The default behavior of the system is to constantly re-try sending data downstream without loss. In some cases its better to restart the push process and reset the start sending position, when the consumer application restarts.
An example is, if an application shuts down between 1AM and 6AM but the capture process runs 24/7. The application wants to only receive data starting at 6AM when the application starts up.
Another use case is, if the application has an error for an unspecified amount of time. The application requires real-time processing, and requires FMADIO Capture system to send data from the current time now without sending old historical data.
Adding the following configuration to the push_realtime.lua config will cause stream_cat to exit if the downstream consumer is unable to process data. FMADIO system scheduler will constantly restart the push process from the current time, until the consumer process starts processing data.
NOTE: Please keep the additional white space at the end of the command.
Analytics Scheduler
In addition to configuration of
/opt/fmadio/etc/push_pcap.lua
To specify when the Push operation occurs the Analytics scheduler must be configured. This is on the "CONFIG" tab of the FMADIO GUI. An Example configuration to push files 24/7,
The "Analytics Engine" field must be exactly the following text.
Screenshot of 24/7 schedule is shown below
Troubleshooting
Logfiles
Configuration problems often occour when setting up the system. The following log files can be used to debug
Monitoring the output can be as follows
In addition each Push entry has a log file with the following format. The Desc value is described
https://docs.fmad.io/fmadio-documentation/configuration/automatic-push-pcap#desc
Example output of correct functionality is as follows
Manual Offline mode
In addition to log files its sometimes easier to debug via the CLI interface, by manually starting the push on specific capture files. This can also be helpful to push historical PCAP files.
This is done as the following CLI command
Replace capture name with the complete name of the capture. Also ensure push scheduler has been disabled
Example output of successful offline mode run is shown below.
Note the following repeated status line indicates the push is operating successfully
For problems per push target, the logfile shown in the above command line here /mnt/store0/log/push_pcap-all.cur
A good way to debug that is running tail -F /mnt/store0/log/push_pcap-all.cur to monitor it such as the following
Multiple Push Schedules
Multiple push_pcap schedules can be added to the system, for example
Push A) Realtime 1min push
used for realtime monitoring throughout the day
Push B) End of Day, recon push.
Used for End of Day Recon pushing data to back office systems
This can be achieved with the following steps
1) create a directory for custom analytics schedules
All files in this directory are sym linked to the /opt/fmadio/analytics directory used by the scheduler.
2) Copy the current push_pcap loader and rename it
Create a new push_pcap_eod loader as follows
Then Edit the file to load a different configuration
In this case the config file is called push_pcap_eod.lua
3) Configure the new push_pcap_eod.lua
Configure the new push_pcap_eod.lua file, it will require hand editing of the file, as fmadiocli only operates on the default configuration
4) Enable in the scheduler
Going to the GUI -> Config page add the new loader file into the schedule with the new loader file push_pcap_eod
Example is shown below
5) Confirm operation
Log files for (push_pcap_eod) are named
Performance testing
Push performance is critical and subject to multiple factors. The following provides a baseline test of different variables.
Remote Write Performance
A first initial step is to confirm the writing to the remote file system has sufficient bandwidth. This is simply achieved running the commands
Example run
The above command writes a 20GB file to the remote file location. In this case its writing @ 678MB/sec ( 5.4Gbps throughput)
Maintenance
Force PCAP Link Layer Setting
To force the PCAP link layer type to Ethernet use the following CLI command on the PCAP
The symptoms of this is unusual TCPDUMP output such as the following
After setting the PCAP Link Layer setting using the above command the output is as follows
Which contains the correctly decoded packets.
Examples
Example Configuration files for refence
Push to NFS Share with BPF Filter and 1 minute PCAPs
Push to NFS Share with BPF Filter and HHMMSS Timezone
FW: 7936+
Example pushes a single UDP multicast group 1001 at 1 minute snapshots using an Hour Min Sec with Timezone filename.
Push to NFS Share 1min Splits with BPF Filter and LZ4 compression
Example pushes 1min PCAPs with a BPF filter (port 80) and applying LZ4 compression. LZ4 compression is fast and reasonably good compression rates.
Push to NFS share 1min Splits with BPF Filter and ZSTD compression
Example pushes 1min PCAPs with a BPF filter (port 80) and applying ZSTD compression. ZSTD is a new compression format with performance close to LZ4 but compression rates close to GZIP.
Push to NFS/CIFS Share 1GB splits
FW: 7936+
Example pushes the raw data to a remote NFS/CIFS (Windows Share) splitting by 1GB file size writing a gzip compressed PCAP file to the remote location.
Push to MAGPACK over FTP
Push to an LXC 24/7
Push to Multiple lxc_ring 24/7
Example pushes different VLAN traffic to seperate lxc_rings
Push to AWS S3 Bucket with Compression
Pushing captured PCAP data from the local device to AWS S3 Bucket can be done using the RCLONE support.
Below is an example push_pcap.lua config file for that
This uses gzip to compress the data. Also note we added a PreCapture filter to 64B Slice all traffic to AWS S3 IP address. This prevents the capture size for a run-away explosion.
Below is the resulting output in AWS S3
This does require RCLONE S3 Config to be configured before using.
Performance
Performance of push varies by protocol filter and compression mode. Typically the remote push locations IO bandwidth is typically the bottleneck.
Here we are testing against a RAM disk on the local system. This removes the network and remote IO performance from the benchmark. Focusing entirely on the FMADIO fetch filter and compress performance.
Parallel GZIP (pigz) running on an FMADIO 100Gv2 Analytics System (96 CPUs total). Making good use of all the CPUs
XZ compression using 64 CPUs, more utilization than Parallel GZ
Traffic Profile ISP
The dataset we are testing with is a WAN connection that has an ISP like L2-L7 packet distribution. As its ISP like traffic the data compression rate is not high.
Description | Gbps | Size | Ratio |
---|---|---|---|
raw (not compressed) | 6.9 Gbps | 45.5GB | 1.0 |
lz4 | 2.5Gbps | 41.2GB | x 1.104 |
gzip -1 (fast) | 0.16Gbps | 40.5GB | x 1..123 |
gzip default | 0.15Gbps | 40.2GB | x 1.131 |
pigz 64 CPU (Parallel GZIP with 64 CPUs) | 4.8Gbps | 40.3GB | x 1.129 |
pigz 32 CPU (Parallel GZIP with 32 CPUs) | 4.4Gbps | 40.3GB | x 1.129 |
pigz 8 CPU (Parallel GZIP with 8 CPUs) | 1.04Gbps | 40.3GB | x 1.129 |
zstd default | 0.77Gbps | 39.5GB | x 1.15 |
zstd --fast | 2.4Gbps | 40.3GB | x 1.129 |
xz default | 0.019Gbps | 39.2GB | x 1.160 |
xz 64 CPU (-T 64) | 0.909Gbps | 38.9GB | x 1.17 |
xz 32 CPU (-T 32) | 0.502Gbps | 39.3GB | x 1.157 |
Traffic Profile Finance
The data set tested is a full days worth of OPRA A+B Feed dataset, raw uncompressed data size is just under 1TB. Financial data typically gets x2 to x3 compression ratio with xz maxing out at x5.
Description | Gbps | Time | Size | Ratio |
---|---|---|---|---|
raw (not compressed) | 4.29Gbps | 0.6H | 979GB | x 1.0 |
lz4 | 1.65Gbps | 1.3H | 372GB | x 2.629 |
gzip -1 (fast) | 0.341Gbps | 6.3H | 311GB | x 3.14 |
gzip (default) | 0.129Gbps | 16.6H | 266GB | x 3.67 |
pigz 64 CPU (Parallel GZIP) | 3.92Gbps | 0.55H | 264GB | x 3.70 |
pigz 32 CPU (Parallel GZIP) | 3.92Gbps | 0.55H | 264GB | x 3.70 |
pigz 16 CPU (Parallel GZIP) | 1.98Gbps | 1.09H | 264GB | x 3.70 |
pigz 8 CPU (Parallel GZIP) | 0.997 | 2.1 H | 264GB | x 3.70 |
zstd default | 1.118Gbps | 1.9H | 255GB | x 3.83 |
zstd --fast | 1.90Gbps | 1.13H | 294GB | x 3.32 |
xz default | 0.012Gbps | 181H | 184GB | x 5.32 |
xz 64 CPU | 0.604Gbps | 3.6H | 184GB | x 5.32 |
xz 32 CPU | 0.372Gbps | 5.8H | 184GB | x 5.32 |
xz 16 CPU | 0.192Gbps | 11.3H | 184GB | x 5.32 |
xz 8 CPU | 0.096Gbps | 22.6H | 184GB | x 5.32 |
Last updated