Automatic Push PCAP
Requires FW:6979+
FMADIO Packet Capture systems provide a built in Push mode to transfer capture PCAP data on a regular schedule to a remote system. An example is pushing 1minute PCAPs to a remote NFS share or an S3 storage bucket.
Configuration is via configuration scripts located:
If there is no such file above, please copy the basic example from the following location
An example is shown as follows:
Multiple push targets can be specified, there is no real limit however throughput does get effected.
In the above example there are 2 push rules
A) Push all packet data (no filter)
This push target sends all PCAP data the remote NFS share mounted on
/mnt/remote0
See NFS mount configuration section for details on setting up /mnt/remote0 mounting points.
The sepcified is "FilterBPF=nil" meaning there is no filter, thus all traffic is pushed
B) Push all TCP data from network 192.168.1.0/24
The second example shows pushing all TCP data on the network 192.168.1.0/24 to the specified /mnt/remote0/push/ directory with a PCAP file prefix of "tcp_*"
Note FilterBPF=net 192.168.1.0/24 and tcp
This applies a full BPF (Berkley Packet Filter https://en.wikipedia.org/wiki/Berkeley_Packet_Filter ) with the filter "tcp" on the packets before writing it to the location. This results in only TCP data written to the /mnt/remote0/push/tcp_*.pcap output files
Supported Endpoints
Mode
Description
linux file
linux file on FMADIO capture system
NFS
remote NFS mountpoint on FMADIO capture system
SFTP
remote SSH file system via rclone ( https://rclone.org/sftp/ )
FTP
FTP push via rclone ( https://rclone.org/ftp/ )
S3
S3 protocol via rclone ( https://rclone.org/s3/ )
Google Drive
Google drive via rclone ( https://rclone.org/drive/ )
Digital Ocean
Digital Ocan Spaces via rclone ( https://rclone.org/s3/#digitalocean-spaces )
Azure Blob
Microsoft Azure Blob via rclone ( https://rclone.org/azureblob/ )
Dropbox
Dropbox via rclone ( https://rclone.org/dropbox/ )
Hadoop HDFS
Hadoop file system via rclone ( https://rclone.org/hdfs/ )
Ceph
Ceph S3 interface via rclone ( https://rclone.org/s3/ )
and many more, see the rclone documentation for full list of endpoints supported
Command Reference
Following is a description of each option for per push target.
Desc
Required
Provides a text human readable description for each push target. It is also used for for log file description
For example the above push logfiles will go to /mnt/store0/log/push_pcap-all_* this can be helpful for troubleshooting any problems
Mode
Default (FILE)
Specifies how the output files are written. Currently there are 2 modes, standard linux file "File" and rclone which provides multiple end points such as FTP, S3, Google Drive, Azure Cloud and many more.
Options
Command
Description
FILE
output a regular linux file. this can be ln the local file system or over a remote NFS mount
RCLONE
use rclone as the end point file. Note rclone needs to be setup and configured before remote push is started
For RCLONE Config please see their documentation
https://rclone.org/commands/rclone_config/
FMADIO by default stores config file into
/opt/fmadio/etc/rclone.conf
Requires FW:7157+
LXC
Writes output to the LXC ring buffer. Location of the ring buffer is the Path variable e.g
/opt/fmadio/queue/lxc_ring0
Requires FW:7738+
CURL
Pipe down a CURL pipe, e.g. to FTP or HTTP POST
Path
Required
Full remote path of the target PCAPs + the leading prefix of the remote output.
The above example uses the "FILE" mode, which specifies a full linux system file path.
Command
Description
/mnt/remote0/push/all
FILE mode output PCAP files will be written for example as /mnt/remote0/push/all_
20210101_010101.cap
gdrive://pcap/all
RCLONE mode output PCAP files written be written to the rclone configured google drive endpoint into the google drive directory /pcap/
Split
Required
This specifies how to split the incoming PCAP data, either by Bytes or by Time. Following example is splitting by Time
Command
Description
--split-time <nano seconds>
Splits PCAP data by time, argument is in nanoseconds Scientific notation can be used
--split-byte <bytes>
Splits PCAP data by Size. argument is in bytes, Scientific notation can be used
FileName
Required
Specifies how to split filename is encoded. Different downstream applications require specific encodings. If your downstream applications need an encoding not listed, please contact us for support.
Command
Description
--filename-epoch-sec-startend
writes the sec epoch start/end time as the file name
e.g March 21, 2021 1:50:55
1616334655-1616334755.pcap
--filename-epoch-sec
writes the sec epoch start time as the file name.
e.g March 21, 2021 1:50:55
1616334655.pcap
--filename-epoch-msec
Writes the epoch start time in milliseconds as the file name
e.g for April 22 02:48 GMT
fmadio_1650595712592.pcap
--filename-epoch-usec
Writes the epoch start time in microseconds
e.g For April 22 02:48 GMT
fmadio_1650585598007301.pcap
--filename-epoch-nsec
Writes the epoch start time in nanoseconds
e.g For April 22 02:48 GMT
fmadio_1650585598007301462.pcap
--filename-tstr-HHMM
writes the YYYYMMDD_HHMM style file name.
e.g. 2021 Dec 1st 23:50
20211201_2350.pcap
--filename-tstr-HHMMSS
writes the YYYYMMDD_HHMMSS style file name.
e.g. 2021 Dec 1st 23:50:59
20211201_235059.pcap
--filename-tstr-HHMMSS_NS
writes the YYYYMMDD_HHMMSS.MSEC.USEC.NSEC style file name.
e.g. 2021 Dec 1st 23:50:59 123456789nsec20211201_235059.123.456.789.pcap
--filename-tstr-HHMMSS_TZ
Wrties the filename in Hour Min Sec with a local timezone suffix
e.g 2022 April 22 19:59 CST
fmadio__2022-04-21_19:59:58-04:00.pcap
--filename-strftime <time string>
Generic strftime print
e.g command line
--filename-strftime "%Y%m%d%H%M%S"
Output is as follows
fmadio__20220421224832.pcap
FilterBPF
Default (nil)
Full libpcap BPF filter can be applied to reduce the total PCAP size or segment specific list of PCAPs . The system uses the native libpcap library, everything that tcpdump supports FilterBPF also supports.
The above is an example BPF filter "net 192.168.1.0/24 and tcp" its a slightly more complicated BPF and shows the flexibility and wide range of options available. Technically there is no limit on the complexity of the BPF filter, we recommend to keep it as simple as possible to reduce the CPU load
Command
Description
FilterBPF
Enter a full tcpdump equivlent BPF filter expression
example host filter
FilterBPF="host 192.168.1.1"
Decap
Default (true)
In addition to FilterBPF full packet de-encapsulation is performed by default before the BPF filter is applied. This for example can decode VLAN, ERSPAN, GRE tunnels and many more. It enables the BPF filter is applied on the inner payload instead of the encapsulated output,
Example to disable automatic De-encapsulation
Configuration is a simple boolean type only
Command
Description
Decap
boolean value of "true" enables Packet De-encapsulation (Default true)
PipeCmd
Requires FW:7157+
Default (nil)
Pipe commands are processed on a per PCAP split basis before the end transport is applied. Examples to use this are to GZIP or compress files before sending to the endpoint.
This is a generic stdin/stdout linux application, gzip, lz4 are current examples, Other options are possible, please contact us for more details
The above runs gzip with compression level 1 on the split PCAP before sending to the output location. Some examples are shown below
Command
Description
gzip -c -1
Run GZIP on split PCAPs with fastest compression mode
gzip -c -9
Run GZIP on split PCAPs in maximum compression mode
lz4 -c
Run LZ4 compression on split PCAPs for fast compression
FileSuffix
Requires FW:7157+
Default (nil)
By default the split PCAP filename suff is .pcap
For most operations that is sufficient, however for more complicated operations such as GZIP compressing with PipeCmd a .pcap.gz file suffix is more appropriate. The Following is an example config target that compresses and outputs splits in .pcap.gz file format
The above example pushes gzip 1minute PCAP splits to an S3 protocol storage device
Command
Description
.pcap
Default suffix.
.pcap.gz
GZIP Compressed PCAP
.pcap.lz4
LZ4 compressed PCAP
Chunked
Requires FW: 7355+
Default: (nil)
Chunked mode is a more optimized processing mode. It increases the aggregate thoughput of the Push operation specifically for network traffic profiles skewed towards small packets.
By default its disabled.
Example as follows
true
Enables chunk mode
FollowStart
Requires FW: 7355+
FollowStart forces the push to start from the beginning of the capture. If its disabled it will push from the current capture position.
Default is "false" push from the current capture position
Example as follows
false
Push from the current capture position
true
Push from the start of the currently active capture
CPU
Requires FW: 7750+
Sets a specific CPU for stream_cat to run on by overriding the default CPU setting. This is helpful when multiple pushes are running in parallel.
Default: the system assigned CPU number for push (typically CPU 23)
Example as follows
nil
Uses the default system CPU for push operations
<numeric value>
Literal Numeric value indicating which CPU to run stream_cat on
Consumer Application Restart
The default behavior of the system is to constantly re-try sending data downstream without loss. In some cases its better to restart the push process and reset the start sending position, when the consumer application restarts.
An example is, if an application shuts down between 1AM and 6AM but the capture process runs 24/7. The application wants to only receive data starting at 6AM when the application starts up.
Another use case is, if the application has an error for an unspecified amount of time. The application requires real-time processing, and requires FMADIO Capture system to send data from the current time now without sending old historical data.
Adding the following configuration to the push_realtime.lua config will cause stream_cat to exit if the downstream consumer is unable to process data. FMADIO system scheduler will constantly restart the push process from the current time, until the consumer process starts processing data.
NOTE: Please keep the additional white space at the end of the command.
Analytics Scheduler
In addition to configuration of
/opt/fmadio/etc/push_pcap.lua
To specify when the Push operation occurs the Analytics scheduler must be configured. This is on the "CONFIG" tab of the FMADIO GUI. An Example configuration to push files 24/7,
The "Analytics Engine" field must be exactly the following text.
Screenshot of 24/7 schedule is shown below
Troubleshooting
Logfiles
Configuration problems often occour when setting up the system. The following log files can be used to debug
Monitoring the output can be as follows
In addition each Push entry has a log file with the following format. The Desc value is described
https://docs.fmad.io/fmadio-documentation/configuration/automatic-push-pcap#desc
Example output of correct functionality is as follows
Manual Offline mode
In addition to log files its sometimes easier to debug via the CLI interface, by manually starting the push on specific capture files. This can also be helpful to push historical PCAP files.
This is done as the following CLI command
Replace capture name with the complete name of the capture. Also ensure push scheduler has been disabled
Example output of successful offline mode run is shown below.
Note the following repeated status line indicates the push is operating successfully
For problems per push target, the logfile shown in the above command line here /mnt/store0/log/push_pcap-all.cur
A good way to debug that is running tail -F /mnt/store0/log/push_pcap-all.cur to monitor it such as the following
Multiple Push Schedules
Multiple push_pcap schedules can be added to the system, for example
Push A) Realtime 1min push
used for realtime monitoring throughout the day
Push B) End of Day, recon push.
Used for End of Day Recon pushing data to back office systems
This can be achieved with the following steps
1) create a directory for custom analytics schedules
All files in this directory are sym linked to the /opt/fmadio/analytics directory used by the scheduler.
2) Copy the current push_pcap loader and rename it
Create a new push_pcap_eod loader as follows
Then Edit the file to load a different configuration
In this case the config file is called push_pcap_eod.lua
3) Configure the new push_pcap_eod.lua
Configure the new push_pcap_eod.lua file, it will require hand editing of the file, as fmadiocli only operates on the default configuration
4) Enable in the scheduler
Going to the GUI -> Config page add the new loader file into the schedule with the new loader file push_pcap_eod
Example is shown below
5) Confirm operation
Log files for (push_pcap_eod) are named
Performance testing
Push performance is critical and subject to multiple factors. The following provides a baseline test of different variables.
Remote Write Performance
A first initial step is to confirm the writing to the remote file system has sufficient bandwidth. This is simply achieved running the commands
Example run
The above command writes a 20GB file to the remote file location. In this case its writing @ 678MB/sec ( 5.4Gbps throughput)
Maintenance
Force PCAP Link Layer Setting
To force the PCAP link layer type to Ethernet use the following CLI command on the PCAP
The symptoms of this is unusual TCPDUMP output such as the following
After setting the PCAP Link Layer setting using the above command the output is as follows
Which contains the correctly decoded packets.
Examples
Example Configuration files for refence
Push to NFS Share with BPF Filter and 1 minute PCAPs
Push to NFS Share with BPF Filter and HHMMSS Timezone
FW: 7936+
Example pushes a single UDP multicast group 1001 at 1 minute snapshots using an Hour Min Sec with Timezone filename.
Push to NFS Share 1min Splits with BPF Filter and LZ4 compression
Example pushes 1min PCAPs with a BPF filter (port 80) and applying LZ4 compression. LZ4 compression is fast and reasonably good compression rates.
Push to NFS share 1min Splits with BPF Filter and ZSTD compression
Example pushes 1min PCAPs with a BPF filter (port 80) and applying ZSTD compression. ZSTD is a new compression format with performance close to LZ4 but compression rates close to GZIP.
Push to NFS/CIFS Share 1GB splits
FW: 7936+
Example pushes the raw data to a remote NFS/CIFS (Windows Share) splitting by 1GB file size writing a gzip compressed PCAP file to the remote location.
Push to MAGPACK over FTP
Push to an LXC 24/7
Push to Multiple lxc_ring 24/7
Example pushes different VLAN traffic to seperate lxc_rings
Push to AWS S3 Bucket with Compression
Pushing captured PCAP data from the local device to AWS S3 Bucket can be done using the RCLONE support.
Below is an example push_pcap.lua config file for that
This uses gzip to compress the data. Also note we added a PreCapture filter to 64B Slice all traffic to AWS S3 IP address. This prevents the capture size for a run-away explosion.
Below is the resulting output in AWS S3
This does require RCLONE S3 Config to be configured before using.
Performance
Performance of push varies by protocol filter and compression mode. Typically the remote push locations IO bandwidth is typically the bottleneck.
Here we are testing against a RAM disk on the local system. This removes the network and remote IO performance from the benchmark. Focusing entirely on the FMADIO fetch filter and compress performance.
Parallel GZIP (pigz) running on an FMADIO 100Gv2 Analytics System (96 CPUs total). Making good use of all the CPUs
XZ compression using 64 CPUs, more utilization than Parallel GZ
Traffic Profile ISP
The dataset we are testing with is a WAN connection that has an ISP like L2-L7 packet distribution. As its ISP like traffic the data compression rate is not high.
raw (not compressed)
6.9 Gbps
45.5GB
1.0
lz4
2.5Gbps
41.2GB
x 1.104
gzip -1 (fast)
0.16Gbps
40.5GB
x 1..123
gzip default
0.15Gbps
40.2GB
x 1.131
pigz 64 CPU (Parallel GZIP with 64 CPUs)
4.8Gbps
40.3GB
x 1.129
pigz 32 CPU (Parallel GZIP with 32 CPUs)
4.4Gbps
40.3GB
x 1.129
pigz 8 CPU (Parallel GZIP with 8 CPUs)
1.04Gbps
40.3GB
x 1.129
zstd default
0.77Gbps
39.5GB
x 1.15
zstd --fast
2.4Gbps
40.3GB
x 1.129
xz default
0.019Gbps
39.2GB
x 1.160
xz 64 CPU (-T 64)
0.909Gbps
38.9GB
x 1.17
xz 32 CPU (-T 32)
0.502Gbps
39.3GB
x 1.157
Traffic Profile Finance
The data set tested is a full days worth of OPRA A+B Feed dataset, raw uncompressed data size is just under 1TB. Financial data typically gets x2 to x3 compression ratio with xz maxing out at x5.
raw (not compressed)
4.29Gbps
0.6H
979GB
x 1.0
lz4
1.65Gbps
1.3H
372GB
x 2.629
gzip -1 (fast)
0.341Gbps
6.3H
311GB
x 3.14
gzip (default)
0.129Gbps
16.6H
266GB
x 3.67
pigz 64 CPU (Parallel GZIP)
3.92Gbps
0.55H
264GB
x 3.70
pigz 32 CPU (Parallel GZIP)
3.92Gbps
0.55H
264GB
x 3.70
pigz 16 CPU (Parallel GZIP)
1.98Gbps
1.09H
264GB
x 3.70
pigz 8 CPU (Parallel GZIP)
0.997
2.1 H
264GB
x 3.70
zstd default
1.118Gbps
1.9H
255GB
x 3.83
zstd --fast
1.90Gbps
1.13H
294GB
x 3.32
xz default
0.012Gbps
181H
184GB
x 5.32
xz 64 CPU
0.604Gbps
3.6H
184GB
x 5.32
xz 32 CPU
0.372Gbps
5.8H
184GB
x 5.32
xz 16 CPU
0.192Gbps
11.3H
184GB
x 5.32
xz 8 CPU
0.096Gbps
22.6H
184GB
x 5.32
Last updated