[netsa-tools-discuss] Any known issues with memory use and python plugins?

Markland, Matthew W. (Matt), M.S. Markland.Matthew at mayo.edu
Wed Feb 7 16:53:23 EST 2018


All:

I'm working on moving a workflow from SiLK 3.11.0.1 to SiLK 3.16 and have run into some interesting memory usage numbers. I have a large deduped file (1.6Gb compressed) which I'm running rwuniq on with a custom python plugin which adds two fields to the output. The python plugin is not doing any real computation per se (i.e. no heavy math). What caused me to notice something was up is that the output from rwuniq when using SiLK 3.16 has entries in it which have addresses not in the dedupe file and the entries also appear malformed.

A typical invocation of rwuniq looks like. 

/usr/bin/time /home/users/software/silk/bin/rwuniq --python-file=../scripts/get_mseconds.py --values=Records,Packets,Bytes,start_mseconds,end_mseconds --fields=1-5,8 --timestamp-format=iso,local --ipv6-policy=asv4 --delimited=, 2018020214.dedupe > 2018020214.summary


I'm running on a 16-core machine with plenty of RAM. When I run I see the following "Max Resident Memory" numbers as provided by /usr/bin/time:

3.11.0.1:  842700
3.16:         5577576 

Seeing this I'm guessing the mangled output is due to hitting some sort of 4Gb limit. If I don't use the python plugin I see this 

3.16:  2787624

Which still doesn't seem to be stellar in my mind, but at least it doesn't mangle the output.

Looking at the config.log file for the 3.16 build I don't see anything that would appear to change its behavior like this. It was built with gcc 4.8.5 on CentOS (I believe).

I'm thinking that I may not need the python plugin going forward (more experiments to run), but I wanted to see if anyone else has seen memory usage like this.

Thanks for your time!

Matt

-- 
Matthew Markland | Sr Analyst/Programmer | Information Technology | 507-538-5493 | markland.matthew at mayo.edu
Mayo Clinic | 200 First Street SW | Rochester, MN 55905 | www.mayoclinic.org
 
 



More information about the netsa-tools-discuss mailing list