[netsa-tools-discuss] Setting up yaf, super_mediator and silk (on FreeBSD)

andreas scherrer ascherrer at gmail.com
Sat Nov 19 17:28:09 EST 2016


Hi Emily

Thank you for this valuable input again. You were spot on, it seems. I 
have added "filedaemon" to the mix on the collector and now the length 
mismatch messages are gone.

Unfortunately I am not able to upgrade to 1.4 easily. Also I think it is 
some sort of MySQL bug to return the error code "0" and no error 
message. All the documentation I found claims that error 1148 should be 
returned...

 From what I see in the "mysql_option" enum in my mysql.h (line 159 on 
my system) I believe it should be possible to set the 
"MYSQL_OPT_LOCAL_INFILE" option when calling mysql_real_connect; this 
would be identical to what is currently done for "MYSQL_OPT_RECONNECT" 
in super_mediator's mdExporterAddMySQLInfo. Maybe that would help in my 
case?

Anyway, I have followed the idea presented in [9] (section "Manual 
Import into MySQL Database") and set up a script that is loading the 
data into MySQL for now. That works like a charm.

So I am up and running, I guess!

Again, thank you for all your help and patience.


Best regards
andreas

[9] https://tools.netsa.cert.org/super_mediator/sm_guide.html

On 16.11.16 22:14 , Emily Sarneso wrote:
> Andreas,
>
> I believe the length mismatch might be happening because rwsender is stealing the file before YAF is finished writing to it.  rwsender doesn’t honor YAF’s locking system.  filedaemon is a tool that is installed with YAF that will respect YAF’s locking system and can move the files from YAF’s output directory to rwsender’s incoming directory when the lock is removed. Try:
>
> filedaemon --in ‘/data/yaf/yaf*’ --next-dir=/var/spool/silk/destination --lock
>
> I’m still not sure why you are having issues uploading to the database.  I doubt it’s related to the length mismatch issue.  There was a bug in 1.3.0 (fixed in 1.4.0) related to uploading files to the database when the MYSQL_TABLE was not set in the configuration, but I tested your exact configuration and it appears to be working for me with 1.3.0.
>
> Try the filedaemon trick and let me know if that fixes that issue.  If that doesn’t fix the database loading problem, I might have you try and upgrade to 1.4.0 if possible,
>
> Thanks,
>
>
> Emily
>
>
>
>
>
>
>> On Nov 14, 2016, at 5:49 PM, andreas scherrer <ascherrer at gmail.com> wrote:
>>
>> Dear Emily
>>
>> Thank you so much for bearing with me!
>>
>> I guess I failed to provide all the required information once more.
>>
>> I am (trying to) use the "LAST_SEEN" option already. Here's my "DNS_DEDUP" block:
>>
>> DNS_DEDUP
>>   MAX_HIT_COUNT 500
>>   LAST_SEEN
>> DNS_DEDUP END
>>
>> And according to what I see in the files, this seems to be working. This is an example line from a "dns file" that super_mediator wrote:
>>
>> -----
>> 2016-11-13 21:45:43.862|2016-11-13 21:45:43.862|1|googleapis.l.google.com.|1|172.217.19.10
>> -----
>>
>> For the sake of it, I have also removed the "LAST_SEEN" option from the SM config and dropped/recreated the table using "--dns-dedup". The result was the same: "dns: Error loading data 0:".
>>
>> Is there a way to get more information about (the root cause of) this error?
>>
>> super_mediator is polling the directory where rwreceiver is saving the files; so I did update the configuration to use a duplicate destination. That configuration does work (I see files "popping up" in the duplicate dir and super_mediator is reading (and removing) them). But the same "length mismatch" messages keep appearing in the log.
>>
>> This is somewhat consistent with my feeling that the error even pops up when I stop super_mediator for a short while, let rwreceiver "accumulate" some files, then stop rwreciever and start super_mediator.
>>
>> I wonder if the two issues are related (i.e. the database load fails because the file is corrupted?).
>>
>>
>> Best regards
>> andreas
>>
>> Ps.: I have sanitized my "full" config and uploaded it to dropbox [7] if this helps; I did not want to "pollute" the list with this.
>>
>> [7] https://dl.dropboxusercontent.com/u/45790405/yaf_silk_sm.tar.gz
>>
>> On 14.11.16 17:23 , Emily Sarneso wrote:
>>> Hello Andreas,
>>>
>>> To create the default “dns_dedup” please use the --dns-dedup option to super_table_creator.  The option name changed in a recent release and that particular tutorial hasn’t been updated yet.  I believe the reason why you are getting the database load error is because you are using the --dedup-last-seen option which is a slightly different table.  If you want to use the LAST_SEEN option, you must add the following block to the super_mediator.conf:
>>>
>>> DNS_DEDUP
>>>    LAST_SEEN
>>> DNS_DEDUP END
>>>
>>> The LAST_SEEN options is described in the super_mediator.conf man page.
>>>
>>> The  "IPFIX Message length mismatch (buffer has 3997, read 8899)” error usually means that the file is not complete or truncated in some way.  Do you have super_mediator polling rwreceiver’s incoming directory?  I’m wondering if super_mediator is picking up the file before rwreceiver is finished writing to it.  You may want to try adding a --duplicate-destination option to rwreceiver and having super_mediator pick up the file from the duplicate destination.
>>>
>>>
>>> Emily
>>>
>>>
>>>
>>>
>>> --------------------
>>> Emily Sarneso
>>> CMU/SEI/CERT
>>> ecoff at cert.org
>>> (412) 268-6313
>>>
>>>
>>>
>>>
>>>
>>>> On Nov 12, 2016, at 6:04 PM, andreas scherrer <ascherrer at gmail.com> wrote:
>>>>
>>>> Hi Emliy
>>>>
>>>> As indicated, I am now trying to have super_mediator update a MySQL database. The relevant excerpt from super_mediator.conf is the following:
>>>>
>>>> -----
>>>> #dedup process
>>>> EXPORTER TEXT "dns"
>>>>  APPLICATION == 53
>>>>  PATH "/var/spool/silk/dns/dns"
>>>>  DELIMITER "|"
>>>>  ROTATE 120
>>>>  DNS_DEDUP_ONLY
>>>>  LOCK
>>>>  MYSQL_USER "mediator"
>>>>  MYSQL_PASSWORD "youmeeverybody"
>>>>  MYSQL_TABLE "dns_dedup"
>>>>  MYSQL_DATABASE "smediator"
>>>> EXPORTER END
>>>> -----
>>>>
>>>> The table "dns_dedup" exists and was created by using
>>>>
>>>> -----
>>>> super_table_creator --name mediator \
>>>>   --pass=<SuperSecretPassword> \
>>>>   --database=smediator
>>>>   --dedup-last-seen
>>>> -----
>>>>
>>>> (Note: [5] states that
>>>>   "super_table_creator --name mediator \
>>>>   --pass=<SuperSecretPassword> \
>>>>   --database=smediator --dedup"
>>>> should be used, but that gave me an error (missing argument to "dedup" or something similar).)
>>>>
>>>> The user "mediator" has the required privileges on the database:
>>>>
>>>> -----
>>>> mysql> show grants;
>>>> +------------------------------------------------------------------------------+
>>>> | Grants for mediator at localhost       |
>>>> +------------------------------------------------------------------------------+
>>>> | GRANT USAGE ON *.* TO 'mediator'@'localhost' IDENTIFIED BY PASSWORD <secret> |
>>>> | GRANT ALL PRIVILEGES ON `smediator`.* TO 'mediator'@'localhost'       |
>>>> +------------------------------------------------------------------------------+
>>>> 2 rows in set (0.00 sec)
>>>> -----
>>>>
>>>> But, unfortunately, the table stays empty and the log (set to DEBUG) only shows the following:
>>>>
>>>> -----
>>>> dns: Error loading data 0:
>>>> -----
>>>>
>>>> The files on disk however are there and contain data. I also tried to manually load one of the files using "mysqlimport" and this worked.
>>>>
>>>> Any idea what might be going wrong or how to debug this?
>>>>
>>>> When looking at the log, I also saw quite a few lines like the following:
>>>>
>>>> -----
>>>> Ignoring Packet: IPFIX Message length mismatch (buffer has 18507, read 33478)
>>>> -----
>>>>
>>>> And eventually super_mediator even crashed:
>>>>
>>>> -----
>>>> super_mediator terminating on error: IPFIX Message length mismatch (buffer has 3997, read 8899)
>>>> -----
>>>>
>>>> I do not believe that this is directly related but I guess it indicates that I (still) have some issues in my config...
>>>>
>>>> On [6] I also saw that you are referring to a bug when importing DNS into MySQL. Could that be related? I am running super_mediator v1.3.0... (which is the latest version available as a package on FreeBSD)
>>>>
>>>> $ super_mediator --version
>>>> super_mediator version 1.3.0
>>>>
>>>>
>>>> Again, any help is greatly appreciated
>>>> andreas
>>>>
>>>> [5] https://tools.netsa.cert.org/yaf/libyaf/yaf_sm_silk.html
>>>> [6] https://tools.netsa.cert.org/confluence/x/E4Dz
>>>>
>>>>
>>>> On 9.11.16 20:00 , Emily Sarneso wrote:
>>>>> Hi Andreas,
>>>>>
>>>>> Thank you for the details.  Using 2 machines is a typical setup configuration.  First, I want to note that YAF exports IPFIX data.  The --silk option in YAF only [slightly] changes how some of the IPFIX data is exported.  Super_mediator expects IPFIX data from YAF.  Flowcap/rwflowpack take the IPFIX data and convert it into SiLK data, which is an entirely different data format.  Once the data is in SiLK format, super_mediator will not be able to read it and all of the DPI/payload data is dropped. Therefore, super_mediator must exist between YAF and flowcap/rwflowpack.
>>>>>
>>>>> If you only have one sensor and both the sensor and analysis machines are within the same network, I would suggest not using rwsender and rwreceiver.  You could run YAF on the sensor and super_mediator and rwflowpack on the analysis machines and they all would communicate via TCP.  If that doesn’t work for you, I would suggest the following configuration:
>>>>>
>>>>> On the collector machine run YAF exporting via TCP to super_mediator and flowcap.  The super_mediator configuration file would like something like this:
>>>>>
>>>>> COLLECTOR TCP “yaf”
>>>>>  PORT 18000
>>>>>  HOST localhost
>>>>> COLLECTOR END
>>>>>
>>>>> EXPORTER TCP “silk”
>>>>>  PORT 18001
>>>>>  HOST localhost
>>>>>  FLOW_ONLY
>>>>> COLLECTOR END
>>>>>
>>>>> EXPORTER TEXT “dns”
>>>>>  APPLICATION == 53
>>>>>  FIELDS stime,etime,sip,dip,sport,dport,protocol,vlanint,DPI
>>>>>  PATH “/data/dns/yafdns”
>>>>>  DPI_ONLY
>>>>>  ROTATE 120
>>>>>  LOCK
>>>>>  GZIP
>>>>>  MOVE “/var/rwsender/incoming”
>>>>> EXPORTER END
>>>>>
>>>>> If you are only interested in DNS data, I would suggest setting --plugin-opts=53 or if you are using the yaf config file:
>>>>>
>>>>> DPI_PLUGIN = {name = "/usr/local/lib/yaf/dpacketplugin.la", options="53",
>>>>>                         conf="/usr/local/etc/yafDPIRules.conf”}
>>>>>
>>>>> In the sensor.conf file, your probe block should look something like this:
>>>>>
>>>>> probe S0 ipfix
>>>>>     protocol tcp
>>>>>     listen-on-port 18001
>>>>>     interface-values vlan
>>>>> end probe
>>>>>
>>>>> With the above configuration, super_mediator will forward flow data to flowcap. flowcap will generate SiLK incremental files in the destination directory.  Additionally, super_mediator will write flow records that contain DNS data to /data/dns.  After 120 seconds it will close the file it is writing to, GZIP it, and move it to the rwsender incoming directory. rwsender will poll that directory for both the SiLK files and the DNS files created by super_mediator and transfer them to 2 rwreceiver's on the analysis machine.  The rwsender command will look similar to this:
>>>>>
>>>>> /usr/local/sbin/rwsender --ident=sender --mode=server --incoming-dir=/var/rwsender/incoming --processing-dir=/var/rwsender/processing --error-dir=/var/rwsender/error --server-port=19873 --client-ident=receive1 --client-ident=receive2 --filter=receive1:yaf2dns --filter=receive2:_S0 --log-dest=syslog
>>>>>
>>>>> This will transmit the DNS files to rwreceiver “receive1” and the SiLK files to “receive2”.
>>>>>
>>>>> On your analysis machine you can have 2 rwreceivers:
>>>>>
>>>>> /usr/local/sbin/rwreceiver --identifier=receive1 --mode=client --destination-dir=/data/dns --server-address=sender:localhost:19873 --log-dest=syslog --post-command=gunzip -d %s
>>>>> /usr/local/sbin/rwreceiver --identifier=receive2 --mode=client --destination-dir=/data/silk/incoming --server-address=sender:localhost:19873 --log-dest=syslog
>>>>>
>>>>> The post-command on the first rwreceiver will decompress the files.
>>>>>
>>>>> From there you will run rwflowpack with the incoming directory as the second rwreceiver’s destination-directory [/data/silk/incoming].  The --input-mode should be set to fcfiles.
>>>>>
>>>>>
>>>>> The other option is to have YAF write to IPFIX files and transport those via rwsender/rwreceiver to super_mediator on the analysis machine.  super_mediator could poll the incoming directory and write the DNS data to files as well as forward the flow data to rwflowpack.
>>>>>
>>>>> Unfortunately there is no --become-user option for super_mediator.  I will add this feature to the next super_mediator release.
>>>>>
>>>>> I know that’s a lot of information.  I hope this helps.  Please let me know if you have any other questions.
>>>>>
>>>>> Emily
>>>>>
>>>>>
>>>>>
>>>>>> On Nov 8, 2016, at 6:27 PM, andreas scherrer <ascherrer at gmail.com> wrote:
>>>>>>
>>>>>> Hi Emily
>>>>>>
>>>>>> Thank you very much for the quick and precise answer. Unfortunately I am still struggling with my setup; that is probably because I oversimplified in my question.
>>>>>>
>>>>>> Actually I have two machines, a collector and an analysis machine, and I use (try to) rwsender/rwreceiver. Basically I am trying to add super_mediator to the mix which is described in figure 1.4 (page 12) in [2].
>>>>>>
>>>>>> I believe this should work as follows:
>>>>>> a) yaf running on collector, producing --silk data
>>>>>> b) flowcap running on collector getting yaf data (following your advice I have switched to TCP localhost) and writing to a file/directory (i.e. producing a silk file)
>>>>>> c) rwsender waiting for rwreceiver to fetch the file
>>>>>> d) rwreceiver running on analysis machine fetches file (still in silk format)
>>>>>> e) super_mediator on analysis machine reads the (silk) file and produces multiple outputs: an IPFIX file for rwflowpack as well as DPI output (be it a file or load it into a database)
>>>>>> f) rwflowpack reads the IPFIX file that super_mediator produced
>>>>>>
>>>>>> It seems that I get something conceptually wrong here, but I do not seem to grok it.
>>>>>>
>>>>>> What I (now) see is super_mediator complaining:
>>>>>> "C1: Ignoring Packet: Illegal IPFIX Message version 0xdead; input is probably not an IPFIX Message stream."
>>>>>>
>>>>>> => yes, it is not IPFIX, it is silk... Kind of the inverse issue I had before :(
>>>>>>
>>>>>> For example, [3] uses the "--silk" option when yaf is started and sending data to super_mediator (although not via file, but via TCP). This indicates that super_mediator can handle/read data that was produced by yaf using --silk, no? On the other hand, [4] and other sources seem to indicate that super_mediator only handles IPFIX. At least [4] states (section "Collector Block", title "PATH file path") "...should be the name of the IPFIX file to read and process.".
>>>>>>
>>>>>> I want to do DNS (and maybe HTTP) logging. Does that mean that I need --silk or could I also do this with IPFIX "only"? This should be possible without --silk, right?
>>>>>>
>>>>>> Does it mean I need to export the payload (in/with yaf)? I guess so, otherwise super_mediator (running on a different machine!) will not have the required information?
>>>>>>
>>>>>> Hopefully it is possible to make sense from my text here! Any help/nudge in the right direction is greatly appreciated.
>>>>>>
>>>>>> On a side note: is there an option to run super_mediator as a different user (like --become-user for yaf)? I am not sure if/how I would automatically start super_mediator as non-root at boot time without such a switch.
>>>>>>
>>>>>> [2] http://tools.netsa.cert.org/silk/silk-install-handbook.pdf
>>>>>> [3] http://tools.netsa.cert.org/yaf/libyaf/yaf_sm_silk.html
>>>>>> [4] http://tools.netsa.cert.org/super_mediator/super_mediator.conf.html
>>>>>>
>>>>>> On 8.11.16 14:52 , Emily Sarneso wrote:
>>>>>>> Hello Andreas,
>>>>>>>
>>>>>>> Thank you for your interest in our tools.  In order to have rwflowpack poll a directory for IPFIX files, make sure the input mode (--input-mode) is set to stream.  I suspect that it is currently set to fcfiles or respool as the log message indicates that it is expecting SiLK data, not IPFIX data.
>>>>>>>
>>>>>>> Alternatively, you can export IPFIX via TCP/UDP from super_mediator (instead of writing IPFIX to files and having rwflowpack poll a directory) by using the following EXPORTER block:
>>>>>>>
>>>>>>> EXPORTER UDP “S0”
>>>>>>> HOST localhost
>>>>>>> PORT 18001
>>>>>>> FLOW_ONLY
>>>>>>> EXPORTER END
>>>>>>>
>>>>>>> There is a good tutorial for configuring the tools here: http://tools.netsa.cert.org/yaf/libyaf/yaf_sm_silk.html
>>>>>>>
>>>>>>>> BTW: when I change "export_payload to yes, I get the following error from yaf: "yaf terminating on error: End of message. Overrun on variable-length encode (need 2051 bytes, 402 available)”
>>>>>>>
>>>>>>> When exporting via UDP, the IPFIX message must be less than the MTU (typically 1420). When you enable export payload and max-payload is set to 2048, you are overrunning the available space in the IPFIX message.  I would suggest using TCP between YAF and super_mediator if you would like to export payload and DPI data.  Alternatively, you can reduce the max-payload OR max-export options in YAF to reduce the amount of data you are exporting. However, even if you reduce the max-payload or max-export options you may run into this same error message when exporting DPI data.  If you are required to use UDP, you should also change the “limit total” value in the yafDPIRules.conf to set a limit on the DPI data that YAF will export.
>>>>>>>
>>>>>>> Hope that helps.  Please let us know if you run into any other problems.
>>>>>>>
>>>>>>> Emily
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --------------------
>>>>>>> Emily Sarneso
>>>>>>> CMU/SEI/CERT
>>>>>>> ecoff at cert.org
>>>>>>> (412) 268-6313
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On Nov 7, 2016, at 6:23 PM, andreas scherrer <ascherrer at gmail.com> wrote:
>>>>>>>>
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> I am trying to set up yaf to collect flows directly from an interface (which only sees VLAN tagged traffic) and forward it to super_mediator to process the DPI information and forward the flow to rwflowpack/SiLK.
>>>>>>>>
>>>>>>>> Unfortunately rwflowpack does not seem to be happy with what it gets from super_mediator:
>>>>>>>>
>>>>>>>> -----
>>>>>>>> rwflowpack[69512]: File does not appear to be a SiLK data file '<filename>'
>>>>>>>> -----
>>>>>>>>
>>>>>>>> I saw on [1] that the header of a SiLK file should have "0xDEADBEEF" at the beginning. That does not seem to be the case for my files...
>>>>>>>>
>>>>>>>> -----
>>>>>>>> $ hexdump <filename>.med| head -1
>>>>>>>> 0000000 0a00 0001 2158 8905 0000 0000 0000 0000
>>>>>>>> -----
>>>>>>>>
>>>>>>>> That seems to be consistent.
>>>>>>>>
>>>>>>>> yaf is running with the following configuration (yaf.init file):
>>>>>>>>
>>>>>>>> -----
>>>>>>>> input = {inf = "re1", type="pcap"}
>>>>>>>> UDP_LOCAL_EXPORT = {host = "localhost", port = "9901", protocol="udp"}
>>>>>>>> output = UDP_LOCAL_EXPORT
>>>>>>>> decode = {gre = false, ip4_only=false, ip6_only=false, nofrag=false}
>>>>>>>> export = {silk = true, mac = true}
>>>>>>>> applabel = true
>>>>>>>> applabel_rules = "/usr/local/etc/yafApplabelRules.conf"
>>>>>>>> maxpayload = 2048
>>>>>>>> export_payload = false
>>>>>>>> udp_uniflow = 53
>>>>>>>> DPI_PLUGIN = {name = "/usr/local/lib/yaf/dpacketplugin.so",
>>>>>>>>           conf="/usr/local/etc/yafDPIRules.conf"}
>>>>>>>> DHCP_PLUGIN = {name = "/usr/local/lib/yaf/dhcp_fp_plugin.so"}
>>>>>>>> plugin = {DPI_PLUGIN, DHCP_PLUGIN}
>>>>>>>> PCAP_EXPORT = {path = "/tmp/pcap", maxpcap=25, pcap_timer=300, meta="/tmp/meta"}
>>>>>>>> log = {spec = "/tmp/yaflog.log", level="debug"}
>>>>>>>> -----
>>>>>>>>
>>>>>>>> And started using the following command line:
>>>>>>>>
>>>>>>>> -----
>>>>>>>> yaf -c /usr/local/etc/yaf.init --become-user foo --become-group foo
>>>>>>>> -----
>>>>>>>>
>>>>>>>> BTW: when I change "export_payload to yes, I get the following error from yaf: "yaf terminating on error: End of message. Overrun on variable-length encode (need 2051 bytes, 402 available)"
>>>>>>>>
>>>>>>>> super_mediator is running with the following config (file):
>>>>>>>>
>>>>>>>> -----
>>>>>>>> COLLECTOR UDP
>>>>>>>> PORT 9901
>>>>>>>> COLLECTOR END
>>>>>>>>
>>>>>>>> EXPORTER FILEHANDLER "S0"
>>>>>>>> PATH "/var/spool/silk/destination/"
>>>>>>>> ROTATE 10
>>>>>>>> FLOW_ONLY
>>>>>>>> EXPORTER END
>>>>>>>>
>>>>>>>> LOGLEVEL DEBUG
>>>>>>>>
>>>>>>>> LOG "/var/log/super_mediator.log"
>>>>>>>>
>>>>>>>> PIDFILE "/var/run/super_mediator.pid"
>>>>>>>> -----
>>>>>>>>
>>>>>>>> And is started with the following command line:
>>>>>>>>
>>>>>>>> -----
>>>>>>>> super_mediator -c /usr/local/etc/super_mediator.conf
>>>>>>>> -----
>>>>>>>>
>>>>>>>> I am doing this on FreeBSD (10.x).
>>>>>>>>
>>>>>>>> $ yaf --version
>>>>>>>> yaf version 2.8.4
>>>>>>>> $ super_mediator --version
>>>>>>>> super_mediator version 1.3.0
>>>>>>>>
>>>>>>>>
>>>>>>>> Any hint would be greatly appreciated!
>>>>>>>> andreas
>>>>>>>>
>>>>>>>> Ps.: My set up is working *without* super_mediator (meaning sending data directly from yaf to rwflowpack)
>>>>>>>>
>>>>>>>> [1] https://tools.netsa.cert.org/silk/faq.html#file-header


More information about the netsa-tools-discuss mailing list