[netsa-tools-discuss] flowdata storage scattered

Mark Thomas mthomas at cert.org
Wed May 13 13:57:31 EDT 2015


Stefan-

Hello.  Thank you for your interest in SiLK and for your question.

The short answer to your question is that there is no built-in
support for storing flow records by their end times.  SiLK was
designed for routers and flow generators that use an active timeout
not greater than 3600 seconds.

It appears to be relatively easy to modify the source code of
rwflowpack to store each record by its end time.  You must make
certain that rwflowpack uses particular file types when it stores
records to disk---more on this below.

Also, I must warn you that I do not know what sort of ripple effects
may occur by changing rwflowpack to store a record by its end time.
Some tools (such as rwcount) work better when the flow records
appear in generally time-sorted order.

To make the change, find section in rwflowpack that reads:

    /* Determine the hour this data is associated with.  This is a UTC
     * value expressed in milliseconds since the unix epoch, rounded
     * (down) to the hour. */
    key.time_stamp = rwRecGetStartTime(rwrec);
    key.time_stamp -= key.time_stamp % 3600000;

and change "rwRecGetStartTime" to "rwRecGetEndTime".

Even if you make this change, SiLK is going to have difficulties
with the records from your Juniper routers.

The in-memory representation of a SiLK Flow record includes the
start time and the duration (or elapsed time) and not the end time.
The start time is represented as milliseconds since the UNIX epoch
and it is stored in a 64-bit integer.  The elapsed time is
represented as milliseconds and it is stored in a 32-bit unsigned
integer for a maximum duration of 49.7 days.

The on-disk format of a SiLK Flow record also uses the start time
and the duration.  Some of the IPv4 disk formats use 22-bits for the
duration, and flows whose durations are longer than 69.9 minutes
cannot be stored.  Those same disk formats store the start-time in a
22-bit field as a millisecond offset from the time in the file's
header.  For a 64-bit start time and a 32-bit duration, make sure
you are using either an IPV6 file format (RT_RWIPV6ROUTING or
RT_RWIPV6) or the FT_RWGENERIC IPv4 file format.

I hope that answers your question.

-Mark



On Wed, 13 May 2015 17:09:51 +0200, Stefan Gundlach wrote:

> Hello! Is ist possible to let rwflowpack store the data based on the *end time* of the flow?
>
> As stated in the FAQ (https://tools.netsa.cert.org/silk/faq.html#rwfilter-time):
>
>   "note that the repository stores flow records by their start-time"
>
> Our Routers (Juniper) report some rather long running flows (> 2
> months), for example VPN sessions. To gather the recent parts of that
> traffic with rwfilter a lot of disk space has to be searched. This is
> not very efficient...
>
> Any suggestions?
>
> Stefan Gundlach 
> getit, Dortmund, Germany


More information about the netsa-tools-discuss mailing list