SysFlow Python API Reference

SysFlow Reader API

class sysflow.reader.FlattenedSFReader(filename, retEntities=False)

FlattenedSFReader

This class loads a raw sysflow file, and links all Entities (header, process, container, files) with the current flow or event in the file. As a result, the user does not have to manage this information. This class supports the python iterator design pattern. Example Usage:

reader = FlattenedSFReader(trace)
head = 20 # max number of records to print
for i, (objtype, header, cont, pproc, proc, files, evt, flow) in enumerate(reader):
    exe = proc.exe
    pid = proc.oid.hpid if proc else ''
    evflow = evt or flow
    tid = evflow.tid if evflow else ''
    opFlags = utils.getOpFlagsStr(evflow.opFlags) if evflow else ''
    sTime = utils.getTimeStr(evflow.ts) if evflow else ''
    eTime = utils.getTimeStr(evflow.endTs) if flow else ''
    ret = evflow.ret if evt else ''
    res1 = ''
    if objtype == ObjectTypes.FILE_FLOW or objtype == ObjectTypes.FILE_EVT:
        res1 = files[0].path
    elif objtype == ObjectTypes.NET_FLOW:
        res1 = utils.getNetFlowStr(flow)
    numBReads = evflow.numRRecvBytes if flow else ''
    numBWrites = evflow.numWSendBytes if flow else ''
    res2 = files[1].path if files and files[1] else ''
    cont = cont.id if cont else ''
    print("|{0:30}|{1:9}|{2:26}|{3:26}|{4:30}|{5:8}|{6:8}|".format(exe, opFlags, sTime, eTime, res1, numBReads, numBWrites))
    if i == head:
        break
Parameters:
  • filename (str) – the name of the sysflow file to be read.

  • retEntities (bool) – If True, the reader will return entity objects by themselves as they are seen in the sysflow file. In this case, all other objects will be set to None

Iterator

Reader returns a tuple of objects in the following order:

objtype (sysflow.objtypes.ObjectTypes) The type of entity or flow returned.

header (sysflow.entity.SFHeader) The header entity of the file.

pod (sysflow.entity.Pod) The pod associated with the flow/evt, or None if no pod.

cont (sysflow.entity.Container) The container associated with the flow/evt, or None if no container.

pproc (sysflow.entity.Process) The parent process associated with the flow/evt.

proc (sysflow.entity.Process) The process associated with the flow/evt.

files (tuple of sysflow.entity.File) Any files associated with the flow/evt.

evt (sysflow.event.{ProcessEvent,FileEvent}) If the record is an event, it will be returned here. Otherwise this variable will be None. objtype will indicate the type of event.

flow (sysflow.flow.{NetworkFlow,FileFlow}) If the record is a flow, it will be returned here. Otherwise this variable will be None. objtype will indicate the type of flow.

getProcess(oid)

Returns a Process Object given a process object id.

Parameters:

oid (sysflow.type.OID) – the object id of the Process Object requested

Return type:

sysflow.entity.Process

Returns:

the desired process object or None if no process object is available.

class sysflow.reader.NestedNamespace(**kwargs)
class sysflow.reader.SFReader(filename)

SFReader

This class loads a raw sysflow file, and returns each entity/flow one by one. It is the user’s responsibility to link the related objects together through the OID. This class supports the python iterator design pattern. Example Usage:

reader = SFReader("./sysflowfile.sf")
for name, sf in reader:
    if name == "sysflow.entity.SFHeader":
       //do something with the header object
    elif name == "sysflow.entity.Container":
       //do something with the container object
    elif name == "sysflow.entity.Process":
       //do something with the Process object
    ....
Parameters:

filename (str) – the name of the sysflow file to be read.

SysFlow Formatter API

class sysflow.formatter.SFFormatter(reader, defs=[])

SFFormatter

This class takes a FlattenedSFReader, and exports SysFlow as either JSON, CSV or Pretty Print . Example Usage:

reader = FlattenedSFReader(trace, False)
formatter = SFFormatter(reader)
fields=args.fields.split(',') if args.fields else None
if args.output == 'json':
    if args.file is not None:
        formatter.toJsonFile(args.file, fields=fields)
    else:
        formatter.toJsonStdOut(fields=fields)
elif args.output == 'csv' and args.file is not None:
    formatter.toCsvFile(args.file, fields=fields)
elif args.output == 'str':
    formatter.toStdOut(fields=fields)
Parameters:
applyFuncJson(func, fields=None, expr=None)

Enables a delegate function to be applied to each JSON record read.

Parameters:
  • func (function) – delegate function of the form func(str)

  • fields (list) – a list of the SysFlow fields to be exported in JSON. See formatter.py for a list of fields

  • expr (str) – a sfql filter expression

enableAllFields()

Enables all available fields to be added to the output by default.

enableK8sEventFields()

Enables fields related to k8s events be added to the output by default.

enablePodFields()

Enables fields related to pods to be added to the output by default.

getFields()

Returns a list with available SysFlow fields and their descriptions.

toCsvFile(path, fields=None, header=True, expr=None)

Writes SysFlow to CSV file.

Parameters:
  • path (str) – the full path of the output file.

  • fields (list) – a list of the SysFlow fields to be exported in the JSON. See formatter.py for a list of fields

  • expr (str) – a sfql filter expression

toDataframe(fields=None, expr=None)

Enables a delegate function to be applied to each JSON record read.

Parameters:
  • func (function) – delegate function of the form func(str)

  • fields (list) – a list of the SysFlow fields to be exported in the JSON. See formatter.py for a list of fields

  • expr (str) – a sfql filter expression

toJson(fields=None, flat=False, expr=None)

Writes SysFlow as JSON object.

Parameters:
  • fields (list) – a list of the SysFlow fields to be exported in JSON. See formatter.py for a list of fields

  • expr (str) – a sfql filter expression

Flat:

specifies if JSON output should be flattened

toJsonFile(path, fields=None, flat=False, expr=None)

Writes SysFlow to JSON file.

Parameters:
  • path (str) – the full path of the output file.

  • fields (list) – a list of the SysFlow fields to be exported in JSON. See formatter.py for a list of fields

  • expr (str) – a sfql filter expression

Flat:

specifies if JSON output should be flattened

toJsonStdOut(fields=None, flat=False, expr=None)

Writes SysFlow as JSON to stdout.

Parameters:
  • fields (list) – a list of the SysFlow fields to be exported in JSON. See formatter.py for a list of fields

  • expr (str) – a sfql filter expression

Flat:

specifies if JSON output should be flattened

toStdOut(fields=['ts_uts', 'type', 'proc.exe', 'proc.args', 'pproc.pid', 'proc.pid', 'proc.tid', 'opflags', 'res', 'flow.rbytes', 'flow.wbytes', 'container.id'], pretty_headers=True, showindex=True, expr=None)

Writes SysFlow as a tabular pretty print form to stdout.

Parameters:
  • fields (list) – a list of the SysFlow fields to be exported in the JSON. See formatter.py for a list of fields

  • pretty_headers (bool) – print table headers in pretty format.

  • showindex (bool) – show record number.

  • expr (str) – a sfql filter expression

SysFlow Object Types

class sysflow.objtypes.ObjectTypes(value)

ObjectTypes

Enumeration representing each of the object types:

HEADER = 0, CONT = 1, PROC = 2, FILE = 3, PROC_EVT = 4, NET_FLOW = 5, FILE_FLOW = 6, FILE_EVT = 7 PROC_FLOW = 8 POD = 9 K8S_EVT = 10

SysFlow Utils API

sysflow.utils.getEnvStr(env)

Converts an array of environment variables into a string representation.

Parameters:

env (str[]) – An array of environment variables.

Return type:

str

Returns:

A concatenated string representation of the environment variables array.

sysflow.utils.getIpIntStr(ipInt)

Converts an IP address in host order integer to a string representation.

Parameters:

ipInt – an IP address integer

Return type:

str

Returns:

A string representation of the IP address

sysflow.utils.getNetFlowStr(nf)

Converts a NetworkFlow into a string representation.

Parameters:

nf (sysflow.schema_classes.SchemaClasses.sysflow.flow.NetworkFlowClass) – a NetworkFlow object.

Return type:

str

Returns:

A string representation of the NetworkFlow in form (sip:sport-dip:dport).

sysflow.utils.getOpFlags(opFlags)

Converts a sysflow operations flag bitmap into a set representation.

Parameters:

opflag (int) – An operations bitmap from a flow or event.

Return type:

set

Returns:

A set representation of the operations bitmap.

sysflow.utils.getOpFlagsStr(opFlags)

Converts a sysflow operations flag bitmap into a string representation.

Parameters:

opflag (int) – An operations bitmap from a flow or event.

Return type:

str

Returns:

A string representation of the operations bitmap.

sysflow.utils.getOpStr(opFlags)

Converts a sysflow operations into a string representation.

Parameters:

opflag (int) – An operations bitmap from a flow or event.

Return type:

str

Returns:

A string representation of the operations bitmap.

sysflow.utils.getOpenFlags(openFlags)

Converts a sysflow open modes flag bitmap into a set representation.

Parameters:

opflag – An open modes bitmap from a flow or event.

Return type:

set

Returns:

A set representation of the open modes bitmap.

sysflow.utils.getTimeStr(ts)

Converts a nanosecond ts into a string representation.

Parameters:

ts (int) – A nanosecond epoch.

Return type:

str

Returns:

A string representation of the timestamp in %m/%d/%YT%H:%M:%S.%f format.

sysflow.utils.getTimeStrIso8601(ts)

Converts a nanosecond ts into a string representation in UTC time zone.

Parameters:

ts (int) – A nanosecond epoch.

Return type:

str

Returns:

A string representation of the timestamp in ISO 8601 format.

SysFlow Graphlet API

class sysflow.graphlet.Edge(n1, n2, label)

Edge

This class represents a graph edge, and acts as a super class for specific edges.

Parameters:

edge (sysflow.Edge) – an abstract edge object.

class sysflow.graphlet.EvtEdge(n1, n2, label)

EvtEdge

This class represents a graph event edge. It is used for sysflow event objects and subclasses Edge.

Parameters:

evtedge (sysflow.EvtEdge) – an edge object representing a sysflow evt.

class sysflow.graphlet.FileFlowNode(oid, exe, args)

FileFlowNode

This class represents a fileflow node.

Parameters:

ff (sysflow.FileFlow) – a fileflow node object.

class sysflow.graphlet.FlowEdge(n1, n2, label)

FlowEdge

This class represents a graph flow edge. It is used for sysflow flow objects and subclasses Edge.

Parameters:

flowedge (sysflow.FlowEdge) – an edge object representing a sysflow flow.

class sysflow.graphlet.Graphlet(path, expr=None, defs=[])

Graphlet

This class takes a path pointing to a sysflow trace or a directory containing sysflow traces.

Example Usage:

# basic usage
g1 = Graphlet('data/')
g1.view()

# filtering and enrichment with policies
ioc1 = 'proc.exe = /usr/bin/scp'
g1 = Graphlet('data/', ioc1, ['policies/ttps.yaml'])
g1.view()
Parameters:

graphlet (sysflow.Graphlet) – A compact provenance graph representation based on sysflow traces.

associatedMitigations(oid=None)

Returns a dataframe containing the set of MITRE mitigations associated with TTPs annotated in the graph.

Parameters:

oid (object ID string) – a node ID filter.

bt(cond, prune=True, label=None)

Performs a backward traversal on the graph from nodes matching a condition.

Example Usage::
def passwd(df):

return len(df[(df[‘file.path’].str.contains(‘passwd’))])>0

cond = lambda n: passwd(n.df()) g.bt(cond, prune=True, label=’discovery’).view()

Parameters:
  • cond (a lambda predicate that received a node object as argument and returns True or False.) – a lambda describing a predicate over node properties.

  • prune (boolean) – if true, nodes outside the dominance paths of matched nodes are pruned.

  • label (string) – if set, add a label to nodes in the dominance paths of matched nodes.

compare(other, withoid=False, peek=True, peeksize=3, flows=True, ttps=False)

Compares the graph to another graph (using a simple graph difference), returning a graph slice.

Parameters:
  • withoid (boolean) – indicates whether to show the node ID.

  • peek (boolean) – indicates whether to show details about the records associated with the nodes.

  • peeksize (integer) – the number of node records to show.

  • flows (boolean) – indicates whether to show flow nodes.

  • ttps (boolean) – indicates whether to show tags.

countermeasures(oid=None)

Returns a dataframe containing the set of MITRE d3fend defenses associated with TTPs annotated in the graph.

Parameters:

oid (object ID string) – a node ID filter.

data(oid=None)

Returns a dataframe containing the underlying data (sysflow records) of the graph.

Parameters:

oid (object ID string) – a node ID filter.

df(oid=None)

Returns a dataframe containing a summary of the graph node IDs and process metadata associated with them.

Parameters:

oid (object ID string) – a node ID filter.

findPaths(source, sink, prune=True, label=None)

Finds paths from source to sink nodes matching conditions.

Example Usage::
def scp(df):

return len(df[(df[‘proc.exe’].str.contains(‘scp’))])>0

source = lambda n: scp(n.df()) def passwd(df):

return len(df[(df[‘file.path’].str.contains(‘passwd’))])>0

sink = lambda n: passwd(n.df()) g.findPaths(source, sink, prune=True, label=’exfil’).view()

Parameters:
  • source (a lambda predicate that received a node object as argument and returns True or False.) – a lambda describing a predicate over node properties.

  • sink (a lambda predicate that received a node object as argument and returns True or False.) – a lambda describing a predicate over node properties.

  • prune (boolean) – if true, nodes outside the paths connecting matched nodes are pruned.

  • label (string) – if set, add a label to nodes in the paths connecting matched nodes.

ft(cond, prune=True, label=None)

Performs a forward traversal on the graph from nodes matching a condition.

Example Usage::
def scp(df):

return len(df[(df[‘proc.exe’].str.contains(‘scp’))])>0

cond = lambda n: scp(n.df()) g.ft(cond, prune=True, label=’remotecopy’).view()

Parameters:
  • cond (a lambda predicate that received a node object as argument and returns True or False.) – a lambda describing a predicate over node properties.

  • prune (boolean) – if true, nodes outside the dominated paths of matched nodes are pruned.

  • label (string) – if set, add a label to nodes in the dominated paths of matched nodes.

intersection(other, withoid=False, peek=True, peeksize=3, flows=True, ttps=False)

Computes the intersection of a graph with another graph, returning the graph corresponding to the intersection of the two graphs.

Parameters:

other (Graphlet) – the other graphlet to compute the intersection.

mitigations(oid=None, details=False)

Returns a dataframe containing the summary set of MITRE mitigations associated with TTPs annotated in the graph.

Parameters:

oid (object ID string) – a node ID filter.

tags(oid=None)

Returns a dataframe containing the set of (enrichment) tags in the graph.

Parameters:

oid (object ID string) – a node ID filter.

ttps(oid=None, details=False)

Returns a dataframe containing the set of MITRE TTP tags in the graph (e.g., as enriched by the ttps.yaml policy provided with the SysFlow processor).

Parameters:
  • oid (object ID string) – a node ID filter.

  • details (boolean) – indicates whether to include complete TTP metadata in the dataframe.

view(withoid=False, peek=True, peeksize=3, flows=True, ttps=False)

Visualizes the graph in dot format.

Parameters:
  • withoid (boolean) – indicates whether to show the node ID.

  • peek (boolean) – indicates whether to show details about the records associated with the nodes.

  • peeksize (integer) – the number of underlying sysflow records to show in the node.

  • flows (boolean) – indicates whether to show flow nodes.

  • ttps (boolean) – indicates whether to show tags.

class sysflow.graphlet.NetFlowNode(oid, exe, args)

NetFlowNode

This class represents a netflow node.

Parameters:

nf (sysflow.NetFlow) – a netflow node object.

class sysflow.graphlet.Node(oid)

Node

This class represents a graph node, and acts as a super class for specific nodes.

Parameters:

node (sysflow.Node) – an abstract node object.

class sysflow.graphlet.ProcessNode(oid, exe, args, uid, user, gid, group, tty)

ProcessNode

This class represents a process node.

Parameters:

proc (sysflow.ProcessNode) – a process node object.

SysFlow QL API

class sysflow.sfql.SfqlInterpreter(query: str | None = None, paths: list = [], inputs: list = [])

SfqlInterpreter

This class takes a sfql expression (and optionally a file containining a library of lists and macros) and produces a predicate expression that can be matched against sysflow records. Example Usage:

# using 'filter' to filter the input stream
reader = FlattenedSFReader('trace.sf')
interpreter = SfqlInterpreter()
query = '- sfql: type = FF'
for r in interpreter.filter(reader, query):
    print(r)
Parameters:

interpreter (sysflow.SfqlInterpreter) – An interpreter for executing sfql expressions.

compile(query: str | None = None, paths: list = [], inputs: list = [])

Compile sfql into a predicate expression to match sysflow records.

Parameters:
  • query (str) – sfql query.

  • paths (list) – a list of paths to file containing sfql list and macro definitions.

  • inputs (list) – a list of input streams from where to read sfql list and macro definitions.

enrich(t: T)

Process flattened sysflow record t based on policies.

evaluate(t: T, query: str | None = None, paths: list = []) bool

Evaluate sfql expression against flattened sysflow record t.

Parameters:
  • reader – individual sysflow record

  • query (str) – sfql query.

  • paths (list) – a list of paths to file containing sfql list and macro definitions.

filter(reader, query: str | None = None, paths: list = [])

Filter iterable reader according to sfql expression.

Parameters:
  • reader (FlattenedSFReader) – sysflow reader

  • query (str) – sfql query.

  • paths (list) – a list of paths to file containing sfql list and macro definitions.

getAttributes()

Return list of attributes supported by sfql.

class sysflow.sfql.SfqlMapper