SysFlow Python API Reference¶
SysFlow Reader API¶
- class sysflow.reader.FlattenedSFReader(filename, retEntities=False)¶
FlattenedSFReader
This class loads a raw sysflow file, and links all Entities (header, process, container, files) with the current flow or event in the file. As a result, the user does not have to manage this information. This class supports the python iterator design pattern. Example Usage:
reader = FlattenedSFReader(trace) head = 20 # max number of records to print for i, (objtype, header, cont, pproc, proc, files, evt, flow) in enumerate(reader): exe = proc.exe pid = proc.oid.hpid if proc else '' evflow = evt or flow tid = evflow.tid if evflow else '' opFlags = utils.getOpFlagsStr(evflow.opFlags) if evflow else '' sTime = utils.getTimeStr(evflow.ts) if evflow else '' eTime = utils.getTimeStr(evflow.endTs) if flow else '' ret = evflow.ret if evt else '' res1 = '' if objtype == ObjectTypes.FILE_FLOW or objtype == ObjectTypes.FILE_EVT: res1 = files[0].path elif objtype == ObjectTypes.NET_FLOW: res1 = utils.getNetFlowStr(flow) numBReads = evflow.numRRecvBytes if flow else '' numBWrites = evflow.numWSendBytes if flow else '' res2 = files[1].path if files and files[1] else '' cont = cont.id if cont else '' print("|{0:30}|{1:9}|{2:26}|{3:26}|{4:30}|{5:8}|{6:8}|".format(exe, opFlags, sTime, eTime, res1, numBReads, numBWrites)) if i == head: break
- Parameters:
filename (str) – the name of the sysflow file to be read.
retEntities (bool) – If True, the reader will return entity objects by themselves as they are seen in the sysflow file. In this case, all other objects will be set to None
- Iterator
Reader returns a tuple of objects in the following order:
objtype (
sysflow.objtypes.ObjectTypes
) The type of entity or flow returned.header (
sysflow.entity.SFHeader
) The header entity of the file.pod (
sysflow.entity.Pod
) The pod associated with the flow/evt, or None if no pod.cont (
sysflow.entity.Container
) The container associated with the flow/evt, or None if no container.pproc (
sysflow.entity.Process
) The parent process associated with the flow/evt.proc (
sysflow.entity.Process
) The process associated with the flow/evt.files (tuple of
sysflow.entity.File
) Any files associated with the flow/evt.evt (
sysflow.event.{ProcessEvent,FileEvent}
) If the record is an event, it will be returned here. Otherwise this variable will be None. objtype will indicate the type of event.flow (
sysflow.flow.{NetworkFlow,FileFlow}
) If the record is a flow, it will be returned here. Otherwise this variable will be None. objtype will indicate the type of flow.
- getProcess(oid)¶
Returns a Process Object given a process object id.
- Parameters:
oid (sysflow.type.OID) – the object id of the Process Object requested
- Return type:
sysflow.entity.Process
- Returns:
the desired process object or None if no process object is available.
- class sysflow.reader.NestedNamespace(**kwargs)¶
- class sysflow.reader.SFReader(filename)¶
SFReader
This class loads a raw sysflow file, and returns each entity/flow one by one. It is the user’s responsibility to link the related objects together through the OID. This class supports the python iterator design pattern. Example Usage:
reader = SFReader("./sysflowfile.sf") for name, sf in reader: if name == "sysflow.entity.SFHeader": //do something with the header object elif name == "sysflow.entity.Container": //do something with the container object elif name == "sysflow.entity.Process": //do something with the Process object ....
- Parameters:
filename (str) – the name of the sysflow file to be read.
SysFlow Formatter API¶
- class sysflow.formatter.SFFormatter(reader, defs=[])¶
SFFormatter
This class takes a FlattenedSFReader, and exports SysFlow as either JSON, CSV or Pretty Print . Example Usage:
reader = FlattenedSFReader(trace, False) formatter = SFFormatter(reader) fields=args.fields.split(',') if args.fields else None if args.output == 'json': if args.file is not None: formatter.toJsonFile(args.file, fields=fields) else: formatter.toJsonStdOut(fields=fields) elif args.output == 'csv' and args.file is not None: formatter.toCsvFile(args.file, fields=fields) elif args.output == 'str': formatter.toStdOut(fields=fields)
- Parameters:
reader (sysflow.reader.FlattenedSFReader) – A reader representing the sysflow file being read.
defs (list) – A list of paths to filter definitions.
- applyFuncJson(func, fields=None, expr=None)¶
Enables a delegate function to be applied to each JSON record read.
- Parameters:
func (function) – delegate function of the form func(str)
fields (list) – a list of the SysFlow fields to be exported in JSON. See formatter.py for a list of fields
expr (str) – a sfql filter expression
- enableAllFields()¶
Enables all available fields to be added to the output by default.
- enableK8sEventFields()¶
Enables fields related to k8s events be added to the output by default.
- enablePodFields()¶
Enables fields related to pods to be added to the output by default.
- getFields()¶
Returns a list with available SysFlow fields and their descriptions.
- toCsvFile(path, fields=None, header=True, expr=None)¶
Writes SysFlow to CSV file.
- Parameters:
path (str) – the full path of the output file.
fields (list) – a list of the SysFlow fields to be exported in the JSON. See formatter.py for a list of fields
expr (str) – a sfql filter expression
- toDataframe(fields=None, expr=None)¶
Enables a delegate function to be applied to each JSON record read.
- Parameters:
func (function) – delegate function of the form func(str)
fields (list) – a list of the SysFlow fields to be exported in the JSON. See formatter.py for a list of fields
expr (str) – a sfql filter expression
- toJson(fields=None, flat=False, expr=None)¶
Writes SysFlow as JSON object.
- Parameters:
fields (list) – a list of the SysFlow fields to be exported in JSON. See formatter.py for a list of fields
expr (str) – a sfql filter expression
- Flat:
specifies if JSON output should be flattened
- toJsonFile(path, fields=None, flat=False, expr=None)¶
Writes SysFlow to JSON file.
- Parameters:
path (str) – the full path of the output file.
fields (list) – a list of the SysFlow fields to be exported in JSON. See formatter.py for a list of fields
expr (str) – a sfql filter expression
- Flat:
specifies if JSON output should be flattened
- toJsonStdOut(fields=None, flat=False, expr=None)¶
Writes SysFlow as JSON to stdout.
- Parameters:
fields (list) – a list of the SysFlow fields to be exported in JSON. See formatter.py for a list of fields
expr (str) – a sfql filter expression
- Flat:
specifies if JSON output should be flattened
- toStdOut(fields=['ts_uts', 'type', 'proc.exe', 'proc.args', 'pproc.pid', 'proc.pid', 'proc.tid', 'opflags', 'res', 'flow.rbytes', 'flow.wbytes', 'container.id'], pretty_headers=True, showindex=True, expr=None)¶
Writes SysFlow as a tabular pretty print form to stdout.
- Parameters:
fields (list) – a list of the SysFlow fields to be exported in the JSON. See formatter.py for a list of fields
pretty_headers (bool) – print table headers in pretty format.
showindex (bool) – show record number.
expr (str) – a sfql filter expression
SysFlow Object Types¶
- class sysflow.objtypes.ObjectTypes(value)¶
ObjectTypes
- Enumeration representing each of the object types:
HEADER = 0, CONT = 1, PROC = 2, FILE = 3, PROC_EVT = 4, NET_FLOW = 5, FILE_FLOW = 6, FILE_EVT = 7 PROC_FLOW = 8 POD = 9 K8S_EVT = 10
SysFlow Utils API¶
- sysflow.utils.getEnvStr(env)¶
Converts an array of environment variables into a string representation.
- Parameters:
env (str[]) – An array of environment variables.
- Return type:
str
- Returns:
A concatenated string representation of the environment variables array.
- sysflow.utils.getIpIntStr(ipInt)¶
Converts an IP address in host order integer to a string representation.
- Parameters:
ipInt – an IP address integer
- Return type:
str
- Returns:
A string representation of the IP address
- sysflow.utils.getNetFlowStr(nf)¶
Converts a NetworkFlow into a string representation.
- Parameters:
nf (sysflow.schema_classes.SchemaClasses.sysflow.flow.NetworkFlowClass) – a NetworkFlow object.
- Return type:
str
- Returns:
A string representation of the NetworkFlow in form (sip:sport-dip:dport).
- sysflow.utils.getOpFlags(opFlags)¶
Converts a sysflow operations flag bitmap into a set representation.
- Parameters:
opflag (int) – An operations bitmap from a flow or event.
- Return type:
set
- Returns:
A set representation of the operations bitmap.
- sysflow.utils.getOpFlagsStr(opFlags)¶
Converts a sysflow operations flag bitmap into a string representation.
- Parameters:
opflag (int) – An operations bitmap from a flow or event.
- Return type:
str
- Returns:
A string representation of the operations bitmap.
- sysflow.utils.getOpStr(opFlags)¶
Converts a sysflow operations into a string representation.
- Parameters:
opflag (int) – An operations bitmap from a flow or event.
- Return type:
str
- Returns:
A string representation of the operations bitmap.
- sysflow.utils.getOpenFlags(openFlags)¶
Converts a sysflow open modes flag bitmap into a set representation.
- Parameters:
opflag – An open modes bitmap from a flow or event.
- Return type:
set
- Returns:
A set representation of the open modes bitmap.
- sysflow.utils.getTimeStr(ts)¶
Converts a nanosecond ts into a string representation.
- Parameters:
ts (int) – A nanosecond epoch.
- Return type:
str
- Returns:
A string representation of the timestamp in %m/%d/%YT%H:%M:%S.%f format.
- sysflow.utils.getTimeStrIso8601(ts)¶
Converts a nanosecond ts into a string representation in UTC time zone.
- Parameters:
ts (int) – A nanosecond epoch.
- Return type:
str
- Returns:
A string representation of the timestamp in ISO 8601 format.
SysFlow Graphlet API¶
- class sysflow.graphlet.Edge(n1, n2, label)¶
Edge
This class represents a graph edge, and acts as a super class for specific edges.
- Parameters:
edge (sysflow.Edge) – an abstract edge object.
- class sysflow.graphlet.EvtEdge(n1, n2, label)¶
EvtEdge
This class represents a graph event edge. It is used for sysflow event objects and subclasses Edge.
- Parameters:
evtedge (sysflow.EvtEdge) – an edge object representing a sysflow evt.
- class sysflow.graphlet.FileFlowNode(oid, exe, args)¶
FileFlowNode
This class represents a fileflow node.
- Parameters:
ff (sysflow.FileFlow) – a fileflow node object.
- class sysflow.graphlet.FlowEdge(n1, n2, label)¶
FlowEdge
This class represents a graph flow edge. It is used for sysflow flow objects and subclasses Edge.
- Parameters:
flowedge (sysflow.FlowEdge) – an edge object representing a sysflow flow.
- class sysflow.graphlet.Graphlet(path, expr=None, defs=[])¶
Graphlet
This class takes a path pointing to a sysflow trace or a directory containing sysflow traces.
Example Usage:
# basic usage g1 = Graphlet('data/') g1.view() # filtering and enrichment with policies ioc1 = 'proc.exe = /usr/bin/scp' g1 = Graphlet('data/', ioc1, ['policies/ttps.yaml']) g1.view()
- Parameters:
graphlet (sysflow.Graphlet) – A compact provenance graph representation based on sysflow traces.
- associatedMitigations(oid=None)¶
Returns a dataframe containing the set of MITRE mitigations associated with TTPs annotated in the graph.
- Parameters:
oid (object ID string) – a node ID filter.
- bt(cond, prune=True, label=None)¶
Performs a backward traversal on the graph from nodes matching a condition.
- Example Usage::
- def passwd(df):
return len(df[(df[‘file.path’].str.contains(‘passwd’))])>0
cond = lambda n: passwd(n.df()) g.bt(cond, prune=True, label=’discovery’).view()
- Parameters:
cond (a lambda predicate that received a node object as argument and returns True or False.) – a lambda describing a predicate over node properties.
prune (boolean) – if true, nodes outside the dominance paths of matched nodes are pruned.
label (string) – if set, add a label to nodes in the dominance paths of matched nodes.
- compare(other, withoid=False, peek=True, peeksize=3, flows=True, ttps=False)¶
Compares the graph to another graph (using a simple graph difference), returning a graph slice.
- Parameters:
withoid (boolean) – indicates whether to show the node ID.
peek (boolean) – indicates whether to show details about the records associated with the nodes.
peeksize (integer) – the number of node records to show.
flows (boolean) – indicates whether to show flow nodes.
ttps (boolean) – indicates whether to show tags.
- countermeasures(oid=None)¶
Returns a dataframe containing the set of MITRE d3fend defenses associated with TTPs annotated in the graph.
- Parameters:
oid (object ID string) – a node ID filter.
- data(oid=None)¶
Returns a dataframe containing the underlying data (sysflow records) of the graph.
- Parameters:
oid (object ID string) – a node ID filter.
- df(oid=None)¶
Returns a dataframe containing a summary of the graph node IDs and process metadata associated with them.
- Parameters:
oid (object ID string) – a node ID filter.
- findPaths(source, sink, prune=True, label=None)¶
Finds paths from source to sink nodes matching conditions.
- Example Usage::
- def scp(df):
return len(df[(df[‘proc.exe’].str.contains(‘scp’))])>0
source = lambda n: scp(n.df()) def passwd(df):
return len(df[(df[‘file.path’].str.contains(‘passwd’))])>0
sink = lambda n: passwd(n.df()) g.findPaths(source, sink, prune=True, label=’exfil’).view()
- Parameters:
source (a lambda predicate that received a node object as argument and returns True or False.) – a lambda describing a predicate over node properties.
sink (a lambda predicate that received a node object as argument and returns True or False.) – a lambda describing a predicate over node properties.
prune (boolean) – if true, nodes outside the paths connecting matched nodes are pruned.
label (string) – if set, add a label to nodes in the paths connecting matched nodes.
- ft(cond, prune=True, label=None)¶
Performs a forward traversal on the graph from nodes matching a condition.
- Example Usage::
- def scp(df):
return len(df[(df[‘proc.exe’].str.contains(‘scp’))])>0
cond = lambda n: scp(n.df()) g.ft(cond, prune=True, label=’remotecopy’).view()
- Parameters:
cond (a lambda predicate that received a node object as argument and returns True or False.) – a lambda describing a predicate over node properties.
prune (boolean) – if true, nodes outside the dominated paths of matched nodes are pruned.
label (string) – if set, add a label to nodes in the dominated paths of matched nodes.
- intersection(other, withoid=False, peek=True, peeksize=3, flows=True, ttps=False)¶
Computes the intersection of a graph with another graph, returning the graph corresponding to the intersection of the two graphs.
- Parameters:
other (Graphlet) – the other graphlet to compute the intersection.
- mitigations(oid=None, details=False)¶
Returns a dataframe containing the summary set of MITRE mitigations associated with TTPs annotated in the graph.
- Parameters:
oid (object ID string) – a node ID filter.
- tags(oid=None)¶
Returns a dataframe containing the set of (enrichment) tags in the graph.
- Parameters:
oid (object ID string) – a node ID filter.
- ttps(oid=None, details=False)¶
Returns a dataframe containing the set of MITRE TTP tags in the graph (e.g., as enriched by the ttps.yaml policy provided with the SysFlow processor).
- Parameters:
oid (object ID string) – a node ID filter.
details (boolean) – indicates whether to include complete TTP metadata in the dataframe.
- view(withoid=False, peek=True, peeksize=3, flows=True, ttps=False)¶
Visualizes the graph in dot format.
- Parameters:
withoid (boolean) – indicates whether to show the node ID.
peek (boolean) – indicates whether to show details about the records associated with the nodes.
peeksize (integer) – the number of underlying sysflow records to show in the node.
flows (boolean) – indicates whether to show flow nodes.
ttps (boolean) – indicates whether to show tags.
- class sysflow.graphlet.NetFlowNode(oid, exe, args)¶
NetFlowNode
This class represents a netflow node.
- Parameters:
nf (sysflow.NetFlow) – a netflow node object.
- class sysflow.graphlet.Node(oid)¶
Node
This class represents a graph node, and acts as a super class for specific nodes.
- Parameters:
node (sysflow.Node) – an abstract node object.
- class sysflow.graphlet.ProcessNode(oid, exe, args, uid, user, gid, group, tty)¶
ProcessNode
This class represents a process node.
- Parameters:
proc (sysflow.ProcessNode) – a process node object.
SysFlow QL API¶
- class sysflow.sfql.SfqlInterpreter(query: str | None = None, paths: list = [], inputs: list = [])¶
SfqlInterpreter
This class takes a sfql expression (and optionally a file containining a library of lists and macros) and produces a predicate expression that can be matched against sysflow records. Example Usage:
# using 'filter' to filter the input stream reader = FlattenedSFReader('trace.sf') interpreter = SfqlInterpreter() query = '- sfql: type = FF' for r in interpreter.filter(reader, query): print(r)
- Parameters:
interpreter (sysflow.SfqlInterpreter) – An interpreter for executing sfql expressions.
- compile(query: str | None = None, paths: list = [], inputs: list = [])¶
Compile sfql into a predicate expression to match sysflow records.
- Parameters:
query (str) – sfql query.
paths (list) – a list of paths to file containing sfql list and macro definitions.
inputs (list) – a list of input streams from where to read sfql list and macro definitions.
- enrich(t: T)¶
Process flattened sysflow record t based on policies.
- evaluate(t: T, query: str | None = None, paths: list = []) bool ¶
Evaluate sfql expression against flattened sysflow record t.
- Parameters:
reader – individual sysflow record
query (str) – sfql query.
paths (list) – a list of paths to file containing sfql list and macro definitions.
- filter(reader, query: str | None = None, paths: list = [])¶
Filter iterable reader according to sfql expression.
- Parameters:
reader (FlattenedSFReader) – sysflow reader
query (str) – sfql query.
paths (list) – a list of paths to file containing sfql list and macro definitions.
- getAttributes()¶
Return list of attributes supported by sfql.
- class sysflow.sfql.SfqlMapper¶