tshark + Elasticsearch

Project to build Virtual Machine  https://github.com/H21lab/tsharkVM 

The native tshark JSON and Elasticsearch export enables to push the decoded data into Elasticsearch cluster.

tshark can be used in this way as monitoring probe to push the data into Elasticsearch cluster which enables:

Possible architecture:

Here is a github project to built VM demonstrating the approach. 

Usage

Here is documented short guide how to use it. 

Let's start with documentation:

man tshark (version 3.0.5)


-T  ek|fields|json|jsonraw|pdml|ps|psml|tabs|textSet the format of the output when viewing decoded packet data.  The options are one of:
ek Newline delimited JSON format for bulk import into Elasticsearch.  It can be used with -j or -J including the JSON filter or with -x to include raw hex-encoded packet data.  If -P is specified it will print the packet summary only, with both -P and -V it will print the packet summary and packet details. If neither -P or -V are used it will print the packet details only.  Example of usage to import data into Elasticsearch:
tshark -T ek -j "http tcp ip" -P -V -x -r file.pcap > file.jsoncurl -H "Content-Type: application/x-ndjson" -XPOST http://elasticsearch:9200/_bulk --data-binary "@file.json"
Elastic requires a mapping file to be loaded as template for packets-* index in order to convert wireshark types to elastic types. This file can be auto-generated with the command "tshark-G elastic-mapping". Since the mapping file can be huge, protocols can be selected by using the option --elastic-mapping-filter:
tshark -G elastic-mapping --elastic-mapping-filter ip,udp,dns
fields The values of fields specified with the -e option, in a form specified by the -E option.  For example,
tshark -T fields -E separator=, -E quote=d
would generate comma-separated values (CSV) output suitable for importing into your favorite spreadsheet program.
json JSON file format.  It can be used with -j or -J including the JSON filter or with -x option to include raw hex-encoded packet data.  Example of usage:
tshark -T json -r file.pcaptshark -T json -j "http tcp ip" -x -r file.pcap
jsonraw JSON file format including only raw hex-encoded packet data.  It can be used with -j including or -J the JSON filter option.  Example of usage:
tshark -T jsonraw -r file.pcaptshark -T jsonraw -j "http tcp ip" -x -r file.pcap
...
-j  <protocol match filter>Protocol match filter used for ek|json|jsonraw|pdml output file types.  Parent node containing multiple child nodes is only included, if the name is found in the filter.
Example: tshark -j "ip ip.flags text"
-J  <protocol match filter>Protocol top level filter used for ek|json|jsonraw|pdml output file types.  Parent node containing multiple child nodes is included with all children.
Example: tshark -J "http tcp"
...
--no-duplicate-keys      If -T json is specified, merge duplicate keys in an object into a single key with as value a json array containing all values

NOTE: The -T ek is de-duplicated by default in the latest wireshark code. This is required for Elasticsearch 6.0 and higher due the strict duplicate checking. The use of switch  --no-duplicate-keys should be used based on the json parsers. Without this switch, the json generates also duplicated values. This has been described in wireshark bug 12958.


Here follows the examples of generated outputs.

Example of the thsark JSON output (-T json)

# json example by using top-level protocol filter

./tshark -T json -J ip -r siqtran.pcap[  {    "_index": "packets-2017-02-14",    "_type": "pcap_file",    "_score": null,    "_source": {      "layers": {        "frame": {          "filtered": "frame"        },        "sll": {          "filtered": "sll"        },        "ip": {          "ip.version": "4",          "ip.hdr_len": "20",          "ip.dsfield": "0x00000000",          "ip.dsfield_tree": {            "ip.dsfield.dscp": "0",            "ip.dsfield.ecn": "0"          },          "ip.len": "276",          "ip.id": "0x00000000",          "ip.flags": "0x00000000",          "ip.flags_tree": {            "ip.flags.rb": "0",            "ip.flags.df": "0",            "ip.flags.mf": "0"          },          "ip.frag_offset": "0",          "ip.ttl": "64",          "ip.proto": "132",          "ip.checksum": "0x00007ad6",          "ip.checksum.status": "2",          "ip.src": "127.0.0.1",          "ip.addr": "127.0.0.1",          "ip.src_host": "127.0.0.1",          "ip.host": "127.0.0.1",          "ip.dst": "127.0.0.1",          "ip.addr": "127.0.0.1",          "ip.dst_host": "127.0.0.1",          "ip.host": "127.0.0.1",          "Source GeoIP: Unknown": "",          "Destination GeoIP: Unknown": ""        },        "sctp": {          "filtered": "sctp"        },        "m3ua": {          "filtered": "m3ua"        },        "sccp": {          "filtered": "sccp"        },        "tcap": {          "filtered": "tcap"        },        "gsm_map": {          "filtered": "gsm_map"        },        "gsm_sms": {          "filtered": "gsm_sms"        }      }    }  }]

# json example by using protocol filter

./tshark -T json -j "dns text" -r dns_trace.pcap[  {    "_index": "packets-2016-06-23",    "_type": "pcap_file",    "_score": null,    "_source": {      "layers": {        "frame": {          "filtered": "frame"        },        "eth": {          "filtered": "eth"        },        "ip": {          "filtered": "ip"        },        "udp": {          "filtered": "udp"        },        "dns": {          "dns.id": "0x0000a77c",          "dns.flags": {            "filtered": "dns_flags"          },          "dns.count.queries": "1",          "dns.count.answers": "0",          "dns.count.auth_rr": "0",          "dns.count.add_rr": "0",          "Queries": {            "clients6.google.com: type A, class IN": {              "dns.qry.name": "clients6.google.com",              "dns.qry.name.len": "19",              "dns.count.labels": "3",              "dns.qry.type": "1",              "dns.qry.class": "0x00000001"            }          }        }      }    }  }]

# json example of the fields filter

./tshark -T json -e "ip.dst" -e "ip.src" -r /test_http.pcap[  {    "_index": "packets-2016-06-30",    "_type": "pcap_file",    "_score": null,    "_source": {      "layers": {        "ip.dst": ["10.0.0.1"],        "ip.src": ["10.0.0.2"]      }    }  }]

Example of Elasticsearch output (-T ek)

 # by using json protocol filter

./tshark -T ek -j "dns text" -r dns_trace.pcap{"index" : {"_index": "packets-2016-06-23", "_type": "pcap_file", "_score": null}}{"timestamp" : "1439587238915", "layers" : {"frame": {"filtered": "frame"},"eth": {"filtered": "eth"},"ip": {"filtered": "ip"},"udp": {"filtered": "udp"},"dns": {"dns_dns_id": "0x0000a77c","dns_dns_flags": "","filtered": "dns_flags","dns_dns_count_queries": "1","dns_dns_count_answers": "0","dns_dns_count_auth_rr": "0","dns_dns_count_add_rr": "0","dns_text": "Queries","text_text": "clients6.google.com: type A, class IN","text_dns_qry_name": "clients6.google.com","text_dns_qry_name_len": "19","text_dns_count_labels": "3","text_dns_qry_type": "1","text_dns_qry_class": "0x00000001"}}}

# by using the fields filter

./tshark -T ek -e "ip.dst" -e "ip.src" -r /test_http.pcap{"index" : {"_index": "packets-2016-06-30", "_type": "pcap_file", "_score": null}}{"timestamp" : "1438233730468", "layers" : {"ip_dst": ["10.0.0.1"],"ip_src": ["10.0.0.2"]}}

Example of Kibana dashboards

GSM MAP split by Operation Codes and Calling GT

To store the whole packets into Elasticsearch, the following could be used.
However for more complex operations logstash should be considered.


./tshark -T ek -x -r ~/e/_Downloads/test_dns.pcap > ~/Downloads/file.jsoncurl -H "Content-Type: application/x-ndjson" -XPOST http://elasticsearch:9200/_bulk --data-binary @file.json

To reconstruct the pcap back from Elasticsearch

# past the frame_raw into text file using vi editor


vi test.hexxxd -r -p test.hex | od -Ax -tx1 | text2pcap - test.pcap

REFERENCES and UPDATES: