CodeCommitsIssuesPull requestsActionsInsightsSecurity
9a8d7d65552313c0d7b4bbd23d5a1ed4caba7093

Branches

Tags

  • No tags available.
0Branches0Tags
Go to file
Add file
Code

Clone

HTTPS

Download ZIP

Notebooks/Example - Linux-Windows-Office Investigation.ipynb

3081lines · modepreview

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Title: Sample Investigation \n",
    "## Linux, Windows, Network and Office data\n",
    "**Notebook Version:** 1.0<br>\n",
    "**Python Version:** Python 3.6 (including Python 3.6 - AzureML)<br>\n",
    "**Required Packages**: kqlmagic, msticpy, pandas, numpy, matplotlib, networkx, ipywidgets, ipython, scikit_learn, dnspython, ipwhois, folium, maxminddb_geolite2<br>\n",
    "**Platforms Supported**:\n",
    "- Azure Notebooks Free Compute\n",
    "- Azure Notebooks DSVM\n",
    "- OS Independent\n",
    "\n",
    "**Data Sources Required**:\n",
    "- Log Analytics - SecurityAlert, SecurityEvent (EventIDs 4688 and 4624/25), AuditLog_CL (Linux Auditd), OfficeActivity, AzureNetworkAnalytics_CL, Heartbeat\n",
    "- (Optional) - VirusTotal (with API key)\n",
    "\n",
    "**Contact:** ianhelle@microsoft.com\n",
    "\n",
    "## Description:\n",
    "This is an example notebook demonstrating techniques to trace the path of an attacker in an organization. Most of the steps use relatively simple _Log Analytics_ queries but it also includes a few advanced procedures such as:\n",
    "- Unpacking and decoding Linux Audit logs\n",
    "- Clustering\n",
    "\n",
    "From an initial alert (or suspect IP address) examine activity on a Linux host, a Windows and Office subscription.\n",
    "Discover malicious activity related to the ip address in each of these. \n",
    "\n",
    "The notebook is intended to illustrate the kinds of steps and data query and analysis that you might do in a real hunt or investigation.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='toc'></a>\n",
    "## Table of Contents\n",
    "- [Setup and Authenticate](#setup)\n",
    "\n",
    "- [Get Alerts List](#getalertslist)\n",
    "- [Choose an Alert to investigate](#enteralertid)\n",
    "  - [Extract Properties and entities from alert](#extractalertproperties)\n",
    "  - [Basic IP Checks](#basic_ip_checks)\n",
    "    - [Check the IP Address for known C2 addresses](#check_ip_ti)\n",
    "- [See What's going on on the Affected Host - Linux](#alerthost)\n",
    "  - [Event Types collected](#linux_event_types)\n",
    "  - [Failure Events](#linux_failure_events)\n",
    "  - [Extract IPs from all Events](#linux_extract_ips)\n",
    "  - [Get Logins with IP Address Recorded](#linux_login_ips)\n",
    "  - [What's happening in these sessions?](#linux_login_sessions)\n",
    "  - [Find Distinctive Process Patterns - Clustering](#linux_proc_cluster)\n",
    "- [Alert Host Network Data](#alert_host_net)\n",
    "  - [Check Communications with Other Hosts](#comms_to_other_hosts)\n",
    "  - [GeoLocation Mapping](#geomap_lx_ips)\n",
    "  - [Have any other hosts been communicating with this address(es)?](#other_hosts_to_ips)\n",
    "- [Other Hosts Communicating with IP](#other_host_investigate)\n",
    "  - [Check Host Logons](#host_logons)\n",
    "  - [Examine a Logon Session](#examine_win_logon_sess)\n",
    "  - [Unusual Processes on Host - Clustering](#process_clustering)\n",
    "  - [Processes for Selected LogonId](#process_session)\n",
    "  - [Other Events on the Host](#other_win_events)\n",
    "- [Office 365 Activity](#o365)\n",
    "- [Summary](#summary)\n",
    "- [Appendices](#appendices)\n",
    "  - [Saving data to Excel](#appendices)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='setup'></a>[Contents](#toc)\n",
    "# Setup\n",
    "\n",
    "Make sure that you have installed packages specified in the setup (uncomment the lines to execute)\n",
    "\n",
    "## Install Packages\n",
    "The first time this cell runs for a new Azure Notebooks project or local Python environment it will take several minutes to download and install the packages. In subsequent runs it should run quickly and confirm that package dependencies are already installed. Unless you want to upgrade the packages you can feel free to skip execution of the next cell.\n",
    "\n",
    "If you see any import failures (```ImportError```) in the notebook, please re-run this cell and answer 'y', then re-run the cell where the failure occurred.\n",
    "\n",
    "Note you may see some warnings about package incompatibility with certain packages. This does not affect the functionality of this notebook but you may need to upgrade the packages producing the warnings to a more recent version."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "import warnings\n",
    "\n",
    "warnings.filterwarnings(\"ignore\",category=DeprecationWarning)\n",
    "\n",
    "MIN_REQ_PYTHON = (3,6)\n",
    "if sys.version_info < MIN_REQ_PYTHON:\n",
    "    print('Check the Kernel->Change Kernel menu and ensure that Python 3.6')\n",
    "    print('or later is selected as the active kernel.')\n",
    "    sys.exit(\"Python %s.%s or later is required.\\n\" % MIN_REQ_PYTHON)\n",
    "\n",
    "# Package Installs - try to avoid if they are already installed\n",
    "try:\n",
    "    import msticpy.sectools as sectools\n",
    "    import Kqlmagic\n",
    "    from dns import reversename, resolver\n",
    "    from ipwhois import IPWhois\n",
    "    import folium\n",
    "    \n",
    "    print('If you answer \"n\" this cell will exit with an error in order to avoid the pip install calls,')\n",
    "    print('This error can safely be ignored.')\n",
    "    resp = input('msticpy and Kqlmagic packages are already loaded. Do you want to re-install? (y/n)')\n",
    "    if resp.strip().lower() != 'y':\n",
    "        sys.exit('pip install aborted - you may skip this error and continue.')\n",
    "except ImportError:\n",
    "    pass\n",
    "\n",
    "print('\\nPlease wait. Installing required packages. This may take a few minutes...')\n",
    "!pip install git+https://github.com/microsoft/msticpy --upgrade --user\n",
    "!pip install Kqlmagic --no-cache-dir --upgrade --user\n",
    "\n",
    "# Additional packages used in this notebook.\n",
    "!pip install dnspython --upgrade \n",
    "!pip install ipwhois --upgrade \n",
    "!pip install folium --upgrade \n",
    "\n",
    "# Uncomment to refresh the maxminddb database\n",
    "# !pip install maxminddb-geolite2 --upgrade \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Imports\n",
    "import sys\n",
    "MIN_REQ_PYTHON = (3,6)\n",
    "if sys.version_info < MIN_REQ_PYTHON:\n",
    "    print('Check the Kernel->Change Kernel menu and ensure that Python 3.6')\n",
    "    print('or later is selected as the active kernel.')\n",
    "    sys.exit(\"Python %s.%s or later is required.\\n\" % MIN_REQ_PYTHON)\n",
    "\n",
    "import numpy as np\n",
    "from IPython import get_ipython\n",
    "from IPython.display import display, HTML, Markdown\n",
    "import ipywidgets as widgets\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "sns.set()\n",
    "import networkx as nx\n",
    "\n",
    "import pandas as pd\n",
    "pd.set_option('display.max_rows', 100)\n",
    "pd.set_option('display.max_columns', 50)\n",
    "pd.set_option('display.max_colwidth', 100)\n",
    "\n",
    "import msticpy.sectools as sectools\n",
    "import msticpy.nbtools as mas\n",
    "import msticpy.nbtools.kql as qry\n",
    "import msticpy.nbtools.nbdisplay as nbdisp\n",
    "\n",
    "WIDGET_DEFAULTS = {'layout': widgets.Layout(width='95%'),\n",
    "                   'style': {'description_width': 'initial'}}\n",
    "display(HTML(mas.util._TOGGLE_CODE_PREPARE_STR))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Copy Last Query to Clipboard\n",
    "This section adds an IPython magic command 'la'. The magic is used as follows:\n",
    "\n",
    "```%la [pythonvar|string]```\n",
    "\n",
    "If used with no arguments it will copy the last KQL query to clipboard and display a link to take you to the Sentinel/Log Analytics portal. If the argument is a variable the value of the variable is copied, otherwise the string is copied.\n",
    "\n",
    "When using the **%%la** cell magic form the entire cell is copied to the clipboard.\n",
    "\n",
    "The URL uses the current config settings for workspace and subscription.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "from IPython.core.magic import (register_line_magic, register_cell_magic,\n",
    "                                register_line_cell_magic)\n",
    "from collections import OrderedDict\n",
    "\n",
    "def copy_to_clipboard(copy_text):\n",
    "    pd.DataFrame([copy_text]).to_clipboard(index=False,header=False)\n",
    "    \n",
    "LA_URL=('https://ms.portal.azure.com/'\n",
    "        '?feature.showassettypes=Microsoft_Azure_Security_Insights_SecurityInsightsDashboard'\n",
    "        '#blade/Microsoft_Azure_Security_Insights/MainMenuBlade/6'\n",
    "        '/subscriptionId/{sub_id}'\n",
    "        '/resourceGroup/{ws_id}'\n",
    "        '/workspaceName/{ws_name}')\n",
    "\n",
    "@register_line_cell_magic\n",
    "def la(line, cell=None):\n",
    "    KQL_MAGIC_RESULT = '_kql_raw_result_'\n",
    "    \n",
    "    #import pdb; pdb.set_trace()\n",
    "    if not cell and not line or line.strip() == KQL_MAGIC_RESULT:\n",
    "        if KQL_MAGIC_RESULT in globals():\n",
    "            copy_to_clipboard(globals()[KQL_MAGIC_RESULT].query)\n",
    "            print('Last kql query copied to clipboard.')\n",
    "    elif line and cell is None:\n",
    "        if line in globals():\n",
    "            copy_to_clipboard(globals()[line])\n",
    "            print(f'Value of {line} copied to clipboard.')\n",
    "        elif line in locals():\n",
    "            copy_to_clipboard(locals()[line])\n",
    "            print(f'Value of {line} copied to clipboard.')\n",
    "        else:\n",
    "            copy_to_clipboard(line)\n",
    "            print(f'Copied to clipboard.')\n",
    "    else:\n",
    "        copy_to_clipboard(cell)\n",
    "        print(f'Copied to clipboard.')\n",
    "    \n",
    "    url = LA_URL # TODO .format(sub_id=subscription_id,\n",
    "#                         ws_id=resource_group, \n",
    "#                         ws_name=workspace_name)\n",
    "    return HTML(f'<a target=\"_new\" href=\"{url}\">Open Log Analytics</a>')\n",
    "del la\n",
    "\n",
    "# Create an observation collector list\n",
    "from collections import namedtuple\n",
    "Observation = namedtuple('Observation', ['caption', 'description', 'item', 'link'])\n",
    "observation_list = OrderedDict()\n",
    "def display_observation(observation):\n",
    "    display(Markdown(f'### {observation.caption}'))\n",
    "    display(Markdown(observation.description))\n",
    "    display(Markdown(f'[Go to details](#{observation.link})'))\n",
    "    display(observation.item)\n",
    "\n",
    "def add_observation(observation):\n",
    "    observation_list[observation.caption] = observation\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": [
     "remove"
    ]
   },
   "source": [
    "### Get WorkspaceId\n",
    "To find your Workspace Id go to [Log Analytics](https://ms.portal.azure.com/#blade/HubsExtension/Resources/resourceType/Microsoft.OperationalInsights%2Fworkspaces). Look at the workspace properties to find the ID."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true,
    "tags": [
     "todo"
    ]
   },
   "outputs": [],
   "source": [
    "import os\n",
    "from msticpy.nbtools.wsconfig import WorkspaceConfig\n",
    "ws_config_file = 'config.json'\n",
    "\n",
    "try:\n",
    "    ws_config = WorkspaceConfig(ws_config_file)\n",
    "    display(Markdown(f'Read Workspace configuration from local config.json for workspace **{ws_config[\"workspace_name\"]}**'))\n",
    "    for cf_item in ['tenant_id', 'subscription_id', 'resource_group', 'workspace_id', 'workspace_name']:\n",
    "        display(Markdown(f'**{cf_item.upper()}**: {ws_config[cf_item]}'))\n",
    "    WORKSPACE_ID = ws_config['workspace_id']\n",
    "except:\n",
    "    WORKSPACE_ID = None\n",
    "    display(Markdown('**Workspace configuration not found.**\\n\\n'\n",
    "                     'Please go to your Log Analytics workspace, copy the workspace ID and paste here. '\n",
    "                     'Or read the workspace_id from the config.json in your Azure Notebooks project.'))\n",
    "    ws_config = None\n",
    "    ws_id = mas.GetEnvironmentKey(env_var='WORKSPACE_ID',\n",
    "                              prompt='Please enter your Log Analytics Workspace Id:')\n",
    "    ws_id.display()\n",
    "    \n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Authenticate to Log Analytics\n",
    "If you are using user/device authentication, run the following cell. \n",
    "- Click the 'Copy code to clipboard and authenticate' button.\n",
    "- This will pop up an Azure Active Directory authentication dialog (in a new tab or browser window). The device code will have been copied to the clipboard. \n",
    "- Select the text box and paste (Ctrl-V/Cmd-V) the copied value. \n",
    "- You should then be redirected to a user authentication page where you should authenticate with a user account that has permission to query your Log Analytics workspace.\n",
    "\n",
    "Use the following syntax if you are authenticating using an Azure Active Directory AppId and Secret:\n",
    "```\n",
    "%kql loganalytics://tenant(aad_tenant).workspace(WORKSPACE_ID).clientid(client_id).clientsecret(client_secret)\n",
    "```\n",
    "instead of\n",
    "```\n",
    "%kql loganalytics://code().workspace(WORKSPACE_ID)\n",
    "```\n",
    "\n",
    "Note: you may occasionally see a JavaScript error displayed at the end of the authentication - you can safely ignore this.<br>\n",
    "On successful authentication you should see a ```popup schema``` button."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "todo"
    ]
   },
   "outputs": [],
   "source": [
    "if not WORKSPACE_ID:\n",
    "    try:\n",
    "        WORKSPACE_ID = ws_id.value\n",
    "    except NameError:\n",
    "        raise ValueError('No workspace Id.')\n",
    "\n",
    "mas.kql.load_kql_magic()\n",
    "%kql loganalytics://code().workspace(WORKSPACE_ID)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": [
     "todo"
    ]
   },
   "source": [
    "<a id='getalertslist'></a>[Contents](#toc)\n",
    "# Get Alerts List\n",
    "\n",
    "Specify a time range to search for alerts. One this is set run the following cell to retrieve any alerts in that time window.\n",
    "You can change the time range and re-run the queries until you find the alerts that you want."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "alert_q_times = mas.QueryTime(units='day', max_before=20, before=2, max_after=1)\n",
    "alert_q_times.display()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "alert_counts = qry.list_alerts_counts(provs=[alert_q_times])\n",
    "alert_list = qry.list_alerts(provs=[alert_q_times])\n",
    "print(len(alert_counts), ' distinct alert types')\n",
    "print(len(alert_list), ' distinct alerts')\n",
    "display(HTML('<h2>Top alerts</h2>'))\n",
    "alert_counts.head(10) # remove '.head()'' to see the full list grouped by AlertName"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='enteralertid'></a>[Contents](#toc)\n",
    "# Choose Alert to Investigate\n",
    "Pick an alert from a list of retrieved alerts.\n",
    "\n",
    "This section extracts the alert information and entities into a SecurityAlert object allowing us to query the properties more reliably. \n",
    "\n",
    "In particular, we use the alert to automatically provide parameters for queries and UI elements.\n",
    "Subsequent queries will use properties like the host name and derived properties such as the OS family (Linux or Windows) to adapt the query. Query time selectors like the one above will also default to an origin time that matches the alert selected.\n",
    "\n",
    "The alert view below shows all of the main properties of the alert plus the extended property dictionary (if any) and JSON representations of the Entity."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Select alert from list\n",
    "As you select an alert, the main properties will be shown below the list.\n",
    "\n",
    "Use the filter box to narrow down your search to any substring in the AlertName."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "security_alert = None\n",
    "def show_full_alert(selected_alert):\n",
    "    global security_alert\n",
    "    security_alert = mas.SecurityAlert(alert_select.selected_alert)\n",
    "    mas.disp.display_alert(security_alert, show_entities=True)\n",
    "alert_select = mas.AlertSelector(alerts=alert_list, action=show_full_alert)\n",
    "alert_select.display()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Check alert for IP addresses not contained in entities\n",
    "Additional IP addresses found in alert are shown below.\n",
    "\n",
    "If you have others to check (e.g. from Threat Intel data) that\n",
    "you think may be associated with the same investigation, add them here (delimited by commas)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# Extract any additional IP entities\n",
    "ioc_extractor = sectools.IoCExtract()\n",
    "new_entities = ioc_extractor.extract(src=str(security_alert))\n",
    "\n",
    "addl_ip_addrs = ', '.join(new_entities.get('ipv4', []))\n",
    "\n",
    "if (not [e for e in security_alert.entities if isinstance(e, mas.IpAddress)] and\n",
    "    not addl_ip_addrs):\n",
    "    print('WARNING: Alert has no IpAddress entities.')\n",
    "    print()\n",
    "\n",
    "\n",
    "print('Additional IP addresses')\n",
    "ip_wgt = widgets.Text(description='IP Addresses:', value=addl_ip_addrs, **WIDGET_DEFAULTS)\n",
    "display(ip_wgt)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='basic_ip_checks'></a>[Contents](#toc)\n",
    "## Basic IP Checks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Geo-mapping function definition"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "from msticpy.sectools.geoip import GeoLiteLookup\n",
    "iplocation = GeoLiteLookup()\n",
    "\n",
    "alert_ip_entities = [e for e in security_alert.entities if isinstance(e, mas.IpAddress)]\n",
    "for ip_entity in alert_ip_entities:\n",
    "    if 'Location' not in ip_entity or not ip_entity.Location:\n",
    "        print(ip_entity)\n",
    "        iplocation.lookup_ip(ip_entity=ip_entity)\n",
    "\n",
    "if ip_wgt.value:\n",
    "    ip_str_list = [ip.strip() for ip in ip_wgt.value.split(',') if ip]\n",
    "    if ip_str_list:\n",
    "        _, ip_entities = iplocation.lookup_ip(ip_addr_list=ip_str_list)\n",
    "        alert_ip_entities = alert_ip_entities + ip_entities\n",
    "\n",
    "import folium\n",
    "from folium.plugins import MarkerCluster\n",
    "from numbers import Number\n",
    "import warnings\n",
    "\n",
    "def create_ip_map():\n",
    "    folium_map = folium.Map(zoom_start=7, tiles=None, width='100%', height='100%')\n",
    "    folium_map.add_tile_layer(name='IPEvents')\n",
    "    return folium_map\n",
    "\n",
    "def add_ip_cluster(folium_map, ip_entities, alert=None, **icon_props):\n",
    "    if not folium_map:\n",
    "        folium_map = create_ip_map()\n",
    "    \n",
    "    for ip_entity in ip_entities:\n",
    "        if not (isinstance(ip_entity.Location.Latitude, Number) and\n",
    "                isinstance(ip_entity.Location.Longitude, Number)):\n",
    "            warnings.warn(\"Invalid location information for IP: \" + ip_entity.Address,\n",
    "                          RuntimeWarning)\n",
    "            continue\n",
    "        loc_props = ', '.join([f'{key}={val}' for key, val in \n",
    "                               ip_entity.Location.properties.items() if val])\n",
    "        popup_text = \"{loc_props}<br>{IP}\".format(IP=ip_entity.Address,\n",
    "                                                  loc_props=loc_props)\n",
    "        tooltip_text = '{City}, {CountryName}'.format(**ip_entity.Location.properties)\n",
    "        if alert:\n",
    "            popup_text = f'{popup_text}<br>{alert.AlertName}'\n",
    "        if ip_entity.AdditionalData:\n",
    "            addl_props = ', '.join([f'{key}={val}' for key, val in \n",
    "                                    ip_entity.AdditionalData.items() if val])\n",
    "            popup_text = f'{popup_text}<br>{addl_props}'\n",
    "            tooltip_text = f'{tooltip_text}, {addl_props}'\n",
    "        marker = folium.Marker(\n",
    "            location = [ip_entity.Location.Latitude, ip_entity.Location.Longitude],\n",
    "            popup=popup_text,\n",
    "            tooltip=tooltip_text,\n",
    "            icon=folium.Icon(**icon_props)\n",
    "        )\n",
    "        marker.add_to(folium_map)\n",
    "\n",
    "    return folium_map\n",
    "\n",
    "ip_loc_map = create_ip_map()\n",
    "icon_props = {'color': 'red', 'icon': 'crosshairs', 'prefix': 'fa'}\n",
    "ip_loc_map = add_ip_cluster(folium_map=ip_loc_map,\n",
    "                            ip_entities=alert_ip_entities,\n",
    "                            alert=security_alert,\n",
    "                            **icon_props)\n",
    "display(HTML('<h3>Location of IP Address in alert</h3>'))\n",
    "display(ip_loc_map)\n",
    "\n",
    "add_observation(Observation(caption='Alert IPs Location', \n",
    "                            description='Geolocation of alert IPs',\n",
    "                            item=alert_ip_entities,\n",
    "                            link='basic_ip_checks'))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Reverse IP and WhoIs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# reverse DNS lookup\n",
    "from dns import reversename, resolver\n",
    "from ipwhois import IPWhois\n",
    "for src_ip_entity in alert_ip_entities:\n",
    "    print('IP:', src_ip_entity.Address)\n",
    "    print('-'*50)\n",
    "    \n",
    "    print('Reverse Name Lookup.')\n",
    "    rev_name = reversename.from_address(src_ip_entity.Address)\n",
    "    \n",
    "    print(rev_name)\n",
    "    try:\n",
    "        rev_dns = str(resolver.query(rev_name, 'PTR'))\n",
    "        display(rev_dns)\n",
    "    except:\n",
    "        print('No reverse addr result')\n",
    "        pass\n",
    "\n",
    "    print('\\nWhoIs Lookup.')\n",
    "    whois = IPWhois(src_ip_entity.Address)\n",
    "    whois_result = whois.lookup_whois()\n",
    "    if whois_result:\n",
    "        display(whois_result)\n",
    "    else:\n",
    "        print('No whois result')\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='check_ip_ti'></a>[Contents](#toc)\n",
    "### Check the IP Address for known C2 addresses"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Lookup in BYOTI or VT (or IPRep)\n",
    "fake_ip = '203.0.113.5'\n",
    "ti_query = r'''\n",
    "ThreatIntelSample_CL\n",
    "| where NetworkIP_s == '{ip}'\n",
    "| project\n",
    "TimeGenerated,\n",
    "ExternalIndicatorId_s,\n",
    "ThreatType_s,\n",
    "Description_s,\n",
    "Active_s,\n",
    "TrafficLightProtocolLevel_s,\n",
    "ConfidenceScore_s,\n",
    "ThreatSeverity_s,\n",
    "ExpirationDateTime_t,\n",
    "IndicatorId_s,\n",
    "NetworkIP_s,\n",
    "Type\n",
    "'''.format(ip=fake_ip)\n",
    "%kql -query ti_query\n",
    "ti_query_df = _kql_raw_result_.to_dataframe()\n",
    "if len(ti_query_df) > 0:\n",
    "    display(ti_query_df.T)\n",
    "    \n",
    "    add_observation(Observation(caption='Threat Intel Report on IP(s)', \n",
    "                            description='Threat intelligence report found on alert host ext IP',\n",
    "                            item=ti_query_df,\n",
    "                            link='check_ip_ti'))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='alerthost'></a>[Contents](#toc)\n",
    "# See What's going on on the Affected Host - Linux"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "host1_q_times = mas.QueryTime(label='Set time bounds for alert host - at least 1hr either side of the alert',\n",
    "                           units='hour', max_before=48, before=2, after=1, \n",
    "                           max_after=24, origin_time=security_alert.StartTimeUtc)\n",
    "host1_q_times.display()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Linux test\n",
    "linux_events = r'''\n",
    "AuditLog_CL\n",
    "| where Computer has '{hostname}'\n",
    "| where TimeGenerated >= datetime({start})\n",
    "| where TimeGenerated <= datetime({end})\n",
    "| extend mssg_parts = extract_all(@\"type=(?P<type>[^\\s]+)\\s+msg=audit\\((?P<mssg_id>[^)]+)\\):\\s+(?P<mssg>[^\\r]+)\\r?\", dynamic(['type', 'mssg_id', 'mssg']), RawData)\n",
    "| extend mssg_type = tostring(mssg_parts[0][0]), mssg_id = tostring(mssg_parts[0][1])\n",
    "| project TenantId, TimeGenerated, Computer, mssg_type, mssg_id, mssg_parts\n",
    "| extend mssg_content = split(mssg_parts[0][2],' ')\n",
    "| extend typed_mssg = pack(mssg_type, mssg_content)\n",
    "| summarize AuditdMessage = makelist(typed_mssg) by TenantId, TimeGenerated, Computer, mssg_id\n",
    "'''.format(start=host1_q_times.start, end=host1_q_times.end,\n",
    "           hostname=security_alert.hostname)\n",
    "print('getting data...')\n",
    "%kql -query linux_events\n",
    "linux_events_df = _kql_raw_result_.to_dataframe()\n",
    "print(f'{len(linux_events_df)} raw auditd mssgs downloaded')\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "import codecs\n",
    "from datetime import datetime\n",
    "encoded_params = {'EXECVE': {'a0', 'a1', 'a2', 'a3', 'arch'},\n",
    "                  'PROCTITLE': {'proctitle'},\n",
    "                  'USER_CMD': {'cmd'}}\n",
    "\n",
    "def unpack_auditd(audit_str):\n",
    "    event_dict = {}\n",
    "    for record in audit_str:\n",
    "        \n",
    "        for rec_key, rec_val in record.items():\n",
    "            rec_dict = {}\n",
    "            encoded_fields_map = encoded_params.get(rec_key, None)\n",
    "            for rec_item in rec_val:\n",
    "                rec_split = rec_item.split('=', maxsplit=1)\n",
    "                if len(rec_split) == 1:\n",
    "                    rec_dict[rec_split[0]] = None\n",
    "                    continue\n",
    "                if (not encoded_fields_map or rec_split[1].startswith('\"') or\n",
    "                        rec_split[0] not in encoded_fields_map):\n",
    "                    field_value = rec_split[1].strip('\\\"')\n",
    "                else:\n",
    "                    try:\n",
    "                        field_value = codecs.decode(rec_split[1], 'hex').decode('utf-8')\n",
    "                    except:\n",
    "                        field_value = rec_split[1]\n",
    "                        print(rec_val)\n",
    "                        print('ERR:', rec_key, rec_split[0], rec_split[1], type(rec_split[1]))\n",
    "                rec_dict[rec_split[0]] = field_value\n",
    "            event_dict[rec_key] = rec_dict\n",
    "        \n",
    "    return event_dict\n",
    "\n",
    "USER_START = {'pid': 'int', 'uid': 'int', 'auid': 'int', \n",
    "              'ses': 'int', 'msg': None, 'acct': None, 'exe': None, \n",
    "              'hostname': None, 'addr': None, 'terminal': None, \n",
    "              'res': None}\n",
    "FIELD_DEFS = {'SYSCALL': {'success': None, 'ppid': 'int', 'pid': 'int', \n",
    "                          'auid': 'int', 'uid': 'int', 'gid': 'int',\n",
    "                          'euid': 'int', 'egid': 'int', 'ses': 'int',\n",
    "                          'exe': None, 'com': None},\n",
    "              'CWD': {'cwd': None},\n",
    "              'PROCTITLE': {'proctitle': None},\n",
    "              'LOGIN': {'pid': 'int', 'uid': 'int', 'tty': None, 'old-ses': 'int', \n",
    "                        'ses': 'int', 'res': None},\n",
    "              'EXECVE': {'argc': 'int', 'a0': None, 'a1': None, 'a2': None},\n",
    "              'USER_START': USER_START,\n",
    "              'USER_END': USER_START,\n",
    "              'CRED_DISP': USER_START,\n",
    "              'USER_ACCT': USER_START,\n",
    "              'CRED_ACQ': USER_START,\n",
    "              'USER_CMD': {'pid': 'int', 'uid': 'int', 'auid': 'int', \n",
    "                           'ses': 'int', 'msg': None, 'cmd': None,\n",
    "                           'terminal': None, 'res': None},\n",
    "             }\n",
    "\n",
    "def extract_event(message_dict):\n",
    "    if 'SYSCALL' in message_dict:\n",
    "        proc_create_dict = {}\n",
    "        for mssg_type in ['SYSCALL', 'CWD', 'EXECVE', 'PROCTITLE']:\n",
    "            if (not mssg_type in message_dict or\n",
    "                    not mssg_type in FIELD_DEFS) :\n",
    "                continue\n",
    "            for fieldname, conv in FIELD_DEFS[mssg_type].items():\n",
    "                value = message_dict[mssg_type].get(fieldname, None)\n",
    "                if not value:\n",
    "                    continue\n",
    "                if conv:\n",
    "                    if conv == 'int':\n",
    "                        value = int(value)\n",
    "                        if value == 4294967295:\n",
    "                            value = -1\n",
    "                proc_create_dict[fieldname] = value\n",
    "            if mssg_type == 'EXECVE':\n",
    "                args = int(proc_create_dict.get('argc', 1))\n",
    "                arg_strs = []\n",
    "                for arg_idx in range(0, args):\n",
    "                    arg_strs.append(proc_create_dict.get(f'a{arg_idx}', ''))\n",
    "                    \n",
    "                proc_create_dict['cmdline'] = ' '.join(arg_strs)\n",
    "        return 'SYSCALL', proc_create_dict\n",
    "    else:\n",
    "        event_dict = {}                                            \n",
    "        for mssg_type, mssg in message_dict.items():\n",
    "            if mssg_type in FIELD_DEFS:\n",
    "                for fieldname, conv in FIELD_DEFS[mssg_type].items():\n",
    "                    value = message_dict[mssg_type].get(fieldname, None)\n",
    "                    if conv:\n",
    "                        if conv == 'int':\n",
    "                            value = int(value)\n",
    "                            if value == 4294967295:\n",
    "                                value = -1\n",
    "                    event_dict[fieldname] = value\n",
    "            else:\n",
    "                \n",
    "                event_dict.update(message_dict[mssg_type])\n",
    "        return list(message_dict.keys())[0], event_dict\n",
    "\n",
    "    \n",
    "def move_cols_to_front(df, column_count):\n",
    "    \"\"\"Reorder columms to put the last column count cols to front.\"\"\"\n",
    "    return df[list(df.columns[-column_count:]) + list(df.columns[:-column_count])]\n",
    "\n",
    "\n",
    "def extract_events_to_df(data, event_type=None, verbose=False):\n",
    "    \n",
    "    if verbose:\n",
    "        start_time = datetime.utcnow()\n",
    "        print(f'Unpacking auditd messages for {len(data)} events...')\n",
    "    tmp_df = (data.apply(lambda x: extract_event(unpack_auditd(x.AuditdMessage)), \n",
    "                         axis=1, result_type='expand')\n",
    "                  .rename(columns={0: 'EventType', \n",
    "                                   1: 'EventData'})\n",
    "                  )\n",
    "    # if only one type of event is requested\n",
    "    if event_type:\n",
    "        tmp_df = tmp_df[tmp_df['EventType'] == event_type]\n",
    "        if verbose:\n",
    "            print(f'Event subset = ', event_type, ' (events: {len(tmp_df)})')\n",
    "    \n",
    "    if verbose:\n",
    "        print('Building output dataframe...')\n",
    "        \n",
    "    tmp_df = (tmp_df.apply(lambda x: pd.Series(x.EventData), axis=1)\n",
    "              .merge(tmp_df[['EventType']], left_index=True, right_index=True)\n",
    "              .merge(data.drop(['AuditdMessage'], axis=1), \n",
    "                 how='inner', left_index=True, right_index=True)\n",
    "              .dropna(axis=1, how='all'))\n",
    "    \n",
    "    if verbose:\n",
    "        print('Fixing timestamps...')\n",
    "        \n",
    "    # extract real timestamp from mssg_id\n",
    "    tmp_df['TimeStamp'] = (tmp_df.apply(lambda x:\n",
    "                                        datetime.utcfromtimestamp(float(x.mssg_id.split(':')[0])),\n",
    "                                        axis=1))\n",
    "    tmp_df = (tmp_df.drop(['TimeGenerated'], axis=1)\n",
    "                    .rename(columns={'TimeStamp': 'TimeGenerated'})\n",
    "                    .pipe(move_cols_to_front, column_count=5))\n",
    "    if verbose:\n",
    "        print(f'Complete. {len(tmp_df)} output rows', end=' ')\n",
    "        delta = datetime.utcnow() - start_time\n",
    "        print(f'time: {delta.seconds + delta.microseconds/1_000_000} sec')\n",
    "        \n",
    "    return tmp_df\n",
    "\n",
    "\n",
    "def get_event_subset(data, event_type):\n",
    "    return (data[data['EventType'] == event_type]\n",
    "             .dropna(axis=1, how='all')\n",
    "             .infer_objects())\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "linux_events_all = extract_events_to_df(linux_events_df, verbose=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='linux_event_types'></a>[Contents](#toc)\n",
    "### Event Types collected"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "sns.set()\n",
    "(linux_events_all[['EventType', 'TimeGenerated']]\n",
    "     .groupby('EventType').count().rename(columns={'TimeGenerated': 'EventCount'})\n",
    "     .sort_values('EventCount', ascending=True)\n",
    "     .plot.barh(logx=True, figsize=(12,6)));"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "View events by Type - Process (SYSCALL) and Login events are covered in more detail below.\n",
    "Use this to look at some of the rarer event types to see anything unusual."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "from ipywidgets import interactive\n",
    "\n",
    "items = sorted(linux_events_all['EventType'].unique().tolist())\n",
    "\n",
    "def view(x=''):\n",
    "    display(linux_events_all[linux_events_all['EventType']==x]\n",
    "            .drop(['EventType', 'TenantId', 'Computer', 'mssg_id'], axis=1)\n",
    "            .dropna(axis=1, how='all'))\n",
    "\n",
    "w = widgets.Select(options=items, description='Select Event Type', **WIDGET_DEFAULTS)\n",
    "interactive(view, x=w)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Extract Individual Event Types"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "lx_proc_create = get_event_subset(linux_events_all,'SYSCALL')\n",
    "print(f'{len(lx_proc_create)} Process Create Events')\n",
    "\n",
    "lx_login = (get_event_subset(linux_events_all, 'LOGIN')\n",
    "        .merge(get_event_subset(linux_events_all, 'CRED_ACQ'), \n",
    "               how='inner',\n",
    "               left_on=['old-ses', 'pid', 'uid'], \n",
    "               right_on=['ses', 'pid', 'uid'],\n",
    "               suffixes=('', '_cred')).drop(['old-ses','TenantId_cred', \n",
    "                                             'Computer_cred'], axis=1)\n",
    "        .dropna(axis=1, how='all'))\n",
    "print(f'{len(lx_login)} Login Events')\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='linux_failure_events'></a>[Contents](#toc)\n",
    "### Failure Events\n",
    "Can sometimes tell us about attempts to probe around the system that haven't quite worked.\n",
    "Login failures will show up here as well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "                   \n",
    "\n",
    "lx_fail_events = (linux_events_all[linux_events_all['res'] == \"failed'\"]\n",
    "                    .drop(['TenantId', 'mssg_id'], axis=1)\n",
    "                    .dropna(axis=1, how='all'))\n",
    "if len(lx_fail_events) > 0:\n",
    "    display(lx_fail_events)\n",
    "    add_observation(Observation(caption='Failure events on Linux host.',\n",
    "                               description='One or more failure events detected on host.',\n",
    "                               item=lx_fail_events,\n",
    "                               link='linux_failure_events'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='linux_extract_ips'></a>[Contents](#toc)\n",
    "### Extract IPs from all Events"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "                   \n",
    "\n",
    "# Search all events for addr with an IPAddress. \n",
    "# Drop duplicates and localhost and return list\n",
    "events_with_ips = (linux_events_all[['EventType','addr']]\n",
    "                   [linux_events_all['addr'].str.contains('\\.', na=False)]\n",
    "                   .drop_duplicates())\n",
    "display(events_with_ips)\n",
    "host_ext_ips = list(events_with_ips['addr'].drop_duplicates().to_dict().values())\n",
    "if '127.0.0.1' in host_ext_ips:\n",
    "    host_ext_ips.remove('127.0.0.1')\n",
    "display(host_ext_ips)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='linux_login_ips'></a>[Contents](#toc)\n",
    "### Get Logins with IP Address Recorded"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "                   \n",
    "\n",
    "logins_with_ips = (lx_login[lx_login['addr'] != '?']\n",
    "                   [['Computer', 'TimeGenerated','pid', 'ses', \n",
    "                     'acct', 'addr', 'exe', 'hostname', 'msg',\n",
    "                     'res_cred', 'ses_cred', 'terminal']])\n",
    "if len(logins_with_ips) > 0:\n",
    "    display(logins_with_ips)\n",
    "    add_observation(Observation(caption='Login events with source Ip addresses',\n",
    "                                description=f'{len(logins_with_ips)} logins with external addresses',\n",
    "                                item=logins_with_ips,\n",
    "                                link='linux_login_ips'))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='linux_login_sessions'></a>[Contents](#toc)\n",
    "### What's happening in these sessions?\n",
    "If there are a lot of events here try the [Process Clustering](#linux_proc_cluster) section below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# copy/replace this value with the ses/SubjectLogonId value\n",
    "\n",
    "\n",
    "items = sorted(lx_login[lx_login['addr'] != '?']['ses'].unique().tolist())\n",
    "\n",
    "def view(x=''):\n",
    "    procs = (lx_proc_create[lx_proc_create['ses']==x]\n",
    "                [['TimeGenerated', 'exe','cmdline', 'pid','cwd']])\n",
    "    display(Markdown(f'{len(procs)} process events'))\n",
    "    display(procs)\n",
    "\n",
    "w = widgets.Select(options=items, description='Select Session', **WIDGET_DEFAULTS)\n",
    "interactive(view, x=w)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Convert Auditd to Windows-like events"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "                   \n",
    "\n",
    "lx_to_proc_create = {'acct': 'SubjectUserName',\n",
    "                     'uid': 'SubjectUserSid',\n",
    "                     'user': 'SubjectUserName',\n",
    "                     'ses': 'SubjectLogonId',\n",
    "                     'pid': 'NewProcessId',\n",
    "                     'exe': 'NewProcessName',\n",
    "                     'ppid': 'ProcessId',\n",
    "                     'cmdline': 'CommandLine',}\n",
    "\n",
    "proc_create_to_lx = {'SubjectUserName': 'acct',\n",
    "                     'SubjectUserSid': 'uid',\n",
    "                     'SubjectUserName': 'user',\n",
    "                     'SubjectLogonId': 'ses',\n",
    "                     'NewProcessId': 'pid',\n",
    "                     'NewProcessName': 'exe',\n",
    "                     'ProcessId': 'ppid',\n",
    "                     'CommandLine': 'cmdline',}\n",
    "\n",
    "lx_to_logon = {'acct': 'SubjectUserName',\n",
    "               'auid': 'SubjectUserSid',\n",
    "               'user': 'TargetUserName',\n",
    "               'uid': 'TargetUserSid',\n",
    "               'ses': 'TargetLogonId',\n",
    "               'exe': 'LogonProcessName',\n",
    "               'terminal': 'LogonType',\n",
    "               'msg': 'AuthenticationPackageName',\n",
    "               'res': 'Status',\n",
    "               'addr': 'IpAddress',\n",
    "               'hostname': 'WorkstationName',}\n",
    "\n",
    "logon_to_lx = {'SubjectUserName': 'acct',\n",
    "               'SubjectUserSid': 'auid',\n",
    "               'TargetUserName': 'user',\n",
    "               'TargetUserSid': 'uid',\n",
    "               'TargetLogonId': 'ses',\n",
    "               'LogonProcessName': 'exe',\n",
    "               'LogonType': 'terminal',\n",
    "               'AuthenticationPackageName': 'msg',\n",
    "               'Status': 'res',\n",
    "               'IpAddress': 'addr',\n",
    "               'WorkstationName': 'hostname',}\n",
    "\n",
    "lx_proc_create_trans = lx_proc_create.rename(columns=lx_to_proc_create)\n",
    "lx_login_trans = lx_login.rename(columns=lx_to_logon)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='linux_proc_cluster'></a>[Contents](#toc)\n",
    "## Find Distinctive Process Patterns - Clustering"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "                   \n",
    "print('analyzing data...')\n",
    "from msticpy.sectools.eventcluster import dbcluster_events, add_process_features\n",
    "\n",
    "feature_procs_h1 = add_process_features(input_frame=lx_proc_create_trans,\n",
    "                                        path_separator=security_alert.path_separator)\n",
    "\n",
    "\n",
    "# you might need to play around with the max_cluster_distance parameter.\n",
    "# decreasing this gives more clusters.\n",
    "(clus_events, dbcluster, x_data) = dbcluster_events(data=feature_procs_h1,\n",
    "                                                    cluster_columns=['commandlineTokensFull', \n",
    "                                                                     'pathScore',\n",
    "                                                                    'SubjectUserSid'],\n",
    "                                                    time_column='TimeGenerated',\n",
    "                                                    max_cluster_distance=0.0001)\n",
    "print('Number of input events:', len(feature_procs_h1))\n",
    "print('Number of clustered events:', len(clus_events))\n",
    "(clus_events.sort_values('TimeGenerated')[['TimeGenerated', 'LastEventTime',\n",
    "                                           'NewProcessName', 'CommandLine', \n",
    "                                           'ClusterSize', 'commandlineTokensFull',\n",
    "                                           'SubjectLogonId', 'SubjectUserSid',\n",
    "                                           'pathScore', 'isSystemSession']]\n",
    "    .sort_values('ClusterSize', ascending=True));"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "                   \n",
    "\n",
    "def view(x=''):\n",
    "    procs = (clus_events[clus_events['SubjectLogonId']==x]\n",
    "            [['TimeGenerated', 'NewProcessName','CommandLine', \n",
    "              'NewProcessId', 'SubjectUserSid', 'cwd', 'ClusterSize', 'SubjectLogonId']])\n",
    "    display(Markdown(f'{len(procs)} process events'))\n",
    "    display(procs)\n",
    "\n",
    "w = widgets.Select(options=items, description='Select Session to view', **WIDGET_DEFAULTS)\n",
    "interactive(view, x=w)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "selected_session = w.value\n",
    "add_observation(Observation(caption='Suspicious Process Session on Linux Host.',\n",
    "                            description='Attempt to download and run script + recon cmds.',\n",
    "                            item = clus_events.query('SubjectLogonId == @selected_session & ClusterSize < 3'),\n",
    "                            link='linux_proc_cluster'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='alert_host_net'></a>[Contents](#toc)\n",
    "# Alert Host Network Data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Get the IP Address of the Source Host"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "                   \n",
    "\n",
    "host_entities = [e for e in security_alert.entities if isinstance(e, mas.Host)]\n",
    "if len(host_entities) == 1:\n",
    "    alert_host_entity = host_entities[0]\n",
    "    host_name = alert_host_entity.HostName\n",
    "    resource = alert_host_entity.AzureID\n",
    "else:\n",
    "    host_name = None\n",
    "    alert_host_entity = None\n",
    "    print('Error: Could not determine host entity from alert. Please type the hostname below')\n",
    "txt_wgt = widgets.Text(value=host_name, description='Confirm Source Host name:', **WIDGET_DEFAULTS)\n",
    "display(txt_wgt)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "             \n",
    "\n",
    "print('Looking for IP addresses of ', txt_wgt.value)\n",
    "aznet_query = '''\n",
    "AzureNetworkAnalytics_CL \n",
    "| where VirtualMachine_s has \\'{host}\\'\n",
    "| where ResourceType == 'NetworkInterface'\n",
    "| top 1 by TimeGenerated desc\n",
    "| project PrivateIPAddresses = PrivateIPAddresses_s, \n",
    "    PublicIPAddresses = PublicIPAddresses_s\n",
    "'''.format(host=txt_wgt.value)\n",
    "%kql -query aznet_query\n",
    "az_net_df = _kql_raw_result_.to_dataframe()\n",
    "\n",
    "\n",
    "oms_heartbeat_query = '''\n",
    "Heartbeat\n",
    "| where Computer has \\'{host}\\'\n",
    "| top 1 by TimeGenerated desc nulls last\n",
    "| project ComputerIP, OSType, OSMajorVersion, OSMinorVersion, ResourceId, RemoteIPCountry, \n",
    "RemoteIPLatitude, RemoteIPLongitude, SourceComputerId\n",
    "'''.format(host=txt_wgt.value)\n",
    "%kql -query oms_heartbeat_query\n",
    "oms_heartbeat_df = _kql_raw_result_.to_dataframe()\n",
    "display(oms_heartbeat_df[['ComputerIP']])\n",
    "display(az_net_df)\n",
    "\n",
    "print('getting data...')\n",
    "# Get the host entity and add this IP and system info to the \n",
    "try:\n",
    "    if not inv_host_entity:\n",
    "        inv_host_entity = mas.Host()\n",
    "        inv_host_entity.HostName = host_name\n",
    "except NameError:\n",
    "    inv_host_entity = mas.Host()\n",
    "    inv_host_entity.HostName = host_name\n",
    "\n",
    "def convert_to_ip_entities(ip_str):\n",
    "    ip_entities = []\n",
    "    if ip_str:\n",
    "        if ',' in ip_str:\n",
    "            addrs = ip_str.split(',')\n",
    "        elif ' ' in ip_str:\n",
    "            addrs = ip_str.split(' ')\n",
    "        else:\n",
    "            addrs = [ip_str]\n",
    "        for addr in addrs:\n",
    "            ip_entity = mas.IpAddress()\n",
    "            ip_entity.Address = addr.strip()\n",
    "            iplocation.lookup_ip(ip_entity=ip_entity)\n",
    "            ip_entities.append(ip_entity)\n",
    "    return ip_entities\n",
    "\n",
    "# Add this information to our inv_host_entity\n",
    "if len(az_net_df) == 1:\n",
    "    priv_addr_str = az_net_df['PrivateIPAddresses'].loc[0]\n",
    "    inv_host_entity.properties['private_ips'] = convert_to_ip_entities(priv_addr_str)\n",
    "\n",
    "    pub_addr_str = az_net_df['PublicIPAddresses'].loc[0]\n",
    "    inv_host_entity.properties['public_ips'] = convert_to_ip_entities(pub_addr_str)\n",
    "\n",
    "retrieved_address = [ip.Address for ip in inv_host_entity.properties['public_ips']]\n",
    "if len(oms_heartbeat_df) == 1:\n",
    "    if oms_heartbeat_df['ComputerIP'].loc[0]:\n",
    "        oms_address = oms_heartbeat_df['ComputerIP'].loc[0]\n",
    "        if oms_address not in retrieved_address:\n",
    "            ip_entity = mas.IpAddress()\n",
    "            ip_entity.Address = oms_address\n",
    "            iplocation.lookup_ip(ip_entity=ip_entity)\n",
    "            inv_host_entity.properties['public_ips'].append(ip_entity)\n",
    "        \n",
    "    inv_host_entity.OSFamily = oms_heartbeat_df['OSType'].loc[0]\n",
    "    inv_host_entity.AdditionalData['OSMajorVersion'] = oms_heartbeat_df['OSMajorVersion'].loc[0]\n",
    "    inv_host_entity.AdditionalData['OSMinorVersion'] = oms_heartbeat_df['OSMinorVersion'].loc[0]\n",
    "    inv_host_entity.AdditionalData['SourceComputerId'] = oms_heartbeat_df['SourceComputerId'].loc[0]\n",
    "\n",
    "print('Updated Host Entity\\n')\n",
    "print(inv_host_entity)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='comms_to_other_hosts'></a>[Contents](#toc)\n",
    "## Check Communications with Other Hosts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "                   \n",
    "\n",
    "# Azure Network Analytics Base Query\n",
    "az_net_analytics_query =r'''\n",
    "AzureNetworkAnalytics_CL \n",
    "| where SubType_s == 'FlowLog'\n",
    "| where FlowStartTime_t >= datetime({start})\n",
    "| where FlowEndTime_t <= datetime({end})\n",
    "| project TenantId, TimeGenerated, \n",
    "    FlowStartTime = FlowStartTime_t, \n",
    "    FlowEndTime = FlowEndTime_t, \n",
    "    FlowIntervalEndTime = FlowIntervalEndTime_t, \n",
    "    FlowType = FlowType_s,\n",
    "    ResourceGroup = split(VM_s, '/')[0],\n",
    "    VMName = split(VM_s, '/')[1],\n",
    "    VMIPAddress = VMIP_s, \n",
    "    PublicIPs = extractall(@\"([\\d\\.]+)[|\\d]+\", dynamic([1]), PublicIPs_s),\n",
    "    SrcIP = SrcIP_s,\n",
    "    DestIP = DestIP_s,\n",
    "    ExtIP = iif(FlowDirection_s == 'I', SrcIP_s, DestIP_s),\n",
    "    L4Protocol = L4Protocol_s, \n",
    "    L7Protocol = L7Protocol_s, \n",
    "    DestPort = DestPort_d, \n",
    "    FlowDirection = FlowDirection_s,\n",
    "    AllowedOutFlows = AllowedOutFlows_d, \n",
    "    AllowedInFlows = AllowedInFlows_d,\n",
    "    DeniedInFlows = DeniedInFlows_d, \n",
    "    DeniedOutFlows = DeniedOutFlows_d,\n",
    "    RemoteRegion = AzureRegion_s,\n",
    "    VMRegion = Region_s\n",
    "| extend AllExtIPs = iif(isempty(PublicIPs), pack_array(ExtIP), \n",
    "                         iif(isempty(ExtIP), PublicIPs, array_concat(PublicIPs, pack_array(ExtIP)))\n",
    "                         )\n",
    "| project-away ExtIP\n",
    "| mvexpand AllExtIPs\n",
    "{where_clause}\n",
    "'''\n",
    "\n",
    "ip_q_times = mas.QueryTime(label='Set time bounds for network queries',\n",
    "                           units='hour', max_before=48, before=10, after=5, \n",
    "                           max_after=24, origin_time=security_alert.StartTimeUtc)\n",
    "ip_q_times.display()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Query Flows by Host IP Addresses"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "                   \n",
    "\n",
    "all_alert_host_ips = inv_host_entity.private_ips + inv_host_entity.public_ips\n",
    "host_ips = {'\\'{}\\''.format(i.Address) for i in all_alert_host_ips}\n",
    "alert_host_ip_list = ','.join(host_ips)\n",
    "\n",
    "az_ip_where = f'''\n",
    "| where (VMIPAddress in ({alert_host_ip_list}) \n",
    "        or SrcIP in ({alert_host_ip_list}) \n",
    "        or DestIP in ({alert_host_ip_list}) \n",
    "        ) and \n",
    "    (AllowedOutFlows > 0 or AllowedInFlows > 0)'''\n",
    "print('getting data...')\n",
    "az_net_query_byip = az_net_analytics_query.format(where_clause=az_ip_where,\n",
    "                                                  start = ip_q_times.start,\n",
    "                                                  end = ip_q_times.end)\n",
    "\n",
    "net_default_cols = ['FlowStartTime', 'FlowEndTime', 'VMName', 'VMIPAddress', \n",
    "                'PublicIPs', 'SrcIP', 'DestIP', 'L4Protocol', 'L7Protocol',\n",
    "                'DestPort', 'FlowDirection', 'AllowedOutFlows', \n",
    "                'AllowedInFlows']\n",
    "\n",
    "%kql -query az_net_query_byip\n",
    "az_net_comms_df = _kql_raw_result_.to_dataframe()\n",
    "az_net_comms_df[net_default_cols]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"net_flow_graphs\"></a>\n",
    "### Flow Time and Protocol Distribution"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "import warnings\n",
    "\n",
    "                   \n",
    "\n",
    "with warnings.catch_warnings():\n",
    "    warnings.simplefilter(\"ignore\")\n",
    "    \n",
    "    az_net_comms_df['TotalAllowedFlows'] = az_net_comms_df['AllowedOutFlows'] + az_net_comms_df['AllowedInFlows']\n",
    "    sns.catplot(x=\"L7Protocol\", y=\"TotalAllowedFlows\", col=\"FlowDirection\", data=az_net_comms_df)\n",
    "    sns.relplot(x=\"FlowStartTime\", y=\"TotalAllowedFlows\", \n",
    "                col=\"FlowDirection\", kind=\"line\", \n",
    "                hue=\"L7Protocol\", data=az_net_comms_df).set_xticklabels(rotation=50)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Isolated SSH traffic"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "az_net_comms_df.query('FlowDirection == \\'I\\' & L7Protocol == \\'ssh\\'')[net_default_cols]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Seems suspicious, so Record findings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "ext_ip_list = az_net_comms_df.query('FlowDirection == \\'I\\' & L7Protocol == \\'ssh\\'')['AllExtIPs'].tolist()\n",
    "\n",
    "for ip in ext_ip_list:\n",
    "    if not ip:\n",
    "        continue\n",
    "    # Check IP is not already in our list of entities\n",
    "    if ip in [curr_ip.Address for curr_ip in alert_ip_entities]:\n",
    "        continue\n",
    "    ip_entity = mas.IpAddress(Address=ip)\n",
    "    iplocation.lookup_ip(ip_entity=ip_entity)\n",
    "    \n",
    "    alert_ip_entities.append(ip_entity)\n",
    "    \n",
    "add_observation(Observation(caption='Outlier SSH session on Linux Host.',\n",
    "                            description='''Plot of in/out flows shows unexpected ssh inbound. \n",
    "Ip Address confirmed as logon source for SSH.''',\n",
    "                            item = az_net_comms_df.query('FlowDirection == \\'I\\' & L7Protocol == \\'ssh\\''),\n",
    "                            link='net_flow_graphs'))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Interactive Flow Timeline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "nbdisp.display_timeline(data=az_net_comms_df.query('AllowedOutFlows > 0'),\n",
    "                         overlay_data=az_net_comms_df.query('AllowedInFlows > 0'),\n",
    "                         alert=security_alert, title='Network Flows (out=blue, in=green)',\n",
    "                         time_column='FlowStartTime',\n",
    "                         source_columns=['FlowType', 'AllExtIPs', 'L7Protocol', 'FlowDirection'],\n",
    "                         height=300)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='geomap_lx_ips'></a>[Contents](#toc)\n",
    "## GeoLocation Mapping"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "ip_locs_in = set()\n",
    "ip_locs_out = set()\n",
    "for _, row in az_net_comms_df.iterrows():\n",
    "    ip = row.AllExtIPs\n",
    "\n",
    "    if ip in ip_locs_in or ip in ip_locs_out or not ip:\n",
    "        continue\n",
    "    ip_entity = mas.IpAddress(Address=ip)\n",
    "    iplocation.lookup_ip(ip_entity=ip_entity)\n",
    "    if not ip_entity.Location:\n",
    "        continue\n",
    "    ip_entity.AdditionalData['protocol'] = row.L7Protocol\n",
    "    if row.FlowDirection == 'I':\n",
    "        ip_locs_in.add(ip_entity)\n",
    "    else:\n",
    "        ip_locs_out.add(ip_entity)\n",
    "\n",
    "flow_map = create_ip_map()\n",
    "display(HTML('<h3>External IP Addresses communicating with host</h3>'))\n",
    "display(HTML('Numbered circles indicate multiple items - click to expand'))\n",
    "display(HTML('Location markers: Blue = outbound, Purple = inbound, Green = Host'))\n",
    "\n",
    "icon_props = {'color': 'green'}\n",
    "flow_map = add_ip_cluster(folium_map=flow_map,\n",
    "                            ip_entities=inv_host_entity.public_ips,\n",
    "                            **icon_props)\n",
    "icon_props = {'color': 'blue'}\n",
    "flow_map = add_ip_cluster(folium_map=flow_map,\n",
    "                            ip_entities=ip_locs_out,\n",
    "                            **icon_props)\n",
    "icon_props = {'color': 'purple'}\n",
    "flow_map = add_ip_cluster(folium_map=flow_map,\n",
    "                            ip_entities=ip_locs_in,\n",
    "                            **icon_props)\n",
    "\n",
    "display(flow_map)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Look at 'Denied' Flows - who's trying to get in from where?\n",
    "#### Optional and can take a long time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "# Comment this out to run automatically\n",
    "if False:\n",
    "    az_ip_where = f'''\n",
    "    | where (VMIPAddress in ({alert_host_ip_list}) \n",
    "            or SrcIP in ({alert_host_ip_list}) \n",
    "            or DestIP in ({alert_host_ip_list}) \n",
    "            )'''\n",
    "\n",
    "    az_net_query_byip = az_net_analytics_query.format(where_clause=az_ip_where,\n",
    "                                                      start = ip_q_times.start,\n",
    "                                                      end = ip_q_times.end)\n",
    "    %kql -query az_net_query_byip\n",
    "    az_net_comms_all_df = _kql_raw_result_.to_dataframe()\n",
    "\n",
    "    ip_all = set()\n",
    "    ip_locs_in_allow = set()\n",
    "    ip_locs_out_allow = set()\n",
    "    ip_locs_in_deny = set()\n",
    "    ip_locs_out_deny = set()\n",
    "    for _, row in az_net_comms_all_df.iterrows():\n",
    "        if not row.PublicIPs:\n",
    "            continue\n",
    "        for ip in row.PublicIPs:\n",
    "            if ip in ip_all:\n",
    "                continue\n",
    "            ip_all.add(ip)\n",
    "            ip_entity = mas.IpAddress(Address=ip)\n",
    "            iplocation.lookup_ip(ip_entity=ip_entity)\n",
    "            if not ip_entity.Location:\n",
    "                print(\"No location information for IP: \", ip)\n",
    "                continue\n",
    "            ip_entity.AdditionalData['protocol'] = row.L7Protocol\n",
    "            if row.FlowDirection == 'I':\n",
    "                if row.AllowedInFlows > 0:\n",
    "                    ip_locs_in_allow.add(ip_entity)\n",
    "                elif row.DeniedInFlows > 0:\n",
    "                    ip_locs_in_deny.add(ip_entity)\n",
    "            else:\n",
    "                if row.AllowedOutFlows > 0:\n",
    "                    ip_locs_out_allow.add(ip_entity)\n",
    "                elif row.DeniedOutFlows > 0:\n",
    "                    ip_locs_out_deny.add(ip_entity)\n",
    "\n",
    "    flow_map = create_ip_map()\n",
    "    display(HTML('<h3>External IP Addresses Blocked and Allowed communicating with host</h3>'))\n",
    "    display(HTML('Numbered circles indicate multiple items - click to expand.'))\n",
    "    display(HTML('Location markers: Blue = outbound, Purple = inbound, Red = in denied, Cyan = out denied.'))\n",
    "\n",
    "\n",
    "    icon_props = {'color': 'purple'}\n",
    "    flow_map = add_ip_cluster(folium_map=flow_map,\n",
    "                                ip_entities=ip_locs_in_allow,\n",
    "                                **icon_props)\n",
    "    icon_props = {'color': 'blue'}\n",
    "    flow_map = add_ip_cluster(folium_map=flow_map,\n",
    "                                ip_entities=ip_locs_out_allow,\n",
    "                                **icon_props)\n",
    "    icon_props = {'color': 'red'}\n",
    "    flow_map = add_ip_cluster(folium_map=flow_map,\n",
    "                                ip_entities=ip_locs_in_deny,\n",
    "                                **icon_props)\n",
    "    icon_props = {'color': 'cyan'}\n",
    "    flow_map = add_ip_cluster(folium_map=flow_map,\n",
    "                                ip_entities=ip_locs_out_deny,\n",
    "                                **icon_props)\n",
    "    display(flow_map)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### DNS Activity Includes any of these IPs?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "dns_query =r'''\n",
    "DnsEvents\n",
    "| where ClientIP in ({ip_list})\n",
    "'''.format(ip_list=', '.join([f'\\'{ip.Address}\\'' for ip in alert_ip_entities]))\n",
    "\n",
    "%kql -query dns_query\n",
    "dns_df = _kql_raw_result_.to_dataframe()\n",
    "dns_df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='other_hosts_to_ips'></a>[Contents](#toc)\n",
    "## Have any other hosts been communicating with this address(es)?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "ip_q_times = mas.QueryTime(units='day', max_before=10, before=3, after=1, max_after=10, origin_time=security_alert.StartTimeUtc)\n",
    "ip_q_times.display()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "alert_ips = {'\\'{}\\''.format(i.Address) for i in alert_ip_entities}\n",
    "alert_host_ip_list = ','.join(alert_ips)\n",
    "\n",
    "az_ip_where = f'| where AllExtIPs in ({alert_host_ip_list})'\n",
    "\n",
    "az_net_query_by_pub_ip = az_net_analytics_query.format(where_clause=az_ip_where,\n",
    "                                                       start = ip_q_times.start,\n",
    "                                                       end = ip_q_times.end)\n",
    "print('getting data...')\n",
    "%kql -query az_net_query_by_pub_ip\n",
    "az_net_ext_comms_df = _kql_raw_result_.to_dataframe()\n",
    "az_net_ext_comms_df[net_default_cols]\n",
    "\n",
    "# az_net_ext_comms_df.groupby(['VMName', 'L7Protocol'])['AllowedOutFlows','AllowedInFlows','DeniedInFlows','DeniedOutFlows'].sum()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "inv_host_ips = [ent.Address for ent in inv_host_entity.private_ips]\n",
    "inv_host_ips += [ent.Address for ent in inv_host_entity.public_ips]\n",
    "\n",
    "alert_ips = [ip.Address for ip in alert_ip_entities]\n",
    "\n",
    "known_ips = inv_host_ips + alert_ips\n",
    "\n",
    "# Ips can be in one of 4 columns!\n",
    "def find_new_ips(known_ips, row):\n",
    "    new_ips = set()\n",
    "    if row.VMIPAddress and row.VMIPAddress not in known_ips:\n",
    "        new_ips.add(row.VMIPAddress)\n",
    "    if row.SrcIP and row.SrcIP not in known_ips:\n",
    "        new_ips.add(row.SrcIP)\n",
    "    if row.DestIP and row.DestIP not in known_ips:\n",
    "        new_ips.add(row.DestIP)\n",
    "    if row.PublicIPs:\n",
    "        for pub_ip in row.PublicIPs:\n",
    "            if pub_ip not in known_ips:\n",
    "                new_ips.add(pub_ip)\n",
    "    if new_ips:            \n",
    "        return list(new_ips)\n",
    "\n",
    "new_ips_all = az_net_ext_comms_df.apply(lambda x: find_new_ips(known_ips, x), axis=1).dropna()\n",
    "new_ips = set()\n",
    "for ip in [ip for item in new_ips_all for ip in item]:\n",
    "    new_ips.add(ip)\n",
    "display(Markdown(f'#### {len(new_ips)} unseen IP Address found in this data: {list(new_ips)}'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Note you should re-run this section for each new IP Address found to determine who it belongs to"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "items = list(new_ips)\n",
    "ip_w = widgets.Select(options=items, \n",
    "                   description='Select ip address to search for',\n",
    "                   value=items[0] if items else None,\n",
    "                   **WIDGET_DEFAULTS)\n",
    "display(ip_w)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "vm_ip = ip_w.value\n",
    "aznet_query = '''\n",
    "AzureNetworkAnalytics_CL \n",
    "| where PrivateIPAddresses_s has \\'{vm_ip}\\'\n",
    "| where ResourceType == 'NetworkInterface'\n",
    "| top 1 by TimeGenerated desc\n",
    "| project PrivateIPAddresses = PrivateIPAddresses_s, \n",
    "    PublicIPAddresses = PublicIPAddresses_s,\n",
    "    VirtualMachine = VirtualMachine_s\n",
    "| extend Host = split(VirtualMachine, '/')[-1]\n",
    "'''.format(vm_ip=vm_ip)\n",
    "%kql -query aznet_query\n",
    "az_net_df = _kql_raw_result_.to_dataframe()\n",
    "if len(az_net_df) > 0:\n",
    "    host_name = az_net_df['Host'].at[0]\n",
    "\n",
    "oms_heartbeat_query = '''\n",
    "Heartbeat\n",
    "| where ComputerIP == \\'{vm_ip}\\'\n",
    "| top 1 by TimeGenerated desc nulls last\n",
    "| project Computer, ComputerIP, OSType, OSMajorVersion, OSMinorVersion, ResourceId, RemoteIPCountry, \n",
    "RemoteIPLatitude, RemoteIPLongitude, SourceComputerId\n",
    "'''.format(vm_ip=vm_ip)\n",
    "%kql -query oms_heartbeat_query\n",
    "oms_heartbeat_df = _kql_raw_result_.to_dataframe()\n",
    "if len(oms_heartbeat_df) > 0:\n",
    "    host_name = oms_heartbeat_df['Computer'].at[0]\n",
    "    \n",
    "\n",
    "# Get the host entity and add this IP and system info to the \n",
    "try:\n",
    "    if not victim_host_entity:\n",
    "        victim_host_entity = mas.Host()\n",
    "        victim_host_entity.HostName = host_name\n",
    "except NameError:\n",
    "    victim_host_entity = mas.Host()\n",
    "    victim_host_entity.HostName = host_name\n",
    "\n",
    "def convert_to_ip_entities(ip_str):\n",
    "    ip_entities = []\n",
    "    if ip_str:\n",
    "        if ',' in ip_str:\n",
    "            addrs = ip_str.split(',')\n",
    "        elif ' ' in ip_str:\n",
    "            addrs = ip_str.split(' ')\n",
    "        else:\n",
    "            addrs = [ip_str]\n",
    "        for addr in addrs:\n",
    "            ip_entity = mas.IpAddress()\n",
    "            ip_entity.Address = addr.strip()\n",
    "            iplocation.lookup_ip(ip_entity=ip_entity)\n",
    "            ip_entities.append(ip_entity)\n",
    "    return ip_entities\n",
    "\n",
    "# Add this information to our inv_host_entity\n",
    "retrieved_pub_addresses = []\n",
    "if len(az_net_df) == 1:\n",
    "    priv_addr_str = az_net_df['PrivateIPAddresses'].loc[0]\n",
    "    victim_host_entity.properties['private_ips'] = convert_to_ip_entities(priv_addr_str)\n",
    "\n",
    "    pub_addr_str = az_net_df['PublicIPAddresses'].loc[0]\n",
    "    victim_host_entity.properties['public_ips'] = convert_to_ip_entities(pub_addr_str)\n",
    "    retrieved_pub_addresses = [ip.Address for ip in victim_host_entity.properties['public_ips']]\n",
    "    \n",
    "if len(oms_heartbeat_df) == 1:\n",
    "    if oms_heartbeat_df['ComputerIP'].loc[0]:\n",
    "        oms_address = oms_heartbeat_df['ComputerIP'].loc[0]\n",
    "        if oms_address not in retrieved_address:\n",
    "            ip_entity = mas.IpAddress()\n",
    "            ip_entity.Address = oms_address\n",
    "            iplocation.lookup_ip(ip_entity=ip_entity)\n",
    "            inv_host_entity.properties['public_ips'].append(ip_entity)\n",
    "        \n",
    "    victim_host_entity.OSFamily = oms_heartbeat_df['OSType'].loc[0]\n",
    "    victim_host_entity.AdditionalData['OSMajorVersion'] = oms_heartbeat_df['OSMajorVersion'].loc[0]\n",
    "    victim_host_entity.AdditionalData['OSMinorVersion'] = oms_heartbeat_df['OSMinorVersion'].loc[0]\n",
    "    victim_host_entity.AdditionalData['SourceComputerId'] = oms_heartbeat_df['SourceComputerId'].loc[0]\n",
    "\n",
    "print(f'Found New Host Entity {victim_host_entity.HostName}\\n')\n",
    "print(victim_host_entity)\n",
    "\n",
    "add_observation(Observation(caption=f'Second victim host identified {victim_host_entity.HostName}',\n",
    "                            description='Description of host entity shown in attachment.',\n",
    "                            item=victim_host_entity,\n",
    "                            link='other_hosts_to_ips'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "sns.set()\n",
    "from matplotlib import MatplotlibDeprecationWarning\n",
    "warnings.simplefilter(\"ignore\", category=MatplotlibDeprecationWarning)\n",
    "\n",
    "ip_graph = nx.DiGraph(id='IPGraph')\n",
    "\n",
    "def add_vm_node(graph, host_entity):\n",
    "    vm_name = host_entity.HostName\n",
    "    vm_ip = host_entity.private_ips[0].Address\n",
    "    vm_desc = f'{host_entity.HostName}\\n{row.ResourceGroup}, {row.VMRegion}'\n",
    "    ip_graph.add_node(vm_ip, name=vm_name, description=vm_desc,\n",
    "                      node_type='host')\n",
    "\n",
    "for ip_entity in alert_ip_entities:\n",
    "    if 'Location' in ip_entity:\n",
    "        ip_desc = f'{ip_entity.Address}\\n{ip_entity.Location.City}, {ip_entity.Location.CountryName}'\n",
    "    else:\n",
    "        ip_desc = 'unknown location'\n",
    "    ip_graph.add_node(ip_entity.Address, name=ip_entity.Address, description=ip_desc, node_type='ip')\n",
    "\n",
    "add_vm_node(ip_graph, inv_host_entity)\n",
    "add_vm_node(ip_graph, victim_host_entity)\n",
    "\n",
    "\n",
    "def add_edges(graph, row): \n",
    "    dest_ip = row.DestIP if row.DestIP else row.VMIPAddress\n",
    "    if row.SrcIP:\n",
    "        src_ip = row.SrcIP\n",
    "        ip_graph.add_edge(src_ip, dest_ip)\n",
    "    else:\n",
    "        for ip in row.PublicIPs:\n",
    "            src_ip = ip\n",
    "            ip_graph.add_edge(src_ip, dest_ip)\n",
    "\n",
    "    \n",
    "# Add edges from network data\n",
    "az_net_ext_comms_df.apply(lambda x: add_edges(ip_graph, x),axis=1)\n",
    "\n",
    "src_node = [n for (n, node_type) in\n",
    "            nx.get_node_attributes(ip_graph, 'node_type').items()\n",
    "            if node_type == 'ip']\n",
    "vm_nodes = [n for (n, node_type) in\n",
    "            nx.get_node_attributes(ip_graph, 'node_type').items()\n",
    "            if node_type == 'host']\n",
    "\n",
    "# now draw them in subsets  using the `nodelist` arg\n",
    "plt.rcParams['figure.figsize'] = (10, 10)\n",
    "plt.margins(x=0.3, y=0.3)\n",
    "plt.title('Comms between hosts and suspect IPs')\n",
    "pos = nx.circular_layout(ip_graph)\n",
    "nx.draw_networkx_nodes(ip_graph, pos, nodelist=src_node,\n",
    "                       node_color='red', alpha=0.5, node_shape='o')\n",
    "nx.draw_networkx_nodes(ip_graph, pos, nodelist=vm_nodes,\n",
    "                       node_color='green', alpha=0.5, node_shape='s',\n",
    "                       s=400)\n",
    "nlabels = nx.get_node_attributes(ip_graph, 'description')\n",
    "nx.relabel_nodes(ip_graph, nlabels)\n",
    "nx.draw_networkx_labels(ip_graph, pos, nlabels, font_size=15)\n",
    "nx.draw_networkx_edges(ip_graph, pos, alpha=0.5, arrows=True, arrowsize=20);\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='other_host_investigate'></a>[Contents](#toc)\n",
    "# Other Hosts Communicating with IP"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='host_logons'></a>[Contents](#toc)\n",
    "## Check Host Logons"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "from msticpy.nbtools.query_defns import DataFamily, DataEnvironment\n",
    "params_dict = {}\n",
    "params_dict.update(security_alert.query_params)\n",
    "params_dict['host_filter_eq'] = f'Computer has \\'{victim_host_entity.HostName}\\''\n",
    "params_dict['host_filter_neq'] = f'Computer !has \\'{victim_host_entity.HostName}\\''\n",
    "params_dict['host_name'] = victim_host_entity.HostName\n",
    "if victim_host_entity.OSFamily == 'Linux':\n",
    "    params_dict['data_family'] = DataFamily.LinuxSecurity\n",
    "    params_dict['path_separator'] = '/'\n",
    "else:\n",
    "    params_dict['data_family'] = DataFamily.WindowsSecurity\n",
    "    params_dict['path_separator'] = '\\\\'\n",
    "\n",
    "# set the origin time to the time of our alert\n",
    "logon_query_times = mas.QueryTime(units='day', origin_time=security_alert.origin_time,\n",
    "                                  before=5, after=1, max_before=20, max_after=20)\n",
    "logon_query_times.display()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "from msticpy.sectools.eventcluster import dbcluster_events, add_process_features, _string_score\n",
    "\n",
    "host_logons = qry.list_host_logons(provs=[logon_query_times], **params_dict)\n",
    "\n",
    "\n",
    "if len(host_logons) > 0:\n",
    "    logon_features = host_logons.copy()\n",
    "    logon_features['AccountNum'] = host_logons.apply(lambda x: _string_score(x.Account), axis=1)\n",
    "    logon_features['LogonIdNum'] = host_logons.apply(lambda x: _string_score(x.TargetLogonId), axis=1)\n",
    "    logon_features['LogonHour'] = host_logons.apply(lambda x: x.TimeGenerated.hour, axis=1)\n",
    "\n",
    "    # you might need to play around with the max_cluster_distance parameter.\n",
    "    # decreasing this gives more clusters.\n",
    "    (clus_logons, _, _) = dbcluster_events(data=logon_features, time_column='TimeGenerated',\n",
    "                                           cluster_columns=['AccountNum',\n",
    "                                                            'LogonType'],\n",
    "                                           max_cluster_distance=0.0001)\n",
    "    %matplotlib inline\n",
    "    plt.rcParams['figure.figsize'] = (12, 4)\n",
    "    clus_logons.plot.barh(x=\"Account\", y=\"ClusterSize\")\n",
    "\n",
    "    display(Markdown(f'Number of input events: {len(host_logons)}'))\n",
    "    display(Markdown(f'Number of clustered events: {len(clus_logons)}'))\n",
    "    display(Markdown('#### Distinct host logon patterns'))\n",
    "    clus_logons.sort_values('TimeGenerated')\n",
    "    nbdisp.display_logon_data(clus_logons)\n",
    "else:\n",
    "    display(Markdown('No logon events found for host.'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='victim2_logon_types'></a>\n",
    "### Classification of Logon Types by Account"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "display(Markdown('### Counts of logon events by logon type.'))\n",
    "display(Markdown('Min counts for each logon type highlighted.'))\n",
    "logon_by_type = (host_logons[['Account', 'LogonType', 'EventID']]\n",
    "                .groupby(['Account','LogonType']).count().unstack()\n",
    "                .fillna(0)\n",
    "                .style\n",
    "                .background_gradient(cmap='viridis', low=.5, high=0)\n",
    "                .format(\"{0:0>3.0f}\"))\n",
    "display(logon_by_type)\n",
    "key = 'logon type key = {}'.format('; '.join([f'{k}: {v}' for k,v in mas.nbdisplay._WIN_LOGON_TYPE_MAP.items()]))\n",
    "display(Markdown(key))\n",
    "\n",
    "display(Markdown('### Logon Timeline.'))\n",
    "nbdisp.display_timeline(data=host_logons,\n",
    "                         overlay_data=host_logons.query('LogonType == 10'),\n",
    "                         alert=security_alert, \n",
    "                         source_columns=['Account', 'LogonType', 'TimeGenerated'],\n",
    "                         title='All Host Logons (RDP Logons in green)')\n",
    "\n",
    "add_observation(Observation(caption='RDP Logons seen for victim #2',\n",
    "                            description='Logons by logon type.',\n",
    "                            item=logon_by_type,\n",
    "                            link='victim2_logon_types'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hidden": true
   },
   "source": [
    "<a id='failed_logons'></a>[Contents](#toc)\n",
    "## Check for Failed Logons"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hidden": true,
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "failedLogons = qry.list_host_logon_failures(provs=[logon_query_times], **params_dict)\n",
    "if failedLogons.shape[0] == 0:\n",
    "    display(print('No logon failures recorded for this host between {security_alert.start} and {security_alert.start}'))\n",
    "else:\n",
    "    display(failedLogons)\n",
    "    add_observation(Observation(caption='Logon failures seen for victim #2',\n",
    "                            description=f'{len(failedLogons)} Logons seen.',\n",
    "                            item=failedLogons,\n",
    "                            link='failed_logons'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='examine_win_logon_sess'></a>[Contents](#toc)\n",
    "## Examine a Logon Session"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Select a Logon ID To Examine"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "import re\n",
    "dist_logons = clus_logons.sort_values('TimeGenerated')[['TargetUserName', 'TimeGenerated', \n",
    "                                                        'LastEventTime', 'LogonType', \n",
    "                                                        'ClusterSize']]\n",
    "items = dist_logons.apply(lambda x: (f'{x.TargetUserName}:    '\n",
    "                                     f'(logontype={x.LogonType})   '\n",
    "                                     f'timerange={x.TimeGenerated} - {x.LastEventTime}    '\n",
    "                                     f'count={x.ClusterSize}'),\n",
    "                          axis=1).values.tolist()\n",
    "def get_selected_logon_cluster(selected_item):\n",
    "    acct_match = re.search(r'(?P<acct>[^:]+):\\s+\\(logontype=(?P<l_type>[^)]+)', selected_item)\n",
    "    if acct_match:\n",
    "        acct = acct_match['acct']\n",
    "        l_type = int(acct_match['l_type'])\n",
    "        return host_logons.query('TargetUserName == @acct and LogonType == @l_type')\n",
    "\n",
    "def get_selected_logon(selected_item):\n",
    "    logon_list_regex = r'''\n",
    "(?P<acct>[^:]+):\\s+\n",
    "\\(logontype=(?P<logon_type>[^)]+)\\)\\s+\n",
    "\\(timestamp=(?P<time>[^)]+)\\)\\s+\n",
    "logonid=(?P<logonid>[0-9a-fx)]+)\n",
    "'''\n",
    "    acct_match = re.search(logon_list_regex, selected_item, re.VERBOSE)\n",
    "    if acct_match:\n",
    "        acct = acct_match['acct']\n",
    "        logon_type = int(acct_match['logon_type'])\n",
    "        time_stamp = pd.to_datetime(acct_match['time'])\n",
    "        logon_id = acct_match['logonid']\n",
    "        return host_logons.query('TargetUserName == @acct and LogonType == @logon_type'\n",
    "                                 ' and TargetLogonId == @logon_id')\n",
    "    \n",
    "logon_wgt = mas.SelectString(description='Select logon cluster to examine', \n",
    "                             item_list=items, height='200px', width='100%', auto_display=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "selected_logon_cluster = get_selected_logon_cluster(logon_wgt.value)\n",
    "\n",
    "def view_logon(x=''):\n",
    "    global selected_logon\n",
    "    selected_logon = get_selected_logon(x)\n",
    "    display(get_selected_logon(x))\n",
    "    \n",
    "\n",
    "items = selected_logon_cluster.sort_values('TimeGenerated').apply(lambda x: (f'{x.TargetUserName}:    '\n",
    "                                        f'(logontype={x.LogonType})   '\n",
    "                                        f'(timestamp={x.TimeGenerated})    '\n",
    "                                        f'logonid={x.TargetLogonId}'),\n",
    "                             axis=1).values.tolist()\n",
    "w = widgets.Select(options=items, description='Select logon instance to examine', **WIDGET_DEFAULTS)\n",
    "    \n",
    "interactive(view_logon, x=w)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='process_clustering'></a>[Contents](#toc)\n",
    "## Unusual Processes on Host - Clustering\n",
    "Sometimes you don't have a source process to work with. Other times it's just useful to see what else is going on on the host. This section retrieves all processes on the host within the time bounds\n",
    "set in the query times widget.\n",
    "\n",
    "You can display the raw output of this by looking at the *processes_on_host* dataframe. Just copy this into a new cell and hit Ctrl-Enter.\n",
    "\n",
    "Usually though, the results return a lot of very repetitive and unintersting system processes so we attempt to cluster these to make the view easier to negotiate. \n",
    "To do this we process the raw event list output to extract a few features that render strings (such as commandline)into numerical values. The default below uses the following features:\n",
    "- commandLineTokensFull - this is a count of common delimiters in the commandline \n",
    "  (given by this regex r'[\\s\\-\\\\/\\.,\"\\'|&:;%$()]'). The aim of this is to capture the commandline structure while ignoring variations on what is essentially the same pattern (e.g. temporary path GUIDs, target IP or host names, etc.)\n",
    "- pathScore - this sums the ordinal (character) value of each character in the path (so /bin/bash and /bin/bosh would have similar scores).\n",
    "- isSystemSession - 1 if this is a root/system session, 0 if anything else.\n",
    "\n",
    "Then we run a clustering algorithm (DBScan in this case) on the process list. The result groups similar (noisy) processes together and leaves unique process patterns as single-member clusters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "# Calculate time range based on the logons from previous section\n",
    "logon_time = selected_logon_cluster['TimeGenerated'].min()\n",
    "last_logon_time = selected_logon_cluster['TimeGenerated'].max()\n",
    "time_diff = int((last_logon_time - logon_time).total_seconds() / (60 * 60) + 2)\n",
    "\n",
    "# set the origin time to the time of our alert\n",
    "proc_query_times = mas.QueryTime(units='hours', origin_time=logon_time,\n",
    "                           before=1, after=time_diff, max_before=20, max_after=20)\n",
    "proc_query_times.display()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "from msticpy.sectools.eventcluster import dbcluster_events, add_process_features\n",
    "print('Getting process events...', end='')\n",
    "processes_on_host = qry.list_processes(provs=[proc_query_times], **params_dict)\n",
    "print('done')\n",
    "print('Clustering...', end='')\n",
    "feature_procs = add_process_features(input_frame=processes_on_host,\n",
    "                                     path_separator=params_dict['path_separator'])\n",
    "\n",
    "feature_procs['accountNum'] = feature_procs.apply(lambda x: _string_score(x.Account), axis=1)\n",
    "# you might need to play around with the max_cluster_distance parameter.\n",
    "# decreasing this gives more clusters.\n",
    "(clus_events, dbcluster, x_data) = dbcluster_events(data=feature_procs,\n",
    "                                                    cluster_columns=['commandlineTokensFull', \n",
    "                                                                     'pathScore',\n",
    "                                                                     'accountNum',\n",
    "                                                                     'isSystemSession'],\n",
    "                                                    max_cluster_distance=0.0001)\n",
    "print('done')\n",
    "print('Number of input events:', len(feature_procs))\n",
    "print('Number of clustered events:', len(clus_events))\n",
    "(clus_events.sort_values('TimeGenerated')[['TimeGenerated', 'LastEventTime',\n",
    "                                          'NewProcessName', 'CommandLine', \n",
    "                                          'ClusterSize', 'commandlineTokensFull',\n",
    "                                          'pathScore', 'isSystemSession']]\n",
    "    .sort_values('ClusterSize', ascending=False))\n",
    "print('done')\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### View processes used in login session"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "selected_logon_cluster = get_selected_logon_cluster(logon_wgt.value)\n",
    "\n",
    "def view_logon_sess(x=''):\n",
    "    global selected_logon\n",
    "    selected_logon = get_selected_logon(x)\n",
    "    display(selected_logon)\n",
    "    logonId = selected_logon['TargetLogonId'].iloc[0]\n",
    "    sess_procs = (processes_on_host.query('TargetLogonId == @logonId | SubjectLogonId == @logonId')\n",
    "                                          [['NewProcessName', 'CommandLine', 'TargetLogonId']]\n",
    "                  .drop_duplicates())\n",
    "    display(sess_procs)\n",
    "    \n",
    "\n",
    "items = selected_logon_cluster.sort_values('TimeGenerated').apply(lambda x: (f'{x.TargetUserName}:    '\n",
    "                                        f'(logontype={x.LogonType})   '\n",
    "                                        f'(timestamp={x.TimeGenerated})    '\n",
    "                                        f'logonid={x.TargetLogonId}'),\n",
    "                             axis=1).values.tolist()\n",
    "sess_w = widgets.Select(options=items, description='Select logon instance to examine', **WIDGET_DEFAULTS)\n",
    "    \n",
    "interactive(view_logon_sess, x=sess_w)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Save Selected Session as Observation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "if selected_logon is not None:\n",
    "    display(Markdown('**Attacker Logon Session selected**'))\n",
    "    display(selected_logon)\n",
    "    logonid = selected_logon['TargetLogonId'].iloc[0]\n",
    "    logon_time = selected_logon['TimeGenerated'].iloc[0]\n",
    "    subj_account = mas.Account(src_event=selected_logon.iloc[0], role='subject')\n",
    "    tgt_account = mas.Account(src_event=selected_logon.iloc[0], role='target')\n",
    "    logon_session = mas.HostLogonSession(src_event=selected_logon.iloc[0])\n",
    "    logon_session.Account = tgt_account\n",
    "    logon_session.SessionId = logonid\n",
    "    logon_session.Host = inv_host_entity\n",
    "    display(Markdown('**Entities:**'))\n",
    "    print('Subject Account:\\n', subj_account)\n",
    "    print('Target Account Session:\\n', logon_session)\n",
    "    \n",
    "    add_observation(Observation(caption='Logon session identified for attacker IP',\n",
    "                            description=f'Logon session for account {logon_session.Account.Name}',\n",
    "                            item=logon_session,\n",
    "                            link='examine_win_logon_sess'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='process_session'></a>[Contents](#toc)\n",
    "### Processes for Selected LogonId"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "logonId = selected_logon['TargetLogonId'].iloc[0]\n",
    "sess_procs = (processes_on_host.query('TargetLogonId == @logonId | SubjectLogonId == @logonId')\n",
    "                                          [['TimeGenerated', 'NewProcessName', 'CommandLine']])\n",
    "\n",
    "display(sess_procs)\n",
    "add_observation(Observation(caption='Attacker commands on Victim 2',\n",
    "                            description=f'Processes run in Attacker session',\n",
    "                            item=sess_procs,\n",
    "                            link='process_session'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Clustered Version of Previous Query - collapsing duplicates"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "display(clus_events.query('TargetLogonId == @logonId | SubjectLogonId == @logonId')\n",
    "        [['TimeGenerated', 'NewProcessName', 'CommandLine', 'ClusterSize']])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Optional (for the curious) - View clustering stats"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "# change False to True in the if statement to see this\n",
    "if False:\n",
    "    proc_plot = sns.catplot(y=\"processName\", x=\"commandlineTokensFull\", \n",
    "                            data=feature_procs.sort_values('processName'),\n",
    "                            kind='box', height=10)\n",
    "    proc_plot.fig.suptitle('All Processes - Variability of Commandline Tokens', x=1, y=1)\n",
    "\n",
    "    plt.rcParams['figure.figsize'] = (5, 15)\n",
    "    clus_plot = clus_events[['processName', \n",
    "                             'ClusterId', \n",
    "                             'ClusterSize']].groupby(['processName', \n",
    "                                                      'ClusterId']).sum().plot.barh()\n",
    "    plt.title('Clustered Processes - cluster size of each command line pattern');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='other_win_events'></a>[Contents](#toc)\n",
    "## Other Events on the Host"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "all_events_base_qry = '''\n",
    "SecurityEvent\n",
    "| where Computer =~ '{host}'\n",
    "| where TimeGenerated >= datetime({start})\n",
    "| where TimeGenerated <= datetime({end})\n",
    "| where {where_filter}\n",
    "'''\n",
    "all_events_qry = all_events_base_qry.format(host=params_dict['host_name'],\n",
    "                                            start=proc_query_times.start,\n",
    "                                            end=proc_query_times.end,\n",
    "                                            where_filter='EventID != 4688 and EventID != 4624')\n",
    "\n",
    "%kql -query all_events_qry\n",
    "all_events_df = _kql_raw_result_.to_dataframe()\n",
    "display(all_events_df[['Account', 'Activity', 'TimeGenerated']].groupby(['Account', 'Activity']).count())\n",
    "\n",
    "add_observation(Observation(caption='System account modifications during attack.',\n",
    "                            description='Count of event types seen on system',\n",
    "                            item=all_events_df[['Account', 'Activity', 'TimeGenerated']].groupby(['Account', 'Activity']).count(),\n",
    "                            link='other_win_events'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "# Function to convert EventData XML into dictionary and populate columns into DataFrame from previous query result\n",
    "all_events_df['EventData'].iloc[10]\n",
    "import xml.etree.ElementTree as ET\n",
    "from xml.etree.ElementTree import ParseError\n",
    "SCHEMA='http://schemas.microsoft.com/win/2004/08/events/event'\n",
    "def parse_event_data(row):\n",
    "    try:\n",
    "        xdoc = ET.fromstring(row.EventData)\n",
    "        col_dict = {elem.attrib['Name']: elem.text for elem in xdoc.findall(f'{{{SCHEMA}}}Data')}\n",
    "        reassigned = set()\n",
    "        for k, v in col_dict.items():\n",
    "            if k in row and not row[k]:\n",
    "                row[k] = v\n",
    "                reassigned.add(k)\n",
    "        if reassigned:\n",
    "            #print('Reassigned: ', ', '.join(reassigned))\n",
    "            for k in reassigned:\n",
    "                col_dict.pop(k)\n",
    "        return col_dict\n",
    "    except ParseError:\n",
    "        return None\n",
    "\n",
    "all_events_df['EventProperties'] = all_events_df.apply(parse_event_data, axis=1)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='o365'></a>[Contents](#toc)\n",
    "# Office 365 Activity"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "# set the origin time to the time of our alert\n",
    "o365_query_times = mas.QueryTime(units='hours', origin_time=security_alert.origin_time,\n",
    "                           before=1, after=10, max_before=20, max_after=20)\n",
    "o365_query_times.display()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Execute queries to get the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "print('Running queries...', end=' ')\n",
    "# Queries\n",
    "ad_changes_query = '''\n",
    "OfficeActivity\n",
    "| where TimeGenerated >= datetime({start})\n",
    "| where TimeGenerated <= datetime({end})\n",
    "| where RecordType == 'AzureActiveDirectory'\n",
    "| where Operation in ('Add service principal.',\n",
    "                      'Change user password.', \n",
    "                      'Add user.', \n",
    "                      'Add member to role.')\n",
    "| where UserType == 'Regular' \n",
    "| project OfficeId, TimeGenerated, Operation, OrganizationId, \n",
    "          OfficeWorkload, ResultStatus, OfficeObjectId, \n",
    "          UserId = tolower(UserId), ClientIP, ExtendedProperties\n",
    "'''.format(start = o365_query_times.start, end=o365_query_times.end)\n",
    "%kql -query ad_changes_query\n",
    "ad_changes_df = _kql_raw_result_.to_dataframe()\n",
    "\n",
    "\n",
    "office_ops_query = '''\n",
    "OfficeActivity\n",
    "| where TimeGenerated >= datetime({start})\n",
    "| where TimeGenerated <= datetime({end})\n",
    "| where RecordType in (\"AzureActiveDirectoryAccountLogon\", \"AzureActiveDirectoryStsLogon\")\n",
    "| extend UserAgent = extractjson(\"$[0].Value\", ExtendedProperties, typeof(string))\n",
    "| union (\n",
    "    OfficeActivity \n",
    "    | where TimeGenerated >= datetime({start})\n",
    "    | where TimeGenerated <= datetime({end})\n",
    "    | where RecordType !in (\"AzureActiveDirectoryAccountLogon\", \"AzureActiveDirectoryStsLogon\")\n",
    ")\n",
    "| where UserType == 'Regular'\n",
    "'''.format(start = o365_query_times.start, end=o365_query_times.end)\n",
    "%kql -query office_ops_query\n",
    "office_ops_df = _kql_raw_result_.to_dataframe()\n",
    "\n",
    "office_ops_summary_query = '''\n",
    "let timeRange=ago(30d);\n",
    "let officeAuthentications = OfficeActivity\n",
    "| where TimeGenerated >= timeRange\n",
    "| where RecordType in (\"AzureActiveDirectoryAccountLogon\", \"AzureActiveDirectoryStsLogon\")\n",
    "| extend UserAgent = extractjson(\"$[0].Value\", ExtendedProperties, typeof(string))\n",
    "| where Operation == \"UserLoggedIn\";\n",
    "officeAuthentications\n",
    "| union (\n",
    "    OfficeActivity \n",
    "    | where TimeGenerated >= timeRange\n",
    "    | where RecordType !in (\"AzureActiveDirectoryAccountLogon\", \"AzureActiveDirectoryStsLogon\")\n",
    ")\n",
    "| where UserType == 'Regular'\n",
    "| extend RecordOp = strcat(RecordType, '-', Operation)\n",
    "| summarize OpCount=count() by RecordType, Operation, UserId, UserAgent, ClientIP, bin(TimeGenerated, 1h)\n",
    "// render timeline\n",
    "'''.format(start = o365_query_times.start, end=o365_query_times.end)\n",
    "%kql -query office_ops_summary_query\n",
    "office_ops_summary_df = _kql_raw_result_.to_dataframe()\n",
    "# %kql -query office_ops_query\n",
    "# office_ops_df = _kql_raw_result_.to_dataframe()\n",
    "\n",
    "office_logons_query = '''\n",
    "let timeRange=ago(30d);\n",
    "let officeAuthentications = OfficeActivity\n",
    "| where TimeGenerated >= timeRange\n",
    "| where RecordType in (\"AzureActiveDirectoryAccountLogon\", \"AzureActiveDirectoryStsLogon\")\n",
    "| extend UserAgent = extractjson(\"$[0].Value\", ExtendedProperties, typeof(string))\n",
    "| where Operation == \"UserLoggedIn\";\n",
    "let lookupWindow = 1d;\n",
    "let lookupBin = lookupWindow / 2.0; \n",
    "officeAuthentications | project-rename Start=TimeGenerated\n",
    "| extend TimeKey = bin(Start, lookupBin)\n",
    "| join kind = inner (\n",
    "    officeAuthentications\n",
    "    | project-rename End=TimeGenerated\n",
    "    | extend TimeKey = range(bin(End - lookupWindow, lookupBin), bin(End, lookupBin), lookupBin)\n",
    "    | mvexpand TimeKey to typeof(datetime)\n",
    ") on UserAgent, TimeKey\n",
    "| project timeSpan = End - Start, UserId, ClientIP , UserAgent , Start, End\n",
    "| summarize dcount(ClientIP) by  UserAgent\n",
    "| where dcount_ClientIP > 1\n",
    "| join kind=inner (  \n",
    "officeAuthentications\n",
    "| summarize minTime=min(TimeGenerated), maxTime=max(TimeGenerated) by UserId, UserAgent, ClientIP\n",
    ") on UserAgent\n",
    "'''\n",
    "%kql -query office_logons_query\n",
    "office_logons_df = _kql_raw_result_.to_dataframe()\n",
    "\n",
    "print('done.')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id='o365_match_ip'></a>\n",
    "### Any IP Addresses in our alert IPs that match Office Activity?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Any IP Addresses in our alert IPs that match?\n",
    "\n",
    "\n",
    "for ip in alert_ip_entities:\n",
    "    susp_o365_activities = office_ops_df[office_ops_df['ClientIP'] == ip.Address]\n",
    "    susp_o365_summ = (office_ops_df[office_ops_df['ClientIP'] == ip.Address]\n",
    "                         [['OfficeId', 'UserId', 'RecordType', 'Operation']]\n",
    "                         .groupby(['UserId', 'RecordType', 'Operation']).count()\n",
    "                         .rename(columns={'OfficeId': 'OperationCount'}))\n",
    "    \n",
    "    display(Markdown(f'### Activity for {ip.Address}'))\n",
    "    \n",
    "    if len(susp_o365_summ) > 0:\n",
    "        display(susp_o365_summ)\n",
    "    \n",
    "        add_observation(Observation(caption=f'O365 activity from suspected attacker IP {ip.Address}',\n",
    "                                    description=f'Summarized operation count for each user/service/operation type',\n",
    "                                    item=susp_o365_summ,\n",
    "                                    link='o365_match_ip'))\n",
    "    else:\n",
    "        display(Markdown('No activity detected'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "for susp_ip in [ip.Address for ip in alert_ip_entities]:\n",
    "\n",
    "    display(Markdown(f'### Timeline of operations originating from suspect IP Address: {susp_ip}'))\n",
    "    display(Markdown(f'**{susp_ip}**'))\n",
    "    suspect_ip_ops = office_ops_df[office_ops_df['ClientIP'] == susp_ip]\n",
    "    if len(suspect_ip_ops) == 0:\n",
    "        display(Markdown('No activity detected'))\n",
    "        continue\n",
    "    sel_op_type='FileDownloaded'\n",
    "    nbdisp.display_timeline(data=suspect_ip_ops, title=f'Operations from {susp_ip} (all=blue, {sel_op_type}=green)',\n",
    "                             overlay_data=suspect_ip_ops.query('Operation == @sel_op_type'),\n",
    "                            source_columns=['UserId', 'RecordType', 'Operation'])\n",
    "    \n",
    "    # Uncomment line below to see all activity\n",
    "    # display(suspect_ip_ops.sort_values('TimeGenerated', ascending=True).head())\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"o356_high_freq\"></a>\n",
    "### Look for high-frequency operations - like automated or bulk uploads/downloads\n",
    "#### Anything above or approaching 1 operation/sec is likely an automated or bulk operation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "timed_slice_ops = office_ops_df[['RecordType', 'TimeGenerated', 'Operation',\n",
    "       'OrganizationId', 'UserType', 'OfficeWorkload',\n",
    "       'ResultStatus', 'OfficeObjectId', 'UserId', 'ClientIP', 'Start_Time']]\n",
    "timed_slice_ops2 = timed_slice_ops.set_index('TimeGenerated')\n",
    "\n",
    "hi_freq_ops = (timed_slice_ops2[['UserId', 'ClientIP', 'Operation', 'RecordType']]\n",
    "                .groupby(['UserId', 'ClientIP', 'RecordType', 'Operation']).resample('10S').count()\n",
    "                .query('RecordType > 10')\n",
    "                .drop(['ClientIP', 'UserId', 'RecordType'], axis=1)\n",
    "                .assign(OpsPerSec = lambda x: x.Operation / 10)\n",
    "                .rename(columns={'Operation': 'Operation Count'}))\n",
    "\n",
    "if len(hi_freq_ops) > 0:\n",
    "    display(hi_freq_ops)\n",
    "    add_observation(Observation(caption=f'O365 bulk/high freq operations seen',\n",
    "                                    description=f'Summarized operation count bulk actions',\n",
    "                                    item=hi_freq_ops,\n",
    "                                    link='o356_high_freq'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Other Background Data for O365"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "display(Markdown('### IPs and User Agents - frequency of use'))\n",
    "office_ops_df['UserId'] = office_ops_df['UserId'].str.lower()\n",
    "display(Markdown('Distinct IPs by num of operations'))\n",
    "display(office_ops_df[['ClientIP', 'Operation']].groupby(['ClientIP']).count())\n",
    "display(Markdown('Distinct UserAgents by num of operations'))\n",
    "office_ops_df[['UserAgent', 'Operation']].groupby(['UserAgent']).count()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "off_ip_locs = (office_ops_df[['ClientIP']]\n",
    "                   .drop_duplicates()\n",
    "                   .apply(lambda x: \n",
    "                          iplocation.lookup_ip(ip_address=x.ClientIP)[1]\n",
    "                          if x.ClientIP and x.ClientIP != '<null>' else None, axis=1)\n",
    "                   .tolist())\n",
    "ip_locs = [ip_list[0] for ip_list in off_ip_locs if ip_list]\n",
    "    \n",
    "flow_map = create_ip_map()\n",
    "display(HTML('<h3>External IP Addresses seen in Office Activity</h3>'))\n",
    "display(HTML('Numbered circles indicate multiple items - click to expand.'))\n",
    "\n",
    "\n",
    "icon_props = {'color': 'purple'}\n",
    "flow_map = add_ip_cluster(folium_map=flow_map,\n",
    "                            ip_entities=ip_locs,\n",
    "                            **icon_props)\n",
    "display(flow_map)   "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "with warnings.catch_warnings():\n",
    "    warnings.simplefilter(\"ignore\")\n",
    "    display(Markdown('### Change in rate of Activity Class (RecordType) and Operation'))\n",
    "    sns.relplot(data=office_ops_summary_df, x='TimeGenerated', y='OpCount', kind='line', aspect=2, \n",
    "                hue='RecordType')\n",
    "    sns.relplot(data=office_ops_summary_df.query('RecordType == \"SharePointFileOperation\"'), \n",
    "                x='TimeGenerated', y='OpCount', hue='Operation', kind='line', aspect=2)\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "with warnings.catch_warnings():\n",
    "    warnings.simplefilter(\"ignore\")\n",
    "    display(Markdown('### Identify Users/IPs with largest operation count'))\n",
    "    office_ops_summary_df['UserId'] = office_ops_summary_df['UserId'].str.lower()\n",
    "\n",
    "    sns.catplot(data=office_ops_summary_df, x='UserId', y='OpCount', \n",
    "                hue='Operation', aspect=2).set_xticklabels(rotation=30)\n",
    "    office_ops_summary_df.pivot_table('OpCount', index=['ClientIP', 'UserId'], \n",
    "                                      columns='Operation').style.bar(color='orange', align='mid')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Extract distinctive events from O365 Operations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "todo"
    ]
   },
   "outputs": [],
   "source": [
    "\n",
    "\n",
    "from msticpy.sectools.eventcluster import (dbcluster_events, \n",
    "                                           add_process_features, \n",
    "                                           char_ord_score,\n",
    "                                           token_count,\n",
    "                                           delim_count)\n",
    "\n",
    "feature_office_ops = office_ops_df.copy()\n",
    "feature_office_ops['ip_num'] = feature_office_ops.apply(lambda x: char_ord_score(x, 'ClientIP'), axis=1)\n",
    "feature_office_ops['ua_tokens'] = feature_office_ops.apply(lambda x: char_ord_score(x, 'UserAgent'), axis=1)\n",
    "feature_office_ops['oid_tokens'] = feature_office_ops.apply(lambda x: char_ord_score(x, 'OfficeObjectId'), axis=1)\n",
    "\n",
    "# you might need to play around with the max_cluster_distance parameter.\n",
    "# decreasing this gives more clusters.\n",
    "(clustered_ops, dbcluster, x_data) = dbcluster_events(data=feature_office_ops,\n",
    "                                                      cluster_columns=['ip_num', \n",
    "                                                                     'ua_tokens', \n",
    "                                                                     'oid_tokens'],\n",
    "                                                      time_column='TimeGenerated',\n",
    "                                                      max_cluster_distance=0.0001)\n",
    "print('Number of input events:', len(feature_office_ops))\n",
    "print('Number of clustered events:', len(clustered_ops))\n",
    "(clustered_ops[['TimeGenerated', 'RecordType',\n",
    "                'Operation', 'UserId', 'UserAgent', 'ClusterSize',\n",
    "                'OfficeObjectId']]\n",
    "    .query('ClusterSize <= 2')\n",
    "    .sort_values('ClusterSize', ascending=True))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hidden": true
   },
   "source": [
    "<a id='summary'></a>[Contents](#toc)\n",
    "# Summary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "for observation in observation_list.values():\n",
    "    display_observation(observation)\n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hidden": true
   },
   "source": [
    "<a id='appendices'></a>[Contents](#toc)\n",
    "# Appendices"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Available DataFrames"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "print('List of current DataFrames in Notebook')\n",
    "print('-' * 50)\n",
    "current_vars = list(locals().keys())\n",
    "for var_name in current_vars:\n",
    "    if isinstance(locals()[var_name], pd.DataFrame) and not var_name.startswith('_'):\n",
    "        print(var_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "heading_collapsed": true,
    "tags": [
     "todo"
    ]
   },
   "source": [
    "## Saving Data to Excel\n",
    "To save the contents of a pandas DataFrame to an Excel spreadsheet\n",
    "use the following syntax\n",
    "```\n",
    "writer = pd.ExcelWriter('myWorksheet.xlsx')\n",
    "my_data_frame.to_excel(writer,'Sheet1')\n",
    "writer.save()\n",
    "```"
   ]
  }
 ],
 "metadata": {
  "hide_input": false,
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {
    "height": "318.996px",
    "width": "320.994px"
   },
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {
    "height": "calc(100% - 180px)",
    "left": "10px",
    "top": "150px",
    "width": "270px"
   },
   "toc_section_display": true,
   "toc_window_display": true
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "position": {
    "height": "406.193px",
    "left": "1468.4px",
    "right": "20px",
    "top": "120px",
    "width": "456.572px"
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}