cloudflare/Azure-Sentinel
Publicmirrored fromhttps://github.com/cloudflare/Azure-Sentinel
Notebooks/Guided Hunting - Office365-Exploring.ipynb
1402lines · modecode
unknown
| 1 | { |
| 2 | "cells": [ |
| 3 | { |
| 4 | "cell_type": "markdown", |
| 5 | "metadata": {}, |
| 6 | "source": [ |
| 7 | "# Title: Office 365 Explorer\n", |
| 8 | "**Notebook Version:** 1.0<br>\n", |
| 9 | "**Python Version:** Python 3.6 (including Python 3.6 - AzureML)<br>\n", |
| 10 | "**Required Packages**: kqlmagic, msticpy, pandas, numpy, matplotlib, seaborn, networkx, ipywidgets, ipython, scikit_learn, folium, maxminddb_geolite2, holoviews<br>\n", |
| 11 | "**Platforms Supported**:\n", |
| 12 | "- Azure Notebooks Free Compute\n", |
| 13 | "- Azure Notebooks DSVM\n", |
| 14 | "- OS Independent\n", |
| 15 | "\n", |
| 16 | "**Data Sources Required**:\n", |
| 17 | "- Log Analytics - OfficeActivity, IPLocation, Azure Network Analytics\n", |
| 18 | "\n", |
| 19 | "## Description:\n", |
| 20 | "Brings together a series of queries and visualizations to help you investigate the security status of Office 365 subscription and individual user activities.\n", |
| 21 | "- The first section focuses on Tenant-Wide data queries and analysis\n", |
| 22 | "- The second section allows you to focus on individial accounts and examine them for any suspicious activity.\n", |
| 23 | "\n", |
| 24 | "This notebook is intended to be illustrative of the types of data available in Office 365 Activity data and how to query and use them. It is not meant to be used as a prescriptive guide to how to navigate through the data. Feel free to experiment and submit anything interesting you find to the community.\n", |
| 25 | "\n" |
| 26 | ] |
| 27 | }, |
| 28 | { |
| 29 | "cell_type": "markdown", |
| 30 | "metadata": {}, |
| 31 | "source": [ |
| 32 | "<a id=\"contents\"></a>\n", |
| 33 | "# Table of Contents\n", |
| 34 | "- [Setup and Authenticate](#setup)\n", |
| 35 | "- [Office 365 Activity](#o365)\n", |
| 36 | " - [Tenant-wide Information](#tenant_info)\n", |
| 37 | " - [AAD Operations - Account Modifications](#aad_ops)\n", |
| 38 | " - [Logon Anomalies](#logon_anomalies)\n", |
| 39 | " - [Activity Summary](#activity_summary)\n", |
| 40 | " - [Variability of IP Address for users](#ip_variability)\n", |
| 41 | " - [Accounts with multiple IPs and Geolocations](#acct_multi_geo)\n", |
| 42 | " - [User Logons with > N IP Address](#acct_multi_ips)\n", |
| 43 | " - [Operation Types by Location and IP](#ip_op_matrix)\n", |
| 44 | " - [Geolocation Map of Client IPs](#geo_map_tenant)\n", |
| 45 | " - [Distinct User Agent Strings in Use](#distinct_uas)\n", |
| 46 | " - [Graphical Activity Timeline](#op_timeline)\n", |
| 47 | " - [Users With largest Activity Type Count](#user_activity_counts)\n", |
| 48 | " - [Office User Investigation](#o365_user_inv)\n", |
| 49 | " - [Activity Summary](#user_act_summary)\n", |
| 50 | " - [Operation Breakdown for User](#user_op_count)\n", |
| 51 | " - [IP Count for Different User Operations](#user_ip_counts)\n", |
| 52 | " - [Activity Timeline](#user_act_timeline)\n", |
| 53 | " - [User IP GeoMap](#user_geomap)\n", |
| 54 | " - [Check for User IPs in Azure Network Flow Data](#ips_in_azure)\n", |
| 55 | " - [Rare Combinations of Country/UserAgent/Operation Type](#o365_cluster)\n", |
| 56 | "- [Appendices](#appendices)\n", |
| 57 | " - [Saving data to Excel](#appendices)\n" |
| 58 | ] |
| 59 | }, |
| 60 | { |
| 61 | "cell_type": "markdown", |
| 62 | "metadata": {}, |
| 63 | "source": [ |
| 64 | "<a id='setup'></a>[Contents](#contents)\n", |
| 65 | "# Setup\n", |
| 66 | "\n", |
| 67 | "Make sure that you have installed packages specified in the setup (uncomment the lines to execute)\n", |
| 68 | "\n", |
| 69 | "## Install Packages\n", |
| 70 | "The first time this cell runs for a new Azure Notebooks project or local Python environment it will take several minutes to download and install the packages. In subsequent runs it should run quickly and confirm that package dependencies are already installed. Unless you want to upgrade the packages you can feel free to skip execution of the next cell.\n", |
| 71 | "\n", |
| 72 | "If you see any import failures (```ImportError```) in the notebook, please re-run this cell and answer 'y', then re-run the cell where the failure occurred.\n", |
| 73 | "\n", |
| 74 | "Note you may see some warnings about package incompatibility with certain packages. This does not affect the functionality of this notebook but you may need to upgrade the packages producing the warnings to a more recent version." |
| 75 | ] |
| 76 | }, |
| 77 | { |
| 78 | "cell_type": "code", |
| 79 | "execution_count": null, |
| 80 | "metadata": {}, |
| 81 | "outputs": [], |
| 82 | "source": [ |
| 83 | "import sys\n", |
| 84 | "import warnings\n", |
| 85 | "\n", |
| 86 | "warnings.filterwarnings(\"ignore\",category=DeprecationWarning)\n", |
| 87 | "\n", |
| 88 | "MIN_REQ_PYTHON = (3,6)\n", |
| 89 | "if sys.version_info < MIN_REQ_PYTHON:\n", |
| 90 | " print('Check the Kernel->Change Kernel menu and ensure that Python 3.6')\n", |
| 91 | " print('or later is selected as the active kernel.')\n", |
| 92 | " sys.exit(\"Python %s.%s or later is required.\\n\" % MIN_REQ_PYTHON)\n", |
| 93 | "\n", |
| 94 | "# Package Installs - try to avoid if they are already installed\n", |
| 95 | "try:\n", |
| 96 | " import msticpy.sectools as sectools\n", |
| 97 | " import Kqlmagic\n", |
| 98 | " from dns import reversename, resolver\n", |
| 99 | " from ipwhois import IPWhois\n", |
| 100 | " import folium\n", |
| 101 | " \n", |
| 102 | " print('If you answer \"n\" this cell will exit with an error in order to avoid the pip install calls,')\n", |
| 103 | " print('This error can safely be ignored.')\n", |
| 104 | " resp = input('msticpy and Kqlmagic packages are already loaded. Do you want to re-install? (y/n)')\n", |
| 105 | " if resp.strip().lower() != 'y':\n", |
| 106 | " sys.exit('pip install aborted - you may skip this error and continue.')\n", |
| 107 | " else:\n", |
| 108 | " print('After installation has completed, restart the current kernel and run '\n", |
| 109 | " 'the notebook again skipping this cell.')\n", |
| 110 | "except ImportError:\n", |
| 111 | " pass\n", |
| 112 | "\n", |
| 113 | "print('\\nPlease wait. Installing required packages. This may take a few minutes...')\n", |
| 114 | "!pip install git+https://github.com/microsoft/msticpy --upgrade --user\n", |
| 115 | "!pip install Kqlmagic --no-cache-dir --upgrade --user\n", |
| 116 | "!pip install seaborn --upgrade --user\n", |
| 117 | "!pip install holoviews --upgrade --user\n", |
| 118 | "!pip install dnspython --upgrade --user \n", |
| 119 | "!pip install ipwhois --upgrade --user \n", |
| 120 | "!pip install folium --upgrade --user\n", |
| 121 | "\n", |
| 122 | "# Uncomment to refresh the maxminddb database\n", |
| 123 | "# !pip install maxminddb-geolite2 --upgrade \n", |
| 124 | "\n", |
| 125 | "print('To ensure that the latest versions of the installed libraries '\n", |
| 126 | " 'are used, please restart the current kernel and run '\n", |
| 127 | " 'the notebook again skipping this cell.')" |
| 128 | ] |
| 129 | }, |
| 130 | { |
| 131 | "cell_type": "code", |
| 132 | "execution_count": null, |
| 133 | "metadata": { |
| 134 | "scrolled": true |
| 135 | }, |
| 136 | "outputs": [], |
| 137 | "source": [ |
| 138 | "# Imports\n", |
| 139 | "import sys\n", |
| 140 | "import warnings\n", |
| 141 | "\n", |
| 142 | "MIN_REQ_PYTHON = (3,6)\n", |
| 143 | "if sys.version_info < MIN_REQ_PYTHON:\n", |
| 144 | " print('Check the Kernel->Change Kernel menu and ensure that Python 3.6')\n", |
| 145 | " print('or later is selected as the active kernel.')\n", |
| 146 | " sys.exit(\"Python %s.%s or later is required.\\n\" % MIN_REQ_PYTHON)\n", |
| 147 | "\n", |
| 148 | "import numpy as np\n", |
| 149 | "from IPython import get_ipython\n", |
| 150 | "from IPython.display import display, HTML, Markdown\n", |
| 151 | "import ipywidgets as widgets\n", |
| 152 | "\n", |
| 153 | "import matplotlib.pyplot as plt\n", |
| 154 | "import seaborn as sns\n", |
| 155 | "sns.set()\n", |
| 156 | "import networkx as nx\n", |
| 157 | "\n", |
| 158 | "import pandas as pd\n", |
| 159 | "pd.set_option('display.max_rows', 100)\n", |
| 160 | "pd.set_option('display.max_columns', 50)\n", |
| 161 | "pd.set_option('display.max_colwidth', 100)\n", |
| 162 | "\n", |
| 163 | "import msticpy.sectools as sectools\n", |
| 164 | "import msticpy.nbtools as mas\n", |
| 165 | "import msticpy.nbtools.kql as qry\n", |
| 166 | "import msticpy.nbtools.nbdisplay as nbdisp\n", |
| 167 | "\n", |
| 168 | "# Some of our dependencies (networkx) still use deprecated Matplotlib\n", |
| 169 | "# APIs - we can't do anything about it so suppress them from view\n", |
| 170 | "from matplotlib import MatplotlibDeprecationWarning\n", |
| 171 | "warnings.simplefilter(\"ignore\", category=MatplotlibDeprecationWarning)\n", |
| 172 | "\n", |
| 173 | "WIDGET_DEFAULTS = {'layout': widgets.Layout(width='95%'),\n", |
| 174 | " 'style': {'description_width': 'initial'}}\n", |
| 175 | "display(HTML(mas.util._TOGGLE_CODE_PREPARE_STR))\n", |
| 176 | "HTML('''\n", |
| 177 | " <script type=\"text/javascript\">\n", |
| 178 | " IPython.notebook.kernel.execute(\"nb_query_string='\".concat(window.location.search).concat(\"'\"));\n", |
| 179 | " </script>\n", |
| 180 | " ''');" |
| 181 | ] |
| 182 | }, |
| 183 | { |
| 184 | "cell_type": "markdown", |
| 185 | "metadata": { |
| 186 | "tags": [ |
| 187 | "remove" |
| 188 | ] |
| 189 | }, |
| 190 | "source": [ |
| 191 | "### Get WorkspaceId\n", |
| 192 | "To find your Workspace Id go to [Log Analytics](https://ms.portal.azure.com/#blade/HubsExtension/Resources/resourceType/Microsoft.OperationalInsights%2Fworkspaces). Look at the workspace properties to find the ID." |
| 193 | ] |
| 194 | }, |
| 195 | { |
| 196 | "cell_type": "code", |
| 197 | "execution_count": null, |
| 198 | "metadata": { |
| 199 | "tags": [ |
| 200 | "todo" |
| 201 | ] |
| 202 | }, |
| 203 | "outputs": [], |
| 204 | "source": [ |
| 205 | "import os\n", |
| 206 | "from msticpy.nbtools.wsconfig import WorkspaceConfig\n", |
| 207 | "ws_config_file = 'config.json'\n", |
| 208 | "\n", |
| 209 | "WORKSPACE_ID = None\n", |
| 210 | "TENANT_ID = None\n", |
| 211 | "try:\n", |
| 212 | " ws_config = WorkspaceConfig(ws_config_file)\n", |
| 213 | " display(Markdown(f'Read Workspace configuration from local config.json for workspace **{ws_config[\"workspace_name\"]}**'))\n", |
| 214 | " for cf_item in ['tenant_id', 'subscription_id', 'resource_group', 'workspace_id', 'workspace_name']:\n", |
| 215 | " display(Markdown(f'**{cf_item.upper()}**: {ws_config[cf_item]}'))\n", |
| 216 | " \n", |
| 217 | " if ('cookiecutter' not in ws_config['workspace_id'] or\n", |
| 218 | " 'cookiecutter' not in ws_config['tenant_id']):\n", |
| 219 | " WORKSPACE_ID = ws_config['workspace_id']\n", |
| 220 | " TENANT_ID = ws_config['tenant_id']\n", |
| 221 | "except:\n", |
| 222 | " pass\n", |
| 223 | "\n", |
| 224 | "if not WORKSPACE_ID or not TENANT_ID:\n", |
| 225 | " display(Markdown('**Workspace configuration not found.**\\n\\n'\n", |
| 226 | " 'Please go to your Log Analytics workspace, copy the workspace ID'\n", |
| 227 | " ' and/or tenant Id and paste here.<br> '\n", |
| 228 | " 'Or read the workspace_id from the config.json in your Azure Notebooks project.'))\n", |
| 229 | " ws_config = None\n", |
| 230 | " ws_id = mas.GetEnvironmentKey(env_var='WORKSPACE_ID',\n", |
| 231 | " prompt='Please enter your Log Analytics Workspace Id:', auto_display=True)\n", |
| 232 | " ten_id = mas.GetEnvironmentKey(env_var='TENANT_ID',\n", |
| 233 | " prompt='Please enter your Log Analytics Tenant Id:', auto_display=True)" |
| 234 | ] |
| 235 | }, |
| 236 | { |
| 237 | "cell_type": "markdown", |
| 238 | "metadata": {}, |
| 239 | "source": [ |
| 240 | "### Authenticate to Log Analytics\n", |
| 241 | "If you are using user/device authentication, run the following cell. \n", |
| 242 | "- Click the 'Copy code to clipboard and authenticate' button.\n", |
| 243 | "- This will pop up an Azure Active Directory authentication dialog (in a new tab or browser window). The device code will have been copied to the clipboard. \n", |
| 244 | "- Select the text box and paste (Ctrl-V/Cmd-V) the copied value. \n", |
| 245 | "- You should then be redirected to a user authentication page where you should authenticate with a user account that has permission to query your Log Analytics workspace.\n", |
| 246 | "\n", |
| 247 | "Use the following syntax if you are authenticating using an Azure Active Directory AppId and Secret:\n", |
| 248 | "```\n", |
| 249 | "%kql loganalytics://tenant(aad_tenant).workspace(WORKSPACE_ID).clientid(client_id).clientsecret(client_secret)\n", |
| 250 | "```\n", |
| 251 | "instead of\n", |
| 252 | "```\n", |
| 253 | "%kql loganalytics://code().workspace(WORKSPACE_ID)\n", |
| 254 | "```\n", |
| 255 | "\n", |
| 256 | "Note: you may occasionally see a JavaScript error displayed at the end of the authentication - you can safely ignore this.<br>\n", |
| 257 | "On successful authentication you should see a ```popup schema``` button." |
| 258 | ] |
| 259 | }, |
| 260 | { |
| 261 | "cell_type": "code", |
| 262 | "execution_count": null, |
| 263 | "metadata": { |
| 264 | "tags": [ |
| 265 | "todo" |
| 266 | ] |
| 267 | }, |
| 268 | "outputs": [], |
| 269 | "source": [ |
| 270 | "if not WORKSPACE_ID or not TENANT_ID:\n", |
| 271 | " try:\n", |
| 272 | " WORKSPACE_ID = ws_id.value\n", |
| 273 | " TENANT_ID = ten_id.value\n", |
| 274 | " except NameError:\n", |
| 275 | " raise ValueError('No workspace or Tenant Id.')\n", |
| 276 | "\n", |
| 277 | "mas.kql.load_kql_magic()\n", |
| 278 | "%kql loganalytics://code().tenant(TENANT_ID).workspace(WORKSPACE_ID)\n" |
| 279 | ] |
| 280 | }, |
| 281 | { |
| 282 | "cell_type": "code", |
| 283 | "execution_count": null, |
| 284 | "metadata": {}, |
| 285 | "outputs": [], |
| 286 | "source": [ |
| 287 | "%kql search * | summarize RowCount=count() by Type | project-rename Table=Type\n", |
| 288 | "la_table_set = _kql_raw_result_.to_dataframe()\n", |
| 289 | "table_index = la_table_set.set_index('Table')['RowCount'].to_dict()\n", |
| 290 | "display(Markdown('Current data in workspace'))\n", |
| 291 | "display(la_table_set.T)" |
| 292 | ] |
| 293 | }, |
| 294 | { |
| 295 | "cell_type": "markdown", |
| 296 | "metadata": {}, |
| 297 | "source": [ |
| 298 | "<a id='o365'></a>[Contents](#contents)\n", |
| 299 | "# Office 365 Activity" |
| 300 | ] |
| 301 | }, |
| 302 | { |
| 303 | "cell_type": "markdown", |
| 304 | "metadata": {}, |
| 305 | "source": [ |
| 306 | "### Log Analytics Queries" |
| 307 | ] |
| 308 | }, |
| 309 | { |
| 310 | "cell_type": "code", |
| 311 | "execution_count": null, |
| 312 | "metadata": {}, |
| 313 | "outputs": [], |
| 314 | "source": [ |
| 315 | "if ('OfficeActivity' not in table_index or\n", |
| 316 | " table_index['OfficeActivity'] == 0):\n", |
| 317 | " display(Markdown('<font color=\"red\"><h2>Warning. Office Data not available.</h2></font><br>'\n", |
| 318 | " 'Either Office 365 data has not been imported into the workspace or'\n", |
| 319 | " ' the OfficeActivity table is empty.<br>'\n", |
| 320 | " 'This workbook is not useable with the current workspace.'))" |
| 321 | ] |
| 322 | }, |
| 323 | { |
| 324 | "cell_type": "code", |
| 325 | "execution_count": null, |
| 326 | "metadata": {}, |
| 327 | "outputs": [], |
| 328 | "source": [ |
| 329 | "from msticpy.sectools.geoip import GeoLiteLookup\n", |
| 330 | "iplocation = GeoLiteLookup()\n", |
| 331 | "\n", |
| 332 | "# Queries\n", |
| 333 | "ad_changes_query = '''\n", |
| 334 | "OfficeActivity\n", |
| 335 | "| where TimeGenerated >= datetime({start})\n", |
| 336 | "| where TimeGenerated <= datetime({end})\n", |
| 337 | "| where RecordType == 'AzureActiveDirectory'\n", |
| 338 | "| where Operation in ('Add service principal.',\n", |
| 339 | " 'Change user password.', \n", |
| 340 | " 'Add user.', \n", |
| 341 | " 'Add member to role.')\n", |
| 342 | "| where UserType == 'Regular' \n", |
| 343 | "| project OfficeId, TimeGenerated, Operation, OrganizationId, \n", |
| 344 | " OfficeWorkload, ResultStatus, OfficeObjectId, \n", |
| 345 | " UserId = tolower(UserId), ClientIP, ExtendedProperties\n", |
| 346 | "'''\n", |
| 347 | "\n", |
| 348 | "\n", |
| 349 | "office_ops_query = '''\n", |
| 350 | "OfficeActivity\n", |
| 351 | "| where TimeGenerated >= datetime({start})\n", |
| 352 | "| where TimeGenerated <= datetime({end})\n", |
| 353 | "| where RecordType in (\"AzureActiveDirectoryAccountLogon\", \"AzureActiveDirectoryStsLogon\")\n", |
| 354 | "| extend UserAgent = extractjson(\"$[0].Value\", ExtendedProperties, typeof(string))\n", |
| 355 | "| union (\n", |
| 356 | " OfficeActivity \n", |
| 357 | " | where TimeGenerated >= datetime({start})\n", |
| 358 | " | where TimeGenerated <= datetime({end})\n", |
| 359 | " | where RecordType !in (\"AzureActiveDirectoryAccountLogon\", \"AzureActiveDirectoryStsLogon\")\n", |
| 360 | ")\n", |
| 361 | "| where UserType == 'Regular'\n", |
| 362 | "'''\n", |
| 363 | "\n", |
| 364 | "\n", |
| 365 | "office_ops_summary_query = '''\n", |
| 366 | "let timeRange=ago(30d);\n", |
| 367 | "let officeAuthentications = OfficeActivity\n", |
| 368 | "| where TimeGenerated >= timeRange\n", |
| 369 | "| where RecordType in (\"AzureActiveDirectoryAccountLogon\", \"AzureActiveDirectoryStsLogon\")\n", |
| 370 | "| extend UserAgent = extractjson(\"$[0].Value\", ExtendedProperties, typeof(string))\n", |
| 371 | "| where Operation == \"UserLoggedIn\";\n", |
| 372 | "officeAuthentications\n", |
| 373 | "| union (\n", |
| 374 | " OfficeActivity \n", |
| 375 | " | where TimeGenerated >= timeRange\n", |
| 376 | " | where RecordType !in (\"AzureActiveDirectoryAccountLogon\", \"AzureActiveDirectoryStsLogon\")\n", |
| 377 | ")\n", |
| 378 | "| where UserType == 'Regular'\n", |
| 379 | "| extend RecordOp = strcat(RecordType, '-', Operation)\n", |
| 380 | "| summarize OperationCount=count() by RecordType, Operation, UserId, UserAgent, ClientIP, bin(TimeGenerated, 1h)\n", |
| 381 | "// render timeline\n", |
| 382 | "'''\n", |
| 383 | "\n", |
| 384 | "\n", |
| 385 | "office_logons_byua_query = '''\n", |
| 386 | "let end = datetime({end});\n", |
| 387 | "let threshold={threshold};\n", |
| 388 | "let start = end - 1d;\n", |
| 389 | "let hist_start = start - 30d;\n", |
| 390 | "let hist_end = end;\n", |
| 391 | "let officeAuthentications = OfficeActivity\n", |
| 392 | "| where TimeGenerated >= hist_start\n", |
| 393 | "| where TimeGenerated <= hist_end\n", |
| 394 | "| where RecordType in (\"AzureActiveDirectoryAccountLogon\", \"AzureActiveDirectoryStsLogon\")\n", |
| 395 | "| extend UserAgent = extractjson(\"$[0].Value\", ExtendedProperties, typeof(string))\n", |
| 396 | "| where Operation == \"UserLoggedIn\";\n", |
| 397 | "let lookupWindow = end - start;\n", |
| 398 | "let lookupBin = lookupWindow / 2.0; \n", |
| 399 | "officeAuthentications \n", |
| 400 | "| project-rename Start = TimeGenerated\n", |
| 401 | "| extend TimeKey = bin(Start, lookupBin)\n", |
| 402 | "| join kind = inner (\n", |
| 403 | " officeAuthentications\n", |
| 404 | " | project-rename End = TimeGenerated\n", |
| 405 | " | extend TimeKey = range(bin(End - lookupWindow, lookupBin), bin(End, lookupBin), lookupBin)\n", |
| 406 | " | mvexpand TimeKey to typeof(datetime)\n", |
| 407 | ") on UserAgent, TimeKey\n", |
| 408 | "| project timeSpan = End - Start, UserId, ClientIP , UserAgent , Start, End\n", |
| 409 | "| summarize Count_ClientIP = dcount(ClientIP) by UserId\n", |
| 410 | "| where Count_ClientIP > threshold\n", |
| 411 | "| join kind=inner ( \n", |
| 412 | " officeAuthentications\n", |
| 413 | " | summarize minTime=min(TimeGenerated), maxTime=max(TimeGenerated) by UserId, UserAgent, ClientIP\n", |
| 414 | ") on UserAgent\n", |
| 415 | "'''\n", |
| 416 | "\n", |
| 417 | "office_logons_byuser_query = '''\n", |
| 418 | "let end = datetime({end});\n", |
| 419 | "let start = datetime({start});\n", |
| 420 | "let threshold={threshold};\n", |
| 421 | "let officeAuthentications = OfficeActivity\n", |
| 422 | "| where TimeGenerated >= start\n", |
| 423 | "| where TimeGenerated <= end\n", |
| 424 | "| where RecordType in (\"AzureActiveDirectoryAccountLogon\", \"AzureActiveDirectoryStsLogon\")\n", |
| 425 | "| extend UserAgent = extractjson(\"$[0].Value\", ExtendedProperties, typeof(string))\n", |
| 426 | "| where Operation == \"UserLoggedIn\";\n", |
| 427 | "let lookupWindow = 1d;\n", |
| 428 | "let lookupBin = lookupWindow / 2.0; \n", |
| 429 | "officeAuthentications \n", |
| 430 | "| project-rename Start = TimeGenerated\n", |
| 431 | "| extend TimeKey = bin(Start, lookupBin)\n", |
| 432 | "| join kind = inner (\n", |
| 433 | " officeAuthentications\n", |
| 434 | " | project-rename End = TimeGenerated\n", |
| 435 | " | extend TimeKey = range(bin(End - lookupWindow, lookupBin), bin(End, lookupBin), lookupBin)\n", |
| 436 | " | mvexpand TimeKey to typeof(datetime)\n", |
| 437 | ") on UserId, TimeKey\n", |
| 438 | "| project timeSpan = End - Start, UserId, ClientIP , UserAgent, Start, End\n", |
| 439 | "| summarize Count_ClientIP = dcount(ClientIP) by UserId\n", |
| 440 | "| where Count_ClientIP > threshold\n", |
| 441 | "| join kind=inner ( \n", |
| 442 | " officeAuthentications\n", |
| 443 | " | summarize minTime=min(TimeGenerated), maxTime=max(TimeGenerated) by UserId, UserAgent, ClientIP\n", |
| 444 | ") on UserId\n", |
| 445 | "'''\n", |
| 446 | "\n", |
| 447 | "# %kql -query office_logons_query\n", |
| 448 | "# office_logons_df = _kql_raw_result_.to_dataframe()\n", |
| 449 | "\n", |
| 450 | "#\n", |
| 451 | "# Description: New user agents associated with a clientIP for sharepoint file uploads/downloads. \n", |
| 452 | "#\n", |
| 453 | "# DataSource: #OfficeActivity\n", |
| 454 | "#\n", |
| 455 | "# Techniques: #Exfiltration\n", |
| 456 | "#\n", |
| 457 | "new_user_agents = '''\n", |
| 458 | "let end = datetime({end});\n", |
| 459 | "let start = datetime({end});\n", |
| 460 | "let hist_start = start - 30d;\n", |
| 461 | "let hist_end = start;\n", |
| 462 | "let historicalUA =\n", |
| 463 | "OfficeActivity\n", |
| 464 | "| where TimeGenerated >= hist_start\n", |
| 465 | "| where TimeGenerated <= hist_end\n", |
| 466 | "| where UserType == 'Regular'\n", |
| 467 | "| summarize op_count = count() by UserId, UserAgent, RecordType, Operation;\n", |
| 468 | "let recentUA = OfficeActivity\n", |
| 469 | "| where TimeGenerated >= start\n", |
| 470 | "| where TimeGenerated <= end\n", |
| 471 | "| where UserType == 'Regular'\n", |
| 472 | "| summarize op_count = count() by UserId, UserAgent, RecordType, Operation;\n", |
| 473 | "recentUA | join kind=leftanti (\n", |
| 474 | " historicalUA \n", |
| 475 | ") on UserId, UserAgent\n", |
| 476 | "| where not(isempty(UserId))\n", |
| 477 | "'''\n", |
| 478 | "\n", |
| 479 | "user_logon_anom_query = '''\n", |
| 480 | "let LogonEvents=() {{\n", |
| 481 | "let logonFail=OfficeActivity\n", |
| 482 | "| where TimeGenerated >= datetime({start})\n", |
| 483 | "| where TimeGenerated <= datetime({end})\n", |
| 484 | "| where RecordType in (\"AzureActiveDirectoryAccountLogon\", \"AzureActiveDirectoryStsLogon\") and ResultStatus =~ \"Failed\"\n", |
| 485 | "| project TimeGenerated, AccountName=split(UserId, \"@\").[0], AccountDomain = iff(RecordType == \"AzureActiveDirectoryAccountLogon\",UserDomain,split(UserId, \"@\").[1]), UserId, IpAddress=ClientIP, OrganizationId, \n", |
| 486 | "ActionType=\"LogonFailure\";\n", |
| 487 | "let logonSuccess=OfficeActivity\n", |
| 488 | "| where TimeGenerated >= datetime({start})\n", |
| 489 | "| where TimeGenerated <= datetime({end})\n", |
| 490 | "| where RecordType in (\"AzureActiveDirectoryAccountLogon\", \"AzureActiveDirectoryStsLogon\") and ResultStatus =~ \"Succeeded\"\n", |
| 491 | "| project TimeGenerated, AccountName=split(UserId, \"@\").[0], AccountDomain = iff(RecordType == \"AzureActiveDirectoryAccountLogon\",UserDomain,split(UserId, \"@\").[1]), UserId, IpAddress=ClientIP, OrganizationId, \n", |
| 492 | "ActionType=\"Logon\";\n", |
| 493 | " logonFail | union logonSuccess}}; \n", |
| 494 | "let logonSummary =\n", |
| 495 | " LogonEvents \n", |
| 496 | "| summarize count() by ActionType, IpAddress, tostring(AccountName), tostring(AccountDomain), UserId, OrganizationId, bin(TimeGenerated, 1m); \n", |
| 497 | "let logon_success = logonSummary | where ActionType == \"Logon\";\n", |
| 498 | "let logon_fail = logonSummary | where ActionType == \"LogonFailure\";\n", |
| 499 | "logon_fail | join kind = leftouter (logon_success) on IpAddress\n", |
| 500 | "| project TimeGenerated, IpAddress, failCount=count_, AccountName, OrganizationId, UserId, successCount=count_1 \n", |
| 501 | "| extend successRate = 1.0*successCount/(successCount+failCount)\n", |
| 502 | "| project TimeGenerated, IpAddress, AccountName, successRate, failCount, successCount, UserId, OrganizationId\n", |
| 503 | "'''" |
| 504 | ] |
| 505 | }, |
| 506 | { |
| 507 | "cell_type": "code", |
| 508 | "execution_count": null, |
| 509 | "metadata": {}, |
| 510 | "outputs": [], |
| 511 | "source": [ |
| 512 | "# set the origin time to the time of our alert\n", |
| 513 | "o365_query_times = mas.QueryTime(units='hours',\n", |
| 514 | " before=24, after=1, max_before=60, max_after=20)\n", |
| 515 | "o365_query_times.display()" |
| 516 | ] |
| 517 | }, |
| 518 | { |
| 519 | "cell_type": "markdown", |
| 520 | "metadata": {}, |
| 521 | "source": [ |
| 522 | "<a id='tenant_info'></a>[Contents](#contents)\n", |
| 523 | "## Tenant-wide Information" |
| 524 | ] |
| 525 | }, |
| 526 | { |
| 527 | "cell_type": "markdown", |
| 528 | "metadata": {}, |
| 529 | "source": [ |
| 530 | "<a id='activity_summary'></a>[Contents](#contents)\n", |
| 531 | "### Summary of O365 Activity Types\n", |
| 532 | "#### <font color=\"red\">Warning this query can be time consuming for large O365 subscriptions</font>" |
| 533 | ] |
| 534 | }, |
| 535 | { |
| 536 | "cell_type": "code", |
| 537 | "execution_count": null, |
| 538 | "metadata": {}, |
| 539 | "outputs": [], |
| 540 | "source": [ |
| 541 | "print('Getting data...', end=' ')\n", |
| 542 | "o365_query = office_ops_summary_query.format(start = o365_query_times.start, \n", |
| 543 | " end=o365_query_times.end)\n", |
| 544 | "%kql -query o365_query\n", |
| 545 | "office_ops_summary_df = _kql_raw_result_.to_dataframe()\n", |
| 546 | "print('done.')\n", |
| 547 | "(office_ops_summary_df\n", |
| 548 | " .assign(UserId = lambda x: x.UserId.str.lower())\n", |
| 549 | " .groupby(['RecordType', 'Operation'])\n", |
| 550 | " .aggregate({'ClientIP': 'nunique',\n", |
| 551 | " 'UserId': 'nunique',\n", |
| 552 | " 'OperationCount': 'sum'}))" |
| 553 | ] |
| 554 | }, |
| 555 | { |
| 556 | "cell_type": "markdown", |
| 557 | "metadata": {}, |
| 558 | "source": [ |
| 559 | "<a id='ip_variability'></a>[Contents](#contents)\n", |
| 560 | "### Variability of IP Address for users" |
| 561 | ] |
| 562 | }, |
| 563 | { |
| 564 | "cell_type": "code", |
| 565 | "execution_count": null, |
| 566 | "metadata": { |
| 567 | "scrolled": true |
| 568 | }, |
| 569 | "outputs": [], |
| 570 | "source": [ |
| 571 | "unique_ip_op_ua = (office_ops_summary_df.assign(UserId = lambda x: x.UserId.str.lower())\n", |
| 572 | " .groupby(['UserId', 'Operation'])\n", |
| 573 | " .aggregate({'ClientIP': 'nunique', 'OperationCount': 'sum'})).reset_index()\n", |
| 574 | "\n", |
| 575 | "user_ip_op = sns.catplot(x=\"ClientIP\", y=\"UserId\", hue='Operation', data=unique_ip_op_ua, height=5, aspect=2)\n", |
| 576 | "user_ip_op.fig.suptitle('Variability of IP Address Usage by user');" |
| 577 | ] |
| 578 | }, |
| 579 | { |
| 580 | "cell_type": "markdown", |
| 581 | "metadata": {}, |
| 582 | "source": [ |
| 583 | "<a id='aad_ops'></a>\n", |
| 584 | "### AAD Operations Changes to users and groups\n", |
| 585 | "\n", |
| 586 | "#### <font color=\"red\">WARNING: due to recent changes in data format this query is not currently functional</font>" |
| 587 | ] |
| 588 | }, |
| 589 | { |
| 590 | "cell_type": "code", |
| 591 | "execution_count": null, |
| 592 | "metadata": {}, |
| 593 | "outputs": [], |
| 594 | "source": [ |
| 595 | "print('Getting data...', end=' ')\n", |
| 596 | "o365_query = ad_changes_query.format(start = o365_query_times.start, \n", |
| 597 | " end=o365_query_times.end)\n", |
| 598 | "%kql -query o365_query\n", |
| 599 | "ad_changes_df = _kql_raw_result_.to_dataframe()\n", |
| 600 | "print('done.')\n", |
| 601 | "ad_changes_df[['TimeGenerated', 'Operation', \n", |
| 602 | " 'OfficeWorkload', 'ResultStatus', 'OfficeObjectId', 'UserId',\n", |
| 603 | " 'ClientIP']]" |
| 604 | ] |
| 605 | }, |
| 606 | { |
| 607 | "cell_type": "markdown", |
| 608 | "metadata": {}, |
| 609 | "source": [ |
| 610 | "<a id='logon_anomalies'></a>[Contents](#contents)\n", |
| 611 | "### Logon Anomalies\n", |
| 612 | "Logon failures from an ipaddress that then succeed.\n", |
| 613 | "\n", |
| 614 | "#### <font color=\"red\">WARNING: due to recent changes in data format this query is not currently functional</font>" |
| 615 | ] |
| 616 | }, |
| 617 | { |
| 618 | "cell_type": "code", |
| 619 | "execution_count": null, |
| 620 | "metadata": {}, |
| 621 | "outputs": [], |
| 622 | "source": [ |
| 623 | "print('Getting data...', end=' ')\n", |
| 624 | "o365_query = user_logon_anom_query.format(start = o365_query_times.start, \n", |
| 625 | " end=o365_query_times.end)\n", |
| 626 | "%kql -query o365_query\n", |
| 627 | "user_logon_anom_df = _kql_raw_result_.to_dataframe()\n", |
| 628 | "print('done.')\n", |
| 629 | "user_logon_anom_df.sort_values('failCount')" |
| 630 | ] |
| 631 | }, |
| 632 | { |
| 633 | "cell_type": "code", |
| 634 | "execution_count": null, |
| 635 | "metadata": {}, |
| 636 | "outputs": [], |
| 637 | "source": [ |
| 638 | "office_ops_summary_df" |
| 639 | ] |
| 640 | }, |
| 641 | { |
| 642 | "cell_type": "markdown", |
| 643 | "metadata": {}, |
| 644 | "source": [ |
| 645 | "<a id='acct_multi_geo'></a>[Contents](#contents)\n", |
| 646 | "### Accounts with multiple IPs and Geolocations" |
| 647 | ] |
| 648 | }, |
| 649 | { |
| 650 | "cell_type": "code", |
| 651 | "execution_count": null, |
| 652 | "metadata": {}, |
| 653 | "outputs": [], |
| 654 | "source": [ |
| 655 | "restrict_cols = ['RecordType', 'TimeGenerated', 'Operation',\n", |
| 656 | " 'UserId', 'ClientIP', 'UserAgent']\n", |
| 657 | "office_ops_summary = office_ops_summary_df[restrict_cols].assign(UserId = lambda x: x.UserId.str.lower())\n", |
| 658 | "unique_ip_op_ua['ClientIPCount'] = unique_ip_op_ua['ClientIP']\n", |
| 659 | "office_ops_merged = pd.merge(unique_ip_op_ua.query('ClientIP > 1').drop(columns='ClientIP'), \n", |
| 660 | " office_ops_summary,\n", |
| 661 | " on=['UserId', 'Operation'])\n", |
| 662 | "\n", |
| 663 | "client_ips = office_ops_merged.query('ClientIP != \"<null>\" & ClientIP != \"\"')['ClientIP'].drop_duplicates().tolist()\n", |
| 664 | "ip_entities = []\n", |
| 665 | "for ip in client_ips:\n", |
| 666 | " ip_entity = mas.IpAddress(Address=ip)\n", |
| 667 | " iplocation.lookup_ip(ip_entity=ip_entity)\n", |
| 668 | " ip_dict = {'Address': ip_entity.Address}\n", |
| 669 | " ip_dict.update(ip_entity.Location.properties)\n", |
| 670 | " ip_entities.append(pd.Series(ip_dict))\n", |
| 671 | "\n", |
| 672 | "ip_locs_df = pd.DataFrame(data=ip_entities)\n", |
| 673 | "ip_locs_df\n", |
| 674 | "\n", |
| 675 | "office_ops_summary_ip_loc = pd.merge(office_ops_merged, \n", |
| 676 | " ip_locs_df, left_on='ClientIP', \n", |
| 677 | " right_on='Address', how='left')\n", |
| 678 | "\n", |
| 679 | "(office_ops_summary_ip_loc.groupby(['UserId', 'CountryCode', 'City'])\n", |
| 680 | " .aggregate({'ClientIP': 'nunique', 'OperationCount': 'sum'})).reset_index()" |
| 681 | ] |
| 682 | }, |
| 683 | { |
| 684 | "cell_type": "markdown", |
| 685 | "metadata": {}, |
| 686 | "source": [ |
| 687 | "<a id='acct_multi_ips'></a>[Contents](#contents)\n", |
| 688 | "### User Logons where User has logged on from > N IP Address in period" |
| 689 | ] |
| 690 | }, |
| 691 | { |
| 692 | "cell_type": "code", |
| 693 | "execution_count": null, |
| 694 | "metadata": {}, |
| 695 | "outputs": [], |
| 696 | "source": [ |
| 697 | "th_wgt = widgets.IntSlider(value=1, min=1, max=50, step=1, description='Set IP Count Threshold', **WIDGET_DEFAULTS)\n", |
| 698 | "th_wgt" |
| 699 | ] |
| 700 | }, |
| 701 | { |
| 702 | "cell_type": "code", |
| 703 | "execution_count": null, |
| 704 | "metadata": {}, |
| 705 | "outputs": [], |
| 706 | "source": [ |
| 707 | "print('Getting data...', end=' ')\n", |
| 708 | "o365_query = office_logons_byuser_query.format(start = o365_query_times.start, \n", |
| 709 | " end=o365_query_times.end,\n", |
| 710 | " threshold=th_wgt.value)\n", |
| 711 | "%kql -query o365_query\n", |
| 712 | "office_logons_byuser_df = _kql_raw_result_.to_dataframe()\n", |
| 713 | "print('done.')\n", |
| 714 | "office_logons_byuser_df\n" |
| 715 | ] |
| 716 | }, |
| 717 | { |
| 718 | "cell_type": "markdown", |
| 719 | "metadata": {}, |
| 720 | "source": [ |
| 721 | "<a id='ip_op_matrix'></a>[Contents](#contents)\n", |
| 722 | "### Matrix of Selected Operation Types by Location and IP" |
| 723 | ] |
| 724 | }, |
| 725 | { |
| 726 | "cell_type": "code", |
| 727 | "execution_count": null, |
| 728 | "metadata": {}, |
| 729 | "outputs": [], |
| 730 | "source": [ |
| 731 | "print('Getting data...', end=' ')\n", |
| 732 | "o365_query = office_ops_query.format(start=o365_query_times.start, \n", |
| 733 | " end=o365_query_times.end)\n", |
| 734 | "%kql -query o365_query\n", |
| 735 | "office_ops_df = _kql_raw_result_.to_dataframe()\n", |
| 736 | "print('done.') \n", |
| 737 | "\n", |
| 738 | "# Get Locations for distinct IPs\n", |
| 739 | "client_ips = office_ops_df.query('ClientIP != \"<null>\" & ClientIP != \"\"')['ClientIP'].drop_duplicates().tolist()\n", |
| 740 | "ip_entities = []\n", |
| 741 | "for ip in client_ips:\n", |
| 742 | " ip_entity = mas.IpAddress(Address=ip)\n", |
| 743 | " iplocation.lookup_ip(ip_entity=ip_entity)\n", |
| 744 | " ip_dict = {'Address': ip_entity.Address}\n", |
| 745 | " ip_dict.update(ip_entity.Location.properties)\n", |
| 746 | " ip_entities.append(pd.Series(ip_dict))\n", |
| 747 | "\n", |
| 748 | "ip_locs_df = pd.DataFrame(data=ip_entities)\n", |
| 749 | "\n", |
| 750 | "# Get rid of unneeded columns\n", |
| 751 | "restrict_cols = ['OfficeId', 'RecordType', 'TimeGenerated', 'Operation',\n", |
| 752 | " 'OrganizationId', 'UserType', 'UserKey', 'OfficeWorkload',\n", |
| 753 | " 'ResultStatus', 'OfficeObjectId', 'UserId', 'ClientIP','UserAgent']\n", |
| 754 | "office_ops_restr = office_ops_df[restrict_cols]\n", |
| 755 | "\n", |
| 756 | "# Merge main DF with IP location data\n", |
| 757 | "office_ops_locs = pd.merge(office_ops_restr, ip_locs_df, how='right', left_on='ClientIP', right_on='Address',\n", |
| 758 | " indicator=True)\n", |
| 759 | "\n", |
| 760 | "limit_op_types = ['FileDownloaded', 'FileModified','FileUploaded',\n", |
| 761 | " 'UserLoggedIn','UserLoginFailed','Add member to role.',\n", |
| 762 | " 'Add user.','Change user password.', 'Update user.']\n", |
| 763 | "\n", |
| 764 | "office_ops_locs = office_ops_locs[office_ops_locs.Operation.isin(limit_op_types)]\n", |
| 765 | "\n", |
| 766 | "# Calculate operations grouped by location and operation type\n", |
| 767 | "cm = sns.light_palette(\"yellow\", as_cmap=True)\n", |
| 768 | "country_by_op_count = (office_ops_locs[['Operation', 'RecordType', 'CountryCode', 'City']]\n", |
| 769 | " .groupby(['CountryCode', 'City', 'Operation'])\n", |
| 770 | " .count())\n", |
| 771 | "display(country_by_op_count.unstack().fillna(0).rename(columns={'RecordType':'OperationCount'})\n", |
| 772 | " .style.background_gradient(cmap=cm))\n", |
| 773 | "\n", |
| 774 | "# Group by Client IP, Country, Operation\n", |
| 775 | "clientip_by_op_count = (office_ops_locs[['ClientIP', 'Operation', 'RecordType', 'CountryCode']]\n", |
| 776 | " .groupby(['ClientIP', 'CountryCode', 'Operation'])\n", |
| 777 | " .count())\n", |
| 778 | "\n", |
| 779 | "(clientip_by_op_count.unstack().fillna(0).rename(columns={'RecordType':'OperationCount'})\n", |
| 780 | " .style.background_gradient(cmap=cm))" |
| 781 | ] |
| 782 | }, |
| 783 | { |
| 784 | "cell_type": "markdown", |
| 785 | "metadata": {}, |
| 786 | "source": [ |
| 787 | "<a id='geo_map_tenant'></a>[Contents](#contents)\n", |
| 788 | "### Geolocation Map of Client IPs" |
| 789 | ] |
| 790 | }, |
| 791 | { |
| 792 | "cell_type": "code", |
| 793 | "execution_count": null, |
| 794 | "metadata": {}, |
| 795 | "outputs": [], |
| 796 | "source": [ |
| 797 | "from msticpy.nbtools.foliummap import FoliumMap\n", |
| 798 | "folium_map = FoliumMap()\n", |
| 799 | "\n", |
| 800 | "def get_row_ip_loc(row):\n", |
| 801 | " try:\n", |
| 802 | " _, ip_entity = iplocation.lookup_ip(ip_address=row.ClientIP)\n", |
| 803 | " return ip_entity\n", |
| 804 | " except ValueError:\n", |
| 805 | " return None\n", |
| 806 | " \n", |
| 807 | "off_ip_locs = (office_ops_df[['ClientIP']]\n", |
| 808 | " .drop_duplicates()\n", |
| 809 | " .apply(get_row_ip_loc, axis=1)\n", |
| 810 | " .tolist())\n", |
| 811 | "ip_locs = [ip_list[0] for ip_list in off_ip_locs if ip_list]\n", |
| 812 | " \n", |
| 813 | "display(HTML('<h3>External IP Addresses seen in Office Activity</h3>'))\n", |
| 814 | "display(HTML('Numbered circles indicate multiple items - click to expand.'))\n", |
| 815 | "\n", |
| 816 | "\n", |
| 817 | "icon_props = {'color': 'purple'}\n", |
| 818 | "folium_map.add_ip_cluster(ip_entities=ip_locs,\n", |
| 819 | " **icon_props)\n", |
| 820 | "display(folium_map.folium_map)" |
| 821 | ] |
| 822 | }, |
| 823 | { |
| 824 | "cell_type": "markdown", |
| 825 | "metadata": {}, |
| 826 | "source": [ |
| 827 | "<a id='distinct_uas'></a>[Contents](#contents)\n", |
| 828 | "### Distinct User Agent Strings in Use\n", |
| 829 | "\n", |
| 830 | "#### <font color=\"red\">WARNING: due to recent changes in data format this query is not currently functional</font>" |
| 831 | ] |
| 832 | }, |
| 833 | { |
| 834 | "cell_type": "code", |
| 835 | "execution_count": null, |
| 836 | "metadata": {}, |
| 837 | "outputs": [], |
| 838 | "source": [ |
| 839 | "\n", |
| 840 | "display(Markdown('### IPs and User Agents - frequency of use'))\n", |
| 841 | "display(Markdown('Distinct UserAgents by num of operations'))\n", |
| 842 | "office_ops_df[['UserAgent', 'Operation']].groupby(['UserAgent']).count().rename(columns={'Operation':'OpCount'})\n" |
| 843 | ] |
| 844 | }, |
| 845 | { |
| 846 | "cell_type": "markdown", |
| 847 | "metadata": {}, |
| 848 | "source": [ |
| 849 | "<a id='op_timeline'></a>[Contents](#contents)\n", |
| 850 | "### Graphical Activity Timeline" |
| 851 | ] |
| 852 | }, |
| 853 | { |
| 854 | "cell_type": "code", |
| 855 | "execution_count": null, |
| 856 | "metadata": {}, |
| 857 | "outputs": [], |
| 858 | "source": [ |
| 859 | "with warnings.catch_warnings():\n", |
| 860 | " warnings.simplefilter(\"ignore\")\n", |
| 861 | " display(Markdown('### Change in rate of Activity Class (RecordType) and Operation'))\n", |
| 862 | " sns.relplot(data=office_ops_summary_df, x='TimeGenerated', y='OperationCount', kind='line', aspect=2, \n", |
| 863 | " hue='RecordType')\n", |
| 864 | " sns.relplot(data=office_ops_summary_df.query('RecordType == \"SharePointFileOperation\"'), \n", |
| 865 | " x='TimeGenerated', y='OperationCount', hue='Operation', kind='line', aspect=2)\n", |
| 866 | "\n" |
| 867 | ] |
| 868 | }, |
| 869 | { |
| 870 | "cell_type": "markdown", |
| 871 | "metadata": {}, |
| 872 | "source": [ |
| 873 | "<a id='user_activity_counts'></a>[Contents](#contents)\n", |
| 874 | "### Users With largest Activity Type Count" |
| 875 | ] |
| 876 | }, |
| 877 | { |
| 878 | "cell_type": "code", |
| 879 | "execution_count": null, |
| 880 | "metadata": {}, |
| 881 | "outputs": [], |
| 882 | "source": [ |
| 883 | "with warnings.catch_warnings():\n", |
| 884 | " warnings.simplefilter(\"ignore\")\n", |
| 885 | " display(Markdown('### Identify Users/IPs with largest operation count'))\n", |
| 886 | " office_ops = office_ops_summary_df.assign(Account=lambda x: \n", |
| 887 | " (x.UserId.str.extract('([^@]+)@.*', expand=False)).str.lower())\n", |
| 888 | "\n", |
| 889 | " limit_op_types = ['FileDownloaded', 'FileModified','FileUploaded',\n", |
| 890 | " 'UserLoggedIn','UserLoginFailed','Add member to role.',\n", |
| 891 | " 'Add user.','Change user password.', 'Update user.']\n", |
| 892 | " office_ops = office_ops[office_ops.Operation.isin(limit_op_types)]\n", |
| 893 | " \n", |
| 894 | " sns.catplot(data=office_ops, y='Account', x='OperationCount', \n", |
| 895 | " hue='Operation', aspect=2)\n", |
| 896 | " display(office_ops.pivot_table('OperationCount', index=['Account'], \n", |
| 897 | " columns='Operation').style.bar(color='orange', align='mid'))" |
| 898 | ] |
| 899 | }, |
| 900 | { |
| 901 | "cell_type": "code", |
| 902 | "execution_count": null, |
| 903 | "metadata": {}, |
| 904 | "outputs": [], |
| 905 | "source": [ |
| 906 | "new_df = office_ops_df[['OfficeId', 'RecordType', 'TimeGenerated', 'Operation',\n", |
| 907 | " 'OrganizationId', 'UserType', 'UserKey', 'OfficeWorkload',\n", |
| 908 | " 'ResultStatus', 'OfficeObjectId', 'UserId', 'ClientIP','UserAgent']]\n", |
| 909 | "pd.merge(new_df, ip_locs_df, how='left', left_on='ClientIP', right_on='Address')" |
| 910 | ] |
| 911 | }, |
| 912 | { |
| 913 | "cell_type": "markdown", |
| 914 | "metadata": {}, |
| 915 | "source": [ |
| 916 | "<a id='o365_user_inv'></a>[Contents](#contents)\n", |
| 917 | "## Office User Investigation" |
| 918 | ] |
| 919 | }, |
| 920 | { |
| 921 | "cell_type": "code", |
| 922 | "execution_count": null, |
| 923 | "metadata": {}, |
| 924 | "outputs": [], |
| 925 | "source": [ |
| 926 | "# set the origin time to the time of our alert\n", |
| 927 | "o365_query_times_user = mas.QueryTime(units='days',\n", |
| 928 | " before=10, after=1, max_before=60, max_after=20, auto_display=True)\n" |
| 929 | ] |
| 930 | }, |
| 931 | { |
| 932 | "cell_type": "code", |
| 933 | "execution_count": null, |
| 934 | "metadata": {}, |
| 935 | "outputs": [], |
| 936 | "source": [ |
| 937 | "distinct_users = office_ops_df[['UserId']].sort_values('UserId')['UserId'].str.lower().drop_duplicates().tolist()\n", |
| 938 | "distinct_users\n", |
| 939 | "user_select = mas.SelectString(description='Select User Id', item_list=distinct_users, auto_display=True)\n", |
| 940 | " # (items=distinct_users)" |
| 941 | ] |
| 942 | }, |
| 943 | { |
| 944 | "cell_type": "markdown", |
| 945 | "metadata": {}, |
| 946 | "source": [ |
| 947 | "<a id='user_act_summary'></a>[Contents](#contents)\n", |
| 948 | "### Activity Summary" |
| 949 | ] |
| 950 | }, |
| 951 | { |
| 952 | "cell_type": "code", |
| 953 | "execution_count": null, |
| 954 | "metadata": {}, |
| 955 | "outputs": [], |
| 956 | "source": [ |
| 957 | "# Provides a summary view of a given account's activity\n", |
| 958 | "# For use when investigating an account that has been identified as having associated suspect activity or been otherwise compromised. \n", |
| 959 | "# All office activity by UserName using UI to set Time range\n", |
| 960 | "# Tags: #Persistence, #Discovery, #Lateral Movement, #Collection\n", |
| 961 | "\n", |
| 962 | "user_activity_query = '''\n", |
| 963 | "OfficeActivity\n", |
| 964 | "| where TimeGenerated >= datetime({start})\n", |
| 965 | "| where TimeGenerated <= datetime({end})\n", |
| 966 | "| where UserKey has \"{user}\" or UserId has \"{user}\"\n", |
| 967 | "'''\n", |
| 968 | "print('Getting data...', end=' ')\n", |
| 969 | "o365_query = user_activity_query.format(start=o365_query_times_user.start, \n", |
| 970 | " end=o365_query_times_user.end,\n", |
| 971 | " user=user_select.value)\n", |
| 972 | "%kql -query o365_query\n", |
| 973 | "user_activity_df = _kql_raw_result_.to_dataframe()\n", |
| 974 | "print('done.')\n", |
| 975 | "user_activity_df" |
| 976 | ] |
| 977 | }, |
| 978 | { |
| 979 | "cell_type": "markdown", |
| 980 | "metadata": {}, |
| 981 | "source": [ |
| 982 | "<a id='user_op_count'></a>[Contents](#contents)\n", |
| 983 | "### Operation Breakdown for User" |
| 984 | ] |
| 985 | }, |
| 986 | { |
| 987 | "cell_type": "code", |
| 988 | "execution_count": null, |
| 989 | "metadata": {}, |
| 990 | "outputs": [], |
| 991 | "source": [ |
| 992 | "my_df = (user_activity_df[['OfficeId', 'RecordType', 'TimeGenerated', 'Operation',\n", |
| 993 | " 'ResultStatus', 'UserId', 'ClientIP','UserAgent']]\n", |
| 994 | " .groupby(['Operation', 'ResultStatus', 'ClientIP'])\n", |
| 995 | " .aggregate({'OfficeId': 'count'})\n", |
| 996 | " .rename(columns={'OfficeId': 'OperationCount', 'ClientIP': 'IPCount'})\n", |
| 997 | " .reset_index())\n", |
| 998 | "sns.catplot(x='OperationCount', y=\"Operation\", hue=\"ClientIP\", jitter=False, data=my_df, aspect=2.5);\n" |
| 999 | ] |
| 1000 | }, |
| 1001 | { |
| 1002 | "cell_type": "markdown", |
| 1003 | "metadata": {}, |
| 1004 | "source": [ |
| 1005 | "<a id='user_ip_counts'></a>[Contents](#contents)\n", |
| 1006 | "### IP Count for Different User Operations " |
| 1007 | ] |
| 1008 | }, |
| 1009 | { |
| 1010 | "cell_type": "code", |
| 1011 | "execution_count": null, |
| 1012 | "metadata": {}, |
| 1013 | "outputs": [], |
| 1014 | "source": [ |
| 1015 | "my_df2 = (user_activity_df[['OfficeId', 'RecordType', 'TimeGenerated', 'Operation',\n", |
| 1016 | " 'ResultStatus', 'UserId', 'ClientIP','UserAgent']]\n", |
| 1017 | " .groupby(['Operation'])\n", |
| 1018 | " .aggregate({'OfficeId': 'count', 'ClientIP': 'nunique'})\n", |
| 1019 | " .rename(columns={'OfficeId': 'OperationCount', 'ClientIP': 'IPCount'})\n", |
| 1020 | " .reset_index())\n", |
| 1021 | "sns.barplot(x='IPCount', y=\"Operation\", data=my_df2);" |
| 1022 | ] |
| 1023 | }, |
| 1024 | { |
| 1025 | "cell_type": "markdown", |
| 1026 | "metadata": {}, |
| 1027 | "source": [ |
| 1028 | "<a id='user_act_timeline'></a>[Contents](#contents)\n", |
| 1029 | "### Activity Timeline" |
| 1030 | ] |
| 1031 | }, |
| 1032 | { |
| 1033 | "cell_type": "code", |
| 1034 | "execution_count": null, |
| 1035 | "metadata": {}, |
| 1036 | "outputs": [], |
| 1037 | "source": [ |
| 1038 | "nbdisp.display_timeline(data=user_activity_df,\n", |
| 1039 | " title='Office Operations',\n", |
| 1040 | " source_columns=['OfficeWorkload', 'Operation', 'ClientIP', 'ResultStatus'],\n", |
| 1041 | " height=200)" |
| 1042 | ] |
| 1043 | }, |
| 1044 | { |
| 1045 | "cell_type": "markdown", |
| 1046 | "metadata": {}, |
| 1047 | "source": [ |
| 1048 | "<a id='user_geomap'></a>[Contents](#contents)\n", |
| 1049 | "### User IP GeoMap" |
| 1050 | ] |
| 1051 | }, |
| 1052 | { |
| 1053 | "cell_type": "code", |
| 1054 | "execution_count": null, |
| 1055 | "metadata": {}, |
| 1056 | "outputs": [], |
| 1057 | "source": [ |
| 1058 | "def get_row_ip_loc(row):\n", |
| 1059 | " try:\n", |
| 1060 | " _, ip_entity = iplocation.lookup_ip(ip_address=row.ClientIP)\n", |
| 1061 | " return ip_entity\n", |
| 1062 | " except ValueError:\n", |
| 1063 | " return None\n", |
| 1064 | " \n", |
| 1065 | "from msticpy.nbtools.foliummap import FoliumMap\n", |
| 1066 | "folium_map = FoliumMap()\n", |
| 1067 | "off_ip_locs = (user_activity_df[['ClientIP']]\n", |
| 1068 | " .drop_duplicates()\n", |
| 1069 | " .apply(get_row_ip_loc, axis=1)\n", |
| 1070 | " .tolist())\n", |
| 1071 | "ip_locs = [ip_list[0] for ip_list in off_ip_locs if ip_list]\n", |
| 1072 | " \n", |
| 1073 | "display(HTML('<h3>External IP Addresses seen in Office Activity</h3>'))\n", |
| 1074 | "display(HTML('Numbered circles indicate multiple items - click to expand.'))\n", |
| 1075 | "\n", |
| 1076 | "\n", |
| 1077 | "icon_props = {'color': 'purple'}\n", |
| 1078 | "folium_map.add_ip_cluster(ip_entities=ip_locs,\n", |
| 1079 | " **icon_props)\n", |
| 1080 | "display(folium_map.folium_map)" |
| 1081 | ] |
| 1082 | }, |
| 1083 | { |
| 1084 | "cell_type": "markdown", |
| 1085 | "metadata": {}, |
| 1086 | "source": [ |
| 1087 | "<a id='ips_in_azure'></a>[Contents](#contents)\n", |
| 1088 | "### Check for User IPs in Azure Network Flow Data\n", |
| 1089 | "The full data is available in the Dataframe ```az_net_query_byip```" |
| 1090 | ] |
| 1091 | }, |
| 1092 | { |
| 1093 | "cell_type": "code", |
| 1094 | "execution_count": null, |
| 1095 | "metadata": {}, |
| 1096 | "outputs": [], |
| 1097 | "source": [ |
| 1098 | "if ('AzureNetworkAnalytics_CL' not in table_index or\n", |
| 1099 | " table_index['AzureNetworkAnalytics_CL'] == 0):\n", |
| 1100 | " display(Markdown('<font color=\"red\"><h2>Warning. Azure network flow data not available.</h2></font><br>'\n", |
| 1101 | " 'This section of the notebook is not useable with the current workspace.'))\n", |
| 1102 | " \n", |
| 1103 | "# Azure Network Analytics Base Query\n", |
| 1104 | "az_net_analytics_query =r'''\n", |
| 1105 | "AzureNetworkAnalytics_CL \n", |
| 1106 | "| where SubType_s == 'FlowLog'\n", |
| 1107 | "| where FlowStartTime_t >= datetime({start})\n", |
| 1108 | "| where FlowEndTime_t <= datetime({end})\n", |
| 1109 | "| project TenantId, TimeGenerated, \n", |
| 1110 | " FlowStartTime = FlowStartTime_t, \n", |
| 1111 | " FlowEndTime = FlowEndTime_t, \n", |
| 1112 | " FlowIntervalEndTime = FlowIntervalEndTime_t, \n", |
| 1113 | " FlowType = FlowType_s,\n", |
| 1114 | " ResourceGroup = split(VM_s, '/')[0],\n", |
| 1115 | " VMName = split(VM_s, '/')[1],\n", |
| 1116 | " VMIPAddress = VMIP_s, \n", |
| 1117 | " PublicIPs = extractall(@\"([\\d\\.]+)[|\\d]+\", dynamic([1]), PublicIPs_s),\n", |
| 1118 | " SrcIP = SrcIP_s,\n", |
| 1119 | " DestIP = DestIP_s,\n", |
| 1120 | " ExtIP = iif(FlowDirection_s == 'I', SrcIP_s, DestIP_s),\n", |
| 1121 | " L4Protocol = L4Protocol_s, \n", |
| 1122 | " L7Protocol = L7Protocol_s, \n", |
| 1123 | " DestPort = DestPort_d, \n", |
| 1124 | " FlowDirection = FlowDirection_s,\n", |
| 1125 | " AllowedOutFlows = AllowedOutFlows_d, \n", |
| 1126 | " AllowedInFlows = AllowedInFlows_d,\n", |
| 1127 | " DeniedInFlows = DeniedInFlows_d, \n", |
| 1128 | " DeniedOutFlows = DeniedOutFlows_d,\n", |
| 1129 | " RemoteRegion = AzureRegion_s,\n", |
| 1130 | " VMRegion = Region_s\n", |
| 1131 | "| extend AllExtIPs = iif(isempty(PublicIPs), pack_array(ExtIP), \n", |
| 1132 | " iif(isempty(ExtIP), PublicIPs, array_concat(PublicIPs, pack_array(ExtIP)))\n", |
| 1133 | " )\n", |
| 1134 | "| project-away ExtIP\n", |
| 1135 | "| mvexpand AllExtIPs\n", |
| 1136 | "{where_clause}\n", |
| 1137 | "'''\n", |
| 1138 | "\n", |
| 1139 | "# Build the query parameters\n", |
| 1140 | "all_user_ips = user_activity_df['ClientIP'].drop_duplicates().tolist()\n", |
| 1141 | "all_user_ips = [ip for ip in all_user_ips if ip and ip != '<null>']\n", |
| 1142 | "ip_list = ','.join(['\\'{}\\''.format(i) for i in all_user_ips])\n", |
| 1143 | "\n", |
| 1144 | "az_ip_where = f'''\n", |
| 1145 | "| where (AllExtIPs in ({ip_list}) \n", |
| 1146 | " or SrcIP in ({ip_list}) \n", |
| 1147 | " or DestIP in ({ip_list}) \n", |
| 1148 | " ) and \n", |
| 1149 | " (AllowedOutFlows > 0 or AllowedInFlows > 0)'''\n", |
| 1150 | "print('getting data...')\n", |
| 1151 | "az_net_query_byip = az_net_analytics_query.format(where_clause=az_ip_where,\n", |
| 1152 | " start=o365_query_times_user.start,\n", |
| 1153 | " end=o365_query_times_user.end)\n", |
| 1154 | "\n", |
| 1155 | "net_default_cols = ['FlowStartTime', 'FlowEndTime', 'VMName', 'VMIPAddress', \n", |
| 1156 | " 'PublicIPs', 'SrcIP', 'DestIP', 'L4Protocol', 'L7Protocol',\n", |
| 1157 | " 'DestPort', 'FlowDirection', 'AllowedOutFlows', \n", |
| 1158 | " 'AllowedInFlows']\n", |
| 1159 | "\n", |
| 1160 | "%kql -query az_net_query_byip\n", |
| 1161 | "az_net_comms_df = _kql_raw_result_.to_dataframe()\n", |
| 1162 | "az_net_comms_df[net_default_cols]\n", |
| 1163 | "\n", |
| 1164 | "import warnings\n", |
| 1165 | "\n", |
| 1166 | "with warnings.catch_warnings():\n", |
| 1167 | " warnings.simplefilter(\"ignore\")\n", |
| 1168 | " \n", |
| 1169 | " az_net_comms_df['TotalAllowedFlows'] = az_net_comms_df['AllowedOutFlows'] + az_net_comms_df['AllowedInFlows']\n", |
| 1170 | " sns.catplot(x=\"L7Protocol\", y=\"TotalAllowedFlows\", col=\"FlowDirection\", data=az_net_comms_df)\n", |
| 1171 | " sns.relplot(x=\"FlowStartTime\", y=\"TotalAllowedFlows\", \n", |
| 1172 | " col=\"FlowDirection\", kind=\"line\", \n", |
| 1173 | " hue=\"L7Protocol\", data=az_net_comms_df).set_xticklabels(rotation=50)\n", |
| 1174 | "\n", |
| 1175 | "cols = ['VMName', 'VMIPAddress', 'PublicIPs', 'SrcIP', 'DestIP', 'L4Protocol',\n", |
| 1176 | " 'L7Protocol', 'DestPort', 'FlowDirection', 'AllExtIPs', 'TotalAllowedFlows']\n", |
| 1177 | "flow_index = az_net_comms_df[cols].copy()\n", |
| 1178 | "def get_source_ip(row):\n", |
| 1179 | " if row.FlowDirection == 'O':\n", |
| 1180 | " return row.VMIPAddress if row.VMIPAddress else row.SrcIP\n", |
| 1181 | " else:\n", |
| 1182 | " return row.AllExtIPs if row.AllExtIPs else row.DestIP\n", |
| 1183 | " \n", |
| 1184 | "def get_dest_ip(row):\n", |
| 1185 | " if row.FlowDirection == 'O':\n", |
| 1186 | " return row.AllExtIPs if row.AllExtIPs else row.DestIP\n", |
| 1187 | " else:\n", |
| 1188 | " return row.VMIPAddress if row.VMIPAddress else row.SrcIP\n", |
| 1189 | "\n", |
| 1190 | "flow_index['source'] = flow_index.apply(get_source_ip, axis=1)\n", |
| 1191 | "flow_index['target'] = flow_index.apply(get_dest_ip, axis=1)\n", |
| 1192 | "flow_index['value'] = flow_index['L7Protocol']\n", |
| 1193 | "\n", |
| 1194 | "cm = sns.light_palette(\"green\", as_cmap=True)\n", |
| 1195 | "with warnings.catch_warnings():\n", |
| 1196 | " warnings.simplefilter(\"ignore\")\n", |
| 1197 | " display(flow_index[['source', 'target', 'value', 'L7Protocol', \n", |
| 1198 | " 'FlowDirection', 'TotalAllowedFlows']]\n", |
| 1199 | " .groupby(['source', 'target', 'value', 'L7Protocol', 'FlowDirection'])\n", |
| 1200 | " .sum().unstack().style.background_gradient(cmap=cm))\n", |
| 1201 | "\n", |
| 1202 | "nbdisp.display_timeline(data=az_net_comms_df.query('AllowedOutFlows > 0'),\n", |
| 1203 | " overlay_data=az_net_comms_df.query('AllowedInFlows > 0'),\n", |
| 1204 | " title='Network Flows (out=blue, in=green)',\n", |
| 1205 | " time_column='FlowStartTime',\n", |
| 1206 | " source_columns=['FlowType', 'AllExtIPs', 'L7Protocol', 'FlowDirection'],\n", |
| 1207 | " height=300)" |
| 1208 | ] |
| 1209 | }, |
| 1210 | { |
| 1211 | "cell_type": "markdown", |
| 1212 | "metadata": {}, |
| 1213 | "source": [ |
| 1214 | "<a id='o365_cluster'></a>[Contents](#contents)\n", |
| 1215 | "## Rare Combinations of Country/UserAgent/Operation Type\n", |
| 1216 | "The dataframe below lists combinations in the time period that had less than 3 instances. This might help you to spot relatively unusual activity." |
| 1217 | ] |
| 1218 | }, |
| 1219 | { |
| 1220 | "cell_type": "code", |
| 1221 | "execution_count": null, |
| 1222 | "metadata": {}, |
| 1223 | "outputs": [], |
| 1224 | "source": [ |
| 1225 | "\n", |
| 1226 | "from msticpy.sectools.eventcluster import (dbcluster_events, \n", |
| 1227 | " add_process_features, \n", |
| 1228 | " char_ord_score,\n", |
| 1229 | " token_count,\n", |
| 1230 | " delim_count)\n", |
| 1231 | "\n", |
| 1232 | "restrict_cols = ['OfficeId', 'RecordType', 'TimeGenerated', 'Operation',\n", |
| 1233 | " 'OrganizationId', 'UserType', 'UserKey', 'OfficeWorkload',\n", |
| 1234 | " 'ResultStatus', 'OfficeObjectId', 'UserId', 'ClientIP','UserAgent']\n", |
| 1235 | "feature_office_ops = office_ops_df[restrict_cols]\n", |
| 1236 | "feature_office_ops = ( pd.merge(feature_office_ops, \n", |
| 1237 | " ip_locs_df, how='left', \n", |
| 1238 | " left_on='ClientIP', right_on='Address')\n", |
| 1239 | " .fillna(''))\n", |
| 1240 | "\n", |
| 1241 | "# feature_office_ops = office_ops_df.copy()\n", |
| 1242 | "\n", |
| 1243 | "feature_office_ops['country_num'] = feature_office_ops.apply(lambda x: char_ord_score(x, 'CountryCode') if x.CountryCode else 0, axis=1)\n", |
| 1244 | "feature_office_ops['ua_tokens'] = feature_office_ops.apply(lambda x: char_ord_score(x, 'UserAgent'), axis=1)\n", |
| 1245 | "feature_office_ops['user_num'] = feature_office_ops.apply(lambda x: char_ord_score(x, 'UserId'), axis=1)\n", |
| 1246 | "feature_office_ops['op_num'] = feature_office_ops.apply(lambda x: char_ord_score(x, 'Operation'), axis=1)\n", |
| 1247 | "\n", |
| 1248 | "# you might need to play around with the max_cluster_distance parameter.\n", |
| 1249 | "# decreasing this gives more clusters.\n", |
| 1250 | "(clustered_ops, dbcluster, x_data) = dbcluster_events(data=feature_office_ops,\n", |
| 1251 | " cluster_columns=['country_num',\n", |
| 1252 | " 'op_num',\n", |
| 1253 | " 'ua_tokens'],\n", |
| 1254 | " time_column='TimeGenerated',\n", |
| 1255 | " max_cluster_distance=0.0001)\n", |
| 1256 | "print('Number of input events:', len(feature_office_ops))\n", |
| 1257 | "print('Number of clustered events:', len(clustered_ops))\n", |
| 1258 | "display(Markdown('#### Rarest combinations'))\n", |
| 1259 | "display(clustered_ops[['TimeGenerated', 'RecordType',\n", |
| 1260 | " 'Operation', 'UserId', 'UserAgent', 'ClusterSize',\n", |
| 1261 | " 'OfficeObjectId', 'CountryName']]\n", |
| 1262 | " .query('ClusterSize <= 2')\n", |
| 1263 | " .sort_values('ClusterSize', ascending=True))\n", |
| 1264 | "display(Markdown('#### Most common operations'))\n", |
| 1265 | "display((clustered_ops[['RecordType', 'Operation', 'ClusterSize']]\n", |
| 1266 | " .sort_values('ClusterSize', ascending=False)\n", |
| 1267 | " .head(10)))" |
| 1268 | ] |
| 1269 | }, |
| 1270 | { |
| 1271 | "cell_type": "markdown", |
| 1272 | "metadata": { |
| 1273 | "hidden": true |
| 1274 | }, |
| 1275 | "source": [ |
| 1276 | "<a id='appendices'></a>[Contents](#contents)\n", |
| 1277 | "# Appendices" |
| 1278 | ] |
| 1279 | }, |
| 1280 | { |
| 1281 | "cell_type": "markdown", |
| 1282 | "metadata": {}, |
| 1283 | "source": [ |
| 1284 | "## Available DataFrames" |
| 1285 | ] |
| 1286 | }, |
| 1287 | { |
| 1288 | "cell_type": "code", |
| 1289 | "execution_count": null, |
| 1290 | "metadata": { |
| 1291 | "scrolled": true |
| 1292 | }, |
| 1293 | "outputs": [], |
| 1294 | "source": [ |
| 1295 | "print('List of current DataFrames in Notebook')\n", |
| 1296 | "print('-' * 50)\n", |
| 1297 | "current_vars = list(locals().keys())\n", |
| 1298 | "for var_name in current_vars:\n", |
| 1299 | " if isinstance(locals()[var_name], pd.DataFrame) and not var_name.startswith('_'):\n", |
| 1300 | " print(var_name)" |
| 1301 | ] |
| 1302 | }, |
| 1303 | { |
| 1304 | "cell_type": "markdown", |
| 1305 | "metadata": { |
| 1306 | "heading_collapsed": true, |
| 1307 | "tags": [ |
| 1308 | "todo" |
| 1309 | ] |
| 1310 | }, |
| 1311 | "source": [ |
| 1312 | "## Saving Data to Excel\n", |
| 1313 | "To save the contents of a pandas DataFrame to an Excel spreadsheet\n", |
| 1314 | "use the following syntax\n", |
| 1315 | "```\n", |
| 1316 | "writer = pd.ExcelWriter('myWorksheet.xlsx')\n", |
| 1317 | "my_data_frame.to_excel(writer,'Sheet1')\n", |
| 1318 | "writer.save()\n", |
| 1319 | "```" |
| 1320 | ] |
| 1321 | } |
| 1322 | ], |
| 1323 | "metadata": { |
| 1324 | "hide_input": false, |
| 1325 | "kernelspec": { |
| 1326 | "name": "python36", |
| 1327 | "display_name": "Python 3.6", |
| 1328 | "language": "python" |
| 1329 | }, |
| 1330 | "language_info": { |
| 1331 | "codemirror_mode": { |
| 1332 | "name": "ipython", |
| 1333 | "version": 3 |
| 1334 | }, |
| 1335 | "file_extension": ".py", |
| 1336 | "mimetype": "text/x-python", |
| 1337 | "name": "python", |
| 1338 | "nbconvert_exporter": "python", |
| 1339 | "pygments_lexer": "ipython3", |
| 1340 | "version": "3.7.1" |
| 1341 | }, |
| 1342 | "toc": { |
| 1343 | "base_numbering": 1, |
| 1344 | "nav_menu": { |
| 1345 | "height": "318.996px", |
| 1346 | "width": "320.994px" |
| 1347 | }, |
| 1348 | "number_sections": false, |
| 1349 | "sideBar": true, |
| 1350 | "skip_h1_title": false, |
| 1351 | "title_cell": "Table of Contents2", |
| 1352 | "title_sidebar": "Contents", |
| 1353 | "toc_cell": false, |
| 1354 | "toc_position": { |
| 1355 | "height": "calc(100% - 180px)", |
| 1356 | "left": "10px", |
| 1357 | "top": "150px", |
| 1358 | "width": "351px" |
| 1359 | }, |
| 1360 | "toc_section_display": true, |
| 1361 | "toc_window_display": false |
| 1362 | }, |
| 1363 | "varInspector": { |
| 1364 | "cols": { |
| 1365 | "lenName": 16, |
| 1366 | "lenType": 16, |
| 1367 | "lenVar": 40 |
| 1368 | }, |
| 1369 | "kernels_config": { |
| 1370 | "python": { |
| 1371 | "delete_cmd_postfix": "", |
| 1372 | "delete_cmd_prefix": "del ", |
| 1373 | "library": "var_list.py", |
| 1374 | "varRefreshCmd": "print(var_dic_list())" |
| 1375 | }, |
| 1376 | "r": { |
| 1377 | "delete_cmd_postfix": ") ", |
| 1378 | "delete_cmd_prefix": "rm(", |
| 1379 | "library": "var_list.r", |
| 1380 | "varRefreshCmd": "cat(var_dic_list()) " |
| 1381 | } |
| 1382 | }, |
| 1383 | "position": { |
| 1384 | "height": "406.193px", |
| 1385 | "left": "1468.4px", |
| 1386 | "right": "20px", |
| 1387 | "top": "120px", |
| 1388 | "width": "456.572px" |
| 1389 | }, |
| 1390 | "types_to_exclude": [ |
| 1391 | "module", |
| 1392 | "function", |
| 1393 | "builtin_function_or_method", |
| 1394 | "instance", |
| 1395 | "_Feature" |
| 1396 | ], |
| 1397 | "window_display": false |
| 1398 | } |
| 1399 | }, |
| 1400 | "nbformat": 4, |
| 1401 | "nbformat_minor": 2 |
| 1402 | } |