Incident Response in the Cloud Part 2 – Azure

Home » DFIR

Last major content update: 6 May, 2021


This is the follow-on article to Incident Response in the Cloud – Part 1 where we will expand on practical application in Microsoft Azure.

Please note, Microsoft has several hundred cloud service offerings so this is not even close to a comprehensive Cloud IR plan. However, it is a good place to start for standard Azure Compute investigations.

Please note this content will be supplemented and modified over time.

As previously mentioned, I am not compensated by anyone or anything referenced in this article; everything is here because it is my opinion and I would encourage you to add your thoughts, recommendations, or even better…disagree, and post your opinion on LinkedIn. We all learn through debate and I would love to hear your ideas.


Tenant – A globally unique Azure AD instance, ending with, representing an organizational entity

Management Group – A logical grouping to allow multi-subscription control. If the organizational entity has multiple AZ subscriptions, this would be the level that one could apply things like global corporate governance/compliance

Subscription – A high-level, multi-region-capable Azure logical grouping/container that can have a number of child groups or resources within it, but will only be associated with one AD tenant.

Resource Group (RG) – This is the common container for resources themselves. For example, one may put VNets, VM’s, gateways, NIC’s, etc. in a single resource group.

Network Security Group (NSG) – equivalent to a network layer firewall or you can picture it as a set of ACL rules. Good for controlling traffic between subnets. For externally-facing assets, an AZ Firewall, AZ Gateway, and/or an Application Gateway are better choices, as they are L3/L4 with extra things like NAT’ing.

“What the heck is Azure <blank> service?!”

By the time you learn about a new Azure service, there will be a newer Azure service. The bright side is that if you need to understand anything related to Azure, a simple Web search will immediately hit on Microsoft’s documentation. My deepest thanks to them for making clear overviews, walkthroughs, and (my personal favorite) amazing free hands-on training [Example of their excellent training].

Search terms for whichever Azure service’s documentation: <your term>

Search terms for free practical application training: <the thing you want to learn (e.g. “Network Security Group”>

Enterprise Discovery

If you read my previous article you know I am annoyed at lazy IT architects that never create network architectural diagrams unless you waterboard them. (You are going to have to hold me under until I surrender before I will write them too.)

I also mentioned in the previous article that a company’s network design is like a snowflake; each of them is often a unique implementation so it is a little more difficult to make educated guesses than it would be for an on-premise environment.

And here is where you will be dazzled and amazed by Azure’s automated discovery.

Network Watcher

Network Watcher performs several functions and it would be worth your time to become very familiar. It is your source to enable netflow, pcap, and my favorite, the Network Security Group (NSG) Diagnostic tool. That last one will show you all the NSG rules (think “firewall”) that your data will flow through.

Architecture Diagram

If you navigate to the “Security Groups” page you will find “Resource Visualizer” on the left blade.

Network Forensics

 We spend a lot of time in DFIR performing data manipulation prior to any data analysis. However, tools for log aggregation, normalization, and robust querying capability already exists within several Azure native security tools so I will include a breakdown of those first.

Additionally, as mentioned in Part I, I prefer to follow Phil Hagen’s FOR572 SANS Institute course, in which he focuses on the following network data sources:
 1. Logs
2. NetFlow
3. Full Packet Capture (FPC)

Security Tools

Microsoft Defender for Cloud– Alerting, security assessment, and compliance of cloud resources

Microsoft Sentinel – Their SIEM/SOAR.

Azure Monitor – Telemetry-focused analysis and console for metrics or logs.

Microsoft Information Protection – M365 DLP solution

Microsoft 365 Defender – This rolled-up several legacy security consoles and is their single-pane-of-glass for M365, the Defender suite, and the Exchange Online stuff. M365 security tools and logging is an article all by itself and will not be covered here.

Vendor Tools – There are numerous third party tools that can be extensively leverages during an Incident Response. If the cloud environment is a mid-size or larger company it is highly likely it will have a third party vendor’s comprehensive network security visibility/compliance tool such as Palo Alto’s Prisma Cloud.


Microsoft refers to these data sources as “platform logs“. Their objective is to provide logs that can be injested by tools like AZ Monitor in order to perform fast queries using Log Analytics, or to export them with AZ Event Hub for event action or forwarding to a SIEM. All platform logs can also be downloaded from the portal or the AZ CLI into CSV or JSON.

Resource Logs

 Resource logs capture operations performed IN or BY a cloud resource. These are not turned on by default, but should be. Resource log schema varies based on the resource type that it covers.

You will have to follow these instructions to activate them for export or analytics.


Below are two examples of Resource logs created by the Microsoft team. You can find a breakdown of the fields here.

AZ Application Gateway log

    "operationName": "ApplicationGatewayAccess",
    "time": "2017-04-26T19:27:38Z",
    "category": "ApplicationGatewayAccessLog",
    "properties": {
        "instanceId": "appgw_1",
        "clientIP": "",
        "httpMethod": "GET",
        "requestUri": "/phpmyadmin/scripts/setup.php",
        "userAgent": "-",
        "httpStatus": 404,
        "httpVersion": "HTTP/1.0",
        "receivedBytes": 65,
        "sentBytes": 553,
        "timeTaken": 205,
        "sslEnabled": "off",
        "sslCipher": "",
        "sslProtocol": "",
        "serverRouted": "",
        "serverStatus": "200",
        "serverResponseLatency": "0.023",
        "host": "",

AZ Application Firewall Log

  "resourceId": "/SUBSCRIPTIONS/{subscriptionId}/RESOURCEGROUPS/{resourceGroupName}/PROVIDERS/MICROSOFT.NETWORK/APPLICATIONGATEWAYS/{applicationGatewayName}",
  "operationName": "ApplicationGatewayFirewall",
  "time": "2017-03-20T15:52:09.1494499Z",
  "category": "ApplicationGatewayFirewallLog",
  "properties": {
    "instanceId": "ApplicationGatewayRole_IN_0",
    "clientIp": "",
    "clientPort": "4835",
    "requestUri": "/?a=%3Cscript%3Ealert(%22Hello%22);%3C/script%3E",
    "ruleSetType": "OWASP",
    "ruleSetVersion": "3.0",
    "ruleId": "941320",
    "message": "Possible XSS Attack Detected - HTML Tag Handler",
    "action": "Blocked",
    "site": "Global",
    "details": {
      "message": "Warning. Pattern match \"<(a|abbr|acronym|address|applet|area|audioscope|b|base|basefront|bdo|bgsound|big|blackface|blink|blockquote|body|bq|br|button|caption|center|cite|code|col|colgroup|comment|dd|del|dfn|dir|div|dl|dt|em|embed|fieldset|fn|font|form|frame|frameset|h1|head|h ...\" at ARGS:a.",
      "data": "Matched Data: <script> found within ARGS:a: <script>alert(\\x22hello\\x22);</script>",
      "file": "rules/REQUEST-941-APPLICATION-ATTACK-XSS.conf",
      "line": "865"
    "hostname": "",
    "transactionId": "AYAcUqAcAcAcAcAcASAcAcAc"

Activity Logs

Activity logs are subscription-level events which capture write operation (e.g. PUT, POST, DELETE) performed TO a cloud resource. That does not mean the Activity logs are logging the inbound request on a web app (i.e. access logging), but rather every request to manipulate the state of an AZ resource, such as a VM start/stop or the modification of the resource itself.

Azure portal --> All services --> search for Activity Log --> scroll down on the left-hand blade --> "the log source"

Direct hyperlink if you are signed into your AZ Portal


This is an example Activity log created by the Microsoft team. You can find a breakdown of the fields here. As you can see, fields like “authorization”, “caller”, “Name”, and “IPaddr” can be extremely valuable to your investigation.

    "authorization": {
        "action": "Microsoft.Network/networkSecurityGroups/write",
        "scope": "/subscriptions/<subscription ID>/resourcegroups/myResourceGroup/providers/Microsoft.Network/networkSecurityGroups/myNSG"
    "caller": "[email protected]",
    "channels": "Operation",
    "claims": {
        "aud": "",
        "iss": "",
        "iat": "1234567890",
        "nbf": "1234567890",
        "exp": "1234567890",
        "_claim_names": "{\"groups\":\"src1\"}",
        "_claim_sources": "{\"src1\":{\"endpoint\":\"\"}}",
        "": "1",
        "aio": "A3GgTJdwK4vy7Fa7l6DgJC2mI0GX44tML385OpU1Q+z+jaPnFMwB",
        "": "rsa,mfa",
        "appid": "355249ed-15d9-460d-8481-84026b065942",
        "appidacr": "2",
        "": "10845a4d-ffa4-4b61-a3b4-e57b9b31cdb5",
        "e_exp": "262800",
        "": "Robertson",
        "": "Rob",
        "ipaddr": "",
        "name": "Rob Robertson",
        "": "f409edeb-4d29-44b5-9763-ee9348ad91bb",
        "onprem_sid": "S-1-5-21-4837261184-168309720-1886587427-18514304",
        "puid": "18247BBD84827C6D",
        "": "user_impersonation",
        "": "b-24Jf94A3FH2sHWVIFqO3-RSJEiv24Jnif3gj7s",
        "": "1114444b-7467-4144-a616-e3a5d63e147b",
        "": "[email protected]",
        "": "[email protected]",
        "uti": "IdP3SUJGtkGlt7dDQVRPAA",
        "ver": "1.0"
    "correlationId": "b5768deb-836b-41cc-803e-3f4de2f9e40b",
    "eventDataId": "d0d36f97-b29c-4cd9-9d3d-ea2b92af3e9d",
    "eventName": {
        "value": "EndRequest",
        "localizedValue": "End request"
    "category": {
        "value": "Administrative",
        "localizedValue": "Administrative"
    "eventTimestamp": "2018-01-29T20:42:31.3810679Z",
    "id": "/subscriptions/<subscription ID>/resourcegroups/myResourceGroup/providers/Microsoft.Network/networkSecurityGroups/myNSG/events/d0d36f97-b29c-4cd9-9d3d-ea2b92af3e9d/ticks/636528553513810679",
    "level": "Informational",
    "operationId": "04e575f8-48d0-4c43-a8b3-78c4eb01d287",
    "operationName": {
        "value": "Microsoft.Network/networkSecurityGroups/write",
        "localizedValue": "Microsoft.Network/networkSecurityGroups/write"
    "resourceGroupName": "myResourceGroup",
    "resourceProviderName": {
        "value": "Microsoft.Network",
        "localizedValue": "Microsoft.Network"
    "resourceType": {
        "value": "Microsoft.Network/networkSecurityGroups",
        "localizedValue": "Microsoft.Network/networkSecurityGroups"
    "resourceId": "/subscriptions/<subscription ID>/resourcegroups/myResourceGroup/providers/Microsoft.Network/networkSecurityGroups/myNSG",
    "status": {
        "value": "Succeeded",
        "localizedValue": "Succeeded"
    "subStatus": {
        "value": "",
        "localizedValue": ""
    "submissionTimestamp": "2018-01-29T20:42:50.0724829Z",
    "subscriptionId": "<subscription ID>",
    "properties": {
        "statusCode": "Created",
        "serviceRequestId": "a4c11dbd-697e-47c5-9663-12362307157d",
        "responseBody": "",
        "requestbody": ""
    "relatedEvents": []
Azure AD Logs

Azure AD logs capture AD Sign-ins, user changes, and flagged activity.

Azure portal --> All services --> search for Azure Active Directory --> scroll down on the left-hand blade --> Monitoring --> “the log source below”

You can drill down to the following AD log subsets in the AZ portal:

AD Audit Logs

AD Audit Logs capture AD tenant-based information such as user, group, credential and application changes.

Direct hyperlink if you are signed into your AZ Portal

AD Sign-in Logs [Interactive]

AD Sign-in Logs [Interactive] captures user activity and status, such as MFA use, logon success/failures, destination, and IP geolocation. If you click on the details pane for any specific entry, you can gain more info, such as logon failure reason.

Direct hyperlink if you are signed into your AZ Portal

AD Provisioning Logs (Preview)

AD Provisioning Logs capture user/group provisioning in 3rd party applications, such as ServiceNow. This may be a good data source to identify lateral movement or establishment of a footprint.

Direct hyperlink if you are signed into your AZ Portal


NSG Flow Logs can be turned on within Network Watcher and pointed to a storage location of your choice. You will be required to:

  1. register Microsoft.Insights provider
    1. Azure portal –> All services –> search for Resource Providers –> microsoft.insights
  2. create a storage account in which to drop the logs
    1. Azure portal –> All services –> search for Storage –> Storage Account
  3. create a Log Analytics workstation to view them
    1. Azure portal –> All services –> search for Log Analytics Workspaces
  4. activate NSG flow logging
    1. Azure portal –> All services –> search for Network Watcher –> NSG Flow Logs

If you need to pivot in the data set quickly, recommend using Power BI, which can give you things like Top Talkers, and directionality.

Example Flow log

Path format: …\<region>\<YYYY>\<MM>\<DD>

Filename: (.gz)

<number>_vpcflowlogs_<region>_fl-<flow number>_<YYYYMMDD>T<UTC>Z_<hex>.log

Log Example

version account-id interface-id srcaddr dstaddr srcport dstport protocol packets bytes start end action log-status

2 742822827437 eni-03aec391810141a74 40048 443 6 10 1673 1611943565 1611943577 ACCEPT OK
2 742822827437 eni-03aec391810141a74 443 40050 6 20 22015 1611943565 1611943577 ACCEPT OK
2 742822827437 eni-03aec391810141a74 38196 443 6 98 6278 1611943565 1611943577 ACCEPT OK
2 742822827437 eni-03aec391810141a74 443 40042 6 110 153235 1611943565 1611943577 ACCEPT OK
Full Packet Capture (FPC)

Network Watcher – FPC between Network Security Groups

Azure portal –> All services –> search for Network Watcher –> Network Diagnostic Tools –> Packet Capture

Follow these instructions to start/stop/download/delete the captures


Azure portal –> All services –> search for Network Watcher –> Logs –> Diagnostic Logs

Enable Network Watcher’s diagnostic log to check system metrics. Instructions are here.

Host Forensics

I felt there was a lack of concise, publicly-available information on this part so I wanted to get the basics out there and expand on the steps over time. However, these options are evolving quickly so if someone has better information please feel free to let me know!

Acquiring an Azure VM disk image

Make sure you understand the volume encryption question before you try a lot of this.

These are some of your available choices:

  • Use libcloudforensics
  • Use a DFIR tool
    • many of them do not see a difference with a cloud volume
  • Get admin credentials from the cloud admin
    • Use your credentials to perform the following steps
  • Have the cloud admin get a snapshot and send it to you
    • Have the admin follow the steps above.
    • Make sure the cloud admin understands you are not going to use this disk to create a clone in the same way system administrators would typically create clones. You only want a full snapshot of the disk made available to your DFIR subscription/resource group/etc.
  • Acquire the VHD using Azure Storage Explorer
    • AZ Storage Explorer is an amazing tool build by Microsoft. It can download full disk images as a VHD using a Microsoft purpose-built protocol optimized for transfer speed.
    • Storage Explorer to Manage Disks
  • Peer your DFIR resource group to the infected one
    • Follow the instructions in the below section, “Extra Stuff –> How to Peer your Forensic VNet to the Bad VNet”
    • ensure there is no overlapping IP ranges
    • make sure you’ve got strong traffic controls
    • If the route is open, you should be able to use your DFIR cloud workstation as if it is in your subnet. (I have not tried this one yet.)


For Alert-to-HostDataAcquisition, I recommend reading Microsoft’s “Get-ForensicVM” workflow. Or consider an XDR/DFIR vendor tool.

Exporting the VHD from the cloud

I highly recommend performing cloud DFIR using a cloud forensic workstation. The amount of time and effort when forcing cloud resources into the standard on-prem DFIR analysis workflow can be staggering. Moreover, this will be a time-consuming copy unless you have ExpressRoute Premium.

The following choices assume you 1) want to do the forensic analysis on-prem and 2) have not (yet) created and sync’d an Azure File Server between your cloud DFIR environment and your on-prem forensic workstations.

  • Best choice – Azure Storage Explorer
    • Read section, “Copy a Managed Disk“. My recommendation is to get administrative credentials at the subscription-level; it is a good balance between ease of data access and security. With subscription-level access you can use AZ Storage Explorer to simply download the unattached VHD, assuming it is not currently attached to a VM. If the VHD is still attached to a VM, you should use the snaphot workflow. With that same access, you can also download the platform logs if they are being archived to an AZ storage account within that subscription, which is very likely if the company is storing the logs at all.
    • Use a Shared Access Signature for access control. Follow the SAS creation instructions in section, “Generate a Download URL”
  • Another option – Download VHD (custom URL)
    • Snapshot the VHD – Windows or Linux
    • Forensicator’s decision point on whether to power down the VM or not.
    • Use a Shared Access Signature for access control. Follow the SAS creation instructions in section, “Generate a Download URL”
  • Slowest option – Shipping it


For Automation of your acquired data –> your on-prem file server, I recommend creating a Azure File server (NFS or SMB) and use Azure FileSync.  Cold tier for both your “DFIR Tools/Binaries” and the “Closed cases”; hot tier for your active cases.  

Make a copy of the VHD

After you have the vm’s disk in your cloud storage somewhere, follow only the steps in section, “Copy a Disk” to create a VHD copy.

Data Analysis
  • Attach the VHD to your cloud forensic workstation but do not mount it. If on-prem, use your typical analysis workflow on the VHD you exported. My preference for:
  • If possible, for cloud DFIR workstations I recommend separate disks. However, do some research to determine if they are actually separate physical disks. If they are logical, the performance increase might not be worth the cost.
    • Workstation Disks
      • PremiumSSD = OS and hash library
      • PremiumSSD = Cases disk
      • Ultra Disk = forensic tool I/O temp cache
    • Azure Files (File Server)
      • NFS/SMB share #1 (hot tier) – DFIR team import/export for active cases 
      • NFS/SMB share #2 (cold tier) – uninstalled DFIR tools/binaries archive
      • NFS/SMB share #1 (cold/archive tier) – archived closed cases
What if I have a large number of files or a bunch of images?
DatasetNetwork bandwidthSolution to use
Large datasetLow-bandwidth network or direct connectivity to on-premises storage is limited by organization policiesAzure Import/Export for export; Data Box Disk or Data Box for import where supported; otherwise use Azure Import/Export
Large datasetHigh-bandwidth network: 1 gigabit per second (Gbps) – 100 GbpsAZCopy for online transfers; or to import data, Azure Data Factory, Azure Data Box Edge, or Azure Data Box Gateway
Large datasetModerate-bandwidth network: 100 megabits per second (Mbps) – 1 GbpsAzure Import/Export for export or Azure Data Box family for import where supported
Small dataset: a few GBs to a few TBsLow to moderate-bandwidth network: up to 1 GbpsIf transferring only a few files, use Azure Storage Explorer, Azure portal, AZCopy, or AZ CLI
Extra Stuff

All of this was gleaned from the excellent Microsoft Learn practice sandboxes or by running it on my own AZ subscription.

How to Navigate in AZ CLI

Unfortunately, a full AZ Powershell, AZ CLI, and Bash AZ tutorial is outside the scope of this article. But here are a few pointers that may help.

(NOTE:”-h” switch is a scoped help menu)

How to find all the Azure commands (Powershell)

This will display all the “get” commands for Azure AD.

Change it to “get*az*” and it will show ALL of the AZ get commands.

 > get-command get*azad*


CommandType     Name         Version                                 Source
-----------     ----          -------                                ------
Alias           Get-AzADServicePrincipalCredential       3.4.1      Az.Resources
Cmdlet          Get-AzADAppCredential                    3.4.1      Az.Resources
Cmdlet          Get-AzADApplication                      3.4.1      Az.Resources
Cmdlet          Get-AzADGroup                            3.4.1      Az.Resources
Cmdlet          Get-AzADGroupMember                      3.4.1      Az.Resources
Cmdlet          Get-AzADServicePrincipal                 3.4.1      Az.Resources
Cmdlet          Get-AzADSpCredential                     3.4.1      Az.Resources
Cmdlet          Get-AzADUser                             3.4.1      Az.Resources
Cmdlet          Get-AzAdvisorConfiguration               1.1.1      Az.Advisor
Cmdlet          Get-AzAdvisorRecommendation              1.1.1      Az.Advisor

Drilling down to “user” stuff

> get-command *azaduser*


CommandType    Name          Version    Source
-----------    ----          -------    ------
Alias        Set-AzADUser    3.4.1    Az.Resources
Cmdlet       Get-AzADUser    3.4.1    Az.Resources
Cmdlet       New-AzADUser    3.4.1    Az.Resources
Cmdlet       Remove-AzADUser 3.4.1    Az.Resources
Cmdlet       Update-AzADUser 3.4.1    Az.Resources

Want to list the users? (I only have one user)

> get-azaduser


UserPrincipalName :
ObjectType : User
UsageLocation : US
GivenName : John
Surname : Smith
AccountEnabled : True
MailNickname :
Mail :
DisplayName : John Smith
Id : 543dfbdf-d3d5-3a4d-7cde-645ea5dc22fc
Type : Member

What other query-able information does the object have?

> get-azaduser | get-member -type property

OUTPUT (of my instance, anyway)

   TypeName: Microsoft.Azure.Commands.ActiveDirectory.PSADUser

Name              MemberType Definition
----              ---------- ----------
AccountEnabled    Property   System.Nullable[bool] AccountEnabled {get;set;}
DisplayName       Property   string DisplayName {get;set;}
GivenName         Property   string GivenName {get;set;}
Id                Property   string Id {get;set;}
Mail              Property   string Mail {get;set;}
MailNickname      Property   string MailNickname {get;set;}
ObjectType        Property   string ObjectType {get;}
Surname           Property   string Surname {get;set;}
Type              Property   string Type {get;set;}
UsageLocation     Property   string UsageLocation {get;set;}
UserPrincipalName Property   string UserPrincipalName {get;set;}

Need only one of the user properties? (I added a few fake users to make it less boring)

> get-azaduser | select DisplayName


John Smith
Jerry Smith
Daniel Bucket
Terrance Atwater

What commands are available to get a volume snapshot

>get-command *snapshot*


> get-command *snapshot*

CommandType     Name                                         Version    Source
-----------     ----                                         -------    ------
Function        New-VirtualDiskSnapshot                Storage
Cmdlet          Get-AzSnapshot                               4.1.0      Az.Compute
Cmdlet          Get-AzWebAppSnapshot                         1.9.0      Az.Websites
Cmdlet          Grant-AzSnapshotAccess                       4.1.0      Az.Compute
Cmdlet          New-AzSnapshot                               4.1.0      Az.Compute
Cmdlet          New-AzSnapshotConfig                         4.1.0      Az.Compute
Cmdlet          New-AzSnapshotUpdateConfig                   4.1.0      Az.Compute
Cmdlet          Remove-AzSnapshot                            4.1.0      Az.Compute
Cmdlet          Restore-AzWebAppSnapshot                     1.9.0      Az.Websites
Cmdlet          Revoke-AzSnapshotAccess                      4.1.0      Az.Compute
Cmdlet          Set-AzSnapshotDiskEncryptionKey              4.1.0      Az.Compute
Cmdlet          Set-AzSnapshotImageReference                 4.1.0      Az.Compute
Cmdlet          Set-AzSnapshotKeyEncryptionKey               4.1.0      Az.Compute
Cmdlet          Set-AzSnapshotUpdateDiskEncryptionKey        4.1.0      Az.Compute
Cmdlet          Set-AzSnapshotUpdateKeyEncryptionKey         4.1.0      Az.Compute
Cmdlet          Update-AzSnapshot                            4.1.0      Az.Compute

This all works in Bash too

List VM status

> az vm list --output table


Name       ResourceGroup   Location   Zones
---------- --------------- ---------- -------
webServer1 TEST-RG2-EASTUS eastus
webServer2 TEST-RG2-EASTUS eastus

With extra details

>az vm list --output table --show-details


Name       ResourceGroup    PowerState   PublicIps  Fqdns   Location   Zones
---------- --------------- ------------ ----------- ------- ---------- -------
webServer1 TEST-RG2-EASTUS VM running                       eastus
webServer2 TEST-RG2-EASTUS VM running                       eastus

You can grep the output

az ad user list


    "accountEnabled": true,
    "createdDateTime": "2020-03-31T16:00:44Z",
    "displayName": "Mike Test",
    "streetAddress": null,
    "userPrincipalName": "[email protected]",
    "userState": null,
    "userStateChangedOn": null,
    "userType": "Member"

With grep

> az ad user list | grep street


"streetAddress": null,
"streetAddress": null,
How to Use Variables in AZ CLI to speed up your typing

Use the Azure CLI

Populating the NIC ID into a variable

For any AZ command, use the “list” function which should produce the JSON output of that object. From there, look at the parent/children relationship of the list and query that value like the example below.

> NICID=$(az vm nic list \
--resource-group <MyResourceGroup> \
--vm-name <MyVM> \
--query "[][*].ipAddress" \
--output tsv)


NICNAME=$(az vm nic show \
--resource-group <MyResourceGroup> \
--vm-name nva \
--nic $NICID \
--query "{name:name}" --output tsv)


> echo $NICNAME

> /subscriptions/<GUID>/resourceGroups/MyResourceGroup \
/providers/Microsoft.Network/networkInterfaces/<NIC ID> 

 or get the PublicIP of an NVA that was created:

> NVAIP="$(az vm list-ip-addresses \
--resource-group <MyResourceGroup> \
--name nva \</p>
--query "[][*].ipAddress" \
--output tsv)"


  then you can remotely fire a commands like this:

IP forwarding   

ssh -t -o StrictHostKeyChecking=no \
azureuser@$NVAIP 'sudo sysctl \
-w net.ipv4.ip_forward=1; exit;'

checking routing

ssh -t -o StrictHostKeyChecking=no \
azureuser@$PUBLICIP 'traceroute <vm name> \
--type=icmp; exit'


traceroute to (, 64 hops max 0.815ms 0.422 0.396ms 1.211ms 1.031ms 1.119ms
Connection to closed.
Azure Bastion (cloud jumpboxes)

Is the client using a AZ Bastion host?

The Bastion should be listed in:

AZ portal --> AZ Bastions

“Current Session Monitoring” allows you to view sessions and force disconnects.

To disconnect a session:

AZ portal --> Resource groups --> RG Name --> VNet --> Sessions

AZ Bastion audit logs may be turned on.

Find out where their log storage account is pointed and look for the Insight logs.

Example of a Bastion host JSON audit log.

   "userAgent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36",
   "message":"Successfully Connected.",
Getting an Endpoint’s “Effective Route”

This is useful if you need to understand the routing path or if you want to see if traffic was allowed to/from an endpoint. If you look at the NIC itself, you can see its “Effective route”

AZ portal --> Routing Tables --> Effective route 


w/o “service endpoint” enabled

Default Active          VNet
Default Active         Internet
Default Active           None
Default Active       None
Default Active       None

w/ “service endpoint” enabled

Default Active      VNet
Default Active      Internet
Default Active       None
Default Active   None
Default Active   None
Default Active, 10 more VirtualNetworkServiceEndpoint
Default Active, 9 more VirtualNetworkServiceEndpoint
How to Peer your Forensic VNet to the Bad VNet

These instructions use the Azure CLI, which is available from the Azure Portal or as a local plugin. Before trying to “one-off” this, I recommend scripting some of the variables below for future use. Again, huge thanks to the Microsoft team for creating such excellent training; what follows comes from their general VNet peering training.

Create the peering connections

Get credentials for the target VNet and make sure the IP ranges do not overlap

From the 1st side

az network vnet peering create \
--name <Your DfirVnet>-To-<TargetVNet> \
--remote-vnet <TargetVNet> \
--resource-group $rg \
--vnet-name <Your DfirVnet> \

From the 2nd side

az network vnet peering create \
--name <TargetVNet>-To-<Your DfirVnet>\
--remote-vnet <Your DfirVnet> \
--resource-group <their RG> \
--vnet-name <Their Vnet name> \

Check the peering connection

az network vnet peering list \
--resource-group $rg \
--vnet-name <Your DfirVnet> \
--output table


AllowForwardedTraffic AllowGatewayTransit AllowVirtualNetworkAccess Name PeeringState ProvisioningState ResourceGroup UseRemoteGateways
----------------------- -----
False False True DfirVnet-To-<TargetVNet> Connected Succeeded $rg False
False False True <TargetVNet>-To-DfirVnet Connected Succeeded $rg False

Check the routes:

az network vnet peering list \
--resource-group $rg \
--vnet-name <TargetVNet> \
--output table


AllowForwardedTraffic AllowGatewayTransit AllowVirtualNetworkAccess Name PeeringState ProvisioningState ResourceGroup UseRemoteGateways
----------------------- -----
False False True <TargetVNet>-To-DfirVnet Connected Succeeded $rg False
False False True DfirVnet-To-<TargetVnet> Connected Succeeded $rg False

Check the NIC route

az network nic show-effective-route-table \
--resource-group learn-fde51b5b-e7c9-4d07-b4e1-8b7e22aefb67 \
--name <DfirVM NIC> \
--output table


Source   State Address Prefix Next Hop Type Next Hop IP
-------- ------- ---------------- ----------------- -------------
Default  Active <cidr> VnetLocal
Default  Active <cidr> VNetPeering
Default  Active <cidr> Internet
Default  Active <cidr> None
Default  Active <cidr> None
Default  Active <cidr> None
Default  Active <cidr> None
Default  Active <cidr> None
Default  Active <cidr> VNetGlobalPeering

Check the routing by SSH’ing into the VM’s. Confirm SSH/RDP is open first, of course. Read the “How to Build a Cloud Forensic Environment” below to get an idea of traffic rule creation.

In Powershell:

"test-netconnection -computername <target host> -port <port>

(Back to your AZ CLI) List the VM’s and routes

az vm list \
--resource-group <RG. Or the var you assigned $rg> \
--query "[*].{Name:name, PrivateIP:privateIps, PublicIP:publicIps}" \
--show-details \
--output table


Name PrivateIP PublicIP
----------- ----------- --------------
DfirVM1 <PrivateIP> <PublicIP>

Assign the public IP to a variable for ease of use

PUBLICIP="$(az vm list-ip-addresses \
--resource-group $rg \
--name \
--query "[][*].ipAddress" \
--output tsv)"

Check the SSH connection

ssh -o StrictHostKeyChecking=no <USERNAME>@<IP>

Then SSH from that one, to another (on the private IP) and see if it works

How To Create a Site-To-Site VPN


How to Build a Cloud Forensic Workstation

Build the DFIR VNet

List and Chose the location for VNet

>az account list-locations -o table


DisplayName           Latitude    Longitude    Name
--------------------  ----------  -----------  ------------------
East Asia             22.267      114.188      eastasia
Southeast Asia        1.283       103.833      southeastasia
Central US            41.5908     -93.6208     centralus
East US               37.3719     -79.8164     eastus
East US 2             36.6681     -78.3889     eastus2
West US               37.783      -122.417     westus
North Central US      41.8819     -87.6278     northcentralus
South Central US      29.4167     -98.5        southcentralus
North Europe          53.3478     -6.2597      northeurope
West Europe           52.3667     4.9          westeurope
Japan West            34.6939     135.5022     japanwest
Japan East            35.68       139.77       japaneast
Brazil South          -23.55      -46.633      brazilsouth
Australia East        -33.86      151.2094     australiaeast
Australia Southeast   -37.8136    144.9631     australiasoutheast
South India           12.9822     80.1636      southindia
Central India         18.5822     73.9197      centralindia
West India            19.088      72.868       westindia
Jio India West        22.470701   70.05773     jioindiawest
Canada Central        43.653      -79.383      canadacentral
Canada East           46.817      -71.217      canadaeast
UK South              50.941      -0.799       uksouth
UK West               53.427      -3.084       ukwest
West Central US       40.890      -110.234     westcentralus
West US 2             47.233      -119.852     westus2
Korea Central         37.5665     126.9780     koreacentral
Korea South           35.1796     129.0756     koreasouth
France Central        46.3772     2.3730       francecentral
France South          43.8345     2.1972       francesouth
Australia Central     -35.3075    149.1244     australiacentral
Australia Central 2   -35.3075    149.1244     australiacentral2
UAE Central           24.466667   54.366669    uaecentral
UAE North             25.266666   55.316666    uaenorth
South Africa North    -25.731340  28.218370    southafricanorth
South Africa West     -34.075691  18.843266    southafricawest
Switzerland North     47.451542   8.564572     switzerlandnorth
Switzerland West      46.204391   6.143158     switzerlandwest
Germany North         53.073635   8.806422     germanynorth
Germany West Central  50.110924   8.682127     germanywestcentral
Norway West           58.969975   5.733107     norwaywest
Norway East           59.913868   10.752245    norwayeast
Brazil Southeast      -22.90278   -43.2075     brazilsoutheast
West US 3             33.448376   -112.074036  westus3

Create variables for RG and Location, then create the RG

Assuming you have a Resource Group named “DfirResourceGroup”. But don’t overlap your target RG’s IP range like I do below…

> rg=DfirResourceGroup

> location=<location you want>

> az network vnet create \
--resource-group $rg \ 
--name DfirVnet \
--address-prefix \  
--subnet-prefix \
--subnet-name DfirSubnet1 \
--location $location \


az group list -o table

Name                                Location        Status
----------------------------------  --------------  ---------
DfirResourceGroup                   southcentralus  Succeeded
NetworkWatcherRG                    southcentralus  Succeeded

Create the DFIR VNet and subnet.

This one is also creating a subnet, which I recommend if you intend to ever peer an infected VNet to your DFIR VNet. This way you can control the traffic between your forensic workstations and the infected endpoints.

> az network vnet create \
--resource-group $rg \
--name DfirVnet\
--address-prefix \
--subnet-name DfirSubnet  \
--subnet-prefix \
--location $location

Verify the network and subnet were created

$ az network vnet list -o table

Name      ResourceGroup      Location        NumSubnets  Prefixes     DnsServers    DDOSProtection
--------  -----------------  --------------  ----------  -----------  -----------  ---------------
DfirVnet  DfirResourceGroup  southcentralus  1                   False

$ az network vnet subnet list -g $rg --vnet-name DfirVnet -o table

AddressPrefix  Name         Priv..Pol..  Priv..Serv..Pol..  ProvisioningState   RG
-----------    -----------  --------------  --------------  -----------------   -----    DfirSubnet1  Enabled         Enabled         Succeeded           DfirResourceGroup

Create the Network Security Group (NSG)

az network nsg create \
--resource-group $rg \
--name DfirNsg


~$ az network nsg list -o table
Location        Name     ProvisioningState    ResourceGroup      ResourceGuid
--------------  -------  -------------------  -----------------  ----------------------
southcentralus  DfirNsg  Succeeded            DfirResourceGroup  <resource GUID>

Creating NSG rules (allow SSH)

create whatever rules you need to customize to your environment. These two is just an example.

az network nsg rule create \
--resource-group $rg \
--nsg-name \
--name AllowSSHRule \
--direction Inbound \
--priority 100 \
--source-address-prefixes <source IP range> \
--destination-address-prefixes <dest IP range> \
--destination-port-ranges 22 \
--access Allow \
--protocol Tcp \
--description "Allow inbound SSH"

Hardening – Creating Deny Rules

> az network nsg rule create \
--resource-group $rg \
--nsg-name DfirNsg \
--name Deny_Internet \
--direction Outbound \
--priority 200 \
--source-address-prefixes\24 \ 
--source-port-ranges '*' \
--access Deny \
--protocol '*' \
--description "Deny access to Internet"

NOTE: This is good time to mention the basic command to view NSG rules

$ az network nsg rule list \
--resource-group $rg \
--nsg-name DfirNsg \
-o table


[I abbreviated for readability and removed the Application Security Group (ASG) header info since it is not relevant in this example]

Name          ResourceGroup      Pri  SrcPorts SrcAddrPref Access Prot Dir DstPorts  DstAddrPref  
-------------  ---------------- 
AllowSSHRule  DfirResourceGroup  100  * Allow Tcp Inbound 80  None
Deny_Internet DfirResourceGroup  200  * Deny  *   Outbound 80 *           None

Create a Storage account

> az storage account create \
--resource-group $rg 
--name <DfirStorageAccount>
--sku <e.g. Standard_LRS> 

Store the primary key in a variable

> test=$(az storage account keys list \
> -g $rg \
> --account-name <DfirStorageAccount> \
> --query "[0].value" |tr -d "\"")

Create a file share

> az storage share create \
--account-name <DfirStorageAccount> \
--account-key $STORAGEKEY \
--name "dfirfileshare"


$ az storage share list --account-key $STORAGEKEY --account-name yourincidentnumber -o table

Name           Quota    Last Modified
-------------  -------  -------------------------
dfirfileshare  5120     2021-05-06T20:00:54+00:00

Restrict file share access to only your network and only on the AZ backbone (no Internet access)

# This assigns the Microsoft.Storage endpoint to the subnet 

az network vnet subnet update \
    --vnet-name DfirVnet \
    --resource-group $rg \
    --name DfirSubnet1 \
    --service-endpoints Microsoft.Storage

# This is an explicit deny making the storage account inaccessible

az storage account update \
--resource-group $rg \
--default-action Deny

# This is an explicit accept for only your vnet

az storage account network-rule add \
--resource-group $rg \
--account-name $STORAGEACCT \
--vnet <your vnet> \
--subnet <your subnet>

Create your forensic VM

Will expand this section later to include AZ CLI-based VM creation. For now, pick the VM that applies to your use case.

 AZ portal --> Virtual Machines 

Check the forensic VM (or use the watch command next)

az vm list \
--resource-group $rg \
--show-details \
--query "[*].{Name:name, Provisioned:provisioningState, Power:powerState}" \
--output table


Name         Provisioned   Power
----------   ------------- ----------
DfirVM1      Succeeded VM  running

Or watch the VM’s get spun up

watch -d -n 5 "az vm list \
--resource-group $rg \
--show-details \
--query '[*].{Name:name, ProvisioningState:provisioningState, PowerState:powerState}' \
--output table"

How To Leverage CosmosDB for DFIR



All screenshots and quotes originate from or my personal Azure cloud subscription.

Container Forensics

Last major content update: 13 April, 2021


The target audience for this article is seasoned forensicators that have a beginner/intermediate knowledge of containers. If you need to brush up on the basics, I recommend the following:

On to the forensicating part…

Due to their self-healing nature and the fact that containers are usually running something business-critical, be prepared for the very high likelihood that containers will be terminated before you get to the forensic-part. Containers are a newer IT paradigm, in which diagnosing a problem takes longer than just blowing it all out and starting all new containers from the gold images. Consequently, IT admins often do not bother with diagnosis. Moreover, many of the health/status tools automatically identify and kill malfunctioning containers. All of that sucks for the forensic examiner, but is unquestionably valuable to the business. Even worse, containers are ephemeral and do not have inherently persistent storage. So unless the logs are writing to an external location you will not be able to quickly diagnose a root cause once that container has been terminated.

With those points in mind, let us address the obvious question, “If containers heal themselves, and the IT folks remediate by cleanly refreshing them, why bother learning container forensics?” Because there are use cases that require investigation, even if you might have to do it with one forensic arm tied behind your back. Some examples are a rogue IT team member running a coin mining pod, ransomware that has broken out of container isolation, or a new zero-day vulnerability of which we are woefully unprepared (that one doesn’t happen, right?) In summary, and as mentioned in previous articles, knowledge and preparation are key, and the middle of a time-sensitive incident response is not the time to be learning how containers function.

A couple disclaimers:

  • I am not an expert in container architecture so if you think I am wrong about anything, please comment below and I will correct it.
  • Almost none of this information represents original research on my part, but rather consolidating others’ excellent research into a guide of sorts. If you see anything that I failed to cite please let me know at [email protected] and I will fix it immediately.
  • Although I am adding content for AWS and Azure, I have not had much experience in GCP so I apologize, both to the reader and to Google….who actually created Kubernetes. The good part is a lot of this will apply to GCP as well.


  • Docker – a popular container format
  • Image – the container’s gold copy
  • Manifest – the container’s configuration information (how many layers, size, OS, etc.) This is the description of the docker image. Below is an example of a Docker Manifest taken from here.
    "schemaVersion": 2,
    "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
    "config": {
        "mediaType": "application/vnd.docker.container.image.v1+json",
        "size": 7023,
        "digest": "sha256:b5b2b2c507a0944348e0303114d8d93aaaa081732b86451d9bce1f432a537bc7"
    "layers": [
            "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
            "size": 32654,
            "digest": "sha256:e692418e4cbaf90ca69d05a66403747baa33ee08806650b51fab815ad7fc331f"
            "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
            "size": 16724,
            "digest": "sha256:3c3a4604a545cdc127456d94e421cd355bca5b528f4a9c1905b15da2eb4a4c6b"
            "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
            "size": 73109,
            "digest": "sha256:ec4b8955958665577945c89419d1af06b5f7636b4ac3da7f12184802ad867736"


  • Kubernetes – a popular orchestration framework that automates container operations
  • Pod – a Kubernetes term for one or more containers that share the same configuration, networking, etc. K8S assigns a single DNS to a Pod. It helps if you notice Docker’s emblem is a whale and a group of whales is a pod.
  • Deployment / Replica Set – The Deployment object is your desired pod state. It manages the number of pod replicas. In most cases, you are only going to see containers/pods get created through a “kind: Deployment” object, and not directly through a “kind: Pod“.
    To ease the confusion, the hierarchy is….
    Deployment –> Replica Sets –> Pods –> Container
  • The Service – Although pods on the same host can see each other, this is the REST object that provides pod/pod and pods/non-pod interaction. For example, a front-end pod to a backend pod, or a pod to the load balancer, persistent storage, database, etc. Like most things in K8S, the Service is a REST object and will have its own IP/DNS. Moreover, since the point of all this is autonomous orchestration of groups of containers, K8S will build/destroy the containers as needed; that means the IP’s and container states are not persistent. That is what the Service does; it creates a stable IP for networking purposes.

This is not a complete list and will be supplemented over time. Please consider providing your comments via Twitter or LinkedIn at the links below. For such a new technology, any input is most welcome!

Yaml files

YAML is the chosen format for Kubernetes configuration files. As far as I know, there is no fixed path. They can even be piped objects. However, these are the files you should request from the IT team as they contain the number of pods, image types, ports, etc.

Jumping over to Docker for a minute, it is worthwhile to note you will also see YAML files used in the implementation for “Docker compose”. While “Docker build” may be the method to create a docker image from some baseline (as well as add whatever the container needs for the specific use case) “Docker compose” is the current best practice, as it can read a Dockerbuild file, as well as the build files of other relevant containers (db backend, service objects, etc), and spin up the whole stack.

Back to YAML now. An in-depth description of YAML is outside the scope of this article. However, it is worth understanding two points. First, YAML is a superset of JSON, so you may see the system administrators using JSON files. Second, YAML has only two structures, Lists and Maps, so if it does not look like a list it is a map. The third equally important point of our two points is that YAML is nest-able.

The most important name-value pair in these YAML files is the “kind” object. For example, the one below is labeled “kind: Deployment.” That will create a deployment object, which deploys and manages the pods. If that field says “kind: pod“, that would instead create a pod.

NOTE: You may want to look at any command line executions that are baked directly into the creation. See “Post Manifest (Example with Command Execution)” below.

kubectl apply -f

Pod Manifest (Web App Example)

A pod manifest is a YAML document that contains the startup parameters and application defaults for when a web application is launched.

Here’s an example manifest for a web server pod named nginx-demo.

apiVersion: v1
kind: Pod
  name: nginx-demo
    role: myrole
    - name: nginx 
        - name: web
          containerPort: 80
          protocol: TCP


Pod Manifest (Example with Command Execution)

This is a pod manifest that has command line executions baked in.

apiVersion: v1
kind: Pod
  name: command-demo
    purpose: demonstrate-command
  - name: command-demo-container
    image: debian
    command: ["printenv"]
  restartPolicy: OnFailure


Deployment Manifest (Web App Example)

Here is a manifest that deploys a web app replica set.

apiVersion: apps/v1
kind: Deployment
  name: mywebapp-deployment
    app: nginx
  replicas: 3
      app: nginx
        app: nginx
      - name: nginx
        image: nginx:1.14.2
        - containerPort: 80


Command line history

Commands sent to the orchestrator (Kubernetes) and containers (Docker) will come from a source workstation. So the originating source of the system administrator’s command line history (e.g. bash history, PowerShell Get-History, etc) should be in-scope.

You should also obtain copies of the system administrator’s locally-stored copies of the YAML files that were initially used to set up the containers and orchestration.

Also, if you believe the maliciousness does not originate from the host straight to its child containers, you may want to assign someone to the network forensics portion. See Incident Response in the Cloud – Part I –> Detection –> Network Forensics

The Service Definition

As stated previously, the Service is the piece that allows the specific interaction between pods and between pod/non-pod resources, like an external load balancer, persistent storage, database, etc. The Service is a REST object and will have its own IP/DNS.

This is an example of the creation (via “POST”) of a Kubernetes Service. “mywebapp”, which will get an assigned IP and will map incoming traffic from TCP 80 to the pod’s specified listening port, TCP 9376.

apiVersion: v1
kind: Service
  name: myservice
    app: myservice
    - protocol: TCP
      port: 80
      targetPort: 9376


Cloud Orchestration logs

These are some of the more useful commands that cloud providers have created in order to interact with the containers.




Application logs

Container application logs (Az container logs) can be viewed in the command line

az container logs --resource-group myResourceGroup --name mycontainer
Traceback (most recent call last):
File "", line 11, in
urllib.request.urlretrieve (sys.argv[1], "foo.txt")
File "/usr/local/lib/python3.6/urllib/", line 248, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/usr/local/lib/python3.6/urllib/", line 223, in urlopen
return, data, timeout)
File "/usr/local/lib/python3.6/urllib/", line 532, in open
response = meth(req, response)
File "/usr/local/lib/python3.6/urllib/", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/local/lib/python3.6/urllib/", line 570, in error
return self._call_chain(args) File "/usr/local/lib/python3.6/urllib/", line 504, in _call_chain result = func(args)
File "/usr/local/lib/python3.6/urllib/", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

From: MicrosoftAzurecontainerlogsreference

Container startup logs

Explained further in the forensic workflow section. Essentially, this command can direct the container stdout and stderror to you.

az container attach --resource-group myResourceGroup --name mycontainer
Docker Desktop (WSL 2)

The data sources below capture interactions between:

1) the Windows host –> Docker Desktop (a MobyLinuxVM using WSL 2.)
2) The Docker Desktop VM –> Docker containers
3) NOT the Docker container commands (AFAIK), which are likely the most relevant. For that data source, please refer to the subsequent sections of this article.

Be advised, those paths are reconfigurable by the system administrator.

> The Docker Desktop host VM (.vhdx)

> Logs, Cache, BlobStorage, Leveldb, Cookies, etc 
     C:\Users\<username>\AppData\Roaming\Docker Desktop

> POSTs to WSL2 VM (MobyLinuxVM running Docker Desktop)
Data Volumes

As mentioned previously, containers do not have inherently persistent storage. It is necessary to either ask the system administrator for the mount points or look for the mount section in “Docker inspect <container>“.

Mounts come in three flavors

  • Volumes
    • For Linux, they should live in /var/lib/docker/volumes
    • For Windows, (i.e. Docker Desktop) please refer to the specific Docker Desktop (WSL 2) section. But essentially, the volumes should live within its “host” ext4 file system on the vhdx volume.
  • Bind mounts
    • These can exist anywhere. Look for them with the Docker inspect command above
  • tmpfs mounts
    • These are in the host RAM, so acquire it. Especially considering these mounts can be through named pipes as well.


Given the reliance on the REST API it may be worthwhile to perform a network capture on the host VM, if there is a suspicion of malicious traffic. I have not tried this myself, but I believe you should be able to do this through the use of a “sidecar.”
Another possibility is ksniff, a pretty inventive tool from Eldad Rudich, that uses kubectl to upload tcpdump to your container, then pushes the output to your Wireshark instance.

DFIR Workflow

Please be advised, this will be supplemented/edited whenever I have time. Hopefully, the quick workflow below is, at least, a small amount of help during your cloud response.

Urgent Containment Needed?

Quarantine the container if possible.

Docker containers can be paused/suspended with “docker pause“. On Linux, this leverages the freezer cgroup and the processes will not be aware or able to control it. On Windows, the container can be paused if it is using Hyper-V.

Jonathan Seawright astutely pointed out the container provider may also be able to quarantine a container with a firewall isolation policy. I would add the host EDR tools may have some capability to quarantine the host itself.

It is important to note, these are capabilities you should understand in your environment before employing, as any quarantine attempts, whether at the container or host level, have the potential for instability. Additionally, Docker creates its own iptables and config entries so it is exceedingly difficult to quarantine a container with just a simple command line.

If you cannot pause the container, I recommend stopping it, which will likely kill it. Reference SIGTERM / SIGKILL in the next section’s command list

Not an Urgent Containment

Analyze for Container Escape

StackRox did a short, but outstanding, blog post on Container Escape and I recommend reading it in-full here.

They recommend you review the container’s deployment.yaml to see if the container was launched with any of the flags below. If these flags are present in the launch, then you may want to image the host. A good next step would be to follow the instructions provided by Jonathan Greig in his excellent article on post-host-imaged container analysis.

If the flags below are not present, then you can limit your analysis to the container processes.

  • --cap-add *
  • --device *
  • --device-cgroup-rule
  • --ipc system
  • --mount /
  • --pid system
  • --privileged
  • --security-opt
  • --volume /
  • --volumes-from

No Container Escape
  • Acquire relevant data sources referenced in the “Artifacts” section above and/or in the Azure/AWS sections that follow
  • suspend/pause using
    • docker container pause <container name>.
    • This will keep the memory/differencing intact.
  • Commit the container’s changes to a new instance for spinning up in your forensic environment.
    • docker commit $CONTAINER_ID <source image>
  • [OPTIONAL 1] Remove the misbehaving container from the orchestrator. Kubectl has a couple of ways to do this. Recommend deferring to the business unit’s system administrator or referring to the K8S documents.
  • Launch the new container in an isolated forensic environment for analysis (e.g. your AWS forensic VPC, your Azure forensic Resource Group, or your on-prem forensic lab environment.)
  • [OPTIONAL 2] Consider going old school on the containers and comparing a gold image hash set to the container you are about to acquire.
    • Get a known hash set
      • Spin up a clean container or run the hash command (below) against known clean containers. Make sure to use the original deployment config files. I recommend getting the exact steps from the system administrator.
    • Get the hashes from the suspicious container
    • Compare the hashes
      • Run the below command on the gold containers and the infected containers, then remove all the common files. Or load the known hash set into your forensic tool of choice and decimate the data set. (note the parenthesis)
  • [OPTIONAL 3] As mentioned in the Container Escape section above, if you want to pursue an alternate workflow and you have the time to get an image of the full host, you can follow the instructions provided by Jonathan Greig in his excellent article on host-based container analysis. I have also included some useful commands in the sections that follow.

docker exec -it <containerID> sh -c \
  "find . -name '*' -type f -print0 | \
  xargs -0 md5sum" > \ 



Container Host Forensics

Microsoft did a great job at explaining Azure container commands here. The forensically-useful points are below.

Container Startup Diagnostics

This points stdout (console) and stderr (error) to the container’s output so you can read the ongoing logs.

az container attach --resource-group myResourceGroup --name <mycontainer>
Example Output

Container 'mycontainer' is in state 'Unknown'…
Container 'mycontainer' is in state 'Waiting'…
Container 'mycontainer' is in state 'Running'…
(count: 1) (last timestamp: 2019-03-21 19:42:39+00:00) pulling image "<imagename>:latest"
Container 'mycontainer' is in state 'Running'…
(count: 1) (last timestamp: 2019-03-21 19:42:39+00:00) pulling image "<imagename>:latest"
(count: 1) (last timestamp: 2019-03-21 19:42:52+00:00) Successfully pulled image "<imagename>:latest"
(count: 1) (last timestamp: 2019-03-21 19:42:55+00:00) Created container
(count: 1) (last timestamp: 2019-03-21 19:42:55+00:00) Started container
Container Startup Events
az container show --resource-group myResourceGroup --name <mycontainer>

Example output:

  "containers": [
      "command": null,
      "environmentVariables": [],
      "image": "<aci-myimage>",
        "events": [
            "count": 1,
            "firstTimestamp": "2019-03-21T19:46:22+00:00",
            "lastTimestamp": "2019-03-21T19:46:22+00:00",
            "message": "pulling image \"<aci-myimage>\"",
            "name": "Pulling",
            "type": "Normal"
            "count": 1,
            "firstTimestamp": "2019-03-21T19:46:28+00:00",
            "lastTimestamp": "2019-03-21T19:46:28+00:00",
            "message": "Successfully pulled image \"<aci-myimage>\"",
            "name": "Pulled",
            "type": "Normal"
            "count": 1,
            "firstTimestamp": "2019-03-21T19:46:31+00:00",
            "lastTimestamp": "2019-03-21T19:46:31+00:00",
            "message": "Created container",
            "name": "Created",
            "type": "Normal"
            "count": 1,
            "firstTimestamp": "2019-03-21T19:46:31+00:00",
            "lastTimestamp": "2019-03-21T19:46:31+00:00",
            "message": "Started container",
            "name": "Started",
            "type": "Normal"
        "previousState": null,
        "restartCount": 0
      "name": "<mycontainer>",
      "ports": [
          "port": 80,
          "protocol": null
Container Interactive Session

If you want to query the container in a non-forensically-sound manner, you can launch an Interactive session. Consider then acquiring the data points enumerated by Sandfly Security

az container exec \
--resource-group <my-resourcegroup> \
--name mycontainer \
--exec-command /bin/sh (to launch a shell) 

(or you can just execute a command from right here similar to the hash command in the previous section)

Check the container metrics using Azure Monitor.

Here is is a list of the currently available container metrics. Be advised, AZ Monitor’s default retention is 93 days

AZ Monitor is set to provide the averages over all the containers in a resource group but you can get the individual container metrics by using an Azure “dimension.”

You can also use the Azure CLI to get the instance-specific container metrics. Microsoft provides the following instructions to do so.

Below is an example of container metrics by resource group (not by individual container). To get the individual container metrics, just add –dimension <containerName>.

CPU metrics

CONTAINER_ID=$(az container show \
--resource-group <my azure resource group> \
--name <mycontainer> \
--query id \
--output tsv

az monitor metrics list \ 
--resource $CONTAINER_ID \ 
--metric CPUUsage \ 
--output table
Example Output:

Timestamp Name Average
2018-08-20 21:39:00 CPU Usage
2018-08-20 21:40:00 CPU Usage
2018-08-20 21:41:00 CPU Usage
2018-08-20 21:42:00 CPU Usage
2018-08-20 21:43:00 CPU Usage 0.375
2018-08-20 21:44:00 CPU Usage 0.875
2018-08-20 21:45:00 CPU Usage 1
2018-08-20 21:46:00 CPU Usage 3.625
2018-08-20 21:47:00 CPU Usage 1.5
2018-08-20 21:48:00 CPU Usage 2.75
2018-08-20 21:49:00 CPU Usage 1.625
2018-08-20 21:50:00 CPU Usage 0.625
2018-08-20 21:51:00 CPU Usage 0.5
2018-08-20 21:52:00 CPU Usage 0.5
2018-08-20 21:53:00 CPU Usage 0.5

Memory metrics

az monitor metrics list \
--resource $CONTAINER_ID \
--metric MemoryUsage \
--output table
Example Output:

Timestamp Name Average
2018-08-20 21:43:00 Memory Usage
2018-08-20 21:44:00 Memory Usage 0.0
2018-08-20 21:45:00 Memory Usage 15917056.0
2018-08-20 21:46:00 Memory Usage 16744448.0
2018-08-20 21:47:00 Memory Usage 16842752.0
2018-08-20 21:48:00 Memory Usage 17190912.0
2018-08-20 21:49:00 Memory Usage 17506304.0
2018-08-20 21:50:00 Memory Usage 17702912.0
2018-08-20 21:51:00 Memory Usage 17965056.0
2018-08-20 21:52:00 Memory Usage 18509824.0
2018-08-20 21:53:00 Memory Usage 18649088.0
2018-08-20 21:54:00 Memory Usage 18845696.0
2018-08-20 21:55:00 Memory Usage 19181568.0

More Useful DFIR Commands

Kubernetes & Docker (also AWS & Azure for that matter) use a base command with options. They can be called with –help for additional instructions.


Get commands with basic output

# List all services in the namespace
kubectl get services

# List all pods in all namespaces
kubectl get pods --all-namespaces

# List all pods in the current namespace, with more details 
kubectl get pods -o wide

# List a particular deployment
kubectl get deployment <my-deployment>

# List all pods in the namespace
kubectl get pods

# Get a pod's YAML
kubectl get pod my-pod -o yaml

# Describe commands with verbose output
kubectl describe nodes <my-node>
kubectl describe pods <my-pod>

# List Services Sorted by Name
kubectl get services

# List pods Sorted by Restart Count
kubectl get pods --sort-by='.status.containerStatuses[0].restartCount'

# List PersistentVolumes sorted by capacity
kubectl get pv

# Get the version label of all pods with label app=cassandra
kubectl get pods --selector=app=cassandra -o \

# Retrieve the value of a key with dots, e.g. 'ca.crt'
kubectl get configmap <myconfig> \
-o jsonpath='{\.crt}'

# Get all worker nodes (use a selector to exclude results that have a label
# named '')
kubectl get node --selector='!'

# Get all running pods in the namespace
kubectl get pods --field-selector=status.phase=Running

# Get ExternalIPs of all nodes
kubectl get nodes -o jsonpath='{.items[*].status.addresses[?(@.type=="ExternalIP")].address}'

# List Names of Pods that belong to Particular RC
# "jq" command useful for transformations that are too complex for jsonpath, it can be found at
sel=${$(kubectl get rc my-rc --output=json | jq -j '.spec.selector | to_entries | .[] | "\(.key)=\(.value),"')%?}
echo $(kubectl get pods --selector=$sel --output=jsonpath={})

# Show labels for all pods (or any other Kubernetes object that supports labelling)
kubectl get pods --show-labels

# Check which nodes are ready
JSONPATH='{range .items[*]}{}:{range @.status.conditions[*]}{@.type}={@.status};{end}{end}' \
&& kubectl get nodes -o jsonpath="$JSONPATH" | grep "Ready=True"

# Output decoded secrets without external tools
kubectl get secret my-secret -o go-template='{{range $k,$v := .data}}{{"### "}}{{$k}}{{"\n"}}{{$v|base64decode}}{{"\n\n"}}{{end}}'

# List all Secrets currently in use by a pod
kubectl get pods -o json | jq '.items[].spec.containers[].env[]?' | grep -v null | sort | uniq

# List all containerIDs of initContainer of all pods
# Helpful when cleaning up stopped containers, while avoiding removal of initContainers.
kubectl get pods --all-namespaces -o jsonpath='{range .items[*].status.initContainerStatuses[*]}{.containerID}{"\n"}{end}' | cut -d/ -f3

# List Events sorted by timestamp
kubectl get events --sort-by=.metadata.creationTimestamp

# Compares the current state of the cluster against the state that the cluster would be in if the manifest was applied.
kubectl diff -f ./<my-manifest>.yaml

# Produce a period-delimited tree of all keys returned for nodes. Helpful when locating a key within a complex nested JSON structure
kubectl get nodes -o json | jq -c 'path(..)|[.[]|tostring]|join(".")'

# Produce a period-delimited tree of all keys returned for pods, etc
kubectl get pods -o json | jq -c 'path(..)|[.[]|tostring]|join(".")'

Copied verbatim from:’s Kubernete Cheat Sheet


For this section, I thought it would me more useful if I pulled down a Docker image and ran some commands. Hopefully, that will help you understand the commands and outputs later. I have commented the commands as well.

The specific steps below cover:

  • reviewing the locally-stored images on the host
  • checking if any containers are running
  • starting a web app container and binding it to TCP 80
  • noting the SHA256 of the container
  • reading the access logs of the newly-running container
  • executing “ps -a” from the host into the container
  • open a shell in the container and running commands from inside the container

>> docker images
# the images that are available to use
# recommend noting any container that is suspiciously different
REPOSITORY                       TAG       IMAGE ID       CREATED       SIZE
dfwforensics/docker101tutorial   latest    a7dc23f996e8   3 hours ago   27.9MB
docker101tutorial                latest    a7dc23f996e8   3 hours ago   27.9MB
alpine/git                       latest    a939554ad0d0   4 weeks ago   25.1MB

>> docker ps
# check for running containers. None are running on this system yet



>> docker run -dp 80:80 docker101tutorial
#this runs the image above, then binds the container's listening port to TCP 80.
#if the image does not exist locally, docker will download it.
#output is the sha256 of the image    


>> docker ps
# after the container is running you can query its summary information

CONTAINER ID   IMAGE               COMMAND                  CREATED 
cdb9128363a9   docker101tutorial   "/docker-entrypoint.…"   17 minutes ago
STATUS          PORTS                NAMES
Up 17 minutes>80/tcp   <containername>

>> docker inspect <mycontainer>
# use "inspect" to read a running container's metadata.
# or use "inspect" on an image with "--verbose" on a suspicious image 

        "Id": "sha256:a7dc23f996e88f5787e8dc2112348fb022bd0a25f91644c92bcd50816dd24b6a",
        "RepoTags": [
        "RepoDigests": [
        "Parent": "",
        "Comment": "buildkit.dockerfile.v0",
        "Created": "2021-03-25T17:15:37.0510086Z",
        "Container": "",
        "ContainerConfig": {
            "Hostname": "",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": null,
            "Cmd": null,
            "Image": "",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": null
        "DockerVersion": "",
        "Author": "",
        "Config": {
            "Hostname": "",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "ExposedPorts": {
                "80/tcp": {}
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
            "Cmd": [
                "daemon off;"
            "Image": "",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": [
            "Labels": {
                "maintainer": "NGINX Docker Maintainers <[email protected]>"
            "StopSignal": "SIGQUIT"
        "Architecture": "amd64",
        "Os": "linux",
        "Size": 27941754,
        "VirtualSize": 27941754,
        "GraphDriver": {
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/u95ewzojeausn0e21kdz4dtu9/diff:/var/lib/docker/overlay2/221ca0b88a17dfdfc3f71c8e552a1aefd9213399d38284dde5e2f38b91465f5b/diff:/var/lib/docker/overlay2/d32278f833945f7e0e0a3cf752f631f1594515bde5b4cd426a5e39b7cd1d477a/diff:/var/lib/docker/overlay2/d0c54430881927715c04507b8da79dff77bedec352418b02c30f4839833f7868/diff:/var/lib/docker/overlay2/6c2981beeab6ea01fb679afece6b904fdac5bb6e2e270044521bf9bab3dc3d1e/diff:/var/lib/docker/overlay2/2bc0eee3896be31ea7c4ddc755f8f3fe5a077fec1db928d77899f19ecabc7d09/diff:/var/lib/docker/overlay2/0a09fdf0301e002ee7849200c716ab3cd19ad379041cb6c7d39892aff643cf8e/diff",
                "MergedDir": "/var/lib/docker/overlay2/8j883dxj8a4hvd9lqnmj7ixg0/merged",
                "UpperDir": "/var/lib/docker/overlay2/8j883dxj8a4hvd9lqnmj7ixg0/diff",
                "WorkDir": "/var/lib/docker/overlay2/8j883dxj8a4hvd9lqnmj7ixg0/work"
            "Name": "overlay2"
        "RootFS": {
            "Type": "layers",
            "Layers": [
        "Metadata": {
            "LastTagTime": "2021-03-25T17:29:34.297585Z"

>> docker logs cdb9128363a9 #note the container ID
# pulling this (web app) container's logs after web browsing in it a bit - - [25/Mar/2021:20:20:59 +0000] "GET / HTTP/1.1" 200 8713 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36" "-" - - [25/Mar/2021:20:20:59 +0000] "GET /assets/stylesheets/application.adb8469c.css HTTP/1.1" 200 76332 "http://localhost/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36" "-" - - [25/Mar/2021:20:20:59 +0000] "GET /assets/stylesheets/application-palette.a8b3c06d.css HTTP/1.1" 200 38773 "http://localhost/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36" "-" - - [25/Mar/2021:20:20:59 +0000] "GET /assets/fonts/material-icons.css HTTP/1.1" 200 873 "http://localhost/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36" "-" - - [25/Mar/2021:20:20:59 +0000] "GET /assets/javascripts/modernizr.86422ebf.js HTTP/1.1" 200 7296 "http://localhost/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36" "-"
... - - [25/Mar/2021:20:21:04 +0000] "GET /tutorial/persisting-our-data/items-added.png HTTP/1.1" 200 63754 "http://localhost/tutorial/persisting-our-data/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36" "-" - - [25/Mar/2021:20:21:04 +0000] "GET /tutorial/persisting-our-data/dashboard-open-cli-ubuntu.png HTTP/1.1" 200 170038 "http://localhost/tutorial/persisting-our-data/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36" "-" - - [25/Mar/2021:20:21:05 +0000] "GET /tutorial/using-bind-mounts/ HTTP/1.1" 200 18697 "http://localhost/tutorial/persisting-our-data/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36" "-" - - [25/Mar/2021:20:21:05 +0000] "GET /tutorial/using-bind-mounts/updated-add-button.png HTTP/1.1" 200 21838 "http://localhost/tutorial/using-bind-mounts/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36" "-"

 >> docker exec cdb9128363a9 ps -a
# running a command in the container

    1 root      0:00 nginx: master process nginx -g daemon off;
   33 nginx     0:00 nginx: worker process
   34 nginx     0:00 nginx: worker process
   35 nginx     0:00 nginx: worker process
   47 nginx     0:00 nginx: worker process
   48 nginx     0:00 nginx: worker process
   55 root      0:00 ps -a

>> docker ps
# starting a remote shell of the container - first, get the container info

CONTAINER ID   IMAGE               COMMAND                  CREATED          STATUS          PORTS                NAMES
cdb9128363a9   docker101tutorial   "/docker-entrypoint.…"   30 minutes ago   Up 30 minutes>80/tcp   <containername>

>> docker exec -it <containername> /bin/sh
# starting a remote shell of the container - second, start the shell

/ #


/ # ps -a
# You are now in the container's shell

    1 root      0:00 nginx: master process nginx -g daemon off;
   33 nginx     0:00 nginx: worker process
   34 nginx     0:00 nginx: worker process
   35 nginx     0:00 nginx: worker process
   47 nginx     0:00 nginx: worker process
   48 nginx     0:00 nginx: worker process
   61 root      0:00 /bin/sh
   68 root      0:00 /bin/sh
   75 root      0:00 ps -a

/ # pwd


/ # whoami


/home # more /etc/passwd


/home # more /etc/shadow

Docker Command Help Menu

The full command list

>> docker

Usage:  docker [OPTIONS] COMMAND

A self-sufficient runtime for containers

      --config string      Location of client config files (default
  -c, --context string     Name of the context to use to connect to the
                           daemon (overrides DOCKER_HOST env var and
                           default context set with "docker context use")
  -D, --debug              Enable debug mode
  -H, --host list          Daemon socket(s) to connect to
  -l, --log-level string   Set the logging level
                           (default "info")
      --tls                Use TLS; implied by --tlsverify
      --tlscacert string   Trust certs signed only by this CA (default
      --tlscert string     Path to TLS certificate file (default
      --tlskey string      Path to TLS key file (default
      --tlsverify          Use TLS and verify the remote
  -v, --version            Print version information and quit

Management Commands:
  app*        Docker App (Docker Inc., v0.9.1-beta3)
  builder     Manage builds
  buildx*     Build with BuildKit (Docker Inc., v0.5.1-docker)
  config      Manage Docker configs
  container   Manage containers
  context     Manage contexts
  image       Manage images
  manifest    Manage Docker image manifests and manifest lists
  network     Manage networks
  node        Manage Swarm nodes
  plugin      Manage plugins
  scan*       Docker Scan (Docker Inc., v0.5.0)
  secret      Manage Docker secrets
  service     Manage services
  stack       Manage Docker stacks
  swarm       Manage Swarm
  system      Manage Docker
  trust       Manage trust on Docker images
  volume      Manage volumes

  attach      Attach local standard input, output, and error streams to a running container
  build       Build an image from a Dockerfile
  commit      Create a new image from a container's changes
  cp          Copy files/folders between a container and the local filesystem
  create      Create a new container
  diff        Inspect changes to files or directories on a container's filesystem
  events      Get real time events from the server
  exec        Run a command in a running container
  export      Export a container's filesystem as a tar archive
  history     Show the history of an image
  images      List images
  import      Import the contents from a tarball to create a filesystem image
  info        Display system-wide information
  inspect     Return low-level information on Docker objects
  kill        Kill one or more running containers
  load        Load an image from a tar archive or STDIN
  login       Log in to a Docker registry
  logout      Log out from a Docker registry
  logs        Fetch the logs of a container
  pause       Pause all processes within one or more containers
  port        List port mappings or a specific mapping for the container
  ps          List containers
  pull        Pull an image or a repository from a registry
  push        Push an image or a repository to a registry
  rename      Rename a container
  restart     Restart one or more containers
  rm          Remove one or more containers
  rmi         Remove one or more images
  run         Run a command in a new container
  save        Save one or more images to a tar archive (streamed to STDOUT by default)
  search      Search the Docker Hub for images
  start       Start one or more stopped containers
  stats       Display a live stream of container(s) resource usage statistics
  stop        Stop one or more running containers
  tag         Create a tag TARGET_IMAGE that refers to SOURCE_IMAGE
  top         Display the running processes of a container
  unpause     Unpause all processes within one or more containers
  update      Update configuration of one or more containers
  version     Show the Docker version information
  wait        Block until one or more containers stop, then print their exit codes

Run 'docker COMMAND --help' for more information on a command.

AWS / Azure / GCP

I recommend either looking at the examples in previous sections or simply performing a web search for “<cloudProviderName” AND <commandYouWant>. The online documentation is very good for all of these cloud providers. Additionally, once you understand the AWS/AZ/GCP command syntax, the commands are closely aligned with the standard Docker/K8S syntax.

SIFT Docker Image

update 1 Apr 21: I will leave the previously posted information below, but as expected, Digital Sleuth made a better docker image with greatly expanded capabilities compared to mine. I highly recommend you use that one. Simply clone that GitHub repository, navigate to your local folder that contains them, and launch “docker compose up -d“. That will give you a fully SIFT’d docker container, along with a SIFT subnet, gui, ssh, and mounting capabilities. So…..pretty much exactly like my docker image but cleaner. And smaller. And….better.
If you do it right it, the in-progress install will look like this:

>> cd .\sift-docker\
>> ls

    Directory: ..\sift-docker

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d-----          4/1/2021   9:31 AM                .git
-a----          4/1/2021   9:31 AM            485 docker-compose.yaml
-a----          4/1/2021   9:31 AM           2335 Dockerfile.bionic
-a----          4/1/2021   9:31 AM           2334 Dockerfile.focal
-a----          4/1/2021   9:31 AM          35149 LICENSE
-a----          4/1/2021   9:31 AM            945

>> docker compose up -d
[+] Building 628.0s (3/4)
 => [internal] load build definition from Dockerfile                                                                                                                       0.0s
 => => transferring dockerfile: 73B                                                                                                                                        0.0s
 => [internal] load .dockerignore                                                                                                                                          0.0s
 => => transferring context: 2B                                                                                                                                            0.0s
 => [internal] load metadata for                                                                                                 3.5s
 => [1/1] FROM                                         624.5s
 => => resolve                                           0.0s
 => => sha256:b2055b333b029de2d76a27fbf5ee74e5ce5479467c4b7790b2d1d832a9c9515c 583.01MB / 1.50GB                                                                         624.5s
 => => sha256:728042ecb84323afa057d4d19a3c2ab6d9015829a15ecf49d458c4dbd542ffa4 111B / 111B                                                                                 0.4s
 => => sha256:6c57dfb413adca4e10aa47ee56248e7c131905511d1f46275b3b13656ad829cf 1.36kB / 1.36kB                                                                             0.0s
 => => sha256:e1b967990adc6e002937671ad01ffba85e51c71bcac3dd9fa3aae0e181790658 7.55kB / 7.55kB                                                                             0.0s

My SIFT docker image

Below is a link to a SIFT Docker image that you can download and play with. Please feel free to try it out. Just load Docker on your Linux workstation with a simple apt install, or if you have WSL 2 enabled, install Docker for Windows. Be advised, this SIFT image is not production-tested or SANS-authorized. Also, DigitalSleuth is working on a SIFT image and it will almost certainly be better than mine, so if that is available, use that one.

To start, just Google “Docker pull” and follow the quick instructions. Then, “Docker run <the other stuff you Googled>” and you will be up and running. At that point “Docker exec -it <container> /bin/bash” will give you an interactive shell in the container and you can work with your data just like you normally would in SIFT. Keep in mind, the value here is scalability, small size, compartmentation, and a clean instance for every case. The downside is docker does not have persistent storage so you should map a storage location if that is something you need. Like every other DFIR workflow task, it is only hard the first time you do it.

Once you have the container running, just copy some files to it and you are good to go. Copying files is easy and there are five different Google-able ways to do that. If you are using Docker for Windows, the easiest way is here. You can also access the container directly on your host. If you are running docker under WSL, just look for a hidden share with Windows Explorer “\\wsl$”



Riadi, Imam & Umar, Rusydi & Sugandi, Andi. (2020). Web Forensic on Kubernetes Cluster Services Using Grr Rapid Response Framework. International Journal of Scientific & Technology Research. 9. 3484-3488.

Grieg, Jonathan (2021). Container Forensics with Docker Explorer. OSDFIR.

StackRox. (2017). Forensics in the age of containers. Accessed from:

Ashnik. Container Forensics. Accessed from:

Microsoft. (2019). Retrieve container logs and events in Azure Container Instances. Accessed from:

unk author. (2020). Kubernetes Documentation. Accessed from:

Burns, Brenden. (2015). A Technical Overview of Kubernetes – CoreOS Fest 2015. Accessed from:

Incident Response in the Cloud – Part 1


I recently decided I wanted to give back to the community. I was going to start a band, but I cannot sing and nobody likes the clarinet so I decided to write things about Digital Forensics and Incident Response (DFIR).

Well, months later, after that “give-back” decision, and with very little completed, I now have an even deeper respect for anyone in the DFIR community who has dedicated time to helping all of us. I am amazed they can hold a job, have a personal life, publish articles, and make tools at the same time. No need to name them; we all know who you are and are profoundly grateful….as you can tell by the one million views, two “likes”, and one comment on the things you post.

Since I cannot find the time to write up the article in-full, and because there is a general lack of information on the subject of Cloud Forensics and Incident Response (IR), I am going to start publishing it in pieces. I admit, this is a departure from the standard industry definition of the Incident Response Process. I just found it easier to organize this particular article.

We will start with the general location and description of Cloud IR data sources, then later follow it with articles on practical application in Amazon AWS and Microsoft Azure.

Please note this content will be supplemented and modified over time.

Also, to save time I will shamelessly borrow and cite from the leaders in our field. I am not compensated by anyone or anything referenced in this article; everything is there because it is my opinion and I would encourage you to add your thoughts, or even better…disagree and post your opinion in the comments. I firmly believe that we need more debate in this field.


The good news…even if you are an absolute cloud-novice you can be successful on your very first cloud incident response. You will find that a simple Web search for any of the terms below will immediately hit on Amazon and Microsoft’s easy-to-read documentation, which are substantially better than the indecipherable hieroglyphs that software developers have produced for decades.

Incident Response Plan

Create an Incident Response (IR) Plan. IR Plans are not in-scope for this article but it is worth mentioning that you will likely fail if you have no plan at all. It is further recommended you have a specific section within your IR playbook covering cloud response, as it is not the same as on-prem incident response.

Environmental Discovery

Data Flow Diagrams

If possible, try to catch the unicorn… architectural diagrams. Just be aware that if the organization does not enforce the creation of diagrams then it is unlikely they will exist. In terms of cloud architecture, a company’s network design is like a snowflake; each of them is often a unique implementation so it is a little more difficult to make educated guesses than it would be for an on-premise environment. Luckily, AWS and Azure have automated discovery that may help you hone in on the pieces you need and not look like a cat chasing a laser pointer. Lastly, before you assume a data source is not available in your SIEM, do a quick (and very general) keyword search. Often the data source might have been added or it was added and incorrectly referenced/index.

Amazon (AWS)

AWS Network Manager – to view the network topology
AWS Config – monitors changes to the account’s system configuration

Azure (AZ)

Network Watcher – to view the network topology


Network Forensics

There are a number of great DFIR training sources. However, it is my opinion that SANS Institute is the gold standard. So for network-forensics I prefer to follow the general workflow from Phil Hagen’s outstanding FOR572 course in which he focuses on the primacy of the following network data sources:
 1. Logs
2. NetFlow
3. Full Packet Capture (FPC)

I would like to add one more source… 
4. Vendor Tools – we will cover EDR tools later, but this is more of a focus on vendor network tools that provide data under one of the three aforementioned data points.


NOTE: If cloud monitoring services tools are enabled and configured (e.g. AWS GuardDutyMS Sentinel, Azure Identity Protection, etc), I recommend you spend your time gaining access to those tools rather than trying to pull the raw log sources listed below. Certainly try to aggregate your log visibility later, but the middle of an IR is not the time to try to add new sources into your SIEM.


Thankfully Amazon likes to standardize the naming structure due to the overwhelming number of AWS products and offerings.

CloudTrail – operations performed TO a cloud resource. These are the API logs and nearly every AWS resource interaction is an API call. So if you only have time to focus on one data source this is a good one.

CloudFront – this is the AWS method to push your data to the closest edge of your clients (i.e. a CDN service). But for DFIR, CloudFront is where you likely will find access logging (also look at the Elastic Load Balancers, which may be capturing access logs as well as producing flow logs).

CloudWatch – health and status, alarms/alerts. Your enterprise may be aggregating other log sources (like CloudTrail) here as well. Additionally, this is also a good metrics data source for suspected coinmining.

GuardDuty – cloud-specific automated thread detection and alerting. Really well done product if it is configured.

Detective – Automated analysis and consoles. Sort of a network-meets-endpoint EDR tool.


Microsoft calls the first three “platform logs”

Azure AD logs – User-based logs. Azure AD Sign-ins, changes, and flagged activity

Activity log – Resource-based logs. Write operations (e.g. PUT, POST, DELETE) performed TO an Azure resource (e.g start/stop a VM, create a webapp, delete an AZ storage account, etc).

Resource log – Also resource-based logs, but these are operations performed IN or BY a cloud resource. These are often host-based logging that are being shipped to the AZ control plane for easier viewing in the Azure admin consoles.

AZ Information Protection – M365 DLP solution

AZ Monitor – Automated analysis and console

MS Defender for Cloud – (formerly AZ Security Center) – A CSPM/CWPP solution with vulnerability assessment, remediation, and alerting capability. There is a lot more DFIR value in the MS Defender line of products and I plan on covering them in another article.

MS Sentinel – There is a lot of overlap with MS Azure security tools but essentially, this is the SIEM/SOAR.

NetFlow (NF)


VPC Flow Logs – NetFlow between the network interfaces.

Ask the admin if these logs are enabled and if they are going to 1) CloudWatch or 2) an s3 bucket (as .gz files).

The default AWS log format is:

${version} ${account-id} ${interface-id} ${srcaddr} ${dstaddr} ${srcport} ${dstport} ${protocol} ${packets} ${bytes} ${start} ${end} ${action} ${log-status}


Network Watcher – NetFlow between Network Security Groups

Full Packet Capture (FPC)


Traffic Mirroring – Can mirror packets (VXLAN format) to another VPC NIC where you have a listener set up, like on the Cloud IR VPC that you have already proactively setup with Bro/Zeek, Wireshark, or tcpdump instance? (hint…hint)

If Traffic Mirroring is not turned on and the infection is ongoing, have your cloud admin go to the AWS console and follow these instructions


Network Watcher – FPC between Network Security Groups (NSG)

Vendor Tools

Most mid-to-large size corporate environments have at least one EDR tool in their baselines that can provide useful IR data. The cloud offers a significant assortment of 3rd party solutions that may already be deployed in your enterprise, like Cisco StealthWatch Cloud or SolarWinds for netflow, or Niksun NetDetector for FPC. This is where your enterprise discovery efforts pay off.

Host Forensics

I am a fan of FOR500 and FOR508. Rob Lee and his fellow instructors have forgotten more about host forensics than I will ever know. Rob even invented the term “DFIR”. I have not invented anything. I thought I invented the frisbee when I was six years old. But my dad said I did not…the bellwether to a long life of disappointment.

Triage Data Collectors
Endpoint Detection and Response (EDR) tools

Do you have an endpoint security tool in your environment (e.g. Cortex XDR, Tanium, FireEye HX, SentinelOne, Crowdstrike Falcon, Cyber Triage, etc)? Most of the time, actionable intelligence can be obtained with a triage data acquisition. EDR tools allow you to focus on rapid acquisition of only the data you need. If you do not have an EDR tool on your endpoints, I recommend starting with Kroll’s open-source free tool, KAPE, created by DFIR Jedi, Eric Zimmerman. Specifically, I recommend picking a use case from the SANS 500 and 508 posters (i.e. Evidence of Execution, File Download, etc) and acquiring those artifacts first.

Incidentally useful tools in your baseline

Do not forget, incidental and “smoking gun” data points are often found in one of the non-DFIR enterprise tools (e.g. security compliance tools like Qualys, or antivirus tools like Symantec or McAfee)

Host Logs


Not in-scope for this article, but the main path for modern Windows computers is %WINDIR%\system32\winevt\Logs\* . Keep in mind there are many other log sources on a Windows machine beyond the event logs. It would be worthwhile to ask the server admin’s if there is any log-forwarding, or if the logs are in a non-standard location. For example, a commonly overlooked issue when acquiring server logs is the architects often put the primary application’s log (and sometimes the swapfile) on a separate partition. So if you are grabbing logs from the C volume, you may not notice that the E volume contains GB’s worth of application logs .


The main location is /var/log. I had a list of commands I liked, based on previous experience. Then I found this outstanding document written by Sandfly Security which I recommend you provide to the cloud sys admin and have them push the output to individual text files, named with the command they ran. Then, add those text files to the specific hostname folder under the main greppable folder structure you created for the incident. For example,
Folder:IncidentXYZ –> subfolder:Hostname1 –> filename:<PsAuxf>.txt)

For more information about the grep folder, see “Lessons Learned”

SANS also has a wealth of knowledge in this reference document

Containment, Eradication, and Re-Discovery

Not in-scope yet. Might add some things here in a future article.

Post-Incident Activity

Also not in-scope yet. Stop being greedy

My Lessons Learned

Here a few recommendation taken from my own failures:

  • Enlist the sys admins. The most time-consuming piece of IR is the data collection. Time is working against you so try not to single-handedly acquire all the data sources; they exist in too many places for one person to handle. Draft the local administrators into the incident response (through their manager’s directive). The admins are probably better at querying/exporting their own logs than you are. Often, you can even request they provide the logs in a specific format and decimated to only what is relevant to you (but still with the widest net possible); most admins are pretty capable with command line tools. Now you can work with 1 GB of logs/netflow/pcap, rather than the 100GB they would send you.
  • Use the telephone. While we are on the topic of slow response, become an innovator in your organization and call people on the telephone thing. You can waste hours, and often days, waiting for email replies from telephobic people, before you even start querying logs.
  • Make a grep folder. If the IR is big enough, you are going to get a flood of data sources over hours or days. Consider making a central folder of all text-based logs and dump each log into a subfolder by the source name/hostname (e.g. <Incident #>\Host-Based Logs\<hostname 1>). Later, you can grep it all at once or use a SIEM to search it, like SOF-ELK or a local Splunk instance.  The main value I have found with this method is you can rapidly grep the data and every hit has the source automatically listed in the path next to it, which is a problem when all the data you get is labeled “access.log1, access.log1[1]…..”.

Folder Structure for DFIR artifacts

A final lesson learned to finish this article. You will likely get a lot of files in various forms but the main one will be gzip’s. Simply using something like 7-zip to “unzip all” will almost certainly result in overwriting a lot of your files (e.g. multiple hosts with “access.log” or “bash.history”, etc.) And although normalizing all your files to a common format should be a step you take, in reality it is often too time-consuming to tailor a script for every incident response. So here is a quick command line to grep only the gzips without unzipping them.

find . -name “*.gz” -type f -exec sh -c "zcat {} | grep -iaE '<malicious IP>|<URL>' " \;   

^^ edited this to allow for piped exec'ing and grepping of each file rather than across the whole output.