
Search Results
521 results found with an empty search
- Detecting OpenClaw/Clawbot with SentinelOne: The Challenge of Blocking
A huge thank you to my dearest friend Jeremy Jethro, who created this comprehensive script and the Detection rule in Sentinel one . Hi everyone, If you've been following the cybersecurity landscape lately, you've probably heard whispers about OpenClaw (also known as Clawbot or Moltbot) . And if you're in IT security, you're likely dealing with requests to detect and block it right now. --------------------------------------------------------------------------------------------------------- What is OpenClaw/Clawbot? OpenClaw is an AI-powered autonomous agent that runs on employees' machines. Think of it as an AI assistant that can interact with your computer, execute commands, access files, and perform actions on behalf of users. While it might sound useful in theory, it's become a significant security concern for organizations worldwide. The agent runs as a persistent background process, often integrating with various services and APIs, and has the ability to authenticate with external platforms like Google, Slack, and Discord. From a security perspective, this creates multiple risk vectors: Unauthorized data access - The agent can potentially access sensitive files and communications Shadow IT concerns - Users installing it without IT approval Compliance violations - Automated actions that bypass security controls Data exfiltration risks - The agent's ability to send data to external services Organizations are particularly concerned because OpenClaw operates with broad permissions and can persist on systems even after users think they've removed it. --------------------------------------------------------------------------------------------------------- My SentinelOne Detection Journey As you all know, I'm a huge SentinelOne fan. I've created a complete article series on leveraging SentinelOne for advanced threat detection - if you want to check out that series, https://www.cyberengage.org/courses-1/mastering-sentinelone%3A-a-comprehensive-guide-to-deep-visibility%2C-threat-hunting%2C-and-advanced-querying%22 Given my experience with SentinelOne, I naturally started working on custom detection rules for OpenClaw. And let me tell you, this one has been challenging. The Detection Challenge: It's Not That Simple Here's where things get complicated. We're facing some serious challenges with blocking OpenClaw for several clients. The core issue is that OpenClaw uses a node process , and this is where the limitations kick in. If we issue a quarantine command in SentinelOne, it will remove node - which could break other legitimate applications that depend on it. This isn't like blocking a standalone malicious executable using Sentinel One. This is a dependency issue that could have widespread impact on production systems. The Persistence Problem Here's the really frustrating part: Even after users uninstall OpenClaw, the claw process remains in startup and continues attempting to authenticate and run via script. I've confirmed this across multiple endpoints. Users go through the uninstall process, think they're done, and the process just keeps running in the background, trying to authenticate and execute. The script is typically located at: /opt/homebrew/bin/node /opt/homebrew/lib/node_modules/clawbot/dist/entry.js --------------------------------------------------------------------------------------------------------- Star custom rule which you can use for Detection in Sentinel one (event.type = 'Process Creation' and (((src.process.cmdline contains "clawd" || tgt.process.cmdline contains "clawd" || osSrc.process.cmdline contains "clawd") OR (src.process.cmdline contains "openclaw" || tgt.process.cmdline contains "openclaw" || osSrc.process.cmdline contains "openclaw") OR (src.process.cmdline contains "moltbot" || tgt.process.cmdline contains "moltbot" || osSrc.process.cmdline contains "moltbot"))) OR ((src.process.image.path contains ".clawdbot/" || src.process.parent.image.path contains ".clawdbot/" || task.path contains ".clawdbot/" || tgt.file.path contains ".clawdbot/" || tgt.file.oldPath contains ".clawdbot/" || tgt.process.image.path contains ".clawdbot/" || module.path contains ".clawdbot/" || osSrc.process.activeContent.path contains ".clawdbot/" || osSrc.process.image.path contains ".clawdbot/" || osSrc.process.parent.image.path contains ".clawdbot/" || src.process.activeContent.path contains ".clawdbot/" || tgt.process.activeContent.path contains ".clawdbot/")) OR (src.process.parent.publisher = "" or osSrc.process.parent.publisher "")) --------------------------------------------------------------------------------------------------------- Current Detection and Remediation Approach What We Can Do in SentinelOne I've created a custom rule to detect OpenClaw installations and processes. The good news: We can detect it . The bad news: Automated quarantine is risky . For alerts like OpenClaw.dmg, We can issue a quarantine command to remove the installer. However, for active installations where OpenClaw is already running as part of the node ecosystem, the quarantine action will: Disrupt the running process NOT fully remove it Potentially break other node-dependent applications --------------------------------------------------------------------------------------------------------- The Manual Removal Path Because of these limitations, use a script to manually remove OpenClaw from their endpoints via MDM (Mobile Device Management). Important finding: Even if users have removed Clawdbot manually, you must ensure they: Check the launchd process - The agent registers itself as a launch daemon Remove the plist file from launchd - This is what makes it persistent across reboots Remove the entry/script from Homebrew - Otherwise it will remain installed The plist files are typically found at locations like: ~/Library/LaunchAgents/bot.molt.gateway.plist ~/Library/LaunchAgents/com.openclaw.gateway.plist ~/Library/LaunchAgents/com.clawdbot.gateway.plist ~/Library/LaunchAgents/com.moltbot.gateway.plist --------------------------------------------------------------------------------------------------------- The Client Landscape For MSSP every clients want to know about OpenClaw, and a significant percentage want to block it immediately. The requests are coming in fast, and the pressure is on. I can tell its possible to block it, but as I mentioned, the way the quarantine action happens means it will disrupt the running process but not remove it cleanly. --------------------------------------------------------------------------------------------------------- The Removal Script I'm sharing the removal script developed for OpenClaw remediation. This script handles everything - killing processes, removing applications, cleaning up LaunchAgents, removing user data, and removing CLI binaries and Homebrew installations. Note: I'm still in the testing phase with some of the SentinelOne quarantine approaches, but this script has been working reliably for manual removal. Just convert this txt file into script --------------------------------------------------------------------------------------------------------- Security Reminder If OpenClaw was connected to external services, users should manually revoke OAuth tokens at: Google : https://myaccount.google.com/permissions Slack : https://slack.com/apps/manage Discord : User Settings > Authorized Apps --------------------------------------------------------------------------------------------------------- What's Next? I'm continuing to refine the SentinelOne detection rules and exploring safer quarantine approaches that won't impact legitimate node processes. If you're dealing with OpenClaw in your environment, I'd love to hear about your approach. Stay secure --------------------------------------------------Dean------------------------------------------------------
- Google Takeout: The Quiet Data Exit Nobody Talks About
Let’s talk about one of the most underestimated data exfil paths in Google Workspace. Not malware. Not OAuth abuse. Not a compromised token. Just… Google Takeout . Most people think of Takeout as a harmless “download my data” feature. And to be fair, that was the original idea. But from a security and forensics perspective, Takeout is a built-in data export mechanism that works surprisingly well — maybe too well. What Is Google Takeout (Really)? Google Takeout, also called “Download Your Data” , allows a user to export all the data associated with their Google account into an archive. This includes: Gmail Google Drive Calendar Contacts Sites And many other Workspace services Originally, Takeout existed to make Google feel more transparent and user-friendly.“Your data belongs to you — take it with you.” For example: Moving from a free Gmail account to Google Workspace Leaving an organization Personal backups All valid use cases. The problem? Takeout is enabled by default. You can disable it if wanted Even for: New organizations Enterprise licenses Security-conscious environments ------------------------------------------------------------------------------------------------------------- Why Takeout Is a Risk in Enterprises Here’s where the threat model changes. In Google Workspace: Any user can export their own data Group Owners can export entire group content , including email Data can be exported outside Google’s ecosystem That last point matters a lot. Because Takeout doesn’t just download data into Google Drive — it can push data directly to: Dropbox OneDrive Box Other third-party cloud storage providers From an investigation standpoint, that’s terrifying. Once data leaves Workspace and lands in a third-party cloud: You may have zero visibility You may have zero access You may not even know what was exported ------------------------------------------------------------------------------------------------------------- What a Takeout Export Looks Like for a User From the user’s perspective, the process is almost boringly simple. They go to: https://takeout.google.com/ From there: They select which services they want to export Choose how the export should be packaged (single archive or multiple ZIP files) Choose how the data should be delivered Most users stick with the default: Email notification with a download link But again — exporting to external storage is just a few clicks away. ------------------------------------------------------------------------------------------------------------- Timing Matters: Takeout Is Not Instant One thing that helps defenders (a little) is that Takeout isn’t immediate. Exports are processed in the background. The time depends on: How many services are selected How much data exists in each service Users can monitor progress in “Manage your exports” , where they’ll also see a history of previous exports. From an IR perspective, this delay gives you a narrow window: To detect To respond To disable access before completion But only if you’re looking. ------------------------------------------------------------------------------------------------------------- What Actually Gets Logged (And What Doesn’t) This is where things get subtle. Google Workspace has a dedicated Takeout Audit Log . That’s good news. The log records: Which user initiated a Takeout export When it started Which services were included The IP address used When the export finished packaging What it does not log: Whether the user downloaded the data Whether the data was accessed after packaging Whether data was successfully imported into a third-party cloud Once you see the “export completed” event, you should assume: The data is gone. Especially if the destination was external storage. ------------------------------------------------------------------------------------------------------------- Important Forensics Gotcha: No API Access Here’s a big one that catches teams off guard. Takeout Audit Logs are NOT available via the Google Workspace API. That means: If you rely only on API-based log collection If your SIEM pipeline pulls Workspace logs via API You will miss Takeout activity entirely . This is one of the few highly forensically relevant logs that requires: Manual Admin Console access Or native Workspace log review The IP address in this log becomes extremely valuable, because it’s often the only reliable pivot point to correlate: Login events OAuth activity Drive access Suspicious sessions ------------------------------------------------------------------------------------------------------------- Where the Data Goes After Packaging Once Takeout finishes building the archive, users can: Download it directly Access it via Google Drive Or let it be pushed to third-party storage If the archive lands in Google Drive: Access to the ZIP files is logged in Drive Audit Logs If it goes to external storage: Logging ends at “export completed” At that point, Workspace visibility stops. ------------------------------------------------------------------------------------------------------------- Customer Takeout: When Admins Export Everything Now let’s talk about the nuclear option . Google Workspace also supports Customer Takeout , which allows a Super Admin to export all data in the organization . This includes: User data Vault data Data under legal hold Data subject to retention rules This is powerful — and dangerous. https://support.google.com/a/answer/14339894?visit_id=01769771249424-8980970382637730314&rd=1 ------------------------------------------------------------------------------------------------------------- Restrictions (And Why They Exist) Google doesn’t let just anyone do this. To perform Customer Takeout: You must be a Super Admin MFA must be enabled Workspace must be older than 30 days Organization must have less than 1000 users These restrictions exist for good reason — but if a threat actor compromises an admin account that meets these conditions, Customer Takeout becomes a single-click mass exfiltration tool . ------------------------------------------------------------------------------------------------------------- The Big Picture: Why Takeout Matters in DFIR Takeout isn’t flashy. It doesn’t trigger AV alerts. It doesn’t bypass MFA. It doesn’t exploit anything. And that’s exactly why it works. From an attacker’s perspective: It’s legitimate It’s built-in It’s trusted It’s quiet From a defender’s perspective: Logging is limited API visibility is missing Exfil can be complete before alarms go off ------------------------------------------------------------------------------------------------------------- Final Thoughts If you’re defending or investigating Google Workspace environments, Takeout needs to be part of your mental threat model. Not because it’s malicious by design — but because it doesn’t need to be . All it requires is: Access Time And a user (or admin) clicking a few buttons ------------------------------------------------Dean-------------------------------------------------------
- Investigating Data Exposure in Google Drive
If you’ve worked in Google Workspace long enough, you already know this truth: Google Drive is where data leaks love to happen. Not always malicious. Sometimes it’s just: “Oops, shared it publicly” “Oops, shared it with the wrong domain” “Oops, didn’t realize Anyone with the link means literally anyone” So when data exposure happens, we usually care about two questions: What happened to the file? Can we still access or recover it? That’s where Google Drive investigation tools come in. ------------------------------------------------------------------------------------------------------------- Tool 1: Google Drive Log Events (Your Timeline, Not Your Files) Think of the Drive Log Events as your CCTV footage , not the evidence locker. What it’s good at: Showing who did what Showing when it happened Showing permission changes Near real-time visibility (usually within minutes) What it’s not good at: Accessing files Showing file contents Tracking anonymous viewers or downloads Key Things to Know About Drive Log Events Let’s break this down simply: Keeps 6 months of history Logs actions like: File creation Sharing changes Permission updates Deletions Does NOT give you the file itself CSV export is limited to 100,000 rows Unauthenticated access is only logged for editing Viewing or downloading by anonymous users? ❌ Not logged So if a file was publicly shared and downloaded 1,000 times anonymously — the audit log will not tell you that. Painful, but important to know upfront. Where Are Drive Log events Now? Earlier, Drive log events lived in their own section. That changed. Today, Drive Log Events live inside in the Google Admin Console. Inside Investigator, you can: Filter events Use AND / OR logic Group by fields (user, document, event type) Search using partial matches One warning though ⚠️Even though logs are generated quickly, some events can lag up to 12 hours before showing up. ------------------------------------------------------------------------------------------------------------ Tool 2: Google Vault (This Is Where the Files Live) If Audit Logs are the timeline, Vault is the evidence room . Vault is what you use when: You actually need the document A file was deleted A user “accidentally” removed something important But Vault comes with conditions. What Vault Can Do Access files in user Drives Access deleted files Apply holds Enforce retention rules What Vault Cannot Do Give you an audit trail Tell you who did what and when It’s access, not visibility. Deletion Timelines (This Matters a LOT) Here’s the reality of deleted files: When a user deletes a file → it goes to Trash Trash keeps files for 30 days Once removed from Trash: Admins have 25 more days to recover (without Vault) With Vault, recovery can extend further Custom retention rules or holds = files stay longer If Vault is enabled, you can often recover files without restoring the user account . ------------------------------------------------------------------------------------------------------------ Alternate File Recovery Scenarios (The “Oh No” Cases) Case 1: Active User Deleted Files Trash keeps files for 30 days After Trash deletion: Admins have 25 days to restore Restore options: Original location Shared Drive No Vault license? After 25 days — game over. Case 2: Deleted User Account This one catches teams off guard. Deleted user accounts can be restored for 20 days Files can only be recovered if: The user account is restored first Or Vault is used Ownership transfer is another option: Files move to another user’s Drive Again — Vault makes life easier here. ------------------------------------------------------------------------------------------------------------ Exporting Drive Logs ------------------------------------------------------------------------------------------------------------ Important Fields You’ll Actually Use During an Investigation Let’s translate the useful ones into investigator language: Document ID This is gold. Unique across all Google Workspace tenants Same ID you see in the document URL Perfect for matching phishing URLs to actual Drive files Owner vs Actor Owner : Who owns the file Actor (User field) : Who performed the action These are often not the same person. Visibility & Prior Visibility This tells the real story. Prior Visibility → what access looked like before Visibility → what access looks like now This is how you catch: Private → Public changes Internal → External sharing Domain-wide exposure IP Address Extremely useful for: Geo anomalies Impossible travel Correlation with other Workspace logs ------------------------------------------------------------------------------------------------------------ Final Thoughts (The Big Picture) When investigating Google Drive exposure, remember: Drive log events tell you the story Vault gives you the evidence Timing is everything Anonymous access visibility is limited Exports are clunky but necessary Drive investigations are rarely about one log or one tool — it’s about stitching together: ---------------------------------------------Dean-----------------------------------------------------------
- Velociraptor Service Not Working? Use This Task Scheduler Method Instead
As you guys remember, I have created a complete series on Velociraptor. If you didn't check it out, do check it out - link below. https://www.cyberengage.org/courses-1/mastering-velociraptor%3A-a-comprehensive-guide-to-incident-response-and-digital-forensics Now, why am I here again? Because I recently tried to install Velociraptor with the latest version on my laptop and ran into some issues. Well, not exactly "issues" - I'd say it's more like modifications in how things work now. ------------------------------------------------------------------------------------------------------------ What Seems to be Changed in the Latest Version? Issue #1: Client Config No Longer Auto-Generated Earlier, when you generated the server config file, the client config file used to get automatically generated too. That doesn't happen anymore! So now you need to manually generate the client config file using this command: velociraptor-v0.75.1-windows-amd64.exe --config server.config.yaml config client > client.config.yaml This is only for Windows. I'll let you know if something changes for Linux in future articles. Issue #2: Windows Service Doesn't Work Properly Now here's the bigger problem I faced. I'm not sure why, but on my laptop I could not run Velociraptor as a Windows service properly. What happened was: I installed Velociraptor as a service ✅ The service showed "RUNNING" status ✅ But when I closed the terminal, the console stopped working ❌ Browser showed "Site not reachable" ❌ I checked a lot, tried to find solutions everywhere, but didn't find any proper fix. The built-in service install command just wasn't working the way it should. So I came up with another solution that works perfectly for both server and client ! The Solution: Task Scheduler + VBScript Method Instead of fighting with Windows Services, we're going to use Task Scheduler with VBScript . This method: ✅ Runs Velociraptor completely hidden (no terminal window) ✅ Starts automatically on boot/login ✅ Works reliably every single time ✅ Easy to set up and manage Let me show you how to set up both server and client. Part 1: Setting Up Velociraptor SERVER Step 1: Generate Server Config (if you haven't already) cd C:\Users\YourUsername\Downloads velociraptor-v0.75.1-windows-amd64.exe config generate -i If you want to see what next in first step, Check out above above article Follow the prompts to create your server.config.yaml file. Step 2: Create VBScript to Run Server Hidden Open Notepad and create a new file: notepad start-velociraptor-server-hidden.vbs Paste this code: Set WshShell = CreateObject("WScript.Shell") WshShell.Run "cmd /c cd C:\Users\\Downloads && velociraptor-v0.75.1-windows-amd64.exe --config server.config.yaml frontend", 0, False Important: Replace C:\Users\YourUsername\Downloads with your actual path! Save and close Notepad. Step 3: Set Up Task Scheduler for Server Now let's make it auto-start: Press Windows Key + R , type taskschd.msc, and press Enter Click "Create Basic Task" on the right side Name: Velociraptor Server Description: Runs Velociraptor server on loginClick Next Trigger: Select "When I log on" Click Next Action: Select "Start a program" Click Next Program/script: Browse and select your VBS file: C:\Users\YourUsername\Downloads\start-velociraptor-server-hidden.vbs Click Next Check "Open the Properties dialog" and click Finish In the Properties dialog: Go to "General" tab Check "Run with highest privileges" Go to "Settings" tab Uncheck "Stop the task if it runs longer than" Click OK Step 4: Test Your Server You can manually start the task to test it: schtasks /run /tn "Velociraptor Server" Wait 10-15 seconds, then open your browser and go to: https:// : You should see the Velociraptor login page! No terminal window anywhere - it's running completely hidden in the background. ----------------------------------------------------------------------------------------------------- Part 2: Setting Up Velociraptor CLIENT Step 1: Generate Client Config From your server machine, generate the client config: cd C:\Users\YourUsername\Downloads velociraptor-v0.75.1-windows-amd64.exe --config server.config.yaml config client > client.config.yaml Copy this client.config.yaml file to the client machine (the laptop/computer you want to monitor). Step 2: Create VBScript to Run Client Hidden On the client machine , open Notepad: notepad start-velociraptor-client-hidden.vbs Paste this code: Set WshShell = CreateObject("WScript.Shell") WshShell.Run "cmd /c cd C:\Users\YourUsername\Downloads && velociraptor-v0.75.1-windows-amd64.exe --config client.config.yaml client", 0, False Important: Replace the path with your actual path! Save and close. Step 3: Set Up Task Scheduler for Client This is slightly different from the server because we want the client to run even before anyone logs in : Press Windows Key + R , type taskschd.msc, and press Enter Click "Create Basic Task" Name: Velociraptor Client Description: Runs Velociraptor client on system startupClick Next Trigger: Select "When the computer starts" ⚠️ (Important!)Click Next Action: Select "Start a program" Click Next Program/script: Browse and select your VBS file: C:\Users\YourUsername\Downloads\start-velociraptor-client-hidden.vbs Click Next Check "Open the Properties dialog" and click Finish In the Properties dialog: Go to "General" tab Select "Run whether user is logged on or not" ⚠️ (Important!) Check "Run with highest privileges" Check "Hidden" Go to "Settings" tab Uncheck "Stop the task if it runs longer than" Click OK Enter your Windows password when prompted Step 4: Test Your Client Manually start the client task: schtasks /run /tn "Velociraptor Client" Wait about 30 seconds, then check your Velociraptor server web interface. You should see the new client appear in your client list! ----------------------------------------------------------------------------------------------------- Quick Command Line Method (For Advanced Users) If you prefer using command line instead of the GUI, here are the commands: For Server: schtasks /create /tn "Velociraptor Server" /tr "C:\Path\To\start-velociraptor-server-hidden.vbs" /sc onlogon /rl highest For Client: schtasks /create /tn "Velociraptor Client" /tr "C:\Path\To\start-velociraptor-client-hidden.vbs" /sc onstart /ru SYSTEM /rl highest ----------------------------------------------------------------------------------------------------- Why This Method is Better Let me break down why I prefer this method over the built-in Windows Service: It actually works! - No more "site not reachable" issues Completely hidden - No annoying terminal windows Auto-starts reliably - Works every time after reboot Easy to manage - Use Task Scheduler GUI to start/stop/disable Works for both server and client - One consistent method Server vs Client Differences Feature Server Client Trigger When I log on When computer starts Run as Current user SYSTEM account Purpose Run when you're working Run always, even offline The client setup ensures it: ✅ Runs even before anyone logs in ✅ Keeps running if user logs out ✅ Survives reboots automatically ✅ Keeps trying to reconnect even when offline ----------------------------------------------------------------------------------------------------- Want to stop Velociraptor? Open Task Scheduler Find the task (Velociraptor Server or Client) Right-click → Disable Want to completely remove it? # Stop and delete the scheduled tasks schtasks /end /tn "Velociraptor Server" schtasks /delete /tn "Velociraptor Server" /f schtasks /end /tn "Velociraptor Client" schtasks /delete /tn "Velociraptor Client" /f # Kill any running processes taskkill /F /IM velociraptor-v0.75.1-windows-amd64.exe ----------------------------------------------------------------------------------------------------- Final Thoughts Look, I know the official documentation says to use service install, but in my experience on Windows, it just doesn't work reliably. The Task Scheduler method might seem like a workaround, but honestly, it's more reliable and easier to troubleshoot. I've been running Velociraptor this way for a while now, and it's been rock solid. No issues, no headaches, just works! If you guys have any questions or run into any issues, drop a comment below. I'm always happy to help! And if this helped you, don't forget to check out my complete Velociraptor series for more tips and tricks! https://www.cyberengage.org/courses-1/mastering-velociraptor%3A-a-comprehensive-guide-to-incident-response-and-digital-forensics Happy hunting! 🦖 ----------------------------------------------Dean---------------------------------------------------
- Setting Up Velociraptor for Forensic Analysis in a Home Lab
Velociraptor is a powerful tool for incident response and digital forensics, capable of collecting and analyzing data from multiple endpoints. In this guide, I’ll walk you through the setup of Velociraptor in a home lab environment using one main server (which will be my personal laptop) and three client machines: one Windows 10 system, one Windows Server, and an Ubuntu 22.04 version. Important Note: This setup is intended for forensic analysis in a home lab, not for production environments. If you're deploying Velociraptor in production, you should enable additional security features like SSO and TLS as per the official documentation. Prerequisites for Setting Up Velociraptor Before we dive into the installation process, here are a few things to keep in mind: I’ll be using one laptop as the server (where I will run the GUI and collect data) and another laptop for the three clients. Different executables are required for Windows and Ubuntu , but you can use the same client.config.yaml file for configuration across these systems. Ensure that your server and client machines can ping each other. If not, you might need to create a rule in Windows Defender to allow ICMP (ping) traffic. In my case, I set up my laptop as the server and made sure all clients could ping me and vice versa. I highly recommend installing WSL (Windows Subsystem for Linux) , as it simplifies several steps in the process, such as signature verification. If you’re deploying in production, remember to go through the official documentation to enable SSO and TLS. Now, let's get started with the installation! Download and Verify Velociraptor First, download the latest release of Velociraptor from the GitHub Releases page . Make sure you also download the .sig file for signature verification . This step is crucial because it ensures the integrity of the executable and verifies that it’s from the official Velociraptor source. To verify the signature, follow these steps ( in WSL) : gpg --verify velociraptor-v0.72.4-windows-amd64.exe.sig gpg --search-keys 0572F28B4EF19A043F4CBBE0B22A7FB19CB6CFA1 Press 1 to import the signature. It’s important to do this to ensure that the file you’re downloading is legitimate and hasn’t been tampered with. Step-by-Step Velociraptor Installation Step 1: Generate Configuration Files Once you've verified the executable, proceed with generating the configuration files. In the Windows command prompt, execute: velociraptor-v0.72.4-windows-amd64.exe -h To generate the configuration files, use: velociraptor-v0.72.4-windows-amd64.exe config generate -i This will prompt you to specify several details, including the datastore directory, SSL options, and frontend settings. Here’s what I used for my server setup: Datastore directory: E:\Velociraptor SSL: Self-Signed SSL Frontend DNS name: localhost Frontend port: 8000 GUI port: 8889 WebSocket comms: Yes Registry writeback files: Yes DynDNS : None GUI User: admin (enter password) Path of log directory : E:\Velociraptor\Logs (Make sure log directory is there if not create one) Velociraptor will then generate two files: server.config.yaml (for the server) client.config.yaml (for the clients) During testing, it appears that a few changes have been made. If only the server YAML file is generated and not the client YAML file, please run the following command to generate the client YAML file Step 2: Configure the Server After generating the configuration files, you’ll need to start the server. In the command prompt, run: velociraptor-v0.72.4-windows-amd64.exe --config server.config.yaml gui This command will open the Velociraptor GUI in your default browser. If it doesn’t open automatically, navigate to https://127.0.0.1:8889/ manually. Enter your admin credentials (username and password) to log in. Important: Keep the command prompt open while the GUI is running. If you close the command prompt, Velociraptor will stop working, and you’ll need to restart the service. Step 3: Run Velociraptor as a Service T o avoid manually starting Velociraptor every time, I recommend running it as a service. This way, even if you close the command prompt, Velociraptor will continue running in the background. To install Velociraptor as a service, use the following command: velociraptor-v0.72.4-windows-amd64.exe --config server.config.yaml service install You can then go to the Windows Services app and ensure that the Velociraptor service is set to start automatically. Step 4: Set Up Client Configuration Now that the server is running, we’ll configure the clients to connect to the server. Before that you’ll need to modify the client.config.yaml file to include the server’s IP address so the clients can connect Note: As for me I am running Server in local host. I will not change the IP in configuration file but if you running server on any other do change it. Setting Up Velociraptor Client on Windows For Windows, you can use the same Velociraptor executable that you used for the server setup. The key difference is that instead of using the server.config.yaml, you’ll need to use the client.config.yaml file generated during the server configuration process . Step 1: Running the Velociraptor Client Use the following command to run Velociraptor as a client on Windows: velociraptor-v0.72.4-windows-amd64.exe --config client.config.yaml client -v This will configure Velociraptor to act as a client and start sending forensic data to the server. Step 2: Running Velociraptor as a Service If you want to make the client persistent (so that Velociraptor automatically runs on startup), you can install it as a service. The command to do this is: velociraptor-v0.72.4-windows-amd64.exe --config client.config.yaml service install By running this, Velociraptor will be set up as a Windows service. Although this step is optional, it can be helpful for p ersistence in environments where continuous monitoring is required. Setting Up Velociraptor Client on Ubuntu For Ubuntu , the process is slightly different since the Velociraptor executable for Linux needs to be downloaded and permissions adjusted before it can be run. Follow these steps for the setup: Step 1: Download the Linux Version of Velociraptor Head over to the Velociraptor GitHub releases page and download the appropriate AMD64 version for Linux. Step 2: Make the Velociraptor Executable Once downloaded, you need to make sure the file has execution permissions. Check if it does using: ls -lha If it doesn’t, modify the permissions with: sudo chmod +x velociraptor-v0.72.4-linux-amd64 Step 3: Running the Velociraptor Client Now that the file is executable, run Velociraptor as a client using the command below (with the correct config file): sudo ./velociraptor-v0.72.4-linux-amd64 --config client.config.yaml client -v Common Error Fix: Directory Creation You may encounter an error when running Velociraptor because certain directories needed for the writeback functionality may not exist . Don’t worry—this is an easy fix. The error message will specify which directories are missing. For example, i n my case, the error indicated that writeback permission was missing. I resolved this by creating the required file and directory: sudo touch /etc/velociraptor.writeback.yaml sudo chown : /etc/velociraptor.writeback.yaml After creating the necessary directories or files, run the Velociraptor client command again, and it should configure successfully. Step 4: Running Velociraptor as a Service on Ubuntu Like in Windows, you can also make Velociraptor persistent on Ubuntu by running it as a service. Follow these steps: 1. Create a Service File sudo nano /etc/systemd/system/velociraptor.service 2. Add the Following Content [Unit] Description=Velociraptor Client Service After=network.target [Service] ExecStart=/path/to/velociraptor-v0.72.4-linux-amd64 --config /path/to/your/client.config.yaml client Restart=always User= [Install] WantedBy=multi-user.target Make sure to replace and the paths with your actual user and file locations. 3. Reload Systemd sudo systemctl daemon-reload 4. Enable and Start the Service sudo systemctl enable velociraptor sudo systemctl start velociraptor Step 5: Verify the Service Status You can verify that the service is running correctly with the following command: sudo systemctl status velociraptor Conclusion T hat's it! You’ve successfully configured Velociraptor clients on both Windows and Ubuntu systems . Whether you decide to run Velociraptor manually or set it up as a service, you now have the flexibility to collect forensic data from your client machines and analyze it through the Velociraptor server. In the next section, we'll explore the Velociraptor GUI interface , diving into how you can manage clients, run hunts, and collect forensic data from the comfort of the web interface. Akash Patel
- Tracking User Account and OAuth in Google Workspace (Without Losing Your Sanity)
If you’ve ever had to investigate a Google Workspace account takeover , you already know one thing: it’s not about one log — it’s about connecting multiple logs and understanding how Google thinks . The Two Logs You Must Know When it comes to tracking user behavior (and especially account compromise), there are four core log types you’ll always come back to: Admin log events User log events ( Previously it was seperated into two logs) (Login Audit Log + User Accounts Audit Log) Think of these as different camera angles. One log alone never tells the full story — but together, they usually do. Log Retention: The 6-Month Trap By default, Google Workspace retains these logs for six months . And here’s the annoying part: You cannot extend retention inside the Admin Console There is no “keep logs longer” checkbox If you want long-term visibility (and you absolutely should), the only solution is to: Export logs to Google Cloud Logging Configure extended retention there Google Cloud allows log storage for up to 10 years , which is a lifesaver for compliance, threat hunting, and delayed investigations. Log Lag Time: Why “Too Early” Is a Real Problem One thing that trips up a lot of investigators is log availability delay . Each of logs has a different lag time before events become searchable. And that lag time should be treated as the minimum waiting period , not a guarantee. So if you search immediately after an incident and think, “This doesn’t make sense…” …it probably doesn’t — yet . Rule of thumb: Never rely on searches run shorter than the documented log lag times . Some events just arrive late. Admin log events: Start Here for Admin Compromise The Admin Log evets is your go-to log for anything that happens inside the Google Admin Console . It tracks: Admin actions Configuration changes Policy updates Organization-wide modifications If you suspect an admin account compromise , don’t overthink it — this is the first log you check . It tells you exactly what changes were made and by which admin account. User log events: Where the Action Is The User log events is where most account takeover investigations spend their time. This log captures: Successful and failed logins Re-authentication prompts MFA changes Security challenges triggered by Google It doesn’t just tell you that someone logged in — it tells you how , why , and under what conditions . Understanding Login Types (This Matters) Each login event includes a Login Type , which explains how the authentication happened. Some common ones you’ll see: Google Password – Standard username + password login ReAuth – Google forced the user to re-authenticate SAML – Login via SSO Exchange – OAuth or existing token-based session Unknown – Login occurred using an unidentified method (always worth a closer look ) When you’re hunting suspicious activity, “Unknown” and unusual patterns in login types are often gold. Warning Icons = Pay Attention In the User log events, some events show a warning icon . These usually indicate unusual or suspicious logins , such as: New IP addresses Unfamiliar locations Behavior Google flags as risky Instead of scrolling endlessly, a smart approach is to hunt by event type . Login Event Types Investigators Care About Here are some high-value event types you should always keep an eye on: 2-step verification disabled – Big red flag Account password change – Especially if unexpected Failed login – Useful for brute-force patterns Government-backed attack – Google explicitly flagged a known threat actor Leaked password – Password found in credential dumps Suspicious login – Unusual characteristics detected Out-of-domain email forwarding enabled – Common data exfil trick User suspended – Often triggered by Google due to abuse or compromise Important note: Some details (like why a login failed) are not visible in the Admin Console and require pulling logs via the API. ------------------------------------------------------------------------------------------------------------- OAuth Let’s be honest — OAuth sounds way more complicated than it actually is. At its core, OAuth is just a permission slip . Instead of giving an app your username and password (which is a terrible idea), OAuth lets you say: “Hey, this app can read my emails, but nothing else.” That’s it. That’s the magic. So What Exactly Is OAuth? OAuth is an authorization mechanism — not authentication. It doesn’t prove who you are It proves what an app is allowed to do When an application wants to access your data through an API (emails, Drive files, contacts, calendar, etc.), OAuth sits in the middle and asks you for permission. If you say yes, the app gets a token . That token is like a digital key that says: “This app is allowed to access these specific things, on behalf of this user.” No password sharing. No repeated logins. Cleaner and safer. Why OAuth Exists (And Why Everyone Uses It) Imagine if every app you used asked for your Gmail password. Nightmare. OAuth solves a few big problems: You don’t have to re-authenticate every time Apps never see your actual credentials Access can be limited (scope-based) Tokens can be revoked anytime That’s why OAuth is everywhere — Google Workspace, Microsoft, GitHub, Slack, Twitter (X), basically everything modern. OAuth in Google Workspace (What Users Actually See) Inside Google Workspace, OAuth usually shows up as that familiar screen: “This app wants access to your : GmailDrive filesContacts” That list? Those are called scopes . Scopes define exactly what the app can touch. Nothing more. Once the user clicks Allow , Google generates an OAuth token, and the app can start making API calls using that token. Important point: OAuth is enabled by default in Google Workspace unless admins restrict it. Where Things Go Wrong: OAuth Abuse Here’s the problem — OAuth is secure, but humans are optimistic . Threat actors figured out something clever: “Why steal passwords when we can just ask nicely?” The Basic OAuth Attack Chain Attacker creates a malicious app Victim gets a phishing email with a link Victim clicks → sees a legit Google OAuth screen Victim clicks Allow Attacker now has access — no password needed No malware. No credential theft. No MFA bypass required. Just consent. Why Threat Actors Love OAuth OAuth attacks are attractive because: No credentials to steal MFA doesn’t stop it Looks completely legitimate Uses official Google infrastructure And the scariest part? OAuth does NOT give attackers more access than the user already has — but that’s usually more than enough. Detecting OAuth Abuse in Google Workspace Google Workspace actually gives us solid visibility here. Oath Log events These logs show: Which user authorized which app Application ID Scopes granted API activity performed using the token If you pull these logs via API , you get even more gold: Source of the request Which Workspace service was accessed How much data was returned Client type and product bucket This is huge for investigations and retroactive analysis. Killing the Access: Revoking OAuth Tokens A few important things defenders should know: Changing a user’s password revokes OAuth tokens IMAP tokens can take up to an hour to expire Admins can: Review all third-party apps See who authorized them Block apps org-wide In the Admin Console, you can quickly identify sketchy apps by: Unusual scopes Non-verified apps Excessive permissions Block once — and it impacts the whole org. The Big Takeaway OAuth isn’t insecure. Blind trust is. OAuth attacks succeed because: Users trust the Google consent screen App names look legitimate No passwords are involved (so alarms don’t go off) Defenders need to: Monitor Token Audit Logs Restrict third-party apps Educate users that “Allow” is a powerful action Because sometimes, clicking Allow is worse than typing your password. ------------------------------------------------------------------------------------------------------------- Final Thoughts If there’s one takeaway here, it’s this: Understand: What each log shows When data becomes available Which events actually matter Once you get comfortable with these logs, Google Workspace investigations stop feeling messy — and start feeling methodical. ------------------------------------------Dean--------------------------------------------------------------
- Email Log Search in Google Workspace – What You Can (and Can’t) See
Now let’s talk about Email Log Search , because this is one of the most commonly used (and misunderstood) tools when you’re investigating phishing, mailbox compromise, or suspicious inbound email. If a user reports: “I got a weird email” This is usually where you end up first. First thing to understand: the 30‑day rule Google stores email transaction logs differently depending on how old the email is. This affects what you can search , how you can search , and what results you’ll get . Think of it as two different worlds: Emails within the last 30 days This is the "easy mode." ✅ No strict search parameters required ✅ You can search: Sender Recipient IP address Message ID Google Groups email Results limited to 1000 messages (screen + CSV export) Near real-time (usually minutes, sometimes up to 24 hours lag) This is where you do fast phishing triage . Emails older than 30 days This is where things get restrictive. ❌ You cannot search Google Group email ❌ You must search using: Gmail recipient or Message ID ✅ No limit on historical depth (can go back years) Still limited to 1000 results per search Only post‑delivery details are available Full delivery history is gone In other words: you can search forever—but only if you already know exactly what you’re looking for . ------------------------------------------------------------------------------------------------------- Where Email Log Search lives now Google recently moved and upgraded the interface. You’ll now find it under: Admin Console → [|||]→ Reporting→ Email Log Search This newer interface gives you: Better filtering Faster searches Cleaner drill-down into message details If you already have a Message ID , always use it. It’s the fastest and cleanest way to get results. ------------------------------------------------------------------------------------------------------- What Email Transaction Logs actually show you Email logs tell you about mail flow , not mailbox content. You can see: Inbound and outbound messages SMTP path details Sender IP addresses Delivery status Whether the message was: Delivered Quarantined Rejected In the recipient details, you can also see the current state of the email inside the mailbox. That’s extremely useful when: A phishing email was delivered Some users opened it Others haven’t yet This helps you decide whether to pull the email from inboxes immediately . ------------------------------------------------------------------------------------------------------- Using Email Log Search for phishing investigations This is one of the strongest use cases. Typical workflow: User reports a phishing email You grab: Message ID Sender address Sender IP Search Email Log Search Identify: How many users received it Whether variations were used If multiple emails came from the same SMTP server Even if the attacker rotated sender addresses, the IP often stays the same , which makes correlation easier. Limitation: this method is most effective within 30 days of delivery. ------------------------------------------------------------------------------------------------------- Quarantined and blocked email (the invisible stuff) Here’s a really important thing many admins miss: Some emails never reach user mailboxes at all . Google’s Gmail gateway evaluates messages before they enter Workspace storage. If an email is blocked at this stage: ❌ It will not appear in Vault ❌ It will not appear in Quarantine ❌ It cannot be recovered It will only appear in Email Log Search . Attachment-based blocking Gmail automatically blocks certain attachment types, including: Executables Scripts Certain archive contents This also applies when: The file is inside a ZIP The ZIP is not password-protected Google will even attempt to brute-force common ZIP password s like: infected malware If it can open the archive and finds a blocked file type, the email is rejected. No notification is sent to: The user The admin The sender Email Log Search is the only place you’ll ever see it. ------------------------------------------------------------------------------------------------------- Why this matters during investigations During IR, you’re often asked: “Did anyone receive this email?” “Was it delivered or blocked?” “Can we retrieve it?” Email Log Search helps you answer all three —but you must understand its limits. Once Gmail blocks an email at the gateway: It never becomes evidence you can collect. You can only prove that it was blocked . ------------------------------------------------------------------------------------------------------- Final takeaway Email Log Search is: Excellent for phishing response Powerful for mail flow analysis Extremely time-sensitive But it is not a mailbox forensics tool. Think of it as your email traffic CCTV —it tells you what passed through the door , not what’s stored inside the room. Used correctly, it’s one of the most valuable tools in Google Workspace investigations. ------------------------------------------Dean--------------------------------------------------------
- Pulling Google Workspace Logs via API
Let me be honest upfront: this setup looks scary the first time you see it. Google makes you jump back and forth between Google Cloud Console and Google Workspace Admin , and it feels like you’re doing something wrong the entire time. You’re not. That’s just how Google designed it. Once you understand the full flow , everything suddenly clicks. This walkthrough assumes: You are a Google Workspace Super Admin You want to collect audit / activity logs using the Admin SDK – Reports API ------------------------------------------------------------------------------------------------------------ Big picture first (so you don’t get lost) You will work in two places : Google Cloud Console Create a project Enable APIs Create a service account Google Workspace Admin Console Trust that service account using domain-wide delegation Google Workspace itself does not have service accounts . That’s why Google Cloud is involved at all. We’re basically borrowing Google Cloud’s identity system to talk to Workspace. ----------------------------------------------------------------------------------------------------------- Step 1: Create a Google Cloud Project (same org as Workspace) Start here: 👉 https://console.cloud.google.com Click the project dropdown in the top bar Select New Project Set: Project name: workspace-log-collection Organization: must be the same org as your Workspace tenant Click Create That’s it. Important thing to understand: you are not deploying servers, VMs, or storage. This project is just a container to hold APIs and a service account. Step 2: Enable the required APIs (this is mandatory) Google locks everything by default, so we have to explicitly enable what we need. Inside your new project: Go to APIs & Services → Library Search for and enable: Admin SDK API (this is the key one) Optional (only if you plan to query these later): Google Drive API Gmail API Calendar API For audit and activity logs , Admin SDK alone is enough . If this API is not enabled, your script will fail even if every permission looks perfect. Step 3: Configure OAuth Consent Screen (yes, even for service accounts) This step confuses almost everyone. Even though we’re using a service account , Google still requires an OAuth consent screen to exist. Go to APIs & Services → OAuth consent screen Choose Internal You only see this option because the project is under a Workspace org Fill in the basics: App name : Workspace Log Collector User support email : your admin email Developer contact email : your admin email Click Save and Continue On the Scopes page → just click Save and Continue Finish You do not need to publish the app externally. Think of this as telling Google: “Yes, this project is allowed to request Workspace APIs.” Step 4: Create the Service Account Now we create the identity that will actually pull logs. Go to IAM & Admin → Service Accounts Click Create Service Account Set: Name : workspace-log-reader Click Create and Continue Skip role assignment (no GCP roles required) Click Done At this point, the service account exists—but it can’t do anything yet. Step 5: Enable Domain-Wide Delegation (critical step) This is where most people miss a checkbox and everything breaks. Click the service account you just created Open the Details tab Click Show domain-wide delegation Check Enable Google Workspace Domain-wide Delegation Save Now copy the Client ID . You’ll need it immediately. This setting allows the service account to act on behalf of users in your domain —bu t only for scopes you explicitly allow. Step 6: Trust the Service Account in Google Workspace Now we jump back to Workspace. Go to Google Admin Console 👉 https://admin.google.com Navigate to: Security → API controls → Domain-wide delegation Click Add new Enter: Client ID : (from the service account) OAuth scopes : https://www.googleapis.com/auth/admin.reports.audit.readonly https://www.googleapis.com/auth/admin.reports.usage.readonly Click Authorize This is the trust handshake between Workspace and Google Cloud. Without this step, every API call will be denied. Step 7: Create and download a Service Account key You’ll need credentials for your script or tool. Go back to Google Cloud Console → Service Accounts Select your service account Open Keys → Add key → Create new key Choose JSON Download the file ⚠️ This JSON file is effectively a password . Store it securely. Step 8: Using the Service Account to pull logs When you actually query the Admin SDK API: Authenticate using the JSON key Enable domain-wide delegation Impersonate a Workspace admin user (very important) Example conceptually: Delegated user: admin@yourdomain.com API: Admin SDK – Reports API Logs belong to the domain , not the service account, which is why impersonation is required. ----------------------------------------------------------------------------------------------------------- Why investigators like this method Once this is set up, you can: Pull all Workspace logs in JSON Avoid UI export limits Build repeatable, defensible evidence collection Feed logs directly into SIEMs, timelines, or DFIR tooling ----------------------------------------------------------------------------------------------------------- Final thought Yes, the setup feels painful the first time. But once it’s done, you’ve essentially built a forensic-grade log pipeline for Google Workspace—and that’s incredibly powerful during incident response. After the first run, most analysts say the same thing: “Oh… that actually wasn’t that bad.” ------------------------------------------------------------Dean----------------------------------------
- Collecting Evidence from Google Workspace
Let’s talk about something that often comes up during Google Workspace investigations: how do we actually collect logs and evidence properly? If you’ve ever worked an incident involving Google Workspace, you already know that the platform gives you a lot of data—but not all of it is equally easy to collect or analyze. Broadly speaking, there are two main ways to collect evidence from Google Workspace: Using the Workspace Admin interface (UI) Using the Workspace Admin SDK / APIs On paper, both give you access to similar information. In reality, they behave quite differently—and those differences really matter during forensic analysis Let’s break this down in a simple, practical way. ------------------------------------------------------------------------------------------------------------- Option 1: Using the Google Workspace Admin Interface The Admin interface is usually where everyone starts—and honestly, it’s not a bad place to begin. It gives you a visual and human-friendly way to explore logs . You can click through different sections, filter events, and clearly see what’s going on. ' This is especially useful when: You’re doing a quick triage You need to show evidence to a manager, legal team, or client You want to visually confirm suspicious activity The downside? All the useful data is scattered across different screens . If you want to investigate a full Workspace compromise, you’ll likely need to: Jump between Drive logs Check login and authentication activity Review OAuth and third‑party app access Inspect Admin console changes Each of these lives in a different place. That means a lot of clicking, filtering, exporting, and repeating the process again and again. It works—but it’s slow. Export limitations There are a few important limitations to keep in mind: You can only export 10,000 or 100,000 events per log type. If you exceed that limit, you must split your search into smaller time ranges Logs are exported only as Google Sheets (GSheet) from the UI You can later convert those sheets into CSV, but it’s an extra step—and not ideal if you’re planning to ingest logs into a SIEM or timeline tool. ------------------------------------------------------------------------------------------------------------- Option 2: Collecting Logs via the Workspace Admin SDK (API) Now this is where things get really interesting for forensic work. The Workspace Admin SDK allows you to collect logs programmatically using API calls. Once set up, this becomes the fastest and most consistent way to gather evidence. Yes, the initial setup takes some effort—you’ll need: A Service Account The right Workspace permissions Some basic scripting knowledge But once that’s done, everything becomes repeatable and scalable. Types of reports you can collect Using the API, you can pull two main types of reports: 1. Activity Reports These tell you what actually happened across Workspace services , including: Google Drive activity Authentication and login events OAuth and third‑party application access Admin console changes These are gold during investigations because they help you track changes, abuse, and attacker actions . 2. Usage Reports These focus more on how user accounts are being used over time. They’re great for spotting anomalies or misuse patterns. Why investigators prefer API logs There are several big advantages here: No event limits like the UI exports Logs are returned in JSON format , which is perfect for: SIEM ingestion Timeline creation Custom parsing and analysis All timestamps are in UTC , which avoids time zone confusion Collection can be fully scripted , ensuring consistency every time In short: if you’re doing a serious investigation, the API approach is hard to beat. ------------------------------------------------------------------------------------------------------------- Option 3: Sending Google Workspace Logs to Google Cloud Logging There’s a third option that often gets overlooked—but it’s extremely powerful. Google Workspace can send certain logs directly to Google Cloud Logging . This allows you to: Retain logs for a much longer period Query them using Cloud Log Explorer Correlate Workspace logs with other Google Cloud activity You Must enabled the sharing which is disabeld by default The catch Not all Workspace logs are sent to Google Cloud. Only five log types are forwarded—and while these are some of the most valuable ones for investigations, they don’t always tell the full story . For example: Email transit and email access logs are not included You cannot customize which logs are sent Google decides what gets forwarded—you only choose whether forwarding is enabled or not So while this method is fantastic for long‑term visibility, it should be seen as a complement , not a replacement, for API‑based collection. ------------------------------------------------------------------------------------------------------------- Permissions: A Common Roadblock If you try to search Workspace logs in Google Cloud and run into permission errors—don’t panic. This usually means your account doesn’t have enough rights to query logs. https://docs.cloud.google.com/logging/docs/access-control The fix is simple: Go to IAM & Admin in Google Cloud Grant the appropriate role (typically Logging Admin or equivalent) Once that’s done, Log Explorer will start behaving as expected. Log Explorer Example Querying Workspace Logs in Google Cloud When Workspace logs arrive in Google Cloud, they are spread across a few service names. To search them together, you can use a query like this in Log Explorer : protoPayload.serviceName = ( "admin.googleapis.com" OR "cloudidentity.googleapis.com" OR "login.googleapis.com" OR "oauth2.googleapis.com" ) Example: protoPayload.serviceName = ( "login.googleapis.com" ) One important thing to remember: you need to be viewing logs at the root organization level in Google Cloud. ------------------------------------------------------------------------------------------------------------- Final Thoughts If we simplify everything: Admin UI → great for quick checks and visual walkthroughs Admin SDK / API → best for fast, consistent, forensic‑grade evidence collection Google Cloud Logging → excellent for long‑term retention and centralized querying In real investigations, the strongest approach is usually a combination of all three . ------------------------------------------Dean-------------------------------------------------------------
- Understanding Google Workspace Structure from a Cloud Forensics Lens
In this new series, we'll be diving deep into investigation and forensics within Google Workspace (the Google ecosystem). So tighten your seatbelt—let's go! When diving into cloud forensics—especially in Google Workspace—there’s a lot more to unravel than just user credentials or login timestamps. One of the most overlooked but crucial areas is how permissions are managed within the environment. let's break down two key building blocks of Google Workspace that matter a lot when you're investigating suspicious account behavior or responding to an incident: 👉 Organizational Units (OUs) 👉 Groups Why OUs and Groups Matter in Forensics Google Workspace has its own authentication and identity system, sure—but when you're trying to understand how and why a user had access to certain data or features, you need to look beyond just login logs. That’s where Organizational Units (OUs) and Groups come in. T hese two are the backbone of how permissions are structured and managed in Workspace. And guess what? They can be used independently , so knowing how each works is essential for tracing how permissions are applied—or misapplied. ---------------------------------------------------------------------------------------------------------- Organizational Units (OUs): Think Department Bins Let’s start with Organizational Units . Think of them like folders or containers that you put users into based on department, location, or job role. Every user account must belong to one—and only one—OU. From an investigation perspective, this helps narrow things down: If you know the user’s OU, you don’t have to search other units. Also, OUs can be nested —meaning you can have child units inside parent ones. So a user could be in a sub-OU deep in the hierarchy, but they’ll still inherit permissions from the OUs above them. T his inheritance is something to watch closely during an investigation. Forensic Tip: OU Inheritance Can Create Hidden Access If a user is in a deeply nested OU, don’t forget to trace all the inherited settings and permissions. You might find that access was granted not directly , but from higher up the chain ---------------------------------------------------------------------------------------------------------- One User, Many Groups — And Even More Permissions Unlike OUs where a user can only belong to one , a single u ser account in Google Workspace can be part of multiple groups at the same time. But here’s where things get interesting—and complicated: Groups can contain other groups. So if User A is in Group X, and Group X is inside Group Y, then User A indirectly inherits all permissions from Group Y too. This is what we call inherited groups , and it’s an important concept for anyone doing incident response or auditing permissions. Forensic Insight: Inherited Groups = Inherited Risk Let’s say you have a group called "IT Users" . It’s a member of both the "Log Access" and "Vault Access" groups. That means everyone in IT Users also inherits access to logs and vault data—even if that wasn’t the original intention. This kind of setup is handy for streamlining permissions—but it can also accidentally over-provision users , which is something DFIR teams always need to watch out for. Using Groups Smartly Groups aren’t just for permissions. You can also use them for: Feature access control Mailing lists Managing shared resources (like calendars, drives, etc.) Think of it like Microsoft’s Security Groups and Distribution Groups in Active Directory. In large organizations, using groups makes onboarding and permissioning way easier . You can just drop a new user into the right group and boom—they’ve got the correct access in seconds. But this simplicity can be dangerous if you don’t track what each group actually allows . Real-World Use: Google Drive Sharing Groups Imagine this : You’ve got three groups set up for Google Drive sharing: Internal Sharing Only Sharing to Trusted Domains Open Sharing (Anyone outside the org) During a data breach, it’s so much easier to identify which group allowed risky sharing if these types of groups are clearly defined. You could simply yank a user out of the “Open Sharing” group, and the exfiltration risk goes down instantly . This logic applies not just to Drive—but to all services in Google Workspace. Group Roles: Owner, Manager, Member Every group in Workspace has three roles : Owner : Full control—can add/remove members, change settings, etc. Manager : Can manage members, sometimes limited in changing settings Member : Just a regular part of the group Forensics Tip: The group owner isn’t always an IT admin . It could be a team lead, project manager, or anyone else. That means non-IT staff could be controlling access to sensitive groups , so always check who owns what . Using Admin Console to Inspect Groups Google makes it a bit easier to investigate with features like Inspect Groups , available in the Admin Console. With this, you can: See all groups assigned to a user Know whether group membership is direct or inherited Check which users belong to a specific group For example:You might find that Akash is directly added to the “Vault Access” group but indirectly added to the “IT Users” group through another group membership. That tells you how permissions were layered onto Akash's account. Feature Alert: Only in Enterprise Plus or Cloud Identity Premium Here’s the catch: This level of detail—especially Inspect Groups and dynamic visibility—requires either: Enterprise Plus edition of Google Workspace , or The Cloud Identity Premium Edition add-on If you’re in a budget-conscious environment, the add-on gives you solid forensic capabilities without needing the full Enterprise tier . ---------------------------------------------------------------------------------------------------------- Final Thoughts: Groups = Power and Risk Groups are incredibly powerful, but also easy to overlook during forensic reviews. Always: Map direct vs inherited permissions Watch for non-admin group owners Audit who’s in which group and why Use features like Inspect Groups if your license supports it Getting this right can help you detect, contain, and respond to incidents faster and smarter . ----------------------------------------------Dean----------------------------------------------------- Stay with me—things are going to get more interesting in the upcoming articles!
- Let’s Go Practical: Working with NetFlow Using nfdump Tools
Enough theory. Now let’s actually touch NetFlow data . If you’re doing DFIR, threat hunting, or even basic network investigations, one toolkit you must be comfortable with is the nfdump suite. This suite gives you three extremely important tools: nfcapd – the collector nfpcapd – the pcap-to-NetFlow converter nfdump – the analysis engine ----------------------------------------------------------------------------------------------------------- nfcapd: The NetFlow Collector (Where Everything Starts) nfcapd is a daemon, not a one-time command. Its job is simple: listen on a UDP port receive NetFlow data from trusted exporters (routers, firewalls, switches) write that data to disk in a compact binary format It supports: NetFlow v5, v7, v9 IPFIX sFlow So regardless of vendor or flow standard, nfcapd usually has you covered. How Much Storage Do You Actually Need? This is one of the first questions everyone asks. A rough rule of thumb: ~1 MB of NetFlow data for every 2 GB of network traffic Is this perfect? No. Is it useful for planning? Yes. Your actual numbers will depend on: number of flows traffic patterns sampling exporter behavior But it’s a good starting point when designing storage. How nfcapd Stores Data (And Why It Matters) When nfcapd writes flow data, it uses a very clean naming scheme: nfcapd.YYYYMMDDhhmm Example: nfcapd.201302262305 Why this matters: files sort naturally by time no database needed easy scripting easy forensic timelines By default, nfcapd rotates files every 5 minutes. That means: 288 files per exporter per day predictable storage growth easy time slicing during investigations ----------------------------------------------------------------------------------------------------------- Bonus Feature: Flow Forwarding (-R Option) One very underrated feature of nfcapd is flow forwarding. You can collect NetFlow and forward it to another collector at the same time. Example scenario: local collection for DFIR central collection for SOC visibility Example command: nfcapd -p 1025 -w -D -R 10.0.0.1/1025 \ -n router,10.0.0.2,/var/local/flows/router1 Command breakdown: nfcapd - NetFlow capture daemon -p 1025 - Listen on port 1025 for incoming NetFlow packets -w - Align file rotation to the next interval (e.g., start at the top of the hour) -D - Run as a daemon (background process) -R 10.0.0.1/ 1025 - Act as a repeater/forwarder: send received flows to IP 10.0.0.1 on port 1025 -n router,10.0.0.2,/var/local/flows/routerlogs - Define an identification string: router - Identifier name for this source 10.0.0.2 - Expected source IP address /var/local/flows/routerlogs - Directory where flow files will be stored In summary: This command starts a NetFlow collector that listens on port 1025, stores flow data from router at 10.0.0.2 into /var/local/flows/routerlogs , and simultaneously forwards the data to another collector at 10.0.0.1/1025 . It runs in the background as a daemon. This is extremely useful in larger environments. nfpcapd: Turning PCAPs into NetFlow Now this is where DFIR people should pay attention. nfpcapd lets you: take a pcap file and convert it into NetFlow-style records Why does this matter? Because parsing large pcaps is: slow CPU-heavy painful at scale NetFlow-based analysis is orders of magnitude faster. So the smart workflow is: Convert pcap → NetFlow Hunt quickly using NetFlow Go back to full pcap only where needed Example: nfpcapd -r bigFlows.pcap -l /mnt/c/Users/Akash/Downloads/ This step alone can save hours or days in an investigation. ----------------------------------------------------------------------------------------------------------- nfdump: Where the Real Analysis Happens Once flows are collected (or converted), this is where we start asking questions. nfdump is a command-line NetFlow analysis tool. It: reads nfcapd binary files applies filters summarizes results responds very fast — even on huge datasets Important point: nfdump does not magically find “bad traffic” Its power comes from: how you ask questions how you refine hypotheses how you chain queries together This is investigative work, not alert-driven work. ----------------------------------------------------------------------------------------------------------- Reading NetFlow Data with nfdump You can read: a single file or an entire directory tree Reading a Single File nfdump -r /mnt/c/Users/Akash/Downloads/nfcapd.201302262305 This reads: flows from one exporter for a specific 5-minute window Perfect for targeted investigations. Reading a Directory (Much More Common) nfdump -R /mnt/c/Users/Akash/Downloads/test/ This tells nfdump: recursively walk the directory read all NetFlow files inside This is how you analyze: days weeks months of traffic ----------------------------------------------------------------------------------------------------------- Building Real Investigative Queries Let’s look at a realistic example. Goal: Find internal systems that accessed internet web servers without using the corporate proxy. Conditions: traffic passed through internet-facing router destination ports 80 or 443 exclude proxy IP 172.0.1.1 specific 24-hour window show only top 10 systems Command: nfdump -R /mnt/c/Users/Akash/Downloads/test/ \ -t '2026/01/12.12:00:00-2026/01/13.12:00:00' \ -c 10 'proto tcp and (dst port 80 or dst port 443) and not src host 172.0.1.1' This is classic NetFlow hunting: scoped fast hypothesis-driven From here, you pivot: which hosts? how often? how much data? where did they connect? ----------------------------------------------------------------------------------------------------------- line Output (Default, Lightweight) This is the default view and the one you’ll see most often when you’re doing quick scoping. It shows: start and end time source and destination IPs ports protocol bytes and packets Example: nfdump -R /mnt/c/Users/Akash/Downloads/test -o line host 172.16.128.169 This is perfect when you’re asking: “Is this IP even talking on my network?” Fast. Minimal. No noise. ----------------------------------------------------------------------------------------------------------- 2. long Output (Adds TCP Flags) The long format builds on line and adds: TCP flags Type of Service (ToS) Example: nfdump -R /mnt/c/Users/Akash/Downloads/test -o long 'proto tcp and port 445' Why this matters: TCP flags tell a story SYN-only traffic looks very different from established sessions RST storms, half-open connections, or scanning behavior start to stand out Important reminder: Each line is unidirectional. A normal bidirectional conversation: client → server server → client …will always appear as two separate flow records. This trips people up early on. ----------------------------------------------------------------------------------------------------------- 3. extended Output (Adds Statistics) This is where things get interesting. The extended format adds derived values , calculated at query time: packets per second bits per second bytes per packet Example: nfdump -R /mnt/c/Users/Akash/Downloads/test -o extended 'proto tcp and port 445' These values help you distinguish: interactive shells (low & slow) file transfers (fast ramp-up, steady throughput) dormant C2 channels (tiny but persistent) None of this data is stored explicitly — it’s derived — but it’s incredibly useful for behavioral analysis. ----------------------------------------------------------------------------------------------------------- IPv6 Note (Important but Often Missed) nfdump fully supports IPv6 , but truncates addresses by default for readability. If you want full IPv6 visibility, use: line6 long6 extended6 Same formats — just IPv6-aware. ----------------------------------------------------------------------------------------------------------- Practical Hunt: Finding Patient Zero Using NetFlow Now let’s do real hunting , not theory. Goal: Identify internal hosts communicating with this C2 Step 1: First Hits of the Day Start with a known NetFlow file Ask: “Who talked to this IP first today?” nfdump -R /mnt/c/Users/Akash/Downloads/test -O tstart -c 5 'proto tcp and dst port 8014 and host 172.16.128.169 We see the first hit into the day. That’s early — but maybe not early enough. Step 2: Expand the Time Window (Overnight) If the first hit isn’t at the beginning of the capture window, that’s a signal. So we expand: (overnight window) nfdump -R /mnt/c/Users/Akash/Downloads/test -t '2013/02/26.23:00:00-2013/02/26.23:10:60' -O tstart -c 1 'proto tcp and dst port 8014 and host 172.16.128.169' Step 3: What Else Did Patient Zero Do? Now we pivot. Same time window, but focus on the internal host itself: nfdump -R /mnt/c/Users/Akash/Downloads/test -t '2013/02/26.23:00:00-2013/02/26.23:10:60' -O tstart 'host 172.16.128.169' This answers: what happened before C2? was there a download? was there lateral movement? did anything precede the UDP traffic? Step 4: Infrastructure Expansion Using ASN Analysis 172.16.128.169 Using WHOIS: whois 172.16.128.169 | grep AS Step 5: Hunt the “Internet Neighborhood” If the attacker uses one provider, they may use more infrastructure in the same ASN. So we ask: “Who talked to this ASN all month?” nfdump -q -R /mnt/c/Users/Akash/Downloads/test -o 'fmt:$sa $da' 'dst as 36351' | sort | uniq What this gives you $sa → source IP (internal) $da → destination IP (external) Deduplicated list of unique communications Viewing Minimal Samples for Orientation Sometimes you just want a quick sanity check : nfdump -R mnt/c/Users/Akash/Downloads/test -t '2013/02/26.23:00:00-2013/02/26.23:10:60' -O tstart -c 1 Or inspecting a single file: nfdump -r mnt/c/Users/Akash/Downloads/test/nfcapd.201302262305 -O tstart -c 5 These commands are underrated — they help you: Validate time ranges Confirm exporter behavior Avoid wrong assumptions early ------------------------------------------------------------------------------------------------------------ Why Aggregation Changes Everything Because flows are split across files, one real-world connection may appear as many records. By default, nfdump aggregates using five key values: Source IP Destination IP Protocol Source Port Destination Port Flows sharing these values are merged into a single logical event . Detecting Port Scanning with Custom Aggregation Port scanners behave differently: Source port changes constantly Target port stays fixed nfdump -q -R mnt/c/Users/Akash/Downloads/test -O bytes -A srcip, proto, dstport -o 'fmt: $sa -> $pr $dp $byt $fl' This answers: Who is consuming the most bandwidth Which protocol and port How many flows were involved Great for: Data exfiltration hunting Rogue services Abnormal internal behavior Using “TopN” Statistics for Threat Hunting Most engineers use TopN for bandwidth. Investigators use it differently. Syntax -s statistic[:p][/orderby] Example nfdump -R /mnt/c/Users/Akash/Downloads/test/ -s ip/bytes -s dstport:p/bytes -n 5 Why this matters Identify staging systems (high outbound bytes) Detect scanners (high flow counts) Separate TCP vs UDP behavior with :p TopN becomes powerful only when driven by intelligence, not curiosity. ------------------------------------------------------------------------------------------------------------ Final Thoughts nfdump isn’t flashy. It doesn’t decrypt payloads. It doesn’t show malware strings. But when used correctly, it tells you: Who talked For how long How often And how much data moved In real investigations, that context is often enough to confirm compromise, scope incidents, and prioritize response. ----------------------------------------------Dean-------------------------------------------------------
- Where NetFlow Either Shines or Struggles
Let’s talk about where NetFlow either becomes incredibly powerful… or painfully slow. Most NetFlow analysis are done on GUI: browser-based or thin clients that are basically a browser wrapped with authentication, branding, and access control Nothing wrong with that — in fact, it makes a lot of sense. In most deployments, the GUI or console is hosted close to the storage laye r or on the same system entirely. That design choice is intentional. When analysts start querying months or years of NetFlow data, you do not want that traffic flying across the network. Keeping compute, storage, and analysis close together reduces latency and prevents unnecessary network load. ------------------------------------------------------------------------------------------------------------- Performance: The Real Bottleneck Nobody Plans For In commercial NetFlow products, the number of concurrent users is usually limited by: hardware capacity performance thresholds licensing In open-source setups, licensing disappears — but performance absolutely does not. Here’s the reality: Even a handful of analysts clicking around dashboards can place massive load on the system. Drilling down into NetFlow data is extremely I/O-intensive. Multiple users querying long time ranges at the same time can quickly: saturate disk I/O spike CPU usage increase memory pressure and even introduce network congestion Out of all NetFlow components — exporter, collector, storage, analysis —the GUI or analysis console is by far the most resource-hungry. And historical searches make it worse. ------------------------------------------------------------------------------------------------------------- Storage Is Not Optional — It’s the Strategy Long-term NetFlow analysis only works if all records remain available locally to the analysis server. That means: ever-growing storage constant monitoring planned scaling Storage decisions are usually dictated by the analysis software itself. Most tools manage their own storage backend because the UI, queries, and analyst workflows depend on it. This isn’t something you “figure out later” .If storage is under-provisioned, performance will suffer — and data will be lost. ------------------------------------------------------------------------------------------------------------- Network Teams vs DFIR Teams: Very Different Needs This is where things get interesting. Network Engineering Teams They usually care about: near real-time NetFlow bandwidth usage link saturation uptime and performance For them, recent data (days or weeks) is the priority. Long-term historical NetFlow? Rarely critical. DFIR & Security Teams Completely different mindset. Incident responders want: maximum retention historical visibility the ability to look back in time Why? Because breach discovery is slow. That’s why security teams often deploy their own NetFlow infrastructure, separate from network engineering. It allows: long-term retention forensic-grade investigations zero impact on production network tooling With this model, security teams can identify: command-and-control traffic beaconing behavior suspicious outbound communications …even months or years after the initial compromise. Most IT departments simply cannot afford to retain data at that scale — but security teams often must. ------------------------------------------------------------------------------------------------------------- How NetFlow Data Is Stored (And Why It Matters) There’s no single standard here. Commercial tools usually rely on databases Open-source tools often use: binary formats ASCII formats or search-optimized document stores Some tools allow multiple formats to coexist so the same dataset can be analyzed with different tools. File-based storage has one big advantage: accessibility If the data is stored as files, organizations can: reuse the data analyze it with multiple tools adapt as requirements change For some teams, the choice of NetFlow platform is driven less by dashboards and more by how easily the data can be reused later. ------------------------------------------------------------------------------------------------------------- NetFlow Is Powerful — But Not Magic Let’s be honest. NetFlow does not contain payloads. There is no content. That means analysts often operate on reasonable assumptions , not absolute proof. Example: Seeing TCP/80 traffic does not guarantee HTTP. Without PCAP, proxy logs, or host artifacts, that conclusion is still a hypothesis. But in incident response, educated hypotheses are normal — as long as we constantly look for evidence that disproves them. This is where correlation matters: IDS alerts proxy logs endpoint telemetry protocol errors NetFlow rarely works alone. ------------------------------------------------------------------------------------------------------------- Baselines Turn Guesswork into Hunting One way to reduce uncertainty is baselining. If: 95% of engineering traffic normally goes to 20 autonomous systems and a new AS suddenly appears in the top traffic list That’s worth investigating. Same idea for: known botnet infrastructure traffic redirection services suspicious hosting providers Even without payloads, patterns matter. ------------------------------------------------------------------------------------------------------------- In a perfect world, we’d answer questions like: Did the attacker exfiltrate data? What tools were transferred? Which credentials were used? How long was access maintained? Who was behind the attack? In reality, limited data retention, encryption, and undocumented protocols make that difficult. NetFlow won’t answer everything. But combined with: protocol knowledge baselines timing analysis throughput patterns directionality …it allows analysts to make informed, defensible conclusions even when full packet capture is unavailable. ------------------------------------------------------------------------------------------------------------- Final Thought Yes, there’s a lot of theory here. And that’s intentional. Because the next article will be practical: So…tie your seatbelts — we’re about to get hands-on. ----------------------------------------------Dean----------------------------------------------------------



