top of page

Data Collection (Key Directories) in Digital Forensics for Linux

In digital forensics, it’s essential to follow the order of volatility to gather data effectively. The accepted standard, outlined in RFC3227, is as follows:


  1. Memory

  2. Swap File

  3. History Files

  4. Network Data

    • Routing Table

    • ARP Cache

    • Network Connections

  5. Running Processes

  6. Temporary Filesystems

  7. Disk Data

    • Key Files (Audit Logs, Accounts, etc.)

    • Disk Images

  8. Physical Configuration

  9. Backup Data


Key Principles for Investigations

Attackers will leave traces, and some behavior can be predictable. Keeping these principles in mind helps guide the investigation:


  • The attacker needs to get access and run applications. This means user accounts & login data might keep evidence.

  • They are likely to start with shell access – shell history is important.

  • If they compile exploits or shell scripts to avoid shell history, you can find evidence in the text editor history.

  • Attackers need to be able to communicate with the target system, analysing networking is very important.

  • Attackers generally want to get back in if you reboot, so look for common persistence.

  • As they move through the file system, they will change things. This can range from modifying existing files to implanting a backdoor through to staging data for exfiltration.


-------------------------------------------------------------------------------------------------------------

Common Attack Methods

  • Brute Force

  • Creating New Accounts

  • Adding Accounts to Groups

  • SSH Keys

  • Sudo Rights



Key Data to Check

Authentication Logs

Check auth.log, secure, utmp, wtmp, and btmp for failed logins.

/etc/passwd

Review user accounts, modification times, and shell access.

/etc/shadow

Check for unexpected accounts and modification times.

/etc/group

Review group membership for privileged accounts (Wheel, Sudo, Adm).

/etc/sudoers

Validate modification times and check for users with excessive privileges.

/etc/sudoers.d/

Same as /etc/sudoers—attackers prefer this location as it survives system updates.


SSH Keys

  • /home/(username)/.ssh/ and /root/.ssh/ contain default SSH key locations.

  • known_hosts helps identify lateral movement.

  • authorized_keys shows evidence of backdoor access.


-------------------------------------------------------------------------------------------------------------

History Files to Investigate


Key Data to Check

.bash_history

commands issued in the Bash shell. Other shells may store history elsewhere and the actual location of this file is stored in the $HISTFILE variable

.lesshst

record of any searches or shell commands issued while running less. this can maintain a record of users or attackers searching through files for specific strings or, in the case of restricted shells, attempting shell escapes.

.viminfo

This file contains the command line, search string and input-line history from any vi or vim invocations. It also contains references to file locations, buffer lists and key variables.

.mysql_history

Any commands line MySql activity is stored in this file.


Other Potential History Files

  • .python_history

  • .gdb_history

  • .wget-hsts

  • .local/share/nano/search_history


-------------------------------------------------------------------------------------------------------------


Alternative Shells and History Files

Bash is the default shell for most Linux distributions like Ubuntu, CentOS, RHEL, and Amazon Linux. However, certain distributions such as Kali and Parrot default to Zsh, and users can easily configure their systems to use different shells.


Key IR Point: Shell History

Each shell stores its history file in different locations and behaves differently, which impacts how you interact with it during live response:


  • Zsh History: Stored in ~/.zsh_history. This can be overlooked in automated responses.

    • The history command in Zsh is aliased to fc -l 1, which limits some standard history switches from working as expected.


    • You can make the history file display timestamps on a live system with any of the following commands:

      • fc -lf

      • fc -li 100

      • \history -E

      • \history -I


Note: Timestamps in history files are not forensically sound, and entries from previous sessions may have the timestamp of the first session.

You can find out more about the command with run-help fc in a terminal.


-------------------------------------------------------------------------------------------------------------


Networking Data Collection in IR

Networking analysis is critical in detecting unauthorized or malicious activity. Here are key points and common signs to look for:

Attacker Signs

  • Unusual ports in use.

  • Long-lasting connections.

  • Unexpected processes making network connections.


Key Networking Files

/etc/hosts

Contains local IP resolution data. Attackers may modify this file to reroute hostnames or disguise C2 IP addresses.

/etc/resolv.conf and /etc/systemd/resolved.conf

Check the DNS resolution configurations for suspicious changes, especially invalid nameservers.


Useful Networking Commands

lsof -Pni

(Live response only) Files with network connections

netstat -nap

(Live response only) displays network connection data.

route

Live response only) displays the kernel routing table.

arp -a

(Live response only) returns the arp table on the system.


-------------------------------------------------------------------------------------------------------------


Running Processes

When reviewing running processes, certain signs can indicate an attacker’s presence:

Attacker Signs

  • Unusual processes or unexpected arguments.

  • Processes running with unexpected privileges.


Process Investigation

  • Use the ps -auxww command to list processes. Review start times and validate that the user account and command line arguments are legitimate.

  • Inspect the /proc directory for more details on running processes. This directory can be an alternative to a memory dump in live response situations.


-------------------------------------------------------------------------------------------------------------


Common Persistence Techniques

Attackers often leave backdoors to maintain persistence on the system. Here are common techniques and what to check for during IR:


Creating New Accounts & SSH Access

Validate user accounts and modification times, especially for unexpected or malicious SSH keys located in /home/(username)/.ssh/ or /root/.ssh/.


Scheduled Tasks

Check crontab and the following directories for suspicious scheduled tasks:


  • /etc/cron.d/

  • /etc/cron.hourly/

  • /etc/cron.daily/

  • /etc/cron.weekly/

  • /etc/cron.monthly/


Look for tasks modified during the incident timeframe.


Start-Up Scripts

Review startup scripts, which vary based on the operating system. Key locations include:


  • /etc/init.d/

  • /etc/rc(x).d/

  • /etc/systemd/system/

  • /usr/lib/systemd/user/

  • /usr/lib/systemd/system/


Useful commands:


  • systemctl list-unit-files: Default on CentOS & Ubuntu.

  • chkconfig –list: For System V systems.


Modified Binaries

Attackers sometimes modify or replace binaries with their code. While it’s not feasible to check every file during triage, focus on files with suspicious modification times or altered permissions. Linux does not strongly enforce digital signatures for ELF or SO files, making them prime targets for attackers.


Hidden Files

Hidden files are another method attackers use to conceal backdoors. Files with a leading . in their name, such as .evil, are hidden from default views. Attackers may even create folders named ... to further disguise their presence.

Use the find command to search for hidden directories:


find / -type d -iname '.*' -exec ls -alht {} \; 2>/dev/null

This command can be used for both live response and analysis of disk images.


-------------------------------------------------------------------------------------------------------------


Validating SSH Access

Attacker Signs:

  • Brute force attacks on the SSH service

  • Creation of new SSH keys to maintain access

  • Modifying the authorized_keys file to add malicious entries

  • Exploits targeting SSH configuration files


Key Steps for Responders:

  • Logs: Investigate SSH logs for failed login attempts followed by successful logins, a common sign of brute force attempts. Look in files such as /var/log/auth.log or /var/log/secure for SSH activity.

  • Shell History: Look for any SSH-related commands in user histories, which can indicate remote access activity. The use of scp (secure copy protocol) could signal data exfiltration.

  • SSH Folders: Check for modifications in the SSH configuration files and authorized_keys to ensure no unauthorized keys were added. Compare these files to backups, if available, or check modification times to identify suspicious changes.

  • Configuration Validation: Inspect SSH configuration files, such as /etc/ssh/sshd_config, for any unusual modifications that might have been made to weaken security.


-------------------------------------------------------------------------------------------------------------


Checking for File Modifications

Attacker Signs:

  • Creation of new users (with malicious intent)

  • Anti-forensic activities like deleting shell history files or hiding files

  • Staging data for exfiltration in archives (e.g., .tar, .zip files)

  • Leaving backdoor files or scripts for persistence


Key Commands for Responders:

  • Detecting Large Files: Look for large files (often staged for exfiltration). Example command:


find / -type f -size +1G
  • Suspicious Files in /dev: The /dev folder should only contain device files or symbolic links. Regular files in this directory could indicate tampering:


find /dev -type f
  • File Modification Timestamps: Search for files modified during the incident window to detect changes in binaries, scripts, or configuration files:


find / -type f -newermt YYYY-MM-DD ! -newermt YYYY-MM-DD

This is an example search that looks for files modified in given time:


Additionally, during live response, it is good practice to validate the packages installed on the system:

Package Verification: Validate the integrity of installed packages using these commands:

  • CentOS:

rpm -Va

  • Ubuntu:

debsums -c  # (Install debsums if necessary)

-------------------------------------------------------------------------------------------------------------


Log Data Collection


Primary Logs to Examine:

Apache/httpd Logs

Look for unusual requests, especially ones that may indicate scanning or exploitation attempts

Audit Logs

These capture all system-level events, and you can check for unusual file accesses, command executions, or authentication failures.

/var/log/secure

Focus on sudo usage, SSH authentications, and any failed login attempts

/var/log/messages

Check for system errors, warnings, and notifications that may indicate misconfigurations or exploits.

/var/log/auth.log

Focus on user authentication attempts, including both failed and successful ones.


Quick Wins (Log-Based Indicators):

Sudo Use and Command Execution:=

grep 'sudo' /var/log/secure.

User Authentication:

 grep 'Failed password' /var/log/auth.log.

Unusual Notifications or Warnings:

grep 'warning' /var/log/messages.

Audit Logs for Commands Issued:

ausearch -m execve.


Additional Logs to Consider:

Mail Logs

var/log/maillog or /var/log/mail.log

identify if malicious actors are sending spam or phishing emails.

Firewalld Logs

/var/log/firewalld

Look for changes or violations in firewall rules

IPTables Logs

/var/log/syslog

unexpected firewall rule modifications.

UFW Logs (Uncomplicated Firewall on Ubuntu)

/var/log/ufw

sudden allow/deny actions that are unusual for the environment.

Samba Logs

/usr/local/samba/var/smbd.log or /var/log/samba/smbd.log

Useful if you're in a mixed Windows/Linux environment, especially for tracking lateral movement via file shares.

-------------------------------------------------------------------------------------------------------------


Key Considerations for Log Collection:

  1. Establish Collection Requirements:

    • Determine whether to collect just logs, memory, or even full disk images, depending on the attack scope.

  2. Live or Dead Collection:

    • Live Response: Can gather data directly from an active system.

    • Dead Collection: Query mounted disk images, but note that live system data (e.g., /proc and /var/run) won’t be available.

  3. Prepare for Errors:

    • Account for system-specific idiosyncrasies, like file permissions or missing directories. Root access is often required for collecting critical files like /etc/shadow.

  4. Log the Collection Process:

    • Document everything collected to maintain context and establish proper documentation for later analysis or legal cases.

  5. Timestamp the Collection:

    • Capture exact times during collection to reconcile events across logs and understand the sequence of attacker activities.

  6. Hash Collected Data:

    • Hash all collected files to ensure their integrity for further forensic investigation or legal use.


-------------------------------------------------------------------------------------------------------------


Live Response Commands:

scp/ssh for Remote Collection:

  • Copy collection script to a remote machine:

scp -i sshkey script.sh root@hostname:/tmp/script.sh
  • Make the script executable:

ssh -i sshkey root@hostname chmod +x /tmp/script.sh
  • Run the script to collect evidence:

ssh -i sshkey root@hostname '/tmp/script.sh /tmp/evidence'
  • Retrieve the evidence

scp -i sshkey root@hostname:/tmp/evidence/ ./

-------------------------------------------------------------------------------------------------------------


This method can be incorporated into a loop to automate the collection process from multiple hosts. Make sure you test your script extensively before deployment and document it thoroughly for transparency and reproducibility.


A basic triage script template is available on GitHub that can be customized to fit your IR process:


-------------------------------------------------------------------------------------------------------------


Conclusion:

Implementing a well-structured log collection process, including hashing and timestamping data, ensures the integrity and accuracy of evidence. Live and remote collection strategies, combined with error handling and proper documentation, are essential to streamline incident response, minimize the risk of data tampering, and support legal or regulatory requirements. As always, extensive testing and preparation of your collection scripts will help mitigate errors during critical moments, making your response faster and more efficient.


Akash Patel

66 views0 comments

Comments


bottom of page