In an era where cyber threats are around every corner and with increasing attacks on data centers, security has become an essential element to include in every machine guarding users’ data. However, many security offerings are defenseless in the presence of malware. Furthermore software based security consumes compute and memory resources that should be allocated to users.
Mellanox BlueField® SmartNIC is an advanced, programmable, Ethernet SmartNIC equipped with an array of Arm® processor cores and an integrated Mellanox ConnectX®-5 network controller which solves the problem of securing data centers. It secures data centers while simultaneously enabling users to enjoy the computation resources they were promised. The BlueField SoC at the heart of the SmartNIC runs out-of-band security software, in a trusted domain that is different and isolated from potential malware. As the security software runs on the SmartNIC Arm coreses, all server compute resources are made available to users. Using an isolated environment, the SmartNIC is able to securely access application data for introspection, while simultaneously avoiding data tampering by malware, and without leaving a footprint of when and what data was accessed. This innovative design makes BlueField the best solution for malware detection and forensics investigation.
Nowadays, malware is vicious, stealthy and can employ hiding techniques to avoid detection by traditional software security offerings. Avoidance is possible due to an inherent problem in the data normally used to detect malware. Typically, a security solution has a data collection phase where the data is used to learn the activity of the malware. In the traditional approach, the data collection phase is based on software that runs on the same machine being inspected. Hence, it may fail to determine intrusion if the malware tampers with the data that it is trying to detect. The ability to hide from an observer (e.g. detection software that is looking for Indications Of Compromise (IOC)) is referred to as “anti-forensics techniques”. Malware can employ the same techniques to avoid detection by both Intrusion Detection System (IDS) and Intrusion Prevention System (IPS).
The failure to detect a malicious activity may occur at any step of the process. A critical element is the data acquisition. If the data used for inspection is unreliable, a detection system may not find any IOC, because any signs of IOC were hidden by the malicious software. There are many questions to be asked regarding the data acquisition method to determinate the level of trust: How does the security IDS/IPS application acquire the data? Can malware tamper with the data being acquired by an Intrusion Detection System?
There are several techniques for acquiring data, and several types of data used for the purpose of analysis. The following paragraphs will briefly cover the main methods, the type of data that is relevant to each, and their weaknesses.
Anti-malware software works on files that are persistently stored, also referred to as data-at-rest. The disk can be analyzed by anti-malware software running on the same machine or externally by another machine that is not compromised. When externally analyzed, and when the disk is not encrypted, it’s possible to build the filesystem tree and scan the disk for known IOC. For example, by scanning the disk for a file, the file can be reconstructed for the purpose of computing a hash value. In turn, the various online resources can provide information if a given file is malicious given its hash value. However if malware is not stored on the hard disk it may not have any footprint on the filesystem, and thus the anti-malware scanner technique would fail to detect a compromised system.
Most attacks have some footprint over the network. Let’s consider the scenario of stealing secrets from a host machine and sending them to a remote attacker. Detecting such events can tell which IP might have performed the attack, and what is its goal. Today, most IDS and IPS solutions observe the network for malicious activity. The network data can be collected locally by the same machine or externally (e.g. using a SmartNIC or a switch).
The run-time data provides the best visibility into the system and there are two approaches for acquiring such data: intrusive and non-intrusive to the operating system. The intrusive option refers to a privileged software that hooks to events and triggers through functions in the operating system. For example, an event of opening/closing a file/socket would trigger collecting data of which is opened/closed and when. Another example is forking a new process. A forking trigger and executing a new process is used by detection software to detect malicious activity (e.g. it can help answer the question if a new process is malware? Is the running process expected to fork a new process?) Needless to say, sophisticated malware can manipulate these hooks.
Ideally, we want to be able to collect data that reflects the state of the system and the activity that is happening from three main sources: disk, network, and memory.
Most detection techniques use the network or the disk approach for detecting IOC. Unfortunately this is insufficient to tackle the challenges of modern malware. It has been shown by researchers that modern malware has many tricks up its sleeve, and the bar keeps getting higher.
For example, some malware can attack the system without yielding any footprint on the disk, thus, hiding its presence and malicious activity from detection techniques that are disk based. Malware using the network to operate cannot hide completely. However, while the network traffic may contain many signs of compromise, in many cases the volume of the traffic is stateless, too large, random, and complex, so even if an IOC is found, it’s not possible to analyse the behavior of the malware involved. To understand the behavior of malware and make sense of the network traffic, a closer look at the run-time environment is needed.
An x-ray view of the malware activity requires acquiring data during execution time. Run-time data provides better visibility into events and actions. For example, which processes are running, the network connection, and the different primitive offered by the OS. Runtime data allows for better understanding of malware behavior, thus detection software can more accurately identify malicious activity.
Nonetheless, acquiring such data is challenging. A software-based solution would yield an observer effect since both the malware and intrusion detection system run on the same domain and share the same resources. Malware can manipulate the hooks and functions that an intrusion detection system uses to acquire data resulting in unreliable and compromised data. Thus, rather than using hooks and functions that are subject to alteration by malware, the preferred alternative is to use a secure method to obtain raw data from the host’s physical memory; the arena for run-time execution of the system. Assuming a tamper resilient method exists to acquire the raw data from the host’s physical memory, it’s possible to reconstruct the state of the system. This includes the kernel memory and code, and user space environments.
The data built from raw memory dump provides an abstraction to examine and detect an attack. If an attack was to happen– whether through injecting code, manipulating process memory, forking a new process, opening a new network connection to a remote attacker — all would manifest as a change in physical memory. The greater the impact, the more artifacts it would leave in memory.
Most forensic investigations include both data from the network complemented with data from the host’s physical memory. The combination of both allows for building an accurate copy of the system’s state. This blog proposes a novel proposal that allows reliable data acquisition of host physical memory.
In order to detect and analyze malware, the out-of-band device acquires data without providing an indication of when the accesses are occurring. The hardware-based approach to acquiring data is considered the most reliable and trusted method for malware detection. The reliability attribute is thanks to modern computer architecture and how PCI Express (PCIe) devices access host physical memory. In most cases, using the PCIe protocol, peripheral devices have Direct Memory Access (DMA) capabilities and can read/write from/to host physical memory without yielding any side effects to any software running on a host machine, including malware. When using a PCIe interface on an add-in card, it can issue memory read and memory write transactions to host physical memory at rates of 8Gbps (Gen3) or 16Gbps (Gen4) per lane.
Figure 1:Intrusion Detection System using PCIe interface to read data from host physical memory
The host physical memory is divided into multiple regions which are mapped during boot time and include System RAM, IO space, and ROM. For the most part, the data and areas of the malware attack reside in the System RAM, where the kernel and malware live. For the purpose of data acquisition, the acquisition device issues a memory read transaction to acquire the physical pages of the RAM region. Figure 2 illustrates a memory map of a Linux ubuntu 16.04
Figure 2: The memory map of an ubuntu host machine used by an Intrusion Detection System
Figure 2: The memory map of an ubuntu host machine used by an Intrusion Detection System
The transaction travels from the PCIe add-in card through the PCIe link to the memory controller, which in turn provides access to the physical memory. The aforementioned does not involve software running on the host, and as depicted in Figure 1. Instead, it follows a path that is hidden from the malware. Unlike software-based solutions, such a solution does not violate forensics requirements by running any new software on a host machine under investigation. The next thing to ensure is a constant data that can be analyzed. For example, consider the case when accessing two pages in host physical memory with one page pointing to the other, if the page being pointed to changes its physical address, then the data acquisition tool would be reading the wrong page in memory. This is a risk that exists for any tool acquiring the physical memory, notwithstanding if it’s a hardware- or software-based acquisition tool. Thus, the longer it takes to acquire the memory, the higher the likelihood for inconsistencies. The shorter the acquisition time, the fewer changes might occur, thus, increasing the likelihood for reliable data.
Here again hardware-based approaches outperform their software counterparts due to their superior speed, hence improved efficiency. For instance, acquiring 64GB of RAM can take several minutes using software tools. When using a PCIe add-in card operating at rates of Gen4, the data acquisition happens at 16Gbps per lane. A device with 16 PCIe lanes connected to a host machine, allows for data acquisition at 32GB/s when using Gen4.
Atamli et al. investigated the BlueField SmartNIC suitability for live-memory forensics. BlueField SmartNIC is an advanced, programmable, Ethernet SmartNIC equipped with an array of Arm processors and an integrated Mellanox ConnectX-5 network controller. It supports PCIe Gen3 and Gen4 and offers speedy access to host physical memory. For the purpose of the mentioned investigation, a variation of BlueField SmartNIC with 8 lanes is used. Using volatility memory forensics framework, Atamli et al. extended the framework to support live memory forensics from BlueField SmartNIC. The volatility memory forensics framework is a well-known open source framework used by malware researchers, forensics investigators and incident response personnel and it works with memory files. Using a memory image file, volatility uses a Python application (vol.py) to extract information such as the process list, network connections, and kernel modules that assist forensics investigators in understanding the footprint of the malware and its behavior. The framework allows developers and investigators to analyse host machines by looking at a dump of the memory. The new extension developed by Atamli et al. enables using volatility framework running on BlueField SmartNIC Arm cores to analyse malware in host physical memory. The aforementioned allows live memory analysis by acquiring segments of the physical memory on demand. It’s important to note that the normal mode of volatility works with memory files that can sometimes reach 64GB and 128GB. The aforementioned extension allows acquiring select data needed for a specific purpose like building the process list.
The new volatility plugin connects to memory access SDK that allows using BlueField DMA capabilities. The SDK provides different flavours of accessing the memory to allow fast memory access and lower latency when acquiring data. BlueField SmartNIC on-board memory allows copying the data from host physical memory and analyzing it locally using the Arm cores without the fear that it’s going to be modified by the host. The following video demonstrates the volatility framework running on BlueField Arm cores.
Attacks are getting stealthier and more complex, while the ability of current detection and prevention techniques is many miles behind. Hardware-assisted data acquisition is considered the most reliable and trusted method to acquire data for analysis. BlueField enables hardware-assisted memory acquisition for securing today’s servers. It enables intrusion detection and forensics investigation out-of-band. When authorised, it allows speedy access to host physical memory while guarding security applications, such as an IDS, in an isolated environment. BlueField facilitates forensics investigations, incident response, malware detection, and intrusion detection system.