Dead code elimination for PHP using the Tombs extension

October 2021Jonathan Gruber

At work, I maintain a large PHP codebase. Currently, we are in the process of moving old user interfaces of this project to a more modern React frontend. This step enables the team to reduce technical debt by removing problematic code that is no longer needed. Deleting backend code is scary, however. If you remove a piece of code without being sure that it is, in fact, unused, you are in trouble.

Thankfully, finding dead code becomes straightforward using a concept called Tombstones. This approach has the advantage over static code analysis of being able to resolve dynamic function calls as well. The idea is simple:

  • add a tomb for every function in the program
  • whenever a function is called, vacate its tomb
  • after a reasonable amount of time, only the dead function’s tombs remain (provided that all code paths are followed eventually)

There is a PHP extension called tombs by krakjoe that implements this mechanism. It uses Zend hooks to populate a graveyard at runtime where a tomb is created for every function the PHP engine constructs. It requires PHP 7.1 or later and a Unix-like OS. Let’s see how to install and use this extension.

Setup

The installation steps are as follows:

git clone https://github.com/krakjoe/tombs.git
cd tombs/
phpize
./configure
make
make install

If you are using the official PHP Docker image as we do, the setup is even easier, thanks to the helper scripts that come with this image. Add the following to your Dockerfile:

# Adjust paths to your needs
WORKDIR /tmp
RUN git clone https://github.com/krakjoe/tombs.git \
    && docker-php-ext-configure ./tombs \
    && docker-php-ext-install ./tombs \
    && rm -r ./tombs

# See below for configuation
COPY ./deployment/tombs.ini /usr/local/etc/php/conf.d/tombs.ini

After that, the extension can be loaded and configured in the tombs.ini file. The number of slots needs to be greater than the maximum number of functions in your application. Refer to the project’s README to learn more about the other configuration keys.

# /etc/php/conf.d/tombs.ini
zend_extension  = tombs.so
tombs.slots     = 10000
tombs.strings   = 32M
tombs.socket    = /tmp/zend.tombs.socket
tombs.dump      = /tmp/zend.tombs.dump
tombs.namespace = "App"

If all went well, you should see an output similar to the following when you execute php -v now:

$ php -v
PHP 8.0.12 (cli) (built: Oct 20 2021 16:06:03) ( NTS )
Copyright (c) The PHP Group
Zend Engine v4.0.12, Copyright (c) Zend Technologies
    with Tombs v0.0.3-dev, Copyright (c) 2020, by krakjoe # <<<

PHP OPcache preloading

After initial testing, I noticed that Tombs does not seem to work together with PHP preloading. This mechanism allows you to preload all your code into the opcache to improve performance when the engine starts. Consequently, I had to disable preloading while we were collecting data. However, preloading has no significant impact on the performance of our application anyway.

Usage

Provided the extension is loaded correctly and the application is running, you can read the current graveyard state at any time by connecting to the configured socket. For example, socat can accomplish this task:

socat - UNIX-CONNECT:/tmp/zend.tombs.socket

A background thread will then return the list of populated tombs. Every tomb object has the following shape:

{
  "location": {
    "file": "/var/www/src/Controller/FooController.php",
    "start": 68,
    "end": 71
  },
  "scope": "AppControllerFooController",
  "function": "listBarAction"
}

After a sufficient amount of time, when you are confident that all valid code paths have been executed, the remaining tombs show what functions in your program are likely to be dead. Still, you probably should verify how plausible the results are before actually deleting code.