Why make (and ymake)

07 May 2021

In recent years many new build systems have emerged to replace traditional make. Although they all have different strengths and weaknesses, one common theme is improving execution performance of the build system.

Traditional multiprocessor make systems are generally efficient at scheduling tasks within a given Makefile. Given a dependency graph, child processes can be executed when their dependencies are satisfied and processor cores are available.

Unfortunately, this breaks down where multiple Makefiles are used across different directories or modules. The choice that a build system maintainer faces becomes:

However, there's no reason that this choice is inevitable. If the full build can construct a single global dependency graph, but each directory still contains a partial dependency graph, a directory build can operate as well as traditional make and a full build can operate as efficiently as a newer build system. YMAKE attempted to build such a system, and do so without requiring any changes to per-directory Makefiles.

Specifying the global graph requires a new syntax to logically graft a child directory into the dependency graph. The remainder of Makefile syntax does not need to change, and conditionals can be used to fall back to recursive navigation if a global dependency graph is not supported. This means that the build system does not require developers to install new build tools to compile, so it still benefits from the ubiquity of make, but having YMAKE just makes the existing build system faster.

DIRS=dir_a \
     dir_b

!IFDEF _YMAKE_VER
all[dirs target=all]: $(DIRS)
!ELSE
all:
    for %%i in ($(DIRS)) do cd %%i && make && cd ..
!ENDIF

Rules within Makefiles are unchanged. A regular rule in a Makefile refers to a relative file path, both for targets or prerequisites. Building a global graph can be done by calculating the full paths of objects relative to the Makefile referencing them, and building graphs against the full paths.

In addition to more efficient scheduling of jobs, this also means many fewer child processes. A conventional recursive make-based build system needs to invoke different instances of make for each makefile; here those processes are avoided. This is particularly significant on Windows where process creation is relatively expensive.

Time in seconds, 4 core i5 NMAKE (5 child processes) YMAKE (5 child processes)
Full build (wall time) 14.58 9.55
Full build (CPU time) 41.31 36.08
Inferred CPU utilization 70.8% 94.4%

NMAKE is not a natively multi-process build system, so these results used a seperate process to schedule up to 5 child directories to be compiled at once. Unfortunately the consequence of this is single directory builds can only utilize a single child process. YMAKE addresses this, but note that other make systems such as GNU make address it too. Thus, single directory results are primarily illustrating the limitations of NMAKE rather than capabilities of YMAKE. In addition, single directory builds are frequently too small for any build engine to keep all cores busy, due to having more processor cores than scheduleable tasks. With those caveats, single directory times are presented for completeness:

Time in seconds, 4 core i5 NMAKE (1 child process) YMAKE (5 child processes)
Single directory (wall time) 1.14 0.52
Single directory (CPU time) 1.21 1.18
Inferred CPU utilization 26.5% 54.8%

Having a global graph with per directory Makefiles also simplies the task of rebuilding a directory when dependencies have changed. Rather than needing to manually build different directories, a developer can build the targets in a single directory by evaluating the global graph and building any targets needed to build the requested directory. This requires specifying the global makefile to build from and the target directory to build:

c:\src\github\yori\edit>ymake -f ..\Makefile .
mod_edit.obj
options.c
about.c
edit.c
yedit.exe
edit.c
builtins.lib

Because the global graph takes more time to parse, this is slower than the single directory Makefile results above, but not prohibitively so:

Time in seconds, 4 core i5 YMAKE (global graph) YMAKE (single Makefile)
Single directory (wall time) 0.70 0.52
Single directory (CPU time) 1.37 1.18
Inferred CPU utilization 48.9% 54.8%

The build system in Yori uses preprocessor child processes to determine the capabilties of the compiler version and tailor compilation options, similar to autoconf. Unfortunately, this means that by default autoconf-style probes exist on each invokation of make. In addition to build scheduling improvements, YMAKE allows these probes to be cached, so that repeated compilations do not need to check the environment. This cache is built using a hash of the environment, so if the compilation environment changes, the cache will be ignored automatically.

Time in seconds, 4 core i5 YMAKE (no cache) YMAKE (cached probes)
Single directory (local graph, wall time) 0.52 0.32
Single directory (local graph, CPU time) 1.18 0.91
Inferred CPU utilization 54.8% 72.0%
Single directory (global graph, wall time) 0.70 0.39
Single directory (global graph, CPU time) 1.37 1.00
Inferred CPU utilization 48.9% 64.1%

Source for YMAKE is available on Github and binaries are distributed as part of Yori.