Experiments with binary trees in C
Introduction
This project records my experiments with various data structures such as binary heaps, redblack trees etc. I have done my best to make the implementations “clean” under static and runtime analysis. A simple interpreter provides a framework for testing each data structure. An example finds the intersections of line segments.
Contents
-
binaryHeap.h
contains the public interface for a binary heap type. -
list.h
contains the public interface for a doubly linked list type. -
stack.h
contains the public interface for a stack type. -
bsTree.h
contains the public interface for a balanced tree type based on the pseudo-code in the paper by Andersson. See: “Balanced Search Trees Made Simple”, Arne Andersson, Proc. Workshop on Algorithms and Data Structures, pp. 60-71, 1993. -
redblackTree.h
contains the public interface for a red-black tree type. See: “A dichromatic framework for balanced trees”, L. J. Guibas and R. Sedgewick, Proc. 19th Annual Symposium on Foundations of Computer Science, October, 1978. This implementation is a modified version of the macros in tree.h by Niels Provos. -
redblackCache.h
contains the public interface for a cache implemented with a red-black tree. -
trbTree.h
contains the public interface for a threaded red-black tree type. -
skipList.h
contains the public interface for a skip list type. This implementation is based onjsw_slib.c
by Julienne Walker. See: “Skip Lists: A Probabilistic Alternative to Balanced Trees”, William Pugh, Commun. ACM, June 1990, Vol. 33, No. 6, pp 668-676. Walker’s source code appears to no longer be available on the interwebs. A modified version is used by speedtables. -
splayTree.h
contains the public interface for a splay tree type. See :”Self-Adjusting Binary Search Trees”, D.D. Sleator and R. E. Tarjan, Journal of the Association for Computing Machinery, Vol. 32, No. 3, July 1985, pp. 652-686. This implementation is a modified version of the macros in tree.h by Niels Provos. A splay tree has the advantage that the most recently accessed entry is moved to the root of the tree. On the other hand, when entries are accessed in sorted order, the tree is rearranged into a list-like structure with a corresponding increase in access time from \(\mathcal{O}(lg N)\) to \(\mathcal{O}(N)\). This splay tree implementation does not automatically rebalance the tree but does provide asplayTreeBalance()
function to rebalance the tree when called by the user. See, for example,test/00/t0068a.sh
. -
splayCache.h
contains the public interface for a cache implemented with a splay tree. -
swTree.h
contains the public interface for a binary tree that implements the rebalancing algorithm of Stout and Warren : “A simple algorithm is given which takes an arbitrary binary search tree and rebalances it to form another of optimal shape, using time linear in the number of nodes and only a constant amount of space (beyond that used to store the initial tree). This algorithm is therefore optimal in its use of both time and space.” See: “Tree Rebalancing in Optimal Time and Space”, Q. F. Stout and B. L. Warren, Communications of the ACM, September 1986, Volume 29, Number 9, pp. 902-908 -
sgTree.h
contains the public interface for a binary tree that implements the scapegoat self-balancing binary search tree described by Galperin and Rivest. See “ScapegoatTrees”, Igal Galperin and Ronald L. Rivest.
Design of the interpreter
The interpreter for each data structure contains:
interp_lex.l
,interp_yacc.y
andinterp_ex.c
implement a modified version of a simple interpreter shown in “A Compact Guide to Lex & Yacc” by Thomas Niemann.interp_wrapper.h
defines the operations that may or may not be implemented for each data structure. For example,create
anddestroy
, operating on the entire data structure, orinsert
,delete
,push
,pop
etc. operating on the entries in the data structure.interp_callbacks.c
defines interpreter call-back functions installed by thecreate
operation. These call-back functions implement interpreter memory handling, a comparison operator and debugging messages.- An implementation of the interpreter wrapper functions for each data
structure. For example,
binaryHeap_wrapper.c
. - A header file for the public interface of the data structure. For
example,
binaryHeap.h
. - An implementation of the data structure. For example,
binaryHeap.c
.
Notes on the data structure implementation:
- User data is passed to library as a
void
pointer to an entry type defined by the user ininterp_data.h
. That type may be a simplekey
,[ key , value
pair etc. For simplicity with these examples, I have usedsize_t
as the entrykey
only with novalue
. - The data structure implementation does not directly declare static or
global variable data. If the
duplicateEntry()
and/ordeleteEntry()
call-back functions are defined, then they will be called when appropriate. (The former is called to replace an existing entry). - The comparison function used by
insert
,remove
,find
etc looks at the contents of the user defined type, not the value of the pointer. This is simpler but has some disadvantages:- requires repeated searches for entries
- has no validity checking on non
NULL
pointers passed. For example, checking if the pointer has been freed. That is up to the caller.
Porting
Known porting problems:
- The interpreter assumes
sizeof(size_t)=sizeof(void*)
. - Small differences in floating point results between compiling for 64-bit and 32-bit systems
Building
This project uses GNU make
to build executables. The generic project rules
are in make.rules
. A top-level Makefile
includes a .mk
makefile
from each source directory. For example, here is the top-level Makefile
:
SOURCE_DIRS=src docs
include make.rules
and here is src/list/list.mk
:
list_PROGRAMS := list_interp
PROGRAMS += $(list_PROGRAMS)
VPATH += src/list
list_interp_C_SOURCES := list.c list_wrapper.c
list_interp_STATIC_LIBRARIES := interp.a
$(call add_extra_CFLAGS_macro,$(list_interp_C_SOURCES),-Isrc/interp)
The make.rules
file defines compiler options for PORT=debug
(debugging), PORT=yacc
(flex/bison debugging), PORT=sanitize
(address sanitiser), PORT=coverage
, PORT=profile
etc. The default
options are for the release build.
Comments:
- I only suppress sanitizer and analyzer warnings from the generated
interpreter files
interp_lex.c
andinterp_yacc.c
. However, compiling with-Wanalyzer-too-complex
shows that-fanalyzer
often bails out for other source files. For example with calls to the interpreter call-back functions likebinaryHeap->debug(...)
. - The generated file
interp_lex.c
gives null dereferemce warnings when compiled with-fanalyzer
. - The generated file
interp_yacc.c
gives run-time warnings when compiled with-fsanitize=bounds-strict
- The address sanitiser build appears to give false positives when
compiling with
-O2
and-O3 optimisation
.
Build the html version of the doxygen documentation with:
make docs
Testing
I maintain librj
with Peter Miller’s aegis
source code control
software. Each change is expected to pass the shell scrupts in
the test
directory. These shell scripts run an interpreter test script
for the data structure being tested. In addition, each test script is
expected to run without warnings under the valgrind
emulator or with
a -fsanitize
build. Run the test scripts under valgrind
by, for
example:
VALGRIND_CMD="valgrind --leak-check=full" sh test/00/t0010a.sh
Initial tests were hard-coded in C. Next, I implemented a Python wrapper for
the list class. The result was cumbersome and not valgrind clean. I now use a
modified version of a simple interpreter shown in “A Compact Guide to Lex &
Yacc” by Thomas Niemann.
The interpreter is built from the files src/interp/interp_lex.l
and
src/interp/interp_yacc.y
with implementation details in
src/interp/interp_ex.c
. The directory for each data structure
contains a “wrapper” file that is a “shim” between the implementation
of the data structure and the interpreter. The following
script shows the syntax for a test of redblackTree_interp
:
# Test file
t=time(0);
l=create();
"Insert";
for(x=0; x<30; x=x+1;)
{
print x;
insert(l, x);
}
x = rand(1000);
"x=";print x;
insert(l, x);
x=depth(l);
x = 1234;
p = find(l, x);
if (p == 0)
{
"Didn't find "; print x;
}
else
{
print *p;
}
walk(l, show);
x=2;
p=find(l, x);
remove(l, &x);
x=0;
p=min(l);
while( x<size(l) ) {
x=x+1;
print *p;
p=next(l, p);
}
destroy(l);
uprint time(t);
exit;
Notes:
- 26 integer variables named
a
toz
are allowed. - Strings are contained within double quotes.
#
starts a comment.- The interpreter uses
size_t
as the entry type. I assumevoid*
andsize_t
have the same size. - The
&
and*
operators work as for C print
prints a signed value anduprint
prints an unsigned value.rand(100)
returns a random integer in the range 0 to 99.- Failing to put a
;
after the third statement in afor()
is a syntax error.
Test coverage is determined with GNU gcov
and the -sPORT=coverage
build.
For example, for list_t
, test.in
is from test/00/t0055a.sh
:
$ make PORT=coverage list_interp
$ bin/list_interp < test.in
$ gcov -o obj -s list src/list/list.c
$ less list.c.gcov
Running all the hand-coded tests gave a test coverage of 85.53% for
src/list/list.c
and 92.72% for src/redblackTree/redblackTree.c
.
To-do
- Add a memory pool to the test interpreter
- Re-design the interpreter to enable unwinding of the expression tree after a syntax error.
- Test coverage could be improved by modifying the test interpreter so that the test can choose to fail memory allocation.
Performance comparison of binary tree implementations
The interpreters for the bsTree_t
, redblackTree_t
, sgTree_t
,
skipList_t
, splayTree_t
and swTree_t
implementations were
benchmarked with the benchmark.sh
shell script. The PC operating system was
6.1.12-200.fc37.x86_64 GNU/Linux
. The CPU is an Intel i7-7700K 4.20GHz with
16GiB RAM. The compiler was Red Hat GCC 12.2.1-4
with options -O3 -flto
.
The valgrind
version is 3.20.0
. The binary files were stripped of
debugging symbols. A simple test program shows sizeof(uintptr_t)==8
. The
preamble of the output from cg_annotate
running on a cachegrind
output file
is:
I1 cache: 32768 B, 64 B, 8-way associative
D1 cache: 32768 B, 64 B, 8-way associative
LL cache: 8388608 B, 64 B, 16-way associative
The shell script benchmark_plot.sh
extracts and plots the results.
The benchmark results are plotted on log-log axes with the x-axis being
the number of keys, N, inserted and the y-axis being the result divided by N.
Performance comparison with random keys inserted
This section compares the performance of each tree implementation when the
keys inserted and deleted are both y=rand(k)
. The benchmark results for
execution time are shown in the following plot:
The benchmark results for memory allocation are shown in the following plot:
The benchmark results for instruction references are shown in the following plot:
The benchmark results for data reads are shown in the following plot:
The benchmark results for data writes are shown in the following plot:
Performance comparison with sorted keys inserted
This section compares the performance of each tree implementation when the
keys inserted are sorted, y=x
, and random keys are deleted, y=rand(k)
.
The benchmark results for execution time are shown in the following plot:
The benchmark results for memory allocation are shown in the following plot:
The benchmark results for instruction references are shown in the following plot:
The benchmark results for data reads are shown in the following plot:
The benchmark results for data writes are shown in the following plot:
Comments on the performance comparison
- I have not attempted to extract the contribution of the interpreter
(particularly
interpRand()
) to these test results. Scanning the output ofgprof
suggests that the interpreter contribution can be neglected for tests with more than 10000 entries. - These tests do not consider the effect of a larger “real-world” executable with more I cache misses.
- The binary tree implementations variously use recursive and non-recursive
insert
and/orremove
operations. - For inserting random keys, the “knee” in execution time at about N=100000 is most likely due to the size of the LL cache.
- The values of
depth_factor
for theswTree_t
andalpha
for thesgTree_t
have been set to values that avoid regular rebalancing of these trees with insertion of random keys. Rebalancing these trees is an \( \mathcal{O}(N)\) operation that must be done sufficiently rarely that the amortised cost is \( \mathcal{O}(lg\; N)\). The Rivest and Galperin paper shows results for \( 0.55 \le \alpha \le 0.75 \). \( \alpha = 0.4 \) gives good results with random keys but fails with sorted inserted keys. - I do not rebalance the
splayTree_t
. It appears that this is unnecessary with the random and sorted data used in this benchmark. - I do not attempt to pack the tree structures. For example, the
single bit
redblackTreeColour_e
occupies 4 bytes and theredblackTreeNode_t
is aligned so that it occupies 40 bytes.
An example : Finding line segment intersections
intersectList_t
implements the algorithm for finding intersections of
line segments from Chapter 2 of “Computational Geometry: Algorithms and
Applications”, Second Edition, De Berg et al. Springer-Verlag,
ISBN 3-540-65620-0.
Given a list of line segments, the endpoints are stored as “events” in the event queue, a balanced tree ordered by position. The “upper” endpoints are stored with a list of segments sharing that upper endpoint. Events are considered in order. Segments that lie on the current event sweep line are stored in the status tree, a balanced tree ordered by intersection with the sweep line and segment slope. The status tree is initially empty.
For each event:
- Find all segments stored in the status tree that contain the
event
; they are adjacent. Letlower(event)
denote the subset of segments found whose lower endpoint isevent
, and letinterior(event)
denote the subset of segments found that containevent
in their interior and letupper(event)
denote the subset of segments whose upper endpoint isevent
. Note that, by design, segments inupper(event)
cannot appear in the status tree since they have not been encountered yet. - If
upper(event)+lower(event)+interior(event)
contains more than one segment then create an intersection point forevent
. - Delete the segments in
lower(event)+interior(event)
from the status tree. - Insert the segments in
upper(event)+interior(event)
into the status tree, The order of the segments in the status tree should correspond to the order in which they are intersected by a sweep line just belowevent
. If there is a horizontal segment, it comes last among all segments containingevent
. (Deleting and re-inserting the segments ofinterior(event)
reverses their order.) - If
upper(event)+interior(event)
is empty then:- let
sl
andsr
be the left and right neighbours ofevent
in the status tree - check for a new event point at the intersection of
sl
andsr
- let
otherwise:
- let
s'
be the leftmost segment ofupper(event)+interior(event)
in the tree andsl
be the left neighbour ofs'
in the tree - check for a new event point at the intersection of
s'
andsl
- let
s''
be the rightmost segment ofupper(event)+interior(event)
in the tree andsr
be the right neighbour ofs''
in the tree - check for a new event point at the intersection of
s''
andsr
.
The test program for intersectList_t
, in file intersectList_test.c
,
reads and writes PostScript files. The test program
intersectList_random_test.c
generates a random list of line segments and
writes a PostScript file. The following figure shows an
example of the output from intersectList_random_test.c
:
About this page
This page was generated by the Jekyll static site generator using the Poole theme by Mark Otto. The equations were rendered by MathJax. The original favicon image is here.