← Back to index
2017-08-29

This post is part of a series of more technically oriented posts about the inner workings and design choices of Top ! I'm writing this as a way to think aloud, to analyze the success and failures of my development process, and to remember things. If you're not interested in programming you can save time by gently skipping over this one. But if you like technical conversations and monospace fonts, maybe you will enjoy what's going on here...



In a previous post I mentioned the fact that, although Top ! is being developped in a handmade style, it still has some dependencies on external libraries, the biggest (and most cumbersome) being Cairo. Cairo is an open source 2D vector graphics library that you can find at www.cairographics.org. It has an easy to use API and does the job for painting simple images. It served well as a graphics backend when I started to write the GUI system of Top !, as it allowed me to dive right into my application-specific logic rather than having to first write a vector graphics engine from scratch. So in a sense it was a good option to get started.

However, it's begining to show its limits in the context of an immediate mode GUI and I intend to replace it in the near future. Here I expand a bit on the work overhead involved in managing the dependencies of Top !, and the performance issues I encountered with Cairo.

Embedding Cairo dependencies

Cairo is quite complete and supports many backends and options. I only use a limited subset of its functionnality, exposed to Top ! thru a very simple wrapper. You can turn off unused options when building it manually, so I had to pass the following flags to the configure script :

./configure --disable-xlib --disable-xlib-xrender --disable-quartz --disable-xcb --disable-cairoscript --disable-postscript --disable-pdf --disable-svg --disable-png --disable-xcb-shm
The script complains that you really really should enable these features, but does not tell you why.

Now when we build Top ! it's linked against our installed lib, but when moved to another computer and loaded, it will first try to load the dynamic library from an absolute path stored in the executable. You can see the dynamic libraries an executable depends on by running the following command :

otool -L Top.app/Contents/MacOS/top
which produced, among other (system) libraries, the following line :
/usr/local/lib/libcairo.2.dylib (compatibility version 11509.0.0, current version 11509.0.0)
Since we can't make the assumption that Cairo will be installed on a user's system, and since we want to use the lightweight custom build of the library, this dependency has to be embedded in the application bundle. We can change the path stored in the executable to a path relative the executable by running the following command :
install_name_tools -change /usr/local/lib/libcairo.2.dylib @loader_path/../Resources/libcairo.2.dylib Top.app/Contents/MacOS/top
We then have to copy our custom build of libcairo.2.dylib in Top.app/Contents/Resources. When a user runs Top.app, the library will be searched first in the Resources directory of the bundle and our version will be found. However, Cairo itself depends on a few other libraries, depending themselves on other libraries, and we have to embed them all ! Our compiled version actually depends on 7 other libraries.

So I wrote a simple script to print the dependencies recursively :

#!/bin/bash function pdeps { if [ $1 ] ; then echo $1 ; else return; fi otool -L $1 | grep -o '.*/local/.*\.dylib' | while read -r dep ; do if [ $dep != $1 ] ; then pdeps $dep ; fi done } pdeps $1
The script simply recursively run otool to get the dependencies of the executable given as first argument. grep is used to filter out system libraries and version informations. I then use the following script to embed the libraries installed on my development machine in the application bundle :
#!/bin/bash scripts/printdeps Top.app/Contents/MacOS/top | sort | uniq | while read line ; do base=$(basename $line) cp $line Top.app/Contents/Resources/$base install_name_tool -change "$line" "@loader_path/../Resources/$base" Top.app/Contents/MacOS/top scripts/printdeps $line | sort | uniq | while read dep ; do depbase=$(basename $dep) install_name_tool -change "$dep" "@loader_path/../Resources/$depbase" Top.app/Contents/Resources/$base done done
It runs the previous script on top executable and pipe it to sort and uniq to eliminates duplicates. For each dependency, it copies the dylib file to the Resources folder of the application bundle, changes the load path stored in the application's executable to a relative path, and also changes the load paths of the dylib file to be relative paths. This script can then be called from the makefile of Top !

After these steps I could move the executable to another computer and run the application without having to install all the dependencies. But, there was another issue : the dynamic link editor complained about a missing symbol ___sincos_stret !

It turned out that this is a clang optimization happening when the code calls sin and cos functions with the same argument : these calls are replaced by a call to __sincos_stret, which computes the sine and cosine together and returns them in a struct. Problem is, this symbol does not exist on OSX 10.8, which happened to be the OS version of my 'test' machine, and appeared on OSX 10.9, which was the OS version of my development machine.

The solution to this issue was to re-compile all the libraries with minimum macosx version set to 10.8, which prevents the use of the ___sincos_stret optimization. It can be done by invoking the configure script this way :

CC="clang -mmacosx-version-min=10.8" ./configure
or, for libraries installed via macports, by adding these lines to /opt/local/etc/macports/macports.conf :
mac_osx_deployment_target 10.8 MAC_OSX_DEPLOYMENT_TARGET 10.8
The application could then be run on my test machine. Phew ! Although this work is to be done only once, it points out the presence of potential dependency issues between different libraries and OS versions, which might arise along the course of the development of Top !

Performance issues

The GUI of Top ! updates on every user input, as well as at a fixed rate if no input is present, to handle counters, progress bars, matrix automation etc. The GUI is in immediate mode, meaning it's composed of function calls definining widgets which test the current input event, update their visible aspect and return some info, depending on the type of widget. For instance, a button would be updated as follows :

if(GuiDoTextButton(gui, "start", MakeRect(x, y, width, height), "Start")) { // start the currently selected cue }
There is very few retained state (except some informations could be cached by the widgets to speed up refreshing, but there is no need to explicitly synchronizestate between widgets and application, or even to know which state is cached). The button is only passed a gui context, a unique identifier, it's visible/clickable rectangle, and it's text label. It returns true if the user has clicked the button since the last GUI update cycle, and false otherwise.
(You can find more about the idea of immediate mode GUI here)

The GUI doesn't directly issue draw calls, but rather fills a buffer with drawing commands, which are later passed to the renderer. The GUI also maintains a list of dirty rectangles, and each widget can add dirty rectangles to signal to the GUI system that it has been interacted with and need to be redrawn. Drawing can then be clipped to the dirty areas to avoid unecessary redrawing on each update cycle.

This strategy kept the time spent to refresh the GUI acceptable, but the amount of time needed to actually draw is still high compared to the basic needs of the GUI (ie: rounded rectangles, some text labels, and some color gradients). At first the hotspots where clearly in the text rendering and gradient creation functions, so as these elements don't change frequently in the GUI, I decided to cache them as images and just paint the image when it exists and doesn't need to be updated.

Still, the cairo_paint, cairo_fill, cairo_stroke and the final commit to the screen seemed to take too much time. That was not very surprising, given that I used the cairo image backend (that is, the software renderer).
Cairo supports a Quartz backend. Quartz is the native 2D graphics API for OSX, so one could imagine that this backend would be faster than the pure software renderer. But it appears to be surprinsingly slower depending on the display, due to Cairo not being very good at handling different color spaces, which can incur extra conversion overhead. Moreover, if you touch the coordinates system of Quartz before drawing, even for transformations as simple as flipping an axis, it will take a slow path and resample the whole surface, rather than using an accelerated version of its drawing functions. It will also hit that slow path if you translate the current drawing position by a non pixel-aligned quantity, or if you set any scale factor different from 1 for one of the axis.
Cairo also have an experimental OpenGL backend, but it is not documented and seems not to be supported on OSX...

I thought I could wrote a Quartz wrapper that would handle the transformation and colorspace conversion issue, and to get some idea on what I could expect, I wrote a minimal wrapper for a test application. This application draws random gradient-coloured rounded rectangles and measures the average time needed to perform this operation.
Starting with hundred rectangles of 200x100 pixels, and increasing the width, gives these results :

Width (px) Quartz average time (s) Cairo average time (s)
100 0.026047 0.032158
200 0.041196 0.053368
400 0.079882 0.098137
800 0.14968 0.179297
1600 0.284872 0.316584

When keeping the dimension constant (200x200) and increasing the number of rectangles, the results follow the same pattern :

Rectangle Count Quartz average time (s) Cairo average time (s)
100 0,041843 0,054118
200 0,080495 0,103502
400 0,161196 0,200229
800 0,316555 0,393465

Future plans to replace Cairo

The Quartz wrapper gives a performance gain of roughly 20%. Changing the coordinate system of the GUI to match with Quartz, and rounding all displacement to integer coordinates would have been feasible (although rather painful), but it seems difficult to avoid scaling at some point, given we need to oversample our images by a factor 2 to get a nice look on Retina displays. Quartz is still mainly software rendered and would not benefit from acceleration. Moreover, it would not be portable, and the problem would arise again for future ports. That's why for now, rather than spending time to complete my simple Quartz wrapper and adapt the GUI, I fell back to Cairo. But I strongly feel that I will eventually have to change for a more customizable and optimizable solution with less dependencies.

The obvious candiate for now would be OpenGL, which is accelerated and portable, and present on virtually any system I could think to target. But I will have to write a simple 2D graphics API on top of it. Given the primitives I need are rather simple (Rounded rectangles, some text, some color gradients), it seems like the way to go, but for now I simply did not have the time to explore much in that direction, being more busy developing the specific parts of Top ! So, this story is to be continued !