Thursday, April 23, 2009

teeterl's performance; more than modest

Every compiler or a runtime system should come with an all-encompassing testing suite. Not so for teeterl. There used to be no testing suite at all and the only indication that the system is working as it would be capable of compiling itself and producing a still working copy. Many aspects of the system were this way left outside of spotlight. Such as binary construction and matching as compiler does not do a lot of binary construction and matching when producing code.

Then, teeterl was released to the public and I decided to clean it a bit to expel many a bug still haunting dark corners of the system. OTP test_server and emulator test suites came handy here.

Today teeterl runs many test suites from OTP emulator testing set. Of course, many test suites appeared not relevant for teeterl, such as dynamic driver loading or port commands. These are implemented in teeterl in very unorthodox way. Others, such as binary or list manipulations tests  helped to weed a few overlooked bugs.

One of the test suites was - estone_SUITE. It is a performance test suite and the rest of the post is devoted to its output for teeterl. I would not have published the results if only Richard Feynman's address on cargo cult science convinced me to do otherwise. So, here are all results of estone tests even those that talk not in teeterl's favor at all.


Run #1. teeterl (not optimized meaning -ggdb option to gcc)

**** CPU speed UNKNOWN MHz ****
**** Total time 106.752 seconds ****
**** ESTONES = 9612 ****

    Title                            Millis        Estone

list manipulation                    2522            602
small messages                       38336           80
medium messages                      38806           156
huge messages                        860             576
pattern matching                     1234            628
traverse                             1673            296
Work with large dataset              608             459
Work with large local dataset        441             633
Alloc and dealloc                    184             672
Bif dispatch                         244             3176
Binary handling                      2698            183
Generic server (with timeout)        16953           148
Small Integer arithmetics            750             372
Float arithmetics                    141             219
Function calls                       631             1228
Timers                               672             184

Run #2. teeterl (fully optimized, -O3 -fast options to gcc)

**** CPU speed UNKNOWN MHz ****
**** Total time 82.6701 seconds ****
**** ESTONES = 12654 ****

    Title                            Millis        Estone

list manipulation                    1826            831
small messages                       28284           109
medium messages                      28563           212
huge messages                        623             796
pattern matching                     937             827
traverse                             1061            467
Work with large dataset              607             459
Work with large local dataset        763             365
Alloc and dealloc                    127             972
Bif dispatch                         184             4210
Binary handling                      5901            84
Generic server (with timeout)        12354           203
Small Integer arithmetics            459             607
Float arithmetics                    100             309
Function calls                       398             1946
Timers                               481             257

Run #3. Erlang/OTP, emulator version 5.6.5

**** CPU speed UNKNOWN MHz ****
**** Total time 2.697552 seconds ****
**** ESTONES = 69058 ****

    Title                            Millis        Estone

list manipulation                    292             5204
small messages                       594             5218
medium messages                      1068            5687
huge messages                        315             1573
pattern matching                     104             7456
traverse                             243             2043
Work with large dataset              -490            -569
Work with large local dataset        -634            -439
Alloc and dealloc                    56              2227
Bif dispatch                         36              21695
Binary handling                      228             2171
Generic server (with timeout)        461             5447
Small Integer arithmetics            127             2192
Float arithmetics                    17              1820
Function calls                       118             6572
Timers                               163             761


It looks like teeterl is way slower than out-of-the-box Erlang/OTP even if all optimizations are enabled in C compiler. The results are especially startling for message passing. teeterl takes 50x more time to pass messages. For other tests the figure is close to 2-3x. Something is very very wrong with teeterl's message passing. One possibility is that process yields to scheduler every time it sends a message. This should not be done.

I do not wear hats but if I would wear one I would definitely have taken it off before Erlang/OTP team who produced such a fast system.

No comments: