User avatar
RichardUK
Posts: 235
Joined: Fri Jun 01, 2012 5:12 pm

Why do I not see 2x speed up in sysbench?

Wed Jun 26, 2019 10:44 pm

On the blog post it says "yielding performance increases over Raspberry Pi 3B+ of between two and four times, depending on the benchmark." when talking about moving to the newer CPU arch. And yet testing the CPU alone with sysbench shows only 15% speed up. confused.com :?

Is sysbench that bad???

I have a RPi4 and a RPi3 B+ over clocked to 1.5Ghz. Same as the PRi4. Both have heat sinks and fans.

For the RPi4 I get...

Code: Select all

sysbench 0.4.12:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 4

Doing CPU performance benchmark

Threads started!
Done.

Maximum prime number checked in CPU test: 20000


Test execution summary:
    total time:                          62.9297s
    total number of events:              10000
    total time taken by event execution: 251.6929
    per-request statistics:
         min:                                 24.43ms
         avg:                                 25.17ms
         max:                                 72.31ms
         approx.  95 percentile:              25.49ms

Threads fairness:
    events (avg/stddev):           2500.0000/20.21
    execution time (avg/stddev):   62.9232/0.00
For the RPi 3 I get

Code: Select all

sysbench 0.4.12:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 4

Doing CPU performance benchmark

Threads started!
Done.

Maximum prime number checked in CPU test: 20000


Test execution summary:
    total time:                          73.4124s
    total number of events:              10000
    total time taken by event execution: 293.5897
    per-request statistics:
         min:                                 28.56ms
         avg:                                 29.36ms
         max:                                 86.87ms
         approx.  95 percentile:              31.14ms

Threads fairness:
    events (avg/stddev):           2500.0000/86.26
    execution time (avg/stddev):   73.3974/0.02
Just 11 seconds more, 62 seconds compared to 73 seconds. That's about a 15% difference.

wren
Posts: 78
Joined: Mon May 28, 2018 9:06 pm

Re: Why do I not see 2x speed up in sysbench?

Wed Jun 26, 2019 11:00 pm

If you want something thorough, use Phoronix Test Suite.
https://phoronix-test-suite.com/?k=downloads

User avatar
RichardUK
Posts: 235
Joined: Fri Jun 01, 2012 5:12 pm

Re: Why do I not see 2x speed up in sysbench?

Thu Jun 27, 2019 9:24 am

Ta will give that ago. :)

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 24175
Joined: Sat Jul 30, 2011 7:41 pm

Re: Why do I not see 2x speed up in sysbench?

Thu Jun 27, 2019 9:27 am

Different tests will give wildly different results. Some operations are a lot faster, some are not. On average, about 2 times faster. But include things like network speed or USB access speed, and it's LOT faster. So very dependent on the test.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

ejolson
Posts: 3826
Joined: Tue Mar 18, 2014 11:47 am

Re: Why do I not see 2x speed up in sysbench?

Thu Jun 27, 2019 11:26 am

RichardUK wrote:
Wed Jun 26, 2019 10:44 pm
On the blog post it says "yielding performance increases over Raspberry Pi 3B+ of between two and four times, depending on the benchmark." when talking about moving to the newer CPU arch. And yet testing the CPU alone with sysbench shows only 15% speed up. confused.com :?

Is sysbench that bad???

I have a RPi4 and a RPi3 B+ over clocked to 1.5Ghz. Same as the PRi4. Both have heat sinks and fans.

For the RPi4 I get...

Code: Select all

sysbench 0.4.12:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 4

Doing CPU performance benchmark

Threads started!
Done.

Maximum prime number checked in CPU test: 20000


Test execution summary:
    total time:                          62.9297s
    total number of events:              10000
    total time taken by event execution: 251.6929
    per-request statistics:
         min:                                 24.43ms
         avg:                                 25.17ms
         max:                                 72.31ms
         approx.  95 percentile:              25.49ms

Threads fairness:
    events (avg/stddev):           2500.0000/20.21
    execution time (avg/stddev):   62.9232/0.00
For the RPi 3 I get

Code: Select all

sysbench 0.4.12:  multi-threaded system evaluation benchmark

Running the test with following options:
Number of threads: 4

Doing CPU performance benchmark

Threads started!
Done.

Maximum prime number checked in CPU test: 20000


Test execution summary:
    total time:                          73.4124s
    total number of events:              10000
    total time taken by event execution: 293.5897
    per-request statistics:
         min:                                 28.56ms
         avg:                                 29.36ms
         max:                                 86.87ms
         approx.  95 percentile:              31.14ms

Threads fairness:
    events (avg/stddev):           2500.0000/86.26
    execution time (avg/stddev):   73.3974/0.02
Just 11 seconds more, 62 seconds compared to 73 seconds. That's about a 15% difference.
There should probably be a sticky on the forum about why sysbench runs so slow on all models of Pi from the 2B version 2 to the 4B.

Sysbench was intended to be a comprehensive benchmarking framework. Unfortunately, the simple code which started out as a placeholder for the CPU tests was never replaced by something meaningful. As a result, sysbench only checks how fast the processor can perform 64-bit division. The problem with such a test is
  • Few real-world programs spend significant time if any performing 64-bit division.
  • When the 64-bit ARM Cortex-A53 and A72 CPUs run in 32-bit mode as is currently the case with Raspbian, 64-bit division needs to be emulated using relatively slow software techniques.
There is one more difficulty that frequently makes comparing the speed of the Raspberry Pi difficult: The fact that the system C compiler generates binaries that are reverse compatible with the ARMv6 used on the original Pi and current Zero models by default. Not using the newer 32-bit and NEON instructions further reduces the apparent performance. It appears even -mtune=native -march=native flags are not sufficient and that flags explicitly specifying the A72, for example, need to be added for sensible performance measurements. In particular, this means the automatic build scripts for the Phoronix benchmarks don't work correctly under Raspbian.

At the same time it is likely that even when running suboptimal ARMv6 compatible binaries, the CPU of the B4 is more than twice as fast as the B3+ computer. For example, according to this Pie Pi Chart program
  • The B has Pi ratio 1
  • The 3B+ has Pi ratio 12.4
  • The 4B has Pi ratio 26.8 (may increase with further optimisation)
If you download the pichart program and find compiler settings that lead to faster computational speeds, I would be very happy to see the results, either posted here or in the Pi Pie Chart thread.

okenido
Posts: 57
Joined: Thu Aug 02, 2018 11:47 am

Re: Why do I not see 2x speed up in sysbench?

Thu Jun 27, 2019 1:57 pm

The new cpu IS something between 2-3 times faster, because increased frequency, bigger pipeline, better branch prediction, bigger caches... but it's hard to measure with benchmarks.

For example the big cache advantage may only show up on applications using large data sets and large code base (most apps nowadays). A benchmark may completely miss that if it doesn't make use of it.

A CPU with NEON instructions may appear way more powerful in a benchmark detecting and using it, while it won't show up as a performance gain in your application if you don't optimize your code on purpose for those instructions.

and the list goes on :)

Return to “General discussion”