An nbench challenge
I'm going to put together a blog post fairly soon comparing our performance with various other single-board computers. On the CPU side, I'm intending to use nbench, as results for other systems are widely available. I don't want to sell Raspberry Pi short, so I'm appealing for help to find the build options and compiler choice which yield the best results using the Raspbian wheezy image.
I'm primarily interested in two configurations: stock (arm_freq=700, core_freq=250, sdram_freq=400) and maximum (arm_freq=1000, core_freq=500, sdram_freq=400), but it would also be interesting to see the effect of setting sdram_freq=500. If you're not prepared to overvolt your board, you may wish to only submit results for the "stock" configuration. As a starting point, you can find Jesse's Arch Linux results here.
As an incentive, I'm offering a shiny pibow to the fastest reproducible result for the "stock" configuration.
I'm primarily interested in two configurations: stock (arm_freq=700, core_freq=250, sdram_freq=400) and maximum (arm_freq=1000, core_freq=500, sdram_freq=400), but it would also be interesting to see the effect of setting sdram_freq=500. If you're not prepared to overvolt your board, you may wish to only submit results for the "stock" configuration. As a starting point, you can find Jesse's Arch Linux results here.
As an incentive, I'm offering a shiny pibow to the fastest reproducible result for the "stock" configuration.
-
- Posts: 406
- Joined: Sun Nov 20, 2011 11:37 am
Re: An nbench challenge
Hi Eben,
I haven't used nbench before - I will have a go later.
In the meantime, I thought I'd share the results of a simple benchmarking I did, comparing the Squeezed Puppy release with Arch in a couple of configurations.
I haven't used nbench before - I will have a go later.
In the meantime, I thought I'd share the results of a simple benchmarking I did, comparing the Squeezed Puppy release with Arch in a couple of configurations.
I ran this command to do a very simple benchmark.
(from a console, not within X)With the over-voltage overclock (over_voltage=6, arm_freq=1000, core_freq=500, sdram_freq=500, the "real" time was:Code: Select all
time echo "scale=2000;4*a(1) | bc -l
15.774 secs.
With the "safe" overclock, (arm - 850, sdram - 500, gpu - 250), I got:
22.339s
With no overclocking:
22.464s
As a quick comparison, I ran the test on my Arch installation (this is using the same "safe" overclock) and got:
20.489s
glxgears test
Puppy
"Safe" overclock: best reading was 19.170 fps
No overclock: 19.077 fps
Over-voltage: 26.994 fps
Over-voltage with swapon - 29.103
Arch result (This was in the "safe" overclock settings, with swapon):
31.844 fps
Re: An nbench challenge
Let's start with this as a base, here are my raspbian results.
Code: Select all
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 200.4 : 5.14 : 1.69
STRING SORT : 31.472 : 14.06 : 2.18
BITFIELD : 8.8785e+07 : 15.23 : 3.18
FP EMULATION : 45.509 : 21.84 : 5.04
FOURIER : 2056.4 : 2.34 : 1.31
ASSIGNMENT : 2.3939 : 9.11 : 2.36
IDEA : 669.29 : 10.24 : 3.04
HUFFMAN : 414.53 : 11.49 : 3.67
NEURAL NET : 3.1213 : 5.01 : 2.11
LU DECOMPOSITION : 72.68 : 3.77 : 2.72
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 11.448
FLOATING-POINT INDEX: 3.534
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU :
L2 Cache :
OS : Linux 3.1.9+
C compiler : gcc-4.7.real
libc : libc-2.13.so
MEMORY INDEX : 2.539
INTEGER INDEX : 3.121
FLOATING-POINT INDEX: 1.960
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
-
- Posts: 406
- Joined: Sun Nov 20, 2011 11:37 am
Re: An nbench challenge
Here's my results for the stock settings on Raspbian:
nbench_stock_rasbian
This is with over-clocking to 1000. Sdram speed was 400 (RPi would not boot when set to 500).
nbench_1ghz
nbench_stock_rasbian
This is with over-clocking to 1000. Sdram speed was 400 (RPi would not boot when set to 500).
nbench_1ghz
-
- Posts: 406
- Joined: Sun Nov 20, 2011 11:37 am
Re: An nbench challenge
Here's the results when sdram increased to 450
1ghz_sdram_450
1ghz_sdram_450
-
- Posts: 406
- Joined: Sun Nov 20, 2011 11:37 am
Re: An nbench challenge
Oh, I tried again at 500 for sdram and it booted. Here's the output.
sdram_500
sdram_500
Re: An nbench challenge
Can you share the compiler flags you used to get these results, and maybe the resulting binaries? Some initial thoughts:
- gcc 4.7 seems to make a significant difference versus 4.6
- We do well versus Cortex A8 on floating point, less well on memory and integer
- PIstolero
- Posts: 101
- Joined: Mon Jul 23, 2012 6:28 am
- Location: paradise city, where the grass is green and the girls are pretty
Re: An nbench challenge
Still waiting for my Pi so I' ve done the benchmark in Qemu (ARM emulator) on a Notebook with Core2duo T5600, 4GB, Opensuse 12.1 64bit and Qemu 1.1.1 compiled from sources booting the Raspian image. The virtual Raspian has 192MB RAM.
Good to know that my real RPI will be a lot faster
Good to know that my real RPI will be a lot faster
Code: Select all
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 122.36 : 3.14 : 1.03
STRING SORT : 13.466 : 6.02 : 0.93
BITFIELD : 4.4703e+07 : 7.67 : 1.60
FP EMULATION : 21.81 : 10.47 : 2.41
FOURIER : 422.14 : 0.48 : 0.27
ASSIGNMENT : 2.5424 : 9.67 : 2.51
IDEA :
** WARNING: The current test result is NOT 95 % statistically certain.
** WARNING: The variation among the individual results is too large.
: 192.04 : 2.94 : 0.87
HUFFMAN :
** WARNING: The current test result is NOT 95 % statistically certain.
** WARNING: The variation among the individual results is too large.
: 230.33 : 6.39 : 2.04
NEURAL NET :
** WARNING: The current test result is NOT 95 % statistically certain.
** WARNING: The variation among the individual results is too large.
: 0.45295 : 0.73 : 0.31
LU DECOMPOSITION :
** WARNING: The current test result is NOT 95 % statistically certain.
** WARNING: The variation among the individual results is too large.
: 15.395 : 0.80 : 0.58
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 5.985
FLOATING-POINT INDEX: 0.653
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU :
L2 Cache :
OS : Linux 3.1.9+
C compiler : gcc version 4.6.3 (Debian 4.6.3-8+rpi1)
libc : libc-2.13.so
MEMORY INDEX : 1.553
INTEGER INDEX : 1.451
FLOATING-POINT INDEX: 0.362
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.
-
- Posts: 406
- Joined: Sun Nov 20, 2011 11:37 am
Re: An nbench challenge
Okay, I just ran with the gcc that came with my downloaded raspbian (4.6.3).eben wrote:Can you share the compiler flags you used to get these results, and maybe the resulting binaries? Some initial thoughts:
- gcc 4.7 seems to make a significant difference versus 4.6
- We do well versus Cortex A8 on floating point, less well on memory and integer
I didn't pass any flags, just ran "make".
I am happy to run some tests again, if you can give me idiot-proof instructions
I am now doing an apt-get install gcc-4.7
Re: An nbench challenge
I don't wish to toss a spanner in the works here, but haven't sufficient Pi already been purchased by people who have misunderstood it's purpose - i.e. it's a low cost computer to facilitate education in schools, not a firebreathing multicore desktop replacement?
Surely posting benchmarks on steroids might be counterproductive, especially regarding the "I never bother to RTFM" crowd. By all means benchmark it in a series of real-world situations, but I feel that wringing the snot out of it (and possibly invalidating the warranty) is totally missing the point.
Surely posting benchmarks on steroids might be counterproductive, especially regarding the "I never bother to RTFM" crowd. By all means benchmark it in a series of real-world situations, but I feel that wringing the snot out of it (and possibly invalidating the warranty) is totally missing the point.
Re: An nbench challenge
Maybe worth adding
-mcpu=arm1176jzf-s -mtune=arm1176jzf-s
to the CFLAGS in the Makefile, and trying -O2 and -Os instead of -O3.
In case you hadn't guessed, I'm travelling without a Raspberry Pi, otherwise I'd be playing with this myself as well
-mcpu=arm1176jzf-s -mtune=arm1176jzf-s
to the CFLAGS in the Makefile, and trying -O2 and -Os instead of -O3.
In case you hadn't guessed, I'm travelling without a Raspberry Pi, otherwise I'd be playing with this myself as well
Re: An nbench challenge
The aim is to bring some data to bear on some of the claims being made by vendors of other low-cost computing devices (particularly the various ~$70 Android stick computers). There's a lazy assumption that a Cortex-class processor will automatically wipe the floor with an ARM11, which isn't really borne out either by benchmarks or real-world performance measurements.gritz wrote:Surely posting benchmarks on steroids might be counterproductive, especially regarding the "I never bother to RTFM" crowd. By all means benchmark it in a series of real-world situations, but I feel that wringing the snot out of it (and possibly invalidating the warranty) is totally missing the point.
You might conclude from my interest in performance at 1GHz, and from the fact that we've posted videos showing our overclocked performance, that we are currently investigating the possibility of running the chip at that speed without invalidating the warranty.
-
- Posts: 406
- Joined: Sun Nov 20, 2011 11:37 am
Re: An nbench challenge
But it's fun ...gritz wrote: ... I feel that wringing the snot out of it (and possibly invalidating the warranty) is totally missing the point.
@Eben - yes, I've just been having a look at the -O3 flag and wondering about -Ofast?
I'll try the options you mentioned.
mark
Re: An nbench challenge
I've tried the two suggested modes, all with stock settings all build with gcc-4.7
CFLAGS = -s -static -Wall -O3 -fomit-frame-pointer -mcpu=arm1176jzf-s -mtune=arm1176jzf-s
CFLAGS = -s -static -Wall -O2 -Os -fomit-frame-pointer -mcpu=arm1176jzf-s -mtune=arm1176jzf-s
-O3 does better across the board.
edit: Tried -Ofast as well. Pretty close to -O3, better in some worse in some.
CFLAGS = -s -static -Wall -Ofast -fomit-frame-pointer -mcpu=arm1176jzf-s -mtune=arm1176jzf-s
CFLAGS = -s -static -Wall -O3 -fomit-frame-pointer -mcpu=arm1176jzf-s -mtune=arm1176jzf-s
CFLAGS = -s -static -Wall -O2 -Os -fomit-frame-pointer -mcpu=arm1176jzf-s -mtune=arm1176jzf-s
-O3 does better across the board.
edit: Tried -Ofast as well. Pretty close to -O3, better in some worse in some.
CFLAGS = -s -static -Wall -Ofast -fomit-frame-pointer -mcpu=arm1176jzf-s -mtune=arm1176jzf-s
Re: An nbench challenge
Code: Select all
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 200.4 : 5.14 : 1.69
STRING SORT : 31.472 : 14.06 : 2.18
BITFIELD : 8.8785e+07 : 15.23 : 3.18
FP EMULATION : 45.509 : 21.84 : 5.04
FOURIER : 2056.4 : 2.34 : 1.31
ASSIGNMENT : 2.3939 : 9.11 : 2.36
IDEA : 669.29 : 10.24 : 3.04
HUFFMAN : 414.53 : 11.49 : 3.67
NEURAL NET : 3.1213 : 5.01 : 2.11
LU DECOMPOSITION : 72.68 : 3.77 : 2.72
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 11.448
FLOATING-POINT INDEX: 3.534
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU :
L2 Cache :
OS : Linux 3.1.9+
C compiler : gcc-4.7.real
libc : libc-2.13.so
MEMORY INDEX : 2.539
INTEGER INDEX : 3.121
FLOATING-POINT INDEX: 1.960
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
Code: Select all
CFLAGS = -s -static -Wall -Ofast -funroll-loops -fomit-frame-pointer -march=armv6 -mfpu=vfp -mfloat-abi=hard
Re: An nbench challenge
@eppe - are you sure those numbers are right? They seem identical to your baseline at the top of the page.
-
- Posts: 406
- Joined: Sun Nov 20, 2011 11:37 am
Re: An nbench challenge
I've had a go with the arm1176jzf-s flags and -O2 (GCC 4.7)
Looks like I should run it with O3 as well, as per khh points above. I'll run that tomorrow!
arm flags and -O2
Looks like I should run it with O3 as well, as per khh points above. I'll run that tomorrow!
arm flags and -O2
Re: An nbench challenge
My test results(without booting into X):
compiled with gcc-4.7 and:
clocks were:
with no overvolt.
Code: Select all
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 312.96 : 5.14 : 1.69
STRING SORT : 44.147 : 14.06 : 2.18
BITFIELD : 9.8046e+07 : 15.23 : 3.18
FP EMULATION : 61.606 : 21.84 : 5.04
FOURIER : 3250.4 : 2.34 : 1.31
ASSIGNMENT : 4.1339 : 9.11 : 2.36
IDEA : 924.83 : 10.24 : 3.04
HUFFMAN : 569.97 : 11.49 : 3.67
NEURAL NET : 4.5010 : 5.01 : 2.11
LU DECOMPOSITION : 123.28 : 3.77 : 2.72
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 16.073
FLOATING-POINT INDEX: 9.552
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU :
L2 Cache :
OS : Linux 3.1.9+
C compiler : gcc-4.7.real
libc : libc-2.13.so
MEMORY INDEX : 3.524
INTEGER INDEX : 4.420
FLOATING-POINT INDEX: 3.079
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
Code: Select all
CFLAGS = -s -static -Wall -O3 -fomit-frame-pointer -funroll-loops -Wno-write-strings -Wno-sign-compare -mfloat-abi=hard -mfpu=vfp -mcpu=arm1176jzf-s -mtune=arm1176jzf-s -march=armv6zk
Code: Select all
arm_freq=950
sdram_freq=500
core_freq=500
Re: An nbench challenge
best i can get so far (after 2 hours of twiddling) on stock raspi e.g. no overclocking
Code: Select all
pi@welham ~/nbench-byte-2.2.3 $ ./nbench
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 219.84 : 5.64 : 1.85
STRING SORT : 32.083 : 14.34 : 2.22
BITFIELD : 7.3163e+07 : 12.55 : 2.62
FP EMULATION : 44.582 : 21.39 : 4.94
FOURIER : 2299.2 : 2.61 : 1.47
ASSIGNMENT : 2.6495 : 10.08 : 2.61
IDEA : 686.54 : 10.50 : 3.12
HUFFMAN : 417.5 : 11.58 : 3.70
NEURAL NET : 3.202 : 5.14 : 2.16
LU DECOMPOSITION : 80.64 : 4.18 : 3.02
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 11.500
FLOATING-POINT INDEX: 3.830
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU :
L2 Cache :
OS : Linux 3.1.9+
C compiler : gcc-4.7
libc : libc-2.13.so
MEMORY INDEX : 2.478
INTEGER INDEX : 3.204
FLOATING-POINT INDEX: 2.124
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.
pi@welham ~/nbench-byte-2.2.3 $
Re: An nbench challenge
correction. these are my current best results (two runs)
Code: Select all
pi@welham ~/nbench-byte-2.2.3 $ ./nbench
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 221.2 : 5.67 : 1.86
STRING SORT : 32.002 : 14.30 : 2.21
BITFIELD : 7.2979e+07 : 12.52 : 2.61
FP EMULATION : 44.427 : 21.32 : 4.92
FOURIER : 2292.8 : 2.61 : 1.46
ASSIGNMENT : 2.6616 : 10.13 : 2.63
IDEA : 686.55 : 10.50 : 3.12
HUFFMAN : 417.33 : 11.57 : 3.70
NEURAL NET : 3.2018 : 5.14 : 2.16
LU DECOMPOSITION : 81.04 : 4.20 : 3.03
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 11.503
FLOATING-POINT INDEX: 3.833
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU :
L2 Cache :
OS : Linux 3.1.9+
C compiler : gcc-4.7
libc : libc-2.13.so
MEMORY INDEX : 2.477
INTEGER INDEX : 3.206
FLOATING-POINT INDEX: 2.126
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.
pi@welham ~/nbench-byte-2.2.3 $ ./nbench
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 221.8 : 5.69 : 1.87
STRING SORT : 32.03 : 14.31 : 2.22
BITFIELD : 7.3253e+07 : 12.57 : 2.62
FP EMULATION : 44.476 : 21.34 : 4.92
FOURIER : 2301 : 2.62 : 1.47
ASSIGNMENT : 2.6596 : 10.12 : 2.62
IDEA : 686.54 : 10.50 : 3.12
HUFFMAN : 417.66 : 11.58 : 3.70
NEURAL NET : 3.2015 : 5.14 : 2.16
LU DECOMPOSITION : 80.296 : 4.16 : 3.00
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 11.516
FLOATING-POINT INDEX: 3.825
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU :
L2 Cache :
OS : Linux 3.1.9+
C compiler : gcc-4.7
libc : libc-2.13.so
MEMORY INDEX : 2.480
INTEGER INDEX : 3.209
FLOATING-POINT INDEX: 2.122
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.
pi@welham ~/nbench-byte-2.2.3 $
-
- Posts: 406
- Joined: Sun Nov 20, 2011 11:37 am
Re: An nbench challenge
Impressive.portets wrote:My test results(without booting into X):compiled with gcc-4.7 and:Code: Select all
MEMORY INDEX : 3.524 INTEGER INDEX : 4.420 FLOATING-POINT INDEX: 3.079
clocks were:Code: Select all
CFLAGS = -s -static -Wall -O3 -fomit-frame-pointer -funroll-loops -Wno-write-strings -Wno-sign-compare -mfloat-abi=hard -mfpu=vfp -mcpu=arm1176jzf-s -mtune=arm1176jzf-s -march=armv6zk
with no overvolt.Code: Select all
arm_freq=950 sdram_freq=500 core_freq=500
Re: An nbench challenge
Nobody yet appears to have matched the stock-clocked results that Dom reported on the first Raspbian boots in April: http://pastebin.com/2NZqH2yY …
You can not use -Ofast for benchmarks, without a proof that it does not change their computation results. The option violates the relevant standards in ways that can break valid programs, which is why it is not on by default. You might as well prelink a library that lies about how much time is elapsing, or use a compiler that detects benchmarks and converts them to no-ops.
I see very little effect from the various optimizations that are not already implied by -O3, and many of them improve some tests and worsen others. So why not keep it simple and pay attention to bigger factors instead:Incidentally, the best I could manage on a Pentium II 300 is below. On this basis, you could revise your PC comparison up quite a bit.
You can not use -Ofast for benchmarks, without a proof that it does not change their computation results. The option violates the relevant standards in ways that can break valid programs, which is why it is not on by default. You might as well prelink a library that lies about how much time is elapsing, or use a compiler that detects benchmarks and converts them to no-ops.
I see very little effect from the various optimizations that are not already implied by -O3, and many of them improve some tests and worsen others. So why not keep it simple and pay attention to bigger factors instead:
Code: Select all
fbset -g 704 400 704 400 16
echo on |sudo tee /proc/dwc_sof/SOF_reduction
make CC="gcc-4.7" CFLAGS="-s -static -O3 -funroll-loops" clean nbench
./nbench
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 226.2 : 5.80 : 1.91
STRING SORT : 33.28 : 14.87 : 2.30
BITFIELD : 9.3479e+07 : 16.03 : 3.35
FP EMULATION : 48.222 : 23.14 : 5.34
FOURIER : 2373.8 : 2.70 : 1.52
ASSIGNMENT : 2.9115 : 11.08 : 2.87
IDEA : 701.95 : 10.74 : 3.19
HUFFMAN : 437.9 : 12.14 : 3.88
NEURAL NET : 3.7924 : 6.09 : 2.56
LU DECOMPOSITION : 82.12 : 4.25 : 3.07
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 12.445
FLOATING-POINT INDEX: 4.121
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU :
L2 Cache :
OS : Linux 3.1.9+
C compiler : gcc-4.7
libc : libc-2.13.so
MEMORY INDEX : 2.809
INTEGER INDEX : 3.349
FLOATING-POINT INDEX: 2.285
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.
Code: Select all
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 114.87 : 2.95 : 0.97
STRING SORT : 9.5392 : 4.26 : 0.66
BITFIELD : 5.0838e+07 : 8.72 : 1.82
FP EMULATION : 17.152 : 8.23 : 1.90
FOURIER : 3209.7 : 3.65 : 2.05
ASSIGNMENT : 2.9528 : 11.24 : 2.91
IDEA : 508.01 : 7.77 : 2.31
HUFFMAN : 239.62 : 6.64 : 2.12
NEURAL NET : 3.7654 : 6.05 : 2.54
LU DECOMPOSITION : 111.84 : 5.79 : 4.18
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 6.560
FLOATING-POINT INDEX: 5.039
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU : GenuineIntel Pentium II (Klamath) 300MHz
L2 Cache : 512 KB
OS : Linux 3.1.0-7.fc16.i686
C compiler : gcc version 4.6.3 20120306 (Red Hat 4.6.3-2) (GCC)
libc : libc-2.14.90.so
MEMORY INDEX : 1.519
INTEGER INDEX : 1.732
FLOATING-POINT INDEX: 2.795
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.
Re: An nbench challenge
Got slightly better stats on most tests with an added -funroll-loops
CFLAGS = -s -static -Wall -O3 -fomit-frame-pointer -funroll-loops -mcpu=arm1176jzf-s -mtune=arm1176jzf-s -mfpu=vfp -mfloat-abi=hard
CFLAGS = -s -static -Wall -O3 -fomit-frame-pointer -funroll-loops -mcpu=arm1176jzf-s -mtune=arm1176jzf-s -mfpu=vfp -mfloat-abi=hard
Re: An nbench challenge
Sacrilege! I'd assumed you'd have one surgically implanted by now...eben wrote:In case you hadn't guessed, I'm travelling without a Raspberry Pi
-
- Raspberry Pi Engineer & Forum Moderator
- Posts: 7337
- Joined: Wed Aug 17, 2011 7:41 pm
- Location: Cambridge
Re: An nbench challenge
[I'm not entering the challenge. Feel free to use any information here to help]
I'm using the two numbers following:
==========================ORIGINAL BYTEMARK RESULTS==========================
(not quite sure what the difference between these and the lower numbers below are, but they all seem to scale up and down as expected)
On stock frequency. Using:
INTEGER INDEX : 11.059
FLOATING-POINT INDEX: 3.441
Disable framebuffer (I used tvservice -o):
INTEGER INDEX : 11.199
FLOATING-POINT INDEX: 3.595
Also add disable_l2cache_writealloc=1 to config.txt
INTEGER INDEX : 11.219
FLOATING-POINT INDEX: 3.635
Also build with gcc 4.7.1
INTEGER INDEX : 11.505
FLOATING-POINT INDEX: 3.847
Also built with SOF reduction (echo on | sudo tee /proc/dwc_sof/SOF_reduction)
INTEGER INDEX : 12.081
FLOATING-POINT INDEX: 4.179
And also with overclock (arm_freq=1000, core_freq=500, sdram_freq=500)
INTEGER INDEX : 17.039
FLOATING-POINT INDEX: 5.882
Also with kernel_cutdown.img:
INTEGER INDEX : 17.676
FLOATING-POINT INDEX: 6.247
I'm using the two numbers following:
==========================ORIGINAL BYTEMARK RESULTS==========================
(not quite sure what the difference between these and the lower numbers below are, but they all seem to scale up and down as expected)
On stock frequency. Using:
Code: Select all
CFLAGS = -s -static -Wall -O3 -fomit-frame-pointer -funroll-loops -mcpu=arm1176jzf-s -mtune=arm1176jzf-s -mfpu=vfp -mfloat-abi=hard
FLOATING-POINT INDEX: 3.441
Disable framebuffer (I used tvservice -o):
INTEGER INDEX : 11.199
FLOATING-POINT INDEX: 3.595
Also add disable_l2cache_writealloc=1 to config.txt
INTEGER INDEX : 11.219
FLOATING-POINT INDEX: 3.635
Also build with gcc 4.7.1
INTEGER INDEX : 11.505
FLOATING-POINT INDEX: 3.847
Also built with SOF reduction (echo on | sudo tee /proc/dwc_sof/SOF_reduction)
INTEGER INDEX : 12.081
FLOATING-POINT INDEX: 4.179
And also with overclock (arm_freq=1000, core_freq=500, sdram_freq=500)
INTEGER INDEX : 17.039
FLOATING-POINT INDEX: 5.882
Also with kernel_cutdown.img:
INTEGER INDEX : 17.676
FLOATING-POINT INDEX: 6.247
Code: Select all
BYTEmark* Native Mode Benchmark ver. 2 (10/95)
Index-split by Andrew D. Balsa (11/97)
Linux/Unix* port by Uwe F. Mayer (12/96,11/97)
TEST : Iterations/sec. : Old Index : New Index
: : Pentium 90* : AMD K6/233*
--------------------:------------------:-------------:------------
NUMERIC SORT : 354.16 : 9.08 : 2.98
STRING SORT : 47.983 : 21.44 : 3.32
BITFIELD : 1.0657e+08 : 18.28 : 3.82
FP EMULATION : 67.199 : 32.25 : 7.44
FOURIER : 3649.9 : 4.15 : 2.33
ASSIGNMENT : 4.6711 : 17.77 : 4.61
IDEA : 1003.1 : 15.34 : 4.56
HUFFMAN : 621.12 : 17.22 : 5.50
NEURAL NET : 5.104 : 8.20 : 3.45
LU DECOMPOSITION : 138.28 : 7.16 : 5.17
==========================ORIGINAL BYTEMARK RESULTS==========================
INTEGER INDEX : 17.676
FLOATING-POINT INDEX: 6.247
Baseline (MSDOS*) : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
==============================LINUX DATA BELOW===============================
CPU :
L2 Cache :
OS : Linux 3.1.9-cutdown+
C compiler : gcc-4.7
libc : libc-2.13.so
MEMORY INDEX : 3.880
INTEGER INDEX : 4.856
FLOATING-POINT INDEX: 3.465
Baseline (LINUX) : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
* Trademarks are property of their respective holder.