rmike
Posts: 41
Joined: Mon Aug 22, 2011 10:50 am

Re: Floating point performance?

Sun Aug 28, 2011 11:32 am

Hi,

any good sites or info about the floating point (double precision) performance of the cpu core? I read about an VFP unit inside the arm but did not see any clear figures.

Michael

johnreed
Posts: 6
Joined: Sat Sep 03, 2011 10:55 am

Re: Floating point performance?

Sat Sep 03, 2011 12:04 pm

Hello,

I would like to test it myself, but I don't have a board yet. It would be really great if you could run
http://www.netlib.org/benchmar.....npackc.new
on the alpha board just to test the result.

Thanks!!

User avatar
liz
Raspberry Pi Foundation Employee & Forum Moderator
Raspberry Pi Foundation Employee & Forum Moderator
Posts: 5202
Joined: Thu Jul 28, 2011 7:22 pm
Contact: Website

Re: Floating point performance?

Sat Sep 03, 2011 12:09 pm

It doesn't have double precision floating point acceleration. Single only, I'm afraid. It's the ARM 1176 VFP floating point unit - I direct you to ARM's documentation on the thing to find out more!
Director of Communications, Raspberry Pi

rmike
Posts: 41
Joined: Mon Aug 22, 2011 10:50 am

Re: Floating point performance?

Sat Sep 03, 2011 6:26 pm

Hi Liz,
It doesn't have double precision floating point acceleration
are you sure?

Because this site http://www.arm.com/products/pr.....-point.php linked from
http://www.arm.com/products/pr.....rm1176.php (Specifications) states:
ARM Floating Point architecture (VFP) provides hardware support for floating point operations in half-, single- and double-precision floating point arithmetic

Michael

eben
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 85
Joined: Sun Jul 17, 2011 11:54 am

Re: Floating point performance?

Sat Sep 03, 2011 7:38 pm

Heh - free double-precision floating point! My bad.

User avatar
liz
Raspberry Pi Foundation Employee & Forum Moderator
Raspberry Pi Foundation Employee & Forum Moderator
Posts: 5202
Joined: Thu Jul 28, 2011 7:22 pm
Contact: Website

Re: Floating point performance?

Sat Sep 03, 2011 8:16 pm

Yup - his bad. :) Remind me to double check when he next tells me anything!
Director of Communications, Raspberry Pi

User avatar
Gert van Loo
Posts: 2486
Joined: Tue Aug 02, 2011 7:27 am
Contact: Website

Re: Floating point performance?

Sun Sep 04, 2011 9:26 am

Maybe I should give Liz my mobile phone number in case there are more of these difficult hardware questions :-). I just double checked: the 1176 hardware technical reference and implementation guide. Yep, it has double-precision floating point as well as hardware divide and square root. It is full IEEE 754 compatible.

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 23915
Joined: Sat Jul 30, 2011 7:41 pm

Re: Floating point performance?

Sun Sep 04, 2011 9:34 am

And here are the results on running the linked linpack benchmark, using gcc -O3

Enter array size (q to quit) [200]:
Memory required: 315K.

LINPACK benchmark, Double precision.
Machine precision: 15 digits.
Array size 200 X 200.
Average rolled and unrolled performance:

Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS
----------------------------------------------------
2 0.53 92.45% 1.89% 5.66% 5493.333
4 1.07 92.52% 2.80% 4.67% 5385.621
8 2.12 92.45% 2.36% 5.19% 5466.003
16 4.24 92.45% 2.83% 4.72% 5438.944
32 8.49 92.11% 2.71% 5.18% 5459.213
64 16.98 92.05% 2.89% 5.06% 5452.440
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

rmike
Posts: 41
Joined: Mon Aug 22, 2011 10:50 am

Re: Floating point performance?

Sun Sep 04, 2011 9:39 am

Hi,

thanks for the clarification.
So - did some body of the developers already run the short source code (http://www.netlib.org/benchmar.....npackc.new) mentioned above?

Michael

EDIT: my posting comes to late... Thank you for the figures!

jasonl
Posts: 21
Joined: Tue Aug 30, 2011 8:26 am

Re: Floating point performance?

Sun Sep 04, 2011 10:22 pm

Just for giggles I ran the linpackc benchmark on my Sheevaplug running Debian Squeeze - which is running an Arm v5 with no floating point hardware:

Enter array size (q to quit) [200]:
Memory required: 315K.

LINPACK benchmark, Double precision.
Machine precision: 15 digits.
Array size 200 X 200.
Average rolled and unrolled performance:

Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS
----------------------------------------------------
4 0.50 92.00% 2.00% 6.00% 11687.943
8 0.99 91.92% 2.02% 6.06% 11813.620
16 2.00 92.00% 3.50% 4.50% 11504.363
32 4.00 93.00% 2.75% 4.25% 11474.326
64 8.02 92.64% 2.24% 5.11% 11549.715
128 16.05 92.02% 2.74% 5.23% 11557.309


Comparing apples and oranges, I'm sure - and I'm assuming the KFLOPS figure is the one to look at... But should the R-Pi have faster floating point performance due to the dedicated hardware?

Maybe the floating point hardware isn't being utilised on the alpha board - or this version of GCC doesn't know about it. It appears to be half as fast as the Sheevaplug. Processor speed is 1.2Ghz compared to the R-Pi's 700mhz. Another data point: the Intel Atom in my Dell Mini 9 is reporting around 174,000 KFLOPS, and does 2048 iterations in just over 17s.

Edit:
Just found this - which looks relevant:
http://wiki.debian.org/ArmHard.....ent_Status
Looks like the current Debian armel port doesn't support VFP yet, so it may mean that certain software or libraries for the R-Pi may need to be specially compiled to take advantage of the VFP.

Blars
Posts: 88
Joined: Sun Aug 28, 2011 3:22 am
Contact: Website

Re: Floating point performance?

Sun Sep 04, 2011 11:11 pm

Debian has an armhf port being worked on to support hardware floating point on arm. Unfortunatly for R-pi, it assumes armv9 features are available, so it won't work on R-pi. To use the hardware floating point on Debian arm, you would need to create yet another arm port and add all that infostructure. (buildd and porter machines needed that can handle recompiles of large software in reasonable time, disk space and bandwith on distribution machines, and mainly people needed to maintain it.)

jasonl
Posts: 21
Joined: Tue Aug 30, 2011 8:26 am

Re: Floating point performance?

Mon Sep 05, 2011 12:13 am

Or would it be OK just to supply the relevant GCC switches for certain critical packages?
For the linpack benchmark above would not compiling it with the line: gcc -mfloat-abi=hard -O3 work? Or would glibc and all the other support libraries need recompiling, too?
Ref:
http://wiki.debian.org/ArmHard.....nt_options
If Debian aren't compiling vfp support, then maybe one of the other distros with alpha boards (e.g. Fedora) would? I see a working FPU support as pretty critical, especially for those people wishing to do any multimedia encoding on the RPi.

rmike
Posts: 41
Joined: Mon Aug 22, 2011 10:50 am

Re: Floating point performance?

Mon Sep 05, 2011 7:01 am

Hi,
so it may mean that certain software or libraries for the R-Pi may need to be specially compiled to take advantage of the VFP.
It would be of course nice to have a cross compile environment with everything optimized for a certain platform like gentoo or LFS people do.

Found some links and hope they are worth mentioning:

ARM status @ Gentoo
http://dev.gentoo.org/~armin76...../chost.xml

http://cross-lfs.org/view/clfs.....edded/arm/

Michael

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 23915
Joined: Sat Jul 30, 2011 7:41 pm

Re: Floating point performance?

Mon Sep 05, 2011 8:17 am

I just compiled as follows

gcc -O3 -mfloat-abi=softfp

and got the following much improved results...

Memory required: 315K.
LINPACK benchmark, Double precision.
Machine precision: 15 digits.
Array size 200 X 200.
Average rolled and unrolled performance:

Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS
----------------------------------------------------
8 0.51 90.20% 3.92% 5.88% 22888.889
16 1.02 89.22% 4.90% 5.88% 22888.889
32 2.05 90.24% 3.41% 6.34% 22888.889
64 4.08 91.42% 2.94% 5.64% 22829.437
128 8.16 91.54% 2.94% 5.51% 22799.827
256 16.31 91.35% 2.76% 5.89% 22903.800
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

rmike
Posts: 41
Joined: Mon Aug 22, 2011 10:50 am

Re: Floating point performance?

Tue Sep 06, 2011 8:34 pm

Hi,

could somebody run a whetstone benchmark?
That benchmark also targets (trigonometric) lib functions.
Source see: http://www.rowley.co.uk/arm/wh.....t_dhry.zip

Michael

rmike
Posts: 41
Joined: Mon Aug 22, 2011 10:50 am

Re: Floating point performance?

Sat Sep 24, 2011 7:45 am

Hi,

any new findings about Whetstone performance?
Here some results for other CPUs and platforms:
http://www.roylongbottom.org.u.....tstone.htm
http://processors.wiki.ti.com/.....Benchmarks
http://www.rowley.co.uk/arm/ar....._bench.htm

Michael

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 23915
Joined: Sat Jul 30, 2011 7:41 pm

Re: Floating point performance?

Sat Sep 24, 2011 12:38 pm

Sorry,hadn't seen the post about Whetstones. I'll try and do that this weekend.
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

Chris Tyler
Posts: 70
Joined: Thu Jul 28, 2011 12:16 pm
Contact: Website

Re: Floating point performance?

Sun Sep 25, 2011 2:22 am

There are two separate issues with FPU support:

(a) using the FPU for math. You can enable this with compiler options, and you can mix and match between code that uses and does not use the FPU (or code that autodetects the FPU).

(b) using the FPU to pass function arguments. The "softfp" ABI variant passes function arguments via CPU registers only, and is needed where the architecture level does not guarantee the presence of a FPU. The "hardfp" variant passes function arguments via FPU registers where appropriate, but obviously won't work without an FPU present. Hardfp and softfp code cannot be mixed - you can't have a softfp binary call a hardfp library, and vice-versa, so initially bootstrapping hardfp is almost as much work as porting to an entirely new architecture. The significance of hardfp is that CPU-to-FPU transfers can take 20+ cycles.

In Fedora ARM, we have a softfp build (armv5tel) and are working on a hardfp build. Most hardfp efforts, including Fedora, have focused on armv7, since an FPU is guaranteed to be present at that architecture level. However, the Raspberry Pi has an armv6 CPU, and thus armv7hl (armv7 + hardfp + little-endian; some distros call this armv7hf) builds will not run on it.

The good news: We've had some initial conversations about doing a Fedora armv6hl build (which is pretty much specific to the Pi) once the armv7hl work is completed (soon!).

User avatar
liz
Raspberry Pi Foundation Employee & Forum Moderator
Raspberry Pi Foundation Employee & Forum Moderator
Posts: 5202
Joined: Thu Jul 28, 2011 7:22 pm
Contact: Website

Re: Floating point performance?

Sun Sep 25, 2011 4:56 am

Having a fully hardened Fedora by default would be a fantastic thing - it'd knock a lot of potential problems right on the head. Thanks very much for the update, Chris.
Director of Communications, Raspberry Pi

tufty
Posts: 1456
Joined: Sun Sep 11, 2011 2:32 pm

Re: Floating point performance?

Sun Sep 25, 2011 5:44 am

Liz - what Chris is talking about isn't what's generally referred to as "hardened" linux, which is all about security. What he's talking about is using the somewhat faster (if you have a FPU) hardfp ARM ABI variant, which passes floating point arguments backwards and forwards on the FPU registers rather than passing them on the stack and then loading them onto the FPU later.

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 23915
Joined: Sat Jul 30, 2011 7:41 pm

Re: Floating point performance?

Sun Sep 25, 2011 4:46 pm

Quote from rmike on September 6, 2011, 21:34
Hi,

could somebody run a whetstone benchmark?
That benchmark also targets (trigonometric) lib functions.
Source see: http://www.rowley.co.uk/arm/wh.....t_dhry.zip

Michael

Ran the Whet and Dhrystone tests from that site, but not convinced of the results....(not sure of the value of HZ to use for the Dhrystone test - in the end went with the HZ = sysconf(_SC_CLK_TCK) )

General compile flags

gcc -lm -float-abi=softfp ...

Results

jamesh@raspi-jamesh:~/projects/whetstone$ ./whetstone

Loops: 1000, Iterations: 1, Duration: 5 sec.
C Converted Double Precision Whetstones: 20.0 MIPS

jamesh@raspi-jamesh:~/projects/whetstone$ ./dhrystone
Microseconds for one run through Dhrystone: 4.1
Dhrystones per Second: 246305.4
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

rmike
Posts: 41
Joined: Mon Aug 22, 2011 10:50 am

Re: Floating point performance?

Sun Sep 25, 2011 5:24 pm

jamesh,
thank you for testing.
comparing with results for a Philips LPC2129 running at 60MHz shows (scaled) roughly the same result as the raspi: http://www.rowley.co.uk/arm/ar....._bench.htm What makes me wonder is the fact that the LPC2129 has AFAIK no HW FPU at all...

Michael

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 23915
Joined: Sat Jul 30, 2011 7:41 pm

Re: Floating point performance?

Sun Sep 25, 2011 5:38 pm

Ahh. Forgot to turn optimisations on. New results look a bit better!

Now building with gcc -lm -O3 -mfloat-abi=softfp -o whetstone whetstone.c

jamesh@raspi-jamesh:~/projects/whetstone$ ./whetstone

Loops: 1000, Iterations: 1, Duration: 3 sec.
C Converted Double Precision Whetstones: 33.3 MIPS

jamesh@raspi-jamesh:~/projects/whetstone$ ./dhrystone
Microseconds for one run through Dhrystone: 1.2
Dhrystones per Second: 809061.5
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 23915
Joined: Sat Jul 30, 2011 7:41 pm

Re: Floating point performance?

Sun Sep 25, 2011 5:41 pm

In fact, the run time is so short that the timing is a bit inaccurate. Increasing iterations gives

jamesh@raspi-jamesh:~/projects/whetstone$ ./whetstone

Loops: 1000, Iterations: 10, Duration: 24 sec.
C Converted Double Precision Whetstones: 41.7 MIPS
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

jamesh
Raspberry Pi Engineer & Forum Moderator
Raspberry Pi Engineer & Forum Moderator
Posts: 23915
Joined: Sat Jul 30, 2011 7:41 pm

Re: Floating point performance?

Mon Sep 26, 2011 10:14 am

I've added the results to the Wiki

http://elinux.org/RaspberryPiPerformance
Principal Software Engineer at Raspberry Pi (Trading) Ltd.
Contrary to popular belief, humorous signatures are allowed. Here's an example...
“I think it’s wrong that only one company makes the game Monopoly.” – Steven Wright

Return to “General discussion”