LTolledo
Posts: 1963
Joined: Sat Mar 17, 2018 7:29 am
Location: Anime Heartland

Re: Best Raspberry Pi Alternatives 2019

Tue Oct 08, 2019 9:54 am

Hmmm.... as a "collector" me got interested one time at acquiring the Jetson Nano Developer kit... as its price is almost the same as a RPi4B-2G set combo here.

I may be able persuade the CFO for some budget.... so just keeping my fingers crossed and hopes up :D
"Don't come to me with 'issues' for I don't know how to deal with those
Come to me with 'problems' and I'll help you find solutions"

Some people be like:
"Help me! Am drowning! But dont you dare touch me nor come near me!"

User avatar
jcyr
Posts: 357
Joined: Sun Apr 23, 2017 1:31 pm
Location: Atlanta

Re: Best Raspberry Pi Alternatives 2019

Tue Oct 08, 2019 2:52 pm

LTolledo wrote:
Tue Oct 08, 2019 9:54 am
Hmmm.... as a "collector" me got interested one time at acquiring the Jetson Nano Developer kit... as its price is almost the same as a RPi4B-2G set combo here.

I may be able persuade the CFO for some budget.... so just keeping my fingers crossed and hopes up :D
Mine arrives tomorrow. It'll be interesting to see how well my X86 Ubuntu numerical (non video) applications using power hungry video cards port to this integrated ARM host based SBC.
It's um...uh...well it's kinda like...and it's got a bit of...

ejolson
Posts: 3588
Joined: Tue Mar 18, 2014 11:47 am

Re: Best Raspberry Pi Alternatives 2019

Tue Oct 08, 2019 4:00 pm

jcyr wrote:
Tue Oct 08, 2019 2:52 pm
LTolledo wrote:
Tue Oct 08, 2019 9:54 am
Hmmm.... as a "collector" me got interested one time at acquiring the Jetson Nano Developer kit... as its price is almost the same as a RPi4B-2G set combo here.

I may be able persuade the CFO for some budget.... so just keeping my fingers crossed and hopes up :D
Mine arrives tomorrow. It'll be interesting to see how well my X86 Ubuntu numerical (non video) applications using power hungry video cards port to this integrated ARM host based SBC.
It's worth noting that the GM20B Maxwell GPU in the Nano is primarily designed for machine-learning workloads. In particular, the peak floating-point performance is

Code: Select all

GPU          FP16  FP32  FP64
Jetson Nano   472   236   7.4 GFLOPS
which makes the Nano's GPU slower at double-precision than the Cortex-A72 CPUs on the Pi 4B. Of course the Nano also has some ARM CPUs. I wonder how close they are in speed?

If you are able, I would be very interested to compare the relative performance of the ARM CPUs on the Nvidia Jetson Nano to the Raspberry Pi 4B by running this Pi pie chart program on the Nano after it arrives.

User avatar
jcyr
Posts: 357
Joined: Sun Apr 23, 2017 1:31 pm
Location: Atlanta

Re: Best Raspberry Pi Alternatives 2019

Tue Oct 08, 2019 4:55 pm

ejolson wrote:
Tue Oct 08, 2019 4:00 pm
jcyr wrote:
Tue Oct 08, 2019 2:52 pm
LTolledo wrote:
Tue Oct 08, 2019 9:54 am
Hmmm.... as a "collector" me got interested one time at acquiring the Jetson Nano Developer kit... as its price is almost the same as a RPi4B-2G set combo here.

I may be able persuade the CFO for some budget.... so just keeping my fingers crossed and hopes up :D
Mine arrives tomorrow. It'll be interesting to see how well my X86 Ubuntu numerical (non video) applications using power hungry video cards port to this integrated ARM host based SBC.
It's worth noting that the GM20B Maxwell GPU in the Nano is primarily designed for machine-learning workloads. In particular, the peak floating-point performance is

Code: Select all

GPU          FP16  FP32  FP64
Jetson Nano   472   236   7.4 GFLOPS
which makes the Nano's GPU slower at double-precision than the Cortex-A72 CPUs on the Pi 4B. Of course the Nano also has some ARM CPUs. I wonder how close they are in speed?

If you are able, I would be very interested to compare the relative performance of the ARM CPUs on the Nvidia Jetson Nano to the Raspberry Pi 4B by running this Pi pie chart program on the Nano after it arrives.
The algorithms I've developed in CUDA are cryptographic using integer math. Does the test you suggest offload computation to the CUDA cores? I'm not that reliant on the performance of the ARM cores.
It's um...uh...well it's kinda like...and it's got a bit of...

ejolson
Posts: 3588
Joined: Tue Mar 18, 2014 11:47 am

Re: Best Raspberry Pi Alternatives 2019

Tue Oct 08, 2019 5:41 pm

jcyr wrote:
Tue Oct 08, 2019 4:55 pm
ejolson wrote:
Tue Oct 08, 2019 4:00 pm
jcyr wrote:
Tue Oct 08, 2019 2:52 pm

Mine arrives tomorrow. It'll be interesting to see how well my X86 Ubuntu numerical (non video) applications using power hungry video cards port to this integrated ARM host based SBC.
It's worth noting that the GM20B Maxwell GPU in the Nano is primarily designed for machine-learning workloads. In particular, the peak floating-point performance is

Code: Select all

GPU          FP16  FP32  FP64
Jetson Nano   472   236   7.4 GFLOPS
which makes the Nano's GPU slower at double-precision than the Cortex-A72 CPUs on the Pi 4B. Of course the Nano also has some ARM CPUs. I wonder how close they are in speed?

If you are able, I would be very interested to compare the relative performance of the ARM CPUs on the Nvidia Jetson Nano to the Raspberry Pi 4B by running this Pi pie chart program on the Nano after it arrives.
The algorithms I've developed in CUDA are cryptographic using integer math. Does the test you suggest offload computation to the CUDA cores? I'm not that reliant on the performance of the ARM cores.
I thought about making a GPU-accelerated version, but never did--maybe over Christmas vacation.

Currently the Pi pie chart programs are OpenMP only and do not offload anything to the GPU. Thus, the resulting pie chart would only compare the quad-core Cortex-A57 on the Nano to the Quad-Core Cortex-A72 on the 4B. Although, the main point of the Nano is having a CUDA-enabled GPU, comparing the ARM cores is still a little bit interesting.

I don't have much experience using CUDA for integer arithmetic. How much faster do your CUDA-accelerated encryption routines perform compared to equivalent CPU versions?

alnaseh
Posts: 63
Joined: Thu Jun 23, 2016 5:12 am

Re: Best Raspberry Pi Alternatives 2019

Tue Oct 08, 2019 6:13 pm

Im my case i managed to get 140ms glass to glass latency with the max resolution 3280x2464 with 21fps on newer codec h265. They have very tight integration with gstreamer.

User avatar
jcyr
Posts: 357
Joined: Sun Apr 23, 2017 1:31 pm
Location: Atlanta

Re: Best Raspberry Pi Alternatives 2019

Tue Oct 08, 2019 6:15 pm

ejolson wrote:
Tue Oct 08, 2019 5:41 pm
I thought about making a GPU-accelerated version, but never did--maybe over Christmas vacation.

Currently the Pi pie chart programs are OpenMP only and do not offload anything to the GPU. Thus, the resulting pie chart would only compare the quad-core Cortex-A57 on the Nano to the Quad-Core Cortex-A72 on the 4B. Although, the main point of the Nano is having a CUDA-enabled GPU, comparing the ARM cores is still a little bit interesting.

I don't have much experience using CUDA for integer arithmetic. How much faster do your CUDA-accelerated encryption routines perform compared to equivalent CPU versions?
Sure, I'll run your benchmark as soon as I get some time. I expect the Nano will be slower than the PI4. Older A57 at 1.42 GHz vs. A72 at 1.5 GHz.

As for CUDA acceleration using parallelism, I can only speak for 8 thread Intel I7 with Nvidia GTX1060 where at least 2 orders of magnitude improvement is achieved over I7 CPU alone. Clearly I don't expect this level of performance on the Nano with one tenth the number of CUDA cores running at lower clock speed. Also there's the 1060's 6GB of GDDR5 memory with 192-bit bus width that I could take advantage of, that will not be available on the Nano.
It's um...uh...well it's kinda like...and it's got a bit of...

User avatar
Gavinmc42
Posts: 3758
Joined: Wed Aug 28, 2013 3:31 am

Re: Best Raspberry Pi Alternatives 2019

Wed Oct 09, 2019 1:29 am

I've been ignoring the main guys while living in Piland.
Interesting "Fanless" embedded stuff coming from Intel and AMD.
Some big heatsinks though :D
They say for embedded Display market etc?
Even triple HDMI boards, I wonder why?.

Is doe300 is still around and working on OpenCL for Pi4/VC6?
I'm dancing on Rainbows.
Raspberries are not Apples or Oranges

User avatar
jcyr
Posts: 357
Joined: Sun Apr 23, 2017 1:31 pm
Location: Atlanta

Re: Best Raspberry Pi Alternatives 2019

Wed Oct 09, 2019 3:05 am

Gavinmc42 wrote:
Wed Oct 09, 2019 1:29 am
Is doe300 is still around and working on OpenCL for Pi4/VC6?
Apparently not...VC4 only.
It's um...uh...well it's kinda like...and it's got a bit of...

User avatar
jcyr
Posts: 357
Joined: Sun Apr 23, 2017 1:31 pm
Location: Atlanta

Re: Best Raspberry Pi Alternatives 2019

Thu Oct 10, 2019 12:42 am

ejolson wrote:
Tue Oct 08, 2019 4:00 pm
If you are able, I would be very interested to compare the relative performance of the ARM CPUs on the Nvidia Jetson Nano to the Raspberry Pi 4B by running this Pi pie chart program on the Nano after it arrives.
Nice! Familiar 64-bit Ubuntu on Jetson.

piechart results:

Jetson-nano

Code: Select all

jcyr@Jetson:~/pichart-30$ ./pichart-openmp -t jetson-nanno -n 4
pichart -- Raspberry Pi Performance OPENMP version 30

Prime Sieve          P=14630843 Workers=4 Sec=0.62325 Mops=1499.12
Merge Sort           N=16777216 Workers=4 Sec=1.55443 Mops=259.036
Fourier Transform    N=4194304 Workers=4 Sec=1.17645 Mflops=392.173
Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.713005 Mflops=4517.82

The jetson-nanno has Raspberry Pi ratio=25.5737
Making pie charts...done.
jcyr@Jetson:~/pichart-30$
PI4B+

Code: Select all

pi@raspberrypi:~/pichart-30 $ ./pichart-openmp -t pi4b+ -n 4            
pichart -- Raspberry Pi Performance OPENMP version 30

Prime Sieve          P=14630843 Workers=4 Sec=0.54815 Mops=1704.51
Merge Sort           N=16777216 Workers=4 Sec=1.51346 Mops=266.049
Fourier Transform    N=4194304 Workers=4 Sec=1.70926 Mflops=269.926
Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.620327 Mflops=5192.79

The pi4b+ has Raspberry Pi ratio=25.0723
Making pie charts...done.
pi@raspberrypi:~/pichart-30 $
Odd thing about your benchmark... it sometimes thinks the Jetson has 8 ARM cores?
It's um...uh...well it's kinda like...and it's got a bit of...

ejolson
Posts: 3588
Joined: Tue Mar 18, 2014 11:47 am

Re: Best Raspberry Pi Alternatives 2019

Thu Oct 10, 2019 2:19 am

jcyr wrote:
Thu Oct 10, 2019 12:42 am
ejolson wrote:
Tue Oct 08, 2019 4:00 pm
If you are able, I would be very interested to compare the relative performance of the ARM CPUs on the Nvidia Jetson Nano to the Raspberry Pi 4B by running this Pi pie chart program on the Nano after it arrives.
Nice! Familiar 64-bit Ubuntu on Jetson.

piechart results:

Jetson-nano

Code: Select all

jcyr@Jetson:~/pichart-30$ ./pichart-openmp -t jetson-nanno -n 4
pichart -- Raspberry Pi Performance OPENMP version 30

Prime Sieve          P=14630843 Workers=4 Sec=0.62325 Mops=1499.12
Merge Sort           N=16777216 Workers=4 Sec=1.55443 Mops=259.036
Fourier Transform    N=4194304 Workers=4 Sec=1.17645 Mflops=392.173
Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.713005 Mflops=4517.82

The jetson-nanno has Raspberry Pi ratio=25.5737
Making pie charts...done.
jcyr@Jetson:~/pichart-30$
PI4B+

Code: Select all

pi@raspberrypi:~/pichart-30 $ ./pichart-openmp -t pi4b+ -n 4            
pichart -- Raspberry Pi Performance OPENMP version 30

Prime Sieve          P=14630843 Workers=4 Sec=0.54815 Mops=1704.51
Merge Sort           N=16777216 Workers=4 Sec=1.51346 Mops=266.049
Fourier Transform    N=4194304 Workers=4 Sec=1.70926 Mflops=269.926
Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.620327 Mflops=5192.79

The pi4b+ has Raspberry Pi ratio=25.0723
Making pie charts...done.
pi@raspberrypi:~/pichart-30 $
Odd thing about your benchmark... it sometimes thinks the Jetson has 8 ARM cores?
The Pi chart program knows there are four cores, but tests various numbers of worker threads including overprovisioning with double the available cores and reports the fastest time.

Thanks for running the tests. The results are surprisingly similar and yet different.

User avatar
jcyr
Posts: 357
Joined: Sun Apr 23, 2017 1:31 pm
Location: Atlanta

Re: Best Raspberry Pi Alternatives 2019

Thu Oct 10, 2019 3:53 am

ejolson wrote:
Thu Oct 10, 2019 2:19 am
The Pi chart program knows there are four cores, but tests various numbers of worker threads including overprovisioning with double the available cores and reports the fastest time.
Ah, ok. The merge sort and Fourier tests ran faster with 8 threads.

Code: Select all

jcyr@Jetson:~/pichart-30$ ./pichart-openmp -t jetson-nanno
pichart -- Raspberry Pi Performance OPENMP version 30

Prime Sieve          P=14630843 Workers=4 Sec=0.623114 Mops=1499.45
Merge Sort           N=16777216 Workers=8 Sec=1.22402 Mops=328.959
Fourier Transform    N=4194304 Workers=8 Sec=0.917132 Mflops=503.061
Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.707082 Mflops=4555.66

The jetson-nanno has Raspberry Pi ratio=28.9537
Making pie charts...done.
These are all compute bound tests. I fail to understand why 8 threads would be faster than 4 on a 4 core processor.
It's um...uh...well it's kinda like...and it's got a bit of...

ejolson
Posts: 3588
Joined: Tue Mar 18, 2014 11:47 am

Re: Best Raspberry Pi Alternatives 2019

Thu Oct 10, 2019 5:18 am

jcyr wrote:
Thu Oct 10, 2019 3:53 am
ejolson wrote:
Thu Oct 10, 2019 2:19 am
The Pi chart program knows there are four cores, but tests various numbers of worker threads including overprovisioning with double the available cores and reports the fastest time.
Ah, ok. The merge sort and Fourier tests ran faster with 8 threads.

Code: Select all

jcyr@Jetson:~/pichart-30$ ./pichart-openmp -t jetson-nanno
pichart -- Raspberry Pi Performance OPENMP version 30

Prime Sieve          P=14630843 Workers=4 Sec=0.623114 Mops=1499.45
Merge Sort           N=16777216 Workers=8 Sec=1.22402 Mops=328.959
Fourier Transform    N=4194304 Workers=8 Sec=0.917132 Mflops=503.061
Lorenz 96            N=32768 K=16384 Workers=4 Sec=0.707082 Mflops=4555.66

The jetson-nanno has Raspberry Pi ratio=28.9537
Making pie charts...done.
These are all compute bound tests. I fail to understand why 8 threads would be faster than 4 on a 4 core processor.
My suspicion is that over provisioning results in faster execution times because it allows the Linux scheduler to keep all cores busy by time slicing when the parcels of work aren't quite the same size and wouldn't otherwise divide the problem in an optimal way. Generally, however, it's a bit mysterious.

In this case it looks like the overall improvement is about 10 percent.

If you still have the time, it would be interesting to install gcc version 6.x to see whether the merge sort runs significantly faster, as it does on the Pi when using the older compiler.

Return to “Off topic discussion”