ejolson
Posts: 3674
Joined: Tue Mar 18, 2014 11:47 am

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 6:38 am

Paeryn wrote:
Tue Dec 04, 2018 5:23 am
ejolson wrote:
Tue Dec 04, 2018 4:38 am
jahboater wrote:
Tue Dec 04, 2018 4:29 am
On a Pi3B+ (gcc 8.2) it is:-

Code: Select all

$ gcc -O3 fibo.c -o fibo
$ time ./fibo | head -c 32
10727395641800477229364813596225
real	0m15.430s
user	0m15.416s
sys	0m0.016s
$ time ./fibo | tail -c 32
4856539211500699706378405156269

real	0m15.500s
user	0m15.496s
sys	0m0.007s
That's four times faster but still just one core! Is your Pi 3B+ running in 32-bit mode or 64-bit mode? Thanks for the additional timing.
I just tried it on my stock 3B under Raspbian, compiling with gcc-6.3 gave me 39s, gcc-8.1 gave me 38s. It didn't throttle the speed back whilst running either.

Code: Select all

pi@rpi3:~/Programming/asm/fibo $ time ./fibo8.1 | head -c 32
10727395641800477229364813596225
real    0m38.549s
user    0m38.503s
sys     0m0.047s
That seems a bit slow relative to the 3B+ result. Did you remember the -O3 option? That made a significant difference for me on the Zero.

I wonder whether you are building compatible ARMv6 binaries rather than more efficient ARMv7 executables for the 3B. For example, one needs to explicitly set -march and -mtune when using the default C compiler on Raspbian.

jahboater
Posts: 4759
Joined: Wed Feb 04, 2015 6:38 pm

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 6:41 am

In 64-bit mode on an Odroid C2 at 1.68GHz (same Cortex-A53 as a pi3)

Code: Select all

odroid@odroid:~$ time ./fibo | head -c 32
10727395641800477229364813596225
real	0m6.241s
user	0m6.220s
sys	0m0.010s
odroid@odroid:~$ time ./fibo | tail -c 32
4856539211500699706378405156269

real	0m6.490s
user	0m6.470s
sys	0m0.010s
odroid@odroid:~$ 
Correcting for the clock speed difference 1.68/1.4 * 6.24 = 7.488 seconds and 7.788 seconds.
Last edited by jahboater on Tue Dec 04, 2018 6:46 am, edited 1 time in total.

jahboater
Posts: 4759
Joined: Wed Feb 04, 2015 6:38 pm

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 6:46 am

ejolson wrote:
Tue Dec 04, 2018 6:38 am
I wonder whether you are building compatible ARMv6 binaries rather than more efficient ARMv7 executables for the 3B. For example, one needs to explicitly set -march and -mtune when using the default C compiler on Raspbian.
It should be ARMv8 for the Pi3[+]

Code: Select all

$ gcc -O3 -march=armv8-a -mtune=cortex-a53 -o fibo fibo.c
$ time ./fibo | head -c 32
10727395641800477229364813596225
real	0m15.436s
user	0m15.401s
sys	0m0.037s
Last edited by jahboater on Tue Dec 04, 2018 7:00 am, edited 2 times in total.

ejolson
Posts: 3674
Joined: Tue Mar 18, 2014 11:47 am

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 6:56 am

jahboater wrote:
Tue Dec 04, 2018 6:41 am
In 64-bit mode on an Odroid C2 at 1.68GHz (same Cortex-A53 as a pi3)

Code: Select all

odroid@odroid:~$ time ./fibo | head -c 32
10727395641800477229364813596225
real	0m6.241s
user	0m6.220s
sys	0m0.010s
odroid@odroid:~$ time ./fibo | tail -c 32
4856539211500699706378405156269

real	0m6.490s
user	0m6.470s
sys	0m0.010s
odroid@odroid:~$ 
Correcting for the clock speed difference 1.68/1.4 * 6.24 = 7.488 seconds and 7.788 seconds.
Maybe this is one of those real-world problems where twice as many bits goes twice as fast.

It will be interesting to see whether the Fibonacci programs written in BASIC and the other C code in development experience similar performance gains when running on 64-bit mode or not. It's tempting to try and double performance again using some parallel processing.
Last edited by ejolson on Tue Dec 04, 2018 6:59 am, edited 1 time in total.

jahboater
Posts: 4759
Joined: Wed Feb 04, 2015 6:38 pm

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 6:59 am

ejolson wrote:
Tue Dec 04, 2018 6:56 am
Maybe this is one of those real-world problems where twice as many bits goes twice as fast.
Well there is quite a lot of 64-bit arithmetic going on (unsigned long long), so its reasonable that its a lot faster.

ejolson
Posts: 3674
Joined: Tue Mar 18, 2014 11:47 am

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 8:08 am

jahboater wrote:
Tue Dec 04, 2018 6:59 am
Well there is quite a lot of 64-bit arithmetic (unsigned long long) going on, so its reasonable that its a lot faster.
Agreed. It should be possible to switch the code to use 32-bit integers and a smaller base. Some debugging might be needed as well as tuning. However, even when the processor is in 32-bit mode, I'm not sure such modifications would make the program run faster.

plugwash
Forum Moderator
Forum Moderator
Posts: 3462
Joined: Wed Dec 28, 2011 11:45 pm

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 8:54 am

ejolson wrote:
Tue Dec 04, 2018 8:08 am
jahboater wrote:
Tue Dec 04, 2018 6:59 am
Well there is quite a lot of 64-bit arithmetic (unsigned long long) going on, so its reasonable that its a lot faster.
Agreed. It should be possible to switch the code to use 32-bit integers and a smaller base. Some debugging might be needed as well as tuning. However, even when the processor is in 32-bit mode, I'm not sure such modifications would make the program run faster.
Right, ultimately it probably doesn't matter matter much whether you split it into 64-bit chunks and the compiler splits those 64-bit chunks into 32-bit chunks or you split it into 32-bit chunks yourself. Running the CPU in 32-bit mode means you only get to use half of the ALU width.

jahboater
Posts: 4759
Joined: Wed Feb 04, 2015 6:38 pm

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 10:01 am

For amusement I tried:
-ftree-vectorize -fopt-info-vec-missed
but it printed out pages of reasons why it couldn't use SIMD and was no faster.

Also:
-mneon-for-64bits
which does some of the 64-bit integer arithmetic with NEON - didn't help.

User avatar
Paeryn
Posts: 2698
Joined: Wed Nov 23, 2011 1:10 am
Location: Sheffield, England

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 11:56 am

Ah, found out why it was running so slow on mine. My gcc-8.1 is slightly mis-configured (I cross compiled it on my desktop) and doesn't link properly so I was using gcc-6.3 to do the final linking phase. I've just manually ran the link command from gcc-8.1 after manually fixing it and now I get 17.7s which is near enough what jahboater got taking in to account the 200MHz difference.

Code: Select all

pi@rpi3:~/Programming/asm/fibo $ time ./fibo8.1 | head -c32
10727395641800477229364813596225
real    0m17.752s
user    0m17.740s
sys     0m0.012s
Might take the opportunity to update to gcc-8.2 (correctly this time).
She who travels light — forgot something.

User avatar
DavidS
Posts: 4334
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 1:46 pm

Paeryn wrote: Maybe this is one of those real-world problems where twice as many bits goes twice as fast.
Yes something like this would be one of the very few rare examples where 64-bit has an advantage. Unfortunately for 99.9% of things there is no advantage to 64-Bit.

This is also why benchmarks that implement a lot of looped 64-bit and 128 bit math make 64-bit systems look a lot faster even though real world usage shows that they are a little slower on the same hardware (at least for ARM).

I did a good number of real world benchmarks on the RPi 3B when I first got it to see what the performance difference is between the two modes, both bare-metal gaming (with performance monitoring game engines) and under 32 and 64-bit Linux.
It will be interesting to see whether the Fibonacci programs written in BASIC and the other C code in development experience similar performance gains when running on 64-bit mode or not. It's tempting to try and double performance again using some parallel processing.
Yes it would be interesting. Though I will not be the one to find out. The BBC BASIC version is stuck in the 32-bit world as RISC OS is ARM only and is tied to the 32-bit ISA (a good thing), and I only use 32-bit Linux so the I can only try in 32-bit for the FreeBASIC version.

Though for the BBC BASIC version we are stuck with 32-bit integers, so I will have to convert the bases downward to fit 32-bit, freeBASIC has a 64-bit integer type though.

BTW: Your version has a little dificulty compiling on gcc 4.7.4, though it gets there (after an hour compile time).
RPi = The best ARM based RISC OS computer around
More than 95% of posts made from RISC OS on RPi 1B/1B+ computers. Most of the rest from RISC OS on RPi 2B/3B/3B+ computers

jahboater
Posts: 4759
Joined: Wed Feb 04, 2015 6:38 pm

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 2:24 pm

Paeryn wrote:
Tue Dec 04, 2018 11:56 am
Ah, found out why it was running so slow on mine. My gcc-8.1 is slightly mis-configured (I cross compiled it on my desktop) and doesn't link properly so I was using gcc-6.3 to do the final linking phase. I've just manually ran the link command from gcc-8.1 after manually fixing it and now I get 17.7s which is near enough what jahboater got taking in to account the 200MHz difference.

Code: Select all

pi@rpi3:~/Programming/asm/fibo $ time ./fibo8.1 | head -c32
10727395641800477229364813596225
real    0m17.752s
user    0m17.740s
sys     0m0.012s
Might take the opportunity to update to gcc-8.2 (correctly this time).
Thank you for that! I was thinking they had sent me a prototype Pi4B by mistake :)

I build gcc with this script actually on the 3B, it does everything including the download, but asks before the final install.
Takes about 6 hours or so but just leave it running overnight. And have around 8-10GB of free disk space.

Code: Select all

#!/bin/bash

#
#  This is the new GCC version to install.
#
VERSION=8.2.0

#
#  For the Pi or any computer with less than 2GB of memory.
#
if [ -f /etc/dphys-swapfile ]; then
  sudo sed -i 's/^CONF_SWAPSIZE=[0-9]*$/CONF_SWAPSIZE=1024/' /etc/dphys-swapfile
  sudo /etc/init.d/dphys-swapfile restart
fi

if [ -d gcc-$VERSION ]; then
  cd gcc-$VERSION
  rm -rf obj
else
  wget ftp://ftp.fu-berlin.de/unix/languages/gcc/releases/gcc-$VERSION/gcc-$VERSION.tar.xz
  tar xf gcc-$VERSION.tar.xz
  cd gcc-$VERSION
  contrib/download_prerequisites
fi
mkdir -p obj
cd obj

#
#  Now run the ./configure which MUST be checked/edited beforehand.
#  Uncomment the section below depending on your platform.  You may build
#  on a Pi3 for a target Pi Zero by uncommenting the Pi Zero section.
#  To alter the target directory set --prefix=<dir>
#

# Pi3+, Pi3, and new Pi2
../configure --enable-languages=c,c++ --with-cpu=cortex-a53 \
  --with-fpu=neon-fp-armv8 --with-float=hard --build=arm-linux-gnueabihf \
  --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --enable-checking=no

# Pi Zero's
#../configure --enable-languages=c,c++ --with-cpu=arm1176jzf-s \
#  --with-fpu=vfp --with-float=hard --build=arm-linux-gnueabihf \
#  --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --enable-checking=no

# x86_64
#../configure --disable-multilib --enable-languages=c,c++ --enable-checking=no

# Odroid-C2 Aarch64
#../configure --enable-languages=c,c++ --with-cpu=cortex-a53 --enable-checking=no

# Old Pi2
#../configure --enable-languages=c,c++ --with-cpu=cortex-a7 \
#  --with-fpu=neon-vfpv4 --with-float=hard --build=arm-linux-gnueabihf \
#  --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf --enable-checking=no

#
#  Now build GCC which will take a long time.  This could range from
#  4.5 hours on a Pi3B+ up to about 50 hours on a Pi Zero.  It can be
#  left to complete overnight (or over the weekend for a Pi Zero :-)
#  The most likely causes of failure are lack of disk space, lack of
#  swap space or memory, or the wrong configure section uncommented.
#  The new compiler is placed in /usr/local/bin, the existing compiler remains
#  in /usr/bin and may be used by giving its version gcc-6 (say).
#
if make -j `nproc`
then
  echo
  read -p "Do you wish to install the new GCC (y/n)? " yn
  case $yn in
   [Yy]* ) sudo make install ;;
	   * ) exit ;;
  esac
fi
Last edited by jahboater on Tue Dec 04, 2018 2:49 pm, edited 3 times in total.

jahboater
Posts: 4759
Joined: Wed Feb 04, 2015 6:38 pm

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 2:27 pm

DavidS wrote:
Tue Dec 04, 2018 1:46 pm
BTW: Your version has a little dificulty compiling on gcc 4.7.4, though it gets there (after an hour compile time).
I think there might be something wrong there:

Code: Select all

pi@pi3:~ $ time gcc -O3 fibo.c -o fibo

real	0m1.215s
user	0m1.109s
sys	0m0.106s
1.2 seconds on a Pi3+. Is your disk very slow perhaps?

User avatar
DavidS
Posts: 4334
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 3:13 pm

jahboater wrote:
Tue Dec 04, 2018 2:27 pm
DavidS wrote:
Tue Dec 04, 2018 1:46 pm
BTW: Your version has a little dificulty compiling on gcc 4.7.4, though it gets there (after an hour compile time).
I think there might be something wrong there:

Code: Select all

pi@pi3:~ $ time gcc -O3 fibo.c -o fibo

real	0m1.215s
user	0m1.109s
sys	0m0.106s
1.2 seconds on a Pi3+. Is your disk very slow perhaps?
Takes only a couple seconds on Raspbian with gcc v6.3.0. I generally do not bother with newer versions of gcc than are in the Repo unless I have no choice, the gcc takes to long to compile itself.

My disk is a high speed USB 3.0 500GB WD SSD in RISC OS, so it is not the disk.

Though gcc 4.7.4 throws a ton of warnings when compiling, and I always compile with the -Wall option in gcc.

I do also wonder what the run time would be compiled with tcc, as tcc does not do much for optimization.
RPi = The best ARM based RISC OS computer around
More than 95% of posts made from RISC OS on RPi 1B/1B+ computers. Most of the rest from RISC OS on RPi 2B/3B/3B+ computers

User avatar
DavidS
Posts: 4334
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 3:29 pm

@ejolson:
On my 3B+ running at 600MHz high speed (arm_freq=600, arm_freq_min=500) running Raspbian, using gcc 6.3.0, here is the output:

Code: Select all

~/CPlay $ time gcc -O3 -Wall -ffast-math -o ./EJFibbo EJFibbo.c 
EJFibbo.c: In function ‘fmalloc’:
EJFibbo.c:28:13: warning: pointer targets in initialization differ in signedness [-Wpointer-sign]
     char *r=bp; bp+=p; return r;
             ^~
EJFibbo.c: In function ‘newbn’:
EJFibbo.c:74:9: warning: unused variable ‘i’ [-Wunused-variable]
     int i;
         ^
At top level:
EJFibbo.c:39:12: warning: ‘cnt’ defined but not used [-Wunused-variable]
 static int cnt=0;
            ^~~

real	0m1.924s
user	0m1.815s
sys	0m0.106s
~/CPlay $ time ./EJFibbo | head -c 32
10727395641800477229364813596225
real	0m57.778s
user	0m57.730s
sys	0m0.032s
~/CPlay $ 
So a little faster than you got on the Zero, which I assume is running at 1000MHz?

As this is single threaded it shows how much faster a single core is in the ARM Cortex-A53 over the ARM1176JZF-S, as running much slower the A53 is faster. :) .

Also gcc 4.7.4 throws a lot more warning, a lot more.

I had to clock my old RPi 3B down this far in order to keep it from throwing false complaints about low voltage (I verified that the voltage is always between 5.0 and 5.12 volts on the board, so definitely false warnings).
RPi = The best ARM based RISC OS computer around
More than 95% of posts made from RISC OS on RPi 1B/1B+ computers. Most of the rest from RISC OS on RPi 2B/3B/3B+ computers

User avatar
DavidS
Posts: 4334
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 3:59 pm

I am now working on the FreeBASIC implementation of the version provided by ejolson.
RPi = The best ARM based RISC OS computer around
More than 95% of posts made from RISC OS on RPi 1B/1B+ computers. Most of the rest from RISC OS on RPi 2B/3B/3B+ computers

User avatar
DavidS
Posts: 4334
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 5:27 pm

I am about half way through the hand conversion, and the differences between the languages is notable. I could not say which is better just looking at it so far.

Once I get far enough to attempt to compile it it will be interesting to see what mistakes I have made.

Also the lack of use of any comments by the original author makes it interesting to read the first two times.

I will admit that so far I feel that C is a better option for this particular application. Even if tcc refuses to compile it (still trying to figure that one out).
RPi = The best ARM based RISC OS computer around
More than 95% of posts made from RISC OS on RPi 1B/1B+ computers. Most of the rest from RISC OS on RPi 2B/3B/3B+ computers

ejolson
Posts: 3674
Joined: Tue Mar 18, 2014 11:47 am

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 6:08 pm

DavidS wrote:
Tue Dec 04, 2018 3:29 pm
@ejolson:
On my 3B+ running at 600MHz high speed (arm_freq=600, arm_freq_min=500) running Raspbian, using gcc 6.3.0, here is the output:

Code: Select all

~/CPlay $ time gcc -O3 -Wall -ffast-math -o ./EJFibbo EJFibbo.c 
EJFibbo.c: In function ‘fmalloc’:
EJFibbo.c:28:13: warning: pointer targets in initialization differ in signedness [-Wpointer-sign]
     char *r=bp; bp+=p; return r;
             ^~
EJFibbo.c: In function ‘newbn’:
EJFibbo.c:74:9: warning: unused variable ‘i’ [-Wunused-variable]
     int i;
         ^
At top level:
EJFibbo.c:39:12: warning: ‘cnt’ defined but not used [-Wunused-variable]
 static int cnt=0;
            ^~~

real	0m1.924s
user	0m1.815s
sys	0m0.106s
~/CPlay $ time ./EJFibbo | head -c 32
10727395641800477229364813596225
real	0m57.778s
user	0m57.730s
sys	0m0.032s
~/CPlay $ 
So a little faster than you got on the Zero, which I assume is running at 1000MHz?

As this is single threaded it shows how much faster a single core is in the ARM Cortex-A53 over the ARM1176JZF-S, as running much slower the A53 is faster. :) .

Also gcc 4.7.4 throws a lot more warning, a lot more.

I had to clock my old RPi 3B down this far in order to keep it from throwing false complaints about low voltage (I verified that the voltage is always between 5.0 and 5.12 volts on the board, so definitely false warnings).
That's correct. The Zero is running at 1000MHz.

It looks like a little code cleaning is in order. The subroutines for the big number multiply were originally developed for the Sphere Online Judge Pentium III 733MHz cluster using gcc 4.1.1 or earlier. There are now a few minor changes I plan to make, but nothing that should change the computation or impact the execution speed.

User avatar
DavidS
Posts: 4334
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 7:32 pm

Just out of curiosity do you know why tcc says that sqrt and log10 are undefined? The math.h standard header is included, which is required to define those.

Not really a big thing as gcc is the compiler of choice for this, though would be nice to know.

Now to finish up the conversion. Been an interesting process thus far.
RPi = The best ARM based RISC OS computer around
More than 95% of posts made from RISC OS on RPi 1B/1B+ computers. Most of the rest from RISC OS on RPi 2B/3B/3B+ computers

User avatar
DavidS
Posts: 4334
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 10:21 pm

Ok I am working on the BBC BASIC V version until i can find information on why the FreeBASIC version will not compile (no error messages, compiler verified by compiling many example programs, two versions of compiler on two OS's).

I have learned that I do not like the way pointers are shohorned into FreeBASIC, it is not very natural, I much prefer the way that it is done in BBC BASIC, very natural. I thought that the FreeBASIC version would be a breeze because of one feature that BBC BASIC V does not have, explicit structured datatypes (though it can still access structured datatypes with indirection, in a way similar to how it is done with structured types in other lanugages).

Though it apears that I will have better luck with BBC BASIC V. I use BBC BASIC V every day, and I have done very little with structured BASIC modeled after MS-QuickBASiC/AmigaBASIC since I still heavily used Amiga OS 1.3. I think a big part of my problem is lack of familiarity with the weard way that MS type structured procedural BASIC does things.

though of to punching out the BBC BASIC V version, and compiling it.
RPi = The best ARM based RISC OS computer around
More than 95% of posts made from RISC OS on RPi 1B/1B+ computers. Most of the rest from RISC OS on RPi 2B/3B/3B+ computers

ejolson
Posts: 3674
Joined: Tue Mar 18, 2014 11:47 am

Re: Why Avoid BASIC on RPi?

Tue Dec 04, 2018 11:45 pm

DavidS wrote:
Tue Dec 04, 2018 5:27 pm
Also the lack of use of any comments by the original author makes it interesting to read the first two times.
Sorry about the lack of comments. There is a reasonable discussion of Karatsuba multiplication on Wikipedia along with pseudocode that should help with how multbnk works. Note that when either factor consists of 49 or less base 1000000000 digits, then multbn is called to perform the usual O(n^2) multiplication algorithm learned in school.

I have modified the code to remove the warnings you noticed, changed the names of two variables for consistency and further added code to the copybn routine to zero out unused memory when copying a smaller number over a bigger number. While the numbers only get bigger when computing the nth term in the Fibonacci series, not zeroing the memory could lead to incorrect results if the same code were later reused for a different purpose.

I have updated the code provided in this post to contain the improved version of the code.

jahboater
Posts: 4759
Joined: Wed Feb 04, 2015 6:38 pm

Re: Why Avoid BASIC on RPi?

Wed Dec 05, 2018 1:34 am

I have tidied it up some more.
It now compiles cleanly with gcc -O3 -Wall -Wextra -Wconversion ....

Code: Select all

/*  fibonacci.c -- Compute the nth Fibonacci Number
	Written December 1, 2018 by Eric Olson

	This program demonstrates the expressiveness of C as measured
	by explicitly coding the Karatsuba multiplication algorithm for
	big-number arithmetic and then using the doubling formulas

		F(2k) = F(k)[2F(k+1)-F(k)]
	  F(2k+1) = F(k+1)^2+F(k)^2
	  F(2k+2) = F(k+1)[2F(k)+F(k+1)]

	to compute the nth Fibonacci number.  Note that n is specified
	in the first line of the main routine.

	Version 2:  Minor changes to fix compiler warnings and zero the
	unused memory in the copybn routine.
*/

#include <stdio.h>
#include <math.h>
#include <string.h>
#include <stdlib.h>

typedef unsigned long long ull;
static const ull base=1000000000ULL;
static const int bexp=9;

static unsigned char *bnbuf, *bp;

static inline void *fmalloc(const size_t p){
	unsigned char * const r = bp;
	bp+=p;
	return r;
}
static inline void ffree(void * const p){
	bp=p;
}

typedef struct {
	int n;
	ull *d;
} bignum;

static ull atolln(char const * const p, const int n){
	ull e = 1, s = 0;
	for(int i=n-1; i>=0; --i, e*=10 ){
		s+=(ull)(p[i]-'0')*e;
	}
	return s;
}

static void pbn(const bignum x){
	char fmt[8];
	sprintf(fmt,"%%0%dLu",bexp);
	if(!x.n){
		printf("0\n");
		return;
	}
	for(int i=x.n-1;i>=0;i--){
		if(i==x.n-1) printf("%Lu",x.d[i]);
		else printf(fmt,x.d[i]);
	}
	printf("\n");
}

static inline bignum trmbn(bignum x){
	if(!x.n)
	  return x;
	int i;
	for(i=x.n-1;i>=0;i--)
	  if(x.d[i])
		break;
	x.n=i+1;
	return x;
}

static inline bignum newbn(const int digits){
	bignum x;
	const size_t len = (size_t)digits * sizeof(ull);
	x.d=fmalloc(len);
	memset(x.d,0,len);
	x.n=0;
	return x;
}

static bignum atobn(char const * const p, const int digits){
	int i;
	bignum x=newbn(digits);
	int n=(int)strlen(p);
	if(!n) return x;
	for(i=n-bexp;i>=0;i-=bexp){
		x.d[x.n++]=atolln(&p[i],bexp);
	}
	if(i+bexp>0) {
		x.d[x.n++]=atolln(&p[0],i+bexp);
	}
	return x;
}

static inline void delbn(bignum a){
	ffree(a.d);
}

static bignum addbn(bignum a,const bignum b){
	ull c;
	bignum x;
	int i,n;
	n=a.n>b.n?a.n:b.n;
	x=newbn(n+1);
	for(i=0,c=0;i<n;i++){
		x.d[i]=c;
		if(i<a.n) c+=a.d[i];
		if(i<b.n) c+=b.d[i];
		if(c>=base) {
			x.d[i]=c-base;
			c=1;
		} else {
			x.d[i]=c;
			c=0;
		}
	}
	if(c)
	  x.d[n++]=c;
	x.n=n;
	return x;
}

static void subbn2( bignum a, const bignum b ){
	int i;
	if(!b.n) return;
	char * const c = fmalloc((size_t)a.n+1);
	for(i=0,c[0]=0;i<b.n;i++){
		if(a.d[i]<b.d[i]+c[i]) {
			a.d[i]+=base;
			c[i+1]=1;
		} else {
			c[i+1]=0;
		}
	}
	for(i=0;i<b.n;i++){
		a.d[i]-=b.d[i]+c[i];
	}
	a.d[b.n]-=c[b.n];
	ffree(c);
	return;
}

static bignum mulbn( bignum a, const bignum b ){
	if(!a.n||!b.n) {
	  return newbn(1);
	}
	int k,j;
	ull c;
	bignum x = newbn(a.n+b.n+1);
	x.n=a.n+b.n-1;
	for(int i=0;i<a.n;i++) {
		for(j=0;j<b.n;j++){
			x.d[i+j]+=a.d[i]*b.d[j];
		}
		if((a.n-i)%50==1){
			for(k=0;k<=x.n;k++){
				if(x.d[k]>=base){
					c=x.d[k]/base;
					x.d[k]%=base;
					x.d[k+1]+=c;
				}
			}
		}
	}
	if(x.d[x.n])
	  x.n++;
	return x;
}

static bignum mulbnk( bignum a, const bignum b ){
	int M,m,i,n;
	ull c;
	bignum a0,a1,b0,b1;
	bignum z2,z1a,z1b,z1,z0;
	if(!a.n||!b.n) {
		return newbn(1);
	}
	bignum x = newbn(a.n+b.n+1);
	if(a.n>b.n)
	  { M=a.n; m=b.n; }
	else
	  { M=b.n; m=a.n; }
	if(m<49)
	  return mulbn(a,b);
	n=M/2;
	a0.d=&a.d[0]; a0.n=n; if(a.n<a0.n) a0.n=a.n;
	b0.d=&b.d[0]; b0.n=n; if(b.n<b0.n) b0.n=b.n;
	a1.d=&a.d[n]; a1.n=a.n-n; if(a1.n<0) a1.n=0;
	b1.d=&b.d[n]; b1.n=b.n-n; if(b1.n<0) b1.n=0;
	z0=mulbnk(a0,b0);
	z2=mulbnk(a1,b1);
	z1a=addbn(a1,a0);
	z1b=addbn(b1,b0);
	z1=mulbnk(z1a,z1b);
	subbn2(z1,z0);
	subbn2(z1,z2);
	z1=trmbn(z1);
	memcpy(x.d,z0.d,(size_t)(x.n=z0.n)*sizeof(ull));
	if(z1.n){
		int k;
		for(i=0,c=0;i<z1.n;i++){
			x.d[k=n+i]+=z1.d[i]+c;
			if(x.d[k]>=base) {
				x.d[k]-=base;
				c=1;
			} else c=0;
		}
		k=n+i;
		while(c){
			x.d[k]+=c;
			if(x.d[k]>=base) {
				x.d[k]-=base;
				c=1;
			} else c=0;
			k++;
		}
		if(x.n<k) x.n=k;
	}
	if(z2.n){
		int n2=2*n,k;
		for(i=0,c=0;i<z2.n;i++){
			x.d[k=n2+i]+=z2.d[i]+c;
			if(x.d[k]>=base) {
				x.d[k]-=base;
				c=1;
			} else c=0;
		}
		k=n2+i;
		while(c){
			x.d[k]+=c;
			if(x.d[k]>=base) {
				x.d[k]-=base;
				c=1;
			} else c=0;
			k++;
		}
		if(x.n<k) x.n=k;
	}
	delbn(z1); delbn(z1b); delbn(z1a); delbn(z2); delbn(z0);
	return x;
}

static void copybn( bignum *a, bignum b ){
	b=trmbn(b);
	if(a->n>b.n) bzero(a->d+b.n,sizeof(ull)*(size_t)(a->n-b.n));
	memcpy(a->d,b.d,sizeof(ull)*(size_t)(a->n=b.n));
}

static int digits;

static int fibo(int n, bignum *a, bignum *b){
	if(!n) {
		*a=atobn("0",digits);
		*b=atobn("1",digits);
		return 0;
	}
	int m=2*fibo(n>>1,a,b)+n%2;
	bignum ta=*a, tb=*b;
	bignum taa=mulbnk(ta,ta);
	bignum tbb=mulbnk(tb,tb);
	bignum taapbb=addbn(taa,tbb);
	if(n%2){
		// [a,b]=[a*a+b*b,b*(2*a+b)]
		bignum t2a=addbn(ta,ta);
		bignum t2apb=addbn(t2a,tb);
		bignum tbL2apbR=mulbnk(tb,t2apb);
		copybn(a,taapbb); copybn(b,tbL2apbR);
		delbn(tbL2apbR); delbn(t2apb); delbn(t2a);
	} else {
		// [a,b]=[a*(b*2-a),a*a+b*b]
		bignum t2bma=addbn(tb,tb); subbn2(t2bma,ta);
		bignum taL2bmaR=mulbnk(ta,t2bma);
		copybn(a,taL2bmaR); copybn(b,taapbb);
		delbn(taL2bmaR); delbn(t2bma);
	}
	delbn(taapbb); delbn(tbb); delbn(taa);
	return m;
}

#define PHI ((1+sqrt(5.0))/2)
#define MFACT 12U

int main(){
	const int n = 4784969;
	digits = (int)((double)n * log10(PHI) / bexp + 4);
	bnbuf=malloc((size_t)digits*MFACT*sizeof(ull));
	if(!bnbuf){
		fprintf(stderr,"Out of memory!\n");
		return 1;
	}
	bp=bnbuf;
	bignum a,b;
	if(n<2)
		printf("%d\n",n);
	else {
		fibo(n-1,&a,&b);
		pbn(b);
	}
	return 0;
}
Last edited by jahboater on Wed Dec 05, 2018 8:09 am, edited 2 times in total.

User avatar
DavidS
Posts: 4334
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Why Avoid BASIC on RPi?

Wed Dec 05, 2018 2:32 am

ejolson wrote: Sorry about the lack of comments. There is a reasonable discussion of Karatsuba multiplicationon Wikipedia along with pseudocode that should help with how multbnk works. Note that when either factor consists of 49 or less base 1000000000 digits, then multbn is called to perform the usual O(n^2) multiplication algorithm learned in school.

I have modified the code to remove the warnings you noticed, changed the names of two variables for consistency and further added code to the copybn routine to zero out unused memory when copying a smaller number over a bigger number. While the numbers only get bigger when computing the nth term in the Fibonacci series, not zeroing the memory could lead to incorrect results if the same code were later reused for a different purpose.

I have updated the code provided in this postto contain the improved version of the code.
Nice, even compiles cleanly on RISC OS, in a reasobable time period.

Works out as i just got done the need to items of today, so I have a little time to throw at it tonight. Then I can continue tomorrow afternoon when I get home from the Dentist.
RPi = The best ARM based RISC OS computer around
More than 95% of posts made from RISC OS on RPi 1B/1B+ computers. Most of the rest from RISC OS on RPi 2B/3B/3B+ computers

ejolson
Posts: 3674
Joined: Tue Mar 18, 2014 11:47 am

Re: Why Avoid BASIC on RPi?

Wed Dec 05, 2018 4:53 am

jahboater wrote:
Wed Dec 05, 2018 1:34 am
I have tidied it up some more.
Are you sure this line

Code: Select all

unsigned char * const r = bp;
is correct? Why did you add const and what does it mean here?

jahboater
Posts: 4759
Joined: Wed Feb 04, 2015 6:38 pm

Re: Why Avoid BASIC on RPi?

Wed Dec 05, 2018 6:36 am

ejolson wrote:
Wed Dec 05, 2018 4:53 am
jahboater wrote:
Wed Dec 05, 2018 1:34 am
I have tidied it up some more.
Are you sure this line

Code: Select all

unsigned char * const r = bp;
is correct? Why did you add const and what does it mean here?
That one was just habit, it will work fine without.
It means what it says: "r" does not, and must not, change, so its good to tell the compiler that.
That is different from "const char * r =" which says that the "pointed to" object will not change.
In general, declaring things immutable makes code both faster and safer.
Though this function is so small and simple it probably doesn't make any difference to the speed.

ejolson
Posts: 3674
Joined: Tue Mar 18, 2014 11:47 am

Re: Why Avoid BASIC on RPi?

Wed Dec 05, 2018 2:13 pm

DavidS wrote:
Tue Dec 04, 2018 10:21 pm
I have learned that I do not like the way pointers are shohorned into FreeBASIC, it is not very natural, I much prefer the way that it is done in BBC BASIC, very natural. I thought that the FreeBASIC version would be a breeze because of one feature that BBC BASIC V does not have, explicit structured datatypes (though it can still access structured datatypes with indirection, in a way similar to how it is done with structured types in other lanugages).
In C the idiomatic way to code dynamically sized arrays is with pointers. In particular, the bignum structure and the custom memory allocators fmalloc and ffree could likely be replaced by dynamically sized arrays in Basic without pointers to create a less C-like translation. While this might result in slightly more copying during the divide and conquer part of the Karatsuba algorithm, the resulting clarity may be worthwhile and the performance difference unknown.

Return to “Off topic discussion”