AMcS
Posts: 184
Joined: Sun Jan 06, 2013 11:23 am
Location: Dublin, Ireland

Re: Trying to get an algorithm to run faster on RISC than wh

Mon Apr 15, 2013 8:35 pm

NigelJK wrote:Just to add some spice to the mix. The ARM uses about 2 watts of power, my current AMD 64bit cpu is rated at a (low) 60 watts! i7's (even the eco versions) start at around 45 watts and proceed up to 130 watts.
While that's very true (and has probably got the ARM to where it is now) I must admit I'd happily see the ARM use a few more watts and run two or more times faster - but hey that's just me :D

AMcS
Posts: 184
Joined: Sun Jan 06, 2013 11:23 am
Location: Dublin, Ireland

Re: Trying to get an algorithm to run faster on RISC than wh

Mon Apr 15, 2013 9:03 pm

jamesh wrote:
AMcS wrote: Agreed - I'd be very surprised at VC++ having an x250+ speed improvement without some radical re-arrangement of the output (compared to the source) being done.
jamesh wrote:That level of speed increase is more likely due to memory caching improving, rather than a simply change in the code. Keeping stuff in L1 or L2 cache vs fetching from main memory can make a colossal difference, esp. when the code is very memory intensive. This also make direct code comparisons very difficult, as it's complicated to predict what will and won't be in cache, esp. in a multithreaded system.
I have no problem accepting that a larger cache would improve performance - but here's the thing - on Pi switching from Non-optimised to optimised GCC improved the performance by nearly 3 times but on the PC switching from non-optimised to optimised VC++ improved performance by 259 times. Does this mean GCC's optimisation is rubbish or that VC++'s is "magic"?
jamesh wrote:As to use of Assembler vs high level languages - we've had many arguments over this in other threads.
Not arguments James, I'd characterise them as good natured discussions !
jamesh wrote:I'm of the opinion that compilers nowadays are so good that only a very limited subset of engineers can actually write more efficient code than a compiler, and when they do, that code is almost incomprehensible to anyone else. Take a look at optimised output from a compiler - it's really difficult to see what is going on. So assembler should only be used where absolutely necessary, and those circumstances are very rare indeed.
Actually I'd not object to that (and you'll find a similar response from me to GavinW further up). The thing is RISC OS is in an "odd" position. It is largely written in assembler and because of that it could tentatively be argued that a lot of the "hard" work has already been done.

The "speed critical" parts I think still need to be assembler - but the other parts can easily (and should be) written in high level languages (in fact some of the applications that come as part of RISC OS are).

Recoding RISC OS as a whole into purely "C" (say) would require a lot of work - and would, I feel, chip away at the advantages it does confer (i.e., smaller memory footprint, faster performance, fast boot times, greater UI responsiveness) - if they went would RISC OS offer *anything* the very capable Linux alternatives don't? (I suspect not) and that's why I'd argue Assembler use in RISC OS is a "special case".

User avatar
Burngate
Posts: 6100
Joined: Thu Sep 29, 2011 4:34 pm
Location: Berkshire UK Tralfamadore
Contact: Website

Re: Trying to get an algorithm to run faster on RISC than wh

Tue Apr 16, 2013 8:42 am

jamesh wrote:... Take a look at optimised output from a compiler - it's really difficult to see what is going on...
I don't know much about compilers - has the output still got comments in it? If not ...
Surely what makes high level language code easier to read is the comments, as well as the formatting - indentations, and so on.
Assembler can have both of those if you want

Markodius
Posts: 134
Joined: Fri Jan 04, 2013 11:14 pm

Re: Trying to get an algorithm to run faster on RISC than wh

Tue Apr 16, 2013 8:45 pm

Burngate wrote:..Surely what makes high level language code easier to read is the comments, as well as the formatting - indentations, and so on..
The fact that high level languages are closer to human language is what makes them easier to work with I've always thought.
“In the modern age, to call a man unelectable means he cannot be bought”

AMcS
Posts: 184
Joined: Sun Jan 06, 2013 11:23 am
Location: Dublin, Ireland

Re: Trying to get an algorithm to run faster on RISC than wh

Tue Apr 16, 2013 9:13 pm

Markodius wrote:
Burngate wrote:..Surely what makes high level language code easier to read is the comments, as well as the formatting - indentations, and so on..
The fact that high level languages are closer to human language is what makes them easier to work with I've always thought.
Markodius while not disagreeing with you there is more to this !

Probably one of the easiest high level languages to read is BASIC - but funnily enough many on these fora wouldn't ever even mention it by name - they'd prefer something a tad less high level - like "C" - as our American friends say "go figure..."

In a more serious vain not all high level languages are easy to read (APL anyone, or do the multitudinous brackets of LISP grab you ?)

To me all a High Level language is one where each high level instruction gets converted to a larger number of simpler byte code/assembly code or machine code instructions. You get more bangs for yer buck. Usually the designers helpfully also try to use words you recognise for those high level command - helpful that - but not mandatory (I refer you back to APL...)

To illustrate the point at it's very simplest

PRINT "A" in BASIC is obvious (ok you need a honking great big 64K of BBC BASIC ARM
code to interpret it - but it's just one line of code for the user).

The same thing in ARM assembler might be:

Code: Select all

MOVE R0,#ASC("A")
SWI "OS_WriteC"
MOV PC,R14
Not only a bit less readable - but there's more of it - your single BASIC line requires three ARM instructions (so one high level - gives you (in this case) three low level assembler ones).

As to using commands that are human like, yes I'd agree but up to a point. Low level languages can also use commands that are fairly obvious (in ARM Assembler give you a guess what ADD is ?) - assemblers DON'T have to use one, two or three character codes WHY folks - it's the 21st Century after all - do they not use more characters....

As an aside sometimes you can have a laugh with this stuff. Many moons ago I was using a Z80 based system with had an in memory BASIC (in short you could change it - something us BBC BASIC using types would find distressing). I did a simple dump of the command table in the BASIC and found that I could change BASIC commands - to anything else - so long as the command wasn't longer than the original.

This (given my dubious sense of humour) lead inevitably to







VOMIT "Hello World"

I didn't say it would be funny did I ? :D

Other than this no other high level languages were hurt in the preparation of this posting... please carry on!

Markodius
Posts: 134
Joined: Fri Jan 04, 2013 11:14 pm

Re: Trying to get an algorithm to run faster on RISC than wh

Tue Apr 16, 2013 10:58 pm

Sure, that all seems to make sense - even the hello world snippet! I was taught (decades ago) that as time progressed computer programming languages would all evolve into more human like communications. Sadly that didn't happen and now we seem to be going in reverse. It's true that most programmers roll their eyes at the mere mention of BASIC dialects. I've always supposed that was because the B in BASIC stands for beginners. Daft shame really.
“In the modern age, to call a man unelectable means he cannot be bought”

timrowledge
Posts: 1290
Joined: Mon Oct 29, 2012 8:12 pm
Location: Vancouver Island
Contact: Website

Re: Trying to get an algorithm to run faster on RISC than wh

Wed Apr 17, 2013 5:41 am

The point behind the high level language thing is simply that it allows the programmer to concentrate on the problem and not so much the details of making the code work. It's like controls for modern vehicles; you no longer have to manually adjust the spark timing and fuel-air mixture for your car engine, no longer need to manually pump a little oil as you drive, etc. You can - barring interruptions from cellphones and DVD players and children - concentrate on where you need to go and how to get the car to go there. In the future one may be able to hand over the boring grunt work part of driving and navigating to automated system too.

I write a depressingly large amount of C code because it has become a lingua-franca with reasonably effective tools to produce reasonably effective programs - for some important purposes. For actually complex things I use Smalltalk because it frees me from a lot of pointless faffing around with allocating and freeing memory, querying types and using error-prone switch or if-then statements, worrying about stuff that is irrelevant to my needs and, oh yes, provides useful tools for coding, debugging, testing, refactoring, analysing and understanding. Is Smalltalk as fast as C or assembler for a trivial loop test? Nope. Does that usually matter in the context of a real application doing something complex? Not often. Which is more useful - a racing Kart that you must do every bit of maintenance for yourself, or a GT estate car that hardly ever needs attention? Depends on your needs, obviously.

Lisp has similar attributes mixed in with the incomprehensible (to me) brackets and syntax. Functional languages take a different approach as to what to automate and what to leave to the code writer. APL was designed by Satan personally as way to punish IBMers. java was designed by people that thought it would be fun to see how many people they could get to work standing on their head in manure.

Assembler programming is, as my old friend Bob's song says, "kinda like construction work, with a toothpick for a tool" Lyrics - http://www.lipwalklyrics.com/lyrics/631 ... flame.html Sung performance - http://www.prometheus-music.com/audio/eternalflame.mp3
Making Smalltalk on ARM since 1986; making your Scratch better since 2012

User avatar
DexOS
Posts: 876
Joined: Wed May 16, 2012 6:32 pm
Contact: Website

Re: Trying to get an algorithm to run faster on RISC than wh

Wed Apr 17, 2013 3:40 pm

But its very easy to have the best of both worlds, example, ease of basic and just as portable and the speed and size of assembly.
First the inc (this is the bit that changes for all platform or os)
DBasic_L.inc

Code: Select all

;============================================================
; DexBasic is based on a idea by rCX, for a fasm macro basic.
; The esae of basic, with the power of ASM.
;
; Code input by:
;   Dex
;   rCX
;   Steve
;   TonyMac
;
; This include is for x86 32BIT Linux.
;
;============================================================

;=======================================================  ;
;  File header.                                           ;
;=======================================================  ; 
format ELF executable                                     ;
entry  start                                              ;
                                                          ;
;=======================================================  ;
;  CLS.                                                   ;
;=======================================================  ; 
macro CLS                                                 ;
{                                                         ;
local .Done                                               ;
local .a                                                  ;
local .b                                                  ;
                                                          ;
        mov     eax,4                                     ; Print function
        mov     ebx,1                                     ;
        mov     ecx,.a                                    ;
        mov     edx,.b                                    ;
        int     80h                                       ;
        jmp     .Done                                     ;
                                                          ;
.a db 1bh, "[2J",1bh, "[01;01H"                           ; clear screen.
.b = $-.a                                                 ;
.Done:                                                    ;
}                                                         ;
                                                          ;
;=======================================================  ;
;  PRINT.                                                 ;
;=======================================================  ; 
macro PRINT String{                                       ;
local .Done                                               ;
local .a                                                  ;
local .b                                                  ;
                                                          ;
        mov     eax,4                                     ; Print function
        mov     ebx,1                                     ;
        mov     ecx,.a                                    ;
        mov     edx,.b                                    ;
        int     80h                                       ;
        jmp     .Done                                     ;
                                                          ;
.a db String,0xa                                          ;
.b = $-.a                                                 ;
.Done:                                                    ;
}                                                         ;
                                                          ;
;=======================================================  ;
;  LOCATE.                                                ;
;=======================================================  ; 
macro LOCATE col,row                                      ;
{                                                         ;
local .Done                                               ;
local .CursorX                                            ;
local .CursorY                                            ;
local .ColumnRow                                          ;
local .ColumnRowSize                                      ;
        pushad                                            ;
        mov     dh, 10                                    ;
                                                          ;
        mov     ax, row                                   ; AX = character column
        and     ax, 0FFh                                  ;
        div     dh                                        ; divide by 10
        add     ax, "00"                                  ;
        mov     [.CursorX], ax                            ;
                                                          ;
        mov     ax, col                                   ; AX = character row
        and     ax, 0FFh                                  ;
        div     dh                                        ; divide by 10
        add     ax, "00"                                  ;
        mov     [.CursorY], ax                            ;
                                                          ;
        mov     eax,4                                     ; Print function
        mov     ebx,1                                     ;
        mov     ecx,.ColumnRow                            ;
        mov     edx,.ColumnRowSize                        ;
        int     80h                                       ;
        jmp     .Done                                     ;
                                                          ;
                                                          ;
.ColumnRow      db      1bh, "["                          ; cursor positioning sequence
.CursorX        dw      0                                 ;
                db      ";"                               ;
.CursorY        dw      0                                 ;
                db      "H"                               ;
.ColumnRowSize = $-.ColumnRow                             ;
.Done:                                                    ;
        popad                                             ;
}                                                         ;
                                                          ;
;=======================================================  ;
;  END.                                                   ;
;=======================================================  ; 
macro COLOR NewColor                                      ;
{                                                         ;
local .Done                                               ;
local .Done1                                              ;
local .black                                              ;
local .red                                                ;
local .green                                              ;
local .yellow                                             ;
local .blue                                               ;
local .magenta                                            ;
local .cyan                                               ;
local .white                                              ;
local .Size1                                              ;
local .Size2                                              ;
                                                          ;
        mov     al,NewColor                               ;
        cmp     al,15                                     ;
        ja      .Done                                     ;
        cmp     al,15                                     ;
        jne     @f                                        ;
        mov     ecx,.white                                ;
        jmp     .Done1                                    ;
@@:                                                       ;
        cmp     al,0                                      ;
        jne     @f                                        ;
        mov     ecx,.black                                ;
        jmp     .Done1                                    ;
@@:                                                       ;
        cmp     al,11                                     ;
        jne     @f                                        ;
        mov     ecx,.cyan                                 ;
        jmp     .Done1                                    ;
@@:                                                       ;
        cmp     al,12                                     ;
        jne     @f                                        ;
        mov     ecx,.red                                  ;
        jmp     .Done1                                    ;
@@:                                                       ;
        cmp     al,14                                     ;
        jne     @f                                        ;
        mov     ecx,.yellow                               ;
        jmp     .Done1                                    ;
@@:                                                       ;
        cmp     al,10                                     ;
        jne     @f                                        ;
        mov     ecx,.green                                ;
        jmp     .Done1                                    ;
@@:                                                       ;
        cmp     al,13                                     ;
        jne     @f                                        ;
        mov     ecx,.magenta                              ;
        jmp     .Done1                                    ;
@@:                                                       ;
        cmp     al,9                                      ;
        jne     @f                                        ;
        mov     ecx,.blue                                 ;
        jmp     .Done1                                    ;
@@:                                                       ;
        jmp     .Done                                     ;
.Done1:                                                   ;
        mov     eax,4                                     ; Print function
        mov     ebx,1                                     ;
        mov     edx,.Size2                                ;
        int     80h                                       ;
        jmp     .Done                                     ;
                                                          ;
                                                          ;
.black    db 1bh, "[0;30;49m"                             ;
.red      db 1bh, "[0;31;49m"                             ;
.green    db 1bh, "[0;32;49m"                             ;
.yellow   db 1bh, "[0;33;49m"                             ;
.blue     db 1bh, "[0;34;49m"                             ;
.magenta  db 1bh, "[0;35;49m"                             ;
.cyan     db 1bh, "[0;36;49m"                             ;
.white    db 1bh, "[0;37;49m"                             ;
.Size1    db 1bh, "[0;39;49m"                             ;
.Size2 = $-.Size1                                         ;
.Done:                                                    ;
}                                                         ;
                                                          ;
;=======================================================  ;
;  GOTO.                                                  ;
;=======================================================  ;
Macro GOTO _op1                                           ;
{                                                         ;
        jmp     _op1                                      ;
}                                                         ;
                                                          ;
;=======================================================  ;
;  GOSUB.                                                 ;
;=======================================================  ;
Macro GOSUB _subname                                      ;
{                                                         ;
        call     _subname                                 ;
}                                                         ;
                                                          ;
;=======================================================  ;
;  RETURN.                                                ;
;=======================================================  ;
Macro RETURN                                              ;
{                                                         ;
        ret                                               ;
}                                                         ;
                                                          ;
;=======================================================  ;
;  SLEEP.                                                 ;
;=======================================================  ;
macro SLEEP                                               ;
{                                                         ;
        pushad                                            ;
        mov     eax,3                                     ;
        xor     ebx,ebx                                   ;
        mov     ecx,1                                     ;
        mov     edx,buffer                                ;
        int     80h                                       ;
        popad                                             ;
}                                                         ;
                                                          ;
;=======================================================  ;
;  END.                                                   ;
;=======================================================  ; 
macro END                                                 ;
{                                                         ;
local .Done                                               ;
local .restore                                            ;
local .Size2                                              ;
        pushad                                            ;
        mov     ecx,.restore                              ;
        mov     eax,4                                     ; Print function
        mov     ebx,1                                     ;
        mov     edx,.Size2                                ;
        int     80h                                       ;
                                                          ;
        mov     eax,1                                     ; Exit function
        xor     ebx,ebx                                   ;
        int     80h                                       ;
        jmp     .Done                                     ;
                                                          ;
.restore db 1bh, "[0;39;49m"                              ;
.Size2 = $-.restore                                       ;
.Done:                                                    ;
        popad                                             ;
}                                                         ;
                                                          ;
;=======================================================  ;
;  START of Program.                                      ;
;=======================================================  ;
segment readable writeable executable                     ;
buffer rb 1                                               ;
start:                                                    ;         
This stays the same other than the .inc
Hello_world.asm

Code: Select all

include 'DBasic_L.inc'
CLS
COLOR  11
LOCATE 2,1
PRINT "This app is written in Macro Basic, for Linux "
COLOR  12
LOCATE 2,2
PRINT "With the ease of Basic and the power of ASM "
COLOR  15
LOCATE 2,3
PRINT "It user's the basic commands:"
PRINT " "
PRINT "    CLS"
PRINT "    SCREEN"
PRINT "    COLOR"
PRINT "    LOCATE"
PRINT "    PRINT"
PRINT "    GOSUB"
PRINT "    RETURN"
PRINT "    SLEEP"
PRINT "    END"
PRINT " "
GOSUB TestSub
SLEEP
END

TestSub:
PRINT "  Press any key to quit."
RETURN
You could very easy port it to riscos.
It also can be run bare metal on the pi

Image
Batteries not included, Some assembly required.

Markodius
Posts: 134
Joined: Fri Jan 04, 2013 11:14 pm

Re: Trying to get an algorithm to run faster on RISC than wh

Wed Apr 17, 2013 6:24 pm

timrowledge wrote:You no longer have to manually adjust the spark timing and fuel-air mixture for your car engine
So that's what I've been doing wrong.. Hi Tim. I regret having given you the equivalent of a broadside when I last visited. It might please you to know that I did take a very cursory look at Squeak under Raspbian and whilst it's probably not something I'd use in anger it looks like a competent tool for producing 2d games and stuff that will appeal to the younger generation. So.. can I have my cannonballs and grapeshot back please?
“In the modern age, to call a man unelectable means he cannot be bought”

timrowledge
Posts: 1290
Joined: Mon Oct 29, 2012 8:12 pm
Location: Vancouver Island
Contact: Website

Re: Trying to get an algorithm to run faster on RISC than wh

Fri Apr 19, 2013 6:52 pm

Sure, Markodius, you can have your balls back…

If you have another, more powerful, machine you might like to spend a short while trying Squeak on it to get a different view. Right now Squeak on Pi is a bit annoyingly slow for various reasons not worth wittering on about here. I'm working on them as part of the project to speed up Scratch; hopefully significant improvements coming soon-ish.

It's not just for kids or games though. Seaside is a very serious web server written for Squeak. I've worked on a realtime OS written in Squeak, for special purpose (but still small ARM cpu) hardware. eToys (which actually *is* for kids) is a Squeak system, as is a lot of the OLPC code and of course Scratch. Youtube has many videos worth perusing; search for 'squeak tutorial' or 'alan kay demo' for example.
Making Smalltalk on ARM since 1986; making your Scratch better since 2012

Markodius
Posts: 134
Joined: Fri Jan 04, 2013 11:14 pm

Re: Trying to get an algorithm to run faster on RISC than wh

Sat Apr 20, 2013 1:58 am

timrowledge wrote:Sure, Markodius, you can have your balls back…
Most gracious of you Tim.
“In the modern age, to call a man unelectable means he cannot be bought”

AMcS
Posts: 184
Joined: Sun Jan 06, 2013 11:23 am
Location: Dublin, Ireland

Re: Trying to get an algorithm to run faster on RISC than wh

Sat Apr 20, 2013 12:19 pm

DexOS wrote:But its very easy to have the best of both worlds, example, ease of basic and just as portable and the speed and size of assembly.
First the inc (this is the bit that changes for all platform or os)
DBasic_L.inc


You could very easy port it to riscos.
It also can be run bare metal on the pi
Hi DexOs, would this not require fasm to be ported to RISC OS first - if so is there a plan to do this ? Or can an existing RISC OS assembler be used?

User avatar
DexOS
Posts: 876
Joined: Wed May 16, 2012 6:32 pm
Contact: Website

Re: Trying to get an algorithm to run faster on RISC than wh

Sat Apr 20, 2013 11:00 pm

AMcS wrote:
DexOS wrote:But its very easy to have the best of both worlds, example, ease of basic and just as portable and the speed and size of assembly.
First the inc (this is the bit that changes for all platform or os)
DBasic_L.inc


You could very easy port it to riscos.
It also can be run bare metal on the pi
Hi DexOs, would this not require fasm to be ported to RISC OS first - if so is there a plan to do this ? Or can an existing RISC OS assembler be used?
Fasmarm is being ported to arm, including a riscos port, but in the mean time you could assembly the code on x86 linux or windows, its also possible to assembly stuff from a web browser, by having fasmarm on a server.
Heres a example of a hello world app for riscos that's coded with fasmarm

Code: Select all

format binary as ''

OS_Write0 = 0x02    ; Define the two system calls
OS_Exit   = 0x11    ; used in this code
                    ;
org 0x8000          ; org (may not be needed ?
use32               ; use 32bit
mov r0,r0           ; Decompression code call
mov r0,r0           ; Self-relocation code call
mov r0,r0           ; Zero initialisation code call
bl  start           ; Program entry call
swi OS_Exit         ; Fall-out trap to force exit
dw  0x40            ; Read-only area size (header)
dw  0x20            ; Read-write area size (code)
dw  0               ; Debug area size
dw  0               ; Zero initialisation size
dw  0               ; Debug type
dw  0x8000          ; Current base of absolute
dw  0               ; Workspace required
dw  32              ; Flag software as 32 bit PCR okay
dw  0               ; Data base address when linked
dw  0               ; Reserved header (should be zero)
dw  0               ; Reserved header (should be zero
message:
db "Hello World! :-)",0xa ,0
align 4
start:
adr r0,message      ; Pointer to message
swi OS_Write0       ; OS call writes until null byte
mov r0,0            ; Define return code
swi OS_Exit         ; And exit.                               
As you can see, i need to add my own header as fasmarm does not out put the riscos file format.
Batteries not included, Some assembly required.

AMcS
Posts: 184
Joined: Sun Jan 06, 2013 11:23 am
Location: Dublin, Ireland

Re: Trying to get an algorithm to run faster on RISC than wh

Mon Apr 22, 2013 7:10 pm

DexOS wrote:
AMcS wrote:
DexOS wrote:But its very easy to have the best of both worlds, example, ease of basic and just as portable and the speed and size of assembly.
First the inc (this is the bit that changes for all platform or os)
DBasic_L.inc


You could very easy port it to riscos.
It also can be run bare metal on the pi
Hi DexOs, would this not require fasm to be ported to RISC OS first - if so is there a plan to do this ? Or can an existing RISC OS assembler be used?
Fasmarm is being ported to arm, including a riscos port, but in the mean time you could assembly the code on x86 linux or windows, its also possible to assembly stuff from a web browser, by having fasmarm on a server.
Heres a example of a hello world app for riscos that's coded with fasmarm

Code: Select all

format binary as ''

OS_Write0 = 0x02    ; Define the two system calls
OS_Exit   = 0x11    ; used in this code
                    ;
org 0x8000          ; org (may not be needed ?
use32               ; use 32bit
mov r0,r0           ; Decompression code call
mov r0,r0           ; Self-relocation code call
mov r0,r0           ; Zero initialisation code call
bl  start           ; Program entry call
swi OS_Exit         ; Fall-out trap to force exit
dw  0x40            ; Read-only area size (header)
dw  0x20            ; Read-write area size (code)
dw  0               ; Debug area size
dw  0               ; Zero initialisation size
dw  0               ; Debug type
dw  0x8000          ; Current base of absolute
dw  0               ; Workspace required
dw  32              ; Flag software as 32 bit PCR okay
dw  0               ; Data base address when linked
dw  0               ; Reserved header (should be zero)
dw  0               ; Reserved header (should be zero
message:
db "Hello World! :-)",0xa ,0
align 4
start:
adr r0,message      ; Pointer to message
swi OS_Write0       ; OS call writes until null byte
mov r0,0            ; Define return code
swi OS_Exit         ; And exit.                               
As you can see, i need to add my own header as fasmarm does not out put the riscos file format.
Thanks for that DexOs, it certainly looks interesting. There'd be a few differences for me to get used to but it is sufficiently close the BBC BASIC's ARM assembler that it would not be overly onerous to use Fasmarm (yet another programming option - always a good thing...) - it may well provide additional functionality if your "BASIC" demo is anything to go by.

Markodius
Posts: 134
Joined: Fri Jan 04, 2013 11:14 pm

Re: Trying to get an algorithm to run faster on RISC than wh

Mon Apr 22, 2013 10:10 pm

Hang on chaps! Why (short sentences and large pictures please!) would one use fasmarm when one could use native assembler?
“In the modern age, to call a man unelectable means he cannot be bought”


Markodius
Posts: 134
Joined: Fri Jan 04, 2013 11:14 pm

Re: Trying to get an algorithm to run faster on RISC than wh

Tue Apr 23, 2013 5:41 am

Thanks for answering pygmy_giant but sadly I don't really understand and if I don't understand a one word sentence then it will probably be a waste of time to elucidate. I did however read the Wiki page re Macro which possibly sheds a little light but it's almost definitely not something I'm going to be able to absorb let alone use. Thanks anyway.
“In the modern age, to call a man unelectable means he cannot be bought”

AMcS
Posts: 184
Joined: Sun Jan 06, 2013 11:23 am
Location: Dublin, Ireland

Re: Trying to get an algorithm to run faster on RISC than wh

Tue Apr 23, 2013 8:59 pm

Markodius wrote:Thanks for answering pygmy_giant but sadly I don't really understand and if I don't understand a one word sentence then it will probably be a waste of time to elucidate. I did however read the Wiki page re Macro which possibly sheds a little light but it's almost definitely not something I'm going to be able to absorb let alone use. Thanks anyway.
Let's suppose you wanted to print an "A" in ARM assembler.

You would define the following as a macro:

Code: Select all

MOV R0,#65  \\65 is the ASCII code for the letter "A"
SWI "OS_WriteC"
Now you give the macro a name "printA" - then whenever you need it you put "printA" into your assembly code and the assembler magically transforms it into:

Code: Select all

MOV R0,#65  \\65 is the ASCII code for the letter "A"
SWI "OS_WriteC"
What DexOS was doing was (in effect) stringing together many Macros that resemble BASIC (although ultimately the output is still ARM assembler (FasmArm in this case)).

If you were using BBC BASIC ARM Assembler you could "form" macros in a different way (using PROC and FN) - the outcome would be the same - a single simple command that "magically" gets transformed into a useful block of ARM code.

Markodius
Posts: 134
Joined: Fri Jan 04, 2013 11:14 pm

Re: Trying to get an algorithm to run faster on RISC than wh

Tue Apr 23, 2013 11:48 pm

Oh right! I DO understand! Thanks for clearing that up AMcS - I was thinking (slowly, after reading the Wiki) that maybe using fasmarm one could compile code for different processor architectures and OS's using a bunch of switches. Trust me to wander off the beaten path and fall off a cliff.
“In the modern age, to call a man unelectable means he cannot be bought”

User avatar
DavidS
Posts: 4334
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Trying to get an algorithm to run faster on RISC than wh

Thu May 16, 2013 3:28 am

Norcroft C is a lot better at optimizing for the ARMv6 than GCC on RISC OS. Yes Norcroft C is older, though it has been updated and these updates include very good optimazition for the ARMv6.

Also your implementation uses the C standard library for output, it would probably be better to use a bit of inline assembly to directly call SWI 0x0002 (OS_Write0) to write out your C style null terminated string, and write your own conversion routines. Unfortunately all of the forms of printf() in the C stdlib are very inefecient at best and the versions in RISC OS are a little slower than most.

Though as with any OS always avoid any language provided standard libraries if you want speed. That goes for Linux, Plan 9, BSD, RISC OS, GEOS, QDOS, Windows, etc, etc.
RPi = The best ARM based RISC OS computer around
More than 95% of posts made from RISC OS on RPi 1B/1B+ computers. Most of the rest from RISC OS on RPi 2B/3B/3B+ computers

Markodius
Posts: 134
Joined: Fri Jan 04, 2013 11:14 pm

Re: Trying to get an algorithm to run faster on RISC than wh

Thu May 16, 2013 11:34 pm

Welcome back online DavidS 8-)
“In the modern age, to call a man unelectable means he cannot be bought”

User avatar
DavidS
Posts: 4334
Joined: Thu Dec 15, 2011 6:39 am
Location: USA
Contact: Website

Re: Trying to get an algorithm to run faster on RISC than wh

Tue Sep 24, 2013 10:34 am

Ok I finaly took the time to read this thread all of the way through, and here are just some thoughts.

************************************************************************************************


Some time ago I did some experiments with the relitive speed of code compiled with the Linux version of GCC (using recompiled RISC OS Libs) and Norcroft C. The resaults, running on RISC OS in all cases consistantly showed Norcroft produced faster code. My notes indicate that on average the executed test cases were 1.17 times faster with Norcroft over Linux GCC. This is all on the RPi. GCC -O3.

Unfortunately I forgot to note all of the details. I only noted the resaults, and naming the programs GccBench0, GccBench1, GccBench2, GccBench3, etc and NorBench0, NorBench1, NorBench2, NorBench3, etc does not tell me what code I used :-( . Maybe at some point I will dig up the code, or maybe not.



AMcS wrote: I'd largely accept that but would, with due caution, point out that only one language on Raspberry Pi allowed it to better the performance of a fairly modern PC (i5) clocked at nearly 4 times faster than the Pi and that was ARM Assembler. The lure is speed - the reason RISC OS is competitive and actually looks good and performs well on Pi is because it is largely assembler and is fairly frugal.

Would we even be having this discussion (or would RISC OS even be on Pi) if that were not the case?

The issues of debugging and maintenance, I believe, could be addressed with appropriate tools and methods. For example the issue with the ARM Hanoi program could possibly have been addressed more easily if there was an convenient way to step through the code, examine/change register values and trace execution. If we leveraged the computer to do the work - fixing ARM code (or indeed ANY language) becomes more practical. [Yes I know we could insert brake points with *BREAKSET - but in a program of any length that would get tedious... there has to be an easier way]
Hmm.

And about:
*MemoryI
*MemoryA
*Memory
*BreakSet
*BreakList
*BreakClr
*Conitnue
*InitStore
*ShowRegs

Does not get much simpler. I guess you could probably hack togather a debuger that does single stepping, though using breakpoints is so much better, and could catch these kind of errors fairly easily. And it is less tedious to use breakpoints than any other method of watching code execute, as other methods require single stepping, or intepreter tricks.
AMcS wrote:I did take a very quick look (on Google) for Acorn C/Norcroft compiler switches/options, but couldn't find anything obvious.
Did you look in the C,C++ book (C_C++/PDF) that comes in the documentation with the compiler? Try the section #pragma directives under Extra Features. Other than that I do not know of any optimization switches. Yeah just double checked the Command line options, and nothing there, so the pragmas are all we have
Markodious wrote: Welcome back online DavidS
Glad to be back.
RPi = The best ARM based RISC OS computer around
More than 95% of posts made from RISC OS on RPi 1B/1B+ computers. Most of the rest from RISC OS on RPi 2B/3B/3B+ computers

Return to “RISCOS”