User avatar
John_Spikowski
Posts: 1557
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: The Rust debate.

Thu Oct 03, 2019 12:15 pm

Code: Select all

PRINT FORMAT("%0.f",0xFFFF * 0xFFFF),"\n"

pi@RPi4B:~/sbrt/examples $ scriba 32math.sb
4294836225
pi@RPi4B:~/sbrt/examples $ 
What are you expecting as a result?

sal55
Posts: 59
Joined: Sat Sep 21, 2019 7:15 pm

Re: The Rust debate.

Thu Oct 03, 2019 12:20 pm

jahboater wrote:
Thu Oct 03, 2019 11:42 am
sal55 wrote:
Thu Oct 03, 2019 11:29 am
Whereas more modern languages that have ranges of fixed-width types have standardized on widths of 8, 16, 32 and 64 bits, including Rust.
Precisely what C did 20 years ago ...

C99 introduced exact width types and others such as int_fast and int_least.
They may have introduced them, but nobody uses them! At least, they seem to be rare in open-source code I've looked at.
Forget about what they may map onto, it is of no interest to the programmer.
(sure int64_t may be slow on a 32-bit platform, but it works perfectly. There is no need for the code to be aware.)
This is it, they may map to the poorly defined char, short, int, long, long long (notice FIVE types which normally represent FOUR sizes of 8, 16, 32, 64).

'uint64_t' might map to 'unsigned long long int' on Windows, or 'unsigned long int' on Linux. This is important if you want to know whether to use "%lu" or "%llu" when printing its value (or you have to use PRi64d or whatever). It is little help when trying to call 'sprintf' as a foreign function from another language.

It's also important when your code, for example, might use int64_t*, but you want to call a library function that needs 'long int*' or 'long long int*. It's still a mess!
CHAR_BIT is fixed at 8 (when did you last see a computer with 6 or 9 bit bytes?)
CHAR_BIT is at least 8 bits; POSIX implementations mandate 8 bits. But there are still special processors that don't have an 8-bit byte, so it either needs emulating, or CHAR_BIT for that device will be 16 or 24 or 32 or 37.
Yes the older types such as long may change between 32 and 64 bit platforms, which is sometimes useful.
When is that? I've just glanced at the ScriptBasic sources (it was mentioned a few posts above) and they seem to use 'long' extensively. Which means that when long is 64 bits, and the hardware is 32 bits, it will now be wasting time doing unnecessary 64-bit ops. (And if it expects long to be 64 bits, it will fail when it's only 32.)

User avatar
PeterO
Posts: 5073
Joined: Sun Jul 22, 2012 4:14 pm

Re: The Rust debate.

Thu Oct 03, 2019 12:28 pm

sal55 wrote:
Thu Oct 03, 2019 12:20 pm
They may have introduced them, but nobody uses them!
And with opinion stated as fact you loose all credibility I'm afraid !
PeterO
Discoverer of the PI2 XENON DEATH FLASH!
Interests: C,Python,PIC,Electronics,Ham Radio (G0DZB),1960s British Computers.
"The primary requirement (as we've always seen in your examples) is that the code is readable. " Dougie Lawson

jahboater
Posts: 4769
Joined: Wed Feb 04, 2015 6:38 pm

Re: The Rust debate.

Thu Oct 03, 2019 12:36 pm

sal55 wrote:
Thu Oct 03, 2019 12:20 pm
When is that? I've just glanced at the ScriptBasic sources (it was mentioned a few posts above) and they seem to use 'long' extensively. Which means that when long is 64 bits, and the hardware is 32 bits, it will now be wasting time doing unnecessary 64-bit ops. (And if it expects long to be 64 bits, it will fail when it's only 32.)
No.
On all the platforms around here, sizeof(long) == sizeof(char*)

So in 32-bit mode, long is 32-bits. Try it on the Pi.
In 64-bit mode, long is 64-bits. Try it on your Intel PC or the Pi with Gentoo

There is no hard guarantee however, so use the exact width types.
"long" is best avoided in general. Its a bit easier to read than ptrdiff_t when you are subtracting pointers (you cannot use int of course). But that's about all I can think of.

User avatar
rpdom
Posts: 15412
Joined: Sun May 06, 2012 5:17 am
Location: Chelmsford, Essex, UK

Re: The Rust debate.

Thu Oct 03, 2019 12:44 pm

jahboater wrote:
Thu Oct 03, 2019 11:42 am
when did you last see a computer with 6 or 9 bit bytes?
About 14 years ago when I was working on one.

jahboater
Posts: 4769
Joined: Wed Feb 04, 2015 6:38 pm

Re: The Rust debate.

Thu Oct 03, 2019 2:10 pm

sal55 wrote:
Thu Oct 03, 2019 12:20 pm
POSIX implementations mandate 8 bits.
Yes, that's what I thought.
And its near 40 years since I used a machine without 8 bit bytes.
:) I am sure Peter O has some working computers in his museum with some other sizes :)

On a related note, I believe there are no longer any known computers where the integer arithmetic is not twos-complement. C++14 and the forthcoming C2x say that signed integer overflow is now a defined operation.
Like your 37-bit bytes, I shan't worry about making my code portable to sign-magnitude or ones complement hardware!

sal55
Posts: 59
Joined: Sat Sep 21, 2019 7:15 pm

Re: The Rust debate.

Thu Oct 03, 2019 2:15 pm

PeterO wrote:
Thu Oct 03, 2019 12:28 pm
sal55 wrote:
Thu Oct 03, 2019 12:20 pm
They may have introduced them, but nobody uses them!
And with opinion stated as fact you loose all credibility I'm afraid !
PeterO
And you snipped the bit where I qualified that remark. But if I google for "C tutorials", then the first hit doesn't mention those "int32_t" types, only the usual char, short, int, long. Neither did the 2nd. Or the 3rd. Or the 4th. Or a random 5th link.

With the few open source C projects on my PC, Lua doesn't use them (but tries to be c89 compatible); neither does Tiny C; Python mainly does (but also a mix of ordinary types, plus its own types); Sqlite3 (250Kloc) doesn't; the stb_image library mainly doesn't, especially for its API.

My point was, there is a considerable amount of C code and headers and APIs that doesn't use the stdint.h types, and it is necessary to deal with that. Partly that was because many projects used their own schemes to get around lack of those standards (even though stdint.h is at least 20 years old). The fact remains that you will come across int and long and int32_t - will int32_t* be compatible with int* or long*?

(ETA: this still leaves the print-format problem unresolved, especially calling printf-family functions from another language. What format code to use for your genuine (ie. not bolted-on to int or long) int32 and int64 types?)
Last edited by sal55 on Thu Oct 03, 2019 2:35 pm, edited 2 times in total.

sal55
Posts: 59
Joined: Sat Sep 21, 2019 7:15 pm

Re: The Rust debate.

Thu Oct 03, 2019 2:30 pm

jahboater wrote:
Thu Oct 03, 2019 12:36 pm
No.
On all the platforms around here, sizeof(long) == sizeof(char*)

So in 32-bit mode, long is 32-bits. Try it on the Pi.
In 64-bit mode, long is 64-bits. Try it on your Intel PC or the Pi with Gentoo
I've avoided dealing with 'long' for years. But some tests (on the same PC) gives the following for sizeof(long):

Code: Select all

         32-bit  64-bit

Linux    4       8

Windows  4       4
So the size changes between 32-bit and 64-bit Linux; and between Windows and 64-bit Linux.

This doesn't change my point that if 'long' in that ScriptBasic app didn't need to be 64 bits, then it will be so unnecessarily on a 64-bit Linux.
There is no hard guarantee however, so use the exact width types.
There is still the risk of coming across a library that takes 'long*', although just a plain 'long' can also be a problem.

jahboater
Posts: 4769
Joined: Wed Feb 04, 2015 6:38 pm

Re: The Rust debate.

Thu Oct 03, 2019 2:40 pm

Sal55,

The common memory model for 64-bits is LP64 - that is Longs and Pointers are 64-bits.
For 32-bit mode, the memory model should be ILP32 that is Integers, Longs and Pointers are all 32-bits.

Windows uses LLP64 which is Long Long and Pointers are 64-bits
Long remains 32-bits the same as int. Pretty daft but its a Windows issue from long ago (because they had a very commonly used #define from the 32-bit days). I am sure int64_t etc work as expected for Windows.

Linux sensibly does the right thing and uses LP64.

If this sort of thing is important for you then the compiler defines __LP64__ when its in effect.

jahboater
Posts: 4769
Joined: Wed Feb 04, 2015 6:38 pm

Re: The Rust debate.

Thu Oct 03, 2019 2:52 pm

plugwash,
plugwash wrote:
Thu Oct 03, 2019 12:00 pm
Unfortunately the C99 typedefs don't fully solve the problem because of C's boneheaded promotion and overflow rules, consider for example.
I thought that was to accommodate processors which cannot do arithmetic on types smaller than int - like the Pi's ARM CPU's. The Pi can load a byte or half word (16 bits) with ldrb or ldrh, but after that, arithmetic is done with full 32-bit wide registers. Just like the C promotion model.

Not needed on x86 of course than can fully do arithmetic on 8 or 16 bit registers, including setting the flags correctly.
However if int is 32 bits then this code has undefined behavior.
I thought all unsigned arithmetic was defined. I don't have my copy of C18 to hand which has a definitive list of all the undefined behaviors in C.

jahboater
Posts: 4769
Joined: Wed Feb 04, 2015 6:38 pm

Re: The Rust debate.

Thu Oct 03, 2019 3:01 pm

Sal55,
sal55 wrote:
Thu Oct 03, 2019 2:15 pm
(ETA: this still leaves the print-format problem unresolved, especially calling printf-family functions from another language. What format code to use for your genuine (ie. not bolted-on to int or long) int32 and int64 types?)
You also have the problem of 64-bit literals.
In 64-bit modes the suffix is L, whereas on 32 bit mode, and Windows, its probably LL
In C of course its never an issue. For a 64-bit literal, just use INT64_C(123456789) and it will always be correct.

I can see your point about calling printf from another language.
Not something I have done, its easier to write everything in C.
Or I suppose any sensible host language will have decent formatted printing, why do you have to call C?
Last edited by jahboater on Thu Oct 03, 2019 3:33 pm, edited 2 times in total.

plugwash
Forum Moderator
Forum Moderator
Posts: 3463
Joined: Wed Dec 28, 2011 11:45 pm

Re: The Rust debate.

Thu Oct 03, 2019 3:06 pm

jahboater wrote:
Thu Oct 03, 2019 2:52 pm
I thought all unsigned arithmetic was defined.
The problem is if int is larger than 16 bits then the code snippet isn't unsigned arithmetic, it's conversion to int, followed by signed arithmetic, followed by conversion back to uint16_t .

jahboater
Posts: 4769
Joined: Wed Feb 04, 2015 6:38 pm

Re: The Rust debate.

Thu Oct 03, 2019 3:27 pm

plugwash wrote:
Thu Oct 03, 2019 3:06 pm
jahboater wrote:
Thu Oct 03, 2019 2:52 pm
I thought all unsigned arithmetic was defined.
The problem is if int is larger than 16 bits then the code snippet isn't unsigned arithmetic, it's conversion to int, followed by signed arithmetic, followed by conversion back to uint16_t .
OK, if the original operand fits in a signed int then that's what it is promoted to.
Even though it would also fit in an unsigned int which would be more sensible!

It wont be undefined behavior for very long. Two's complement is now deemed to be universal.

Heater
Posts: 13660
Joined: Tue Jul 17, 2012 3:02 pm

Re: The Rust debate.

Thu Oct 03, 2019 3:38 pm

This discussion of what C does where makes me think that the Rust folks have got a good idea!

It underlies the fact that a group of people cannot create bug free C code. Most of them only have a vague idea how it works, if they do it's so easy to make mistakes or miss those UBs. That's before we even start thinking about memory and thread safety.

Don't get me wrong, I love C, warts and all. Even if it is more of a random number generator than a high level programming language. For all kind of reasons I won't go into just yet.

Is it time for a "C Debate" thread?

Aside: I'm not sure of course but surely the fact that 16 bit ints get promoted to 32 bit whilst doing the arithmetic has no effect on the final result. I would expect that promotion to be unobservable. Or are you really saying that according to the standard that 16 bit by 16 bit multiply with 16 bit result does not have to produce the correct result? Is it really UB?
Memory in C++ is a leaky abstraction .

User avatar
jcyr
Posts: 446
Joined: Sun Apr 23, 2017 1:31 pm
Location: Atlanta

Re: The Rust debate.

Thu Oct 03, 2019 3:48 pm

Heater wrote:
Thu Oct 03, 2019 3:38 pm
Don't get me wrong, I love C, warts and all. Even if it is more of a random number generator than a high level programming language.
I don't think C is considered a high level language anymore. In fact I've recently seen it referred to as the 'new assembler'! That analogy seems accurate since the effective use of C requires a fair bit of awareness of the underlying architecture.
It's um...uh...well it's kinda like...and it's got a bit of...

Heater
Posts: 13660
Joined: Tue Jul 17, 2012 3:02 pm

Re: The Rust debate.

Thu Oct 03, 2019 4:05 pm

C has been referred to as "portable assembler" and such for as long as I can remember. Which is longer than I can remember, err, if you see what I mean.

I hold that up as praise rather than damnation. I think C is perfect for what it is supposed to be. No more no less.
Memory in C++ is a leaky abstraction .

User avatar
paddyg
Posts: 2394
Joined: Sat Jan 28, 2012 11:57 am
Location: UK

Re: The Rust debate.

Thu Oct 03, 2019 5:01 pm

Anyway. Returning to the subject of this thread: after pondering why Rust bundles so much into its executables I read this post which gives a good justification (to avoid dependency issues etc) and also what to do if you really need to make it smaller. By adding additional flags to the cargo command the file drops from 2.5MB to 15k (10k after strip)

Compiling very large files is still going to be an issue and take a long time, and I think compile time is a problem generally for Rust, that the community is aware of and is trying to improve. (@sal55 I noticed nim on your comparison sheet - have you done anything serious with that?)
Last edited by paddyg on Thu Oct 03, 2019 5:02 pm, edited 1 time in total.
also https://groups.google.com/forum/?hl=en-GB&fromgroups=#!forum/pi3d

sal55
Posts: 59
Joined: Sat Sep 21, 2019 7:15 pm

Re: The Rust debate.

Thu Oct 03, 2019 5:02 pm

jahboater wrote:
Thu Oct 03, 2019 3:27 pm
plugwash wrote:
Thu Oct 03, 2019 3:06 pm
jahboater wrote:
Thu Oct 03, 2019 2:52 pm
I thought all unsigned arithmetic was defined.
The problem is if int is larger than 16 bits then the code snippet isn't unsigned arithmetic, it's conversion to int, followed by signed arithmetic, followed by conversion back to uint16_t .
OK, if the original operand fits in a signed int then that's what it is promoted to.
Even though it would also fit in an unsigned int which would be more sensible!
Rust has its own odd behaviours:

Code: Select all

    let a:u8=255;
    let b:u8=1;
Here, a+b and a+1 both give a result of 0, even if assigning the result to a wider variable. In C, everything is promoted to 'int' (typically 32 bits), the result is 256 if using it at u16 or wider, or 0 if stored back into u8. But then, you get the same problem adding 0xFFFFFFFF and 1, if wanting to use a 64-bit result of 0x100000000.

(This in Rust release mode; in debug mode; it will fail with an overflow error.)

It seems to me that this makes Rust lower level than C, as well as much stricter, as every expression will be of exactly 8, 16, 32 or 64 bits, and both operands of a binary op will always be the same width.

The type of a literal constant like 1 seems to be adjusted to that of the other operand, so can be u8, u32 or u16, unless both are constants, when the wider type is used. However, expressions such as 2000000000+2000000000 or 1<<62 overflow; you have to use 1u64<<62, as the width of a literal seems to be capped at u32.

So still a little messy. (My own languages are 64-bits, since I thought all hardware was now, and those cans are kicked down the road far enough they will give the expected results on all these examples without needing to do anything special.)

Getting back to C: if you were to draw up a chart of the 64 combinations of i8/i16/i32/i64 and u8/u16/u32/u64, as to whether the binary operation is done as signed or unsigned, then the results will not have the regular pattern that you might expect, partly due to the discontuity between 32-bit and 64-bit types. I think Rust solves this at least, by not allowing such mixed arithmetic!
It wont be undefined behavior for very long. Two's complement is now deemed to be universal.
Too many high-end C compilers rely on undefined behaviour for them to be able to do their optimisations.

User avatar
John_Spikowski
Posts: 1557
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: The Rust debate.

Thu Oct 03, 2019 5:13 pm

For me the two biggest reasons I use C is portability and the extensive code / library base that comes standard with Linux essential development tools.

User avatar
paddyg
Posts: 2394
Joined: Sat Jan 28, 2012 11:57 am
Location: UK

Re: The Rust debate.

Thu Oct 03, 2019 5:17 pm

@sal55, Rust might have a slight inconstancy there but

Code: Select all

let a:u64 = 1 << 62;
works OK on my laptop
also https://groups.google.com/forum/?hl=en-GB&fromgroups=#!forum/pi3d

jahboater
Posts: 4769
Joined: Wed Feb 04, 2015 6:38 pm

Re: The Rust debate.

Thu Oct 03, 2019 5:19 pm

sal55 wrote:
Thu Oct 03, 2019 5:02 pm
Too many high-end C compilers rely on undefined behaviour for them to be able to do their optimisations.
What happens is this:
Undefined behavior cannot happen in a correct C program.
Therefore if a compiler can detect the UB, its free to delete the code.
This is commonly seen with badly designed a-priori checks for signed overflow.

if( a + b < a ) printf("overflow!");

The compiler will simply delete the whole thing, since a + b can never be less than a in correct program.

User avatar
John_Spikowski
Posts: 1557
Joined: Wed Apr 03, 2019 5:53 pm
Location: Anacortes, WA USA
Contact: Website Twitter

Re: The Rust debate.

Thu Oct 03, 2019 5:33 pm

Sure it can. A and B being negative numbers then added then compared.

Heater
Posts: 13660
Joined: Tue Jul 17, 2012 3:02 pm

Re: The Rust debate.

Thu Oct 03, 2019 5:45 pm

sal55,
Rust has its own odd behaviours:
May be. But not the one you are incorrectly pointing out below.
Here, a+b and a+1 both give a result of 0, even if assigning the result to a wider variable.
No, they don't. And you can't assign the result to a wider variable:

Code: Select all

    let a: u8 = 255;
    let b: u8 = 1;
    let c: u32 = a + b;
    print!("{}", c);
Results in:

Code: Select all

$ cargo run --release
   Compiling test v0.1.0 (/home/pi/test)
error[E0308]: mismatched types
   --> src/main.rs:102:18
    |
102 |     let c: u32 = a + b;
    |                  ^^^^^
    |                  |
    |                  expected u32, found u8
    |                  help: you can convert an `u8` to `u32`: `(a + b).into()`

error: aborting due to previous error
But you can do this:

Code: Select all

    let a: u8 = 255;
    let b: u8 = 1;
    let c: u32 = (a + b).into();        // Gives 0    
    println!("{}", c);
    let c: u32 = a as u32 + b as u32;   // Gives 256    
    println!("{}", c);
You just have to say what you want.
(This in Rust release mode; in debug mode; it will fail with an overflow error.)
You can have overflow checking in any build mode. Configure it in Cargo.toml or on the rustc command options:

Code: Select all

[profile.release]
overflow-checks = false
So still a little messy.
Not messy at all. As you see.
Too many high-end C compilers rely on undefined behaviour for them to be able to do their optimisations.
I don't believe so. One is expected not to use UB.

But if one is using UB the optimizer, is of course at liberty to do what it likes, like deleting all your code and returning any random numbers, crashing your program or whatever.

If you are lucky the compiler will warn you of UBs. If it can detect them.
Memory in C++ is a leaky abstraction .

sal55
Posts: 59
Joined: Sat Sep 21, 2019 7:15 pm

Re: The Rust debate.

Thu Oct 03, 2019 6:07 pm

paddyg wrote:
Thu Oct 03, 2019 5:17 pm
@sal55, Rust might have a slight inconstancy there but

Code: Select all

let a:u64 = 1 << 62;
works OK on my laptop
I used:

Code: Select all

println!("{}"), 1<<62);
for my examples, to avoid influencing the result.

sal55
Posts: 59
Joined: Sat Sep 21, 2019 7:15 pm

Re: The Rust debate.

Thu Oct 03, 2019 6:27 pm

Heater wrote:
Thu Oct 03, 2019 5:45 pm
Here, a+b and a+1 both give a result of 0, even if assigning the result to a wider variable.
No, they don't.

Code: Select all

fn main() {
    let a=255 as u8;
    let b=1 as u8;
    let c=255 as u16;
    let d=1 as u16;

    println!("{}",a+b);
    println!("{}",c+d);
}
This gives results of 0 and 256, even though both calculations are 255+1.
But you can do this:

Code: Select all

    let a: u8 = 255;
    let b: u8 = 1;
    let c: u32 = (a + b).into();        // Gives 0    
    println!("{}", c);
    let c: u32 = a as u32 + b as u32;   // Gives 256    
    println!("{}", c);
You just have to say what you want.

Not messy at all. As you see.
Well, it is a bit messy and fiddly. It used to be just in assembly that you had to specify one of ADD AL,BL, ADD AX,BL, ADD EAX,EBX, or ADD RAX,EBX; you don't expect that in HLL!

You don't want to worry about intermediate overflows, not until a result has to fit into a specific width (or you are overflowing the machine word size anyway).

ETA: just noticed a couple of anti-patterns in your example, first, the re-use of the 'c' identifier (apparently the same identifier can be re-used in the same scope for different purposes and even with different types).

Second, this one:

Code: Select all

   let c: T = a as T + b as T;
Notice the type T occurs 3 times. If the first T is changed here, the others must be changed too, with possible changes of behaviour. (I'm interested in language design; I notice such things. Not that I will ever use Rust as it seems like trying to code with both hands tied behind your back, even if compilation was blazing fast.)

Return to “Other programming languages”