This is writing sizeof(char) (== 1 almost everywhere) zero to address zero. It is not using a NULL macro or other predefined symbol.
In the real world, this would generally write a byte to address 0000:0000, leading to UB because it would fuck up the divide-by-zero IV.
PS: I used Borland C++ 3.1, Microsoft C++ 3.x and 4.5x, Watcom, and early GNU.
https://c-faq.com/null/null2.html
https://c-faq.com/null/machexamp.html
Actual ways to do what you want to do are described in
https://c-faq.com/null/accessloc0.html
but technically speaking the pointer with a constant zero assigned to it _is_ a null pointer (which can be implemented as whatever bit pattern), independent of the preprocessor macro.
Here in godbolt, clang compiling C simply deletes the code in the function past and including the null pointer dereference.
https://godbolt.org/z/9aqWPazsP
> This is writing sizeof(char) (== 1 almost everywhere)
1 everywhere. sizeof's unit is "how many chars". For instance there was a cray machine that could only access 64bit words. sizeof(char) is still 1, with 64bit chars.
> zero to address zero. It is not using a NULL macro or other predefined symbol.
NULL is defined as literal 0.
sizeof char is 1 by definition everywhere.
/pedantic
Parentheses are required around char because it's a type.
/pedantic
sizeof is an operator in C, and does not need parenthesis any more than pointer operator *. It is true that programmers frequently think of it as a function and use parenthesis.
-----
unless initial property is start of dynamic operation, in which case, holding almost anywhere begins at the first operation after the start of the dynamic operation. process / lambda / epsilon calculi is just symbolic math. address 0 static, everything else dynamic.
per math, dimension N is static, to be able to "change things up" in dimension N, need to to be almost everywhere higher than dimension n. Edge cases are weird in any dimension. Guess why logicians just do the equivalent of C's !0
(cast classic logic) A=1 (cast boolean logic) B=0
C statement !(!B == A) hold everywhere and almost everywhere depends on how read C spec to interpret A & B.
My first kernel was 1.0.9 released alongside Slackware 2.0, offering initial support for IDE CD-ROM drives and experimental support for ELF files, by the way.
The literal 0 is treated specially, so this could indeed be one of those 'turns into a weird bit pattern NULL pointers', if such a thing existed in the wild anymore.
But you're correct in that there probably haven't been any since the turn of the century or whenever the last Univac mainframes got turned off.
execl takes a variable-length, null-pointer-terminated list of character pointer arguments, and is correctly called like this:
execl("/bin/sh", "sh", "-c", "date", (char *)0);
Due to ececl being a variadic function it can not take advantage of a prototype to instruct the compiler that one of its arguments needs to be treated as a pointer context.But yes, the interrupt table was my first thought when reading the headline.
A byte is CHAR_BIT bits, where CHAR_BIT >= 8. (It's exactly 8 on most implementations; DSPs are the most common exception).
short and int are both required to be at least 16 bits wide. It's possible for int to be 1 byte (sizeof (int) == 1), but only if CHAR_BIT >= 16.
If I'm being pedantic, I might add something like
#if CHAR_BIT != 8
#error "This code assumes 8-bit char"
#endif
But realistically, if I'm using headers defined by either POSIX or Windows, that's probably enough of a guarantee. (Though I'd still use CHAR_BIT rather than 8 to refer to the number of bits in a byte.)As for why it's address 0, well, it has to go somewhere, every machine has a CPU so everyone needs an interrupt table even if they don't have much memory. And when memory was precious there was no sense wasting even one byte of it; 0 was a real address on your physical memory chip, so why not use it just like any other?
(The fact that it's "address 0" for "division by 0" is just coincidence as far as I can see; division by 0 just happens to be the first kind of possible CPU interrupt. Perhaps it was the most common one?)
From the numerous responses here, it's clear that people interpret my question as about how the hardware itself works, which isn't at all what I was asking about; I'm aware of how stuff like this works at the assembly level, but my understanding was that in C and C++, trying to write arbitrarily to "special" addresses like that would be considered undefined behavior (often resulting in segfaults). When I read the comment I responded to above, it surprised me, so I wanted to check whether I understood what was said correctly. It's honestly kind of confusing to me that so many people seem very upset by the idea that a stranger on the internet might have a misconception about how hardware abstractions are exposed via compiled code to the point that they feel the need to explain in detail how hardware works but not actually answer the question I asked.
DEC provided the necessary hardware MMU to do actual real time multi-processing/multi-user access in feasibile/practical manner.
They're not saying this is, like, a portable standard way to handle division by zero in C++. You're right that it would be undefined behaviour under the standard (but a C++ compiler for real-mode x86 would be expected to support it, at least implicitly; obviously this specific case is not a particularly useful, but C++ is used in embedded settings and setting a custom interrupt handler is something its users want and expect).
A decent, well-behaved language would do some kind of structured error handling on divide by zero, like throwing an exception. IMO that includes any C++ compiler worth bothering with (though again the standard makes it undefined behaviour so it's possible that some compilers don't). But, the way the runtime of such a decent C++ compiler would actually implement that would be by setting up an interrupt handler for the divide by zero interrupt (that would contain code to construct the exception etc.), and by performing this write to address 0 you're overwriting (the pointer to) that interrupt handler. So, this line of code would cause your program to behave (almost certainly) badly on the next division by zero, even if you were using a well-behaved C++ compiler that normally handled division by zero gracefully.
(OTOH with a maliciously pedantic C++ compiler that division by zero would already be undefined behaviour, so in practice, since most C++ compilers tend to be maliciously pedantic, you might be no worse off than you were before that line).
The original post you replied to was just talking about the somewhat interesting details of what would actually happen because of the quirks of what these addresses are used for on that hardware (e.g. the fact that address 0 is supposed to contain a pointer to the handler, so by setting it to 0 you cause the CPU to start executing the interrupt handler table as code, is kind of interesting - not as a point about C++, but as a point about funny emergent behaviour of hardware), not about what this is specified as doing or the normal way of doing things in C++. I don't know why you were downvoted.
What got missed though, is ther has to be an "unused"/"reserve" bit(s) space in order for things to run without requiring additional specific hardware operations.
Modern CPUs with virtual memory means the question is a lot more complicated. Every process in a modern OS gets it's own address space so you can write to 0 but it could go anywhere (even virtualized to disk) and all the actual hardware is not directly accessible (must go through the OS).
I'm not sure I'd call this ability "useful" except if you're writing an operating system. This is vast simplification but when your computer boots it's effectively in a mode that allows reading/writing to anywhere. The OS kernel has direct access to all the hardware and then it limits access when running user processes.
The address can be changed with the LIDT instruction and operating systems nowadays will just put it wherever, but for backward compatibility it is expected to still be at 0000:0000 (not sure how this is handled nowadays in UEFI, but it should still be possible t o set it up that way).
And yes, some addresses are special. (AFAIK, on all current mainstream architectures.) This is the expected way to set those signal handlers, output (and input) data, configure devices, etc.
That said, there are some gotchas on using specific addresses in C. AFAIK none apply to x86, but it's something you usually do in assembly.