- new
- past
- show
- ask
- show
- jobs
- submit
<https://www.rfc-editor.org/old/EOLstory.txt>
Note this does not apply to file formats (except for RFCs).
And for what it's worth, the actual C standard library tends to be fairly rarely used, especially if you consider the malloc/free interface to be part of the system library rather than the C standard library. The C stdio functionality, for example, is extremely underpowered compared to the capabilities of all major operating systems' I/O libraries, and so most applications--even those written entirely in C--will choose to avoid the C standard library and instead use the more direct primitives of the system API layer instead.
Runtime libraries for C/C++ provide two general sets of stuff: the stuff mandated for the Standard C and Standard C++ libraries, and the stuff that is needed by the basic mechanics of the language.
The former is everything from abort() to wscanf(). The latter is a bunch of internal functions, calls to which the compiler inserts in order to do stuff. This is basically the split nowadays between UCRT and VCRUNTIME.
In the days of programming targetting the 80486SX without an 80487 present, for instance, every piece of floating point arithmetic was not a machine instruction but a call to a runtime library routine that did the floating point operation longhand using non-FPU instructions. Other runtime functionality over the years has included doing 32-bit or 64-bit integer arithmetic on 16-bit and 32-bit architectures where this was not a native word size, functions to do stack checking in the function perilogue, and functions to do C++ run-type type checking and exception processing.
This pattern is followed by other (compiled) programming languages. Naturally, the programming languages do not necessarily have any relation to the Standard C or Standard C++ libraries, nor do they generate code that needs the same helper functions for stuff as C/C++ code does. (But the situation is complicated by the POSIX API and the old C language bindings for the MS-DOS system call API, some of which another programming language might also allow program code to use.)
For example the C runtime has a notion of what a "string" is: it's binary layout in memory and the conventions around it (e.g. an array of utf-8 bytes terminated by a null).
A runtime can be very thin or very complex. The dotnet or Java runtimes are massive by comparison. To the point they generally JIT the intermediate language to produce executable code (whether ahead of time or on-the-fly). Go's runtime has its own notion of threading built on top of the system notion of threads.
A self-contained static binary embeds any runtime implementations it needs into its own binary so it is still using runtime facilities but needs no external libraries.
A standalone or "bare" program can mean one that is built using only syscall primitives. Of course that can be taken further: you can build a true baremetal program that is designed to be copied into memory by the bootloader so it runs without a kernel or OS underneath it. This is, after all, what an OS kernel is: just code built such that the bootloader can jump to a fixed (or designated in metadata) address, handing off a pointer to info about the hardware (such as a DeviceTree) in memory and that's it.
In the early PC days BIOS was basically a set of functions built-in to the hardware (or more often flashed onto EEPROM). More or less a minimal sort of runtime + device drivers that knew how to read keyboard input, print characters to the screen, etc.
Almost everything is built on abstractions. In modern systems EFI or equivalents is a form of runtime + device drivers for early boot and the kernel. The kernel forms that for userspace. And a userspace language runtime can be something like a mini-OS for the code it runs. Going the other direction CPUs themselves are much more like a collection of networked PCs than you might expect.
It wasn't until fairly recently that the C runtime was stably shipped with Windows. Previously you had to install the correct version of the C library alongside your application.
Which is called from what, if not C? Does windows really offer no API for writing text (rather than bytes) to files? Or does it rely on the application developer to manage line endings in their own code? Neither of those sounds very developer-friendly.
And you can of course use non-C languages to call the Win32 API. Or even directly using assembly code.
Is that a supported/official API though? On Linux you "can" put your arguments in registers and trigger the system call interrupt directly, and I think Go programs even do this, but it's not the official interface and they reserve the right to break your program in future updates, at least in theory.
Of course, the Linux solution results in some weirdness, especially because specs like POSIX cover the C API, not the syscall ABI. setuid() at the libc layer is specced as changing the UID for all threads in a process. The Linux setuid() syscall only changes the current thread[1], and it's up to the C library to do some absolute magic to then propagate that to all other threads. Which made things difficult for things not using the C library, like Go (https://github.com/golang/go/issues/1435). But that's still not an argument that the supported interface is the C library - the kernel advertises the interface it exposes via the syscall ABI, and will retain that functionality, and if you want POSIX compatibility then you get it from somewhere else.
[1] In Linux, a thread is just a very slightly special case of a process
For instance, Delphi had a period of popularity for Windows application development, and AFAIK it has always used its own runtime library which is completely independent of the C runtime.
Go does not trigger low-level system call interrupts on Windows. (It does that on Linux, but Windows syscall numbers are not stable even across minor Windows updates, so if Go did that, its Windows binaries would be incredibly fragile.)
On Windows NT, Go uses the userspace wrappers provided in Windows system libraries such as NTDLL.DLL and KERNEL32.DLL. But those too are entirely separate from the C runtime.
Basically on Linux the syscalls are the equivalent of Win32 except much narrower in scope.
But it is sometimes required to do things properly.
The Win32 API doesn't even use the "C" calling convention. C is just another language to Windows and the standard C library is a cross-platform library for C. You could also write C code on classic Mac OS and it had it's own API as well but more styled for Pascal.
The OS and C being closely related is not universal across all operating systems, it's just a Unix thing.
A prominent example is Delphi[1]. At work our primary application is a 20 year old Delphi Win32 application, which we ship new features in weekly.
Delphi does not rely on the C runtime, instead having its own system library which interfaces with the Win32 API that gets compiled in.
In the UNIX world there is this strange notion that C language is somehow special and that the OS itself should provide its runtime (a single global version of it) for every program, even those written in other languages, to interact with the OS but... it's just silly.
> Does windows really offer no API for writing text (rather than bytes) to files? Or does it rely on the application developer to manage line endings in their own code? Neither of those sounds very developer-friendly.
No it doesn't. That logic belongs in the OS-specific layer in the runtimes/standard libraries of the implementations of the different programming languages. They may decide to re-use each other libraries, of course, or they may decide not to.
Well sure but you have to define it somewhere. At some point there's an interface where something that's part of the application asks something that's part of the OS to do something, and that interface had better be stable and well-specified. If you really want you can use a different interface from your C ABI, sure, but given that, like it or not, most of windows is written in C (or in C++ but using C linkage between component boundaries), what do you gain?
It's defined, and well-specified.
> your C ABI
Which is a C ABI. Borland's Turbo C and C++Builder used different ABI than Microsoft C compiler did. GCC for Windows used to use a third, entirely different ABI as well. The ABI is not part of the language definition, you see.
> most of windows is written in C
And compiled with a very specific C compiler that used a particular ABI. That only means that you need to follow it when you call into the OS, sure, but not that you have to stick to it anywhere else — and indeed, most implementations of many programming languages on Windows didn't; they invented and used their own ABIs.
Sure, you can do that. Userspace code can use any ABI it wants, or none. But again, why, what do you gain?
And regardless of whether it's "the" ABI or merely "a" ABI, that ABI presumably has a representation for strings and allows passing them around - and while you certainly could use a different representation in your program (or in the OS internals) and transform strings back and forth when calling the OS (or when receiving calls from userspace), you probably don't want to. At which point we're back at needing a way to write strings in an in-memory format to OS-standard files in the filesystem.
Performance? Codegen simplicity? Why, again, must one use the syscall ABI for anything that is not a syscall?
> that ABI presumably has a representation for strings and allows passing them around
In this particular case, the API operates with binary buffers, not text strings. Sure, you can go the VMS way, or even IBM way, and turn files from binary blobs into arrays of fixed-length records (that's why C's fwrite/fread have both num and sz arguments: some OSes literally can't write data any other way).
> At which point we're back at needing a way to write strings in an in-memory format to OS-standard files in the filesystem.
Yes? Some text editors converted LFs to NULs to work on the text in memory, and then they'd convert NULs back into LFs on writing to the disk (IIRC). Both emacs and vi don't store text in memory the way it's layed out in the file; they translate it when writing to the disk.
Again, why do you want the OS to get involved into any of this? It's not the OS's job, period, stop trying to make the world an even worse place.
If you can figure out an ABI that gives you significant advantages, sure, knock yourself out. But given that you're going to have to implement the syscall one anyway, if there's no compelling reason to use a different one then why make things more complicated?
> Again, why do you want the OS to get involved into any of this? It's not the OS's job, period
Again, why have a filesystem if you're not going to have any standardised structure for how to use it? Why have an OS at all if you're not going to give programs ways to interact with each other? The OS owns the filesystem, it should also define how it's used.
> stop trying to make the world an even worse place
Right back at you.
For reference, Unix has no API other than bytes either.
So it's "specific to" almost all programming languages in actual use. That's a rather esoteric point.
> For reference, Unix has no API other than bytes either.
Unix does offer an API for writing C-standard in-memory text strings to Unix-standard on-disk text files, it just happens to be the same one as the API for writing in-memory binary strings to on-disk binary files.
Why on bloody Earth should a presumably generic-purpose OS provide a special API for dealing with internal representation of some data structure in a (particular) implementation of a (particular) programming language?
Besides, it doesn't offer such an API anyhow; you need to take care to manually pass the result of a strlen() call instead of sizeof()'s as the value for the len parameter of a write() call, otherwise a NUL-terminator will get written into the file as well.
And C says nothing about what constitutes a line break, by the way. Nor does it have any concept of a "line", or any utilities for working with lines specifically, it only knows of strings, and that's all. The concept of "text line" is POSIX.
Because the purpose of the OS is to facilitate applications (and, on the other end, facilitate hardware), and those applications tend to have a need to process text in-memory and then store it on the filesystem?
All of this is left to the user space to sort out, just as it is on Linux, so I am not entirely sure why you demand Windows to do more for you than Linux does.
If you are writing a developer suite, whether you're Delphi developing for MS-DOS or Microsoft developing for Apple II, you kinda have the idea of how things should work (because you have the reference book for the platform, not the compiler/language). It is not the assumption that the OS provides abstraction for text - in thise days, everyone just implement it from scratch, really ("code page" was from literal code pages, where each character has a well-defined byte). This is manifested in command-line handling on Windows: the platform convention is that it is just a flat string, and the C runtime determines how to chop that up (MSVC and Intel C has historically disagreed heavily here) The abberation of Windows only having CRLF is because Unix-based designs took over the world: macOS is Unix, Linux was insiped by Unix, *BSD was Unix-derived.
The real fragmentation is not CRLF but the transition to system level UTF-16 support, involving all sorts of macros and duplicating almost every OS API function into FooW() and FooA() variants.
By "recently" you mean Win95? MSVCRT.DLL has been there for at least that long.
https://devblogs.microsoft.com/oldnewthing/20140411-00/?p=12...
https://learn.microsoft.com/en-us/cpp/windows/universal-crt-...
Current versions of the OS ship with functions in MSVCRT.DLL that weren't in the last VC6 version, such as the updated C++ exception handler (__CxxFrameHandler4). AFAIK, there is no redistributable version of it, it's unique to the OS.
The C standard library is definitely not part of Windows.
It is now with the Universal C runtime, introduced in Windows 10, which is ironically written in C++ with extern "C" { ... }
On non UNIX clones, including Windows, it has always been the role of commercial C compilers to provide the C standard library on top of the actual C APIs.
C runtime library being part of OS is accidental thing in Unix, 16bit and 32bit Windows API even does not use C-compatible ABI (instead, Pascal-compatible one is present)
But in Unix “\n” is a single byte, and in DOS it is 2. So they introduced text and binary modes for files on DOS. Behind the scenes the library will handle the extra byte. This is not necessary in Unix.
I used to have to be careful about importing files to DOS. Did the file come from Unix?
I think you are talking about carriage return linefeed pair (CRLF or \r\n),
These control codes go back to line printers. Linefeed advances the paper one line and carriage return moves the print head to the left.
In binary mode. In text mode if you printf(“Hello World\n”) you get CRLF because that’s how text works on DOS. Unix had the convention of only requiring the LF for text. And Unix didn’t have text/binary modes. That’s the compatibility hack on DOS.
>These control codes go back to line printers.
Back to teletypes even. Believe me, I go back to line printers.
Note that printf(), which you use in your example, is a C library function that writes writes to a predefined text mode stream. So it follows the same rules.
I wasn't able to dig up the source code of a vintage DOS compiler's C library in a few minutes of looking, so I can't prove it right now, but this section of the C standard (7.21.2 - Streams) hints that my recollection is correct:
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf#p...
*(On systems where the char type is one byte, of course, which is the case for DOS C compilers.)
Sending a carriage return and linefeed to a TTY 33 and then printing works fine. Doing them in the opposite order, if the carriage is to the right of the page, will result in a linefeed (platen rotation) happening quickly, then the carriage starting to move left to the beginning of the next line, and then the next printed character will print wherever the carriage happens to be at the time - not yet to the left. So you will be missing a character at the beginning of the line because it's in between the two lines in an unexpected column-ish.
I have run into (in my mind, "hipster") code where the programmer for some reason reversed the order of CR and LF.
Text inside a computer doesn't need any of that just to signal a newline. UNIX chose to use a single line feed character as a line separator because there was no good reason to use two. MacOS chose a single carriage return for similar reasons. Anything going out to a printer or teletype would run through a device driver that would turn the newline character into whatever the device expects.
Windows copied DOS which copied CP/M which was a very basic program loader for 8-bit machines and didn't really have "drivers" like we think of them today. I'm guessing here, but I imagine they chose the teletype combo because that's what most serial printers understood and printing was a major use case for those machines. That was probably the right choice for CP/M, but I can't imagine Microsoft would choose it if they were developing Windows from scratch today.
It actually long predates DOS
C stdio is descended from Mike Lesk’s “portable IO package” (original release circa 1973). Bell Labs ported their C compiler from Unix to Honeywell GCOS and IBM S/370 mainframes. Mainframes handle text files very differently from how Unix systems do-it is much more complex than simply changing the newline character. So in Lesk’s package, the mode parameter to copen() told you whether the file was text or binary. copen() was renamed to fopen(), and the character to indicate binary mode was changed from “i” to “b”, and hence stdio
stdio has always had text-vs-binary file distinction, on some platforms (such as Unix) it has always been a no-op, on others it hasn’t
https://archive.org/details/lesk-iolib