Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲How do I inform Windows that I'm writing a binary file? (devblogs.microsoft.com)

73 points by ingve 4 days ago | 69 comments

Isamu 3 days ago [-]

Going all the way back to the earliest C compilers on DOS. There was a decision made to make “\n” just work on DOS for portability of Unix programs, and to make the examples from the C programming book just work.

But in Unix “\n” is a single byte, and in DOS it is 2. So they introduced text and binary modes for files on DOS. Behind the scenes the library will handle the extra byte. This is not necessary in Unix.

I used to have to be careful about importing files to DOS. Did the file come from Unix?

criddell 3 days ago [-]

Linefeed (\n) is a single byte in DOS as well.

I think you are talking about carriage return linefeed pair (CRLF or \r\n),

These control codes go back to line printers. Linefeed advances the paper one line and carriage return moves the print head to the left.

Isamu 3 days ago [-]

>Linefeed (\n) is a single byte in DOS as well.

In binary mode. In text mode if you printf(“Hello World\n”) you get CRLF because that’s how text works on DOS. Unix had the convention of only requiring the LF for text. And Unix didn’t have text/binary modes. That’s the compatibility hack on DOS.

>These control codes go back to line printers.

Back to teletypes even. Believe me, I go back to line printers.

foresto 2 days ago [-]

I'm pretty sure that conversion was done by the C library, just as stated in the article. Not by DOS. ASCII 0x0A '\n' is always one byte*, and C library implementations for DOS would insert an ASCII 0x0D '\r' byte before it at output time if the C FILE stream had been opened in text mode.

Note that printf(), which you use in your example, is a C library function that writes writes to a predefined text mode stream. So it follows the same rules.

I wasn't able to dig up the source code of a vintage DOS compiler's C library in a few minutes of looking, so I can't prove it right now, but this section of the C standard (7.21.2 - Streams) hints that my recollection is correct:

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf#p...

*(On systems where the char type is one byte, of course, which is the case for DOS C compilers.)

JdeBP 2 days ago [-]

I wrote a C Standard library for MS/PC/DR-DOS. Your recollection is correct.

foresto 2 days ago [-]

Thanks for confirming!

Isamu 2 days ago [-]

Agreed, I didn’t mean that DOS somehow converted it, this was a compatibility feature put into the C library.

satiated_grue 2 days ago [-]

Yes, and the order is important.

Sending a carriage return and linefeed to a TTY 33 and then printing works fine. Doing them in the opposite order, if the carriage is to the right of the page, will result in a linefeed (platen rotation) happening quickly, then the carriage starting to move left to the beginning of the next line, and then the next printed character will print wherever the carriage happens to be at the time - not yet to the left. So you will be missing a character at the beginning of the line because it's in between the two lines in an unexpected column-ish.

I have run into (in my mind, "hipster") code where the programmer for some reason reversed the order of CR and LF.

criddell 2 days ago [-]

Teletypes is what I meant to say, thank you. Line printers have no carriage return mechanism.

anitil 3 days ago [-]

Annoyingly I actually think '\r\n' is the correct line ending here - advance the paper and return the carriage, but I suppose unix took the simpler implementation which makes looping over characters, words (split by ' ') and lines (split by '\n') simpler as each loop only has a single comparison

spauldo 2 days ago [-]

The carriage return and linefeed combo are the commands to move to the next line of a teletype. Other commands might (in theory) be used for this purpose on other devices. These are implementation details.

Text inside a computer doesn't need any of that just to signal a newline. UNIX chose to use a single line feed character as a line separator because there was no good reason to use two. MacOS chose a single carriage return for similar reasons. Anything going out to a printer or teletype would run through a device driver that would turn the newline character into whatever the device expects.

Windows copied DOS which copied CP/M which was a very basic program loader for 8-bit machines and didn't really have "drivers" like we think of them today. I'm guessing here, but I imagine they chose the teletype combo because that's what most serial printers understood and printing was a major use case for those machines. That was probably the right choice for CP/M, but I can't imagine Microsoft would choose it if they were developing Windows from scratch today.

Joker_vD 3 days ago [-]

Yep, on Unixen the translation of CRLF to LF when printing to the terminal (and from CR to CRLF when reading input from the terminal) is done in the kernel, it's called "line discipline".

poizan42 3 days ago [-]

And if you switch the tty from "cooked" to "raw" mode then it doesn't do the conversion, and a CR just moves the cursor back to the start of the line and a LF just moves the cursor one line down.

fragmede 2 days ago [-]

Which is how you do the fun spinny icons on the command line without having to invoke ncurses!

freedomben 2 days ago [-]

You can also just use a \r directly without a \n. For example:

    spin='/-\|'
    while true; do 
      i=$(( (i+1) % 4 ))
      printf "\r${spin:$i:1} Working..."
      sleep 0.1
    done

freedomben 1 days ago [-]

By the way yes, that is a subtle reference to the TOS computer, so it's best read in that voice :-)

skissane 2 days ago [-]

> So they introduced text and binary modes for files on DOS

It actually long predates DOS

C stdio is descended from Mike Lesk’s “portable IO package” (original release circa 1973). Bell Labs ported their C compiler from Unix to Honeywell GCOS and IBM S/370 mainframes. Mainframes handle text files very differently from how Unix systems do-it is much more complex than simply changing the newline character. So in Lesk’s package, the mode parameter to copen() told you whether the file was text or binary. copen() was renamed to fopen(), and the character to indicate binary mode was changed from “i” to “b”, and hence stdio

stdio has always had text-vs-binary file distinction, on some platforms (such as Unix) it has always been a no-op, on others it hasn’t

https://archive.org/details/lesk-iolib

kalleboo 2 days ago [-]

Similarly on Classic Mac OS, C compilers would map single '\n' to single '\r' which is what was the Mac OS convention.

Rendered at 15:48:40 GMT+0000 (Coordinated Universal Time) with Vercel.