chasil 3 months ago

This goes back further.

Teletype machines needed a delay to move the printing apparatus back to the beginning of a line. The two characters provided that delay.

I never used one of these; I was too young.

https://en.m.wikipedia.org/wiki/Teletype_Model_33

Edit: first google hit:

https://www.revk.uk/2022/02/crlf-has-long-history.html?m=1

  • Stratoscope 3 months ago

    I learned to program in high school on a Teletype Model 33 ASR that dialed into to a timesharing service. This was the version with a paper tape punch and reader.

    Dial-up time was very expensive, somewhere around $50/hour in today's dollars.

    So we would punch our code onto a paper tape, then print it out locally to review it, and finally dial in, run the paper tape through and wait for the printout from our program, and immediately disconnect.

    In theory, CR LF should have been enough at the end of a line, but we found that sometimes, especially after a long line, the next line would start printing before the paper had fully advanced.

    To play it safe, we would punch CR LF RUBOUT. RUBOUT was a character with all eight holes punched in the tape, i.e. 0xFF. By convention, this code was ignored by any system that received it.

    The intended use of RUBOUT was to correct an error you made when punching a tape. You would back up the tape as many characters as you got wrong, then punch that many RUBOUTs, and continue from there. Because RUBOUT had every hole punched out, it would erase any error.

    And it also served to give the paper a little more time to advance when we punched CR LF RUBOUT.

    Maybe our machine was in need of some oiling or adjustment. The Model 33 was designed as a light-duty machine, unlike the more rugged Models 28 and 35.

    • Animats 3 months ago

      > Maybe our machine was in need of some oiling or adjustment. The Model 33 was designed as a light-duty machine, unlike the more rugged Models 28 and 35.

      Yes. The spec for the Model 33 says that it is rated for one year of continuous operation without lubrication, three years with lubrication.

      The models 14, 15, 28, and 35 were designed for a much longer life. I have some Model 14 machines coming up on a century and still working. But there is a price to be paid in maintenance and lubrication. Those machines are totally repairable - you can take them completely apart and put them back together. I've overhauled five of the older machines. There are over 600 oiling points, and you need both oil and two different greases. Nobody would tolerate an office machine today that required such maintenance.

      Think about that when you look at iFixit scores.

drdec 3 months ago

> This choice was designed to spread the pain equally among all operating systems of the day; each has to translate to and from the CR LF convention when text was transferred across the network.

On the one hand, this seems clever and fair. On the other hand, this is why we can't have nice things.

  • Someone 3 months ago

    According tithe article, it wasn’t even true. FTA:

    “Early operating system designers had to adopt some "end-of-line" convention using CR and LF; some used LF, some used CR, and some used a two-octet sequence: LF CR or CR LF.”

    ⇒ OSes that used CR LF didn’t have to do that translation.

gary_0 3 months ago

Nowadays I see in-flight EOL normalization as Considered Harmful; instead I ensure that routines dealing with text buffers treat both CRLF and LF as the same. This has Just Worked without any issues in the software I've written with this approach. (Occasionally this requires wrapping/replacing library code that only uses the EOL convention of the host system.)

I have encountered numerous issues with EOLs being changed in transit or on disk, so I always make sure to open/transfer files in "binary mode", since both Windows and Linux builds will run the same EOL-agnostic code.

ktpsns 3 months ago

I haven't seen end-of-line conversion problems (as well as Unicode BOMs) for decades. My guess is that software quality improved these days. MS Notepad was a noteable tool which always mocked around with unix-style line endings. Using dos2unix and unix2dos utilities was something I commonly used in the early 2000s, in particular on dual boot computers. This was also the time where UTF-8 was not yet so widespread, but that is another topic ;-)

  • yjftsjthsd-h 3 months ago

    I can easily believe that it's gotten better, but I've hit it within the last 6 months; Microsoft Azure DevOps git[0] has a web editor that defaults to Windows-style EOL, and if you use it to deploy files to boxes running a Linux distro then some tools will completely break on config files that aren't using unix-style EOL. Ask me how I know. That was fun to fix. For bonus points, some programs on Linux are compatible with either line ending-_-

    [0] I expect this isn't its real name, but it's MS so that's a lost cause.

  • don-code 3 months ago

    Sad to say, I filed a bug in an internal tool just two weeks ago relating to how it processes Windows linebreaks. The tool is written using modern languages (Python 3.11) and frameworks (FastAPI).