Honestly I had no idea what ctrl+d even did, I just knew it was a convenient way for me to close all the REPL programs I use. The fact that it is similar to pressing enter really surprised me, so I wanted to share this knowledge with you :)

  • Arthur Besse@lemmy.mlM
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    5 days ago

    Note: for novices reading along at home, the notation ^X means hold down the ctrl key and type x (without shift).

    ctrl-a though ctrl-z will send ASCII characters 1 through 26, which are called control characters (because they’re for controling things, and also because you can type them by holding down the control key).

    ^D is the EOF character.

    Nope, Chuck Testa: there is no EOF character. *

    “D” being the fourth letter of the alphabet, sends ASCII character 4, which (as you can see in man ascii) is called EOT or “end of transmission”.

    $ stty -a | grep eof
    intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>;
    $ man stty |grep -A1 eof |head -n2
           eof CHAR
                  CHAR will send an end of file (terminate the input)
    

    What this means is that the character specified after eof (by default ^D, aka EOT) is configured to be intercepted (by the tty driver) and, instead of that character being sent to the process reading standard input, the kernel will “send an end of file (terminate the input)”.

    (*)

    One could also say there is an EOF character, but what it is can be configured on a per-tty basis.

    By default the EOF character is EOT, a control character, but it could be set to any character. For instance: run stty eof x and now, in that terminal, “x” (by itself, without the control key) will be the EOF character and will behave exactly as ^D did before.

    But “send an end of file” does not mean sending any character to the reading process: as the blog post explains, it actually (counterintuitively) means flushing the buffer - meaning, causing the read syscall to return with whatever is in the buffer currently.

    It is confusing that this functionality is called eof, and the stty man page description of it is even more so, given that it (really!) does actually flush the contents of the buffer to read - even if the line buffer is not empty, in which case it is not actually indicating end-of-file!

    You can confirm this is happening by running cat and typing a few characters and then hitting ^D. (cat will echo those characters, even though you have not hit enter yet.)

    Or, you can pipe cat into pv and see that ^D also causes pv to receive the buffer contents prior to hitting enter.

    I guess unix calls this eof because this function is most often used to flush an empty buffer, which is how you “send an end of file” to the reader.

    The empty-read-means-EOF semantics are documented, among other places, in the man page for the read() syscall (man read):

    On success, the number of bytes read is returned (zero indicates end of file)

    If you want ^D to send an actual EOT character through to the reading process, you can escape it using the confusingly-named lnext function, which by default triggered by the ^V control character (aka SYN, “synchronous idle”, ASCII character 22 - note V is the 22nd letter of the alphabet):

    $ man stty|grep lnext -A1
           * lnext CHAR
                  CHAR will enter the next character quoted
    $ stty -a|grep lnext
    werase = ^W; lnext = ^V; discard = ^O; min = 1; time = 0;
    

    Try it: you can type echo " and then ^V and ^D and then "|xxd (and then enter) and you will see that this is sending ascii character 4.

    You can also send it with echo -e '\x04'. Note that the EOT character does not terminate bash:

    $ echo -e '\x04\necho see?'|xxd
    00000000: 040a 6563 686f 2073 6565 3f0a            ..echo see?.
    $ echo -e '\x04\necho see?'|bash
    bash: line 1: $'\004': command not found
    see?
    

    As you can see, it instead interprets it as a command.

    (Control characters are perfectly cromulent filenames btw...)
    $ echo -e '#!/bin/bash\necho lmao' > ~/.local/bin/$(echo -en '\x04')
    $ chmod +x ~/.local/bin/$(echo -en '\x04')
    $ echo -e '\x04\necho see?'|bash
    lmao
    see?
    
    • mina86@lemmy.wtf
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      edit-2
      3 days ago

      Which is why I haven’t wrote ‘EOF character’, ‘EOT’ or ‘EOT character’. Neither have I claimed that \x4 character is interpreted by the shell as end of file.

      Edit: Actually, I did say ‘EOF character’ originally (though I still haven’t claimed that it sends EOF character to the program). I’ve updated the comment to clear things up more.