We should totally have a way to render HTML-style links in terminal emulators.
Lots of terminal emulators already have a way to recognize URLs in free text so that you can visit them (ctrl-click in gnome-terminal, an option on the dropdown menu in both gnome-terminal and Konsole), which is useful. But if you're chatting over XMPP with someone who's using HTML (like HipChat), they're probably going to embed links from time to time. And they expect that when they say:
We'll go from my house to Facebook at around 10:00.
where "my house" is linked to http://maps.google.com/maps?client=ubuntu&channel=fs&oe=utf-8&q=1456+Edgewood+Dr.,+Palo+Alto,+CA&um=1&ie=UTF-8&hq=&hnear=0x808fbb0d745a65e9:0xfde4b05151805922,1456+Edgewood+Dr,+Palo+Alto,+CA+94301&gl=us&sa=X&ei=bZNCUbbuI62g4AOPyYDQDw&ved=0CDAQ8gEwAA and "Facebook" is linked to http://maps.google.com/maps?q=facebook&hl=en&sll=37.45392,-122.13933&sspn=0.007495,0.013797&gl=us&hq=facebook&t=m&z=16, then you will see something like
We'll go from my house to Facebook at around 10:00.
and not
We'll go from http://maps.google.com/maps?client=ubuntu&channel=fs&oe =utf-8&q=1456+Edgewood+Dr.,+Palo+Alto,+CA&um=1&ie=UTF-8&hq=&hnear=0x80 8fbb0d745a65e9:0xfde4b05151805922,1456+Edgewood+Dr,+Palo+Alto,+CA+9430 1&gl=us&sa=X&ei=bZNCUbbuI62g4AOPyYDQDw&ved=0CDAQ8gEwAAmy house to
http://maps.google.com/maps?q=facebook&hl=en&sll=37.45392,-122.13933&s spn=0.007495,0.013797&gl=us&hq=facebook&t=m&z=16 Facebook at around
10:00.
which is pretty annoying and hard to read.
ESC [ _
and ESC [ U < url >
We define two new ANSI-compatible escape sequences, which ought to be proposed for the next version of ECMA-48 and the corresponding ISO standard:
CSI _
, or ESC [ _
, "LNK": to indicate the beginning of a
hyperlink to an URL. All characters produced before the next URL
sequence form part of the link.CSI U
, or ESC [ U
, "URL": to indicate the end of the link text
begun by LNK. This sequence is followed by the unencoded text of
the URL to which clicking the link should take the user, preceded by
"<" and terminating before the next ">" or " ", which is part of the
escape sequence and not displayed.As an example, a link labeled "Canonical Hackers" linking to
http://canonical.org/ could be represented as follows, with ESC
representing the ASCII escape character:
ESC[_Canonical HackersESC[U<http://canonical.org/>
Above and beyond the typographical possibilities of whitespace, we
already have boldface, underlining, and different colors in our
terminal emulators. These are produced by "escape sequences",
invisible sequences of characters typically beginning with the ESC
character (character 27) followed by a "[", which change the state of
the terminal so that subsequently displayed characters will have a
different effect. The ESC[
sequence is called "CSI", "Control
Sequence Introducer".
So I think we should define an escape sequence to make the following (or preceding?) sequence of characters into a clickable link to a given URL.
It would be desirable if the escape sequence degraded into something that was human-readable when displayed on terminals that didn't support it. This is only possible to a limited extent, since programs that try to be aware of screen layout will necessarily be confused about where the text is on the screen, but it is still somewhat possible.
ANSI escape sequences can't contain sequences of arbitrary characters,
while URLs can contain most characters. Consequently the escape
sequences for setting the terminal title aren't ANSI escape sequences:
ESC]0;new title^G
, where ^G
is the BEL character control-G
(character 7) and can also be ESC\
, and 0
can be replaced with
1
or 2
. The ESC]
sequence is known as "OSC" or "Operating
System Command".
xterm's parsing of the above sequence seems to consume anything
following ESC]
until the next ^G
or ESC\
, even if it doesn't
start with a digit, which is pretty unfriendly. This suggests that if
we wanted to use the same ESC]
introduction sequence, incompatible
xterm implementations would eat the URL if we terminated it with the
same ^G
, and other things as well if we used a different terminator.
Even if most people are now using other terminal emulators, this seems
like a compatibility trap.
Putting the link escape sequence after rather than before the text to be marked up offers a certain kind of safety; it's quite easy for an unterminated color escape sequence to set the next few pages of output to white on white, or flashing, or underlined, or whatnot. It also probably improves matters on terminals that don't yet understand the escape sequence, as they could see "Canonicalhttp://canonical.org/" rather than "http://canonical.orgCanonical". Since you presumably need two escape sequences anyway to indicate the boundary of the link, you might as well put the URL in the final one. So you have one escape sequence to indicate "beginning of link" and another one to indicate "end of link; URL is xyz".
For reasons of tradition and URL-safety, I would like the delimiters
(particularly in the gracefully degraded form) to be <>
. Ideally
the beginning and ending sequences would also have a pleasing visual
and memorable symmetry, and would disappear from view entirely in old
terminals.
(In the following, ESC
means the ASCII character 27, escape.)
This suggests using one of the ASCII matching delimiter pairs
[](){}<>`'
. []
are right out, since they're already in use.
ESC(
and ESC)
cause rxvt to swallow the following character; I'm
not sure what their meaning is, but they seem to be used.
Treating `'
as a matched pair is sadly out of style, due to the unfortunate
but now nearly universal adoption from Microsoft Windows fonts of a
vertical '
.
ESC{
and ESC}
disappear in rxvt and xterm, render as literal ESC
character glyphs in gnome-terminal, overlaid on top of the {} in my
font, and disappear in konsole while producing warning messages on its
stderr. ESC<
and ESC>
disappear in rxvt; the second disappears in
gnome-terminal, while the first renders as a literal ESC character
glyph; they disappear in konsole. This suggests that perhaps ESC<
and ESC>
are already taken.
A little looking around suggests that ESC>
is "set numeric keypad
mode" aka DECKPNM on VT100s, ESC<
is "exit ANSI mode" on VT52s,
while ESC(
and ESC)
are used by VT100s to change character sets
(setaltg0, setaltg1, etc.)
ESC}
on VT100s is "invoke the G2 character set", according to
http://rtfm.etla.org/xterm/ctlseq.html, although that's ignored in
xterm and probably all other modern terminal emulators.
So this suggests the following syntax:
ESC{Canonical.ESC}<http://canonical.org/>
Actually, though, you could use valid ANSI escape sequences instead of
ESC{
and ESC}
. I wanted to use ESC[a
for the "begin link"
sequence (like <a>
), but rxvt already uses it for a nonstandard
"move right" sequence, an alternative spelling of ESC[C
. (See
rxvt-2.6.4/src/command.c:2652, inside process_csi_seq
.) The full
set of CSI-ending codes handled by rxvt seems to be
iAeBCaDEFG`dHfIZJK@LMXPT^ScmnrshltgW
, which is to say,
@ABCDEFGHIJKLMPSTWXZ^`acdefghilmnrst
.
Argh, so what to use for the other escape sequence? Wikipedia says:
For two character sequences, the second character is in the range ASCII 64 to 95 (@ to _). However, most of the sequences are more than two characters, and start with the characters ESC and [ (left bracket). This sequence is called CSI for Control Sequence Introducer (or Control Sequence Initiator). The final character of these sequences is in the range ASCII 64 to 126 (@ to ~).
This suggests that we could use "ESC[", which konsole reports as an "Undecodable sequence" and drops, rxvt apparently drops (and isn't in rxvt's list of CSI codes), xterm and screen drop, and gnome-terminal displays literally. "ESC[" is nice and mnemonic: links are normally underlined.
ESC[U
would work for the "end link, begin URL" sequence, which could
be followed by the URL wrapped in <>
. If the URL is specified to
end at the next space or >
, then this sequence would be unlikely to
inadvertently gobble up a large quantity of text when random data is
sent to the terminal that randomly happens to include ESC[U
. So
that would give us:
ESC[_Canonical.ESC[U<http://canonical.org/>
You could write your own terminal emulator to support links, but it probably makes more sense to implement the feature in existing terminal-emulation software. The popular free-software terminal emulators are tmux, screen, gnome-terminal, konsole, rxvt, xterm, Emacs, and whatever Apple ships, plus perhaps implementation in ncurses is necessary for much application software to use it.
I looked through the available file on my Ubuntu box to see what other terminal emulators there are. The relevant popularity metrics from http://popcon.debian.org/main/by_vote are:
#rank name inst vote old recent no-files (maintainer)
34 libncurses5 137051 119786 7528 9710 27 (Craig Small)
448 libvte9 65767 30960 26691 8033 83 (Debian Gnome Maintainers)
499 gnome-terminal 54405 27886 21381 5118 20 (Debian Gnome Maintainers)
771 xterm 77540 17077 50118 10317 28 (Debian X Strike Force)
918 screen 45241 12867 30602 1758 14 (Axel Beckert)
1078 libvte-2.90-9 27674 9519 11363 5591 1201 (Debian Gnome Maintainers)
1182 konsole 16489 8229 6662 1590 8 (Debian Qt/kde Maintainers)
1434 emacsen-common 28896 5699 19961 2805 431 (Rob Browning)
1609 xfce4-terminal 9368 4110 4360 896 2 (Debian Xfce Maintainers)
2109 tmux 6908 2154 4244 508 2 (Karl Ferdinand Ebert)
2285 lxterminal 5290 1803 2946 541 0 (Debian Lxde Maintainers)
2739 terminator 2158 1274 764 119 1 (Nicolas Valcárcel Scerpella)
2766 yakuake 2324 1242 972 110 0 (Ana Beatriz Guerrero Lopez)
3073 guake 1499 972 449 78 0 (Sylvestre Ledru)
4913 tilda 724 324 367 33 0 (Davide Truffa)
4920 eterm 1189 322 787 80 0 (Debian Qa Group)
5130 terminal.app 830 294 509 26 1 (Debian Gnustep Maintainers)
5289 rxvt 2032 274 1634 124 0 (Jan Christoph Nordholz)
6432 aterm 972 180 761 31 0 (Debian Qa Group)
6948 cutecom 796 148 596 50 2 (Roman I Khimov)
7207 gtkterm 781 135 619 27 0 (Sebastien Bacher)
7299 sakura 299 132 155 12 0 (Andrew Starr-bochicchio)
7376 ajaxterm 251 128 116 7 0 (Julien Valroff)
7855 mrxvt 452 110 320 22 0 (Jan Christoph Nordholz)
7768 picocom 535 113 378 43 1 (Matt Palmer)
7975 mlterm 628 106 448 73 1 (Kenshi Muto)
8386 fbterm 1392 95 1097 199 1 (Nobuhiro Iwamatsu)
8947 roxterm 470 82 79 5 304 (Tony Houghton)
10260 pterm 345 59 231 55 0 (Colin Watson)
10458 jfbterm 1142 56 1007 78 1 (Debian Qa Group)
10771 kterm 560 52 497 11 0 (Ishikawa Mutsumi)
13843 evilvte 117 27 87 3 0 (Wen-yen Chuang)
14563 microcom 187 24 150 13 0 (Alexander Reichle-schmehl)
14898 xvt 212 23 177 11 1 (Sam Hocevar)
15979 termit 71 19 49 3 0 (Thomas Koch)
19919 vala-terminal 44 10 32 2 0 (Debian Freesmartphone.org Team)
24553 xiterm+thai 36 5 28 3 0 (Neutron Soutmun)
24634 bogl-bterm 41 4 32 5 0 (Samuel Thibault)
25809 pyqonsole 31 4 24 3 0 (Alexandre Fayolle)
45654 libterm-vt102-perl 5 0 5 0 0 (Debian Perl Group)
(I thought Text::CharWidth might be relevant, but it doesn't seem to handle escape sequences anyway.)
Now, libvte9 or libvte-2.90-9 is the actual terminal emulator library that powers a number of the above terminal emulators, at least gnome-terminal, sakura, xfce4-terminal, vala-terminal, tilda, lxterminal, gtkterm, and evilvte. But from looking at the code of gnome-terminal and xfce4-terminal, each separate application built on top of libvte would probably have to write some code to handle clicks on link regions.
It seems that once you add the feature to libncurses5 (in the termcap, at least), libvte9, and gnome-terminal, it's available to 20% of users of Debian and similar systems; if you add it to xterm, you get another 12%, although it's perhaps dubious whether upstream xterm will accept such a patch; adding the feature to screen makes it available to another 9%, although some of them will still be using other terminals (such as MacOS X terminal) to connect to their screen; konsole gets another 6%; emacs ansi-color.el 4%; xfce4-terminal 3%; and tmux 1.6%.
Some escape sequences of the past have been disabled for security reasons in modern software. However, in general, it's safe to launch arbitrary URLs in a normal browser at the moment, or so we believe; so this should be safe. It may, however, result in terminal users getting rickrolled. It would be useful to have a way to see what the linked URL is before you click on it.