XXX tone down the arrogant pedant attitude
(Edited and considerably expanded from my comments on the orange website.)
What does it mean for something to be a “programming language”? It might seem like a trivial discussion of semantics, but as it turns out, it’s important to care for semantics in order to have a conversation that conveys information, which is necessary for having culture.
Moreover, the real question here is not one of word definitions, but one of ontology: what kinds of categories exist in the real world? If we attempt to reason with incoherent categories, such as “non-elephant mammals plus mosquitoes”, we cripple our reasoning. Reasoning with such categories, we are likely to believe that mosquitoes probably bear live young and produce milk, even if we haven’t been able to observe these phenomena yet, and be unable to use observations of elephants to reject false hypotheses about mammals.
In the 20th century, after thousands of years of effort, the humans finally succeeded in formalizing the notion of an logical procedure or algorithm in the construct of a computer program, and the implications of this discovery touch on some of the most profound questions about the limits of the knowable, the foundations of mathematics, and even the nature of the physical universe; and subsequent experiments rapidly disproved many long-held philosophical notions about the nature of human thought. The concept of a programming language is emerging as fundamental to this far-reaching intellectual breakthrough.
Dismayingly, a recent article in IEEE Spectrum incorrectly described HTML as a programming language, so it seems to me important to clear up the confusion.
One person eloquently defended the popular confusion as follows:
People don’t care for semantics. If you write it and it causes a PC to do something (even if it’s just to show a website), it’s a programming language. Doesn’t need to have branching, variables, etc... (not to mention that HTML includes JS and CSS)
print "hello world"
,<b>hello world</b>
, same difference.
This seems to amount to the position “my ignorance is as good as your knowledge”.
Unfortunately, this proposed definition of programming languages includes CSV, JPEG, Word, and URLs. Does that mean that anyone who snaps a photo on Instagram is a programmer? This does not seem to be in accordance with the usual meaning of the word “programmer”. Is it “programming” to write a dunning letter in Word? It seems to me like a qualitatively different activity.
Upon being presented with this argument, the misguided-egalitarian admirer of foolishness responded as follows:
And yet people don’t call them “programming languages”, while they do call HTML that — so that even if HTML is not a programming language, there’s still a not formally expressed difference that people can intuitively grasp with those other things...
You can’t program most things in SQL either (not without some modern extensions that make it Turing complete), but it is still considered a programming language...
So what is it that really distinguishes an ontologically coherent category of “programming languages” from other things, and why are the ignorant so frequently confused into thinking HTML is a programming language? What is this “not formally expressed difference” that “people can intuitively grasp”? And how much can you program in SQL, anyway?
Programming languages are all more or less equivalent; although they have a variety of paradigms, and although they are supplied with a variety of I/O facilities, you can program more or less precisely the same set of computations in all of them. This is called “Turing-completeness”.
Consider the differences between C, Forth, Smalltalk, Prolog, Fractran, Lisp, Brainfuck, Haskell, VHDL, amd64 machine code, Malbolge, Scratch, Octave, Python, and R: some of these do not even have branching or loops or variables, but they are all programming languages; none of them can compute anything that any of the others cannot.
The exception might be things like Turner’s “Total Functional Programming”, which excludes nonterminating computations without, he hopes, excluding much of practical interest; Excel, excluding the macro language, also has this limitation, but in Excel it is more onerous.
By contrast, you can’t compute so much as a polynomial or NAND in HTML. That is to say, you can’t program at all in HTML. So HTML is not a programming language. The same is true of CSV, JPEG, URLs, and Word documents (except insofar as Word documents might include macros).
You might say that HTML includes JS, but this is wrong. English does not include Latin, although, e.g.†, phrases like “inter alia” and “et cetera” can be used in English, and it’s not uncommon for an English sentence to quote a Latin motto. Similarly, HTML does not include JS and CSS, just as it does not include the URL syntax, and Python does not include SQL or HTML; they are six separate languages.
† exempli gratia
It’s fairly uncommon for people to call SQL a “programming language”, but you actually can program quite a bit with SQL; even without the recursive common table expressions alluded to above by the promoter of vulgar misconceptions, you can do surprisingly complex computations with it, and even define functions in the form of views, as long as they are finite. Certainly neither NAND nor evaluating polynomials poses any difficulty for SQL. In fact, here’s an example of arbitrary polynomial evaluation in SQL which works in MySQL, MariaDB, Postgres, Oracle, and SQL Server:
select x, sum(a * power(x, e)) from data group by x;
In SQLite, you easily can evaluate a particular polynomial, or, with a bit more hassle, any polynomial up to some finite degree, but evaluating an arbitrary polynomial would seem to be out of reach without using recursive CTEs. Here’s an example of how to do this for polynomials up to the fourth degree without recursive CTEs:
select x, sum(a * case e when 4 then x*x*x*x
when 3 then x*x*x
when 2 then x*x
when 1 then x
when 0 then 1 end) from data group by x;
The identifier data
provides the polynomial to evaluate and the
points at which to evaluate it; this can be a table or, in most
dialects of SQL, provided directly inline. In SQLite, for example, in
this case evaluating 5x⁴ + 2x + 5 at the points 0, 1, 2, and 3:
select x, sum(a * case e when 4 then x*x*x*x
when 3 then x*x*x
when 2 then x*x
when 1 then x
when 0 then 1 end)
from (select 3 as x union select 0 union select 1 union select 2) xt,
(select 4 as e, 5 as a union
select 0, 5 union
select 1, 2) t
group by x;
This syntax works in other database engines, but in, for example, Postgres, you can instead simply say
select x, sum(a * power(x, e))
from (select 3 as x union values (0), (1), (2)) xt,
(select 4 as e, 5 as a union values (0, 5), (1, 2)) t
group by x;
MySQL (5.5.62, at least) has power
but requires the more verbose
literal data syntax:
select x, sum(a * power(x, e))
from (select 3 as x union select 0 union select 1 union select 2) xt,
(select 4 as e, 5 as a union
select 0, 5 union
select 1, 2) t
group by x;
Note, though, that even the Postgres version of this is somewhat repetitive; to evaluate the polynomial at 100 points, we would have to put 100 numbers and 192 more words into the source code. While this is not as bad as HTML, where we would have to evaluate the polynomial ahead of time (for example, with pencil and paper, or with SQL, or with an actual programming language) it still means that there are things that are easier to program in languages such as Fractran or Malbolge than in SQL — without recursive CTEs, that is.
With recursive CTEs, you can do it compactly, though fairly awkwardly, because recursive CTEs do make SQL Turing-complete:
with recursive redonditos as (select 0 as x union select x+1 from redonditos where x < 99) select x, sum(a * power(x, e)) from (select 4 as e, 5 as a union values (0, 5), (1, 2)) t, redonditos group by x;
This works in Postgres (9.5.14) and, with the change explained earlier, in SQLite 3. MySQL doesn’t support CTEs, at least as of 5.5.62.
Excel without macros, Coq, and GNU MathProg are other languages that can compute large classes of interesting functions but are not Turing-complete.
The first and most important fact about programming languages is the Turing Tarpit Principle: they are all equivalent, in the sense that any of them can compute any algorithm, which is to say, anything any of the others can compute; but that does not mean that it is equally easy in all of them. (“In the Turing Tarpit,” observes Fred Brooks, all computable functions are “possible, but nothing of interest is easy.” “Turing Tarpit” languages like Malbolge perversely elevate this to a design goal, and so it took several years for someone to figure out how to write a working loop in Malbolge.)
In particular, the Halting Problem applies to all of them: you cannot always compute whether a program in them will continue to loop forever on a given input, or whether it will eventually terminate. Turing gave an airtight but somewhat lengthy and mindbending proof of this in 1936.
But a very simple argument that shows that this is at least very difficult is searching for counterexamples to the Collatz conjecture. In Python:
cn = lambda n: 3*n + 1 if n % 2 == 1 else n // 2
(x, n, m, p) = (1, 1, 1, 0)
while True:
if m == 1 or cn(m) == 1:
print("{} → 1 in {} steps".format(x, 2*p if m == 1 else 2*p+1))
(x, n, m, p) = (x+1, x+1, x+1, 0)
else:
(n, m, p) = (cn(n), cn(cn(m)), p+1)
if n == m:
print("{} loops without reaching 1".format(x))
exit()
This produces lines such as the following:
1 → 1 in 0 steps
2 → 1 in 1 steps
3 → 1 in 7 steps
4 → 1 in 2 steps
5 → 1 in 5 steps
6 → 1 in 8 steps
7 → 1 in 16 steps
8 → 1 in 3 steps
9 → 1 in 19 steps
10 → 1 in 6 steps
The Collatz conjecture says that this sequence, where the successor of
a number n is 3n + 1 for odd n and ½n for even n, reaches 1
if you start at any positive n. Lothar Collatz proposed this in
1937, but nobody has found a proof of it yet, nor a disproof. If the
conjecture is true, then the program above will run forever,
continuously printing lines of text, if it isn’t halted externally.
If it is false because there exists a Collatz cycle that doesn’t
include 1, the program will halt by invoking exit()
. If it is false
because it finds a Collatz sequence that increases without bound, it
will instead run forever without printing anything, if it doesn’t run
out of memory first.
Nobody knows which of these three is the case, despite some 80 years of effort from many of the most famous mathematicians and physicists; at least one book has been published on the subject. (It is entitled The Ultimate Challenge.) That is, 80 years of research by many people has not been sufficient to predict the behavior of the 11 lines of code above. And you can translate that code into any programming language. (However, it is known that if there is a counterexample to the Collatz conjecture, it is larger than 87 × 2⁶⁰, so your translation may involve multiple-precision arithmetic and therefore be several pages of code. Python has arbitrary-precision arithmetic built in.)
You cannot translate it into CSV, JPEG, HTML, URLs, Excel without macros, SQL without recursive CTEs, Coq, MathProg, or Turner’s Total Functional Programming. However, you can translate the computation within the loop into most of these, at least with numbers limited to some finite size (such as 128 bits or 256 bits); that is, you can translate this part:
cn = lambda n: 3*n + 1 if n % 2 == 1 else n // 2
if m == 1 or cn(m) == 1:
print("{} → 1 in {} steps".format(x, 2*p if m == 1 else 2*p+1))
(x, n, m, p) = (x+1, x+1, x+1, 0)
else:
(n, m, p) = (cn(n), cn(cn(m)), p+1)
if n == m:
print("{} loops without reaching 1".format(x))
exit()
For example, here is a fairly horrific but apparently correct† translation into SQL (without recursive CTEs), tested in SQLite3 and Postgres:
select case when nextx then x + 1 else x end as x,
case when nextx then x + 1
else (case n % 2 when 1 then 3*n + 1
else n / 2 end)
end as n,
case when nextx then x + 1
else (case newm % 2 when 1 then 3*newm + 1
else newm / 2 end)
end as m,
case when nextx then 0 else p+1 end as p,
case when nextx then 0=1 else n = newm end as exit
from (select (m = 1 or newm = 1) as nextx, x, n, m, p, newm
from (select x, n, m, p,
case m % 2 when 1 then 3*m + 1
else m / 2 end as newm
from (select 16 as x, 4 as n, 1 as m, 2 as p) state) a) b;
From there it is a matter of finding some external way to initialize
the computation and repeat it until it results in your translation of
exit()
. (And, as Turing argued, this is a generalized property of
all programs, not just this Collatz program — as long as you have
infinite memory, any of them can be rendered into an even more limited
form where each execution step is just a lookup in a finite table of
next actions.)
However, even in this limited form, you cannot translate the Collatz program into HTML, CSV, JPEG, or an URL. So, in that way, things like Excel without macros and SQL without recursive CTEs are still programming languages, even though they aren’t Turing-complete, while things like HTML are not.
† Implementing multiple-precision arithmetic in SQL is left as an
exercise to the reader. Maybe you can use DECIMAL
— in Postgres,
select power(2::decimal(30), 64) - (power(2::decimal(30), 64) - 1)
works correctly, but that syntax is a Postgres extension to SQL.
In Postgres 9.5.14 the query
with recursive collatz as
(select 1 as x, 1 as n, 1 as m, 0 as p, 0=1 as exit union
select case when nextx then x + 1 else x end as x,
case when nextx then x + 1
else (case n % 2 when 1 then 3*n + 1
else n / 2 end)
end as n,
case when nextx then x + 1
else (case newm % 2 when 1 then 3*newm + 1
else newm / 2 end)
end as m,
case when nextx then 0 else p+1 end as p,
case when nextx then 0 = 1 else n = newm end as exit
from (select (m = 1 or newm = 1) as nextx, x, n, m, p, newm
from (select x, n, m, p,
case m % 2 when 1 then 3*m + 1 else m / 2 end as newm
from collatz where not exit) a) b
where x < 100)
select * from collatz;
appears to work, producing 1634 rows in 23 ms.
In SQLite 3.11.0, subqueries cannot refer to recursive CTEs, and it's not obvious to me how to reformulate this query to eliminate the subqueries.
We could describe the above program as computing a partial function of the integers, namely how many Collatz iterations it takes to reach 1 from each integer. It’s a partial function in that the program might fail to produce an answer for a given input if that Collatz sequence increases without bound. Rice’s Theorem extends Turing’s proof of the Halting Problem’s incomputability to say that there is no algorithm to prove any nontrivial property of the partial function computed by given Turing machines, and thus by given programs in any Turing-complete programming language. The definition of “nontrivial” is amazingly wide.
There is additionally an issue related to computational complexity: in programming languages it is easy to write a program that will take a very long time to run, even if it terminates.
This means that it is very easy to, in some sense, get ourselves in trouble in any programming language — to produce a program intended to do something in particular, but not be able to tell whether the program does in fact always do that. This leads to the widely-remarked-on phenomenon of buggy software.
No. If we can extract any value judgment from this, it is rather the opposite: it means that programmers are constantly tripping over their own feet attempting to accomplish even seemingly-trivial tasks in their programs. This is the motivation behind the design of non-Turing-complete languages like Coq, GNU MathProg, URLs, HTML, Excel, and SQL (at least until recently): by steering clear of Rice’s Theorem, we can ensure that our constructs behave in predictable ways, can be analyzed in certain ways, and can be changed in predictable ways. In the history of the WWW, this was called the “Principle of Least Power”; HTML was deliberately designed to not be a programming language due to experience with using programming languages like PostScript and TEX to represent documents.
In building any system, you should do as little as possible in programming languages and as much as possible in passive data formats like HTML.
The ongoing struggle to navigate between the Scylla of programming by copy-pasting boilerplate and the Charybdis of Rice’s Theorem has given rise to a fruitful and active area of research that continually expands the set of borderline “programming language” cases; the SQL example above shows one of the most successful cases. However, HTML is nowhere near that borderline. It is a format for passive documents consisting of marked-up text.
I have occasionally heard people saying HTML is a programming language. But they’re just wrong. Presumably most people at some point thought the world was flat; we shouldn’t attempt to explain away their error by saying that they meant something different by “world” or “flat” than we do. Some people think vaccines cause autism; this isn’t actually because they’re using the word “vaccine” or “autism” in a different sense than we are. They’re just wrong. (As for the people who think organic food contains no chemicals, I’m not sure; I think some of them are just wrong, while others are in fact just using a nonstandard definition of “chemical”, meaning something like “pure chemical” or “industrially produced chemical”.)
In the same way, people are just wrong if they think HTML belongs to the set that includes C, Forth, Smalltalk, Prolog, Fractran, Lisp, Brainfuck, Haskell, VHDL, amd64 machine code, Malbolge, Scratch, Octave, Python, and R, rather than to the set that includes CSV, JPEG, Word, and URLs. Similarly, bash clearly belongs to the former set, and Excel (without macros) and SQL arguably do.
In a lot of cases, they seem to be reasoning based on shallow surface features like the use of plain ASCII text files. (You can see several examples of this even in the original thread discussing the issue on the orange website.) This is similar to the neural network that learned to recognize photos of tanks based on whether the photos had been taken on a sunny day or a cloudy day, or the people who think I’m “hacking into systems” when they see me using a terminal emulator — an understandable error from the ignorant and foolish, but not one we should allow to confuse our own thinking.
After all, to an illiterate person, a page full of random letters and spaces looks much the same as a page of a novel or a page from a legal brief, and quite visibly different from, for example, a painting of the Virgin Mary or a pornographic sculpture. We would not therefore assert that the legal brief is “fiction”, nor the page of random letters “legalese”. But an unlettered peasant might make such an error, just as they might assert that the Earth is flat.
This, I think, is the “not formally expressed difference” that “people can intuitively grasp”: if you don’t know how to read it, an editor window full of HTML looks similar to an editor window full of JS, and different from an editor window with a Microsoft Word document in it, because the font has a fixed width and all the letters are the same font size. That’s what people intuitively grasp.
The surprise is to find that the editors of IEEE Spectrum have fallen to such abysmal levels of ignorance.
It is unfortunate to be ignorant, not reproachable. Reproaching someone for their ignorance is contemptible, like reproaching poor people for their poverty or sick people for their sickness. But to spread ignorance is to impede others from escaping that unfortunate situation, and that is reproachable. It is malfeasance as surely as robbery or deliberately infecting people with diseases.