Linda was a coordination language for parallel computing where you could “in” some tuples from a global blackboard (or “rd” them, which is the same thing without removing them) and then “out” some tuples; the tuple space served as a sort of hybrid of communication and storage. It’s sort of like Prolog, and I think it gave rise to the family of Concurrent Prolog systems. It’s really dramatically easier to program than message-passing systems.
Many new software systems are built on ØMQ (ZeroMQ) or LevelDB, which are new minimalistic software designs that combine extreme efficiency with extreme flexibility.
ØMQ is a sort of hybrid of sockets and message-queuing systems like RabbitMQ, one that doesn’t necessarily require a message broker as such. Like message-queuing systems, it has message framing, permits publish-subscribe communications, and can queue messages in RAM until they are processed. Like sockets, the messages are mere strings of bytes (rather than serialized data structures with an associated type system), and producers can connect directly to consumers. In order to reconcile publish-subscribe with using mere strings of bytes, the messages can be divided into a key and a value, and the subscriptions are byte prefixes of the key, or if not present, the message.
LevelDB is sort of a modern replacement for ISAM using log-structured merge trees. It stores a set of bytestring keys, each associated with a bytestring value, which may be empty. It provides efficient batch insertion/updates/deletes, which vanilla ISAM can’t, and efficient in-order traversal by key.
Both LevelDB and ØMQ are one to two orders of magnitude more efficient than the more elaborate traditional systems they can replace: ØMQ can route two or three million messages per second on my laptop, while implementations of OpenMQ are around a hundred thousand or so, and LevelDB can insert about 14,000 to 300,000 records per second, while Postgres manages about 3000. (This is on an SSD.)
So it occurs to me that it might be interesting to build a “tuple-space” system which is really a “bytestring space”. Workers would attempt to “in” or “rd” keys with a given prefix, and if successful might “out” others. The bytestring space might be persisted to disk or purely in RAM, and it might be hosted on a single server, sharded across servers, or even replicated across servers. If the ins and outs are transactional, it might even be possible to make it fault-tolerant.
Zooming down to the other end of the computational scale, there are truly astonishing amounts of computational power available in tiny, cheap microcontrollers at this point (various 48MIPS Cortex-M models from Philips, ST, Atmel, and others cost under US$1 at this point) and they use a tiny amount of power — in theory an STM32L011x3/4, similar in computational power to a Sun-3 workstation from the 1980s, should be able to run at 16 MIPS for a week on a CR2032 coin cell.
But it’s difficult to get them to do anything complex because they have a very small amount of memory. You might have 4K to 32K of RAM and a somewhat larger amount of Flash. If you can decompose a system into pieces that fit into the RAM, communicate via message-passing, and can manage with a somewhat sequential access to the messages, you can do decent computations on these things. The problem is somewhat similar to the problem Unix solved with pipes on the PDP-11.
So, in particular, I was thinking that you could hook up an external nonvolatile storage such as a Flash chip storing a “tuple space” and have a set of “actors”, each waiting on one or more prefixes, and load a single “actor” into the RAM and feed it items from the space until it’s blocked on a prefix that has no existing items. Then you could context-switch to a different “actor” and repeat the process — hopefully keeping the total number of context switches low enough that you spend most of your time running actors instead of context-switching.