Trying to imagine a future of programming language ecosystem

This text is also available to read here.

The reader may treat the following discussion as a technical fantasy fiction, or an intellectual exercise, a thought-experiment. We assume the reader have some familiarity with programming language theory, design and implementation.

We are trying to imagine a programming system that is syntax-, semantics-, runtime- and platform-agnostic. Such a system would accept programs written in multiple syntaxes, semantics, for multiple platforms and runtimes; the system would also allow, upto certain limits, changes being made in all aspects, at compile- as well as run-time, wherever applicable. The changes may be made programmatically, across the whole stack, from syntax to runtime, in ways that isn’t easily achievable through the traditional ways of metaprogramming. Thus, the system would allow syntax, semantics and runtime to be treated as first-class objects, providing tools and mechanisms to operate on them.
With such a programming system available as basic infrastructure, it would be possible to realize the following (this list is not exhaustive):

  • Programmers may choose their own flavor of syntax to view and edit the code, similar to UI skins and themes now. A single project may be using multiple flavors of syntax and semantics for different purposes. In the manner of language-oriented programming, programmers may continuously (in a topological sense) adapt the language to fit the problem domain better, or deal with the complexities more effectively.[^1]
  • When dealing with semantics in a first-class manner, making such changes to it that some existing code couldn’t be automatically updated to new semantics, the system would provide ways to interactively fix them, similar to, or rather a narrow application of, interactive theorem proving. This system would make it easier to incrementally rewrite large legacy codebases to benefit from innovations in programming methodology (e.g. borrow checking, capability-based security). It’d also allow the whole ecosystem to continue to evolve little by little along with the problem domain and society at large, the cost of effort to be paid amortized over time, instead of dealing with the maintenance cost and when it could no longer be borne, the cost of rewriting.
  • Not only it would be possible to make significant changes to the runtime at run-time, but also swap-out the entire runtime; due to having schemas to represent the entire program state runtime-agnostically to reliably carry this out.

[^1]: The seeds of this idea can be found in Common Lisp’s reader macros or what can be done with the low-level primitives SOURCE, REFILL and >IN in Forth. Racket and, to some extent, Guile are among the implementations that are now at the frontier of realizing this idea.

Now we’ll give some examples to better explain the idea that has been presented.

Consider the current situation of code styling: casing of identifiers (camelCase, PascalCase, snake_case, kebab-case and so on), spaces (how many) vs tabs, whether braces are on the same line or next, spacing around operators and punctuation and so forth. In absence of stipulated standards or norms, it is a matter of personal taste and preference. The system would allow the programmer to treat the entire syntax like this. Two examples of an inkling of this being realized: with partial style-insensitivity of identifiers in Nim, while camel case is the norm, the identifiers may be accessed using snake case; uniform function call syntax in D, Nim etc., allowing a function call f(a, b, c) to be written as a.f(b, c). In general, the state of the art in this aspect is transpilation; as extensively seen with the many languages targetting ECMAScript, Lua and C. The compiled languages targetting the LLVM or other such IR demonstrates another way.

Then consider the recent proliferation of configuration languages. Through the lens of the idea here, we can foresee more such syntaxes to be designed. Our focus would be on the schema: the meaning of the data being represented through such configuration languages. If a proper schema is in place, it doesn’t really matter which syntax is used to represent it; for there are infinite ways to represent the same schema.

When dealing with schemas, creating, updating or interoperating with other schemas[^2], having a schema of schemas, meta-schemas would help, at which point it’d be better described as ontology. Then, as things progress, there will be multiple ontologies out there, to properly deal with them upper or foundational ontologies would be needed, and so on. The analogy we can use here is of type stratification: the type of values are 1-types, the type of 1-types are 2-types and so on. Or over in category theory, (small?) categories can be treated as objects and functors between such categories as morphisms to build a 2-category and so on.

[^2]: There should already be some work done with XML schemas in this regard. There’s Cambria for JSON schemas.

So, in order for the system we imagine to be agnostic in manners described before, we need to agree on the meaning of the program, thus needing semantic schemas and programming ontologies. As discussed in the previous paragraph, schemas and ontologies are just the beginning; as the system shall evolve further, attaining higher and higher levels of abstractions, requiring theories to be developed to properly accomodate such evolution. How such higher level theories would be like, it is left to the reader to ponder upon.

Back to syntax, the system treating them in a first-class manner, we might see reusable syntax libraries, allowing imports of features on demand; enabling of syntax may be performed in arbitrarily-nested lexically-scoped enclosure; maybe overriding the syntactical elements, temporarily repurposing them. Similarly, semantics would benefit from both lexical and dynamic scoping (see Common Lisp for more about this concept). For example, choosing to temporarily use saturation arithmetic instead of wrap-around on overflow[^3]. Then rounding modes in floating-point environment. Stricter semantics for cryptographical libraries and so on. In that legendary commit message where its author vent on the dismal state of locale handling in C, would treating locales through semantics help?

[^3]: Sure, it can be done with operator overloading, but it would be quite inelegant; it is instructive to imagine how nesting (like interpolating an expression with further string interpolation in a string, but with semantics instead) would look like.

Unlike now when the same algorithms need to reimplemented for standard libraries of newly developed programming languages, again and again; with the system envisioned, we’ll save much time and effort: implement the algorithms once and it would be available to all programs, written in different syntaxes and semantics[^4] easily, like how LSP to some degree made their situation turn from O(mn) to O(m + n).

[^4]: Where difference in semantics is too big, the cost of using an ‘adapter’ between two semantics would have to be paid, or when one can’t pay, the portions where the cost is too high would have to be rewritten. But now it’s limited to the specific semantics instead of so many languages.

In the preliminary discussion about this idea, we came across some common responses that we find worth elaborating on:

(the infeasibility of) a single universal system (or language, or schema)

Clearly, it didn’t work in natural languages (Esperanto and so on), we understand it won’t with programming languages as well. To continue with the analogy of natural languages, we should take note how when there’s significant amount of human interaction possible (i.e. there’s no natural barrier as hindrance), we don’t see a sudden jump in the language, it forms a spectrum, with the languages smoothly transitioning from one to another. Also consider the formation of Pidgins and Creoles, out of need to communicate. As many words there are, there aren’t usually[^5] that many concepts that can be conveyed; allowing cross-cultural exchange. When we encounter new ideas or concepts, and when we don’t bother coining new words or phrases for them[^6]; we tend to directly borrow from those who already have them. We want to imagine the same manner of dynamism in programming languages or systems. Thus we’ll have multiple syntaxes, semantics, runtimes, platforms, even systems; each adapted to respective purposes or domains, while allowing interaction, collaboration, interoperation among them. Standardization wouldn’t help, either it would be too strict, too limiting or too loose, too vague. Thus let systems be developed and when need to interoperate arises, let there be agreements on some aspects and divergence in others; like bridges or adapters between protocols now, but more fundamental in nature. When the systems aren’t as rigid like protocols; each of them can evolve and converge towards a certain direction, incorporating the best aspects of each of them, to the benefit of all.

[^5]: As apparent from the study of etymology. We aren’t considering the case of artists and thinkers who delve past the frontier of representation or imagination. Whether taking advantage of inherent ambiguity of the languages, using literary devices or coining new words or phrases, they aren’t having it easy.

[^6]: We are lamenting on the naming of ideas and concepts after the names of people, especially in mathematics. The situation have slightly improved with people trying to create meaningful names.

intermediate representations or existing platforms (JVM, CLR, BEAM, WebAssembly etc.)

Targetting them causes the loss of information about the program. The systems need to preserve all information, losslessly, to achieve what it purports to. Initial implementations of the idea would rely on and target these, but only as a means to bootstrap. Eventually we’d incorporate (the innards of) all of these and what might be developed in future as well.

extreme difficulty, requiring enormous amount of effort to realize, or impossible to realize to its fullest generality

Thus the initial disclaimer. We are documenting the idea for posterity. Whether and how this can be realized, we leave it to the reader for consideration. In certain respects, we are trying to contend with the Real, with implied inherent futility of such effort. How meaningful this effort would be, (again) it is left to the reader to think through.

programming languages are tools, instead of one tool trying to do everything, there should be multiple tools suited for different purposes

While we don’t disagree with the intent being conveyed, we find the metaphor somewhat inappropriate. Can languages, cultures or even civilizations be considered as tools? What kind of divine perspective does this require?

Languages are not static entities. As cultures evolve to find richer perspectives, the languages evolve along with them. The systems would need be quite flexible and malleable to deal with this kind of dynamism.

Languages are the mediums we use to convey intent. They shape not only our own identity, but our worldview as well. They form cultures and communities. Thus we feel so strongly about the languages we use. In the idea that had been presented, we try to move above and beyond the realm of languages, to avoid dealing with the issues that arise in such situations and focus on deeper issues instead. Therefore, to put it another way, we are considering the entire infrastructure of such toolmaking. Instead of the current state of rigid, inflexible toolmaking, we want to consider the realizability of malleability and conviviality of these tools, from the perspective of end-programmers as well as the developers of these systems themselves. Forths and Lisps are among the languages that quintessentially embody this ideal. How to bring these benefits to all and how to develop the philosophy of these languages embody further is what we have considered so far. We greatly appreciate the readers who have endured reading until the end, taken time to think through various aspects of the idea presented here and would join us in refining and developing the idea further. We look forward to thoughtful, in-depth and critical discussion in this regard.

1 Like