RandomThoughts

December 7, 2021

Some thoughts on Haskell and purity.

A friend of mine introduced me to CPS a few weeks ago and this got me interested in Andrew Appel’s Modern Compiler Implementation in ML that implements a compiler for Tiger programming language.

The chapter on Static-single assignment or SSA form was instructive because it offered an alternative perspective on why purity in functional languages matter. SSA is a property of an intermediate representation, that requires that each variable be assigned only once, and be defined before it is used. Isn’t this by definition, a purely functional program? As it turns out, in the case of Haskell, purity has the un-intended consequence of optimization for the code generated by the Haskell compiler. Purity has obvious advantages for a programmer, though it is nice to see that a feature has benefits all along the language pipeline: front-end, abstract syntax tree, intermediate representation and code generation.

Haskell’s purity also, in my opinion, helps in implementing a multi-core runtime and address the M by N concurrency problem. At least one of the consequences is that since the Haskell runtime can schedule threads across various cores, the incidents when a volatile variable has to be refreshed across multiple cores can reduced.Eric Etheridge’s write-up is an excellent resource on the advantages of Haskell for a practical C programmer. Quoting a relevant blurb from that reference, specially about volatile variables how they get used:

C programmers are supposed to declare variables as volatile if they have this behavior. To correctly handle those cases, your compiler has to give up most of its hope of knowing how variables will be updated during execution.

Language war alert

Haskell’s purity is unique among functional programming languages and OCAML, SML and FSharp for example still support mutable refs.

  let nextVal : int ref = ref 0

OCAML had realized this limitation a long while ago and has refused to support multicore, concurrent garbage collector in its core implementation, although, multicore OCAML is in the works.

Some notes on STM in COQ.

COQ, a proof assistant, can be considered to be one of the best software programs written to date. COQ is implemented in OCAML and the reason to bring that up in this discussion is that the implementation is switching over to Software Transactional Memory (similar to hardware transactional memory) for its internal data structures: implementation here. The following snippet highlights the issues that Haskell’s runtime, specifically GHC’ runtime, doesn’t have to deal with, especially the need to deal with mutable variables.

let async_proofs_workers_extra_env = ref [||]

type aast = {
  verbose : bool;
  indentation : int;
  strlen : int;
  mutable expr : vernac_control; (* mutable: Proof using hinted by aux file *)
}

In imperative programs, or programs with mutable variables, the size of the lock is usually either too large or too small; too large a lock can slow an application on a multi-core architecture whereas a smaller lock can create those nasty race conditions that magically disappear when one tries to printf them. Haskell’s STM implementation helps address some of the issues in implementing concurrent programs.

Can we hope for a unified development environment that brings Haskell, COQ together wrapped inside a UI?

An ideal development environment would be a mix of COQ, Haskell and Smalltalk so that I can develop programs that are robust while still enjoying the process, though, I am not sure if that will happen anytime soon. Each programming language has its own build system, Haskell with stack, shake and cabal; OCAML with dune makes it hard to switch programming languages. Speaking of differences in terminologies, the term functor has been used to mean entirely different things be it in C++, OCAML, Haskell and Prolog. Of course, in my opinion, Haskell’s use of functors is closer to term from category theory, therefore a bit more correct.