# Notes on Coq, and the Coq/API: This document contains some random and incomplete information we gathered on while working with Coq's API, as well as some constructive ;) comments and criticisms (marked with _fixme?_). It is our view and it doesn't represent the views of the Coq Development team. For the purposes of this commentary, we use a special-purpose branch https://github.com/ejgallego/coq/tree/ejga-commentary so links to code are stable. ## STM overview Coq 8.5 has a new document model API called STM, which allows the consumer to build Coq documents as a DAG. This enables much finer control over Coq's processing of a document and makes writing alternative frontends quite easier. The current version is functional, but still not stable nor complete. Also, some _technical debt_ and a fair amount of compatibility code seems to be present; the Coq developers expect this kind of code to progressively go away. More details can be found in the ITP 2015 paper https://hal.inria.fr/hal-01135919 ### STM Operation Overview The basic idea of the STM API is to represent a Coq document as a Directed Acyclic Graph, where edges are intended to represent the dependency of each part of the document on previous parts, enabling asynchronous/parallel processing. The basic graph building operation is to `add` a statement to the document; Coq will try to parse it, and if successful, a new node and edges will be created by Coq representing the path from the root of the three to that particular state. Every new node is assigned it an unique _state id_. Things are interesting sometimes: the core of Coq is very stateful and some statements may produce _interesting_ side effects. Document building is separated from document processing; we must tell Coq that we want it to process the document up to a certain node, this operation is named `observe`, and will make Coq try to process the document up to the given _state id_. Using pseudocode, a typical STM workflow of adding and processing one new statement would be: ``` cur_state := Stm.init () while(new_input) { old_state := cur_state; try { cur_state := Stm.add(cur_state, new_input) try { Stm.observe(cur_state); } catch Error (..) { -- go back to the old state cur_state := old_state; -- Tell the STM to we want to edit a node Stm.edit_at(cur_state); } catch ParsingError(..) { -- Notify the user of a parse error. } } ``` The last important piece of the API is the `edit_at` function: it tells the STM that we are editing the tree from a particular node, so all the successors become invalidated. We have used `try ... catch ...` blocks to capture errors, but true Coq error handling is more subtle, as Coq may both generate an exception and call a hook too in order to report errors, command completion, etc... CoqIDE has a debug mode for showing the current document tree, it can be invoked with: ``` $ coqide -debug ``` If dot (graphviz) is installed Coq generates `.pdf` file(s) in `/tmp/` every time a sentence added/removed to/from the document. ### Basic Datatypes The basic datatypes involved are: * [`Stateid.t`](https://github.com/ejgallego/coq/blob/ejga-commentary/lib/stateid.mli) Abstract (private or opaque) type of state denoting a node in the document. This is the main type over which the `STM` works. * [`edit_id`](https://github.com/ejgallego/coq/blob/ejga-commentary/lib/feedback.mli#L30-33) A user-controller integer to identify pre-parsed statements. `edit_id`'s are used for the cases the STM has not generated yet a state id (i.e: parsing errors). * _fixme?_: There also exists an `edit_or_state_id` type, but it will be removed at some point. * [`feedback`](https://github.com/ejgallego/coq/blob/ejga-commentary/lib/feedback.mli#L37-63) This type represents STM feedback. Currently, it includes very varied information. _fixme?_: Related to the previous fixme, it would be great to specialize the feedback type into more granular cases ensuring greater type safety and allowing the removal of the `edit_or_state_id` type. * [`message`](https://github.com/ejgallego/coq/blob/ejga-commentary/lib/feedback.mli#L12-22) _fixme?_: This is the legacy feedback type, which doesn't include information about the particular node of the document it relates to. It is deprecated as should be gradually removed/encapsulated into a compat layer. ### Document Building * [`add`](https://github.com/ejgallego/coq/blob/ejga-commentary/stm/stm.mli#L22-24) `add` builds a document by appending a new node. ```ocaml (* [add ontop check vebose eid s] *) val add : ontop:Stateid.t -> ?newtip:Stateid.t -> ?check:(located_vernac_expr -> unit) -> bool -> edit_id -> string -> Stateid.t * [ `NewTip | `Unfocus of Stateid.t ] ``` It basically parses the input text and returns the new document id. If parsing fails, it will produce an exception and send feedback with the specified `edit_id`. `add` allows modular editing of the document. In the common case, it will return `NewTip`, as the newly added node is a leaf for the tree, however, if editing a document sub-part it will return `Unfocus` when the STM thinks that the sub-document has been closed (usually at QED time). See the `edit_at` and `observe` documentation for more information. * [`edit_at`](https://github.com/ejgallego/coq/blob/ejga-commentary/stm/stm.mli#L40) ```ocaml val edit_at : Stateid.t -> [ `NewTip | `Focus of focus ] ``` The STM document model is stateful due to certain Coq commands having a global effect; thus, node editing ─ like in the case of an error or a user editing the document ─ must be globally signaled with `edit_at`. The argument of `edit_at` is the last known-good state we want to preserve. I will either return `NewTip`, signaling that new adds after the edit will become the new tip of the document, or `Focus` if the edit is "local" to the document, meaning that only depending parts of the graph will be discarded. * [`observe`](https://github.com/ejgallego/coq/blob/ejga-commentary/stm/stm.mli#L45) ```ocaml val observe : Stateid.t -> unit ``` `add` and `edit_at` build the document, `observe` actually processes the document up to the specified `Stateid.t`. A good question is whether the processing is asynchronous or not. The general case is complicated due to Coq having too many commands, but a good rule of thumb is that if you do several `add`s followed by one observe, Coq will try to process the adds in parallel, whereas if the API user does add/observe in sync no parallel processing will take place. In order to enable async processing in CoqIDE, the _go to current point_ command can be used; then, CoqIDE will do a bunch of adds followed by a single observe. See [`stm.ml:async_policy`](https://github.com/ejgallego/coq/blob/ejga-commentary/stm/stm.ml#L1619-1625) for more information. * The STM API includes some more self-explaining functions, such as: ```ocaml val init : unit -> unit val get_current_state : unit -> Stateid.t ``` ### Feedback The main instrument the STM uses to communicate results of Coq processing is the `feeeback` mechanism. Feedback tries to unify all kinds of reporting, from debug and informative messages to error handling. Previously, they were separated in Coq, with "informative" output taken care of by printing functions and error handling mainly done via exceptions. As noted in the coq sources, the current implementation seems a bit in flux due to compatibility constraints. Feedback related operations are split among three modules: * [`Feedback`](https://github.com/ejgallego/coq/blob/ejga-commentary/lib/feedback.mli) Datatype definition and serialization occur here. * [`Pp`](https://github.com/ejgallego/coq/blob/ejga-commentary/lib/pp.mli#L164) Surprisingly, the main functions used by Coq to generate feedback are located here for legacy reasons. The first part of such functions is composed by the old-style `msg_*` functions, used to output messages to `stdout/stderr`. They still do, unless `log_via_feedback()` is called, which will pipe them through a feedback message. The main feedback producer function ─ that is, the way Coq internals generate feedback ─ is: [`feedback`](https://github.com/ejgallego/coq/blob/ejga-commentary/lib/pp.mli#L175-178) ```ocaml val feedback : ?state_id:Feedback.state_id -> ?edit_id:Feedback.edit_id -> ?route:Feedback.route_id -> Feedback.feedback_content -> unit ``` which produces the corresponding feedback. _fixme_: There is a precondition on `feedback` imposing that only one of `state_id` or `edit_id` can be set at call time, this should be fixed soon. Feedback consumers may register with: ```ocaml val set_feeder : (Feedback.feedback -> unit) -> unit ``` It seems that only one consumer can be registered. * [`Stm`](https://github.com/ejgallego/coq/blob/ejga-commentary/stm/stm.mli#L97) Aside from the general feedback mechanism, `Stm` provides finer-grained `Hooks` for some crucial events: ```ocaml val state_computed_hook : (Stateid.t -> in_cache:bool -> unit) Hook.t val parse_error_hook : (Feedback.edit_or_state_id -> Loc.t -> Pp.std_ppcmds -> unit) Hook.t val execution_error_hook : (Stateid.t -> Loc.t -> Pp.std_ppcmds -> unit) Hook.t val unreachable_state_hook : (Stateid.t -> unit) Hook.t val state_ready_hook : (Stateid.t -> unit) Hook.t ``` By default, this functions will just call `feedback` with an appropriate constructor (usually `MsgError`) but ideally their implementation would be made mandatory by the API users and, as noted before, adapted to a specialized `feedback` type. ### More on Feedback A possibility for the future of async STM feedback in Coq could be to have a "dispatch" module where users may register for particular events. Event generation should replace all the `msg_*` functions in `Pp`. Several dispatch modules should be allowed to coexist, and they they should be thread local, but that is another issue. ### Error handling. Due to legacy constraints, error handling and reporting currently has some overlap between the new feedback mechanism and Ocaml exceptions. This is not a big problem, but care must be taken when interacting with the STM API. The approach CoqIDE takes is to wrap all the exceptions, see [`ide_slave.ml`](https://github.com/ejgallego/coq/blob/ejga-commentary/ide/ide_slave.ml#L327) The error hooks in `Feedback` are set by default to send `ErrorMsg` feedback messages. Some of exceptions an STM user is allowed to/must catch are: ```ocaml [in errors.mli] exception UserError of string * std_ppcmds exception AlreadyDeclared of std_ppcmds [in proofs.ml etc...] exception NoSuchProof exception NoCurrentProof exception FailedBullet of t * suggestion ``` ## Settable Vernacular Options Vernacular Options (as `Set Printing All`) are defined and accessed using the [`Goptions`](https://github.com/ejgallego/coq/blob/ejga-commentary/library/goptions.mli) module. An option declared through this interface can perform any arbitrary global action. This seems quite inconvenient and IMVHO should be addressed. Strangely enough, it seems that there is no easy way in the API to query what the status of an option is. ## Other (minor) comments ### VM static initializers `vm.ml` does a lot of static initialization. This is not good for a wide variety of reasons, and very inconvenient for jscoq. ### Declare ML Module `Declare ML Module` is unqualified, which is a bit inconvenient for applications like jscoq that have to manage their own fs layout.