[flang] document C++17 feature usage
Original-commit: flang-compiler/f18@40d84ca20f Reviewed-on: https://github.com/flang-compiler/f18/pull/229 Tree-same-pre-rewrite: false
This commit is contained in:
parent
abe5168bdd
commit
d1cc618f2c
|
@ -0,0 +1,133 @@
|
|||
<!--
|
||||
Copyright (c) 2018, NVIDIA CORPORATION. All rights reserved.
|
||||
-->
|
||||
|
||||
## C++14/17 features used in f18
|
||||
|
||||
The C++ dialect used in this project is a subset of the
|
||||
standard C++ programming language and library features.
|
||||
We want it to be compatible with the LLVM C++ language
|
||||
subset in use at the time that we integrate with that
|
||||
project.
|
||||
We also want to maximize portability, future-proofing,
|
||||
compile-time error checking, and use of best practices.
|
||||
|
||||
To that end, we have a C++ style guide (q.v.) that lays
|
||||
out the details of how our C++ code should look and gives
|
||||
guidance about feature usage.
|
||||
|
||||
We have chosen to use some features of the recent C++17
|
||||
language standard in f18.
|
||||
The most important of these are:
|
||||
* sum types (discriminated unions) in the form of `std::variant`
|
||||
* product types in the form of `std::tuple`
|
||||
* `std::optional`
|
||||
|
||||
### Sum types
|
||||
|
||||
Fortran is notoriously problematic to lex and parse, as tokenization
|
||||
depends on the state of the partial parse;
|
||||
the language has no reserved words in the sense that C++ does.
|
||||
Fortran parsers implemented with distinct lexing and parsing phases
|
||||
(generated by hand or with tools) need to implement them as
|
||||
coroutines with complicated state, and experience has shown that
|
||||
it's hard to get right and harder to extend as the language evolves.
|
||||
|
||||
Alternatively, with the use of backtracking, one can parse Fortran with
|
||||
a unified lexer/parser.
|
||||
We have chosen to do so because it is simpler and should reduce
|
||||
both initial bugs and long-term maintenance.
|
||||
|
||||
Specifically, f18's parser uses the technique of recursive descent with
|
||||
backtracking.
|
||||
It is constructed as the incremental composition of pure parsing functions
|
||||
that each, when given a context (location in the input stream plus some state),
|
||||
either _succeed_ or _fail_ to recognize some piece of Fortran.
|
||||
On success, they return a new state and some semantic value, and this is
|
||||
usually an instance of a C++ `struct` type that encodes the semantic
|
||||
content of a production in the Fortran grammar.
|
||||
|
||||
This technique allows us to specify both the Fortran grammar and the
|
||||
representation of successfully parsed programs with C++ code
|
||||
whose functions and data structures correspond closely to the productions
|
||||
of Fortran.
|
||||
|
||||
The specification of Fortran uses a form of BNF with alternatives,
|
||||
optional elements, sequences, and lists. Each of these constructs
|
||||
in the Fortran grammar maps directly in the f18 parser to both
|
||||
the means of combining other parsers as alternatives, &c., and to
|
||||
the declarations of the parse tree data structures that represent
|
||||
the results of successful parses.
|
||||
Move semantics are used in the parsing functions to acquire and
|
||||
combine the results of sub-parses into the result of a larger
|
||||
parse.
|
||||
|
||||
To represent nodes in the Fortran parse tree, we need a means of
|
||||
handling sum types for productions that have multiple alternatives.
|
||||
The bounded polymorphism supplied by the C++17 `std::variant` fits
|
||||
those needs exactly.
|
||||
For example, production R502 in Fortran defines the top-level
|
||||
program unit of Fortran as being a function, subroutine, module, &c.
|
||||
The `struct ProgramUnit` in the f18 parse tree header file
|
||||
represents each program unit with a member that is a `std::variant`
|
||||
over the six possibilities.
|
||||
Similarly, the parser for that type in the f18 grammar has six alternatives,
|
||||
each of which constructs an instance of `ProgramUnit` upon the result of
|
||||
parsing a `Module`, `FunctionSubprogram`, and so on.
|
||||
|
||||
Code that performs semantic analysis on the result of a successful
|
||||
parse is typically implemented with overloaded functions.
|
||||
A function instantiated on `ProgramUnit` will use `std::visit` to
|
||||
identify the right alternative and perform the right actions.
|
||||
The call to `std::visit` must pass a visitor that can handle all
|
||||
of the possibilities, and f18 will fail to build if one is missing.
|
||||
|
||||
Were we unable to use `std::variant` directly, we would likely
|
||||
have chosen to implement a local `SumType` replacement; in the
|
||||
absence of C++17's ability of `using` a template parameter pack,
|
||||
it would be less convenient to use.
|
||||
|
||||
The other options for polymorphism in C++ at the level of C++11
|
||||
would be to:
|
||||
* loosen up compile-time type safety and use a unified parse tree node
|
||||
representation with an enumeration type for an operator and generic
|
||||
subtree pointers, or
|
||||
* define the sum types for the parse tree as abstract base classes from
|
||||
which each particular alternative would derive, and then use virtual
|
||||
functions (or the forbidden `dynamic_cast`) to identify alternatives
|
||||
during analysis
|
||||
|
||||
### Product types
|
||||
|
||||
Many productions in the Fortran grammar describe a sequence of various
|
||||
sub-parses.
|
||||
For example, R504 defines the things that may appear in the "specification
|
||||
part" of a subprogram in the order in which they are allowed: `USE`
|
||||
statements, then `IMPORT` statements, and so on.
|
||||
|
||||
The parse tree node that represents such a thing needs to incorporate
|
||||
the representations of those parses, of course.
|
||||
It turns out to be convenient to allow these data members to be anonymous
|
||||
components of a `std::tuple` product type.
|
||||
This type facilitates the automation of code that walks over all of the
|
||||
members in a type-safe fashion and avoids the need to invent and remember
|
||||
needless member names -- the components of a `std::tuple` instance can
|
||||
be identified and accessed in terms of their types, and those tend to be
|
||||
distinct.
|
||||
|
||||
So we use `std::tuple` for such things.
|
||||
It has also been handy for template metaprogramming that needs to work
|
||||
with lists of types.
|
||||
|
||||
### `std::optional`
|
||||
|
||||
This simple little type is used wherever a value might or might not be
|
||||
present.
|
||||
It corresponds directly to the optional elements in the productions
|
||||
of the Fortran grammar.
|
||||
It is also used as a wrapper around a parse tree node type to define the
|
||||
results of the various parsing functions, where presence of a value
|
||||
signifies a successful recognition and absence denotes a failed parse.
|
||||
It is used in data structures in place of nullable pointers to
|
||||
avoid indirection as well as the possible confusion over whether a pointer
|
||||
is allowed to be null.
|
Loading…
Reference in New Issue