[flang] Remove old character "cooking" parser combinators that handled Fortran

comments, continuations, &c. that have become obsolete with the use
of the new C++-coded prescanner module.  Clean out members from
ParseState that were used only by cookedNextChar and its sub-parsers.

Original-commit: flang-compiler/f18@41717531e5
Reviewed-on: https://github.com/flang-compiler/f18/pull/11
Tree-same-pre-rewrite: false
This commit is contained in:
peter klausler 2018-02-16 10:41:16 -08:00
parent 3185562e19
commit 7af9dd8736
9 changed files with 74 additions and 488 deletions

View File

@ -45,9 +45,7 @@ These objects and functions are (or return) the fundamental parsers:
* `cut` is a trivial parser that always fails silently. * `cut` is a trivial parser that always fails silently.
* `guard(pred)` returns a parser that succeeds if and only if the predicate * `guard(pred)` returns a parser that succeeds if and only if the predicate
expression evaluates to true. expression evaluates to true.
* `rawNextChar` returns the next raw character, and fails at EOF. * `nextChar` returns the next character, and fails at EOF.
* `cookedNextChar` returns the next character after preprocessing, skipping
Fortran line continuations and comments; it also fails at EOF
### Combinators ### Combinators
These functions and operators combine existing parsers to generate new parsers. These functions and operators combine existing parsers to generate new parsers.
@ -107,26 +105,15 @@ collect the values that they return.
These are non-advancing state inquiry and update parsers: These are non-advancing state inquiry and update parsers:
* `getColumn` returns the 1-based column position. * `getColumn` returns the 1-based column position.
* `inCharLiteral` succeeds under `withinCharLiteral` (below).
* `inFortran` succeeds unless in a preprocessing directive.
* `inFixedForm` succeeds in fixed form Fortran source. * `inFixedForm` succeeds in fixed form Fortran source.
* `setInFixedForm` sets the fixed form flag, returning its prior value. * `setInFixedForm` sets the fixed form flag, returning its prior value.
* `columns` returns the 1-based column number after which source is clipped. * `columns` returns the 1-based column number after which source is clipped.
* `setColumns(c)` sets the column limit and returns its prior value. * `setColumns(c)` sets the column limit and returns its prior value.
### Monadic Combination
When parsing depends on the result values of earlier parses, the
*monadic bind* combinator is available.
Please try to avoid using it, as it makes automatic analysis of the
grammar difficult.
It has the syntax `p >>= f`, and it constructs a parser that matches p,
yielding some value x on success, then matches the parser returned from
the function call `f(x)`.
### Token Parsers ### Token Parsers
Last, we have these basic parsers on which the actual grammar of the Fortran Last, we have these basic parsers on which the actual grammar of the Fortran
is built. All of the following parsers consume characters acquired from is built. All of the following parsers consume characters acquired from
`cookedNextChar`. `nextChar`.
* `spaces` always succeeds after consuming any spaces or tabs * `spaces` always succeeds after consuming any spaces or tabs
* `digit` matches one cooked decimal digit (0-9) * `digit` matches one cooked decimal digit (0-9)
@ -138,8 +125,6 @@ is built. All of the following parsers consume characters acquired from
the combinator `>>` or after `/`.) the combinator `>>` or after `/`.)
* `parenthesized(p)` is shorthand for `"(" >> p / ")"`. * `parenthesized(p)` is shorthand for `"(" >> p / ")"`.
* `bracketed(p)` is shorthand for `"[" >> p / "]"`. * `bracketed(p)` is shorthand for `"[" >> p / "]"`.
* `withinCharLiteral(p)` applies the parser p, tokenizing for
CHARACTER/Hollerith literals.
* `nonEmptyListOf(p)` matches a comma-separated list of one or more * `nonEmptyListOf(p)` matches a comma-separated list of one or more
instances of p. instances of p.
* `optionalListOf(p)` is the same thing, but can be empty, and always succeeds. * `optionalListOf(p)` is the same thing, but can be empty, and always succeeds.

View File

@ -1179,45 +1179,20 @@ private:
inline constexpr auto guard(bool truth) { return GuardParser(truth); } inline constexpr auto guard(bool truth) { return GuardParser(truth); }
// rawNextChar is a parser that succeeds if the parsing state is not // nextChar is a parser that succeeds if the parsing state is not
// at the end of its input, returning the next character and // at the end of its input, returning the next character and
// advancing the parse when it does so. // advancing the parse when it does so.
constexpr struct RawNextCharParser { constexpr struct NextCharParser {
using resultType = char; using resultType = char;
constexpr RawNextCharParser() {} constexpr NextCharParser() {}
std::optional<char> Parse(ParseState *state) const { std::optional<char> Parse(ParseState *state) const {
if (std::optional<char> ch{state->GetNextRawChar()}) { std::optional<char> ch{state->GetNextChar()};
state->Advance(); if (!ch) {
return ch; state->PutMessage("end of file");
} }
state->PutMessage("end of file"); return ch;
return {};
} }
} rawNextChar; } nextChar;
// If a is a parser, then withinCharLiteral(a) succeeds if a does so, with the
// parsing state temporarily modified during the recognition of a to
// signify that the parse is within quotes or Hollerith.
template<typename PA> class WithinCharLiteral {
public:
using resultType = typename PA::resultType;
constexpr WithinCharLiteral(const WithinCharLiteral &) = default;
constexpr WithinCharLiteral(const PA &parser) : parser_{parser} {}
std::optional<resultType> Parse(ParseState *state) const {
bool was{state->inCharLiteral()};
std::optional<resultType> result{parser_.Parse(state)};
state->set_inCharLiteral(was);
return result;
}
private:
const PA parser_;
};
template<typename PA>
inline constexpr auto withinCharLiteral(const PA &parser) {
return WithinCharLiteral<PA>(parser);
}
// If a is a parser for nonstandard usage, extension(a) is a parser that // If a is a parser for nonstandard usage, extension(a) is a parser that
// is disabled if strict standard compliance is enforced, and enabled with // is disabled if strict standard compliance is enforced, and enabled with

View File

@ -1,113 +0,0 @@
#ifndef FORTRAN_CHAR_PARSERS_H_
#define FORTRAN_CHAR_PARSERS_H_
// Defines simple character-level parsers for use by the tokenizing
// parsers in cooked-chars.h.
#include "basic-parsers.h"
#include "parse-state.h"
#include <optional>
namespace Fortran {
namespace parser {
template<char goal> struct ExactRaw {
using resultType = char;
constexpr ExactRaw() {}
constexpr ExactRaw(const ExactRaw &) {}
static std::optional<char> Parse(ParseState *state) {
if (std::optional<char> ch{state->GetNextRawChar()}) {
if (*ch == goal) {
state->Advance();
return ch;
}
}
return {};
}
};
template<char a, char z> struct ExactRawRange {
using resultType = char;
constexpr ExactRawRange() {}
constexpr ExactRawRange(const ExactRawRange &){};
static std::optional<char> Parse(ParseState *state) {
if (std::optional<char> ch{state->GetNextRawChar()}) {
if (*ch >= a && *ch <= z) {
state->Advance();
return ch;
}
}
return {};
}
};
template<char unwanted> struct AnyCharExcept {
using resultType = char;
constexpr AnyCharExcept() {}
constexpr AnyCharExcept(const AnyCharExcept &) {}
static std::optional<char> Parse(ParseState *state) {
if (std::optional<char> ch{state->GetNextRawChar()}) {
if (*ch != unwanted) {
state->Advance();
return ch;
}
}
return {};
}
};
template<char goal> struct SkipPast {
using resultType = Success;
constexpr SkipPast() {}
constexpr SkipPast(const SkipPast &) {}
static std::optional<Success> Parse(ParseState *state) {
while (std::optional<char> ch{state->GetNextRawChar()}) {
state->Advance();
if (*ch == goal) {
return {Success{}};
}
}
return {};
}
};
// Line endings have been previously normalized to simple newlines.
constexpr auto eoln = ExactRaw<'\n'>{};
static inline bool InCharLiteral(const ParseState &state) {
return state.inCharLiteral();
}
constexpr StatePredicateGuardParser inCharLiteral{InCharLiteral};
class RawStringMatch {
public:
using resultType = Success;
constexpr RawStringMatch(const RawStringMatch &) = default;
constexpr RawStringMatch(const char *str, size_t n) : str_{str}, length_{n} {}
std::optional<Success> Parse(ParseState *state) const {
const char *p{str_};
for (size_t j{0}; j < length_ && *p != '\0'; ++j, ++p) {
if (std::optional<char> ch{state->GetNextRawChar()}) {
if (tolower(*ch) != *p) {
return {};
}
state->Advance();
} else {
return {};
}
}
return {Success{}};
}
private:
const char *const str_;
const size_t length_;
};
constexpr RawStringMatch operator""_raw(const char str[], size_t n) {
return RawStringMatch{str, n};
}
} // namespace parser
} // namespace Fortran
#endif // FORTRAN_CHAR_PARSERS_H_

View File

@ -1,214 +0,0 @@
#ifndef FORTRAN_COOKED_CHARS_H_
#define FORTRAN_COOKED_CHARS_H_
// Defines the parser cookedNextChar, which supplies all of the input to
// the next stage of parsing, viz. the tokenization parsers in cooked-tokens.h.
// It consumes the stream of raw characters and removes Fortran comments,
// continuation line markers, and characters that appear in the right margin
// of fixed form source after the column limit. It inserts spaces to
// pad out source card images to fixed form's right margin when necessary.
// These parsers are largely bypassed when the prescanner is used, but still
// serve as the definition of correct character cooking, apart from
// preprocessing and file inclusion, which are not supported here.
#include "basic-parsers.h"
#include "char-parsers.h"
#include "idioms.h"
#include "parse-state.h"
#include <optional>
namespace Fortran {
namespace parser {
constexpr struct FixedFormPadding {
using resultType = char;
static std::optional<char> Parse(ParseState *state) {
if (state->inCharLiteral() && state->inFortran() && state->inFixedForm() &&
state->column() <= state->columns()) {
if (std::optional<char> ch{state->GetNextRawChar()}) {
if (*ch == '\n') {
state->AdvanceColumnForPadding();
return {' '};
}
}
}
return {};
}
} fixedFormPadding;
static inline void IncrementSkippedNewLines(ParseState *state) {
state->set_skippedNewLines(state->skippedNewLines() + 1);
}
constexpr StateUpdateParser noteSkippedNewLine{IncrementSkippedNewLines};
static inline bool InRightMargin(const ParseState &state) {
if (state.inFortran() && state.inFixedForm() &&
state.column() > state.columns() && !state.tabInCurrentLine()) {
if (std::optional<char> ch{state.GetNextRawChar()}) {
return *ch != '\n';
}
}
return false;
}
constexpr StatePredicateGuardParser inRightMargin{InRightMargin};
template<int col> struct AtFixedFormColumn {
using resultType = Success;
constexpr AtFixedFormColumn() {}
constexpr AtFixedFormColumn(const AtFixedFormColumn &) {}
static std::optional<Success> Parse(ParseState *state) {
if (state->inFortran() && state->inFixedForm() && !state->IsAtEnd() &&
state->column() == col) {
return {Success{}};
}
return {};
}
};
template<int col> struct AtColumn {
using resultType = Success;
constexpr AtColumn() {}
constexpr AtColumn(const AtColumn &) {}
static std::optional<Success> Parse(ParseState *state) {
if (!state->IsAtEnd() && state->column() == col) {
return {Success{}};
}
return {};
}
};
static inline bool AtOldDebugLineMarker(const ParseState &state) {
if (state.inFortran() && state.inFixedForm() && state.column() == 1) {
if (std::optional<char> ch{state.GetNextRawChar()}) {
return toupper(*ch) == 'D';
}
}
return false;
}
static inline bool AtDisabledOldDebugLine(const ParseState &state) {
return AtOldDebugLineMarker(state) && !state.enableOldDebugLines();
}
static inline bool AtEnabledOldDebugLine(const ParseState &state) {
return AtOldDebugLineMarker(state) && state.enableOldDebugLines();
}
static constexpr StatePredicateGuardParser atDisabledOldDebugLine{
AtDisabledOldDebugLine},
atEnabledOldDebugLine{AtEnabledOldDebugLine};
constexpr auto skipPastNewLine = SkipPast<'\n'>{} / noteSkippedNewLine;
// constexpr auto rawSpace =
// (ExactRaw<' '>{} || ExactRaw<'\t'>{} ||
// atEnabledOldDebugLine >> rawNextChar) >> ok;
constexpr struct FastRawSpaceParser {
using resultType = Success;
constexpr FastRawSpaceParser() {}
constexpr FastRawSpaceParser(const FastRawSpaceParser &) {}
static std::optional<Success> Parse(ParseState *state) {
if (std::optional<char> ch{state->GetNextRawChar()}) {
if (*ch == ' ' || *ch == '\t' ||
(toupper(*ch) == 'D' && state->column() == 1 &&
state->enableOldDebugLines() && state->inFortran() &&
state->inFixedForm())) {
state->Advance();
return {Success{}};
}
}
return {};
}
} rawSpace;
constexpr auto skipAnyRawSpaces = skipManyFast(rawSpace);
constexpr auto commentBang =
!inCharLiteral >> !AtFixedFormColumn<6>{} >> ExactRaw<'!'>{} >> ok;
constexpr auto fixedComment = AtFixedFormColumn<1>{} >>
((ExactRaw<'*'>{} || ExactRaw<'C'>{} || ExactRaw<'c'>{}) >> ok ||
atDisabledOldDebugLine ||
extension(ExactRaw<'%'>{} /* VAX %list, %eject, &c. */) >> ok);
constexpr auto comment =
(skipAnyRawSpaces >> (commentBang || inRightMargin) || fixedComment) >>
skipPastNewLine;
constexpr auto blankLine = skipAnyRawSpaces >> eoln >> ok;
inline bool InFortran(const ParseState &state) { return state.inFortran(); }
constexpr StatePredicateGuardParser inFortran{InFortran};
inline bool FixedFormFortran(const ParseState &state) {
return state.inFortran() && state.inFixedForm();
}
constexpr StatePredicateGuardParser fixedFormFortran{FixedFormFortran};
inline bool FreeFormFortran(const ParseState &state) {
return state.inFortran() && !state.inFixedForm();
}
constexpr StatePredicateGuardParser freeFormFortran{FreeFormFortran};
constexpr auto lineEnd = comment || blankLine;
constexpr auto skippedLineEnd = lineEnd / noteSkippedNewLine;
constexpr auto someSkippedLineEnds = skippedLineEnd >> skipMany(skippedLineEnd);
constexpr auto fixedFormContinuation = fixedFormFortran >>
someSkippedLineEnds >>
(extension(AtColumn<1>{} >>
(ExactRaw<'&'>{} || // extension: & in column 1
(ExactRaw<'\t'>{} >> // VAX Fortran: tab and then 1-9
ExactRawRange<'1', '9'>{}))) ||
(skipAnyRawSpaces >> AtColumn<6>{} >> AnyCharExcept<'0'>{})) >>
ok;
constexpr auto freeFormContinuation = freeFormFortran >>
((ExactRaw<'&'>{} >> blankLine >> skipMany(skippedLineEnd) >>
skipAnyRawSpaces >> ExactRaw<'&'>{} >> ok) ||
(ExactRaw<'&'>{} >> !inCharLiteral >> someSkippedLineEnds >>
maybe(skipAnyRawSpaces >> ExactRaw<'&'>{}) >> ok) ||
// PGI-only extension: don't need '&' on initial line if it's on later
// one
extension(eoln >> skipMany(skippedLineEnd) >> skipAnyRawSpaces >>
ExactRaw<'&'>{} >> ok));
constexpr auto skippable = freeFormContinuation ||
fixedFormFortran >> (fixedFormContinuation || !inCharLiteral >> rawSpace ||
AtColumn<6>{} >> ExactRaw<'0'>{} >> ok);
char toLower(char &&ch) { return tolower(ch); }
// TODO: skip \\ \n in C mode, increment skipped newline count;
// drain skipped newlines.
constexpr auto slowCookedNextChar = fixedFormPadding ||
skipMany(skippable) >>
(inCharLiteral >> rawNextChar || lineEnd >> pure('\n') ||
rawSpace >> skipAnyRawSpaces >> pure(' ') ||
// TODO: detect and report non-digit in fixed form label field
inFortran >> applyFunction(toLower, rawNextChar) || rawNextChar);
constexpr struct CookedChar {
using resultType = char;
static std::optional<char> Parse(ParseState *state) {
if (state->prescanned()) {
return rawNextChar.Parse(state);
}
return slowCookedNextChar.Parse(state);
}
} cookedNextChar;
static inline bool ConsumedAllInput(const ParseState &state) {
return state.IsAtEnd();
}
constexpr StatePredicateGuardParser consumedAllInput{ConsumedAllInput};
} // namespace parser
} // namespace Fortran
#endif // FORTRAN_COOKED_CHARS_H_

View File

@ -2,14 +2,13 @@
#define FORTRAN_GRAMMAR_H_ #define FORTRAN_GRAMMAR_H_
// Top-level grammar specification for Fortran. These parsers drive // Top-level grammar specification for Fortran. These parsers drive
// tokenizing and raw character parsers (cooked-tokens.h, cooked-chars.h) // the tokenization parsers in cooked-tokens.h to consume characters,
// to recognize the productions of Fortran and to construct a parse tree. // recognize the productions of Fortran, and to construct a parse tree.
// See parser-combinators.txt for documentation on the parser combinator // See parser-combinators.txt for documentation on the parser combinator
// library used here to implement an LL recursive descent recognizer. // library used here to implement an LL recursive descent recognizer.
#include "basic-parsers.h" #include "basic-parsers.h"
#include "cooked-chars.h" #include "token-parsers.h"
#include "cooked-tokens.h"
#include "format-specification.h" #include "format-specification.h"
#include "parse-tree.h" #include "parse-tree.h"
#include "user-state.h" #include "user-state.h"
@ -540,7 +539,7 @@ constexpr auto executableConstruct =
constexpr auto executionPartErrorRecovery = skipMany("\n"_tok) >> constexpr auto executionPartErrorRecovery = skipMany("\n"_tok) >>
maybe(label) >> !"END"_tok >> !"ELSE"_tok >> !"CONTAINS"_tok >> maybe(label) >> !"END"_tok >> !"ELSE"_tok >> !"CONTAINS"_tok >>
!"CASE"_tok >> !"TYPE IS"_tok >> !"CLASS"_tok >> !"CASE"_tok >> !"TYPE IS"_tok >> !"CLASS"_tok >>
!"RANK"_tok >> skipPastNewLine >> construct<ErrorRecovery>{}; !"RANK"_tok >> SkipPast<'\n'>{} >> construct<ErrorRecovery>{};
// R510 execution-part-construct -> // R510 execution-part-construct ->
// executable-construct | format-stmt | entry-stmt | data-stmt // executable-construct | format-stmt | entry-stmt | data-stmt

View File

@ -30,33 +30,25 @@ public:
ParseState(const ParseState &that) ParseState(const ParseState &that)
: cooked_{that.cooked_}, p_{that.p_}, limit_{that.limit_}, : cooked_{that.cooked_}, p_{that.p_}, limit_{that.limit_},
column_{that.column_}, messages_{*that.cooked_.allSources()}, column_{that.column_}, messages_{*that.cooked_.allSources()},
userState_{that.userState_}, inCharLiteral_{that.inCharLiteral_}, userState_{that.userState_}, inFixedForm_{that.inFixedForm_},
inFortran_{that.inFortran_}, inFixedForm_{that.inFixedForm_},
enableOldDebugLines_{that.enableOldDebugLines_}, columns_{that.columns_},
enableBackslashEscapesInCharLiterals_{ enableBackslashEscapesInCharLiterals_{
that.enableBackslashEscapesInCharLiterals_}, that.enableBackslashEscapesInCharLiterals_},
strictConformance_{that.strictConformance_}, strictConformance_{that.strictConformance_},
warnOnNonstandardUsage_{that.warnOnNonstandardUsage_}, warnOnNonstandardUsage_{that.warnOnNonstandardUsage_},
warnOnDeprecatedUsage_{that.warnOnDeprecatedUsage_}, warnOnDeprecatedUsage_{that.warnOnDeprecatedUsage_},
skippedNewLines_{that.skippedNewLines_}, anyErrorRecovery_{that.anyErrorRecovery_} {
tabInCurrentLine_{that.tabInCurrentLine_},
anyErrorRecovery_{that.anyErrorRecovery_}, prescanned_{that.prescanned_} {
} }
ParseState(ParseState &&that) ParseState(ParseState &&that)
: cooked_{that.cooked_}, p_{that.p_}, limit_{that.limit_}, : cooked_{that.cooked_}, p_{that.p_}, limit_{that.limit_},
column_{that.column_}, messages_{std::move(that.messages_)}, column_{that.column_}, messages_{std::move(that.messages_)},
context_{std::move(that.context_)}, userState_{that.userState_}, context_{std::move(that.context_)}, userState_{that.userState_},
inCharLiteral_{that.inCharLiteral_}, inFortran_{that.inFortran_},
inFixedForm_{that.inFixedForm_}, inFixedForm_{that.inFixedForm_},
enableOldDebugLines_{that.enableOldDebugLines_}, columns_{that.columns_},
enableBackslashEscapesInCharLiterals_{ enableBackslashEscapesInCharLiterals_{
that.enableBackslashEscapesInCharLiterals_}, that.enableBackslashEscapesInCharLiterals_},
strictConformance_{that.strictConformance_}, strictConformance_{that.strictConformance_},
warnOnNonstandardUsage_{that.warnOnNonstandardUsage_}, warnOnNonstandardUsage_{that.warnOnNonstandardUsage_},
warnOnDeprecatedUsage_{that.warnOnDeprecatedUsage_}, warnOnDeprecatedUsage_{that.warnOnDeprecatedUsage_},
skippedNewLines_{that.skippedNewLines_}, anyErrorRecovery_{that.anyErrorRecovery_} {
tabInCurrentLine_{that.tabInCurrentLine_},
anyErrorRecovery_{that.anyErrorRecovery_}, prescanned_{that.prescanned_} {
} }
ParseState &operator=(ParseState &&that) { ParseState &operator=(ParseState &&that) {
swap(that); swap(that);
@ -87,36 +79,12 @@ public:
return *this; return *this;
} }
bool inCharLiteral() const { return inCharLiteral_; }
ParseState &set_inCharLiteral(bool yes) {
inCharLiteral_ = yes;
return *this;
}
bool inFortran() const { return inFortran_; }
ParseState &set_inFortran(bool yes) {
inFortran_ = yes;
return *this;
}
bool inFixedForm() const { return inFixedForm_; } bool inFixedForm() const { return inFixedForm_; }
ParseState &set_inFixedForm(bool yes) { ParseState &set_inFixedForm(bool yes) {
inFixedForm_ = yes; inFixedForm_ = yes;
return *this; return *this;
} }
bool enableOldDebugLines() const { return enableOldDebugLines_; }
ParseState &set_enableOldDebugLines(bool yes) {
enableOldDebugLines_ = yes;
return *this;
}
int columns() const { return columns_; }
ParseState &set_columns(int cols) {
columns_ = cols;
return *this;
}
bool enableBackslashEscapesInCharLiterals() const { bool enableBackslashEscapesInCharLiterals() const {
return enableBackslashEscapesInCharLiterals_; return enableBackslashEscapesInCharLiterals_;
} }
@ -143,13 +111,6 @@ public:
return *this; return *this;
} }
int skippedNewLines() const { return skippedNewLines_; }
void set_skippedNewLines(int n) { skippedNewLines_ = n; }
bool prescanned() const { return prescanned_; } // TODO: always true, remove
bool tabInCurrentLine() const { return tabInCurrentLine_; }
const char *GetLocation() const { return p_; } const char *GetLocation() const { return p_; }
Provenance GetProvenance(const char *at) const { Provenance GetProvenance(const char *at) const {
return cooked_.GetProvenance(at).LocalOffsetToProvenance(0); return cooked_.GetProvenance(at).LocalOffsetToProvenance(0);
@ -197,29 +158,18 @@ public:
bool IsAtEnd() const { return p_ >= limit_; } bool IsAtEnd() const { return p_ >= limit_; }
std::optional<char> GetNextRawChar() const { std::optional<char> GetNextChar() {
if (p_ < limit_) { if (p_ >= limit_) {
return {*p_}; return {};
} }
return {}; char ch{*p_++};
} ++column_;
if (ch == '\n') {
void Advance() {
CHECK(p_ < limit_);
if (*p_ == '\n') {
column_ = 1; column_ = 1;
tabInCurrentLine_ = false;
} else if (*p_ == '\t') {
column_ = ((column_ + 7) & -8) + 1;
tabInCurrentLine_ = true;
} else {
++column_;
} }
++p_; return {ch};
} }
void AdvanceColumnForPadding() { ++column_; }
private: private:
// Text remaining to be parsed // Text remaining to be parsed
const CookedSource &cooked_; const CookedSource &cooked_;
@ -232,19 +182,12 @@ private:
UserState *userState_{nullptr}; UserState *userState_{nullptr};
bool inCharLiteral_{false};
bool inFortran_{true};
bool inFixedForm_{false}; bool inFixedForm_{false};
bool enableOldDebugLines_{false};
int columns_{72};
bool enableBackslashEscapesInCharLiterals_{true}; bool enableBackslashEscapesInCharLiterals_{true};
bool strictConformance_{false}; bool strictConformance_{false};
bool warnOnNonstandardUsage_{false}; bool warnOnNonstandardUsage_{false};
bool warnOnDeprecatedUsage_{false}; bool warnOnDeprecatedUsage_{false};
int skippedNewLines_{0};
bool tabInCurrentLine_{false};
bool anyErrorRecovery_{false}; bool anyErrorRecovery_{false};
bool prescanned_{true};
// NOTE: Any additions or modifications to these data members must also be // NOTE: Any additions or modifications to these data members must also be
// reflected in the copy and move constructors defined at the top of this // reflected in the copy and move constructors defined at the top of this
// class definition! // class definition!

View File

@ -1,12 +1,10 @@
#ifndef FORTRAN_COOKED_TOKENS_H_ #ifndef FORTRAN_TOKEN_PARSERS_H_
#define FORTRAN_COOKED_TOKENS_H_ #define FORTRAN_TOKEN_PARSERS_H_
// These parsers are driven by the Fortran grammar (grammar.h) to consume // These parsers are driven by the Fortran grammar (grammar.h) to consume
// the cooked character stream from cookedNextChar (cooked-chars.h) and // the prescanned character stream and recognize context-sensitive tokens.
// partition it into a context-sensitive token stream.
#include "basic-parsers.h" #include "basic-parsers.h"
#include "cooked-chars.h"
#include "idioms.h" #include "idioms.h"
#include "provenance.h" #include "provenance.h"
#include <cctype> #include <cctype>
@ -29,7 +27,7 @@ public:
: predicate_{f}, message_{msg} {} : predicate_{f}, message_{msg} {}
std::optional<char> Parse(ParseState *state) const { std::optional<char> Parse(ParseState *state) const {
auto at = state->GetLocation(); auto at = state->GetLocation();
if (std::optional<char> result{cookedNextChar.Parse(state)}) { if (std::optional<char> result{nextChar.Parse(state)}) {
if (predicate_(*result)) { if (predicate_(*result)) {
return result; return result;
} }
@ -68,7 +66,7 @@ public:
constexpr CharMatch() {} constexpr CharMatch() {}
static std::optional<char> Parse(ParseState *state) { static std::optional<char> Parse(ParseState *state) {
auto at = state->GetLocation(); auto at = state->GetLocation();
std::optional<char> result{cookedNextChar.Parse(state)}; std::optional<char> result{nextChar.Parse(state)};
if (result && *result != good) { if (result && *result != good) {
result.reset(); result.reset();
} }
@ -83,7 +81,7 @@ constexpr struct Space {
using resultType = Success; using resultType = Success;
constexpr Space() {} constexpr Space() {}
static std::optional<Success> Parse(ParseState *state) { static std::optional<Success> Parse(ParseState *state) {
std::optional<char> ch{cookedNextChar.Parse(state)}; std::optional<char> ch{nextChar.Parse(state)};
if (ch) { if (ch) {
if (ch == ' ' || ch == '\t') { if (ch == ' ' || ch == '\t') {
return {Success{}}; return {Success{}};
@ -116,13 +114,13 @@ public:
continue; // redundant; ignore continue; // redundant; ignore
} }
} }
if (!ch && !(ch = cookedNextChar.Parse(state))) { if (!ch && !(ch = nextChar.Parse(state))) {
return {}; return {};
} }
if (spaceSkipping) { if (spaceSkipping) {
// medial space: 0 or more spaces/tabs accepted, none required // medial space: 0 or more spaces/tabs accepted, none required
while (*ch == ' ' || *ch == '\t') { while (*ch == ' ' || *ch == '\t') {
if (!(ch = cookedNextChar.Parse(state))) { if (!(ch = nextChar.Parse(state))) {
return {}; return {};
} }
} }
@ -191,7 +189,7 @@ struct CharLiteralChar {
using resultType = Result; using resultType = Result;
static std::optional<Result> Parse(ParseState *state) { static std::optional<Result> Parse(ParseState *state) {
auto at = state->GetLocation(); auto at = state->GetLocation();
std::optional<char> och{cookedNextChar.Parse(state)}; std::optional<char> och{nextChar.Parse(state)};
if (!och.has_value()) { if (!och.has_value()) {
return {}; return {};
} }
@ -203,7 +201,7 @@ struct CharLiteralChar {
if (ch != '\\' || !state->enableBackslashEscapesInCharLiterals()) { if (ch != '\\' || !state->enableBackslashEscapesInCharLiterals()) {
return {Result::Bare(ch)}; return {Result::Bare(ch)};
} }
if (!(och = cookedNextChar.Parse(state)).has_value()) { if (!(och = nextChar.Parse(state)).has_value()) {
return {}; return {};
} }
switch ((ch = *och)) { switch ((ch = *och)) {
@ -249,13 +247,11 @@ template<char quote> struct CharLiteral {
using resultType = std::string; using resultType = std::string;
static std::optional<std::string> Parse(ParseState *state) { static std::optional<std::string> Parse(ParseState *state) {
std::string str; std::string str;
CHECK(!state->inCharLiteral());
static constexpr auto nextch = attempt(CharLiteralChar{}); static constexpr auto nextch = attempt(CharLiteralChar{});
while (std::optional<CharLiteralChar::Result> ch{nextch.Parse(state)}) { while (std::optional<CharLiteralChar::Result> ch{nextch.Parse(state)}) {
if (ch->ch == quote && !ch->wasEscaped) { if (ch->ch == quote && !ch->wasEscaped) {
static constexpr auto doubled = attempt(CharMatch<quote>{}); static constexpr auto doubled = attempt(CharMatch<quote>{});
if (!doubled.Parse(state).has_value()) { if (!doubled.Parse(state).has_value()) {
state->set_inCharLiteral(false);
return {str}; return {str};
} }
} }
@ -286,14 +282,14 @@ struct BOZLiteral {
return {}; return {};
} }
auto ch = cookedNextChar.Parse(state); auto ch = nextChar.Parse(state);
if (!ch) { if (!ch) {
return {}; return {};
} }
if (toupper(*ch) == 'X' && state->strictConformance()) { if (toupper(*ch) == 'X' && state->strictConformance()) {
return {}; return {};
} }
if (baseChar(*ch) && !(ch = cookedNextChar.Parse(state))) { if (baseChar(*ch) && !(ch = nextChar.Parse(state))) {
return {}; return {};
} }
@ -305,7 +301,7 @@ struct BOZLiteral {
auto at = state->GetLocation(); auto at = state->GetLocation();
std::string content; std::string content;
while (true) { while (true) {
if (!(ch = cookedNextChar.Parse(state))) { if (!(ch = nextChar.Parse(state))) {
return {}; return {};
} }
if (*ch == quote) { if (*ch == quote) {
@ -319,7 +315,7 @@ struct BOZLiteral {
if (!shift && !state->strictConformance()) { if (!shift && !state->strictConformance()) {
// extension: base allowed to appear as suffix // extension: base allowed to appear as suffix
if (!(ch = cookedNextChar.Parse(state)) || !baseChar(*ch)) { if (!(ch = nextChar.Parse(state)) || !baseChar(*ch)) {
return {}; return {};
} }
} }
@ -395,22 +391,44 @@ struct HollerithLiteral {
return {}; return {};
} }
std::string content; std::string content;
CHECK(!state->inCharLiteral());
state->set_inCharLiteral(true);
for (auto j = *charCount; j-- > 0;) { for (auto j = *charCount; j-- > 0;) {
std::optional<char> ch{cookedNextChar.Parse(state)}; std::optional<char> ch{nextChar.Parse(state)};
if (!ch || !isprint(*ch)) { if (!ch || !isprint(*ch)) {
state->PutMessage(at, "insufficient or bad characters in Hollerith"); state->PutMessage(at, "insufficient or bad characters in Hollerith");
state->set_inCharLiteral(false);
return {}; return {};
} }
content += *ch; content += *ch;
} }
state->set_inCharLiteral(false);
return {content}; return {content};
} }
}; };
struct ConsumedAllInputParser {
using resultType = Success;
constexpr ConsumedAllInputParser() {}
static std::optional<Success> Parse(ParseState *state) {
if (state->IsAtEnd()) {
return {Success{}};
}
return {};
}
} consumedAllInput;
template<char goal>
struct SkipPast {
using resultType = Success;
constexpr SkipPast() {}
constexpr SkipPast(const SkipPast &) {}
static std::optional<Success> Parse(ParseState *state) {
while (std::optional<char> ch{state->GetNextChar()}) {
if (*ch == goal) {
return {Success{}};
}
}
return {};
}
};
// A common idiom in the Fortran grammar is an optional item (usually // A common idiom in the Fortran grammar is an optional item (usually
// a nonempty comma-separated list) that, if present, must follow a comma // a nonempty comma-separated list) that, if present, must follow a comma
// and precede a doubled colon. When the item is absent, the comma must // and precede a doubled colon. When the item is absent, the comma must
@ -423,4 +441,4 @@ template<typename PA> inline constexpr auto optionalBeforeColons(const PA &p) {
} }
} // namespace parser } // namespace parser
} // namespace Fortran } // namespace Fortran
#endif // FORTRAN_COOKED_TOKENS_H_ #endif // FORTRAN_TOKEN_PARSERS_H_

View File

@ -1,8 +1,5 @@
// Temporary Fortran front end driver main program for development scaffolding. // Temporary Fortran front end driver main program for development scaffolding.
#include "../../lib/parser/basic-parsers.h"
#include "../../lib/parser/char-buffer.h"
#include "../../lib/parser/cooked-chars.h"
#include "../../lib/parser/grammar.h" #include "../../lib/parser/grammar.h"
#include "../../lib/parser/idioms.h" #include "../../lib/parser/idioms.h"
#include "../../lib/parser/message.h" #include "../../lib/parser/message.h"
@ -151,13 +148,10 @@ int main(int argc, char *const argv[]) {
state.set_inFixedForm(fixedForm) state.set_inFixedForm(fixedForm)
.set_enableBackslashEscapesInCharLiterals(backslashEscapes) .set_enableBackslashEscapesInCharLiterals(backslashEscapes)
.set_strictConformance(standard) .set_strictConformance(standard)
.set_columns(columns)
.set_enableOldDebugLines(enableOldDebugLines)
.set_userState(&ustate); .set_userState(&ustate);
if (dumpCookedChars) { if (dumpCookedChars) {
while (std::optional<char> och{ while (std::optional<char> och{state.GetNextChar()}) {
Fortran::parser::cookedNextChar.Parse(&state)}) {
std::cout << *och; std::cout << *och;
} }
return 0; return 0;

View File

@ -1,11 +1,3 @@
#include <cstdlib>
#include <iostream>
#include <list>
#include <optional>
#include <sstream>
#include <stddef.h>
#include <string>
#include "../../lib/parser/grammar.h" #include "../../lib/parser/grammar.h"
#include "../../lib/parser/idioms.h" #include "../../lib/parser/idioms.h"
#include "../../lib/parser/indirection.h" #include "../../lib/parser/indirection.h"
@ -19,6 +11,13 @@
#include "../../lib/parser/user-state.h" #include "../../lib/parser/user-state.h"
#include "../../lib/semantics/attr.h" #include "../../lib/semantics/attr.h"
#include "../../lib/semantics/type.h" #include "../../lib/semantics/type.h"
#include <cstdlib>
#include <iostream>
#include <list>
#include <optional>
#include <sstream>
#include <string>
#include <stddef.h>
using namespace Fortran; using namespace Fortran;
using namespace parser; using namespace parser;