Flexc++(1) was designed after flex(1) and flex++(1). Like these latter two programs flexc++ generates code performing pattern-matching on text, possibly executing actions when certain regular expressions are recognized.
Refer to flexc++(1) for a general overview. This manual page covers the Application Programmer's Interface of classes generated by flexc++, offering the following sections:
The complete list of affected names is:
ActionType_, Leave_, StartConditon_, PostEnum_;
actionType_, continue_, echoCh_, echoFirst_, executeAction_, getRange_, get_, istreamName_, lex_, lop1_, lop2_, lop3_, lop4_, lopf_, matched_, noReturn_, print_, pushFront_, reset_, return_;
d_in_ d_token_ s_finIdx_, s_interactive_, s_maxSizeofStreamStack_, s_nRules_, s_rangeOfEOF_, s_ranges_, s_rf_.
An interactive scanner is characterized by the fact that scanning is postponed until an end-of-line character has been received, followed by reading all information on the line, read so far. Flexc++ supports the %interactive directive), generating an interactive scanner. Here it is assumed that Scanner is the name of the scanner class generated by flexc++.
Caveat: generating interactive and non-interactive scanners should not be mixed as their class organizations fundamentally differ, and several of the Scanner class's members are only available in the non-interactive scanner. As the Scanner.h file contains the Scanner class's interface, which is normally left untouched by flexc++, flexc++ cannot adapt the Scanner class when requested to change the interactivity of an existing Scanner class. Because of this support for the --interactive option was discontinued at flexc++'s 1.01.00 release.
The interactive scanner generated by flexc++ has the following characteristics:
- If the token returned by the scanner is not equal to 0 it is returned as then next token;
- Otherwise the next line is retrieved from the input stream passed to the Scanner's constructor (by default std::cin). If this fails, 0 is returned.
- A '\n' character is appended to the just read line, and the scanner's std::istringstream base class object is re-initialized with that line;
- The member lex_ returns the next token.
Here is an example of how such a scanner could be used:
// scanner generated using 'flexc++ lexer' with lexer containing // the %interactive directive int main() { Scanner scanner; // by default: read from std::cin while (true) { cout << "? "; // prompt at each line while (true) // process all the line's tokens { int token = scanner.lex(); if (token == '\n') // end of line: new prompt break; if (token == 0) // end of input: done return 0; // process other tokens cout << scanner.matched() << '\n'; if (scanner.matched()[0] == 'q') return 0; } } }
By default, flexc++ generates a file Scanner.h containing the initial interface of the scanner class performing the lexical scan according to the specifications given in flexc++'s input file. The name of the file that is generated can easily be changed using flexc++'s --class-header option. In this man-page we'll stick to using the default name.
The file Scanner.h is generated only once, unless an explicit request is made to rewrite it (using flexc++'s --force-class-header option).
The provided interface is very light-weight, primarily offering a link to the scanner's base class (see this manpage's sections 8 through 16).
Many of the facilities offered by the scanner class are inherited from the ScannerBase base class. Additional facilities offered by the Scanner class. are covered below.
All symbols that are required by the generated scanner class end in an underscore character (e.g., executeAction_). These names should not be redefined. As they are part of the Scanner and ScannerBase class their scope is immediately clear and confusion with identically named identifiers elsewhere is unlikely.
Some member functions do not use the underscore convention. These are the scanner class's constructors, or names that are similar or equal to names that have historically been used (e.g., length). Also, some functions are offered offering hooks into the implementation (like preCode). The latter category of function also have names that don't end in underscores.
By default (keepCwd == true) the directory that was active when the scanner was constructed is made current again once the scanning process ends (i.e., when lex_(), see below, returns 0).
When keepCwd == false the scanner doesn't reset its working directory to the directory that was current when the scanner was constructed. This may be convenient in situations where a program repeatedly calls switchStream(), followed by full lexical scans of the switched-to streams.
With interactive scanners input stream switching or stacking is not available; switching output streams, however, is.
The parameter keepCwd is used as with the previous constructor.
This constructor is not available with interactive scanners.
inline int Scanner::lex() { return lex_(); }
Caveat: with interactive scanners the lex function is defined in the generated lex.cc file. Once flexc++ has generated the scanner class header file this scanner class header file isn't automatically rewritten by flexc++. If, at some later stage, an interactive scanner must be generated, then the inline lex implementation must be removed `by hand' from the scanner class header file. Likewise, a lex member implementation (like the above) must be provided `by hand' if a non-interactive scanner is required after first having generated files implementing an interactive scanner.
int Scanner::lex_() { ... preCode(); while (true) { size_t ch = get_(); // fetch next char ... switch (actionType_(range)) // determine the action { ... maybe return } ... no return, continue scanning preCode(); } // while }
Displaying is suppressed when the lex.cc file is (re)generated without using this directive. The function actually showing the tokens (ScannerBase::print_) is called from print, which is defined in-line in Scanner.h. Calling ScannerBase::print_, therefore, can also easily be controlled by an option controlled by the program using the scanner object.
#ifndef Scanner_H_INCLUDED_ #define Scanner_H_INCLUDED_ // $insert baseclass_h #include "Scannerbase.h" // $insert classHead class Scanner: public ScannerBase { public: explicit Scanner(std::istream &in = std::cin, std::ostream &out = std::cout); Scanner(std::string const &infile, std::string const &outfile); // $insert lexFunctionDecl int lex(); private: int lex_(); int executeAction_(size_t ruleNr); void print(); void preCode(); // re-implement this function for code that must // be exec'ed before the patternmatching starts void postCode(PostEnum_ type); // re-implement this function for code that must // be exec'ed after the rules's actions. }; // $insert scannerConstructors inline Scanner::Scanner(std::istream &in, std::ostream &out) : ScannerBase(in, out) {} inline Scanner::Scanner(std::string const &infile, std::string const &outfile) : ScannerBase(infile, outfile) {} // $insert inlineLexFunction inline int Scanner::lex() { return lex_(); } inline void Scanner::preCode() { // optionally replace by your own code } inline void Scanner::postCode(PostEnum_ type) { // optionally replace by your own code } inline void Scanner::print() { print_(); } #endif // Scanner_H_INCLUDED_
By default, flexc++ generates a file Scannerbase.h containing the interface of the base class of the scanner class also generated by flexc++. The name of the file that is generated can easily be changed using flexc++'s --baseclass-header option. In this man-page we use the default name.
The file Scannerbase.h is generated at each new flexc++ run. It contains no user-serviceable or extensible parts. Rewriting can be prevented by specifying flexc++'s --no-baseclass-header option).
begin(StartCondition_::INITIAL);
PostEnum_::END: the function lex_ immediately returns 0 once postCode returns, indicating the end of the input was reached;
PostEnum_::POP: the end of an input stream was reached, and processing continues with the previously pushed input stream. In this case the function lex_ doesn't return, it simply coontinues processing the previously pushed stream;
PostEnum_::RETURN: the function lex_ immediately returns once postCode returns, returning the next token;
PostEnum_::WIP: the function lex_ has matched a non-returning rule, and continues its rule-matching process.
There are no public constructors. ScannerBase is a base class for the Scanner class generated by flexc++. ScannerBase only offers protected constructors.
This member is not available with interactive scanners.
The current output stream is closed, and output is written to outfilename. If this file already exists, it is rewritten.
This member is not available with interactive scanners.
If outfilename == "-" then the standard output stream is used as the scanner's output medium; if outfilename == "" then the standard error stream is used as the scanner's output medium.
This member is not available with interactive scanners.
By default (keepCwd == true) the directory that was active when ScannerBase was constructed is made current again once the scanning process ends (i.e., when Scanner::lex_() returns 0).
When keepCwd == false the scanner doesn't reset its working directory to the directory that was current when the ScannerBase was constructed. This may be convenient in situations where a program repeatedly calls switchStream(), followed by full lexical scans of the switched-to streams.
This constructor is not available for interactive scanners.
The parameter keepCwd is used as with the previous constructor.
All member functions ending in an underscore character are for internal use only and should not be called by user-defined members of the Scanner class.
The following members, however, can safely be called by members of the generated Scanner class:
begin(StartCondition_::INITIAL);
regex-to-match { if (int ret = memberFunction()) return ret; }The member leave removes the need for constructions like the above. The member leave can be called from within member functions encapsulating actions performed when a regular expression has been matched. It ends lex, returning retValue to its caller. The above rule can now be written like this:
regex-to-match memberFunction();and memberFunction could be implemented as follows:
void memberFunction() { if (someCondition()) { // any action, e.g., // switch mini-scanner begin(StartCondition_::INITIAL); leave(Parser::TOKENVALUE); // lex returns TOKENVALUE // this point is never reached } pushStream(d_matched); // switch to the next stream // lex continues }The member leave should only (indirectly) be called (usually nested) from actions defined in the scanner's specification s; calling leave outside of this context results in undefined behavior.
When the <<EOF>> pattern is used in the scanner's specification file popStream is not automatically called. In that case it should explicitly be called in the <<EOF>> pattern's action block, returning 0 when it returns false. E.g.,
<<EOF>> { if (not popStream()) return 0; cerr << "WIP on " << streamStack().size() << "file(s)\n"; }
This member is not available with interactive scanners.
This member is not available with interactive scanners.
The StreamStruct itself is a struct having only one documented member: std::string const &pushedName, containing the absolute pathname of the initial and pushed streams.
Internally, streamStack vector is used as a stack. The pathname of the file that was specified at the scanner's construction time is found at index position 0, and pathname of the file that's currently being processed is found at streamStack().back(). Switching input streams changes the pathname at streamStack().back(), and if the pathname of a stream cannot be determined (which happens when constructing a scanner from or switching to a std::istream) then the program's current working directory is not altered and pushedName contains (istream).
This member is not available with interactive scanners.
All protected data members are for internal use only, allowing lex_ to access them. All of them end in an underscore character.
Flex++ (old) | Flexc++ (new) | |
lineno() | lineNr() | |
YYText() | matched() | |
less() | accept() | |
Flexc++ generates a file Scannerbase.h defining the scanner class's base class, by default named ScannerBase (which is the name used in this man-page). The base class ScannerBase contains a nested class Input having this interface:
class Input { public: Input(); Input(std::istream *iStream, size_t lineNr = 1); size_t get(); size_t lineNr() const; size_t nPending() const; void setPending(size_t nPending); void reRead(size_t ch); void reRead(std::string const &str, size_t fmIdx); void close(); };The members of this class are all required and offer an implementation level between the operations of ScannerBase and flexc++'s actual input file that's being processed.
By default, flexc++ provides the implementation of all of Input's required members. Therefore, in most situations this section of this man-page can safely be ignored.
However, users may define and extend their own Input class, providing flexc++'s base class with their own Input class. To do so flexc++'s rules file must contain the following two directives:
%input-inline = "inline" %input-interface = "interface" %input-implementation = "sourcefile"Here, `inline' is the name of a file containing the inline implementations of the class Input's member functions; `interface' is the name of a file containing the class Input's interface, while the non-inline implementations of the class Input is provided in `sourcefile'.
By default the class Input is defined in ScannerBase's private section. When providing your own implementation the class Input is declared and defined in ScannerBase's protected section so its members can also be accessed by the derived class Scanner. Moreover, in that situation the (otherwise private) ScannerBase data member Input *d_input is also declared in ScannerBase's protected section.
In the default implementation of the class Input short, one-line implementations of some of its members are defined inline below the interface of the class ScannerBase in the generated ScannerBase.h file. When providing your own implementation these members can also be defined inline by providing their implementations in a file (e.g., "inputinline") which is specified by the %input-inline directive.
When using the %input-inline directive make sure that the defined members are specified as inline Input:: class members inside the class ScannerBase (using the actually used scanner class name). E.g., if the default scanner class is used (see also the %class-name directive), then the inline implementation of the member Input::lineNr could be
inline size_t ScannerBase::Line::lineNr() const { return d_lineNr; }If, when using your self-defined Input class, the default implementations of those inline members are not modified then use the default implementations by simply not specifying the %input-inline directive.
Avoid using standard extensions for the files specified at the input-implementation, input-inline, and input-interface directives to prevent confusing the compiler when using program maintenance utilities: the interface specifies the interface of the class Input, like the one shown aboove, and that interface is then inserted in ScannerBase's class as a nested class. Putting these s in, e.g., a subdirectory input of the scanner's directory, and then specifying, e.g.,
%input-interface = "input/interface"nicely separates the interface and implementation of the scanner-class from the user-defined input class.
The file sourcefile specified at the %input-implementation directive contains the standard (non-inline) implementations of members of the user-define Input class.
All member functions of the self-defined class Input must be defined within the class ScannerBase's class scope. E.g., an implementation of the member Input::reRead(size_t ch) would use the following mold:
void ScannerBase::Input::reRead(size_t ch) { .... your implementation here }
When your user-defined input class requires headers, which are not already provided by the class ScannerBase (see the top lines of a generated scannerbase.h file for an overview of the already included header files) then provide the required #include specifications in the file specified at the %baseclass-preinclude directive (or corresponding option). The generated scanner itself is self-supporting and doesn't need additional headers.
As an advice: when implementing your own Input class start out with the default class declaration and implementation as found in ScannerBase.h and lex.cc, and modify that default class to the class you need. Note that the interface of your class Input must at least offer the same members and constructors of the default provided class input, but it may define additional data members and/or member functions if required by your implementation (see also the following two section for descriptions of the class Input's constructors and required members.
When the lexical scanner generated by flexc++ switches streams using the //include directive (see also section 2. FILE SWITCHING) in the flexc++input(7) man page), then the input stream that's currently processed is pushed on an Input stack maintained by ScannerBase, and processing continues at the file named at the //include directive. Once the latter file has been processed, the previously pushed stream is popped off the stack, and processing of the popped stream continues. This implies that Input objects must be `stack-able'. Its interface must always be designed in such a way that it satisfies this requirement.
The new input stream's line counter is set to lineNr, by default 1.
Flexc++'s default skeleton files are in /usr/share/flexc++.
By default, flexc++ generates the following files:
flexc++(1), flexc++input(7)