cpphs
What is cpphs? How do I use it? Downloads |
Differences to cpp cpphs as a library Contacts |
What is cpphs?
cpphs is a liberalised re-implementation of cpp, the C pre-processor, in Haskell.
Why re-implement cpp? Rightly or wrongly, the C pre-processor is widely used in Haskell source code. It enables conditional compilation for different compilers, different versions of the same compiler, and different OS platforms. It is also occasionally used for its macro language, which can enable certain forms of platform-specific detail-filling, such as the tedious boilerplate generation of instance definitions and FFI declarations. However, there are two problems with cpp, aside from the obvious aesthetic ones:
- For some Haskell systems, notably Hugs on Windows, a true cpp is not available by default.
- Even for the other Haskell systems, the common cpp provided by the gcc 3.x and 4.x series has changed subtly in ways that are incompatible with Haskell's syntax. There have always been problems with, for instance, string gaps, and prime characters in identifiers. These problems are only going to get worse.
This version of the C pre-processor is pretty-much feature-complete, and compatible with the -traditional style. It has two main modes:
- conditional compilation only (--nomacro),
- and full macro-expansion (default).
Source language features:
#ifdef | simple conditional compilation |
#if | the full boolean language of defined(), &&, ||, ==, etc. |
#elif | chained conditionals |
#define | in-line definitions (text replacements and macros) |
#undef | in-line revocation of definitions |
#include | file inclusion |
#line | line number directives |
#pragma | cpp pragmas (ignored) |
\\n | line continuations within all # directives |
/**/ | token catenation within a macro definition |
## | ANSI-style token catenation |
# | ANSI-style token stringisation |
__FILE__ | special text replacement for DIY error messages |
__LINE__ | special text replacement for DIY error messages |
__DATE__ | special text replacement |
__TIME__ | special text replacement |
Macro expansion is recursive. Redefinition of a macro name does not generate a warning. Macros can be defined on the command-line with -D just like textual replacements. Macro names are permitted to be Haskell identifiers e.g. with the prime ' and backtick ` characters, which is slightly looser than in C, but they still may not include operator symbols.
Numbering of lines in the output is preserved so that any later processor can give meaningful error messages. When a file is #include'd, cpphs inserts #line directives for the same reason. Numbering should be correct even in the presence of line continuations. If you don't want #line directives in the final output, use the --noline option.
Any syntax error in a cpp directive gives a warning message to stderr. Failure to find a #include'd file also produces a warning to stderr. In both cases, processing continues on the rest of the input.
How do I use it?
Usage: cpphs [ filename | -Dsym | -Dsym=val | -Ipath ]+ [-Ofile] [--nomacro] [--noline] [--nowarn] [--pragma] [--strip] [--strip-eol] [--text] [--hashes] [--layout] [--unlit] [ --cpp compatopts ] cpphs --version
You can give any number of filenames on the command-line. The results are catenated on standard output. (Macro definitions in one file do not carry over into the next.) If no filename is given, cpphs reads from standard input.
Note: if you wish to use cpphs as a replacement for gcc's cpp in conjunction with the ghc compiler then the extra options you need to give to ghc are these:
-cpp -pgmPcpphs -optP--cpp
Options:
-Dsym | define a textual replacement (default value is 1) |
-Dsym=val | define a textual replacement with a specific value |
-Dsym(args)=val | define a macro with arguments |
-Ipath | add a directory to the search path for #include's |
-Ofile | specify a file for output (default is stdout) |
--nomacro | only process #ifdef's and #include's, do not expand macros |
--noline | remove #line droppings from the output |
--nowarn | suppress messages from missing #include files, or #warning |
--pragma | retain #pragma in the output (normally removed) |
--strip | convert traditional C-style comments (not eol //) to whitespace, even outside cpp directives |
--strip-eol | convert modern C-style comments (including /**/ and //) to whitespace, even outside cpp directives |
--hashes | recognise the ANSI # stringise operator, and ## for token catenation, within macros |
--text | treat input as plain text, not Haskell code |
--layout | preserve newlines within macro expansions |
--unlit | unlit literate source code |
--cpp compatopts | accept standard cpp options: -o, -x, -ansi, -traditional, -P, -C, -A, etc |
--version | report version number of cpphs and stop |
There are NO textual replacements defined by default. (Normal cpp usually has definitions for machine, OS, etc. You can easily create a wrapper script if you need these.) The search path is searched in order of the -I options, except that the directory of the calling file, then the current directory, are always searched first. Again, there is no default search path (unless you define one via a wrapper script).
Downloads
Current stable version:
cpphs-1.5, release date 2007.06.05
By HTTP:
.tar.gz,
.zip.
Windows binary,
Fedora package,
Gentoo package,
FreeBSD port,
OpenBSD port.
- Fixed some more obscure corner cases, involving parameterised macro expansion within conditionals e.g. #if FOO(BAR,QUUX)
- Internal refactoring, affecting parts of the library API.
Development:
The current darcs repository of cpphs is available at
darcs get http://www.cs.york.ac.uk/fp/darcs/cpphs(Users on Windows or MacOS filesystems need to use the --partial flag.) The source tree and version history can be browsed on-line through darcsweb. What's new, over and above the latest stable release?
- New option --strip-eol now strips C eol // comments in addition to /**/.
- Bugfix for cpp directives within {- -} Haskell comments.
Older versions:
cpphs-1.4, release date 2007.04.17
By HTTP:
.tar.gz,
.zip.
- Added a "--pragma" option to retain #pragma in the output.
- Fixed a number of obscure corner cases involving the interaction of multiple features e.g. foo##__LINE__.
- Added the "--nowarn" option.
cpphs-1.3, release date 2006.10.09
By HTTP:
.tar.gz,
.zip,
Windows binary.
- Added a "--cpp" option for drop-in compatibility with standard cpp. It causes cpphs to accept standard cpp flags and translate them to cpphs equivalents. Compatibility options include: -o, -ansi, -traditional, -stdc, -x, -include, -P, -C, -CC, -A. The file behaviour is different too - if two filenames are given on the commandline, then the second is treated as the output location.
- Fixed a corner-case bug in evaluating chained and overlapping #ifdefs.
cpphs-1.2, release date 2006.05.04
By HTTP:
.tar.gz,
.zip,
Windows binary.
- Re-arranged the source files into hierarchical libraries.
- Exposed the library interface as an installable Cabal package, with Haddock documentation.
- Added the --unlit option, for removing literate-style comments.
cpphs-1.1, release date 2005.10.14
By HTTP:
.tar.gz,
.zip.
- Fixed the .cabal way of building cpphs.
- Update the --version reported (forgotten in 1.0, which still mistakenly reports 0.9).
- No longer throws an error on an empty file.
cpphs-1.0, release date 2005.10.05
By HTTP:
.tar.gz,
.zip.
- Included the cpphs.compat script for argument compatibility with the original cpp.
- Placed quotes around replacements for special macros __FILE__, __DATE__, and __TIME__.
- If no files are specified, read from stdin.
- Ignore #! lines (e.g. in scripts)
- Parse -D commandline options consistently with cpp, i.e. -Dfoo means foo=1
- Fix compatibility with preprocessors like hsc2hs, which use non-cpp directives like #def. They are now passed through to the output with a warning to stderr.
cpphs-0.9, release date 2005.03.17
By HTTP:
.tar.gz,
.zip.
- Bugfix for ghc-6.4 -O: flush output buffer.
cpphs-0.8, release date 2004.11.14
By HTTP:
.tar.gz,
.zip.
- Added the --text option, to signify the input should not be lexed as Haskell. This causes macros to be defined or expanded regardless of their location within comments, string delimiters, etc.
- Shuffled some source files around - there is now a runhugs script to invoke cpphs nicely.
cpphs-0.7, release date 2004.09.01
By HTTP:
.tar.gz,
.zip.
- Enable the __FILE__, __LINE__, __DATE__, and __TIME__ specials, which can be useful for creating DIY error messages.
cpphs-0.6, release date 2004.07.30
By HTTP:
.tar.gz,
.zip.
- Recognise and ignore the #pragma cpp directive.
- Fix beginning-of-file bug, where in --noline mode, a #line cpp directive appeared at the top of the output file.
- Fix chained parenthesised boolean exprs in #if, e.g.
#if ( foo ) && ( bar )
- Fix precedence in chained unparenthesised boolean exprs in
#if, e.g.
#if foo && bar || baz && frob
- For better compatibility with cpp, and because otherwise
there are certain constructs that cannot be expressed, we no
longer permit whitespace in a #define between the
symbolname and an opening parenthesis, e.g.
#define f (f' id)
. Previously, this was interpreted as a parametrised macro, with arguments in the parens, and no expansion. Now, the space indicates that this is a textual replacement, and the parenthesised expression is in fact the replacement.
cpphs-0.5, release date 2004.06.07
By HTTP:
.tar.gz,
.zip.
- Added a --version flag to report the version number.
- Renamed --stringise to --hashes, and use it to turn on ## catenation as well.
- Bugfix for #if 1, previously interpreted as false.
- Bugfix for --nolines: it no longer adds extra spurious newlines.
- File inclusion now looks in the directory of the calling file.
- Failure to find an include file is now merely a warning to stderr rather than an error.
- Added a --layout flag. Previously, line continuations in a macro definition were always preserved in the output, permitting use of the Haskell layout rule even inside a macro. The default is now to remove line continuations for conformance with cpp, but the option of using --layout is still possible.
cpphs-0.4, release date 2004.05.19
By HTTP:
.tar.gz,
.zip.
- New flag -Ofile to redirect output
- Bugfix for precedence of ! in #if !False && False
- Bugfix for whitespace permitted between # and if
- Bugfix for #define F "blah"; #include F
cpphs-0.3, release date 2004.05.18
By HTTP:
.tar.gz,
.zip.
Fix recursive macro expansion bug. Added option to strip C comments. Added option to recognise the # stringise operator.
cpphs-0.2, release date 2004.05.15
By HTTP:
.tar.gz,
.zip.
Implements textual replacement and macro expansion.
cpphs-0.1, release date 2004.04.07
By HTTP:
.tar.gz,
.zip.
Initial release: implements conditional compilation and file inclusion only.
Building instructions
To build cpphs, use
hmake cpphs [-package base]or
ghc --make cpphs [-o cpphs]or
mv cpphs.hugs cpphs # a simple runhugs script
You will notice that the command-line arguments for cpphs are not the same as for the original cpp. If you want to use cpphs as a completely drop-in replacement for the real cpp, that is, to accept the same arguments, and have broadly the same behaviour in response to them, then use the --cpp compatibility option as the first commandline flag.
Differences from cpp
In general, cpphs is based on the -traditional behaviour, not ANSI C, and has the following main differences from the standard cpp.
General
- The # that introduces any cpp directive must be in the first column of a line (whereas ANSI permits whitespace before the #).
- Generates the #line n "filename" syntax, not the # n "filename" variant.
- C comments are only removed from within cpp directives. They are not stripped from other text. Consider for instance that in Haskell, all of the following are valid operator symbols: /* */ */* However, you can turn on C-comment removal with the --strip option.
- Macros are never expanded within Haskell comments, strings, or character constants, unless you give the --text option to disable lexing the input as Haskell.
- Macros are always expanded recursively, unlike ANSI, which detects and prevents self-recursion. For instance, #define foo x:foo expands foo once only to x:foo in ANSI, but in cpphs it becomes an infinite list x:x:x:x:..., i.e. cpphs does not terminate.
Macro definition language
- Accepts /**/ for token-pasting in a macro definition. However, /* */ (with any text between the open/close comment) inserts whitespace.
- The ANSI ## token-pasting operator is available with the --hashes flag. This is to avoid misinterpreting any valid Haskell operator of the same name.
- Replaces a macro formal parameter with the actual, even inside a string (double or single quoted). This is -traditional behaviour, not supported in ANSI.
- Recognises the # stringisation operator in a macro definition only if you use the --hashes option. (It is an ANSI addition, only needed because quoted stringisation (above) is prohibited by ANSI.)
- Preserves whitespace within a textual replacement definition exactly (modulo newlines), but leading and trailing space is eliminated.
- Preserves whitespace within a macro definition (and trailing it) exactly (modulo newlines), but leading space is eliminated.
- Preserves whitespace within macro call arguments exactly (including newlines), but leading and trailing space is eliminated.
- With the --layout option, line continuations in a textual replacement or macro definition are preserved as line-breaks in the macro call. (Useful for layout-sensitive code in Haskell.)
cpphs as a library
You can use cpphs as a library from within a Haskell program. The main interface is in Language.Preprocessor.Cpphs. Haddock documentation is here. To make the library available to your haskell compiler, you must install the cpphs package using Cabal.
Contacts
I am interested in hearing your feedback on cpphs. Bug reports especially welcome. You can send feature requests too, but I won't guarantee to implement them if they depart much from the ordinary cpp's behaviour. Please mail
Copyright: © 2004-2007 Malcolm Wallace, except for ParseLib (Copyright © 1995 Graham Hutton and Erik Meijer)
License: The library modules in cpphs are distributed under the terms of the LGPL (see file LICENCE-LGPL for more details). If that's a problem for you, contact me to make other arrangements. The application module 'cpphs.hs' itself is GPL (see file LICENCE-GPL).
This software comes with no warranty. Use at your own risk.