Description
Scala has done quite well so far without any preprocessor but in some situations it would be quite handy to just drop an #ifdef
or #include
into the source code. Let's resist this temptation (of using cpp) and focus instead on solving the actual problems that we have without adding too much complexity.
Goals
- Conditional compilation which is more fine-grained than conditional source files.
- Well integrated into the compiler: No change to build toolchains required. Positions work normally.
Non-goals
- Lexical macros
- Template expansion
- Advanced predicate language
Status quo in Scala
- Conditional source files
- Code generation
- Various code generation tools in use: Plain Scala code, FMPP, M4, etc.
- https://github.com/sbt/sbt-buildinfo as a lightweight alternative for getting config values into source code
All of these require build tool support. Conditional source files are supported out of the box (for simple cross-versioning in sbt) or relatively easy to add manually. sbt-buildinfo is also ready to use. Code generation is more difficult to implement. Different projects use various ad-how solutions.
Conditional compilation in other languages
C
Using the C preprocessor (cpp):
- Powerful
- Low-level
- Error-prone (macro expansion, hygiene)
- Solves many problems (badly) that Scala doesn't have (e.g. imports, macros)
HTML
- Allows simple conditional processing
- Dangerous errors possible when not supported by tooling (because it appears to be backwards compatible but is really not)
Rust
Built-in conditional compilation:
- Predicates are limited to key==value checks, exists(key), any(ps), all(ps), not(p)
- Configuration options set by the build system (some automatically, like platform and version, others user-definable)
- Keys are not unique (i.e. every key is associated with a set of values)
- 3 ways of conditional compilation:
cfg attribute
(annotation in Scala) allowed where other attributes are allowedcfg_attr
generated attributes conditionallycfg
macro includes config values in the source code
- Syntactic processing: Excluded source code must be parseable
Java
- No preprocessor or conditional compilation support
static final boolean
flags can be used for conditional compilation of well-typed code- Various preprocessing hacks based on preprocessor tools or conditional comments are used in practice
Haskell
Conditional compilation is supported by Cabal:
- Using cpp with macros provided by Cabal for version-specific compilation
Design space
At which level should conditional compilation work?
-
Before parsing: This keeps the config language separate from Scala. It is the most powerful option that allows arbitrary pieces of source code to be made conditional (or replaced by config values) but it is also difficult to reason about and can be abused to create very unreadable code.
-
After lexing: This option is taken by cpp (at least conceptually by using the same lexer as C, even when implemented by a separate tool). If avoids some of the ugly corner cases of the first option (like being able to make the beginning or end of a comment conditional) while still being very flexible. An implementation for Scala would probably be limited to the default tokenizer state (i.e. no conditional compilation within XML expressions or string interpolation). Tokenization rules do not change very often or very much so that cross-compiling to multiple Scala versions should be easy.
-
After parsing: This is the approach taken by Rust. It limits what can be made conditional (e.g. only single methods but not groups of multiple methods with a single directive) and requires valid syntax in all conditional parts. It cannot be used for version-dependent compilation that requires new syntax not supported by the older versions. An additional concern for Scala is the syntax. Using annotations like in Rust is possible but it would break existing Scala conventions that annotations must not change the interpretation of source code. It is also much harder to justify now (rather than from the beginning when designing a new language) because old tools would misinterpret source code that uses this new feature.
-
After typechecking: This is too limiting in practice and can already be implemented (either using macros or with Scala's optimizer and compile-time constants, just like in Java).
From my experience of cross-compiling Scala code and using conditional source directories, I think that option 3 is sufficiently powerful for most use cases. However, if we have to add a new syntax for it anyway (instead of using annotations), option 2 is worth considering.
Which features do we need?
Rust's cfg
attribute + macro combination looks like a good solution for most cases. I don't expect a big demand for conditional annotations, so we can probably skip cfg_attr
. The cfg
macro can be implemented as a (compiler-intrinsic) macro in Scala, the attribute will probably require a dedicated syntax.
Sources of config options
Conditions for conditional compilation can be very complex. There are two options where this complexity can be expressed:
- Keep the predicates in the Scala sources simple (e.g. only key==value checks), requiring the additional logic to be put into the build definition.
- Or keep the build definition simple and allow more complexity in the predicates.
I prefer the first option. We already have a standard build tool which allows arbitrary Scala code to be run as part of the build definition. Other build tools have developed scripting support, too. The standalone scalac
tool would not have to support anything more than allow configuration options to be set from the command line. We should consider some predefined options but even in seemingly simple cases (like the version number) this could quickly lead to a demand for a more complex predicate language.
Activity
dwijnand commentedon Jul 2, 2019
Looks really good, Stefan!
Do you think you could expand a bit on what is meant by Rust's cfg attribute and macro behaviour? Either just describe it or better yet with examples. Thanks!
lrytz commentedon Jul 2, 2019
Yes, very nice writeup! Thanks for doing the hard work and not just dumping out some syntax ideas :-)
szeiger commentedon Jul 2, 2019
The
cfg
annotation (or "attribute" in Rust) conditionally enables a piece of code (where an attribute is allowed, e.g. a function definition but not arbitrary places). In Scala it could be something like this:binaryVersion
in this example is a config option. They live in a namespace which is distinct from any regular one in Scala code. These annotations are processed logically after parser but before typer (probably not quite so in practice because I expect you'll need to do some typing just to recognize the namecfg
) so the disabled versions of the method have to parse but not typecheck.The
cfg
macro provides a way to bring config values into Scala terms, e.g.Values produced by the macro are expanded into literals at compile time.
szeiger commentedon Jul 4, 2019
A possible way to avoid the namer issue (especially at the top level) without too much complexity would be a new annotation-like syntax like
@if(...)
. This would also allow us to avoid the quotes and instead treat all names within the predicate as config names.lrytz commentedon Jul 4, 2019
Could this express, for example
import a.X
elseimport b.X
class A extends X
elseclass A extends Y
Do we need / want that? :-)
szeiger commentedon Jul 4, 2019
In the scheme with the simple predicate language more complex predicates like
binaryVersion > 2.13
need to be split up into a flag that can be checked by the predicate and some code in the build script to compute the flag. Additional operators could be added to the predicate language (but not user-definable).I don't think normal annotations can be used on imports at the moment but this should be easy to add (especially if we go with an annotation-like special syntax instead of a real annotation).
The macro could replace sbt-buildinfo. We're adding a standard way of defining config variables and passing them to the compiler. I think it makes sense to use this mechanism for reifying them at the term level if we already have it.
lrytz commentedon Jul 4, 2019
Thanks!
Can you think of cases where the annotation based syntax would not work well enough? My example above is a superclass, that could be worked around with a type alias. But for example if I want to add a parent conditionally (and not extend anything in the other case), I don't see how that could be done (except making two copies of the entire class).
szeiger commentedon Jul 4, 2019
You can always extend
AnyRef
orAny
. This doesn't work anymore if you need to pass arguments to the superclass. You'd have to write two separate versions.szeiger commentedon Jul 5, 2019
Here's my prototype so far: https://github.com/szeiger/scala/tree/wip/preprocessor
I'm not quite happy with the set-based key/value checks. It doesn't feel correct with Scala syntax.
Supporting imports will need a bit of refactoring in the parser. It's not as straight-forward to add as I had hoped.
I wanted to try it with collections-compat but discovered that annotations do not work for package objects. This is also a limitation of the parser, to it affects my pseudo-annotations as well. I'm not sure if this is intentional or a bug. Fixing it should be on the same order of difficulty as supporting imports.
Except for these limitations it should be fully functional.
lrytz commentedon Jul 8, 2019
The patch has
so supporting annotation ascriptions is planned, right?
szeiger commentedon Jul 8, 2019
I assume it's trivial to implement but didn't get around to testing it yet.
szeiger commentedon Jul 8, 2019
Looks like the restriction on disallowing annotations in general for package objects is intentional: https://www.scala-lang.org/files/archive/spec/2.13/09-top-level-definitions.html#compilation-units. But since
@if
is not a real annotation we can special-case it for package objects the same way as for imports.szeiger commentedon Jul 8, 2019
The latest update supports imports, package objects and annotated expressions.
szeiger commentedon Jul 9, 2019
Here's a version of scala-collection-compat that does all the conditional compilation with the proprocessor: https://github.com/szeiger/scala-collection-compat/tree/wip/preprocessor-test. This shows the limits of what is possible. In practice I would probably keep 2.13 completely separate but use conditional compilation for the small differences between 2.11 and 2.12.
60 remaining items