Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
Modules.rst 40.38 KiB

Modules

Warning

The functionality described on this page is supported for C and Objective-C. C++ support is experimental.

Introduction

Most software is built using a number of software libraries, including libraries supplied by the platform, internal libraries built as part of the software itself to provide structure, and third-party libraries. For each library, one needs to access both its interface (API) and its implementation. In the C family of languages, the interface to a library is accessed by including the appropriate header files(s):

#include <SomeLib.h>

The implementation is handled separately by linking against the appropriate library. For example, by passing -lSomeLib to the linker.

Modules provide an alternative, simpler way to use software libraries that provides better compile-time scalability and eliminates many of the problems inherent to using the C preprocessor to access the API of a library.

Problems with the current model

The #include mechanism provided by the C preprocessor is a very poor way to access the API of a library, for a number of reasons:

  • Compile-time scalability: Each time a header is included, the compiler must preprocess and parse the text in that header and every header it includes, transitively. This process must be repeated for every translation unit in the application, which involves a huge amount of redundant work. In a project with N translation units and M headers included in each translation unit, the compiler is performing M x N work even though most of the M headers are shared among multiple translation units. C++ is particularly bad, because the compilation model for templates forces a huge amount of code into headers.
  • Fragility: #include directives are treated as textual inclusion by the preprocessor, and are therefore subject to any active macro definitions at the time of inclusion. If any of the active macro definitions happens to collide with a name in the library, it can break the library API or cause compilation failures in the library header itself. For an extreme example, #define std "The C++ Standard" and then include a standard library header: the result is a horrific cascade of failures in the C++ Standard Library's implementation. More subtle real-world problems occur when the headers for two different libraries interact due to macro collisions, and users are forced to reorder #include directives or introduce #undef directives to break the (unintended) dependency.
  • Conventional workarounds: C programmers have adopted a number of conventions to work around the fragility of the C preprocessor model. Include guards, for example, are required for the vast majority of headers to ensure that multiple inclusion doesn't break the compile. Macro names are written with LONG_PREFIXED_UPPERCASE_IDENTIFIERS to avoid collisions, and some library/framework developers even use __underscored names in headers to avoid collisions with "normal" names that (by convention) shouldn't even be macros. These conventions are a barrier to entry for developers coming from non-C languages, are boilerplate for more experienced developers, and make our headers far uglier than they should be.
  • Tool confusion: In a C-based language, it is hard to build tools that work well with software libraries, because the boundaries of the libraries are not clear. Which headers belong to a particular library, and in what order should those headers be included to guarantee that they compile correctly? Are the headers C, C++, Objective-C++, or one of the variants of these languages? What declarations in those headers are actually meant to be part of the API, and what declarations are present only because they had to be written as part of the header file?

Semantic import

Modules improve access to the API of software libraries by replacing the textual preprocessor inclusion model with a more robust, more efficient semantic model. From the user's perspective, the code looks only slightly different, because one uses an import declaration rather than a #include preprocessor directive:

import std.io; // pseudo-code; see below for syntax discussion

However, this module import behaves quite differently from the corresponding #include <stdio.h>: when the compiler sees the module import above, it loads a binary representation of the std.io module and makes its API available to the application directly. Preprocessor definitions that precede the import declaration have no impact on the API provided by std.io, because the module itself was compiled as a separate, standalone module. Additionally, any linker flags required to use the std.io module will automatically be provided when the module is imported [1] This semantic import model addresses many of the problems of the preprocessor inclusion model:

  • Compile-time scalability: The std.io module is only compiled once, and importing the module into a translation unit is a constant-time operation (independent of module system). Thus, the API of each software library is only parsed once, reducing the M x N compilation problem to an M + N problem.
  • Fragility: Each module is parsed as a standalone entity, so it has a consistent preprocessor environment. This completely eliminates the need for __underscored names and similarly defensive tricks. Moreover, the current preprocessor definitions when an import declaration is encountered are ignored, so one software library can not affect how another software library is compiled, eliminating include-order dependencies.
  • Tool confusion: Modules describe the API of software libraries, and tools can reason about and present a module as a representation of that API. Because modules can only be built standalone, tools can rely on the module definition to ensure that they get the complete API for the library. Moreover, modules can specify which languages they work with, so, e.g., one can not accidentally attempt to load a C++ module into a C program.