Automatic JavaScript file bundling and library consumption

Intro

With MediaWiki ResourceLoader JavaScript files are run without a module system, so files have to export properties to and consume properties from the global scope, defined by other files.

This has caused issues in the development workflows related to the project’s JavaScript sources, and the use of JavaScript libraries from the OSS ecosystem. Organizing the code, trying to keep it maintainable, and understanding how it is structured has a big cost on the project’s quality, development and maintenance.

JS code defining global variables all around

Application JavaScript sources

The order in which these files needs to be loaded so that all dependencies are set properly then needs to be manually specified in extension.json by file name. As such, there are two sources of truth:

The order in which parts your program is interpreted is defined outside of the program itself. At the file level, your file must be written with foreknowledge about which files have been loaded, which is defined elsewhere.

If you are doing anything mildly complex, you will end up with a big list of JavaScript files that depend on each other. Having to manually understand that dependency graph based on the list of dependencies on a JSON file and how the files use, define, or re-define global variables from or for other files, and managing a module’s internal state, can be quite of a headache.

JSON configuration skyrockets as soon as you start doing anything interesting in the frontend

Changes to the source code, moving lines in the same file, refactoring code to other and new files, adding new code that uses other files, removing code, …

All of these become really hard really quickly. In the end, organizing your code, trying to keep it maintainable, and understanding how it is structured have a very big cost on the project’s quality, development and maintenance.

Figuring out if the files are specified in order properly in the configuration after and while you make changes has a high potential for obscure runtime bugs by forgetting or not seeing implicit dependencies in the manual order of files.

Libraries

In order to consume libraries for the front-end code, right now we rely on manually (or by script) pulling down npm dependencies or files from a website and including them in the repository. This is hard to verify, keep up to date, and cumbersome. If necessary, it should be possible to specify dependencies with the versions and have them be automatically included in the assets we serve from a trustworthy source. It should be easy to check if the libraries are outdated, and update them.

Requirements

For developing JavaScript files for the front-end:

Solution

We considered our options, and after discussion among the engineers, we decided that:

You can read some more details in the architecture design record 4. Use webpack that we wrote when we discussed this in the team.

Sources with ES modules

Why webpack and not <my-favorite-tool>?

We looked at the ecosystem of bundlers at the time, and this is a summary of our evaluation:

We are not married to, or using any webpack specific features, so we can migrate from this tool in the future if it becomes a problem.

Webpack production build

Results

These are some of the benefits we have seen after the introduction of these changes:

source-map-explorer view

Problems

The approach is a bit controversial since MediaWiki extensions usually don’t have build steps, so there was no easy setup for CI and the deployment pipeline to run a build step before running jobs or deploying.

As such, we ended up committing the built sources to the repository, which is fine with the CI step and the pre-commit hook mentioned before, but has an annoying inconvenience: every time a patch is merged with the corresponding generated asset (resources/dist/*), any pending patches on Gerrit that also need to regenerate the asset (because they touch sources in src/), will now be on merge conflict with master.

We have discussed in phabricator and in wikitech-l:

But sadly didn’t get to any concrete steps. If you think you can help with this issue we would really appreciate your help, as we would like to help other projects, people and teams be able to use build steps in their extensions.

Right now, we sidestep the issues with a bot we have configured for the repository, that responds to the command rebase (parallel to the recheck command that jenkins-bot responds to. On comment, the bot will download the patch, rebase it with master, run the build step, and submit a new patch without the conflict.

As an interim solution it works and allows us to move along, but if we want to adopt this process for other projects we would really like to have a more streamlined solution.

Conclusions

This change has worked very well for us, decreasing the cognitive load and allowing us to work more effectively on our JS files. We recommend it if you have many JS files or ResourceLoader modules, and the order and dependencies are causing you headaches.

We hope to work together in standardizing some sort of CI + deploy process so that projects on the MediaWiki ecosystem can leverage build steps to improve their workflows and leverage powerful tools.