Yarn Plug ‘n’ Play: Time to say goodbye to node_modules?

node_modules — the biggest mystery of computer science

With the release of the next major version as Yarn 2.0(codenamed berry), its developers/maintainers are bringing in a lot of new features. And one of the features tries to get rid of is the infamousnode_modules directory.

Enter Yarn 2.0‘s Plug’n’Play!

Whenever you want to install your application dependencies mentioned in the package.json file, you would run:

yarn install

…..and viola! You get a huge node_modulesdirectory sitting at your application root directory. It is the big collective pool of all the packages and libraries that your application and your application’s dependencies require to function. But it is the gigantic size of the directory which causes the problems:

  • When trying to access any dependency in the application, it’s Node’s job to search for the files and retrieve them. And this is how Node does that:

Does this file exist here? No? Let’s look in the parent node_modules then. Does it exist here? Still no? Too bad...", and keep going until the right one is found.

  • Generating the huge directory takes on an average of 70% of the time of the yarn install command. Even having already installed packages won’t be helpful as a diffing algorithm has to be carried out beforehand as well.
  • Being an I/O operation of placing multiple packages, there is almost zero leeway to optimize the process to save time.
  • Even in runtime, Node resolution takes time to retrieve the packages/libraries, affecting the boot time of the application
  • Design of the node_modules doesn’t allow package managers to dedupe the installed packages. This causes some packages to be instantiated multiple times in memory.

In order to overcome this bottleneck of node_modules, yarn brought forward the Plug’n’Play mode for managing dependencies.

Although this feature has been introduced earlier before, yarn 2.0 decided to stick to this mode by default instead.

In basic idea behind this mode is: When yarn knows the dependency tree and installs package for your application, why not yarn also undertake the locating and retrieval of the packages as well?

So, in PnP mode, the whole node_modules is removed and a single file, called .pnp.js, is generated by yarn. This file contains a map linking a package name and version to a location on the disk, and another map linking a package name and version to its set of dependencies.

So then it becomes yarn‘s responsibility to tell Node exactly where to look for files being required.

  • As only a single text file is generated instead of tens of thousands, installs are now pretty much instantaneous.
  • Installs are more stable and reliable due to reduced I/O operations, which are prone to fail (especially on Windows).
  • Faster application startup, because the Node resolution doesn’t have to iterate over the filesystem hierarchy nearly as much as before (and soon won’t have to do it at all!).

While PnP comes default in yarn 2.0, you can try out the feature in the yarn 1.x version as well. To enable PnP mode, you need to enable the installConfig.pnp flag in your package.json:

package.json

And then simply running yarn install after this will remove your node_modules directory and will place a .pnp.js file.

While the PnP mode does seem to be the elusive silver bullet to the bottleneck problem to the node_modules problems, it has however raised some serious eyebrows and created a divide among the community on whether to adopt this mode by default in the new yarn version or reject the whole yarn 2.0 altogether. The major reason behind this rift can be explained:

  • While PnP mode removesnode_modules altogether, there are still some major existing libraries/packages, like Flow and React Native, which are dependant on the node_modules directory. This creates a compatibility issue of yarn 2.0 with these widely used libraries.
  • Even if an existing project is not using these libraries, migration is not simple and requires additional plugins to make package resolvers in Webpack and Typescript PnP compatible.

While yarn 2.0 does provide a method to disable the PnP mode by enabling the built-in node-modules plugin by adding the following into your local .yarnrc.yml file before running a fresh yarn install:

.yarnrc.yml

But this ultimately beats the purpose of making PnP default so that it can be adopted by the community.

While there are some issues to switch to PnP Mode in your existing application, yarn 2.0, but it can be adopted when creating your applications from scratch. You should go by the motto:

If you can use the Plug’n’Play feature, it is something you want to use in your application.

The reason why PnP is being shipped as default in yarn 2.0 is that the PnP mode is a part of yarn‘s master plan of making the “Zero Install” NodeJS Applications, with one specific goal:

To make Node projects as stable and fast as possible by removing the main source of entropy from the equation: Yarn itself.

This will be explained later in a separate article.