News and Notes from the Makers of Nexus | Sonatype Blog

Why do I need a binary repository manager?

Written by Ember DeBoer | January 30, 2020

This is an excerpt from Out of the Wild: A Beginner's Guide to Package and Dependency Management, a Sonatype Guide. This is the final installment, focusing on binary lifecycle management and the benefits of binary repository managers in enterprise software development. (Read part one and part two.)

What is a binary repository manager

A binary repository stores binary files and artifacts generated during software development. Unlike Git and other source code repositories, binary repositories store and manage build artifacts like compiled code, libraries, executables, Docker images, and other output files from the development pipeline. They are often called artifact repositories or binary artifact repositories.

Artifact repositories streamline binary lifecycle management, including binary storage, distribution, management, and security. They are particularly useful in environments that use continuous integration (CI) and continuous delivery (CD) development workflows.

Why do I need a binary repository manager?

Binary repository managers serve a couple of important functions as part of a modern software development life cycle.

First, they provide a local copy, or "proxy," repository for the language-specific package repositories/registries we discussed earlier. Creating these proxy repositories in a repository manager to store and cache your OSS components locally — rather than downloading them directly from an online repository every time you kick off a build — can provide some of the following benefits, as stated in our own Repository Management Basics course:

  • Increasing build performance due to a wider distribution of software and locally available parts.

  • Reducing network bandwidth and dependency on remote repositories.

  • Insulating your company from internet outages of public repositories (Maven Central, npm, etc.), or even removal of an open source component.

In addition, repository managers serve as a "single source of truth" for the binaries used in your build processes.

At this stage, you may be asking yourself, but why can't I just store my binaries where I store my source code? And the short answer is that you can. But you probably won't want to after you understand more about how version or source control tools like GitHub differ from binary repository managers…

I use a version/source control management repository to store my source code. Why do I need a repository manager for my binaries?

As DZone's Refcard on Using Repository Managers concisely states, "Repository Managers are to binaries what source repositories or VCS (Version Control Systems) are to sources."

Authors Brian Fox and Carlos Sanchez go on to explain that binary files are much larger in size, and need a lot of metadata stored with them, such as their package name, version, license, etc. They also don't need to be diffed or cloned in the way that source code does.

Because of these differences, an artifact repository makes a lot more sense for storing binaries, whether they're the outputs of your build (.zip, .jar, .war, etc.) or packages downloaded from an online registry or Docker images.

This thread on StackOverflow also provides some clarification on how source code repositories and artifact repositories differ:

"In everyday use, you'd store your source code and its history in a git repository, and store your build artifacts (e.g. the compiled software you want to deliver) in Nexus."

So while proxy repositories are the best method to store open source packages downloaded from online registries as we mentioned earlier, hosted repositories can serve as a means to store your internal build artifacts, including snapshots and releases.

Lastly, another advantage that artifact repository managers provide is risk reduction in your build process. We alluded earlier to opening yourself up to certain risks when specifying the "latest" versions of a particular dependency, or even a version range, in your application-level package manager's manifest. Downloading unvetted versions directly from online registries presents more risk because bad actors are increasingly poisoning the well, injecting malicious code into libraries or removing them all together.

As Mykel Alvis explained in his Nexus User Conference presentation, the ability to insulate yourself from outages or vulnerabilities that may occur in such cases is made possible by use of a caching repository manager.

Thank you for reading. Find more resources like this in the Sonatype Community, a place where you can ask questions to other Nexus users and the Sonatype team. Choose from an assortment of learning paths, developed by a team of experts, that helps make using Sonatype Nexus Repository even easier.

Sources