There are many mirrors of the Central repository out there, but they are mostly under-utilized. I believe this occurs for two reasons:
- Users don't know they exist - it's not easy to find a good source for these URLs and locations.
- Users don't have confidence in the mirrors - They don't know how frequently they are updated, and don't have an easy way to validate they are the same files that exist on Central.
In Nexus 1.3, we have introduced new functionality to solve both of those problems.
Nexus and Repository Mirrors
For each proxy repository, Nexus is now able to be configured with an ordered list of URLs to use as mirrors of the remote repository. Currently, only one lookup strategy is provided, but this is extensible and additional strategies will be provided in later versions.
In the current strategy, when Nexus attempts to retrieve an artifact from this repository, it will use the first mirror that is available and not currently blacklisted. This mirror will be used to retrieve the artifacts only, and the hashes and signatures will be retrieved in parallel from the "Canonical Repository" (the master instance -- http://repo1.maven.org/maven2 for example is the Canonical Central repository URL). If the artifact does not match the hashes or is not available on the mirror, then the system will attempt to resolve the artifacts directly on the Canonical Repository URL. If a repository mirror is not available, it is blacklisted for 30 minutes and doesn't participate in lookups for that period of time.
The effect of this strategy is that even if you list 4 mirrors of a repository, only the first one will be used (provided it isn't currently blacklisted). The reason we did this is to reduce the impact on the lookup time (we short-circuit to the Canonical URL instead of crawling all the mirrors for a file that might never be there), and to keep the initial code simple and well tested.
We believe the addition of this simple functionality will have a dramatic impact on throughput, particularly in areas like Europe where there is a mirror nearby, without being subject to any outage at the mirror. It also means that even if the mirror closest to you is slightly out of date, it will be transparent as the latest data is pulled still from the Canonical repo.
The configuration of the mirrors is able to leverage the newly defined repository metadata, which allows a repository to declare known mirrors of itself, along with details about the contents (snapshots vs releases and information about aggregated repositories). There is a new mirror tab in the the repository configuration panel. When you open this, you are able to add and reorder the mirror urls you would like to use. If the repository is publishing the metadata, then the Nexus UI will provide you with this list for you to choose from.
Written by Brian Fox
Brian Fox is a software developer, innovator and entrepreneur. He is an active contributor within the open source development community, most prominently as a member of the Apache Software Foundation and former Chair of the Apache Maven project. As the CTO and co-founder of Sonatype, he is focused on building a platform for developers and DevOps professionals to build high-quality, secure applications with open source components.
Explore All Posts by Brian Fox