Introducing Apache Maven
Chapter 1
Although there are a number of references for Maven online, there is no single, well-written narrative for introducing Maven that can serve as both an authoritative reference and an introduction. What we’ve tried to do with this effort is provide such a narrative coupled with useful reference material.
1.1 Maven… What is it?
The answer to this question depends on your own perspective. The great majority of Maven users are going to call Maven a “build tool”: a tool used to build deployable artifacts from source code. Build engineers and project managers might refer to Maven as something more comprehensive: a project management tool. What is the difference? A build tool such as Ant is focused solely on preprocessing, compilation, packaging, testing, and distribution. A project management tool such as Maven provides a superset of features found in a build tool. In addition to providing build capabilities, Maven can also run reports, generate a web site, and facilitate communication among members of a working team.
A more formal definition of Apache Maven: Maven is a project management tool which encompasses a project object model, a set of standards, a project lifecycle, a dependency management system, and logic for executing plugin goals at defined phases in a lifecycle. When you use Maven, you describe your project using a well-defined project object model, Maven can then apply cross-cutting logic from a set of shared (or custom) plugins.
Don’t let the fact that Maven is a "project management" tool scare you away. If you were just looking for a build tool, Maven will do the job. In fact, the first few chapters of this book will deal with the most common use case: using Maven to build and distribute your project.
1.2. Convention Over Configuration
Convention over configuration is a simple concept. Systems, libraries, and frameworks should assume reasonable defaults. Without requiring unnecessary configuration, systems should "just work". Popular frameworks such as Ruby on Rails and EJB3 have started to adhere to these principles in reaction to the configuration complexity of frameworks such as the initial EJB 2.1 specifications. An illustration of convention over configuration is something like EJB3 persistence: all you need to do to make a particular bean persistent is to annotate that class with @Entity. The framework assumes table and column names based on the name of the class and the names of the properties. Hooks are provided for you to override these default, assumed names if the need arises, but, in most cases, you will find that using the framework-supplied defaults results in a faster project execution.
Maven incorporates this concept by providing sensible default behavior for projects. Without customization, source code is assumed to be in ${basedir}/src/main/java and resources are assumed to be in ${basedir}/src/main/resources. Tests are assumed to be in ${basedir}/src/test, and a project is assumed to produce a JAR file. Maven assumes that you want the compile byte code to ${basedir}/target/classes and then create a distributable JAR file in ${basedir}/target. While this might seem trivial, consider the fact that most Ant-based builds have to define the locations of these directories. Ant doesn’t ship with any built-in idea of where source code or resources might be in a project; you have to supply this information. Maven’s adoption of convention over configuration goes farther than just simple directory locations, Maven’s core plugins apply a common set of conventions for compiling source code, packaging distributions, generating web sites, and many other processes. Maven’s strength comes from the fact that it is "opinionated", it has a defined life-cycle and a set of common plugins that know how to build and assemble software. If you follow the conventions, Maven will require almost zero effort - just put your source in the correct directory, and Maven will take care of the rest.
One side-effect of using systems that follow "convention over configuration" is that end-users might feel that they are forced to use a particular methodology or approach. While it is certainly true that Maven has some core opinions that shouldn’t be challenged, most of the defaults can be customized. For example, the location of a project’s source code and resources can be customized, names of JAR files can be customized, and through the development of custom plugins, almost any behavior can be tailored to your specific environment’s requirements. If you don’t care to follow convention, Maven will allow you to customize defaults in order to adapt to your specific requirements.
1.3 A Common Interface
Before Maven provided a common interface for building software, every single project had someone dedicated to managing a fully customized build system. Developers had to take time away from developing software to learn about the idiosyncrasies of each new project they wanted to contribute to. In 2001, you’d have a completely different approach to building a project like Turbine than you would to building a project like Tomcat. If a new source code analysis tool came out that would perform static analysis on source code, or if someone developed a new unit testing framework, everybody would have to drop what they were doing and figure out how to fit it into each project’s custom build environment. How do you run unit tests? There were a thousand different answers. This environment was characterized by a thousand endless arguments about tools and build procedures. The age before Maven was an age of inefficiency, the age of the "Build Engineer".
Today, most open source developers have used or are currently using Maven to manage new software projects. This transition is less about developers moving from one build tool to another and more about developers starting to adopt a common interface for project builds. As software systems have become more modular, build systems have become more complex, and the number of projects has sky-rocketed. Before Maven, when you wanted to check out a project like Apache ActiveMQ or Apache ServiceMix from Subversion and build it from source, you really had to set aside about an hour to figure out the build system for each particular project. What does the project need to build? What libraries do I need to download? Where do I put them? What goals can I execute in the build? In the best case, it took a few minutes to figure out a new project’s build, and in the worst cases (like the old Servlet API implementation in the Jakarta Project), a project’s build was so difficult it would take multiple hours just to get to the point where a new contributor could edit source and compile the project. These days, you check it out from source, and you run mvn install.
While Maven provides an array of benefits including dependency management and reuse of common build logic through plugins, the core reason why it has succeeded is that it has defined a common interface for building software. When you see that a project like Apache ActiveMQ uses Maven, you can assume that you’ll be able to check it out from source and build it with mvn install without much hassle. You know where the ignition keys goes, you know that the gas pedal is on the right-side, and the brake is on the left.
1.4. Universal Reuse through Maven Plugins
The core of Maven is pretty dumb, it doesn’t know how to do much beyond parsing a few XML documents and keeping track of a lifecycle and a few plugins. Maven has been designed to delegate most responsibility to a set of Maven Plugins which can affect the Maven Lifecycle and offer access to goals. Most of the action in Maven happens in plugin goals which take care of things like compiling source, packaging bytecode, publishing sites, and any other task which need to happen in a build. The Maven you download from Apache doesn’t know much about packaging a WAR file or running JUnit tests; most of the intelligence of Maven is implemented in the plugins and the plugins are retrieved from the Maven Repository. In fact, the first time you ran something like mvn install. with a brand-new Maven installation it retrieved most of the core Maven plugins from the Central Maven Repository. This is more than just a trick to minimize the download size of the Maven distribution, this is behavior which allows you to upgrade a plugin to add capability to your project’s build. The fact that Maven retrieves both dependencies and plugins from the remote repository allows for universal reuse of build logic.
The Maven Surefire plugin is the plugin that is responsible for running unit tests. Somewhere between version 1.0 and the version that is in wide use today someone decided to add support for the TestNG unit testing framework in addition to the support for JUnit. This upgrade happened in a way that didn’t break backwards compatibility. If you were using the Surefire plugin to compile and execute JUnit 3 unit tests, and you upgraded to the most recent version of the Surefire plugin, your tests continued to execute without fail. But, you gained new functionality, if you want to execute unit tests in TestNG you now have that ability. You also gained the ability to run annotated JUnit 4 unit tests. You gained all of these capabilities without having to upgrade your Maven installation or install new software. Most importantly, nothing about your project had to change aside from a version number for a plugin a single Maven configuration file called the Project Object Model (POM).
It is this mechanism that affects much more than the Surefire plugin. Maven has plugins for everything from compiling Java code, to generating reports, to deploying to an application server. Maven has abstracted common build tasks into plugins which are maintained centrally and shared universally. If the state-of-the-art changes in any area of the build, if some new unit testing framework is released or if some new tool is made available, you don’t have to be the one to hack your project’s custom build system to support it. You benefit from the fact that plugins are downloaded from a remote repository and maintained centrally. This is what is meant by universal reuse through Maven plugins.
1.5. Conceptual Model of a "Project"
Maven maintains a model of a project. You are not just compiling source code into bytecode, you are developing a description of a software project and assigning a unique set of coordinates to a project. You are describing the attributes of the project. What is the project’s license? Who develops and contributes to the project? What other projects does this project depend upon? Maven is more than just a "build tool", it is more than just an improvement on tools like make and Ant, it is a platform that encompasses a new semantics related to software projects and software development. This definition of a model for every project enables such features as:
Dependency Management
Because a project is defined by a unique set of coordinates consisting of a group identifier, an artifact identifier, and a version, projects can now use these coordinates to declare dependencies.
Remote Repositories
Related to dependency management, we can use the coordinates defined in the Maven Project Object Model (POM) to create repositories of Maven artifacts.
Universal Reuse of Build Logic
Plugins contain logic that works with the descriptive data and configuration parameters defined in Project Object Model (POM); they are not designed to operate upon specific files in known locations.
Tool Portability / Integration
Tools like Eclipse, NetBeans, and IntelliJ now have a common place to find information about a project. Before the advent of Maven, every IDE had a different way to store what was essentially a custom Project Object Model (POM). Maven has standardized this description, and while each IDE continues to maintain custom project files, they can be easily generated from the model.
Easy Searching and Filtering of Project Artifacts
Tools like Nexus allow you to index and search the contents of a repository using the information stored in the POM.
1.6. Is Maven an alternative to XYZ?
So, sure, Maven is an alternative to Ant, but Apache Ant continues to be a great, widely-used tool. It has been the reigning champion of Java builds for years, and you can integrate Ant build scripts with your project’s Maven build very easily. This is a common usage pattern for a Maven project. On the other hand, as more and more open source projects move to Maven as a project management platform, working developers are starting to realize that Maven not only simplifies the task of build management, it is helping to encourage a common interface between developers and software projects. Maven is more of a platform than a tool, while you could consider Maven an alternative to Ant, you are comparing apples to oranges. "Maven" includes more than just a build tool.
This is the central point that makes all of the Maven vs. Ant, Maven vs. Buildr, Maven vs. Gradle arguments irrelevant. Maven isn’t totally defined by the mechanics of your build system. It isn’t about scripting the various tasks in your build as much as it is about encouraging a set of standards, a common interface, a life-cycle, a standard repository format, a standard directory layout, etc. It certainly isn’t about what format the POM happens to be in (XML vs. YAML vs. Ruby). Maven is much larger than that, and Maven refers to much more than the tool itself. When this book talks of Maven, it is referring to the constellation of software, systems, and standards that support it. Buildr, Ivy, Gradle, all of these tools interact with the repository format that Maven helped create, and you could just as easily use a repository manager like Nexus to support a build written entirely in Ant.
While Maven is an alternative to many of these tools, the community needs to evolve beyond seeing technology as a zero-sum game between unfriendly competitors in a competition for users and developers. This might be how large corporations relate to one another, but it has very little relevance to the way that open source communities work. The headline "Who’s winning? Ant or Maven?" isn’t very constructive. If you force us to answer this question, we’re definitely going to say that Maven is a superior alternative to Ant as a foundational technology for a build; at the same time, Maven’s boundaries are constantly shifting and the Maven community is constantly trying to seek out new ways to become more ecumenical, more inter-operable, more cooperative. The core tenets of Maven are declarative builds, dependency management, repository managers, universal reuse through plugins, but the specific incarnation of these ideas at any given moment is less important than the sense that the open source community is collaborating to reduce the inefficiency of "enterprise-scale builds".
1.7. Comparing Maven with Ant
The authors of this book have no interest in creating a feud between Apache Ant and Apache Maven, but we are also cognizant of the fact that most organizations have to make a decision between the two standard solutions: Apache Ant and Apache Maven. In this section, we compare and contrast the tools.
Ant excels at build process, it is a build system modeled after make with targets and dependencies. Each target consists of a set of instructions which are coded in XML. There is a copy task and a javac task as well as a jar task. When you use Ant, you supply Ant with specific instructions for compiling and packaging your output. Look at the following example of a simple build.xml file:
A Simple Ant build.xml file.
<target name="init">
<target name="compile" depends="init" description="compile the source " >
<target name="dist" depends="compile" description="generate the distribution" >
<!-- Put everything in ${build} into the MyProject-${DSTAMP}.jar file --> <jar jarfile="${dist}/lib/MyProject-${DSTAMP}.jar" basedir="${build}"/>
<target name="clean" description="clean up" > <!-- Delete the ${build} and ${dist} directory trees -->
In this simple Ant example, you can see how you have to tell Ant exactly what to do. There is a compile goal which includes the javac task that compiles the source in the src/main/java directory to the target/classes directory. You have to tell Ant exactly where your source is, where you want the resulting bytecode to be stored, and how to package this all into a JAR file. While there are some recent developments that help make Ant less procedural, a developer’s experience with Ant is in coding a procedural language written in XML.
Contrast the previous Ant example with a Maven example. In Maven, to create a JAR file from some Java source, all you need to do is create a simple pom.xml, place your source code in ${basedir}/src/main/java and then run mvn install. from the command line. The example Maven pom.xml that achieves the same results as the simple Ant file listed in A Simple Ant build.xml file is shown in A Sample Maven pom.xml.
A Sample Maven pom.xml.
That’s all you need in your pom.xml. Running mvn install. from the command line will process resources, compile source, execute unit tests, create a JAR, and install the JAR in a local repository for reuse in other projects. Without modification, you can run mvn site. and then find an index.html file in target/site that contains links to JavaDoc and a few reports about your source code.
Admittedly, this is the simplest possible example project containing nothing more than some source code and producing a simple JAR. It is a project which closely follows Maven conventions and doesn’t require any dependencies or customization. If we wanted to start customizing the behavior, our pom.xml is going to grow in size, and in the largest of projects you can see collections of very complex Maven POMs which contain a great deal of plugin customization and dependency declarations. But, even when your project’s POM files become more substantial, they hold an entirely different kind of information from the build file of a similarly sized project using Ant. Maven POMs contain declarations: "This is a JAR project", and "The source code is in src/main/java". Ant build files contain explicit instructions: "This is project", "The source is in src/main/java", "Run javac against this directory", "Put the results in target/classes", "Create a JAR from the ….", etc. Where Ant had to be explicit about the process, there was something "built-in" to Maven that just knew where the source code was and how it should be processed.
The differences between Ant and Maven in this example are:
- Apache Ant
- Ant doesn’t have formal conventions like a common project directory structure or default behavior. You have to tell Ant exactly where to find the source and where to put the output. Informal conventions have emerged over time, but they haven’t been codified into the product.
- Ant is procedural. You have to tell Ant exactly what to do and when to do it. You have to tell it to compile, then copy, then compress.
- Ant doesn’t have a lifecycle. You have to define goals and goal dependencies. You have to attach a sequence of tasks to each goal manually.
- Apache Maven
- Maven has conventions. It knows where your source code is because you followed the convention. Maven’s Compiler plugin put the bytecode in target/classes, and it produces a JAR file in target.
- Maven is declarative. All you had to do was create a pom.xml file and put your source in the default directory. Maven took care of the rest.
- Maven has a lifecycle which was invoked when you executed mvn install.. This command told Maven to execute a series of sequential lifecycle phases until it reached the install lifecycle phase. As a side-effect of this journey through the lifecycle, Maven executed a number of default plugin goals which did things like compile and create a JAR.
Maven has built-in intelligence about common project tasks in the form of Maven plugins. If you wanted to write and execute unit tests, all you would need to do is write the tests, place them in ${basedir}/src/test/java, add a test-scoped dependency on either TestNG or JUnit, and run mvn test. If you wanted to deploy a web application and not a JAR, all you would need to do is change your project type to war and put your docroot in ${basedir}/src/main/webapp. Sure, you can do all of this with Ant, but you will be writing the instructions from scratch. In Ant, you would first have to figure out where the JUnit JAR file should be. Then you would have to create a classpath that includes the JUnit JAR file. Then you would tell Ant where it should look for test source code, write a goal that compiles the test source to bytecode, and execute the unit tests with JUnit.
Without supporting technologies like antlibs and Ivy (even with these supporting technologies), Ant has the feeling of a c`ustom procedural build. An efficient set of Maven POMs in a project which adheres to Maven’s assumed conventions has surprisingly little XML compared to the Ant alternative. Another benefit of Maven is the reliance on widely-shared Maven plugins. Everyone uses the Maven Surefire plugin for unit testing, and if someone adds support for a new unit testing framework, you can gain new capabilities in your own build by just incrementing the version of a particular Maven plugin in your project’s POM.
The decision to use Maven or Ant isn’t a binary one, and Ant still has a place in a complex build. If your current build contains some highly customized process, or if you’ve written some Ant scripts to complete a specific process in a specific way that cannot be adapted to the Maven standards, you can still use these scripts with Maven. Ant is made available as a core Maven plugin. Custom Maven plugins can be implemented in Ant, and Maven projects can be configured to execute Ant scripts within the Maven project lifecycle.