Tag Archives: github

Publishing an Eclipse p2 composite repository on GitHub Pages

I had already described the process of publishing an Eclipse p2 composite update site:

Well, now that Bintray is shutting down, and Sourceforge is quite slow in serving an Eclipse update site, I decided to publish my Eclipse p2 composite update sites on GitHub Pages.

GitHub Pages might not be ideal for serving binaries, and it has a few limitations. However, such limitations (e.g., published sites may be no larger than 1 GB, sites have a soft bandwidth limit of 100GB per month and sites have a soft limit of 10 builds per hour) are not that crucial for an Eclipse update site, whose artifacts are not that huge. Moreover, at least my projects are not going to serve more than 100GB per month, unfortunately, I might say 😉

In this tutorial, I’ll show how to do that, so that you can easily apply this procedure also to your projects!

The procedure is part of the Maven/Tycho build so that it is fully automated. Moreover, the pom.xml and the ant files can be fully reused in your own projects (just a few properties have to be adapted). The idea is that you can run this Maven build (basically, “mvn deploy”) on any CI server (as long as you have write-access to the GitHub repository hosting the update site – more on that later). Thus, you will not depend on the pipeline syntax of a specific CI server (Travis, GitHub Actions, Jenkins, etc.), though, depending on the specific CI server you might have to adjust a few minimal things.

These are the main points:

The p2 children repositories and the p2 composite repositories will be published with standard Git operations since we publish them in a GitHub repository.

Let’s recap what p2 composite update sites are. Quoting from https://wiki.eclipse.org/Equinox/p2/Composite_Repositories_(new)

As repositories continually grow in size they become harder to manage. The goal of composite repositories is to make this task easier by allowing you to have a parent repository which refers to multiple children. Users are then able to reference the parent repository and the children’s content will transparently be available to them.

In order to achieve this, all published p2 repositories must be available, each one with its own p2 metadata that should never be overwritten. On the contrary, the metadata that we will overwrite will be the one for the composite metadata, i.e., compositeContent.xml and compositeArtifacts.xml.

Directory Structure

I want to be able to serve these composite update sites:

  • the main one collects all the versions
  • a composite update site for each major version (e.g., 1.x, 2.x, etc.)
  • a composite update site for each major.minor version (e.g., 1.0.x, 1.1.x, 2.0.x, etc.)

What I aim at is to have the following paths:

  • releases: in this directory, all p2 simple repositories will be uploaded, each one in its own directory, named after version.buildQualifier, e.g., 1.0.0.v20210307-2037, 1.1.0.v20210307-2104, etc. Your Eclipse users can then use the URL of one of these single update sites to stick to that specific version.
  • updates: in this directory, the metadata for major and major.minor composite sites will be uploaded.
  • root: the main composite update site collecting all versions.

To summarize, we’ll end up with a remote directory structure like the following one

Thus, if you want, you can provide these sites to your users (I’m using the URLs that correspond to my example):

  • https://lorenzobettini.github.io/p2composite-github-pages-example-updates for the main global update site: every new version will be available when using this site;
  • https://lorenzobettini.github.io/p2composite-github-pages-example-updates/updates/1.x for all the releases with major version 1: for example, the user won’t see new releases with major version 2;
  • https://lorenzobettini.github.io/p2composite-github-pages-example-updates/updates/1.x/1.0.x for all the releases with major version 1 and minor version 0: the user will only see new releases of the shape 1.0.0, 1.0.1, 1.0.2, etc., but NOT 1.1.0, 1.2.3, 2.0.0, etc.

If you want to change this structure, you have to carefully tweak the ant file we’ll see in a minute.

Building Steps

During the build, before the actual deployment, we’ll have to update the composite site metadata, and we’ll have to do that locally.

The steps that we’ll perform during the Maven/Tycho build are:

  • Clone the repository hosting the composite update site (in this example, https://github.com/LorenzoBettini/p2composite-github-pages-example-updates);
  • Create the p2 repository (with Tycho, as usual);
  • Copy the p2 repository in the cloned repository in a subdirectory of the releases directory (the name of the subdirectory has the same qualified version of the project, e.g., 1.0.0.v20210307-2037);
  • Update the composite update sites information in the cloned repository (using the p2 tools);
  • Commit and push the updated clone to the remote GitHub repository (the one hosting the composite update site).

First of all, in the parent POM, we define the following properties, which of course you need to tweak for your own projects:

It should be clear which properties you need to modify for your project. In particular, the github-update-repo is the URL (with authentication information) of the GitHub repository hosting the composite update site, and the site.label is the label that will be put in the composite metadata.

Then, in the parent POM, we configure in the pluginManagement section all the versions of the plugin we are going to use (see the sources of the example on GitHub).

The most interesting configuration is the one for the tycho-packaging-plugin, where we specify the format of the qualified version:

Moreover, we create a profile release-composite (which we’ll also use later in the POM of the site project), where we disable the standard Maven plugins for install and deploy. Since we are going to release our Eclipse p2 composite update site during the deploy phase, but we are not interested in installing and deploying the Maven artifacts, we skip the standard Maven plugins bound to those phases:

The interesting steps are in the site project, the one with <packaging>eclipse-repository</packaging>. Here we also define the profile release-composite and we use a few plugins to perform the steps involving the Git repository described above (remember that these configurations are inside the profile release-composite, of course in the build plugins section):

Let’s see these configurations in detail. In particular, it is important to understand how the goals of the plugins are bound to the phases of the default lifecycle; remember that on the phase package, Tycho will automatically create the p2 repository and it will do that before any other goals bound to the phase package in the above configurations:

  • with the build-helper-maven-plugin we parse the current version of the project, in particular, we set the properties holding the major and minor versions that we need later to create the composite metadata directory structure; its goal is automatically bound to one of the first phases (validate) of the lifecycle;
  • with the exec-maven-plugin we configure the execution of the Git commands:
    • we clone the Git repository of the update site (with –depth=1 we only get the latest commit in the history, the previous commits are not interesting for our task); this is done in the phase pre-package, that is before the p2 repository is created by Tycho; the Git repository is cloned in the output directory target/checkout
    • in the phase verify (that is, after the phase package), we commit the changes (which will be done during the phase package as shown in the following points)
    • in the phase deploy (that is, the last phase that we’ll run on the command line), we push the changes to the Git repository of the update site
  • with the maven-resources-plugin we copy the p2 repository generated by Tycho into the target/checkout/releases directory in a subdirectory with the name of the qualified version of the project (e.g., 1.0.0.v20210307-2037);
  • with the tycho-eclipserun-plugin we create the composite metadata; we rely on the Eclipse application org.eclipse.ant.core.antRunner, so that we can execute the p2 Ant task for managing composite repositories (p2.composite.repository). The Ant tasks are defined in the Ant file packaging-p2composite.ant, stored in the site project. In this file, there are also a few properties that describe the layout of the directories described before. Note that we need to pass a few properties, including the site.label, the directory of the local Git clone, and the major and minor versions that we computed before.

Keep in mind that in all the above steps, non-existing directories will be automatically created on-demand (e.g., by the maven-resources-plugin and by the p2 Ant tasks). This means that the described process will work seamlessly the very first time when we start with an empty Git repository.

Now, from the parent POM on your computer, it’s enough to run

and the release will be performed. When cloning you’ll be asked for the password of the GitHub repository, and, if not using an SSH agent or a keyring, also when pushing. Again, this depends on the URL of the GitHub repository; you might use an HTTPS URL that relies on the GitHub token, for example.

If you want to make a few local tests before actually releasing, you might stop at the phase verify and inspect the target/checkout to see whether the directories and the composite metadata are as expected.

You might also want to add another execution to the tycho-eclipserun-plugin to add a reference to another Eclipse update site that is required to install your software. The Ant file provides a task for that, p2.composite.add.external that will store the reference into the innermost composite child (e.g., into 1.2.x); here’s an example that adds a reference to the Eclipse main update site:

For example, in my Xtext projects, I use this technique to add a reference to the Xtext update site corresponding to the Xtext version I’m using in that specific release of my project. This way, my update site will be “self-contained” for my users: when using my update site for installing my software, p2 will be automatically able to install also the required Xtext bundles!

Releasing from GitHub Actions

The Maven command shown above can be used to perform a release from your computer. If you want to release your Eclipse update site directly from GitHub Actions, there are a few more things to do.

First of all, we are talking about a GitHub Actions workflow stored and executed in the GitHub repository of your project, NOT in the GitHub repository of the update site. In this example, it is https://github.com/LorenzoBettini/p2composite-github-pages-example.

In such a workflow, we need to push to another GitHub repository. To do that

  • create a GitHub personal access token (selecting repo);
  • create a secret in the GitHub repository of the project (where we run the GitHub Actions workflow), in this example it is called ACTIONS_TOKEN, with the value of that token;
  • when running the Maven deploy command, we need to override the property github-update-repo by specifying a URL for the GitHub repository with the update site using the HTTPS syntax and the encrypted ACTIONS_TOKEN; in this example, it is https://x-access-token:${{ secrets.ACTIONS_TOKEN }}@github.com/LorenzoBettini/p2composite-github-pages-example-updates;
  • we also need to configure in advance the Git user and email, with some values, otherwise, Git will complain when creating the commit.

To summarize, these are the interesting parts of the release.yml workflow (see the full version here: https://github.com/LorenzoBettini/p2composite-github-pages-example/blob/master/.github/workflows/release.yml):

The workflow is configured to be executed only when you push to the release branch.

Remember that we are talking about the Git repository hosting your project, not the one hosting your update site.

Final thoughts

With the procedure described in this post, you publish your update sites and the composite metadata during the Maven build, so you never deal manually with the GitHub repository of your update site. However, you can always do that! For example, you might want to remove a release. It’s just a matter of cloning that repository, do your changes (i.e., remove a subdirectory of releases and update manually the composite metadata accordingly), commit, and push. Now and then you might also clean up the history of such a Git repository (the history is not important in this context), by pushing with –force after resetting the Git history. By the way, by tweaking the configurations above you could also do that every time you do a release: just commit with amend and push force!

Finally, you could also create an additional GitHub repository for snapshot releases of your update sites, or for milestones, or release candidate.

Happy releasing! 🙂

Caching dependencies in GitHub Actions

I recently started to port all my Java projects from Travis CI to GitHub Actions, since Travis CI changed its pricing model. (I’ll soon update also my book on TDD and Build Automation under that respect.)

I’ve always used caching mechanisms during the builds in Travis CI, to speed up the builds: caching Maven dependencies, especially in big projects, can save a lot of time. In my case, I’m mostly talking of Eclipse plug-in projects, built with Maven/Tycho, and the target platform resolution might have to download a few hundreds of megabytes. Thus, I wanted to use caching also in GitHub Actions, and there’s an action for that.

In this post, I’ll show my strategies for using the cache, in particular, using different workflows based on different operating systems, which are triggered only on some specific events. I’ll use a very simple example, but I’m using this strategy currently on this Xtext project: https://github.com/LorenzoBettini/edelta, which uses more than 300 Mb of dependencies.

The post assumes that you’re already familiar with GitHub Actions.

Warning: Please keep in mind that caches will also be evicted automatically (currently, the documentation says that “caches that are not accessed within the last week will also be evicted”). However, we can still benefit from caches if we are working on a project for a few days in a row.

To experiment with building mechanisms, I suggest you use a very simple example. I’m going to use a simple Maven Java project created with the corresponding Maven archetype: a Java class and a trivial JUnit test. The Java code is not important in this context, and we’ll concentrate on the build automation mechanisms.

The final project can be found here:
https://github.com/LorenzoBettini/github-actions-cache-example.

This is the initial POM for this project:

This is the main workflow file (stored in .github/workflows/maven.yml):

This is a pretty standard workflow for a Java project built with Maven. This workflow runs for every push on any branch and every PR.

Note that we specify to cache the directory where Maven stores all the downloaded artifacts, ~/.m2.

For the cache key, we use the OS where our build is running, a constant string “-m2-” and the result of hashing all the POM files (we’ll see how we rely on this hashing later in this post).

Remember that the cache key will be used in future builds to restore the files saved in the cache. When no cache is found with the given key, the action searches for alternate keys if the restore-keys has been specified. As you see, we specified as the restore key something similar to the actual key: the running OS and the constant string “-m2-” but no hashing. This way, if we change our POMs, the hashing will be different, but if a previous cache exists we can still restore that and make use of the cached values. (See the official documentation for further details.) We’ll then have to download only the new dependencies if any. The cache will then be updated at the end of the successful job.

I usually rely on this strategy for the CI of my projects:

  • build every pushes in any branch using a Linux build environment;
  • build PRs in any branch also on a Windows and macOS environment (actually, I wasn’t using Windows with Travis CI since it did not provide Java support on that environment; that’s another advantage of GitHub Actions, which provides Java support also on Windows)

Thus, I have another workflow definition just for PRs (stored in .github/workflows/pr.yml):

Besides the build matrix for OSes, that’s basically the same as the previous workflow. In particular, we use the same strategy for defining the cache key (and restore key). Thus, we have a different cache for each different operating system.

Now, let’s have a look at the documentation of this action:

A workflow can access and restore a cache created in the current branch, the base branch (including base branches of forked repositories), or the default branch. For example, a cache created on the default branch would be accessible from any pull request. Also, if the branch feature-b has the base branch feature-a, a workflow triggered on feature-b would have access to caches created in the default branch (main), feature-a, and feature-b.

Access restrictions provide cache isolation and security by creating a logical boundary between different workflows and branches. For example, a cache created for the branch feature-a (with the base main) would not be accessible to a pull request for the branch feature-b (with the base main).

What does that mean in our scenario? Since the workflow running on Windows and macOS is executed only in PRs, this means that the cache for these two configurations will never be saved for the master branch. In turns, this means that each time we create a new PR, this workflow will have no chance of finding a cache to restore: the branch for the PR is new (so no cache is available for such a branch) and the base branch (typically, “master” or “main”) will have no cache saved for these two OSes. Summarizing, the builds for the PRs for these two configurations will always have to download all the Maven dependencies from scratch. Of course, if we don’t immediately merge the PR and we push other commits on the branch of the PR, the builds for these two OSes will find a saved cache (if the previous builds of the PR succeeded), but, in any case, the first build for each new PR for these two OSes will take more time (actually, much more time in a complex project with lots of dependencies).

Thus, if we want to benefit from caching also on these two OSes, we have to have another workflow on the OSes Windows and macOS that runs when merging a PR, so that the cache will be stored also for the master branch (actually we could use this strategy also when merging any PR with any base branch, not necessarily the main one).

Here’s this additional workflow (stored in .github/workflows/pr-merge.yml):

Note that we intercept the event push (since a merge of a PR is actually a push) but we have an if statement that enables the workflow only when the commit message contains the string “Merge pull request”, which is the default message when merging a PR on GitHub. In this example, we are only interested in PR merged with the master branch and with any branch starting with “experiments”, but you can adjust that as you see fit. Furthermore, since this workflow is only meant for updating the Maven dependency cache, we skip the tests (with -DskipTests) so that we save some time (especially in a complex project with lots of tests).

This way, after the first PR merged, the PR workflows running on Windows and macOS will find a cache (at least as a starting point).

We can also do better than that and avoid running the Maven build if there’s no need to update the cache. Remember that we use the result of hashing all the POM files in our cache key? We mean that if our POMs do not change then basically we don’t expect our dependencies to change (of course if we’re not using SNAPSHOT dependencies). Now, in the documentation, we also read

When key matches an existing cache, it’s called a cache hit, […] When key doesn’t match an existing cache, it’s called a cache miss, and a new cache is created if the job completes successfully.

The idea is to skip the Maven step in the above workflow “Updates Cache on Windows and macOS” if we have a cache hit since we expect no new dependencies are needed to be downloaded (our POMs haven’t changed). This is the interesting part to change:

Note that we need to define an id for the cache to intercept the cache hit or miss and the id must match the id in the if statement.

This way, if we have a cache hit the workflow for updating the cache on Windows and macOS will be really fast since it won’t even run the Maven build for updating the cache.

If we change the POM, e.g., switch to JUnit 4.13.1, push the change, create a PR, and merge it, then, the workflow for updating the cache will actually run the Maven build since we have a cache miss: the key of the cache has changed. Of course, we’ll still benefit from the already cached dependencies (and all the Maven plugins already downloaded) and we’ll update the cache with the new dependencies for JUnit 4.13.1.

Final notes

One might think to intercept the merge of a PR by using on: pull_request: (as we did in the pr.yml workflow). However, “There’s no way to specify that a workflow should be triggered when a pull request is merged”. In the official forum, you can find a solution based on the “closed” PR event and the inspection of the PR “merged” event. So one might think to update the pr.yml workflow accordingly and get rid of the additional pr-merge.yml workflow. However, from my experiments, this solution will not make use of caching, which is the main goal of this post. The symptoms of such a problem are this message when the workflow initially tries to restore a cache:

Warning: Cache service responded with 403

and this message when the cache should be saved:

Unable to reserve cache with key …, another job may be creating this cache.

Another experiment that I tried was to remove the running OS from the cache key, e.g., m2-${{ hashFiles(‘**/pom.xml’) }} instead of ${{ runner.os }}-m2-${{ hashFiles(‘**/pom.xml’) }}, and to use a restore accordingly key, like m2- instead of ${{ runner.os }}-m2-. I was hoping to reuse the same cache across different OS environments. This seems to work for macOS, which seems to be able to reuse the Linux cache. Unfortunately, this does not work for Windows. Thus, I gave up that solution.

Publishing a Maven Site to GitHub Pages

In this tutorial I’d like to show how to publish a Maven Site to GitHub Pages. You can find several documents on the web about this subject, but I decided to publish one myself because the documents I found refer to a deprecated plugin (https://github.com/github/maven-plugins/tree/master/github-site-plugin) or show complicated settings, also when using the official Maven plugin that I’m going to use myself in this post, Apache Maven SCM Publish Plugin, https://maven.apache.org/plugins/maven-scm-publish-plugin/index.html (probably because the documents I found are somehow old, and they refer to an old version of such a plugin, which has probably been improved a lot since then).

I’m going to use a very simple Java example. The final project is available here: https://github.com/LorenzoBettini/maven-site-github-example.

Create the Java Maven project

Let’s create a simple Java project using the archetype (of course, I’m going to use fake names for the groupId and artifactId, but such values should be used consistently from now on):

This will create the my-app folder with the Maven project. Let’s cd into that folder and make sure it compiles fine:

We can also generate the site of this project (in its default shape):

The site will be generated in the target/site folder and can be opened with a browser; for example, let’s open its index.html:

GitHub setup

Let’s publish the Maven project on GitHub; in my case it’s in https://github.com/LorenzoBettini/maven-site-github-example.

Now we have to setup gh-pages branch on our Git repository. This step is required the first time only. (You might want to have a look at https://help.github.com/en/github/working-with-github-pages/about-github-pages if you’re note familiar with GitHub pages).

Let’s prepare the gh-pages branch: this will be a separate branch in the Git repository, we create it as an orphan branch (see also https://maven.apache.org/plugins/maven-scm-publish-plugin/various-tips.html). WARNING: the following commands will wipe out the current contents of the local Git repository, so make sure you don’t have any uncommitted changes! Let’s run the following commands in the main folder of the Git repository:

  1. git checkout --orphan gh-pages to create the branch locally,
  2. rm .git/index ; git clean -fdx to clean the branch content and let it empty,
  3. echo "It works" > index.html to create an initial site content
  4. git add . && git commit -m "initial site content" && git push origin gh-pages to commit and push to the branch gh-pages

The corresponding web site will be published in your GitHub project’s corresponding URL; in my case it is https://lorenzobettini.github.io/maven-site-github-example:  This should show “It works”

Now we can go back to the master branch:

and remove the index.html we previously created.

Maven POM configuration

Let’s see how to configure our POM to automatically publish the Maven site to GitHub pages.

First, we must specify the distribution management section:

Remember that you will have to use the URL corresponding to your own GitHub project (including your GitHub user id).

Then, we configure the maven-scm-publish-plugin configuration in the pluginManagement section (we configure it there, and then we will call its goals directly from the command line – such a plugin can also be configured to be bound to the Maven lifecycle, but that’s out of the scope of this tutorial, you might want to have a look here in case: https://maven.apache.org/plugins/maven-scm-publish-plugin/usage.html). Note that we specify the gh-pages branch:

Publish the site

Now, publishing the Maven site to GitHub pages it’s just a matter of running:

Wait for the plugin to perform its job.

What the plugin does is

  1. It first checks out the contents of the gh-pages branch from GitHub into a local folder (by default, target/scmpublish-checkout);
  2. Then locally staged site content is applied to the check-out, issuing appropriate Git commands to add and delete entries, followed by a commit and a push.

Now the site is on GitHub Pages:

Improve the site contents

Just for demonstration, let’s enrich the site a bit by using the Maven Archetype Site, https://maven.apache.org/archetypes/maven-archetype-site.

We layer this archetype upon our existing project, so we must run it from the directory containing our current Maven project and specify the same groupId and artifactId we specified when we created the Maven project:

The archetype will update our project with a src/site directory containing a few example contents in Markdown, APT, etc. It will also update our POM with some configuration for maven-project-info-reports-plugin and for i18n localization. Moreover, a new skin will be used for the final site, maven-fluido-skin.

You might want to have a look locally by regenerating the site.

Let’s publish the new site, again by running

Now it’s on GitHub Pages (it might take some time for the new website to be updated on GitHub and browser reload might help):

To jump to the published website you could also use the GitHub web interface: click on the “environment” link and then on the “View deployment” button corresponding to the latest pushed commit on the gh-pages branch:

Now you could experiment by adding/changing/removing the contents of the directory src/site and then publish the website again through Maven.

Remember

What’s in your Maven project (e.g., master branch) contains the sources of the site, while, what’s in the gh-pages branch on GitHub will contain the generated website. You won’t modify the contents of gh-pages branch manually: the Apache Maven SCM Publish will do that for you.

Hope you enjoy this tutorial 🙂