I experimented with Copilot “Agent mode”: I told it to create a Maven plugin. (Let me stress that I already have some experience implementing and testing Maven plugins.)
I’ll report my experience, which has been positive in some respects but disappointing in others.
I use “Claude 3.5 Sonnet” model because that’s the one used in most demos I saw online.
This is the initial prompt:
I want to create a Maven plugin, using the official archetype.
The groupId for the project must be “com.examples” and the artifactId must be “exampleplugin”.
The project must be part of a Git repository, with the standard .gitignore file (excluding typical generated artifacts of a Maven project and also other typical files that must be excluded, including OS typical generated files).
The project must be up-to-date with respect to dependencies and plugins, and it must use Java 21.
I want also to use the Maven wrapper in this project, using the latest version of Maven (but not Maven 4).
I also want the GitHub Actions workflow to build and test the project (including integration tests) for Linux, Windows and macOS.
Is everything clear? Do you have any question?
Here’s the first output, which correctly uses the command for the archetype:
Let’s continue and see how it goes:
The “.gitignore” ignores too many things: I want to keep Eclipse metadata files and directories.
So, before going on, I specify
The dot files for project’s metadata of the Java LSP must not be ignored (e.g., .classpath, .project, etc.); .vscode must not be ignored either
And the updated version looks right:
Then, it goes on updating versions and Java version; it also runs the command for the Maven wrapper:
It creates the GitHub Actions workflow and initializes the Git repository:
That’s the summary:
However, I’m not happy with the results.
It only updated the Java and Maven versions, but not the dependencies or plugin versions. The Maven wrapper is installed with version 3.9.6, but the latest version of Maven is 3.9.9.
The GitHub Actions workflow is not perfect either:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
name: Build and Test on: push: branches: [ "main" ] pull_request: branches: [ "main" ] jobs: build: strategy: matrix: os: [ubuntu-latest, windows-latest, macos-latest] runs-on: ${{ matrix.os }} steps: - uses: actions/checkout@v4 - name: Set up JDK 21 uses: actions/setup-java@v4 with: java-version: '21' distribution: 'temurin' cache: 'maven' - name: Build and test with Maven run: ./mvnw clean verify - name: Run integration tests run: ./mvnw clean verify -Prun-its |
It only considers the “main” branch, and there’s no need to run two builds. However, it correctly detected the profile “run-its” for integration tests!
I then wrote this:
The tasks were almost correct.
There are a few things that must be fixed:– The latest version of Maven (both in the POM and in the wrapper) is not 3.9.6: it is 3.9.9.
– The GitHub Actions workflow only considers the “main” branch (both for push and pr); this is not what I asked: I want to build any branch. Moreover, there’s no need to run Maven twice (as long as integration tests are run).
– Moreover, I asked to update all dependencies and plugins to their latest versions, but that’s not what happened.
As a result, the GitHub Actions workflow was fixed according to my requests.
It also updates the Maven wrapper version and verifies the correct changes in the properties file.
It also updates a few versions in the POM, reporting:
However, most of the dependencies and plugins are NOT up-to-date at all.
I then wrote:
Please, double check the dependencies and plugins: you have updated them, but some of them are not yet at the latest version.
It seems to understand that it can use the Maven versions plugin to display the dependency updates:
After running the command, it complains it cannot read the updates from the output (which, instead, are there) and tries to use another command (which is completely wrong):
I told it about the visible versions:
The versions were actually visible in the output.
And here’s the result:
Unfortunately, when it uses such information to update the POM, it makes a small disaster: most dependencies are not correctly updated, and some versions don’t exist.
I told it about that, in particular, instead of displaying updates, it could have used the proper goal of the Maven versions plugin:
Some versions are still not up-to-date, not to mention that there’s also a wrong version that cannot be resolved. Why don’t you simply use the Maven versions plugin to update dependencies and plugins instead of just displaying their output and parsing it?
But it insists on displaying and trying to change the versions (without succeeding).
So I tried to be clearer:
That only displays the latest versions and then you’ll have to parse the output, which did not work yet. Isn’t there a specific goal to also update automatically all the dependencies and plugins?
Now it understands and uses the proper goal:
It then detects that some previously updated versions were wrong, and that there are beta versions of some dependencies and plugins; the run Maven command has already updated the POM, but it insists on rechecking the versions:
Updating some versions prevents the test from being compiled anymore. Copilot did not realize that. I make it explicit:
The test file contains a compilation error. I guess you updated the corresponding dependency to a too much high version (beta version). It’s better not to use beta versions and instead use the latest version before the beta one.
Now, it starts iterating, trying to downgrade some versions. It performs several iterations and asks whether it should continue. I say yes, and at some point, it seems to make the test compile again, but looking at the versions in the POM, it looks like it almost randomly decreased versions too much. Note that, by looking at the Maven central repository, it is easy to take the latest dependency version or a plugin that does not contain “beta”.
Anyway, after the test is finally compiled again, it decides to run the Maven build to see whether everything works. That’s the first time Copilot proposes to run the Maven build (I would have expected that to be done after creating the files with the first prompt).
Here comes trouble:
The test fails (Google Guice fails when using dependency injection). It looks like Copilot understands there’s a misalignment with dependencies. In particular, trying to fix the previous compilation error, it wrongly added additional transitive dependencies (uselessly) with wrong versions, leading to this failure.
It then starts several iterations without being able to fix the test.
Ultimately, I had to revert the changes to the POM: all the previous iterations about version updates were useless and disruptive.
In the end, I was impressed by the first iterations where Copilot took most tasks correctly.
However, it clearly cannot handle tasks like version updates or compilation failures (at least, without breaking tests afterwards).
I managed to guide it in some parts because, as I said at the beginning, I know how to implement and test Maven plugins. Without such an existing experience, I had to give up very soon.
After all, this was just an experiment. Bootstrapping the project and the Git repository manually with the archetype, updating a few initial versions, and creating the GitHub Actions workflow would have taken me only a few minutes: much less than the time I spent with Copilot 😉
Maybe the “Claude 3.5 Sonnet” model was not the right one? As I said, I saw this is used in most online demos for Copilot agent mode.
I’ll do some further experiments, so stay tuned!