July 2006
[ vmassol ] 21:30, Thursday, 27 July 2006

I'm a bit late to report on this but we've have had a very nice Maven Day in Paris in early July 2006, co-organized by Application-Servers.com and the OSSGTP (Parisian open sourcer group). Thanks to Improve for sponsoring the event!

We were lucky to have Jason Van Zyl talk about new Maven2 stuff and especially the Repository Manager and the maven.org development platform. Emmanuel Venisse has done a presentation on Continuum, Fabrice Bellingard has presented a return of experience of implementing Maven in a large company and I have done a quick presentation of new quality features in Maven2. See Jean-Laurent's blog entry for more details.

[ vmassol ] 08:19, Tuesday, 18 July 2006

It would be nice if there were a tool that could verify that you have correctly added @since tags for methods added in the current version. It would do this by checking against the previous release.

This tool could be based on Clirr or JDiff for example. It would also have an option to fail the build if there are new methods without a @since tag.

Do you know if such a tool exists?

[ vmassol ] 13:52, Monday, 17 July 2006

The experience that I'm relating here is part of an exploratory refactoring that I'm currently doing on the Cargo code base. Till now we were using Java File objects for representing J2EE archives or container installation and configuration directories. This is ok but it makes unit testing a little bit complex when it comes to unit testing File operations. The reason is that you need to define a location on your local file system where you're going to read/write files to, clean up the files, etc.

Here's a method we had (it expands a JAR file):

    public void expandToPath(String path) throws IOException
    {
         File workDir = new File(path);
         JarInputStream inputStream = getContentAsStream();
         
         byte[] buffer = new byte[40960];
         
         ZipEntry entry;
         while ((entry = inputStream.getNextEntry()) != null)
         {
              String entryName = entry.getName();
              entryName = entryName.replace('/', File.separatorChar);
              
              String outFileName = workDir.getPath() + File.separator + entryName;
              File outFile = new File(outFileName);
              
              if (outFileName.endsWith("/") || outFileName.endsWith("\\"))
              {
                   outFile.mkdirs();
               }
              else
              {
                   if (!outFile.getParentFile().exists())
                   {
                        outFile.getParentFile().mkdirs();
                    }
                   
                   if (!outFile.exists())
                   {
                        outFile.createNewFile();
                    }
                   
                   FileOutputStream out = new FileOutputStream(outFile);
                   int read;
                   while ((read = inputStream.read(buffer)) > 0)
                   {
                        out.write(buffer, 0, read);
                    }
                   
                   out.close();
               }
          }
         inputStream.close();
     }

Here's how I've transformed the method by removing all File operations and instead introducing a FileHandler interface with the following methods, equivalent to the File ones:

  • append(URI, String): appends a suffix to a URI
  • mkdirs(URI): create directories for the URI
  • exists(URI): return true if the URI exists
  • createFile(URI): create a file
  • getOutputStream(URI): get an output stream for the passed URI
    public void expandToPath(URI path) throws IOException
    {
         JarInputStream inputStream = getContentAsStream();
 
         byte[] buffer = new byte[40960];
 
         ZipEntry entry;
         while ((entry = inputStream.getNextEntry()) != null)
         {
              String entryName = entry.getName();
  
              URI outFile = getFileHandler().append(path, entryName);
  
              if (outFile.toString().endsWith("/"))
              {
                   getFileHandler().mkdirs(outFile);
               }
              else
              {
                   if (!getFileHandler().exists(getFileHandler().getParent(outFile)))
                   {
                        getFileHandler().mkdirs(getFileHandler().getParent(outFile));
                    }
   
                   if (!getFileHandler().exists(outFile))
                   {
                        getFileHandler().createFile(outFile);
                    }
   
                   OutputStream out = getFileHandler().getOutputStream(outFile);
                   int read;
                   while ((read = inputStream.read(buffer)) > 0)
                   {
                        out.write(buffer, 0, read);
                    }
   
                   out.close();
               }
          }
         inputStream.close();
     }

The interesting part comes now. Because it was a bit hard to create a unit test for the original expandToPath method nobody had done it. It would have involved passing a test JAR but more difficult it would have involved passing a target directory where the JAR would be expanded. This is not easy as the location of this target dir would depend from where the tests is executed and making it work seamlessly from both a build tool and from your IDE is not trivial. Here comes VFS to help us. By implementing the FileHandler interface using VFS, we can now write the following unit test:

    public void testExpandToPath() throws Exception
    {
         URI jarURI = new URI("ram:///test.jar");
 
         FileObject testJar = VFS.getManager().resolveFile(jarURI.toString());
         ZipOutputStream zos = new ZipOutputStream(testJar.getContent().getOutputStream());
         ZipEntry zipEntry = new ZipEntry("rootResource.txt");
         zos.putNextEntry(zipEntry);
         zos.write("Some content".getBytes());
         zos.closeEntry();
         zos.close();
 
         DefaultJarArchive jarArchive = new DefaultJarArchive(jarURI);
         jarArchive.setFileHandler(new VFSFileHandler());
 
         jarArchive.expandToPath(new URI("ram:///test"));
 
         // Verify that the rootResource.txt file has been correctly expanded
         FileObject rootResource = VFS.getManager().resolveFile("ram:///test/rootResource.txt");
         assertTrue(rootResource.exists());
     }

Notice the use of the "ram:" URI scheme. This one of the many filesystems supported by VFS and it means that all file operations will happen in a virtual file system in memory. Also note that VFS doesn't currently support creating Zip files so we're using the JDK's ZipOutputStream API. The nice thing is that as this test operates in memory there's no need to define a target location on the file system.

The other nice thing is that by introducing VFS to this expandToPath() method it's now possible to expand a JAR to any file system supported by VFS. We could thus expand to a FTP server, to a WebDAV repository, to an HTTP URL, to a remote machine using SSH, etc. All this without changing a line to our code. Nice isn't it?

[ vmassol ] 09:54, Thursday, 13 July 2006

(Updated 2006-07-14: Added section on discovering modules and added disclaimer at the end)

IntelliJ IDEA has revolutioned the IDE landscape by adding "intelligence" to IDEs. A few days ago I did a thought experiment by asking myself the following question "how feasible would it be to build a project without knowing any meta-data about it?". In other words, is it possible for a build tool to be intelligent enough to build a project without build files nor POMs. Said differently, is it possible to figure out a project's POM automatically? Let's review some required typical meta-data information and see how they could be guessed.

Source locations

It is possible to guess where sources are by looking for *.java files (for Java projects - The same applies for other project types). Now we still need to differentiate main sources from test sources but that's also relatively easy to do. We can check for classes extending JUnit's TestCase for example or the TestNG equivalent, or any other well-known testing framework.

Note: An interesting thing here is that to be intelligent we'd need the help of the community to add new rules to the discovery process. For example imagine that a new testing framework appears; we'd need to add it to the Test Discovery Rules. Thus, this type of intelligent build system would need to rely a lot on the community and thus would need to get its data from an online repository that could be edited by the community.

Dependencies

How do we detect project dependencies? One relatively way is to parse the sources that we have found above and find all external imports. Then query ibiblio to find matching package names (this information is present in Maven POMs on ibiblio). Now for guessing the version, there's no easy magic. A first approach would be to get the latest released version of the dependencies we've found.

Project type

Project types can easily be guessed by looking at some files. For example if a web.xml file is present then it's a WAR project, if an application.xml one is found then it's an EAR project, if a jnlp file is found then it's a JNLP project, etc.

SCM

SCM can easily be guessed by looking for special files on the filesystem of the project. For example we would look for .cvs directories for SCV and for .svn files for Subversion, etc

Developers

Once we got the SCM URL we can then query the SCM to get the list of all developers.

Project name

The project name could be the name of the top level directory and the version could be set arbitrarily to 1.0. Actually we could even check ibiblio to see if the project is already on ibiblio, get the latest version there and increase the minor number by one as a first order guess. Another strategy would be to query the SCM and look for tags and deduce existing versions by parsing those tags (there are some usual conventions for naming tags so it should be possible to make a good guess).

Modules and artifacts

Discovering the different modules of a project is probably one of the hardest thing to do. If you look at different projects in the wild I believe there are not that many directory structures out there. Maybe 10-15. Thus it should be possible to register knowledge of these structures and let the tool discover which ones matches the closest with the project at hand. This would also allow to deduce the different artifacts that have to be generated. Of course it won't be perfect as there are projects which generate several artifacts and which may be in the same module. Again it's a question of doing 80% of the job and leaving 20% to be done manually.

Additional information

Of course, the information found above are just guesses. In most cases they could be correct but of course we would need to offer a way for the user to edit them and to add any missing information.

Conclusion

I believe it should be possible to create such an intelligent meta-build project which could be used to generate files for one of the existing build system such as Maven, Ant, etc. For example it could create an internal POM file on which Maven could then be executed to produce the build results. At a minimum such a tool could be used to convert existing projects to Maven. I wonder how intelligent it could be but I guess it could go pretty far.

Disclaimer: Of course, such a tool would be bad from a conventions stand point. One of the great strength of Maven has been to standardize the directory structure of projects. I can go to any Maven project and I know exactly where stuff will, what will be generated, etc.

Are there other information which you think could be guessed automatically? Can you think of better algorithms to guess some of the information shown above?