Monday, April 21, 2014

Code Refactoring - Dealing with Legacy Code

If you find yourself dealing with legacy code, that probably means you have code that:

  • Doesn't have a lot of tests (maybe none)
  • Is difficult to modify
  • Is not easily understandable
  • Has tricky bits (takes a long time to debug certain areas)
I'm going to use a specific example from an open source codebase to show how to surgically inspect and adapt your code to make it yield to your bidding.

Now, for this example the code is in Java.  I'll follow up at some point with an example in native code.  One major advantage of working in Java is that there are no real compiler variations, which make it very easy to introduce tests.

Beware the long method

This example is a method which fetches album art from a URL and stores it in a local datastore.  The unaltered code is below.

private void saveCoverArt(String path, String url) throws Exception { 
        InputStream input = null; 

        OutputStream output = null;
        HttpClient client = new DefaultHttpClient();

        try {
            HttpConnectionParams.setConnectionTimeout(client.getParams(), 20 * 1000); // 20 seconds
            HttpConnectionParams.setSoTimeout(client.getParams(), 20 * 1000); // 20 seconds
            HttpGet method = new HttpGet(url);

            HttpResponse response = client.execute(method);
            input = response.getEntity().getContent();

            // Attempt to resolve proper suffix.
            String suffix = "jpg";
            if (url.toLowerCase().endsWith(".gif")) {
                suffix = "gif";
            } else if (url.toLowerCase().endsWith(".png")) {
                suffix = "png";
            }

            // Check permissions.
            File newCoverFile = new File(path, "cover." + suffix);
            if (!securityService.isWriteAllowed(newCoverFile)) {
                throw new Exception("Permission denied: " + StringUtil.toHtml(newCoverFile.getPath()));
            }

            // If file exists, create a backup.
            backup(newCoverFile, new File(path, "cover.backup." + suffix));

            // Write file.
            output = new FileOutputStream(newCoverFile);
            IOUtils.copy(input, output);

            MediaFile dir = mediaFileService.getMediaFile(path);

            // Refresh database.
            mediaFileService.refreshMediaFile(dir);
            dir = mediaFileService.getMediaFile(dir.getId());

            // Rename existing cover file if new cover file is not the preferred.
            try {
                File coverFile = mediaFileService.getCoverArt(dir);
                if (coverFile != null && !isMediaFile(coverFile)) {
                    if (!newCoverFile.equals(coverFile)) {
                        coverFile.renameTo(new File(coverFile.getCanonicalPath() + ".old"));
                        LOG.info("Renamed old image file " + coverFile);

                        // Must refresh again.
                        mediaFileService.refreshMediaFile(dir);
                    }
                }
            } catch (Exception x) {
                LOG.warn("Failed to rename existing cover file.", x);
            }

        } finally {
            IOUtils.closeQuietly(input);
            IOUtils.closeQuietly(output);
            client.getConnectionManager().shutdown();
        }
    }

Now, the issue with this code is not that it is functionally wrong. It may work just fine. But, it takes a few reads and eye movement up and down and possibly to other classes to understand what's going on.  The method is too long, and it is violating the Single Responsiblity Principle.


Start by Understanding the Current Behavior

To refactor this, I want to know what it's current behavior is.  The easiest way is by building a test.  I may want to build a few tests to understand all the paths through the code.  I will need to make one change (easily reversible) to remove the private accessor.

A couple of things that I notice that are going to be important are that 1) this is looking for an http based URL, and 2) there are three different file extensions supported for the cover art (.jpg, .gif, and .png).

So, let's start with the first test - a happy case.  


First Test - Download a PNG of Dark Side of the Moon


I don't want to modify the code under test at this point, I merely want to characterize it's behavior.  So, I'll do the naive thing and then adjust as necessary.

The above png is available on Wikipedia, so I can make sure my code is pointing to that url when I download it.  I want to test the saveCoverArt( String path, String url ) method.

Here's my first try at a test:

public void testSaveCoverArtForDarkSideofTheMoonUsingPng() throws Exception 
{
    String savePath = ".";
    String url = "http://upload.wikimedia.org/wikipedia/en/thumb/3/3b/Dark_Side_of_the_Moon.png/220px-Dark_Side_of_the_Moon.png";
    coverArtService.saveCoverArt(savePath, url);
    File f = new File(savePath,"cover.png");
    assertTrue(f.exists());
}

Now, because I have done the naive thing, I get a NullPointerException thrown when the test runs at line 94 of CoverArtService.

CoverArtService is expecting to have something called SecurityService defined.  Upon inspection of the class I see a setter for SecurityService, meaning they have implemented dependency injection.  This is a really important concept for testability.  

Dependency injection allows the implementation of class dependencies to be replaceable.  This means I can test it easily without changing the code.  If you look at the alternative point of view of code without dependency injection, then CoverArtService would have instantiated its own SecurityService either through the new keyword or through a Factory.  

Because I can replace this dependency, I will use a mocking framework to do so.  There are quite a few mocking frameworks for Java.  I'll stick with Mockito, because it is terse and doesn't come with some of the unnecessary baggage of other mocking frameworks.

Here's the test again, now with a service mock.


public void testSaveCoverArtForDarkSideofTheMoonUsingPng() throws Exception 
{
    String savePath = ".";
    String url = "http://upload.wikimedia.org/wikipedia/en/thumb/3/3b/Dark_Side_of_the_Moon.png/220px-Dark_Side_of_the_Moon.png";
    when(securityServiceMock.isWriteAllowed(notNull(File.class))).thenReturn(true);
    coverArtService.saveCoverArt(savePath, url);
    File f = new File(savePath,"cover.png");
    assertTrue(f.exists());
    verify(securityServiceMock).isWriteAllowed(notNull(File.class));
}

First, a bit of explanation about mocks.  There's some weird looking syntax here at the where ... thenReturn.  This allows me to affect the mock object's behavior and make it behave like a stub when the isWriteAllowed method is called.  Later, after the call to coverArtService.saveCoverArt, I want to verify that isWriteAllowed was called because I expected it to be.  If this is not the case, then the test will fail.

Bah!  I got the NullPointerException again!  Except this time, it's moved to line 105 where the MediaFileService is called.  So, I'll use mock again.  At this point, I'll skip ahead and show that I also created a mock for the MediaFile class.
public void testSaveCoverArtForDarkSideofTheMoonUsingPng() throws Exception 
{
    String savePath = ".";
    String url = "http://upload.wikimedia.org/wikipedia/en/thumb/3/3b/Dark_Side_of_the_Moon.png/220px-Dark_Side_of_the_Moon.png";
    MediaFile mediaFileMock = mock(MediaFile.class);
    when(securityServiceMock.isWriteAllowed(notNull(File.class))).thenReturn(true);
    when(mediaFileServiceMock.getMediaFile(savePath)).thenReturn(mediaFileMock);
    coverArtService.saveCoverArt(savePath, url);
    File f = new File(savePath,"cover.png");
    assertTrue(f.exists());
    verify(securityServiceMock).isWriteAllowed(notNull(File.class));
    verify(mediaFileServiceMock).getMediaFile(savePath);
}

Now, the test runs and I am happy.  Except that, it took too long to run.


Running net.sourceforge.subsonic.ajax.CoverArtServiceTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.971 sec

Of course, this is not a true unit test.  This is more like a functional test.  The reason is that I am testing things that don't have anything to do with the logic of the code.  For example, fetching graphics from somewhere online and persisting to file don't actually need to be run in a unit test.  In fact, they shouldn't be.  Since this test took nearly 4 seconds to run, imagine what would happen if I had a hundred of these types of tests.  Goodbye fast feedback.  So, I want to reduce my dependency on I/O.  

And, another thing.  I have only scraped the surface of the behavior of this method.  So, I will add some other types of tests before I start refining the code too much.


Refactor

I have green tests (working tests).  Now, it's time to refactor the code.  If I take a look at the top section of the code, I see that it's really about setting up a connection.  So let's create a new method for this functionality.

Tip #1: Start small.

Besides starting small, I also want to use copy/paste rather than cut/paste.  If I cut/paste, I may end up in a situation where my code doesn't pass tests anymore and that's not the point.  The point is to use the safety net to make sure I'm not making mistakes by getting too far ahead of myself.

So, I will copy out the code needed to setup the connection and extract a method for it.

HttpClient getHttpClient()
{
    HttpClient client = new DefaultHttpClient();
    HttpConnectionParams.setConnectionTimeout(client.getParams(), 20 * 1000);
    HttpConnectionParams.setSoTimeout(client.getParams(), 20 * 1000);
    return client;
}

Now, before removing the equivalent from the original method, compile this.  Okay, now we can move forward.

void saveCoverArt(String path, String url) throws Exception {
        InputStream input = null;
        OutputStream output = null;
        HttpClient client = getHttpClient();

        try {
            HttpGet method = new HttpGet(url);
...

Our code looks slightly cleaner, but we should probably have a strategy moving forward.  Let's take another look at the method overall.  I won't repeat it here, but feel free to scroll back up.

There are a few discrete operations that are happening in this method.  

  1. Image art is being downloaded from an online HTTP based resource.
  2. Validation of the art type (jpg, png, gif).
  3. The art is being persisted to file store after passing security.
  4. The MediaFileService is getting updated if this is a new version of the original cover art.
So, let's use these as a way to refactor this method.

First, it looks like these steps are out of order.  Why is the suffix being validated after we have fetched the online resource?  What happens if the suffix is invalid?  It looks like there is an incorrect assumption in the code that anything that is not a gif or png is a jpg.  So, there are a few things to fix here.  Let's start by extracting the validation step.

Tip #2: Reorder the steps if necessary

I'll create a method for the image type validation.

First, I'm going to create a simple enum to hold the valid types of images.

public enum ImageType
{
    JPG,
    GIF,
    PNG
}

Second, I'll create the method that will return the valid image type.

ImageType validateImageType(String stringToMatch)
{
    for (ImageType imageType : ImageType.values() )
    {
        if (stringToMatch.toUpperCase().endsWith(imageType.name()))
            return imageType;
    }
    return null;
}

Now, we can create a test for this behavior.  


A special note.

Ordinarily, we would create the test first and then drive the behavior.  However, when refactoring code without unit tests, this is often difficult to do.  Instead, we can extract testability from the existing code and create tests where there were none before.  In a later post, I'll show how to create the test first when refactoring.

Okay, here's the test for the success case.

@Test
public void testValidateImageTypeIsMatchedForPngUrl() throws Exception
{
    ImageType type = coverArtService.validateImageType(DARK_SIDE_PNG_URL);
    assertEquals(ImageType.PNG, type);
}

We should also have a test for an invalid value.

@Test
public void testInvalidImageTypeIsMatchedForBmpUrl() throws Exception
{
    ImageType type = coverArtService.validateImageType(DARK_SIDE_BMP_URL);
    assertNull(type);
}

Now, let's put it together.

Here's the updated method.


void saveCoverArt(String path, String url) throws Exception 
{
    ImageType imageType = validateImageType(url);
    if (null == imageType)
    {
        LOG.info("Cover image " + url + " is not a valid image type.");
        return;
    }
    String suffix = imageType.name().toLowerCase();
        
    InputStream input = null;
    OutputStream output = null;
    HttpClient client = getHttpClient();

    try {
        /* getInputStream( client ) */
        HttpGet method = new HttpGet(url);

        HttpResponse response = client.execute(method);
        input = response.getEntity().getContent();

        /* persistCover( path, suffix, input ) */
        File newCoverFile = new File(path, "cover." + suffix);
        if (!securityService.isWriteAllowed(newCoverFile)) {
            throw new Exception("Permission denied: " + StringUtil.toHtml(newCoverFile.getPath()));
        }

        backup(newCoverFile, new File(path, "cover.backup." + suffix));

        // Write file.
        output = new FileOutputStream(newCoverFile);
        IOUtils.copy(input, output);

        MediaFile dir = mediaFileService.getMediaFile(path);

        // Refresh database.
        mediaFileService.refreshMediaFile(dir);
        dir = mediaFileService.getMediaFile(dir.getId());

        // Rename existing cover file if new cover file is not the preferred.
        try {
            File coverFile = mediaFileService.getCoverArt(dir);
            if (coverFile != null && !isMediaFile(coverFile)) {
                if (!newCoverFile.equals(coverFile)) {
                    coverFile.renameTo(new File(coverFile.getCanonicalPath() + ".old"));
                    LOG.info("Renamed old image file " + coverFile);

                    // Must refresh again.
                    mediaFileService.refreshMediaFile(dir);
                }
            }
        } catch (Exception x) {
            LOG.warn("Failed to rename existing cover file.", x);
        }

    } finally {
        IOUtils.closeQuietly(input);
        IOUtils.closeQuietly(output);
        client.getConnectionManager().shutdown();
    }
}


Unit Tests are Faster

Now, because I have refactored the tests you can see a huge difference between the amount of time it took to run the original functional test and the unit tests that are doing string validation.


By doing more refactoring, I can get a much faster test suite focusing on using mocks and removing the external dependencies that are I/O bound.  We are going to trust that I/O works.  We're using the Java standard library after all.

This is (not) the end

I have gone through a simple way to introduce tests and begin to refactor a lengthy method.  However, I'm not done.  In the next post, I'll carry on with this refactor and discover mistakes I have made and correct those too.

...to be continued.

3 comments:

  1. Good one, explained in a simple and effective way

    ReplyDelete
  2. Thanks for starting this "thread". You know, there's a higher-level concept, too, that your current focus on refactoring brings up: seeing any code change, even if made in a single line of a single file, as changes to the entire codebase. And not only changes which are refactoring.

    Generally, people think of refactoring as different from writing new code or fixing defects, because refactoring is precisely NOT to change the run-time behavior of the code, while new code or fixes are.

    But if we recognize that all code changes potentially affect both the dynamic behavior (what the software does) and also the static behavior (e.g. how easy it is to modify, or how many hidden defects there are in it), then we can unify all three types of code changes. And developers can be aware that any change to code may have wide-ranging effects. Motivation for automated whole-system regression testing, including load testing, as well as, of course, static analysis. Developers who internalize the challenge that every change may affect anything -- both the software's dynamic behavior, and the ease or difficulty of further coding by them or others -- may be more motivated to build the automated safety nets that code quality, Agile, and XP methodologies recommend.

    ReplyDelete