Directory walking

A file system is a recursive structure consisting of a directory which contains files and directories. Ignoring symbolic links a file system can be thought as a tree therefore if we want to walk through it, we could apply analogous techniques than for walking a tree.
There are many ways of doing so: pre order, in order, post order to mention some. The interesting thing about the following one is that it will not load the whole directory structure in memory making the space complexity lower and also being able to deal with some directory changes while walking through it!

We start with a basic usage example:

package ar.com.kriche.dirwalk;

import java.io.File;
import java.io.IOException;

/**
 * Demonstrates a simple use for directory iteration.
 * In a real case the program will not necessarily call getNextFile()
 * over and over again in a single loop.
 * 
 * @author Kriche
 * 
 */
public class Main {

    //TODO ROOT_DIR_PATH must be a valid directory name:
    private static final String ROOT_DIR_PATH = null;
    private static final boolean LOOP_FOR_EVER = true;    

    public static void main(String args[]) throws IOException, InterruptedException {

        DirWalker dirWalker = new DirWalker(ROOT_DIR_PATH, !LOOP_FOR_EVER);
        File file;
        long files = 0;
        while ((file = dirWalker.getNextFile()) != null) {
            files++;
            System.out.println("file: " + file.getAbsolutePath());
        }        
        System.out.println("total files: "+files);        
    }
}

			
The underlying idea for dir walking is simple:
dirwalk (directory) {
	open directory;
	for each entry in directory {
		if entry is a file, just list entry (or do something with it);
		else (it is a directory) dirwalk(entry);
	}
	close directory;
}
			
Here we are adding some extra complexity because we want to offer the loop functionality and instead of just listing it, we want to return it to the caller and the next time we are called we must resume with the next file.
package ar.com.kriche.dirwalk;

import java.io.File;
import java.util.Arrays;

/**
 * Iterates a directory structure (including subdirs), supports structure
 * changes while iterating by skipping no existing files or by adding new ones
 * at some point but not warranted that it will be done during the current
 * iteration.
 *
 * @author Kriche
 *
 */
public class DirWalker {

    private final String dirPath;
    private final boolean loop;
    private File[] files;
    private boolean completed = false;
    private boolean noFileReturned;
    private int nextFile = 0;
    private DirWalker subDirWalker;

    /**
     * @param dirPath must be an existing directory
     * @param loop
     */
    public DirWalker(String dirPath, boolean loop) {
        this.dirPath = dirPath;
        this.loop = loop;
        init();
    }

    private void init() {
        files = new File(dirPath).listFiles();
        Arrays.sort(files);
        completed = false;
        nextFile = 0;
        subDirWalker = null;
        noFileReturned = true;
    }

    /**
     *
     * @return next file or null if there are no files in all the directory tree
     * or finished and loop is false. skips no existing files. Only returns
     * files that are not directories.
     */
    public File getNextFile() {

        // if we were in a sub dir, we resume from there
        if (subDirWalker != null) {
            File currentFile = subDirWalker.getNextFile();
            if (currentFile == null) {
                // finished with the subdir is as if we never have it!
                subDirWalker = null;
                return getNextFile();
            }
            noFileReturned = false;
            return currentFile;
        }

        if (completed) {

            if (!loop) {
                // completed and no loop, from now on: always null
                return null;
            }

            if (noFileReturned) {
                /*
                 * completed dirwalking but the were no files, return null even
                 * if looping since there is no point in walking forever in an
                 * empty or directories only tree and it would cause infinity
                 * recursive calls!
                 * consider the case of having only one directory that only has
                 * one empty directory.
                 */
                init();
                return null;
            }

            // start again, refresh dir in case files changed!
            init();
            if (files == null || files.length == 0) {
                // no files! start again next time, but now return null!
                completed = true;
                return null;
            }
        }

        if (files == null || nextFile == files.length) {
            // no more files! start again next!
            completed = true;
            return getNextFile();
        }

        File current = files[nextFile++];
        if (!current.exists()) {
            // skip non existing files by moving to the next
            return getNextFile();
        }
        if (current.isDirectory()) {
            // delegates to the subdirwalker
            // never loop otherwise we will never leave the subdir
            // TODO recursive call might fail here if current is deleted just
            // before this recursive call:
            subDirWalker = new DirWalker(current.getAbsolutePath(), false);
            return getNextFile();
        }
        noFileReturned = false;
        return current;
    }

}