Bad to use very large strings? (Java)

Streaming vs not

When you can stream, you can handle files of any size (assuming you really can forget all the data you've already seen). You end up with a naturally O(n) complexity, which is a very good thing. You don't break by running out of memory.

Streaming is lovely... but doesn't work in every scenario.

StringBuilder

As it seems there's been a certain amount of controversy over the StringBuilder advice, here's a benchmark to show the effects. I had to reduce the size of the benchmark in order to get the slow version to even finish in a reasonable time.

Results first, then code. This is a very rough and ready benchmark, but the results are dramatic enough to make the point...

c:\Users\Jon\Test>java Test slow
Building a string of length 120000 without StringBuilder took 21763ms

c:\Users\Jon\Test>java Test fast
Building a string of length 120000 with StringBuilder took 7ms

And the code...

class FakeScanner
{
    private int linesLeft;
    private final String line;

    public FakeScanner(String line, int count)
    {
        linesLeft = count;
        this.line = line;
    }

    public boolean hasNext()
    {
        return linesLeft > 0;
    }

    public String next()
    {
        linesLeft--;
        return line;
    }
}

public class Test
{    
    public static void main(String[] args)
    {
        FakeScanner scanner = new FakeScanner("test", 30000);

        boolean useStringBuilder = "fast".equals(args[0]);

        // Accurate enough for this test
        long start = System.currentTimeMillis();

        String someString;
        if (useStringBuilder)
        {
            StringBuilder builder = new StringBuilder();
            while (scanner.hasNext())
            {
                builder.append(scanner.next());
            }
            someString = builder.toString();
        }
        else
        {
            someString = "";     
            while (scanner.hasNext())
            {
                someString += scanner.next();
            }        
        }
        long end = System.currentTimeMillis();

        System.out.println("Building a string of length " 
                           + someString.length()
                           + (useStringBuilder ? " with" : " without")
                           + " StringBuilder took " + (end - start) + "ms");
    }
}

I believe that creates a new String object every time you do a +=. Use StringBuilder instead.

Tags:

Java