Tuesday, January 30, 2007

String Concatenation optimization

String concatenation is one of the most common, but, a pretty expensive operation. It can hit the performance severly if not used correctly. The performance goes down drastically if you append strings using '&' OR ListAppend() in a loop. I have seen application performance improving by 50-100% just by optimizing String concatenation (though that depends on how much concatenation is used in the app). So what do you about it?
The simplest and the most optimized way to do these append operations is using java's StringBuffer. (I am sure you must be aware of it but still.. :)) .
The code would look like

<cfset sb = createObject("java", "java.lang.StringBuffer")>
<cfloop from=1 to=100 index=i>
<cfset sb.append("something")>
<cfset sb.append(i)>
<cfset result=sb.toString()>

Sometimes I feel that we should have a datastructure like this in ColdFusion directly but again I think whats wrong with using StringBuffer? Its like any other function which we would create. Isn't it so?

If you are a puristic and don't want to use any java API inside your CF app, there is another simple way to do the same thing. It uses ColdFusion Array to do the same thing what StringBuffer does. Instead of appending the string in the buffer, you can append to the array using ArrayAppend() and then once you are done and want to get the string back, use ArrayToList() with empty string ("") as delimiter. The code would look like

<cfset arr = ArrayNew(1)>
<cfloop from=1 to=100 index=i>
<cfset ArrayAppend(arr, "something")>
<cfset ArrayAppend(arr, i)>
<cfset result=ArrayToList(arr,"")>

This would give a much better performance as compared to concatenation using '&' or using ListAppend() but will have lower performance as compared to StringBuffer. That is because of the overhead of Array object creation and array append operation. ArrayToList() will anyway create the string buffer and append the strings

You should use '&' or ListAppend() only when there are only 2-3 strings to be concatenated. Otherwise always use either of the two techniques above.


Ben Nadel said...

One thing that I wanted to point out (that might not be thought of by all) is that Arrays are passed by value where as Java objects are passed by reference. What does this mean for string concatenation? Not much, unless you have some sort of recursive function or something that builds a string over a bunch of method calls. Using a Java String Buffer, you can keep passing the buffer object to the methods as an object to write to without worrying about the overhead of object copying.

If you try to do the same with a ColdFusion array, it's structure will get duplicated with every method call (passed by value) which will have some performance implications.

So, again, good tip, but here's just another angle to contemplate that might not be obvious.

Rupesh Kumar said...

Nice point Ben ! If you want to build it over some method calls, you should not use ColdFusion array. StringBuffer is your best friend in that case :)

crish said...

Hey Rupesh,

this code is not working?

cfloop from=1 to=100 index=i>
cfset sb.append("something")
cfset sb.append(i)
cfset result=sb.toString()

Rupesh Kumar said...

@Crish, you are right.
There is a small mistake there. Since there are many append methods in StringBuffer class, CF did not know which one to invoke. Replace the line

<cfset sb.append(i)>

<cfset sb.append(javacast("String", i))>

and it will work fine.