Thursday, January 10, 2008

DNS lookup Caching in ColdFusion/Java

Two days back, I got an interesting question from our support team. The Customer in this case was using CFLdap which connected to a particular LDAP server and things were working fine. The problem came when they replaced this LDAP server with another LDAP server and assigned the same dns name to the new ldap server. Ideally any connection made henceforth should have worked with the new ldap server but actually that did not happen. ColdFusion started throwing error and it did not work until the ColdFusion server is restarted. Ever seen something similar?

It is obvious that it happend because the IP address was being cached for the host name as a result of which ColdFusion was still trying to connect to the old IP address even though the host name now pointed to a different IP address. This caching also applies for all other network protocol tags such as CFHTTP, CFFTP, CFFEED etc and is not limited to CFLDAP. It is actually the JVM that does this caching. When JVM is requested to resolve a host name, it does the host name to IP address resolution and caches the result. This result will be used for all subsequent lookups. It also caches the negative results. By that I mean, if the dns reolution fails, it caches the failed result for a certain period so that any lookup for that hostname in that period will not result into another resolution on network and will immediately return the failed result.

For more detail on this caching, check out the Javadoc for InetAddress class.

As per this doc, there are two parameters that control this caching

networkaddress.cache.ttl (default: -1)
Indicates the caching policy for successful name lookups from the name service. The value is specified as as integer to indicate the number of seconds to cache the successful lookup.

A value of -1 indicates "cache forever".

networkaddress.cache.negative.ttl (default: 10)
Indicates the caching policy for un-successful name lookups from the name service. The value is specified as as integer to indicate the number of seconds to cache the failure for un-successful lookups.

A value of 0 indicates "never cache". A value of -1 indicates "cache forever".


Now where do you specify this settying? You can specify this setting in <Java_home>/jre/lib/security/java.security file. For standalone ColdFusion server it will be in <ColdFusion _dir>/runtime/jre/lib/security/java.security file.

As you see, by default networkaddress.cache.ttl caches the result for ever and hence it is configured for best performance. Any change to this mean drop in performance. If you don't want to cache the resolved IP address for ever, as is the case here, you would need to change networkaddress.cache.ttl value to 60 seconds or 300 seconds or any value you feel suitable. You would not want to set it to 0 as that would mean "never cache" the result which might affect the performance significantly.

In which case you would want to change the value for networkaddress.cache.negative.ttl? That would be mostly in case when you want to cache the negative result for a longer time and in turn improving the performance. For example, if you are trying to connect to a hostname which can not be resolved to any ip address, and that happens very frequently, each of the call (as long as they are not in the same 10 sec window) would become very slow. Increasing this value would increase the performance but again you would not want to cache the negative result for ever.

After you change this setting, you will have to restart the ColdFusion server for this change to take effect.

2 comments:

Anonymous said...

I just got bit by this yesterday.

Here's a thought for the Java and coldfusion developers...why not actually respect the TTL value that was put into the DNS for a reason?

I don't see any way to set things to THAT value -- it's either "cache forever" or "don't cache at all" -- there's no way to make it "consistent" with what should be in the core operating system,

Anonymous said...

THANK YOU. I just took over a coldfusion application that had been experiencing a bug for over 2 months due to this issue. It took me a week to track it down, but once I did, your post helped IMMENSELY.

Thanks a lot!