Using the Railo cluster cache

August 2, 2009 · By Gert Franz · 6 Comments

This is something we wanted to do for more than a year now. We wanted to implement a cache that is reachable from all the servers in a cluster. So here we go...

Introduction

The benefits one has are enormous. First of all a little bit of background about a clustered cache. I will narrow my excurse to EHCache and MemCached. The idea of a clustered cache is to offer a shared memory for all servers in a cluster. What you can do then is to store data in this cache that is available from all servers in the cluster. So let's take an easy example. Our cluster has 5 servers and every one of them holds the same applications. Let's say it's only one application and that this application queries 20 different queries for each request and that these queries can be cached within 5 minutes. If we assume that we have 250 different requests in 5 minutes then we do the math:
Servers5
Requests/5 Minutes250
Queries20#Queries
Queries per Server= 250 * 205,000
Queries per Cluster
(This is what hits the database)
= 250 * 20 * 525,000

In the above case if we can do a cachedwithin for each of these 20 queries and we would be able to store them in the cluster cache then all of the servers would share them and only 5000 queries instead of 25000 would be reaching the database. So we save 80% of the queries. This is of course only valid for the above example. But of course it can be even better.
Anyway, another example is for instance to store other data like components in this cache and share them across the cluster. This functionality has been written and is in use at two big clients already and we are really very happy on how it works.
In addition all cacheable data can be stored in the cluster cache and shared across the cluster. This would then again save a lot of resources. So first check if it's in the cluster, if it is there, then get it from there and if not go and create it and store it in the cluster cache. I guess this is an usual scenario.

Implementation in Railo

At the moment there are two implementations for the Cache in Railo. One is Memcached and one is EHCache. Memcached offers the opportunity to scale out huge amounts of memory to a server farm and store huge amounts of data in it. It is used for instance by MySQL to save resources as well. You can use a list of Memcached servers in order to store your data in it.
EHCache is considered to be one of the fastest implementations of distributed and replicated caches available. It is as mentioned replicated so that you can have an EHCache instance on every server in the cluster and EHCache replicates its contents with RMI so that you have the same content in each local version of EHCache. Then you can access it locally and therefore it is very fast.
Now in Railo we came up with a new way of defining Caches. You will be able to define as many caches as you see fit and label them. Whether they are distributed or replicated is unimportant. You just define them with a label. Then you can define a default cache which is used if you don't use a label in the functions and tags for storing and getting data from the cache. Here's a screenshot from the plugin that is used to define caches in Railo 3.1:



In this case I have defined a memcached server which by default runs on port 11211. In order to use the Railo cluster cache we have created an extension that installs three client libraries A tag library description, a function library description and a plugin for the administrator that allows you to define the caches you need. The advantage of using tag and function libraries in Railo is as well that you then have the description of the syntax in the Railo administrator:



The tag and function Library define the following tags and functions: <rtl:admincache
   [action="string"]
   [class="string"]
   [custom="struct"]
   [default="boolean"]
   [name="string"]
   [result="string"]>

<rtl:cache
   [action="string"]
   [default="object"]
   [filter="string"]
   [key="string"]
   [name="string"]
   [result="string"]
   [throwonerror="boolean"]
   [timespan="timespan"]
   [value="object"]>


rtlcachedelete(String key,[String name]):void
rtlcacheexists(String key,[String name]):boolean
rtlcacheflush([string filter,[String name]]):number
rtlcacheget(String key,[String name]):object
rtlcacheinfo([String name]):struct
rtlcacheinfoentry(String key,[String name]):struct
rtlcachekeys([string filter,[String name]]):array
rtlcachelist([string filter,[String name]]):struct
rtlcacheset(string key,object value,[timespan timespan,[String name]]):void

The documentation is obtained when the extension is purchased. In the above example the functions and tags used have the prefix "RTL". Since it was our client the prefix is obvious. But we will create an extension with a general tag name and some general function names similar to the ones for CFML 2011.

How to use the cluster cache?

When you have defined a cluster cache on two servers, you can use it with the tags and functions listed above. You can read and write keys from and to it, delete keys, get some information and flush the cache. What we normally do is this: <!--- Easy example --->
<cfset myQuery = "">
<cftry>
   <rtl:cache action="get" key="myQuery" result="myQuery">
   <cfset myQuery = evaluate(myQuery)>
   <cfcatch>
      <!--- Might occur if the key isn't in the cache and since I want to save an EXISTS request I surround it with cftry/catch --->
   </cfcatch>
</cftry>
<cfif not isQuery(myQuery)>
   <cfquery name="myQuery" datasource="whatever">
   ...
   </cfquery>
   <rtl:cache action="set" key="myQuery" value="#serialize(myQuery)#" timespan="yourcachetimespan">
</cfif>
Of course there are more things to do than the above. But for the explanation it's sufficient. In the above example I haven't used the attribute name hence I invoked the cache that is defined as the default cache. If I do not have a default cache the above example would throw an error.

Future of the cluster cache?

At the moment the implementation of the memcached cache is only able of handling simple values. So we always have to serialize and deserialize data. This works even with components. But we will automate that so that you do not have to care whether the data is complex or not.
In addition Railo only handles single Memcached instances at the moment. So for distributed caches you would need to write the logic yourself. But this is something that we will add to the implementation as well, allowing you then to define lists of Memcached servers.
One big improvement for Railo 3.2 or 3.3 will be that everything that is cached in Railo can be optionally stored in the cluster cache. So you will be able to select in the Railo administrator where CFCACHE stores its data and where Queries that are kept in the Server RAM are stored. It's just a drop down box with all defined caches and the server memory. So if you store the queries with the attributes CACHEDWITHIN in the cluster cache then you can have the query saving process automated completely.
Next to the above we will allow users to store the application and session scope in the cluster cache. This allows you then to replicate your sessions without J2EE management and you will have ONE application scope that is available in your cluster. More on these features as soon as we have something to present.

Tags: CFML · Extension · Features · Railo 3.1

6 responses so far ↓

  • 1 Seth MacPherson // Aug 2, 2009 at 3:38 AM

    Gert, you and the Railo team just keep impressing me. Thank you for all you've done for our community. I'm excitedly evangelizing every CF'er I encounter about the advantages and brilliance that is Railo.

    Well done.
  • 2 Vince Bonfanti // Aug 3, 2009 at 9:09 PM

    Gert, do you know if this is something that's being addressed by the CFML Advisory Committee? Support for memcache was added to Open BlueDragon almost a year ago (http://groups.google.com/group/openbd/browse_thread/thread/648c81a328ec2663#), and it's a shame that OpenBD and Railo don't support the same syntax.
  • 3 Sean Corfield // Aug 3, 2009 at 9:38 PM

    The cache*() functions were voted on by the committee some time ago and the committee was unanimous that these were vendor-specific for this round of CFML2009.

    With the ease of adding new built-in functions in Railo, we can easily match the CF9 cache functions and any additional ones OpenBD provides.

    Once all three vendors support this the same way, it could be promoted to extended core or even core in CFML2011.
  • 4 Gert Franz // Aug 3, 2009 at 10:19 PM

    Hmmm... I can't quite follow ... The syntax is quite similar. And since it is not yet publically available we can follow any CFML standard...
    Gert
  • 5 Topper // Feb 7, 2010 at 1:56 AM

    Gert, you legend, - I've been working with implementing Memcache for Teamwork and I just stumbled on the fact that you guys are building advanced caching right in there. I'm going to use this to get teamworkpm.net screaming along.

    Thanks for your excellent work, as always.
  • 6 Gert Franz // Feb 7, 2010 at 11:32 AM

    @Topper

    *blushing* well if, then the team is a legend, especially Micha our CTO. But if you like to use the caching mechanisms, just follow the last two posts about caching we released lately :-)

    Thanks for the praise

    Gert

Leave a Comment

Leave this field empty: