In this paper, we propose a new caching policy, referred to as static caching. Assume that we have n documents in the Web server, which are requested at rates r1 r2 rn . These rates can be measured and estimated from the log files of the server. Then, for any given size of the cache, we put the most requested files in the cache until the cache is full. When a cached file is modified on the disk, it is also updated in the cache. The request rates are recomputed at a fixed frequency, say, every day, and the contents of the cache are changed accordingly, say, every night.
Our static approach has several advantages over previous cache management policies. First, it is very simple to implement. There is no need to keep reference counters or other tables about the documents in the cache. The cache does not have to be updated at all during the time of high server load, as opposed to dynamic policies where cache update (or garbage collection) algorithms are frequently invoked which results in a heavy overhead. Indeed, as all ``missed'' documents are brought into the cache, the cache becomes frequently full. Each time it is full, the removal algorithm is activated to remove some of the documents from the cache in order to make room for the new arrivals. Some techniques were used to remedy this problem. One is not to bring into the cache documents exceeding a certain size [8]. Another one is to use higher/lower marks (as in squid): when the cache reaches the higher mark, it activates its removal policy until it reaches the lower mark. Therefore, the removal policy will be activated less often, but the hit rate will be lower as there are fewer documents in the cache on the average.
Another source of cache management overhead is the computational cost of removal policies. They usually maintain a (possibly sorted) list of documents in the cache and make removal decisions based on the counters associated with the documents. Thus, their computational time is often O(n) or O(nlog n) , where n is the number of documents in the cache. Sophisticated removal policies tend to require more computational time. In this regard, static caching is the best as the contents of the cache are modified at a low frequency, and the estimation of the request rates can be computed off line.
In the next two sections we shall evaluate our static caching policy through trace-driven simulations. To conclude this section, we present a theoretical justification of the optimality of static caching.