Scalable network object caching6154811Abstract A scalable distributed caching system on a network receives a request for a data object from a user. The caching system carries out a locator function that locates a directory cache for the object. The directory cache stores a directory list that identifies the locations of object caches that purport to store copies of the object requested by the user. The object caches on the object directory list are polled, and in response send messages to the cache that received the user request indicating if each object cache stores a copy of the requested object. The receiving cache sends a message requesting a copy of the object to the object cache that sent the message first received by the receiving cache indicating that an object cache stores the requested object. The object cache that sent the first received message then sends a copy of the object to the receiving cache, which stores a copy and then sends a copy to the user. The directory list for the object is then updated by adding the network address of the receiving cache. Outdated copies of objects stored on object caches are deleted in a distributed fashion to maintain the coherence of the cached copies. This is further reinforced by the association of time-to-live parameters with the each copy and each object cache address on directory lists. Claims What is claimed is: Description FIELD OF THE INVENTION
______________________________________
Directory.sub.-- list
directory.sub.-- object.sub.-- address /*network address of the
list object*/
{
/*the addresses of five object caches at which copies of
the list object are stored*/
object.sub.-- cache.sub.-- address.sub.-- 1
object.sub.-- cache.sub.-- address.sub.-- 2
object.sub.-- cache.sub.-- address.sub.-- 3
object.sub.-- cache.sub.-- address.sub.-- 4
object.sub.-- cache.sub.-- address.sub.-- 5
}
}
______________________________________
Each cache that receives an object request from a client has a hash locator function. The input of the hash locator function is the network address of an object. The output of the locator function is the address (or pointer thereto) of the directory cache on which the directory list for the requested object is stored. In this way, the receiving cache can locate the directory cache for any cached object. If the requested object has no directory list (i.e., the locator function does not point to any directory cache), then a request is sent to the original source server for the requested object to be sent to the receiving cache. The receiving cache stores a copy of the requested object and then sends a copy to the client. The receiving cache then starts a directory list for the object that contains the receiving cache address as an object cache. In one embodiment, the receiving cache also stores the directory list for the object. From this example, it is shown that a cache may be a receiving cache (receive a client request), a directory cache (store a directory list for an object), and an object cache (store a copy of an object) at the same time. FIG. 5 shows a receiving cache, cache A 501 receiving a request for a data object (denoted "GET.sub.-- request(Y)" in FIG. 5) that has a network address Y from a client 502. Receiving cache A 501 implements hash locator function L using address Y as an input, and obtaining directory cache address C as an output. This indicates that cache C 503 stores the directory list for object Y. An embodiment of the data structure of the directory list for object Y is as follows:
______________________________________
Directory.sub.-- list
Y /*network address of object Y*/
{
/*the addresses of object caches B, D, E and F, each of
which stores a copy of directory list object Y*/
B
D
E
F
}
}
______________________________________
This shows that copies of data object Y are stored on object caches B 504, D 505, E 506 and F 507. Receiving cache A 601 (FIG. 6) sends a message (denoted "lookup" in FIG. 6) requesting the data object to directory cache C 602. In the embodiment shown in FIG. 6, directory cache C 602 sends a message to receiving cache A 601 (denoted "List[B,D,E,F]" in FIG. 6) that includes the object cache addresses from the directory list. Cache A 601 uses this list to poll the object caches and select one from which to receive object Y. In another embodiment of the present invention, directory cache C 602 does not send a message including directory list addresses to the receiving cache A 601. Rather, directory cache C 602 itself polls the object caches (B-603, D-604, E-605 and F-606 in the embodiment shown in FIG. 6) and asks the object caches to send UDP ping hit or miss messages to receiving cache A 601. Receiving cache A 601 then decides from which object cache to request a copy of object Y, and sends a message to the selected object cache requesting the selected cache to send the copy. The present invention is more efficient than known directory systems that use a non-distributed directory to locate cached copies of a given object. In one known system, a single non-distributed directory residing on one server is consulted for every request for a cached object. The single directory returns at least one address of a cache on which the requested object is stored. This known system is disadvantageous because the single server can act as a bottleneck for all requests for cached objects, thus defeating the objective of caching, i.e., enhanced system performance in quickly and reliably returning a requested object. The network can be excessively burdened by request and response message traffic between clients and the single directory server, as well as by maintenance traffic designed to keep the directory accurate. The load on the single directory server can be heavy, reducing its speed of performance, and making it unavailable for other (non-directory) tasks. Also, a non-distributed directory can be very large, taking up substantial memory resources on the server, and be slow and tedious to search, taking a large amount of processor time. A single directory is also a single point-of-failure for the caching system. If the single directory server fails, the entire caching system becomes inoperative, and all of the memory resources devoted to storing cached copies in object caches is wasted. An obvious partial solution in known systems is to place several copies of the non-distributed directory in several different locations on the network (i.e., on several different servers with different network addresses.) This can help reduce the bottleneck and single-point-of-failure effects of having a single directory server because a client can consult one of many non-distributed directories to obtain the address of an object cache. This helps distribute the load of directory lookup activities. Further, if one directory server fails, another can be consulted. However, this solution can exacerbate the caching system traffic burden on the network, because the directories need to be kept accurate. Hence, a change in the state of a cached copy (e.g., a cached copy becomes out-of-date, or a new object is cached, etc.) requires that update messages be sent to all directory servers, which generate further traffic in the form of acknowledgment messages. This can disadvantageously generate a substantial amount of message traffic, burdening the network. Further, as is the case with the single non-distributed directory, non-distributed directories in known systems having more than one directory must be locked during update, rendering them unavailable. In a large system with rapidly changing object states, updating and locking can occur so frequently as to substantially reduce the availability of a directory for lookup. Also, the problems of absorbing memory and processor resources can be exacerbated by having several copies of a large non-distributed directory. The present invention solves these problems, advantageously providing a distributed directory for the caching system that can be used and maintained more efficiently and economically than known systems. By dividing the directory into directory lists in accordance with the present invention and distributing them across numerous servers, the directory lookup load is advantageously distributed across a larger number of servers than known systems. The problems of bottlenecking and single-points-of-failures is substantially reduced or eliminated. The memory resources required to store the directory are substantially smaller on each directory server than on directory server in non-distributed directory caching systems, freeing the server to store other data besides directory data. Further, the advantageously distributed directly lookup load absorbs far less processor time on any one server, enabling a directory server to quickly and efficiently respond to lookup requests, and carry out other tasks. A further advantage of the present invention is the low level of directory maintenance traffic it requires, thus advantageously reducing the burden on the network compared to known systems. The present invention does not use acknowledgment messages and advantageously never locks directory data for updates, and does not use acknowledgment messages. The search system determines where copies of a requested object are stored, if any. As described above, in accordance with the present invention, a cache receives a request (called a "get request") for an object from a client. FIG. 5 shows an embodiment of a cache system where the client 502 sends a get request to Cache A 501. Upon receiving the request, the receiving cache, Cache A 501, first determines if it stores the requested object itself. If it has the requested object, the requested object is sent to the client 502. If it does not store the requested object, it must locate the object elsewhere. If Cache A 502 does not store the requested object, it executes a directory list locator function L 508. As described above, directory list locator function L 508 operates on the network address of the requested data objects an input and yields address (or pointer thereto) of the cache which stores the directory list for that object as the output. Hence, if cache A 501 receives a request from a client 502 for data object Y, and cache A 501 determines that it does not store object Y, then cache A carries out L 508 to determine the location of the directory list for Y. As shown in FIG. 5, L(Y)=C, meaning that cache C 503 stores the directory list for object Y. In certain embodiments of the present invention, the locator function does not indicate a network address. This indicates that there is no copy of the object stored on a cache within logical reach of the receiving cache carrying out the locator function, or that no copy of the object is stored on any cache. FIG. 13 shows an embodiment of the present invention that illustrates the case where no copy of the object 1300 is stored on a cache 1301, 1303 or 1305 within the logical reach of the caches (1301, 1302, 1303, 1304, 1305, 1306 and 1307) receiving cache 1302 cache is one in which caches are grouped on logical levels (e.g., a first logical level a second logical level as shown in FIG. 13) in a hierarchy of logical levels. A cache (e.g. 1301) on a first logical level of caches 1301, 1302 and 1303) can search for and obtain copies of objects stored on other caches 1302 and 1303 on the same level. To obtain a copy of an object 1308 stored on a cache 1304 on a second level, an object request message is sent to a receiving cache 1305 on the second level, which initiates a search and attempts to obtain a copy of the object from a cache 1304, 1306 or 1307 on the second level. Upon obtaining the copy, a copy is sent to the receiving cache 1302 on the first level. A receiving cache 1302 on the first level in this embodiment chooses to send an object request to a receiving cache 1305 on a second level when the locator function returns a pointer indicating that no cache 1301, 1302 or 1303 logically available to the receiving cache 1302 on the first level stores a directory list for the object. Alternatively, the receiving cache chooses in another embodiment of the present invention to request a copy of the object from the server storing the original version of the object when the pointer returned by the locator function indicates that no cache logically available to the receiving cache stores a directory list for the object. Thus, the hash function L 508 comprises a function that uses an object network address as an input and provides a pointer as an output where the information indicated by the pointer determines the subsequent behavior of the present invention. If the pointer indicates a network address of a directory cache storing a directory list for the object, an object request is sent from the receiving cache to the directory cache. On the other hand, if a pointer indicates no network address, or indicates no network address of a logically available cache (e.g., the address of a cache on the same hierarchical level), then an object request message is sent to a receiving cache on another hierarchical level, or else to the original source server for the object. In one embodiment, a table correlating locator function pointers with information such as directory cache network addresses is entered in a receiving cache by hand by a system administrator. In another embodiment, a message is sent to receiving caches that updates the pointer table. For example, if a new directory list is started for a newly cached object X at a directory cache Z, then a message is sent to a receiving cache with the network address of the object X, the network address of directory cache Z, and a flag indicating that the pointer table is to be updated. The receiving cache receives the message, carries out the hash function on the network address of the object, and correlates the resulting pointer value to the network address of cache Z. Thus, when a request for object X is received by the receiving cache, it carries out the hash locator function on X, obtains a pointer, and correlates the pointer to the network address of cache Z. Then the receiving cache sends a lookup request to cache Z in accordance with the present invention. In the example shown in FIG. 5, cache A 501 has received a get request for an object Y. Cache A 501 has determined that object Y is not stored on itself, and has carried out locator function L 508 on the network address of Y. In this example, L(Y)=C, meaning that cache C 503 stores the directory list for object Y. FIG. 6 shows the system of FIG. 5 after the directory list locator function L 508 has identified cache C as the directory cache for object Y. Cache A 601 then sends a "lookup request" to cache C 602. In one embodiment, the lookup request comprises the address of the requested object Y and the address of the requesting cache, cache A 601. Cache C 602 returns the directory list cache addresses for the requested object Y to cache A 601, which then implements a decision system to select a cache from which to request a copy of the object. In another embodiment, cache C 602 does not send the directory list to cache A 601. Rather, it sends out UDP ping requests to object caches on the directory list for the requested object, along with the network address of cache A 601. The object caches respond with UDP ping hit or UDP ping miss messages to cache A 601. A key advantage of the present invention is its scalability. The efficient scalability of the search function is achieved because the directory is distributed in the form of directory lists across many servers. The correct directory list for a given object is found by the hash locator function implemented in a receiving cache. Thus, a great number of different objects cached on a large number of different servers on the network can be readily and efficiently found by consulting a hash locator function and a short directory list, rather than carrying out a full search of a single non-distributed directory burdened by numerous requests and residing on one server. Once the addresses of caches which have copies of the requested object are determined, a decision system is implemented that determines from which cache the copy of the requested object will be sent. In one embodiment of the present invention, if no cached copy of the requested object exists, then the receiving cache sends a request for the object to the object's original source server. In another embodiment, the receiving cache sends a request for the object to another cache at another hierarchical level of the caching system. As described before, in one embodiment of the present invention, object cache addresses are sent to the receiving cache from the directory cache in response to a lookup request from the receiving cache, in which case the receiving cache polls the object caches. In another embodiment, object cache addresses are not sent to the receiving cache, in which case the directory cache polls the object caches. Either way, UDP ping hit and/or miss messages are sent from the object caches to the receiving cache, which then decides which object cache to ask to send a cached copy of the requested object. An embodiment where the object cache addresses are not sent to the receiving cache is shown in FIG. 7. Cache A 701 has received a request for object Y from the client. Cache A 701 has carried out a directory lookup function L(Y)=C, (not shown) indicating that the directory list for object Y is stored on cache C 702. Receiving cache A 701 sends a lookup message to directory cache C 702 ("lookup(1)"). The directory list for Y in this example stored on cache C 702 indicates that copies of Y are stored on caches E 703 and F 704. Directory cache C 702 sends a UDP ping request to object caches E 703 and F 704. As shown in FIG. 8, Cache A 801 first receives a UDP ping hit response ("UDP.sub.-- ping.sub.-- hit(1)") from cache E 802, and then a UDP ping hit response "(UDP.sub.-- ping.sub.-- hit(2)") from cache F 803. As shown in FIG. 9, cache A 901 then sends a request for a copy of object Y to cache E 902 because the first UDP ping hit received by cache A 901 was from cache E 902. This decision criteria advantageously assures that the object is requested from the cache that is "nearest" to the receiving cache A 901 in a network sense, based upon the lower delay in response time between cache A 901 polling the object caches and receiving cache E's 902 UDP ping hit message, compared to receiving cache F's 903 UDP ping hit message. In response to the request from cache A 901, cache E 902 sends a copy of object Y to cache A 901, which stores a copy and forwards a copy to the requesting client. The directory list for object Y is now inaccurate because it does not reflect that cache A 901 has a copy of object Y. In one embodiment of the present invention, cache A 901 sends a message to cache C to update the directory list. Cache C 903 adds the network address of cache A 901 to the directory list for object Y. Cache A 901 thus becomes an object cache for Y. In another embodiment of the present invention, cache C 702 (FIG. 7) does not send a UDP ping request to all caches whose addresses appear on the directory list for the requested object. Rather, cache C (702) selects a fixed number of addresses, alpha, from the list. Suppose that the directory list for object Y in the example shown in FIG. 7 is B 705, D 706, E 703 and F 704. Cache C 702 chooses alpha to be two, and cache C 702 selects caches E 703 and F 704 from the directory list to which to send UDP ping requests along with the address of cache 801 A. Cache E's 802 (FIG. 8).UDP ping hit message is received first by cache 801 A, so cache 801 A sends a request to cache E ("request(1)", FIG. 9) for the object. A copy of the requested object is sent ("copy(2)") 902 from cache 902 E to cache A 901 (FIG. 9). A copy of the object is stored on cache A 901 and a copy is sent ("copy(3)") to the requesting client 904. Cache A 901 sends a message to cache C 903 to update its directory list for Y by adding the address for cache A 901. Whenever alpha is less than the number of addresses on the directory list, the cache must decide which of the caches from the directory list it will send UDP ping requests. In one embodiment, the cache uses a round robin method to advantageously distribute the load among the caches on the list. In another embodiment, the cache randomly selects alpha caches to which to send a request. In one embodiment of the present invention, the directory cache chooses a single cache on the directory list to which to send a UDP ping request, i.e., alpha is one. For example, suppose cache C 1001 (FIG. 10) chooses cache E 1002. Cache C 1001 sends UDP ping request message along with the address of cache A 1003 to cache E 1002, and cache E 1002 sends a response (either UDP ping hit or UDP ping miss) directly to cache A 1003. If the response is UDP ping hit, then cache A 1003 sends a request for the object to cache 1002 E. If the response is UDP ping miss, then the directory list is inaccurate, and cache 1101 (FIG. 11) A sends a directory delete request to cache C 1102, which deletes cache E's 1103 address from the directory list for the requested object. On the other hand, if cache E fails to respond to a request for the object from cache A, then the cache E or the connection to cache E may be inoperative, and another cache on the directory list must be selected and sent a UDP ping request. When alpha is one, the number of messages needed to retrieve a copy is advantageously reduced to four, not including the message that carries a copy of the requested object from cache C 1001 to cache A 1003. This is far more efficient then the known decision function shown in FIGS. 2, 3 and 4, which requires 2N+1 messages to achieve the same goal. Implementing the decision function in accordance with the present invention for alpha=1 provides good caching performance in a reliable network when the probability of failure of either a connection or a machine is very low, and when cache C 1001 implements an efficient method for distributing the load among all caches. The strategy of updating a directory list as the decision system is implemented represents an "optimistic approach," so called because it operates efficiently for highly reliable networks (i.e., networks with highly reliable machines and connections.) Certain networks are less than thoroughly reliable. An example of such a network is the Internet, where it is difficult or impossible to predict whether a given machine or connection is operating or is unavailable at any given time. Further, an efficient distribution of the load among caches is difficult to obtain because different directory and receiving caches make independent decisions about where to send requests for data objects. In other words, a cache attempting to evenly distribute load can be unaware of how loads are being placed independently on other caches, causing unexpected concentrations data object requests on one or more caches. Choosing alpha=1 can be inefficient because the chosen cache may not be able to send the requested object is because the cache or the connection to the cache is temporarily inoperative. Hence, a "pessimistic approach" strategy is advantageously implemented where the directory list is not updated while the decision function is implemented, except that the appropriate address is added to a directory list whenever a cache receives a new object, or deleted from a directory list if a cache ejects (deletes) an object. When alpha is one and the response to the object request is negative (e.g., a UDP ping miss or no response is received), another cache is chosen, and a whole new set of UDP ping messages exchanged to cause the other cache to send the requested object, if possible. Another risk of choosing alpha=1 is that the directory list is inaccurate and the single chosen cache does not in fact have a copy of the requested object. Choosing alpha>1 alleviates these problems in unreliable networks by introducing redundancy into the process of obtaining a copy of the object, seeking the requested object in more than one possible location. This redundancy also mitigates the effect of having some inaccurate directory entries that are not corrected under the pessimistic approach. In one embodiment of the present invention, alpha is chosen to be a constant, K. In this case, if the number of caches on the directory list is N and N<K, then a UDP ping request is sent to N caches that store a copy of the requested object and K-N caches that do not store a copy, and the efficiency of the present invention approaches that of the known caching systems. If the number of caches on the directory list is N and N>K, then the cache chooses K caches from the directory list and sends each a UDP ping request. The number of messages generated in this case is less than or equal to 2K+3, not including the message that carries the copy of the requested object. This can be efficient in a reliable network for the same reasons discussed above for alpha=1. But in an unreliable network, it fails to take full advantage of the redundancy gained by sending a UDP ping request to all of the caches on the directory list that purport to have copies of the requested object. In accordance with the present invention, the value of alpha is advantageously selected to maximize caching system performance according to the reliability of the network on which the present invention is implemented. In highly reliable networks, alpha is selected to be closer to one. In less reliable networks, alpha is selected to be greater than one, its value being selected to be greater the less reliable the network. However, alpha in unreliable networks should be at most equal to the number of addresses in a directory list. In one embodiment of the present invention implemented on an unreliable network, the size of each directory list is fixed at K addresses, and alpha is set to be equal to K. In another embodiment of the present invention, alpha is set to be the average number of addresses on all directory lists. In yet another embodiment, alpha is set to be the median number of addresses on all directory lists. Under the optimistic approach, a cache sends a directory add request to the cache that stores the directory list for the object only when the cache receives an original copy, i.e., when the cache sends request for the object to the object's original source. Upon receiving the directory add request, the directory cache will add the sending cache's address to the directory list for the given object. Also under the optimistic approach, the cache requesting a copy of a given object will send a directory delete request to the directory cache whenever the requesting cache fails to receive a copy of the requested object from an object cache, deleting the object cache address from the directory list for that object. Also, a directory delete request is sent by an object cache to the appropriate directory cache whenever the object cache ejects the object, whereupon the object cache address is deleted from the directory list. Under the pessimistic approach, a cache sends a directory add request to the directory cache each time the cache receives a new copy of a given object, from whatever source. No directory delete request is sent when the requesting cache fails to obtain a requested copy. Rather, a directory delete request is only sent to the appropriate directory cache whenever the cache storing the object ejects the object. This directory delete request is sent by the ejecting object cache. Upon receiving the delete request, the ejecting object cache address is deleted from the directory list. When a data object is outdated, it must be ejected from a cache where it is stored, as shown in FIG. 12. This is accomplished in accordance with one aspect of the present invention by associating a time-to-live (TTL) parameter with each stored object. The TTL parameter specifies the maximum lifetime of a cached object. In one embodiment, the TTL parameter is a date-time stamp offset a fixed amount of time from the time a copy of the object is received and stored at the cache. When the TTL expires, the object is ejected. The TTL parameter is implemented in known systems, where it can disadvantageously allow outdated copies to persist in caches. This problem is addressed in known systems by broadcasting a ejection message to all caches for a given object when the object becomes outdated (e.g., the original is changed). However, the broadcast of ejection messages generates substantial traffic that can disadvantageously burden a network. The present invention solves this problem by efficiently ejecting outdated cached copies in a distributed fashion, thus avoiding costly broadcast messages. An object may be ejected before the expiration of its TTL in accordance with the present invention by sending an ejection request from the original source of an object to the directory cache for the object whenever the original object is changed. The directory cache then sends ejection messages to all of the object caches on the directory list for the object. The object caches then delete the object, and the directory list itself is deleted from the directory cache. This is shown in FIG. 12. Cache A 1201 receives a request from original source server 1202 to eject all copies of object Y. Cache A 1201 carries out directory locator function L(Y)=C, 1203 showing that the directory list for object Y is stored on directory cache C 1204. Cache A 1201 then sends a directory eject request to cache C 1203. The directory list for Y on cache C 1203 indicates that cached copies of Y are stored on caches E 1205 and F 1206. Upon receiving the directory eject request, cache C 1003 sends an eject request to caches E 1205 and F 1206, which eject their copies of Y. Cache C 1203 then deletes its directory list for Y. In another embodiment, cache A sends a directory eject request to the directory server whenever cache A receives a copy of an object from the object's original source server. This embodiment advantageously requires no ejection message from the original source server. The present invention advantageously eliminates the need to broadcast an eject message to all directory caches as is carried out in certain known caching systems. This substantially reduces the amount of network traffic generated to eject outdated copies of objects, thus more efficiently maintaining the accuracy of the caching system. The distributed ejection scheme implemented in accordance with the present invention advantageously eliminates the need to broadcast an ejection message to all caches as is known in the art. The distributed ejection scheme is more efficient and economical than the broadcast method, as it generates only the network traffic needed to eject the outdated objects. Further, distributed ejection operates as efficiently for large systems as for small systems, and is thus more scalable than the broadcast system, which generates increasingly large and burdensome amounts of traffic as the size of the network increases. The present invention advantageously provides a more efficient and scalable caching system than known caching systems. It provides a distributed directory in the form of directory lists, with each directory list associated with a data object. A directory list is a list of object caches on which a copy of an object is stored. The directory lists for objects are distributed across numerous servers in the network. The location of a directory list for a given object is easily and economically determined from a hash locator function that correlates the network address of the object with a pointer to the network address of the directory cache on which the directory list is stored. A number of object caches from the directory list are polled, and one is selected from which to receive the cached copy. The number of object caches can be advantageously selected in accordance with the present invention to maximize caching system performance in various network environments of different reliabilities. Directory lists are updated in a distributed fashion in accordance with the condition of the network. Outdated objects and directories are ejected from the caching system in an efficient distributed fashion. The present invention can be implemented with greater reliability and less cost than known caching systems. Further, the present invention solves many of the problems associated with known caching systems, as the present invention is less susceptible to bottlenecking, single-points-of-failure, and burdens the network with substantially less caching system network traffic than known systems. Pseudo code embodiments of various aspects of the present invention follow. A client seeking to obtain a copy of a given object sends a get request message to a cache. As shown in FIG. 5, the get request for object Y is sent from the client 502 to cache 501 A. A pseudo code embodiment of a get request is as follows:
______________________________________
GET.sub.-- request(input: url.sub.-- of.sub.-- object) /*url.sub.--
of.sub.-- object is the network
address of the requested object*
. . .
dest.sub.-- IPaddr := HASH.sub.-- TABLE.sub.-- NEIGHBORS
[hash.sub.-- l(url.sub.-- of.sub.-- object)];
/*the above line is the directory locator function, implemented in
this embodiment as a hash function on the network address of
the object*/
if dest.sub.-- IPaddr is equal to my.sub.-- IPaddr
/*does the locator function
point to the directory cache
itself?*/
then lookup (url.sub.-- of.sub.-- object, my.sub.-- IPaddr);
/*if the locator function points
to the directory cache, a
lookup routine is run that
will send a copy to the
requesting client*/
else send (dest.sub.-- IPaddr, lookup (url.sub.-- of.sub.-- object,
my.sub.-- IPaddr));
/*the above line sends a lookup request to the cache other than the
receiving cache on which the directory list is located (the "directory
cache") as indicated by the lookup function*/
loop until (response is equal to UDP.sub.-- ping.sub.-- hit
or response is equal to NO.sub.-- neighbors
or waiting.sub.-- time expired
or number.sub.-- of.sub.-- responses is equal to alpha)
wait (response, neighbor.sub.-- IPaddr);
/*this loop continues to run until either a first UDP ping hit message
is
received, or a message to the effect that no copy of the requested
object is stored on caches with which the receiving cache may directly
communicate, or a given waiting time has expired with no response,
or the number of responses is equal to a predetermined number alpha*/
if response is UDP.sub.-- ping.sub.-- hit
then send(neighbor.sub.-- IPaddr, request(url.sub.-- of.sub.-- object,
my.sub.-- IPaddr));
/*if a UDP ping hit is received, then a message is sent to the sender of
the
UDP ping hit to send the requested object to the receiving cache at
network address my.sub.-- IPaddr*/
else send(home.sub.-- url.sub.-- IPaddr, Get.sub.-- request(url.sub.--
of.sub.-- object,
my.sub.-- IPaddr));
/*if no UDP ping request is received, then the original source of the
requested object at home.sub.-- url.sub.-- IPaddr is asked to send a copy
of
the requested object to the receiving cache*/
. . .
}
______________________________________
The above embodiment of a get request implements a directory lookup request. A pseudo code for a lookup request is as follows:
______________________________________
dir.sub.-- lookup (input: url.sub.-- of.sub.-- object, client.sub.--
IPaddr;
output: list.sub.-- of.sub.-- neighbors)
/*client refers to the sender of the lookup request*/
find.sub.-- URL(url.sub.-- of.sub.-- object);
/*find the directory list associated
with the network address of requested
object*/
if url.sub.-- of.sub.-- object is found
/*if the directory list for the requested
then return (list.sub.-- of.sub.-- neighbors);
object is found, the return the
list of addresses on the
directory list*/
else return (null.sub.-- list);
if optimistic approach
find.sub.-- CLIENT(client.sub.-- IPaddr);
/*under the optimistic approach, the directory list is updated by
adding the address of the client cache if the client cache address
is not already on the list*/
if client.sub.-- IPaddr is not found
add.sub.-- neighbor.sub.-- to.sub.-- list (url.sub.-- of.sub.-- object,
client.sub.-- IPaddr);
}
______________________________________
A pseudo code embodiment of a directory lookup routine that selects alpha addresses from a directory list is as follows:
______________________________________
lookup (input: url.sub.-- of.sub.-- object, requester.sub.-- IPaddr)
dir.sub.-- lookup (url.sub.-- of.sub.-- object, requester.sub.-- IPaddr,
list.sub.-- of neighbors);
if list.sub.-- of.sub.-- neighbors is null.sub.-- list
then send (requester.sub.-- IPaddr, (NO.sub.-- neighbors, my.sub.--
IPaddr));
else
make alpha.sub.-- list by choosing alpha neighbors from list.sub.--
of.sub.-- neighbors;
send (alpha.sub.-- list, UDP.sub.-- ping.sub.-- req (requester.sub.--
IPaddr));
}
______________________________________
A pseudo code embodiment of a UDP ping request/hit/miss routine is as follows:
______________________________________
UDP.sub.-- ping.sub.-- req (input: url.sub.-- of.sub.-- object,
client.sub.-- IPaddr)
/*client is the sender of the UDP ping request*/
.linevert split.
find object url.sub.-- of.sub.-- object in local cache;
if object is found
then send (client.sub.-- IPaddr, UDP.sub.-- ping.sub.-- hit (my.sub.--
IPaddr));
else send (client.sub.-- IPaddr, UDP.sub.-- ping.sub.-- miss (my.sub.--
IPaddr));
______________________________________
An embodiment of pseudo code for the directory add routine is as follows:
______________________________________
dir.sub.-- add (input: url.sub.-- of.sub.-- object, client.sub.--
IPaddr)
/*client refers to a cache to which a copy of an object has
been added*/
find.sub.-- URL (url.sub.-- of.sub.-- object);
if url.sub.-- of.sub.-- object is not found
add.sub.-- new.sub.-- list (url.sub.-- object);
/*start a new directory list*/
find.sub.-- CLIENT (client.sub.-- IPaddr);
/*look for the client network address
on the directory list*/
if client.sub.-- IPaddr is not found
/*if not found, add it to the directory list*/
add.sub.-- neighbor.sub.-- to.sub.-- list (url.sub.-- of.sub.-- object,
client.sub.-- IPaddr);
}
______________________________________
A pseudo code embodiment of a directory delete routine is as follows:
______________________________________
dir.sub.-- del (input: url.sub.-- of.sub.-- object, client.sub.--
IPaddr)
/*client refers to a cache from which a copy of an object has
been deleted*/
find.sub.-- URL (url.sub.-- of.sub.-- object);
/*look for address of object on directory
list*/
if url.sub.-- of.sub.-- object is found
/*if address is found, delete the address
from the directory list*/
find.sub.-- CLIENT (client.sub.-- IPaddr);
if client.sub.-- IPaddr is found
delete.sub.-- neighbor.sub.-- from.sub.-- list (url.sub.-- of.sub.--
object client.sub.-- IPaddr);)
______________________________________
An pseudo code embodiment of an ejection routine is as follows:
______________________________________
eject.sub.-- local.sub.-- copy (input: url.sub.-- of.sub.-- object)
. . .
dest.sub.-- IPaddr := HASH.sub.-- TABLE.sub.-- NEIGHBORS
[hash.sub.-- l(url.sub.-- of.sub.-- object)];
/*the above line carries out the directory locator function as a hash
function to locate the cache on which the directory list for the object
is
stored*/
if dest.sub.-- IPaddr is equal to my.sub.-- IPaddr /*if directory cache
is local cache,
delete directory list on local cache*/
then dir.sub.-- del;(url.sub.-- of.sub.-- object, my.sub.-- IPaddr);
else send (dest.sub.-- IPaddr, dir.sub.-- del(url.sub.-- of.sub.--
object, my.sub.-- IPaddr));
/*otherwise send a directory delete request to the directory cache*/
}
______________________________________
A pseudo code embodiment of a refresh routine that obtains a copy of the object from the original source is as follows:
______________________________________
refresh (input: url.sub.-- of.sub.-- object)
. . .
dest.sub.-- IPaddr := HASH.sub.-- TABLE.sub.-- NEIGHBORS
[hash.sub.-- l(url.sub.-- of.sub.-- object)];
/*directory locator function*/
if optimistic approach /*under optimistic approach, update directory
list
when refresh routine is executed*/
if dest.sub.-- IPaddr is equal to my.sub.-- IPaddr
then dir.sub.-- add(url.sub.-- of.sub.-- object, my.sub.-- IPaddr);
else send (dest.sub.-- IPaddr, dir.sub.-- add(url.sub.-- of.sub.--
object, my.sub.-- IPaddr));
}
______________________________________
A pseudo code embodiment of a routine that tests whether a response is received from a destination and implements the appropriate maintenance strategy is as follows:
______________________________________
wait.sub.-- response (input: status)
. . .
dest.sub.-- IPaddr := HASH.sub.-- TABLE.sub.-- NEIGHBORS
[hash.sub.-- l(url.sub.-- of.sub.-- object)];
/*above line is directory locator function*/
if status is OK and pessimistic approach
/*under the pessimistic
approach, and a copy is
received, then the
directory list is
updated by adding
the address of the
receiving cache*/
if dest.sub.-- IPaddr is equal to my.sub.-- IPaddr
then dir.sub.-- add(url.sub.-- of.sub.-- object, my.sub.-- IPaddr);
else send (dest.sub.-- IPaddr, dir.sub.-- add(url.sub.-- of.sub.--
object, my.sub.-- IPaddr));
if status is not OK and optimistic approach
/*under optimistic approach,
if a copy is not received,
then the directory list is
updated by deleting
the address of the
receiving cache*/
if dest.sub.-- IPaddr is equal to my.sub.-- IPaddr
then dir.sub.-- del(url.sub.-- of.sub.-- object, my.sub.-- IPaddr);
else send (dest.sub.-- IPaddr, dir.sub.-- del(url.sub.-- of.sub.--
object, my.sub.-- IPaddr));
}
______________________________________
A pseudo code embodiment of the distributed ejection scheme in accordance with the present invention is as follows:
______________________________________
dir.sub.-- eject (input: url.sub.-- of.sub.-- object)
dest .sub.-- IPaddr := HASH TABLE.sub.-- NEIGHBORS
[hash.sub.-- l(url.sub.-- of.sub.-- object)];
/*above line is directory lookup function*/
if dest.sub.-- Ipaddr is not equal to my.sub.-- IPaddr /*check local
cache first*/
then send (dest.sub.-- IPaddr, dir.sub.-- eject (url.sub.-- of.sub.--
object));
else
find.sub.-- URL(url.sub.-- object); /*find the directory list*/
if url.sub.-- of.sub.-- object is found
then
send (list.sub.-- of.sub.-- neighbors, eject(url.sub.-- of.sub.--
object));
clear the list.sub.-- of.sub.-- neighbors; /*clear the directory list*/
eject (url.sub.-- of.sub.-- object); /* eject local copy */
}
______________________________________
A pseudo code embodiment of the eject routine is as follows:
______________________________________
eject (input: url.sub.-- of.sub.-- object)
find object url.sub.-- of.sub.-- object in local cache;
if object is found
then remove object url.sub.-- of.sub.-- object from the local cache;
}
______________________________________
The present invention advantageously provides a more efficient system and method for caching objects on a network than know caching systems. The use of distributed methods in accordance with the present invention advantageously provides an scalable way to store, find, obtain and eject cached objects, providing improved network, even as the number of data objects on the network increasing performance while imposing a lower caching traffic burden on the network. The present invention also performs better than known caching system on networks whose data objects change frequently. This is advantageously achieved without the need for locking acknowledgments, and broadcast messages that unduly burden networks in known caching systems.
|
Same subclass Same class Consider this |
||||||||||
