Application of database or data structure (e.g., distributed, multimedia, image)

Automatic retrieval of changed files by a network software agent

6029175

Abstract

An intelligent network agent intercepts transactions between clients and servers to perform Distributed Information Logistics Services (DILS) functions such as automatically retrieving updated files from remote servers and delivering them to local client programs. For example, HTTP clients and HTTPD servers are connectionless and stateless, thus there is no way for a server to update a browser automatically when an HTML document is changed. The invention provides a method to update any number of clients from any number of servers without making any changes to currently existing HTTP clients or HTTPD servers. Furthermore, the invention can provide various other DILS services for clients to reduce latency and communication costs for members of a group with interests in similar objects. For example, the intelligent network agent maintains a cache of objects of interest to the group of clients, a log of changes to the objects, a list of the clients interested in the objects, a list of significant change detection methods for the objects, a list of search specifications for the objects, lists of client notification methods, and lists of general interest specifications for the clients.


Claims

What is claimed is:

1. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software enabling interested parties to request access to said objects for display of accessed ones of said objects, a computer-implemented method of operating at least one of said computers for automatically notifying said interested parties when objects of interest are changed, said computer-implemented method comprising the steps of:

a) accepting from said interested parties specifications of the objects of interest;

b) maintaining in memory a list of the interested parties interested in each of the objects of interest;

c) detecting occurrence of changes in the objects of interest, and in response to detecting the occurrence of a change in an object of interest, determining whether an update notification would then be desirable for each interested party in the list of interested parties interested in the object of interest in which the occurrence of change is detected; and

d) upon determining that an update notification would then be desirable for one of the interested parties in response to detecting the occurrence of change in one of said objects of interest, notifying said one of the interested parties of the occurrence of change in said one of said objects of interest for display of said one of said objects of interest.

2. The method as claimed in claim 1, wherein said interested parties are HTTP-compliant client browsers, said specifications are accepted from said interested parties by following an HTTP client-to-server data transmission protocol, and said one of said interested parties is notified of the occurrence of change in said one of said objects of interest by following an HTTP server-to-client data transmission protocol.

3. The method as claimed in claim 1, which further includes:

in response to accepting a specification of each object of interest, checking a cache memory for said each object of interest, and when said cache memory does not contain said each object of interest, obtaining said each object of interest by transmitting said each object of interest over said network from a computer in said network to said cache memory, and storing said each object of interest in said cache memory; and wherein said one of said objects is fetched from said cache memory for said one of said interested parties for display.

4. The method as claimed in claim 1, which further includes highlighting changes in said one of said objects of interest when displaying said one of said objects of interest.

5. The method as claimed in claim 1, which further includes accepting a desired update notification frequency from each interested party, and using the desired update notification frequency for determining whether an update notification would then be desirable for each interested party in the list of interested parties interested in the object of interest in which the occurrence of change is detected.

6. The method as claimed in claim 1, which further includes recording, for each object of interest, the time when said each object of interest was updated, and inspecting the recorded time when each object of interest was updated for detecting the occurrence of a change in said each object of interest.

7. The method as claimed in claim 1, wherein said specifications include specifications of aggregates of objects, and said method includes determining the objects in the specified aggregates of objects for maintaining in memory the list of the interested parties interested in each of the objects of interest.

8. The method as claimed in claim 1, wherein the specifications include abstract characterizations of attributes of objects, and further including the step of matching said abstract characterizations with attributes of objects stored in said distributed computing system to determine objects that are specified by said abstract characterizations.

9. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for providing client browsers for fetching said objects, a computer-implemented method of operating at least one of said computers for automatically notifying said client browsers when objects of interest are changed, said computer-implemented method comprising the steps of:

a) accepting from each client browser a specification of an object of interest, said specification including a resource locator identifying one of said computers containing said object of interest;

b) maintaining in memory a list of the client browsers interested in said object of interest;

c) checking a cache memory for said object of interest, and when said cache memory does not contain said object of interest, obtaining said object of interest by transmitting said object of interest over said network from said one of said computers containing said object of interest to said cache memory, storing said object of interest in said cache memory, and modifying the resource locator for said object of interest to create a modified resource locator for said object of interest identifying said cache memory as a location for obtaining said object of interest;

d) transmitting the modified resource locator for said object of interest from said cache memory to said each client browser, said each client browser thereafter using said modified locator for said object of interest to fetch said object of interest from said cache memory;

e) detecting occurrence of change in said object of interest, and in response to detecting the occurrence of change in said object of interest, determining whether an update notification would then be desirable for each of the client browsers in the list of client browsers interested in said object of interest; and

f) upon determining that an update notification would then be desirable for one of the client browsers in response to detecting the occurrence of change in said object of interest, notifying said one of the client browsers of the occurrence of change in said object of interest.

10. The method as claimed in claim 9, which includes maintaining a list of mechanisms used for notifying client browsers of the occurrence of change in the respective objects of interest.

11. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for providing client browsers for fetching said objects, said object access software using standard resource location protocols for specifying the objects to be fetched, a computer-implemented method of operating at least one of said computers for providing objects from a cache located close to a group of said client browsers, said computer-implemented method comprising the steps of:

a) obtaining objects of interest to said group of client browsers by transmitting said objects of interest over said network to said cache from computers in said distributed computing system containing said objects of interest; and

b) transmitting said objects of interest from said cache to said client browsers in said group of client browsers;

wherein said client browsers are HTTP-compliant client browsers following an HTTP client-to-server data transmission protocol, and said method further includes modifying resource locators for said objects of interest to create modified resource locators identifying said cache as a location for obtaining said objects of interest.

12. The method as claimed in claim 11, wherein said client browsers are HTTP-compliant client browsers following an HTTP client-to-server data transmission protocol, and said method further includes checking said cache for an object of interest identified by a resource locator in a specification received from one of said client browsers in said group of client browsers, and when said cache does not contain the object of interest identified by the resource locator, redirecting said one of said specifications to one of said computers identified by said resource locator as a location for obtaining the object of interest.

13. The method as claimed in claim 12, wherein said cache stores objects of interest obtained from various ones of said computers in said distributed computing system, and said method includes transmitting the objects of interest from said cache to said client browsers in said group of client browsers instead of transmitting the objects of interest from said various ones of said computers in said distributed computing system to said client browsers in said group of client browsers in order to reduce latency for access to the objects of interest stored in said cache.

14. The method as claimed in claim 12, wherein said cache stores objects of interest obtained from various ones of said computers in said distributed computing system, and said method includes transmitting the objects of interest from said cache to said client browsers in said group of client browsers instead of transmitting the objects of interest from said various ones of said computers in said distributed computing system to said client browsers in said group of client browsers in order to reduce communication access costs for access to the objects of interest stored in said cache.

15. The method as claimed in claim 11, which further includes selecting among alternative sources for objects of interest to be stored in said cache by employing performance, cost or quality statistics on resource accesses from said alternative sources.

16. The method as claimed in claim 15, which further includes collecting and maintaining said performance, cost or quality statistics on resource accesses from said alternative sources.

17. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for providing client browsers for fetching said objects, said object access software using standard resource location protocols for specifying the objects to be fetched, a computer-implemented method of operating at least one of said computers for providing objects from a cache located close to a group of said client browsers, said computer-implemented method comprising the steps of:

a) obtaining objects of interest to said group of client browsers by having said objects of interest transmitted over said network to said cache from computers in said distributed computing system containing said objects of interest; and

b) transmitting said objects of interest from said cache to said client browsers in said group of client browsers;

wherein said method further includes:

c) accepting from said client browsers specifications of objects of interest for which said client browsers desire notification of changes;

d) detecting occurrence of changes in the objects of interest, and in response to detecting the occurrence of a change in an object of interest, determining whether an update notification would then be desirable for each client browser desiring notification of changes in the object of interest in which the occurrence of change is detected; and

e) upon determining that an update notification would then be desirable for one of the client browsers, notifying said one of the client browsers of the occurrence of change in said one of said objects of interest.

18. The method as claimed in claim 17, which further includes:

f) recording a first indication of when said one of said objects of interest was last changed;

g) recording a second indication of when said one of the client browsers was last notified of the occurrence of change in said one of said objects of interest;

h) receiving from said one of the client browsers a specification of a desired update notification frequency; and

i) using said first indication, said second indication, and said specification of said desired update notification frequency to determine that an update notification would then be desirable to said one of said client browsers.

19. The method as claimed in claim 17, which further includes recording an indication of changes made to the objects stored in said cache, and providing, to said client browsers in said group of client browsers, summaries of changes made to the objects stored in said cache.

20. A distributed computing system comprising a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for providing client browsers for fetching said objects, said object access software using standard resource location protocols for specifying the objects to be fetched, wherein a plurality of said computers are distributed around the network and programmed to maintain in respective cache memories objects of interest to respective neighboring groups of said client browsers by fetching the objects of interest from various ones of said computers across said network and storing the objects of interest in the cache memories, and transmitting the objects of interest from the cache memories to the respective neighboring groups of said client browsers;

further including means for employing performance, cost or quality statistics of resource accesses to select among alternative sources for the objects of interest stored in said cache memories; and

means for selecting among alternative sources for the objects of interest by optimizing a user objective function trading off communication cost factors against quality of service factors including latency.

21. The distributed computing system as claimed in claim 20, wherein said cache memories are distributed at selected locations around said network, and said client browsers are grouped to be serviced by said cache memories.

22. A distributed computing system comprising a network of computers linked for accessing objects distributed among said computers. some of said computers executing object access software for providing client browsers for fetching said objects, said object access software using standard resource location protocols for specifying the objects to be fetched, wherein a plurality of said computers are distributed around the network and programmed to maintain in respective cache memories objects of interest to respective neighboring groups of said client browsers by fetching the objects of interest from various ones of said computers across said network and storing the objects of interest in the cache memories, and transmitting the objects of interest from the cache memories to the respective neighboring groups of said client browsers;

further including means for employing performance, cost or quality statistics of resource accesses to select among alternative sources for the objects of interest stored in said cache memories; and

wherein said cache memories are distributed at selected locations around said network, and said client browsers are selectively grouped to be serviced by said cache memories, in order to optimize expected value of an objective function in probability over a distribution of expected accesses of the objects of interest to the client browsers included in said groups of client browsers.

23. A distributed computing system comprising a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for providing client browsers for fetching said objects, said object access software using standard resource location protocols for specifying the objects to be fetched, wherein a plurality of said computers are distributed around the network and programmed to maintain in respective cache memories objects of interest to respective neighboring groups of said client browsers by fetching the objects of interest from various ones of said computers across said network and storing the objects of interest in the cache memories, and transmitting the objects of interest from the cache memories to the respective neighboring groups of said client browsers;

further including means for employing Performance, cost or quality statistics of resource accesses to select among alternative sources for the objects of interest stored in said cache memories; and

means associated with one of said cache memories for determining, for each object of interest in said one of said cache memories and for each client browser serviced by said one of said cache memories and being interested in said each object of interest in said one of said cache memories, whether an update notification would be desirable at a specific point in time.

24. A distributed computing system comprising a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for providing client browsers for fetching said objects, wherein a plurality of said computers are distributed around the network and programmed to maintain in respective cache memories objects of interest to respective neighboring groups of said client browsers by fetching the objects of interest from various ones of said computers across said network and storing the objects of interest in the cache memories, and transmitting the objects of interest from the cache memories to the respective neighboring groups of said client browsers,

wherein said distributed computing system includes, for one of said cache memories, means for receiving resource interest specifications from the client browsers serviced from said one of said cache memories, means for maintaining in said one of said cache memories objects that satisfy said resource interest specification, and means for maintaining a list of mechanisms for sending information to the client browsers serviced from said one of said cache memories.

25. The distributed computing system as claimed in claim 24, wherein said distributed computing system further includes, for said one of said cache memories, means for using said lists of mechanisms to send notice of changes in objects of interest to the client browsers serviced from said one of said cache memories.

26. The distributed computing system as claimed in claim 24, further including means for employing performance, cost or quality statistics of resource accesses to select among alternative sources for the objects of interest stored in said cache memories.

27. The distributed computing system as claimed in claim 26, further including means for collecting and maintaining said performance, cost or quality statistics on resource accesses from said alternative sources.

28. The method as claimed in claim 24, further including means associated with one of said cache memories for determining, for each object of interest in said one of said cache memories and for each client browser serviced by said one of said cache memories and being interested in said each object of interest in said one of said cache memories, whether an update notification would be desirable at a specific point in time.

29. In a distributed computing system having a network of clients and servers, said clients sending resource requests to said servers, a computer-implemented method of using a resource manager in said distributed computing system for intermediating between said clients and servers, said method including the steps of:

a) said resource manager intercepting said resource requests from said clients;

b) said resource manager fetching objects from servers to satisfy said resource requests from said clients, said objects fetched from said servers having respective resource locators identifying the respective sources of the objects fetched from said clients;

c) said resource manager modifying said resource locators to identify said resource manager as a source for the objects fetched from the servers to satisfy said resource requests from said clients; and

d) said resource manager returning the fetched objects and modified resource locators to said clients.

30. The method as claimed in claim 29, wherein said clients are HTTP-compliant clients, said resource requests are accepted from said clients by following an HTTP client-to-server data transmission protocol, and said resource manager returns the fetched objects and modified resource locators to said clients by following an HTTP server-to-client data transmission protocol.

31. The method as claimed in claim 29, which further includes said resource manager redirecting one of said resource requests to an alternative server expected to produce higher quality information at lower cost.

32. The method as claimed in claim 29, which further includes said resource manager selecting among alternative sources for objects.

33. The method as claimed in claim 32, which further includes said resource manager using information about costs of the alternative sources to select lowest-cost alternative sources for objects.

34. The method as claimed in claim 32, which further includes said resource manager using information about quality and costs of the alternative sources for objects to select among the alternative sources for objects by a trade-off between quality and cost.

35. The method as claimed in claim 32, which further includes said resource manager collecting and maintaining said performance, cost or quality statistics on resource accesses from said alternative sources, and using said statistics on resource accesses for the selecting among the alternative sources for objects.

36. The method as claimed in claim 29, which further includes said resource manager storing in a cache memory objects having respective resource locators that are modified by the resource manager to indicate that the resource manager is a source of the objects, and said resource manager responding to resource requests from said clients by checking said cache memory for objects that satisfy the resource requests, and when said resource manager finds an object in said cache memory that satisfies one of the resource requests, said resource manager transmitting the object from the cache memory to the client requesting the object.

37. The method as claimed in claim 36, which further includes said resource manager redirecting a request from a client for a specified object to a server when the resource manager checks said cache memory for the specified object and the specified object is absent from said cache memory.

38. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for enabling interested parties to request access to said objects, a computer-implemented method of providing the interested parties with summary reports of how objects of interest have changed, said computer-implemented method comprising the steps of:

a) accepting from said interested parties specifications of the objects of interest;

b) maintaining in memory a list of the interested parties interested in each of the objects of interest;

c) maintaining in memory a list of mechanisms for notifying the interested parties of changes in each of the objects of interest;

d) detecting occurrence of changes in the objects of interest, and in response to detecting an occurrence of a change in an object of interest, recording information about the change in the object of interest; and

e) using recorded information about changes in the object of interest that has changed, using the list of interested parties interested in the object that has changed, and using the list of mechanisms for notifying the interested parties of changes in the object of interest that has changed, to provide, to the interested parties interested in the object that has changed, a summary of changes in the object of interest that has changed, said summary of changes including changes occurring since a certain point in time.

39. The method as claimed in claim 38, which further includes said computers executing said object-access software to display said object of interest that has changed, and to highlight changes in said object of interest that has changed.

40. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software enabling interested parties to request access to said objects, a computer-implemented method of operating at least one of said computers for automatically notifying said interested parties when objects of interest are changed, said computer-implemented method comprising the steps of:

a) accepting from said interested parties specifications of the objects of interest;

b) maintaining in memory a list of the interested parties interested in each of the objects of interest;

c) maintaining in memory a list of mechanisms for notifying the interested parties of changes in each of the objects of interest;

d) determining when an update notification would be desirable for each interested party in the list of interested parties; and

e) when an update notification would be desirable for one of the interested parties interested in one of the objects of interest, using said list of mechanisms to notify said one of the interested parties of the occurrence of change in said one of the objects of interest.

41. The method as claimed in claim 40, wherein one of said objects of interest contains hyperlinks, and wherein the method includes traversing said hyperlinks to access components of said one of said objects of interest.

42. The method as claimed in claim 40, wherein the object is an aggregate of multiple objects, and wherein the step of determining when an update notification would be desirable for each interested party in the list of interested parties includes determining whether a change has occurred in a specified subset of said multiple objects.

43. The method as claimed in claim 40, wherein the method further includes determining when a change occurs in each of the objects of interest, and determining whether an update notification would be desirable when a change occurs in each of the objects of interest.

44. The method as claimed in claim 40, wherein the method further includes polling a source of an object of interest in order to determine whether an update notification would be desirable.

45. The method as claimed in claim 40, wherein a source of an object of interest produces a change indication when change occurs in the object of interest, and wherein the step of determining whether an update notification would be desirable is responsive to the change indication from the source of the object of interest.

46. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for enabling interested parties to request access to said objects, a computer-implemented method of operating at least one of said computers for notifying a group of said interested parties when a change of significance is made to an object of interest to said group, said computer-implemented method comprising the steps of:

a) accepting from said interested parties specifications of the objects of interest;

b) maintaining in memory a list of the interested parties interested in each of the objects of interest;

c) determining, for each object of interest, whether a change of significance has been made to an object of interest; and

d) upon determining that a change of significance has been made to an object of interest, using the list of interested parties interested in the object of interest that has changed to notify the interested parties interested in the object of interest that has changed.

47. The method as claimed in claim 46, which includes maintaining in memory a list of significant change detection methods for each object of interest, and wherein the list of significant change detection methods for each object of interest is used when a change is detected in said each object of interest to determine whether a change of significance has been made in said each object of interest.

48. The method as claimed in claim 46, wherein one of the interested parties in said group of interested parties makes a change of significance to an object of interest to said group of interested parties, and all other interested parties in said group of interested parties are notified virtually simultaneously of the change of significance made by said one of the interested parties in said group of interested parties.

49. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for enabling interested parties to request access to said objects, a computer-implemented method of operating at least one of said computers for automatically notifying interested parties when a change is made to a component of an object defined as a neighborhood or cluster of semantically related objects, said computer-implemented method comprising the steps of:

a) maintaining in memory a list of interested parties interested in said object defined as a neighborhood or cluster of semantically related objects;

b) determining whether a change has occurred in said object defined as a neighborhood or cluster of semantically related objects;

c) upon determining that a change has occurred in said object defined as a neighborhood or cluster of semantically related objects, notifying interested parties by using said list of interested parties interested in said object defined as a neighborhood or cluster of semantically related objects.

50. The method as claimed in claim 49, which further includes receiving from one of said interested parties a specification of said object defined as a neighborhood or cluster of semantically related objects.

51. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for enabling interested parties to request access to said objects, a computer-implemented method of operating at least one of said computers for automatically notifying interested parties when a change is made to a component of an object defined as a neighborhood or cluster of semantically related objects, said computer-implemented method comprising the steps of:

a) maintaining in memory a list of interested parties interested in said object defined as a neighborhood or cluster of semantically related objects;

b) determining whether a change has occurred in said object defined as a neighborhood or cluster of semantically related objects;

c) upon determining that a change has occurred in said object defined as a neighborhood or cluster of semantically related objects, notifying interested parties by using said list of interested parties interested in said object defined as a neighborhood or cluster of semantically related objects;

which further includes receiving from one of said interested parties a specification of said object defined as a neighborhood or cluster of semantically related objects, wherein said specification is a search specification of key words.

52. The method as claimed in claim 51, which further includes maintaining a list of mechanisms for notifying said interested parties, and using said list of mechanisms for notifying said interested parties when notifying the interested parties interested in said object defined as a neighborhood or cluster of semantically related objects.

53. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for enabling interested parties to request access to said objects, a computer-implemented method of operating at least one of said computers based on time-value of information in said objects, said computer-implemented method comprising the steps of:

a) receiving from the interested parties specifications of objects of interest;

b) maintaining in memory a list of the interested parties interested in the objects of interest;

c) evaluating whether a time-value of each object of interest exceeds a threshold to determine whether said each object of interest has sufficient information value to notify the interested parties interested in said each object of interest; and

d) upon determining that said each object of interest has sufficient information value to notify the interested parties interested in said each object of interest, notifying the interested parties interested in said each object of interest.

54. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for enabling interested parties to request access to said objects, a computer-implemented method of operating at least one of said computers based on time-value of information in said objects, said computer-implemented method comprising the steps of:

a) receiving from the interested parties specifications of objects of interest;

b) maintaining in memory a list of the interested parties interested in the objects of interest;

c) evaluating whether a time-value of each object of interest exceeds a threshold to determine whether said each object of interest has sufficient information value to notify the interested parties interested in said each object of interest; and

d) upon determining that said each object of interest has sufficient information value to notify the interested parties interested in said each object of interest, notifying the interested parties interested in said each object of interest;

which further includes maintaining in memory a list of time-value calculations for each of said interested parties, and using said time-value calculations for evaluating whether said each object of interest has sufficient information value to notify each of the interested parties interested in said each object of interest.

55. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for enabling interested parties to request access to said objects, a computer-implemented method of operating at least one of said computers based on time-value of information in said objects, said computer-implemented method comprising the steps of:

a) receiving from the interested parties specifications of objects of interest;

b) maintaining in memory a list of the interested parties interested in the objects of interest;

c) evaluating whether a time-value of each object of interest exceeds a threshold to determine whether said each object of interest has sufficient information value to notify the interested parties interested in said each object of interest; and

d) upon determining that said each object of interest has sufficient information value to notify the interested parties interested in said each object of interest, notifying the interested parties interested in said each object of interest;

which further includes receiving from the interested parties specifications for evaluating the time-value of objects of interest to the interested parties.

56. The method as claimed in claim 55, wherein one of the interested parties specifies a procedure which is executed to evaluate the time-value of one of said objects of interest.

57. The method as claimed in claim 55, wherein one of the interested parties specifies parameters for a function that evaluates the time-value of one of said objects of interest.

58. The method as claimed in claim 55, wherein one of the interested parties specifies knowledge for a knowledge-based system that evaluates the time-value of one of said objects of interest.

59. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for enabling interested parties to request access to said objects, a computer-implemented method of operating at least one of said computers based on time-value of information in said objects, said computer-implemented method comprising the steps of:

a) receiving from said interested parties specifications of objects of interest, and specifications of significant information value for said objects of interest;

b) maintaining in memory a list of interested parties interested in the objects of interest;

c) evaluating said specifications of significant information value for said objects of interest to determine whether each object of interest has sufficient information value to notify the interested parties interested in said each object of interest; and

d) upon determining that said each object of interest has sufficient information value to notify the interested parties interested in said each object of interest, notifying the interested parties interested in said each object of interest.

60. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for enabling interested parties to request access to said objects, a computer-implemented method of operating at least one of said computers based on time-value of information in said objects, said computer-implemented method comprising the steps of:

a) receiving from said interested parties specifications of objects of interest, and specifications of significant information value for said objects of interest;

b) maintaining in memory a list of interested parties interested in the objects of interest;

c) evaluating said specifications of significant information value for said objects of interest to determine whether each object of interest has sufficient information value to notify the interested parties interested in said each object of interest; and

d) upon determining that said each object of interest has sufficient information value to notify the interested parties interested in said each object of interest, notifying the interested parties interested in said each object of interest;

wherein one of said specifications of significant information value for said objects of interest is a time-value calculation specified by one of said interested parties for one of said objects of interest, and said time-value calculation is evaluated to determine whether said one of said objects of interest has sufficient information value to notify said one of said interested parties.

61. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software for enabling interested parties to request access to said objects, a computer-implemented method of operating at least one of said computers based on time-value of information in said objects, said computer-implemented method comprising the steps of:

a) receiving from said interested parties specifications of objects of interest, and specifications of significant information value for said objects of interest;

b) maintaining in memory a list of interested parties interested in the objects of interest;

c) evaluating said specifications of significant information value for said objects of interest to determine whether each object of interest has sufficient information value to notify the interested parties interested in said each object of interest; and

d) upon determining that said each object of interest has sufficient information value to notify the interested parties interested in said each object of interest, notifying the interested parties interested in said each object of interest;

wherein one of said specifications of significant information value for said objects of interest is knowledge specified by one of said interested parties for one of said objects of interest, and said knowledge is evaluated by a knowledge-based system to determine whether said one of said objects of interest has sufficient information value to notify said one of said interested parties.

62. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers being client computers executing object access software for permitting users to view information from said objects, a computer-implemented method of operating at least one of said computers for automatically notifying said client computers when conditions of interest occur with respect to objects of interest, said computer-implemented method comprising the steps of:

a) accepting from each client computer a specification of an object of interest, said specification including a resource locator identifying one of said computers containing said object of interest;

b) maintaining in memory a list of the client computers interested in said object of interest;

c) checking a cache memory for said object of interest, and when said cache memory does not contain said object of interest, obtaining said object of interest by transmitting said object of interest over said network from said one of said computers containing said object of interest to said cache memory, storing said object of interest in said cache memory, and modifying the resource locator for said object of interest to create a modified resource locator for said object of interest identifying said cache memory as a location for obtaining said object of interest;

d) transmitting the modified resource locator for said object of interest from said cache memory to said each client computer, said each client computer thereafter using said modified locator for said object of interest to fetch said object of interest from said cache memory;

e) detecting a condition of interest with respect to said object of interest, and in response to detecting the condition of interest with respect to said object of interest, determining whether a notification of the condition of interest would then be desirable for each of the client computers in the list of client computers interested in said object of interest; and

f) upon determining that a notification of the condition of interest would then be desirable for one of the client computers in response to detecting the condition of interest in said object of interest, notifying said one of the client computers of the occurrence of the condition of interest in said object of interest.

63. The method as claimed in claim 62, which includes maintaining a list of mechanisms used for notifying respective client computers of the occurrence of respective conditions of interest in the respective objects of interest.

64. In a distributed computing system having a network of computers linked for accessing objects distributed among said computers, some of said computers executing object access software enabling interested parties to request access to said objects, a computer-implemented method of operating at least one of said computers for automatically notifying said interested parties when conditions of interest occur with respect to objects of interest, said computer-implemented method comprising the steps of:

a) accepting from said interested parties specifications of the objects of interest, and specifications of respective conditions of interest in the objects of interest;

b) for each of the objects of interest, maintaining in memory a respective list of the interested parties and the conditions of interest for the interested parties; and

c) checking whether a respective condition of interest occurs with respect to an object of interest, and when a respective condition of interest is found with respect to an object of interest, determining whether or not notification of the occurrence of the condition of interest to at least one of the interested parties would then be desirable, and when notification of the condition of interest to said at least one of the interested parties would then be desirable, notifying said at least one of the interested parties of the occurrence of the condition of interest with respect to the object of interest.

65. The method as claimed in claim 64, which further includes maintaining in memory a list of mechanisms for notifying the interested parties of the conditions of interest with respect to the objects of interest, and when a notification of the condition of interest would then be desirable for one of the interested parties interested in one of the objects of interest, using the list of mechanisms to notify said one of the interested parties of the condition of interest with respect to said one of the objects of interest.

66. In a distributed computing system having a network of computers linked for transmission of information about objects distributed among said computers, some of said computers executing software enabling interested parties to be provided with information about objects that the interested parties are interested in, a computer-implemented method of operating said computers for providing information about changed objects to interested parties that are interested in the information about the changed objects, said computer-implemented method comprising:

(a) detecting a change in a changed object of interest residing at one of said computers in said distributed computing system, and upon detecting the change in the changed object of interest, the first computer providing information about the changed object of interest to a cache memory of a second one of said computers in said distributed computing system, said cache memory maintaining information about the changed object of interest; and then

(b) at an appropriate time after said cache memory receives information about the changed object of interest, the second computer forwarding information about the changed object of interest from said cache memory to a third one of said computers of an interested party interested in the changed object of interest.

67. The method as claimed in claim 66, wherein software executed by the third computer provides a client browser that is used for fetching specified objects that are distributed among said computers in said distributed computer system.

68. The method as claimed in claim 66, wherein the second computer maintains in the cache memory information about specified objects and also information about objects in specified general classes of objects.

69. The method as claimed in claim 68, wherein the second computer maintains in the cache memory information about the objects in the specified general classes of objects by searching the network of computers for objects that satisfy a search specification.

70. The method as claimed in claim 69, wherein the search specification defines a set of semantically related objects.

71. In a distributed computing system having a network of computers linked for transmission of information about objects distributed among said computers, some of said computers executing software enabling interested parties to be provided with information about objects that the interested parties are interested in, a computer-implemented method of operating said computers for providing information about changed objects to interested parties that are interested in the information about the changed objects, said computer-implemented method comprising:

(a) detecting a change in a changed object of interest residing at one of said computers in said distributed computing system, and upon detecting the change in the changed object of interest, the first computer providing information about the changed object of interest to a cache memory of a second one of said computers in said distributed computing system, said cache memory maintaining information about the changed object of interest; and then

(b) at an appropriate time after said cache memory receives information about the changed object of interest, the second computer forwarding information about the changed object of interest from said cache memory to a third one of said computers of an interested party interested in the changed object of interest;

wherein the second computer waits a certain amount of time after receipt of information about the changed object of interest before forwarding information about the changed object of interest to the third computer of the interested party interested in the changed object of interest in order to place an appropriate limit on the frequency of transmission of information about the changed object of interest.

72. In a distributed computing system having a network of computers linked for transmission of information about objects distributed among said computers, some of said computers executing software enabling interested parties to be provided with information about objects that the interested parties are interested in, a computer-implemented method of operating said computers for providing information about changed objects to interested parties that are interested in the information about the changed objects, said computer-implemented method comprising:

(a) detecting a change in a changed object of interest residing at one of said computers in said distributed computing system, and upon detecting the change in the changed object of interest, the first computer providing information about the changed object of interest to a cache memory of a second one of said computers in said distributed computing system, said cache memory maintaining information about the changed object of interest; and then

(b) at an appropriate time after said cache memory receives information about the changed object of interest. the second computer forwarding information about the changed object of interest from said cache memory to a third one of said computers of an interested party interested in the changed object of interest;

wherein the second computer maintains a list of interested parties that are interested in the changed object of interest, and wherein the second computer also maintains, for each of the interested parties in the list, information specifying when it is appropriate to forward information about the changed object of interest from said cache memory to said each of the interested parties in the list.

73. In a distributed computing system having a network of computers linked for transmission of information about objects distributed among said computers, some of said computers executing software enabling interested parties to be provided with information about objects that the interested parties are interested in, a computer-implemented method of operating said computers for providing information about changed objects to interested parties that are interested in the information about the changed objects. said computer-implemented method comprising:

(a) detecting a change in a changed object of interest residing at one of said computers in said distributed computing system, and upon detecting the change in the changed object of interest, the first computer providing information about the changed object of interest to a cache memory of a second one of said computers in said distributed computing system, said cache memory maintaining information about the changed object of interest; and then

(b) at an appropriate time after said cache memory receives information about the changed object of interest, the second computer forwarding information about the changed object of interest from said cache memory to a third one of said computers of an interested party interested in the changed object of interest;

wherein the second one of the computers is responsive to a demand from an interested party for information about the changed object of interest by transmitting a summary of changes in the changed object of interest from the cache memory to the interested party demanding information about the changed object of interest.

74. In a distributed computing system having a network of computers linked for transmission of information about objects distributed among said computers, some of said computers executing software enabling interested parties to be provided with information about objects that the interested parties are interested in, a computer-implemented method of operating said computers for providing information about changed objects to interested parties that are interested in the information about the changed objects, said computer-implemented method comprising:

(a) detecting a change in a changed object of interest residing at one of said computers in said distributed computing system, and upon detecting the change in the changed object of interest, the first computer Providing information about the changed object of interest to a cache memory of a second one of said computers in said distributed computing system, said cache memory maintaining information about the changed object of interest; and then

(b) at an appropriate time after said cache memory receives information about the changed object of interest, the second computer forwarding information about the changed object of interest from said cache memory to a third one of said computers of an interested party interested in the changed object of interest;

wherein the second one of the computers maintains in said cache memory a history of a sequence of changes in the changed object of interest, and the second one of the computers is responsive to a demand from an interested party for a summary of changes in the changed object of interest by transmitting information about the sequence of changes in the object of interest from the cache memory to the interested party demanding the summary of changes in the changed object of interest.

75. In a distributed computing system having a network of computers linked for transmission of information about objects distributed among said computers, some of said computers executing software enabling interested parties to be provided with information about objects that the interested parties are interested in, a computer-implemented method of operating said computers for providing information about changed objects to interested parties that are interested in the information about the changed objects, said computer-implemented method comprising:

(a) detecting a change in a changed object of interest residing at a first one of said computers in said distributed computing system, and upon detecting the change in the changed object of interest, the first computer providing information about the changed object of interest to a cache memory of a second one of said computers in said distributed computing system;

(b) the second computer maintaining in said cache memory current information about the changed object of interest; and

(c) in response to a demand from an interested party interested in the changed object of interest, the second computer forwarding information about the changed object of interest from said cache memory to a third one of said computers of the interested party interested in the changed object of interest.

76. In a distributed computing system having a network of computers linked for transmission of information about objects distributed among said computers, some of said computers executing software enabling interested parties to be provided with information about objects that the interested parties are interested in, a computer-implemented method of operating said computers for providing information about changed objects to interested parties that are interested in the information about the changed objects, said computer-implemented method comprising:

(a) detecting a change in a changed object of interest residing at a first one of said computers in said distributed computing system, and upon detecting the chance in the changed object of interest, the first computer providing information about the changed object of interest to a cache memory of a second one of said computers in said distributed computing system;

(b) the second computer maintaining in said cache memory current information about the changed object of interest; and

(c) in response to a demand from an interested party interested in the changed object of interest, the second computer forwarding information about the changed object of interest from said cache memory to a third one of said computers of the interested party interested in the changed object of interest;

wherein the second computer responds to receipt of information about the changed object of interest from the first computer by notifying the interested party that a change has occurred in the changed object of interest, and then the second computer waits for a demand from the interested party for information that includes changes in the changed object of interest, and in response to a demand from the interested party for information that includes changes in the changed object of interest, the second computer forwards to the interested party information that incudes changes in the changed object of interest.

77. In a distributed computing system having a network of computers linked for transmission of information about objects distributed among said computers, some of said computers executing software enabling interested parties to be provided with information about objects that the interested parties are interested in, a computer-implemented method of operating said computers for providing information about changed objects to interested parties that are interested in the information about the changed objects, said computer-implemented method comprising:

(a) detecting a chance in a changed object of interest residing at a first one of said computers in said distributed computing system, and upon detecting the change in the changed object of interest, the first computer providing information about the changed object of interest to a cache memory of a second one of said computers in said distributed computing system;

(b) the second computer maintaining in said cache memory current information about the changed object of interest; and

(c) in response to a demand from an interested party interested in the changed object of interest, the second computer forwarding information about the changed object of interest from said cache memory to a third one of said computers of the interested party interested in the changed object of interest;

wherein the second computer provides a summary of changes in the changed object to the interested party, and wherein the second computer responds to a demand from the interested party for a version of the changed object by providing a version of the changed object to the interested party.

78. In a distributed computing system having a network of computers linked for transmission of information about objects distributed among said computers, some of said computers executing software enabling interested parties to be provided with information about objects that the interested parties are interested in, a computer-implemented method of operating said computers for providing information about changed objects to interested parties that are interested in the information about the changed objects, said computer-implemented method comprising:

(a) maintaining in a cache memory of a first one of said computers information defining a current version of a changed object residing in a second one of said computers and also maintaining in the cache memory information defining at least one prior version of the changed object residing in the second one of said computers;

(b) the first one of said computers responding to a request from an interested party for a summary of changes in the changed object by providing from the cache memory a summary of changes in the changed object; and

(c) the first one of said computers responding to a request from an interested party for a specified version of the changed object by providing from the cache memory the specified version of the changed object.

79. The method as claimed in claim 78, wherein the first one of the computers reconstructs a prior version of the changed object from a log of changes in the cache memory and a current version of the changed object in the cache memory.

80. In a distributed computing system having a network of computers linked for transmission of information about objects distributed among said computers, some of said computers executing software enabling interested parties to be provided with information about objects that the interested parties are interested in, a computer-implemented method of operating said computers for providing information about changed objects to interested parties that are interested in the information about the changed objects, said computer-implemented method comprising:

(a) at least one of said computers of an interested party maintaining a local cache of objects of interest to the interested party;

(b) said at least one of said computers receiving from another computer in the network a current version of an object of interest to the interested party; and

(c) said at least one of said computers comparing the current version of the object of interest to a version of the object of interest in the local cache to identify changes between the current version of the object of interest and the version of the object of interest in the local cache, and to indicate to the interested party the identified changes between the current version of the object of interest and the version of the object of interest in the local cache;

wherein the network automatically forwards to said at least one of said computers an indication that a change has been made to the object of interest in response to a change being made to the object of interest.

81. The method as claimed in claim 80, wherein the interested party specifies a criterion used by the network for determining when a change made to the object of interest is sufficiently significant for automatically forwarding to said at least one of said computers an indication that a change has been made to the object of interest in response to a change being made to the object of interest.

82. The method as claimed in claim 80, wherein the indication that a change has been made to the object of interest is the automatic transmission of the current version of the object of interest to said at least one of the computers.

83. The method as claimed in claim 80, wherein the object of interest is transmitted to said at least one of the computers in response to the party of interest being advised that a change has been made to the object of interest and the party of interest requesting to view at least a changed portion of the object of interest.

84. In a distributed computing system having a network of computers linked for transmission of information about objects distributed among said computers, some of said computers executing software enabling interested parties to be provided with information about objects that the interested parties are interested in. a computer-implemented method of operating said computers for providing information about changed objects to interested parties that are interested in the information about the changed objects, said computer-implemented method comprising:

(a) at least one of said computers of an interested party maintaining a local cache of objects of interest to the interested party;

(b) said at least one of said computers receiving from another computer in the network a current version of an object of interest to the interested party; and

(c) said at least one of said computers comparing the current version of the object of interest to a version of the object of interest in the local cache to identify chances between the current version of the object of interest and the version of the object of interest in the local cache, and to indicate to the interested party the identified changes between the current version of the object of interest and the version of the object of interest in the local cache;

wherein the network automatically forwards to said at least one of said computers an indication that a change has been made to the object of interest in response to a change being made to the object of interest, and wherein the interested party specifies an update frequency used by the network for limiting the frequency at which the network automatically forwards to said at least one of said computers an indication that a change has been made to the object of interest in response to a change being made to the object of interest.

85. In a distributed computing system having a network of computers linked for transmission of information about objects distributed among said computers, some of said computers executing software enabling interested parties to be provided with information about objects that the interested parties are interested in, a computer-implemented method of operating said computers for providing information about updates to a software product of interest from at least one computer in said distributed computing system to interested parties that are interested in the software updates, said computer-implemented method comprising:

a) maintaining in a memory in said distributed computing system a list of the interested parties interested in the updates to the software product, the list being accessed by interested parties via said distributed computing system to register the current interest of the interested parties in the software product; and

b) distributing information about an update to the software product by accessing the list of the interested parties to obtain an identification of the interested parties that are currently interested in the software product, and using the identification of the interested parties that are currently interested in the software product to distribute information about the update to the software product through the distributed computing system from said at least one of the computers to the interested parties.

86. The method as claimed in claim 85, wherein the update to the software product is automatically downloaded through the distributed computing system from said at least one of the computers to the interested parties that are currently interested in the software product.

87. The method as claimed in claim 85, wherein the update to the software product is downloaded through the distributed computing system from said at least one of the computers to an interested party upon the request of the interested party.

88. The method as claimed in claim 85, wherein the distributed computing system includes the Internet, and the update to the software product is transmitted over the Internet to an interested party.

89. In a computer network having a plurality of addressable sites where network clients can access stored information, a method of change notification, said method comprising the steps of:

a) network clients interested in a particular object sending requests for change notification to a respective site in the network for receiving network client requests for change notification with respect to the particular object;

b) maintaining a list of the network clients from which requests for change notification are received at said respective site; and

c) in response to a change being made in the particular object, said list being inspected to send change notifications to the network clients on said list, whereby the network clients interested in the particular object do not need to poll said respective site to be notified of a change in the particular object.

90. The method of change notification as claimed in claim 89, wherein the object is a commercial product and the method further includes distributing an update of the commercial product to network clients that indicate a desire to obtain the update of the commercial product.

91. The method of change notification as claimed in claim 89, which includes at least one of the network clients on said list sending a request to said respective site to be removed from said list so as not to be notified of a subsequent change in said particular object.

92. The method of change notification as claimed in claim 89, wherein at least one of the network clients interested in said particular object sends a respective specification of a significant change detection method for said particular object for determining whether or not the change in said particular object is of sufficient significance to said at least one of the network clients for a change notification to be sent to said at least one of the network clients, and in response to the change being made in the particular object and upon inspecting said list to send change notifications to the network clients on said list, finding that said at least one of said network clients has sent said respective specification of said significant change detection method for said particular object, and applying said significant change detection method for said particular object to determine whether or not the change having been made in the particular object is of sufficient significance to said at least one of said network clients for a change notification to be sent to said at least one of said network clients.

93. The method of change notification as claimed in claim 89, wherein at least one of the network clients interested in said particular object sends a respective specification of a change notification method to be used for notifying said at least one of the network clients of a change in said particular object, and in response to the change being made in the particular object and upon inspecting said list to send change notifications to the network clients on said list, finding that said at least one of said network clients has sent said respective specification of said change notification method, and using said change notification method for notifying said at least one of said network clients of the change in said particular object.

94. A method of operating a server in a data network to service client requests for access to objects residing at other locations in the data network, said method comprising the steps of:

a) in response to requests from clients in the data network, the server fetching specified objects and delivering the specified objects to the clients in the data network; and

b) in response to an update request from at least one of clients having received at least one of the fetched objects, the server registering said at least one of the clients as desiring update notification with respect to said at least one of the fetched objects, and maintaining a copy of said at least one of the fetched objects in a cache memory, and once a change is made to said at least one of the fetched objects at another location in the data network where said at least one of the fetched objects resides, automatically updating the copy of said at least one of the fetched objects residing in the cache memory without client intervention and notifying said at least one of the clients that said at least one of the fetched objects has changed.

95. The method as claimed in claim 94, wherein the server maintains in the cache memory only those objects that are requested by the clients to be updated automatically.

96. The method as claim 94, wherein the server receives a request from said at least one of the clients to cancel registration for updates for said at least one of the fetched objects, and in response the server cancels registration of said at least one of the clients for updates for said at least one of the fetched objects.


Description

BACKGROUND OF THE INVENTION

The present invention relates generally to data processing, and more particularly to information retrieval from a local or remote server in a data network or internetwork.

In 1989, researchers at CERN (the European Laboratory for Particle Physics) wanted to develop a better way to give widely dispersed research groups access to shared information. They wanted to develop a system that would enable them to access quickly all types of information via a common interface, removing the need to execute many steps to achieve the final goal. At that time, to read a document or view an image from a remote site often required finding the location of the desired item, making a remote connection to the machine where it resided, and then retrieving it for storage on a local machine. Over the course of a year, the proposal for this project was refined, and work began on the implementation.

During 1992, CERN began publicizing their project as a world-wide web (WWW). People saw what a great idea it was, and began creating their own WWW servers to make their information available to the Internet. Some people also began working on WWW clients, designing easy-to-use interfaces to the WWW. By far the most successful has been the Mosaic browser from the National Center for Supercomputing Applications (NCSA) and its kindred WWW browsers.

Mosaic and its kindred browsers use a computer interface method known as hypertext. Hypertext is text having embedded cross-references that can be followed to obtain related information. At a display terminal, a user can follow a cross-reference by "clicking" a mouse to point to a cross-referenced phrase, causing a related document or index to appear in a "browser window" on the display. The hypertext used by Mosaic and its kindred browsers is defined in a document using Hypertext Markup Language (HTML) created by the Internet Engineering Task Force and described in an Internet draft document by T. Berners-Lee and D. Connolly entitled "Hypertext Markup Language--2.0" (Jun. 16, 1995). Mosaic and its kindred browsers use an application-level protocol called the Hypertext Transfer Protocol (HTTP) when communicating with network file servers to follow hypertext cross-references. HTTP is described in an Internet draft document by T. Berners Lee, R. T. Fielding, and H. Frystyk Nielsen entitled "Hypertext Transfer Protocol--HTTP/1.0" (Mar. 8, 1995). Servers that recognize the HTTP are known as HTTP Daemon (HTTPD) servers.

There are two freely distributable UNIX-based WWW HTTPD server programs that are widely available and adopted by numerous WWW sites, one from NCSA the other from CERN. According to Netscape Communications Corp., there are now over 3 million WWW users accessing 10,000 Web servers. Currently, there are numerous free and commercial WWW browsers available: NCSA Mosaic, Cello, Viola, Emacs-W3, Lynx, Chimera, MacWeb, WinWeb, OmniWeb, Netscape Navigator from Netscape Communications, BookLink Internetworks, IBM Web Explorer, Netcom NetCruiser, Pipeline, Microsoft Network, Apple's Cyberdog technology, and packages from O'Reilly Associates, Spry, Spyglass, Quarterdeck, Infoseek, Ubique, Quadralay, and SunSoft's HotJava.

BookLink and Netscape Communications are the two most aggressive contenders offering a Web browser. They are start-up companies founded in March and April, 1994, respectively. Netscape released its first browser in October, 1994, and BookLink announced its highly functional browser in June, 1994.

Netscape recently announced its Netsite Communications Server and Commerce Server, as well as a new suite of applications for building a complete electronic commerce system on the Internet. Netscape products feature a built-in proprietary security mechanism that supports Netscape's Secure Sockets Layer (SSL) extension to the HTTP protocol. Based on SSL technology, they offer a secure commercial credit card billing and order processing system. The distinguishing features of the BookLink browser are: speed, multithreading (which allows simultaneous multiple data transfer sessions in an unlimited number of windows), multipaned windows that allow different documents in each window, progressive rendering of graphical elements (e.g., Graphics Interchange Format files--GIFs) so that users don't have to wait to see the completed page, and persistent caching of rendered pages so that there is no delay in redrawing of graphics elements.

The direction of commercial WWW technology development is clearly going toward communication security and faster, better browser displays. However, there has not been a similar degree of development in the area of Distributed Information Logistics Services (DILS). DILS are techniques for reducing human effort, communication costs, and latency in the access by users to information whose value may be time dependent and perishable. Some WWW users have complained about the lack of facilities supporting DILS. For example, the following are quotations from a "www-talk" newsgroup.

Christian Mogensen (mogens@CS.Stanford.EDU) Tue. Jan. 10, 1995 20:43:50+0100

>Is there any way in HTTP for a Server to automatically

>update a page without requiring the user of the client

>to click on anything?

Currently, no. HTTP is connectionless, and that makes it hard to do things after a transaction is completed.

>An example use of this would be if a Client requests a

>stock price page and keeps the page displayed. Now

>suppose the stock price changes. Is there a way

>within HTTP for the Server to update the page

>automatically without requiring the user to click on

>the reload option?

Another way to do this is to provide a stock-ticker application that is initialized by the web client when it receives application/x-stock-ticker data. The browser forks off the special viewer which opens a separate communication channel to the server.

The previously noted use of Expires: xxx header is interesting--I don't think it will work in the described manner until after a few revisions of browser software have passed . . .

jjjones--SIO Technologies Corp. (jjones@helpmt.sio.com) Tue, Jan. 10, 1995 21:44:39+0100

Question:

I know of no client that will automatically refetch a document (if it is the current one on the screen) if the current time surpasses the Expire time specified for the document. Correct?

Marc Salomon (marc@library.ucsf.edu) Tue, Jan. 10, 1995 23:07:50+0100

The problem is that the data changes on the server side and there is no way in the current HTTP model for the server to contact a client to inform them of this. A solution that would be within the http model would be for the server to inform the client that the content of this document changes rapidly and a interval for refreshed. Something like:

C: GET/stock.sub.-- quote.html HTTP/1.1 S: HTTP/1.1 200 OK Content-type: text/html .fwdarw.Refresh: <time.sub.-- interval> Last-modified: Wednesday, Dec. 7, 1994 22:03:38 GMT Expires: Wednesday, Dec. 7, 1994 22:03:39 GMT Content-length: 1721

This would allow the client to set a timer and perform a GET on the URL every time.sub.-- interval so you would have the perception of a dynamically updated document. The updates would be dependent on the alarm clock, of course, instead of any real change in the content of the document, but I think its the closest you are going to get under the current scheme.

marc

Mark J Cox (M. J. Cox@bradford.ac.uk) Tue. Jan. 10, 1995 15:22:05+0100

The HTTP documentation mentions the Expires: header which "Gives the date after which the information ceases to be valid and should be retrieved again" to "allow for the periodic refreshing of displays of volitile [sic.] data".

Some clients are now starting to support the Expires header, but I know of none that will automatically refresh without user intervention. If they do then there is a need for a second header: one that tells the client that the document will expire at some given time--but don't bother getting it again.

End of quotations

Other network communication protocols have similar deficiencies in providing timely information to interested users. For example, Computerworld, Aug. 14, 1995, p. 45, says: "Although pleased with [Lotus] Notes effectiveness in enhancing communication and replacing paper, she has had to deal with cultural issues. For one, the information does not come directly to the user, the user has to open Notes and look for it. . . . "

Therefore, there is a problem of updating a client such as a browser when a page is changed, because all HTTPD servers are stateless and connectionless, so it is impossible for servers to maintain clients' states with the standard server software design and implementation. Changing the behavior of the standard browser and server software, which is now in use by millions of people, would be very costly in terms of ensuring backwards compatibility as the new software is installed.

SUMMARY OF THE INVENTION

The basic objective of the present invention is to provide a software agent for automatically retrieving changed documents previously accessed from network and internetwork servers. Such a software agent will be referred to below as a "Revision Manager."

According to one aspect of the invention, the Revision Manager operates as an intermediary between a client, such as a browser executed at a user's terminal, and a local or remote network server. The Revision Manager is viewed by the network server as a kind of client that fetches documents from the network server. The Revision Manager is viewed by the client as a kind of server that sends these documents to the client's browser for viewing by the user.

In accordance with another aspect of the invention, the Revision Manager maintains a cache system based on client requests, which means it need not cache every page it fetches, but only those pages that a client specifically requests to be updated automatically. Such a Revision Manager is different from a proxy server by: (1) selectively caching documents; (2) not requiring a browser to preselect a proxy server--a browser can communicate with any number of other servers while using the intermediate software agent; (3) updating registered clients' browsers' displays when a document is changed; and (4) deleting cached document files when no client interests are registered, obviating other techniques for garbage collection. An object such as a document will be referred to as "changed" when it is modified (i.e., updated) and also when it is created (i.e., a new object). In the case of a new object, the "changed portion" or "changes" in the object will be a new object itself. In addition, the cache in the Revision Manager can provide a foundation for highly desirable economies of scale in information distribution and dissemination, such as when the Revision Manager is located close to multiple users so that their common information requirements can be served quickly from a shared local cache. Such a local cache obviates repetitive accesses by each user to distant sources that consume significant amounts of long-distance communication. Users employing standard WWW HTTP browsers can automatically receive notification and a view of the most recent version of documents of interest when these documents change. In this fashion, the Revision Manager augments the typical WWW browser with capabilities for accessing a shared cache of automatically updated documents and for responding appropriately and automatically to the change in information within a previously viewed document.

In accordance with a further aspect of the invention, the Revision Manager can notice changes in various sorts of objects by a variety of means and can respond to these changes by notifying interested parties using a variety of appropriate techniques. In addition, each Revision Manager can serve a group of parties that have similar interests with a shared cache of pertinent objects, and multiple Revision Managers can be employed in this manner to optimize the costs and performance of providing timely and high-quality information to many groups over a wide area. Finally, a Revision Manager can be configured to redirect requests for resources to particular ones of several alternative servers to access those expected to produce higher quality information at lower costs.

In a specific embodiment, the operation of the Revision Manager is broken into two basic phases, one corresponding to a "start up" and the other corresponding to "continuing use." In the start-up phase, the user accesses a Revision Manager start-up document encoded as a Hypertext Markup Language (HTML) page on a Revision Manager server. The start-up requests the user to provide two items of information: (1) a resource the Revision Manager should retrieve for the user and (2) the port number on the user's local machine where the user's browser can be notified to receive the resources retrieved by the Revision Manager for the user. In the continuing-operation phase, objects retrieved by the Revision Manager are modified in two ways: (a) a form is appended to the retrieved object and (b) uniform resource locators (URLs) embedded in the object are altered. When the user views the modified retrieved object, the form allows the user to specify whether this is an object of interest and a maximum desirable frequency for notification of subsequent updates to the object of interest. The alteration of embedded URLs is designed to cause subsequent resource requests for the associated resources to be directed to the Revision Manager. If the user activates a hyperlink that has been so modified, the Revision Manager receives the request and decides how best to service the request. In general, it can service the request from a cache that it maintains or it can redirect the request to another server to access the request. In either case, once the resource request is satisfied, the user receives an object modified in the two ways just discussed.

In a specific embodiment, the user may send requests for resources to the Revision Manager by directly supplying the resource locator in the form appended to modified objects previously accessed. In this case, when the user submits the form, the URL is modified to prepend the address of the Revision Manager so that the request is handled by the Revision Manager. The user, alternatively, may use the normal functions of his or her browser to "open" or "retrieve" any object in the typical manner. In such a case, the access request is sent to the user-specified server, which in most cases would not be the Revision Manager itself. In this manner, the user either may direct requests through the Revision Manager to exploit its capabilities for caching, change monitoring, and update notification or may bypass the Revision Manager as appropriate. The relative frequency of these two alternatives is determined entirely at the user's discretion.

The Revision Manager accepts a user input indicating that the user has some interest in monitoring a particular document. We refer to this input as a specification of an "object of interest" and to the user making the specification as "the interested party." In one specific embodiment, the specification of an object of interest consists of two components: (1) the unmodified URL for the object and (2) a check box flag that is toggled on by the user. In addition, in this specific embodiment, the interested party specifies a maximum frequency for receiving notification updates, which the Revision Manager uses to assure update notifications do not occur too often. The Revision Manager also keeps a list of parties interested in particular updates as well as information required to notify each of them appropriately when an update notification is due. In this specific embodiment, the interested party supplies a port number so that update notifications can be conveyed to each through a separate browser window that is opened and displayed automatically for the user. This specific embodiment of the Revision Manager therefore provides: (1) caching of objects of interest to support ready access in case of subsequent repeated requests; (2) translation of resource locators embedded in retrieved resources from their original values to enable subsequent accesses to be directed to the Revision Manager; (3) interception by the Resource Manager of modified resource requests; (4) determination of the best source for supplying the requested resource; and (5) redirection of the request to the best supplier of the requested resource, if a resource of sufficient quality is not already present in the cache; (6) spontaneous updating of the cache when objects of interest have changed; and (7) notification of interested parties when objects of interest have changed.

In a specific embodiment, the Revision Manager receives a URL for a document sent by an off-the-shelf World-Wide Web HTTP browser which has a common gateway interface (CGI) built in, retrieves the document and, if the user specifies an interest in being alerted on updates to the document, caches the document, and subsequently spontaneously monitors the server to notice if the document has been modified. In the preferred embodiment, moreover, the monitoring is performed by periodically querying the document's server to determine if the document has changed since it was last retrieved. When the document is determined to have been modified, the Revision Manager saves the updated document to its cache and then informs each interested party's registered browser, through a CCI channel, to issue a GET command to the Revision Manager. In this way, the browser of the interested party accesses the modified document from the Revision Manager's cached file and updates its view to correspond with the most recently accessed updated object.

In a specific embodiment, the method employed for intercepting a URL includes: (1) use of forms to accept a URL from a user through an HTTP browser, and (2) delivery of the requested document with translated hyperlinks, each of which has the Revision Manager's address as the prefix. This translation of hyperlinks means that when the user next selects a resource for access by activating the hyperlink, the user's request is first directed to the Revision Manager, regardless of the resource's actual location on the network. The delivered document also has a short form attached. This form allows a user to register interest in the currently viewed document and to specify the shortest interval in seconds desired by the user between successive update notifications on the same document. In this specific embodiment, there are two ways to cancel a registration: one is through the short form provided on the returned document, and the other is when the user terminates the browser. Other policies and mechanisms could also be easily supported. In this specific embodiment, other hyperlinks that refer to non-textual multimedia objects are not translated, but are retained intact along with the document's cache file. When the query as to whether a file has been modified is made by the Revision Manager, these multimedia URLs are also queried. Hence, although documents are not modified to cause browsers to access multimedia objects from the Revision Manager, any change made to the document, including its multimedia links, will trigger the browser to retrieve and update the display of the document. Deeper levels of change monitoring can also be adopted as a matter of policy, such as checking whether any of the objects referred to by URLs within a document have changed. This kind of monitoring of changes can be carried to multiple levels by a recursive application through linked documents of the change detection mechanism applied at the top level defined by a single document with its embedded hyperlinks.

Thus, the Revision Manager provides means for users to inform it of which documents are worthy of monitoring and techniques for alerting users when documents of interest have been updated. The Revision Manager serves as an intermediary between standard HTTP browsers and HTTP servers thereby obviating the need for significant new or different infrastructure. The Revision Manager, further, collects at a site convenient to a group of users a single cache of the most recent versions of documents so that all members of the group can have quick and inexpensive access, while the group as a whole can significantly reduce communication costs. Unlike other HTTP servers, the Revision Manager can spontaneously update its cache to keep its information current. Lastly, the Revision Manager can be configured to provide a user a comprehensive view of documents of interest that have changed since the last time the user looked.

BRIEF DESCRIPTION OF THE DRAWINGS

Other objects and advantages of the invention will become apparent upon reading the following detailed description with reference to the accompanying drawings wherein:

FIG. 1 is a block diagram showing a Revision Manager in a data network;

FIG. 2 is a block diagram showing components of the Revision Manager of FIG. 1;

FIG. 3 is a block diagram illustrating data communication among the Revision Manager, a client browser, and a remote server in the network of FIG. 1;

FIG. 4 is a block diagram illustrating the use of a cache in the Revision Manager;

FIG. 5 is a form that is filled in by a user to specify a document of interest to be monitored by the Revision Manager;

FIG. 6 is a form that is attached to Revision Manager monitored documents and filled in by the user to specify an update interval;

FIG. 7 is a block diagram showing the exchange of data between a client browser, a remote web server, and the cache and various processes of the Revision Manager;

FIG. 8 is a flowchart illustrating the execution of the Revision Manager Daemon process of the Revision Manger;

FIG. 9 is a flowchart of a forked child process spawned by the Revision Manager Daemon process upon receiving a request from a client browser;

FIG. 10 is a flowchart of a procedure used by the Revision Manager Daemon to process the cache of the Revision Manager;

FIG. 11 is a flowchart of a procedure used by the Revision Manager Daemon to process a CGI script;

FIG. 12 is a flowchart for a "RM.sub.-- route.pl" script processing routine;

FIG. 13 is a flowchart of a procedure for fetching a page of a requested document;

FIG. 14 is a flowchart of a procedure for executing a document option;

FIG. 15 is a flowchart of a procedure for parsing HTML text;

FIG. 16 is a flowchart of a procedure for printing HTML text;

FIG. 17 is a flowchart for a "RM.sub.-- cacheParse.pl" script processing routine;

FIG. 18 is a flowchart of a Polling Daemon of the Revision Manager;

FIG. 19 is a flowchart of a procedure used by the Polling Daemon to walk through a current directory to build a polling list;

FIG. 20 is a flowchart of a polling action procedure used by the Polling Daemon for fetching documents in the polling list that are of interest to client browsers;

FIG. 21 is a flowchart of a procedure used by the Polling Daemon for checking the status of a response to a request for a document of interest in order to update the Revision Manager cache;

FIG. 22 is a flowchart of a procedure used by the Polling Daemon for checking a client list in order to determine whether to notify clients that the Revision Manager cache has a revised version of a document of interest;

FIG. 23 shows the display screen of a browser in which a user has accessed a Revision Manager;

FIG. 24 shows the display screen of the browser when the user is selecting a CCI option in a file pull-down menu;

FIG. 25 shows the display screen of the browser when the CCI option presents a dialog box;

FIG. 26 shows the display screen of the browser when the Revision Manager presents an interface short form for selecting the options of registering the currently displayed document for update notification, or retrieving another Web page;

FIG. 27 shows the display screen of the browser when the user has entered an update interval of 30 seconds into the interface short form;

FIG. 28 shows the display screen of the browser when the user has entered a new Web page address;

FIG. 29 shows the response of the Revision Manger to the user registering the document for update notification;

FIG. 30 shows the Revision Manager notifying the user of an update to the document;

FIG. 31 is a schematic diagram of a data network employing multiple distributed Revision Managers to optimize access and storage of multiple versions of objects;

FIG. 32 is a schematic diagram of a file server adapted to notify interested parties when changes are made to objects stored in the file server for use in the data network of FIG. 31;

FIG. 33 is a flowchart of a control procedure executed by a processor in the file server of FIG. 32;

FIG. 34 is a schematic diagram of a Revision Manager especially adapted for use in the data network of FIG. 31;

FIG. 35 is a first portion of a flowchart of a procedure executed by a processor in the Revision Manager of FIG. 34 for responding to a request from a client to establish or change update service to the client;

FIG. 36 is a second portion of a flowchart of the procedure executed by the processor in the Revision Manager of FIG. 34 for responding to a request from a client to establish or change update service to the client;

FIG. 37 is a first portion of a flowchart of a procedure executed by the processor in the Revision Manager of FIG. 34 for responding to a search request from a client;

FIG. 38 is a second portion of the flowchart begun in FIG. 37, and also showing an entry point entered when the Revision Manager receives a change notification from a file server as shown in FIG. 32;

FIG. 39 is a third portion of the flowchart begun in FIG. 37, and showing steps for sending an object to a client having issued a search request for the object, or to all interested clients in response to a change notification, when the object is not an object of general interest and thus is not retained in a cache memory of the Revision Manager;

FIG. 40 is a fourth portion of the flowchart begun in FIG. 37, showing steps for sending a new object of general interest to all clients associated with the Revision Manager and being interested in the new object;

FIG. 41 is a fifth portion of the flowchart begun in FIG. 37, and showing steps for deciding whether to report changes in an object to clients interested in the object;

FIG. 42 is a flowchart of a general routine executed by a client when an object is received by the client;

FIG. 43 is a flowchart of an example of a client's significant change detection method performed by the Revision Manager of FIG. 34 for deciding whether to report a change in an object to the client;

FIG. 44 is a flowchart of an example of a client's notification method for notifying the client of a change in an object;

FIG. 45 is a flowchart of background processes performed by the Revision Manager of FIG. 34 for notifying clients interested in being informed when objects are not frequently changed, for monitoring sources of objects to compile current costs, quality, and availability of objects from the sources, and for compiling statistics of network throughput, latency and costs; and

FIG. 46 is a flowchart of a routine for reporting prior versions of an object by stepping backwards through a log of changes to the object.

While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof have been shown in the drawings and will be described in detail. It should be understood, however, that it is not intended to limit the invention to the particular forms shown, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the scope of the invention as defined by the appended claims.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Turning now to FIG. 1 of the drawings, there is shown a data network including a Revision Manager 1 in accordance with the invention. The Revision Manager 1 is a programmed digital computer linked by standard internet connections (shown as arrows in FIG. 1) to other nodes in the data network. Preferably, the program for the Revision Manager runs on any operating system that supports TCP/IP communications and that has a way to connect to either a wide or local area network as shown in FIG. 1.

As illustrated in FIG. 1, the Revision Manager is connected to function as an intermediary between a number of Mosaic browsers 2, 2a, 2b, or any other CCI capable web browser 3, and a number of remote HTTP servers 4, 4a, 4b. The program for the Revision Manager could run on a dedicated digital computer. Alternatively, the program for the Revision Manager could be one of many different programs or processes executed by a digital computer. For example, the program for the Revision Manager could be executed by the same digital computer that executes a program for a browser, or by the same digital computer that executes the program for a server.

As shown in FIG. 2, the Revision Manager includes three components: a Revision Manager Daemon 6 that is an HTTPD WWW server mechanism; a Polling Daemon 7 that is a cache polling mechanism, and a set of CGI scripts 8 that are services to support Revision Manager Forms. The Revision Manager Daemon 6 provides all HTTPD WWW server functions to any Web browser.

FIG. 3 shows that the Revision Manager acts as an intermediary between a browser client 2 (Mosaic) and a Remote HTTP server 4. In the absence of the Revision Manager 1, the client would send an intended destination URL 9 to the remote HTTP server 4, and the remote HTTP server would provide an intended document return 10 to the browser client 2.

To automatically be provided with updates to a document of interest, however, the browser client sends a GET command 11 to request a HTML document "rm.html" from a Revision Manager 1. The user enters a URL by filling in a form "rm.html" (301 shown in FIG. 5). When the user submits the form, the client 2 sends a POST command to the Deamon (6 in FIG. 4) of the Revision Manager 1. After a series of decisions (as described below with reference to FIGS. 9 to 11), the Revision Manager Daemon then calls either the RM.sub.-- route.pl CGI services (FIGS. 12 to 16) or RM.sub.-- cacheParse.pl CGI services (FIG. 17) to provide the document. RM.sub.-- route.pl takes the URL from the POST command parameter value and issues a GET command 12 to the remote server, which is the destination of the URL. After the document is returned 13, RM.sub.-- route.pl saves the document into a cache file, and then concatenates a Revision Manager short form to the document in order to attach this short form and create an altered document. The hyperlinks in the original document are translated to include the Revision Manager's URL as the prefix. Multimedia URLs need not be translated.

The following example shows how a hyperlink is translated. Suppose the original hyperlink is:

http://www.teknowledge.com/company/Patent.html

Then, the translated hyperlink will be:

http://SomeRevisionManager.CompanyName.com:8042/http://www.teknowledge.com/ company/Patent.html

where "SomeRevisionManager.CompanyName.com" is the host IP address of the computer on which the Revision Manager is running, and 8042 is the port number the Revision Manager Daemon is using. The details of this mechanism will be discussed below.

The attached short form lists several options that the Revision Manager offers. The altered document then is returned 14 to the client browser 2. A user uses the attached short form to enable the automatic updating service, which will make the browser automatically display the changed document in a new window after the document has been updated at the remote server site.

Turning now to FIG. 4, the Revision Manager 1 has been expanded to show some of its internal operations and components. RM.sub.-- cacheParse.pl 17 does the same thing as RM.sub.-- route 16, except that it does not get the changed document from the remote server 4; instead, it gets the document from a cached file in a cache 19 on disk. The Revision Manager 1 would also accept as future extension 18 any additional CGI services that might be developed for interfacing with an HTTPD server.

Turning now to FIG. 7, there is an illustration of the data flow to and from the Polling Daemon. As described above, the client browser 2 first communicates with the Revision Manager Daemon to identify a document of interest for which the user would like updates to be sent automatically. The Revision Manager Polling Daemon 7 is spawned by the Revision Manager Daemon at start-up time. The Revision Manager Polling Daemon periodically and spontaneously scans the root directory and any subdirectories of the cache 19 to find any cached documents. Associated with the cache 19 is a set of client files 290. Each document in cache has its own client file containing a list of its registered clients (interested parties). The Revision Manager Polling Daemon opens the client file of each cached document to find out whether there are still any interested parties. If there is at least one client which is due to be notified in case the document has been changed, it issues a GET command 20 for the document to the remote server with an If-Modified-Since header. The time field is the Last-Modified time from the remote server when the document was previously obtained. If the document has been modified, the remote server 4 will send back the entire updated document 21; otherwise, the remote server just sends back a 304 status code, which means that the file has not been modified. If the file has been modified, the Revision Manager Daemon saves 22 the updated document into the cache file. Then, for each interested party which is due for an update, the Polling Daemon 7 sends a CCI command 23 to the client browser 2 instructing the client browser to open and display the updated document including the appended Revision Manager form. If any client browser of an interested party has exited already, the Polling Daemon 7 deletes this client from the saved client file. If all clients have exited, the document is deleted from the cache. When a browser receives a CCI command, it automatically will issue a GET command 24 to the Revision Manager Daemon 6 to retrieve the updated document. Since the updated document is in the cache, the Revision Manager Daemon 6 executes 25 the RM.sub.-- cacheParse.pl 27. The Parse HTML Script service 27 gets the document text of the updated document from a file in the cache 19, parses the text and prints it to the Revision Manager Daemon 6. The Revision Manager Daemon 6 prints the text to the client browser 2, so that the client browser display the text of the updated document in a new window.

FIG. 8 shows the starting of the Revision Manager Daemon process. A Revision Manager Daemon is an extended HTTPD server. It differs from an NCSA HTTPD server because it maintains a cache directory referring to cached documents. It also differs from the standard CERN HTTPD server because: first, it does not cache every file it fetches; second, it is not required to be run as a proxy server; and third, it spontaneously updates its cache for documents of interest. The Revision Manager Daemon acts as both a server and a client. It is a server when accepting HTTP requests from browser clients connecting to it, but it acts like a client to the remote servers that it connects to in order to retrieve the documents for its own clients. Moreover, a document returned from a Revision Manager Daemon to the client has a short form attached to it which provides more options and services to its user about the requested document.

As shown in FIG. 8, when a Revision Manager Daemon process starts, like all other HTTPD servers, it reads command line arguments (step 30) and a server configuration file (step 31) in order to customize itself to reflect the configuration of the host system and how it should act as a server in response to a host system such as a client browser. At this stage, it is no different from either NCSA or CERN HTTPD version 3.0 servers.

After initialization, the Revision Manager splits into two processes: an HTTPD server process (parent in step 32) and the Polling Daemon process 7 (a child process). The Polling Daemon process is an infinite polling loop further described below with reference to FIG. 18.

From this point on, the Revision Manager acts differently from both the NCSA and CERN servers. In FIG. 8, the parent process maintains a listening loop (block 33). When a request 34 comes in from a browser and is accepted in step 35, execution forks to a child process 36 to handle the request, and the parent process 37 returns to step 33 to process subsequently-received requests.

The forked child process 36 continues in FIG. 9. The child process first gets the client's IP address in step 38 and saves it to a variable called "client.sub.-- addr". Next, in step 39, the child process reads the headers of the HTTP request. If the request is to get a local file, it returns the file as would a normal HTTPD server. If the request is for a CGI script, then in step 40 the child process examines the method of the request. If the method of the request is POST, then execution of the child process branches to step 41. In step 41, the child process reads the data field of the HTTP request and saves the data into a buffer called "rm.sub.-- data". In the data, there are Revision Manager-specific name-value pairs. They are "url.sub.-- poll," "url.sub.-- get," "port.sub.-- number" and "update.sub.-- interval." "url.sub.-- poll" indicates whether this document needs the updating service or not. In step 42, execution of the child process branches depending on whether this document needs the updating service. If "url.sub.-- poll" is set, indicating that this document needs the updating service, then in step 43, a flag "rm.sub.-- do.sub.-- cache" is set; otherwise, in step 44, the flag "rm.sub.-- do.sub.-- cache" is cleared. A variable "url.sub.-- get" contains the URL absolute address that the Revision Manager needs to get from the remote server. In step 45, the content of a variable "url.sub.-- get" is saved in a variable called "rm.sub.-- url". Also in step 45, "port.sub.-- number" data is loaded into a variable "client.sub.-- port", and "update.sub.-- interval" data is loaded into a variable called "client.sub.-- interval".

The variable "port.sub.-- number" is the CCI number that browsers such as Mosaic use. It is used by Mosaic to communicate with other agents. The variable "update.sub.-- interval" shows how often the user wishes to have the document checked for an update. Because the frequency of modification of a document is uncontrollable by a browser, it is desirable for many users that, no matter how often a document changes, the browser gets its updates on a fixed periodic schedule. In steps 46 and 47, this interval, in the variable "client.sub.-- interval", is limited to a configurable minimum value.

If in step 40 the child process finds that the method of the incoming request is not "POST", then in step 48 the child process branches depending on whether the method of the request is "GET". If so, then in step 49 the argument of the "GET" command is saved in the variable "rm.sub.-- url," and the rest of Revision Manager variables are not set. If not, then in step 50, the cache flag "rm.sub.-- do.sub.-- cache" is cleared to disable caching. Methods other than "POST" and "GET" (block 50) can readily be defined and implemented to augment the Revision Manager services. After all of the Revision Manager variables have been loaded, the cache processing of the Revision Manager Daemon is performed in step 51.

FIG. 10 shows in detail the cache processing step 51 introduced in FIG. 9. After the method of a request is processed, in step 51 the Revision Manager Daemon examines whether the requested document was cached previously in a local file. This question is answered by converting the variable "rm.sub.-- url" to a full file path name and checking for the existence of this file. First, in step 52, "rm.sub.-- url" is checked to ensure it has valid contents. Next, in step 53, a URL converter translates the URL to a full file path name.

For example, the URL

http://www.teknowledge.com/company/Patent.html

is converted to the filename

/cache-root/http/www.teknowledge.com/company/Patent.html

where "cache-root" is an alias to a directory name that is configured in the server's configuration file.

Because the Polling Daemon may attempt to access the same file concurrently with the Revision Manager Daemon, in step 54 a file lock is implemented to synchronize the file access. Locking is achieved by creating a lock file with a filename made by adding the suffix ".lock" to the cache file name. The lock file for the above example would be:

/cache-root/http/www.teknowledge.com/company/Patent.html.lock

If the lock file already exists, then the Polling Daemon waits a few seconds to finish processing and proceeds. A client file is created the same way as a lock file. In the above example, the client file would be:

/cache-root/http/www.teknowledge.com/company/Patent.html.clients

There is one client file per document cache file. Each client file contains information for multiple clients which are registered with the cache file. In this client file, each client's IP address, port number, update interval and current time are saved. The current time is saved as the last time a browser was updated. In particular, in step 55, the Revision Manager Daemon searches the cache for the client's entry in the client file, and if the client file has already been created, the last update time is changed to be the current time and no other change is made. If the client entry is not found, the client is added to the client file.

In step 56, the Revision Manager Daemon checks the cache to determine whether the cache file is in the cache. If the cache file is found, then in step 57, a cache lookup flag is set to "FOUND". Otherwise, in step 58, the cache file is created, and in step 59, the cache lookup flag is set to "CREATE". In step 60, an environment variable, "RM.sub.-- CACHE" is set to be the cache filename. If the method of the request is "POST" (step 61) and the cache lookup flag is "FOUND" indicating that the cache file exists (step 62), then in step 63, the selected CGI script is changed from the "RM.sub.-- route.pl" process to the "RM.sub.-- cacheParse.pl" process. Both of these processes are Revision Manager Daemon CGI script processes. Because "RM.sub.-- cacheParse.pl" expects a "GET" command, in step 64 the Revision Manager Daemon changes "POST" to "GET" and creates a "query.sub.-- string" for "RM.sub.-- cacheParse.pl", and in step 65 sets "content.sub.-- type" and "content.sub.-- length" to zero. Finally, in step 66, the Revision Manager Daemon processes the CGI script. The selected CGI script process is "RM.sub.-- rout.pl" unless step 63 was performed, changing "RM.sub.-- rout.pl" to "RM.sub.-- cacheParse.pl".

FIG. 11 shows how a CGI script is processed by the Revision Manager Daemon. Before spawning a child, in step 67 the Revision Manager Daemon sets a group of environment variables needed before an HTTPD server executes any CGI script program. Environment variables include "request.sub.-- method", "query.sub.-- string", "content.sub.-- type" and "content.sub.-- length". "request.sub.-- method" indicates which method is requested by the client. "query.sub.-- string" contains data information for the "GET" method. "content.sub.-- type" and "content.sub.-- length" contain data for the "POST" method. "content.sub.-- type" refers to a CGI Form service (which is one type of CGI script service), "content.sub.-- length" tells a CGI script how many bytes of data it should read from its standard input channel.

Next, in step 68, a pair of pipes is established for the communication between a child process and its parent process, and the child process 69 is forked and spawned. The child process 69 executes the selected script, and the selected script writes out a "result" document through its standard output. This standard output is the input pipe of the child's parent. There are two CGI scripts defined: "RM.sub.-- route.pl" and "RM.sub.-- cacheParse.pl". As noted above with respect to step 63 of FIG. 10, "RM.sub.-- route.pl" is selected unless step 63 is reached, changing the selected script to "RM.sub.-- cacheParse.pl". Therefore, if "RM.sub.-- route.pl" has been selected, then execution is transferred in step 69 to an entry point 77 of the "RM.sub.-- route.pl" routine beginning in FIG. 12, and if "RM.sub.-- cacheParse.pl" has been selected, then execution is transferred in step 69 to an entry point 137 of the "RM.sub.-- cacheParse.pl" routine beginning in FIG. 17.

If "content.sub.-- length" is not zero, as tested in step 70, then in step 71 the parent process writes out "rm.sub.-- data" to the pipe. The child process reads its standard input to get the contents of "rm.sub.-- data". In step 72 of FIG. 11, the parent process reads the "result" document from the pipe. Then in step 73, the parent process captures the "Last-Modified" time stamp, and saves this time stamp into a cache information file such as:

/cache-root/http/www.teknowledge.com/company/Patent.html.cache.sub.-- info

The document is then sent back to the browser.

One cache information file is established per directory of monitored WWW files. Inside the file, each cached file is represented by one line of data which contains the file name and last-modified time. After saving the cache information file and sending the WWW document back to the client browser, the parent process waits for the child process to exit (step 74), unlocks the lock file (step 75), and exits (step 76).

Turning now to FIG. 12, there is shown a flowchart of a portion of the "RM.sub.-- route" CGI script process of the Revision Manager, beginning with an entry point "RM.sub.-- route.pl" 77. "RM.sub.-- route" services the Revision Manager form "rm.html" (301 in FIG. 5) and various short forms attached to Revision Manager-monitored documents. An example of a short form 302 is shown in FIG. 6.

The function of "RM.sub.-- route.pl" is to issue an HTTP request, get the requested document from a remote server, save the document to the cache file if requested by the user, and then translate all hyperlinks contained within the document, except for multimedia hyperlinks. The Revision Manager URL address is attached to the parsed hyperlinks as a prefix. All multimedia hyperlinks are then saved into a file with ".images" appended to the cache file name, such as:

/cache-root/http/www.teknowledge.com/company/Patent.html.images.

A short form is also attached to the document according to its current state. There are four states for short forms: document requested, acknowledgment of document update requested, document update in progress, and document updated. Finally, "RM.sub.-- route.pl" sends the results document to its standard output, which is connected to the Revision Manager Daemon's pipe. The WWW document is then sent back to the client browser by the Revision Manager Daemon. When "RM.sub.-- route.pl" begins in step 78 of FIG. 12, it gets the input "request.sub.-- method", and inspects the "request method" in step 78 to decide how to access the data. If the "request method" is "POST", then the POST input string is parsed in step 80; otherwise, the "GET" input string is parsed in step 81. According to the CGI protocol, in step 80 a CGI process gets data from its standard input when the command is "POST". The data length is determined by the environment variable "CONTENT.sub.-- LENGTH". If the command is "GET", the process gets its data in step 81 through a command line argument or the environment variable "QUERY.sub.-- STRING". For either command, the data includes values for "port.sub.-- number", "url.sub.-- get", "url.sub.-- poll", "url.sub.-- previous" and "update.sub.-- interval". "Port.sub.-- number" refers to the CCI port number of a client browser, such as Mosaic. The combination of a port number and a host computer's IP address uniquely identifies a client. This also offers a way for the Revision Manager to communicate with the client. "url.sub.-- get" contains the URL requested from a client. "url.sub.-- poll" is a flag to inform the Revision Manager that the user wants to be notified if the contents of the document identified by the given URL has changed. "url.sub.-- previous" remembers the currently displayed document's URL. "update.sub.-- interval" indicates how long the time interval should be between two consecutive updates.

Since update intervals smaller than 10 seconds are generally not feasible in the Internet environment due to bandwidth-induced response delays and tend to flood the network, the update interval is compared to a minimum of 10 seconds in step 82, and if the update interval is less than 10 seconds, the update interval is set to 10 seconds in step 83.

In step 84, the browser port number is obtained from the input data. In step 85, execution branches on the state of the "url.sub.-- poll" flag. If "url.sub.-- poll" is set, then in step 86, the poll document address is assigned to the request address since the request address is the URL to request for a polling service, and in step 87, a poll flag is set ON. Otherwise, if "url.sub.-- poll" is not set, then "url.sub.-- get" is inspected in step 88 to determine whether "url.sub.-- get" is loaded with a value; if so, then a new document is being requested by the user, and in step 89 the new document address is assigned to the request address. Otherwise, if "url.sub.-- poll" is not set and "url.sub.-- get" is empty, then "url.sub.-- previous" contains the requested document, so in step 91 the current document address is assigned to the request address. Since "url.sub.-- poll" is not set in the latter two situations, the "poll" flag is set to be "OFF" in step 90 or 92. In step 93, "RM.sub.-- route.pl" transfers execution to a procedure "RM.sub.-- getPage" to request the HTML document specified by the request address from a remote server.

FIG. 13 shows the steps in the procedure "RM.sub.-- getPage" to request the HTML document from a remote server. In step 94, the procedure receives a URL from its command line argument. In step 95, the procedure sets the "redirect" flag to trigger document fetching.

In step 96, the procedure "RM.sub.-- getPage" begins a routine to fetch a document. First, in step 96, the routine checks the "redirect" flag, and if it is set, in step 97, the procedure connects to the remote server and sends a "GET" command with a complete listing of HTTPD acceptance parameters. In step 98, the procedure then checks for error conditions when a response is received, and if the document retrieval was successful, then in step 99 the procedure saves the document into a buffer by assigning the HTML text to a local variable, and in step 100 the procedure checks whether a redirection response is received (step 100). A redirection response is sent by an HTTPD server when a URL no longer points to a valid document and a Location header advises the requester where the new location is. The redirection response is a particular line in the HTML header. A requester that receives this type of response should issue another request with the correct location to get the document, which is done in step 101 by setting the new address to the address in the redirect line in the HTML header, and in step 102 by enabling the redirect flag. If a redirection response is not received, the redirect flag is enabled. After steps 102 or step 103, execution returns to step 96 whether or not a redirection occurs. If the remote server returns an error condition, then in step 104 the process prints or otherwise flags an error condition in step 104 and exits in step 105.

After a valid document is returned, the docu