Access augmentation or optimizing

Information filtering apparatus for selecting predetermined article from plural articles to present selected article to user, and method therefore

5907836

Abstract

An information filtering apparatus for receiving articles, such as texts or images, from information sources to select predetermined articles from the supplied articles to a user has a storage portion, an article retrieving portion, a determining portion and a presentation portion. The storage portion stores retrieving conditions previously specified for each user. The article retrieving portion retrieves the supplied articles to select articles which satisfy the retrieving conditions for each user. The determining portion calculates similarities among the articles selected by the article retrieving portion or the selected articles and other articles to determine relevant articles for each article in accordance with the similarities. The presentation portion adds information of the determined relevant articles to the selected articles to present the articles and information to the user.


Claims

What is claimed is:

1. An information filtering apparatus comprising:

means for receiving articles from information sources;

means for storing retrieval conditions previously specified for a user;

means for storing retrieval conditions previously specified for a user;

article retrieving means for retrieving the articles received by said receiving means to select articles which match the retrieval conditions stored in said storing means for the user;

determining means for determining relevant articles for each article selected by said article retrieving means by calculating similarities among the articles selected by said article retrieving means or calculating similarities among the selected articles and other articles received by said receiving means;

output means for outputting the selected articles with information of the determined relevant articles; and

means for outputting the selected articles by retrieval conditions specified in a single language by another language calculating similarities among the retrieval conditions replaced into the another language and the articles retrieved by said article retrieving means and written in a different language to select articles in accordance with the similarities.

2. An information filtering apparatus according to claim 1, further comprising means for outputting the selected articles with information of retrieval conditions under which the articles have been selected by said article retrieving means to each article.

3. An information filtering apparatus according to claim 1, further comprising means for generating a summary having a length corresponding to a type of each article selected by said article retrieving means and means for outputting the summary generated by said means for generating to the user.

4. An information filtering apparatus comprising:

means for receiving articles from information sources;

means for storing retrieval conditions previously specified for a user;

article retrieving means for retrieving supplied articles to select articles which match the retrieval conditions for the user so as to present the articles to the user; and

means for generating a summary having a length corresponding to similarities among retrieval conditions and a type of the article selected by said article retrieving means to present the summary to the user.

5. An information filtering apparatus, comprising:

means for receiving articles from information sources;

means for storing a retrieval condition;

first calculating means for calculating first similarities between the retrieval condition stored in said storing means and the articles received by said receiving means;

output means for sorting the articles received by said receiving means based on the first similarities and for extracting at least one of 1) a predetermined number of articles selected from articles having a highest first similarity and 2) articles which have the first similarities greater than a predetermined threshold;

second calculating means for calculating second similarities between the articles extracted by said output means based on a field of the articles including at least one of a first sentence, a first paragraph, and a caption; and

means for grouping the articles extracted by said output means based on the second similarities.

6. An information filtering apparatus, comprising:

means for receiving articles from information sources;

means for storing a retrieval condition;

means for calculating first similarities between the retrieval condition stored in said storing means and the articles received by said receiving means;

output means for sorting the articles received by said receiving means based on the first similarities and for extracting at least one of 1) a predetermined number of the articles selected from articles having a highest first similarity and 2) articles which have the first similarities greater than a predetermined threshold;

output article storage means for storing the articles extracted by said output means;

means for calculating second similarities between the articles stored by said output article storage means and the articles which are extracted by said output means; and

means for grouping the articles stored by said output article storage means and the articles which are extracted by said output means based on the second similarities, the articles grouped having information when the article is received by said receiving means.

7. An information filtering apparatus comprising:

means for receiving articles from information sources;

means for storing retrieval conditions;

means for calculating similarities among retrieval conditions and supplied articles to retrieve articles by a specified number or articles having similarities greater than a predetermined threshold in accordance with the calculated similarities;

means for outputting the retrieved documents to a user; and

means for receiving articles written in a different language, replacing retrieval conditions specified in a single language by another language to calculate similarities with the articles to output the article written in the different language to the user while being mixed with the documents to be output by said outputting means.

8. An information filtering apparatus, comprising:

means for receiving articles from information sources;

means for storing a retrieval condition;

means for calculating similarities between the retrieval condition stored in said storing means and the articles received by said receiving means;

output means for sorting the articles received by said receiving means based on the similarities and for extracting at least one of 1) a predetermined number of the articles selected from articles having a highest similarity and 2) articles which have the similarities greater than a predetermined threshold; and

means for changing at least one of 1) the predetermined threshold and 2) the retrieval conditions based on a total number of the articles extracted by said output means.

9. An information filtering apparatus, comprising:

means for receiving articles from information sources;

means for storing a retrieval condition; and

filtering means for calculating similarities between the retrieval condition stored in said storing means and the articles received by said receiving means using a combination of a character string matching scheme and a word matching scheme; and

means for extracting the articles received by said receiving means based on the similarities calculated by said filtering means.

10. An information filtering method comprising:

a step for receiving articles from information sources;

a step for storing retrieval conditions previously specified for a user;

an article retrieving step for retrieving the supplied articles to select articles which satisfy the retrieval conditions for the user;

a determining step for determining relevant articles for each article in accordance with the similarities by calculating similarities among the articles selected in said article retrieving step or similarities among the selected articles; and

an output step for outputting the articles with information of the determined relevant articles.

11. An information filtering method comprising:

a step for periodically receiving articles from information sources;

a step for calculating similarities among retrieval conditions previously specified by a user and supplied articles;

a step for sorting the articles in a descending order of the similarities calculated in said step and selecting articles by a predetermined number or only articles having similarities greater than a predetermined threshold;

an output article storage step for storing articles output to the user as a result of filtering;

a step for collecting articles stored in said output article storage step and articles supplied this day to calculate similarities among the articles so as to form the articles into groups or making the articles to be related to one another so as to output the articles to the user; and

a step for adding, to each article to be output, information whether the articles are in a group consisting of only articles supplied this day or a group including previous articles.

12. An information filtering method comprising:

a step for receiving articles from information sources;

a step for calculating similarities among retrieving conditions previously specified by a user and supplied articles to retrieve articles by a specified number or articles having similarities greater than a predetermined threshold in accordance with the calculated similarities; and

a step for receiving articles written in a different language, replacing retrieving conditions specified in a single language by another language to calculate similarities with the articles to present the article written in the different language to the user while being mixed with the articles to be presented.

13. An information filtering apparatus, comprising:

means for receiving first articles from information sources;

means for storing retrieval conditions previously specified for a user;

article retrieving means for retrieving the first articles received by said means for receiving in order to select second articles which match the retrieved conditions stored in said means for storing; and

means for outputting the second articles written in a different language by replacing the retrieval condition specified in a first language with the different language and by calculating similarities among the retrieval conditions replaced into the different language and the second articles to select third articles using the similarities among the retrieval conditions.

14. An information filter apparatus according to claim 13, further comprising:

determining means for determining relevant articles from said second articles by calculating at least one of similarities among the second articles and similarities among the second articles and other articles received by said means for receiving; and

output means for outputting the second articles with information about the determined relevant articles.

15. An information filtering apparatus according to claim 13, further comprising:

means for outputting the second articles with information about the retrieval conditions under which the second articles have been selected by said article retrieving means.

16. An information filtering apparatus according to claim 13, further comprising:

means for generating a summary having a length corresponding to a type of an article selected from said second articles; and

means for outputting to the user the summary generated by said means for generating.

17. An information filtering apparatus according to claim 14, further comprising:

means for forming the second articles into groups of articles by relating each of the second articles to one another using the similarities calculated by said determining means.

18. An information filtering apparatus according to claim 13, wherein said means for outputting outputs the third articles in a descending order using the similarities among the retrieval conditions.


Description

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to an information filtering apparatus for selecting articles from a great quantity of text articles that are required by a user or that meet requirements and interests of the user so as to periodically present the selected articles to the user, and a method therefor.

2. Description of the Related Art

In recent years, wide use of word processors and computers and that of electronic mail and electronic news through computer networks has resulted in electronic documents being spread rapidly.

As can be understood from the term "electronic publication", it can be considered that presentation of information of the contents of newspapers, magazines and books will be given generally. As a result, an expectation can be performed that the quantity of text information that is available in real time for persons is enlarged.

Therefore, there has arisen a requirement for an information filtering system or an information filtering service for selecting a text article from a great quantity of text articles of newspaper and magazines that meets requirements and interests of a user to periodically present the selected article to the user.

Conventional information filtering systems have been arranged to retrieve articles that meet a user profile expressing requirements and interests of a user and to give a presentation of the overall body of the articles while lining up the articles.

Generally, the user profile contained a plurality of topics in which a user is interested in.

Moreover, a function called "Relevance Feedback" has been realized with which the availability of the presented article is determined by a user and information about the determination is reflected onto the user profile to improve the adaptability in the ensuing information filtering operations.

However, the conventional systems, having the simple structure such that the selected articles are enumerated so as to be presented to the user, has a problem in that the user cannot easily recognize the relationship between the articles presented this time and those which have previously been presented.

Moreover, the simple presentation of articles performed by the conventional systems lacks information about the topic and the retrieval conditions of the topic with which matching has been made and which has thus caused the article to be presented to the user and information about the method of other users to read the presented article. Therefore, a great labor has been required to determine the availability of the article and consistency cannot easily be maintained.

It is effective for the information filtering system to perform double filtering such that an important article is selected and then an important text in the article is partially selected in view of effectively collecting information from a long article. However, since the conventional system has the structure such that a text having an appropriate length has been mechanically extracted, there arises a problem in that unnecessary information is sometimes included or required information is lacking.

Since the conventional system has the simple structure such that a text to be presented to the user is selected in accordance with similarity of the text supplied from a news source and the retrieval condition, texts even having the same contents are supplied in a disordered manner.

The information filtering system of the foregoing type has been structured under condition that all articles to be supplied from a news source are written in a single language (for example, English) and the system has been designed to be used in only the subject language zone. Therefore, articles obtained from a news source in another language zone cannot be mixed and supplied to the user.

To provide a news source having articles supplied in a certain language zone and a news source having articles in another language zone are mixed and supplied to a user, a system having a structure such that information filtering apparatuses respectively realized in the subject languages are provided individually are insufficient. If the information filtering apparatuses are provided individually, the user is required to set retrieval conditions for each information filtering apparatus. Moreover, articles written in different languages but having the same contents sometimes coexist. Therefore, the system simply having individual information filtering apparatuses encounters a problem in that duplicate articles are supplied to a user.

An object of the present invention is to provide an information filtering apparatus capable of giving a presentation of the relevance of articles to be supplied to a user by information filtering and enabling the user to recognize the relevance of the articles, and to a method therefor.

SUMMARY OF THE INVENTION

An information filtering apparatus and a method therefor according to the present invention are able to present the relativity among articles to present the articles to a user so to enable the user to easily recognize the relativity among the articles.

The information filtering apparatus and the method therefor according to the present invention enable the user to detect the retrieving conditions satisfied by the presented articles so that the user understands and reliably uses the information filtering system.

The information filtering apparatus and the method therefor according to the present invention enable the length of a summary or an abstract to be presented to a user in accordance with the type of the article so that double filtering is performed efficiently.

The information filtering apparatus and the method therefor according to the present invention enable articles having similar contents to be formed into groups or made to be related to one anther before the articles are presented to a user. Thus, the labor required for the user to read the text articles can be reduced considerably.

The information filtering apparatus and the method therefor according to the present invention enable articles supplied from news sources in a plurality of language zones to be mixedly supplied to a user. Thus, a satisfactory retrieving process can be realized to process a variety of articles written in different languages.

The information filtering apparatus and the method therefor according to the present invention permit the retrieving conditions and the threshold of similarities to be dynamically changed so as to always present appropriate articles to a user.

The information filtering apparatus and the method therefor according to the present invention have an improved retrieving process to improve the filtering accuracy and filtering speed.

According to the present invention, there is provided an information filtering apparatus for receiving articles, such as texts and images, from a plurality of information sources to select predetermined articles from the supplied articles to present the selected articles to a user, comprising means for storing retrieving conditions previously specified for each user; article retrieving means for retrieving supplied articles to select articles which meet retrieving conditions for each user; determining means for calculating similarities among articles selected by the article retrieving means or calculating similarities among selected articles and other articles to determine relevant articles for each article; and presentation means for adding information of the determined relevant articles to the selected articles to present information and the articles to the user.

With the information filtering apparatus, the expressions of articles are compared among the articles to calculate the similarities among articles. In accordance with the similarities, the articles to be presented to the user and their relevant articles are determined. Information about the relevant articles is added to information of the body of each article to be presented to the user and supplied to the user. It is preferable that the similarities are calculated among articles supplied this time or among the articles supplied this time and previous articles. As a result, the relationship among the articles selected by the article retrieving means and the relationship among the articles selected this time and articles selected due to the previous filtering operation can be made to be clear. Thus, the relativity among the articles can be notified to the user.

By calculating the similarities among the articles selected by the article retrieving means, existence of duplicated articles is examined. Thus, information of the body of the article retrieving means is not presented to the user and information of only the captions of the duplicated articles is added as information of a relevant article so as to be presented to the user. As a result, presentation of articles supplied from, for example, different information sources and having the same contents to the user can automatically be prevented.

According to the present invention, there is provided an information filtering apparatus for receiving articles, such as texts and images, from a plurality of information sources to select predetermined articles from the supplied articles to present the selected articles to a user, comprising means for storing retrieving conditions previously specified for each user; article retrieving means for retrieving supplied articles to select articles which meet retrieving conditions for each user so as to present the articles to the user; and means for adding information of retrieving conditions satisfied by the articles selected by the article retrieving means to each article to present the articles and information to the user. Thus, the ground with which the articles have been selected can be notified to the user.

As a result, the retrieving condition satisfied by the articles, which are being presented, such as the topic selected by the user and satisfied by the article which is being presented, can be notified to the user. Therefore, the ground of the presentation of the article can easily be understand by the user. Therefore, the user is able to easily determine the usefulness of the article.

Therefore, a relevance feedback function is further provided in which information whether the articles supplied to the user have been useful for the user is fed back from the user to modify the retrieving conditions while reflecting the information item. Thus, the ground of the selection of the article can effectively be used in the relevance feedback function.

The method with which the articles which are being presented are read by other users is presented to the user in place of presenting the ground of the selection of the article enables the relevance feedback to be performed while making a reference to the determinations of other users. Thus, the relevance feedback can effectively be used.

According to the present invention, there is provided an information filtering apparatus for receiving articles, such as texts and images, from a plurality of information sources to select predetermined articles from the supplied articles to present the selected articles to a user, comprising: means for storing retrieving conditions previously specified for each user; article retrieving means for retrieving supplied articles to select articles which meet retrieving conditions for each user so as to present the articles to the user; and means for generating a summary or an abstract having a length corresponding to the type of the article selected by the article retrieving means to present the summary or the abstract to the user.

As a result of the foregoing structure, the summary or the abstract having the length corresponding to the type of the article is generated and present to the user. Therefore, a ratio of text information which is useful for the user is raised among the text to be presented to the user. As a result, effective information collection can be performed.

It is preferable that the classification of the types of the articles be the difference in the retrieving conditions of the topic satisfied by the article or the difference in the attribute of the article, such as the date of publication. In a case where the user has specified a plurality of topics as the retrieving conditions and priority order is given to the topics, the length of the summary or the abstract can be elongated when the article satisfying the topic having high priority is retrieved. Thus, the ratio of text information useful for the user can be raised.

According to the present invention, there is provided an information filtering apparatus having means for receiving articles from one or more information source; means for calculating similarities among retrieving conditions previously specified by a user and the supplied articles; output means for sorting articles in a descending order of the calculated similarities to output articles by a predetermined number or only articles having similarities greater than a predetermined threshold in the descending order in terms of the similarity, comprising means for calculating the similarities among articles output from the output means; and means for forming the articles into groups, making the articles to be related to one another or controlling selection of articles to be output in accordance with the similarities among articles calculated by the means.

According to the present invention, relative articles can be formed into groups or made to be related to one another before the articles are presented to the user. In a case where relative texts are output in a disorder state as has been experienced with the conventional structure, the user is required to change the way of thinking. Thus, a labor is required to understand the result of the filtering operation. However, the information filtering apparatus according to the present invention enables the relative articles to be formed into groups or made to be related to one another when presented to the user. Therefore, the labor required for the user can significantly be reduced.

It is preferable that the similarities with previous articles be obtained as well as the similarities among articles supplied this day to add information indicating whether the article is included in a group consisting of only the articles supplied this day or in a group including previous articles to the article to be output. As a result, the user is able to arrange the relevant articles more efficiently when the user reads the article.

According to the present invention, there is provided an information filtering apparatus comprising: means for receiving articles, such as texts and images, from a plurality of information source; means for calculating similarities among retrieving conditions previously specified by a user and the supplied articles to retrieve articles by a specified number or articles having similarity greater than a predetermined threshold in accordance with the calculated similarity; and means for presenting the retrieved documents to the user, wherein articles written in a different language are supplied to replace retrieving conditions specified in a single language by another language to calculate similarities with the articles to present the article written in the different language to the user while being mixed with the documents to be presented by the means.

In the case where the similarities among the articles and the retrieving conditions are calculated, the retrieving conditions specified in a certain language are directly used and the retrieving conditions are replaced into another language so that similarities among the changed retrieving conditions and the articles are calculated. Therefore, the user is able to simultaneously receive news or the like from a plurality of language zones with specified retrieving conditions written in one language. Thus, a satisfactory retrieving function can be realized with respect to various articles written in different languages.

It is preferable that the apparatus capable of mixedly presenting articles written in different languages to the user has means for calculating the similarities among articles written in different languages to determine articles of a type have similarity greater than a predetermined threshold to be duplicated articles so as to supply either of the articles to the user. As a result, either of the articles written in different languages and having the same contents is not supplied to the user. Thus, waste to read the same articles can be eliminated. In the foregoing case, it is preferable that a language for the user is previously stored to determine the article among the duplicated articles to be supplied or the overall body or a portion of the article written in a language different from the language of the user is translated into the language of the user before the article is supplied to the user.

According to the present invention, there is provided an information filtering apparatus having: means for receiving articles, such as texts and images, from a plurality of information sources; means for calculating similarities among retrieving conditions previously specified by a user and the supplied articles to retrieve articles by a specified number or articles having similarity greater than a predetermined threshold in accordance with the calculated similarity; and means for presenting the retrieved documents to the user, comprising means for changing the threshold of the similarity or the retrieving conditions in accordance with results of retrieval performed by the retrieving means.

According to the present invention, various retrieving conditions or the threshold of the similarities are dynamically changed whenever the retrieval is performed or in accordance with results of plural and successive retrievals. Thus, the retrieving conditions or the threshold of the similarities can be allowed to automatically follow the change in the contents of the article which is being supplied. As a result, an appropriate article can always be presented to the user without a necessity for the user to change the specification of the retrieving conditions.

As the retrieving conditions which are dynamically changed in accordance with the results of retrievals, topics specified by the user or text data bases in which articles to be retrieved are recorded may be employed. It is preferable that the threshold of the similarities be changed in accordance with distribution of the similarities examined over a plurality of articles. As a result, a problem in that retrieval of document which is not considerably appropriate is performed can be prevented. It is effective to change the retrieving conditions in accordance with the balance of the contraction with the user or change the method of displaying the article in accordance with the similarity.

According to the present invention, there is provided an information filtering apparatus for receiving articles, such as texts and images, from a plurality of information sources to be presented to a user, comprising means for storing retrieving conditions previously specified for each user; and filtering means for calculating the similarities among the retrieving conditions for each user and the articles by a method formed by combing a plurality of methods of calculating similarities with one another and selecting articles which satisfy the retrieving conditions for each user in accordance with results of the calculations.

According to the present invention, a plurality of methods of calculating similarities are combined with each other to prevent deterioration in the filtering accuracy which has not been prevented by the single method of calculating the similarities. Thus, the filtering accuracy can be improved. It is preferable that the method of calculating the similarities be formed by combing calculations for obtaining similarities by using the occurrence frequency in a character unit match and calculations for obtaining similarities by using the occurrence frequency in a word unit match. The calculations for obtaining similarities by using the occurrence frequency in the character unit match have a possibility that the similarity is calculated including words having completely different meanings. On the other hand, the calculations for obtaining similarities by using the occurrence frequency in the word unit match is free from the foregoing problem. On the contrary, the calculations for obtaining similarities by using the occurrence frequency in the word unit match has a possibility that a word which is not contained in the dictionary for analyzing the morpheme cannot correctly be analyzed and, thus, it is not included in the calculations for obtaining similarities. However, the calculations for obtaining similarities by using the occurrence frequency in the character unit match is free from the foregoing problem. Therefore, by combining the two calculation methods for obtaining similarities, the mutual disadvantages can be compensated and, therefore, the similarity can be calculated more accurately.

To previously process only articles required for performing the filtering process simultaneously with the filtering process, it is preferable that a primary retrieval is simply performed such that articles including words specified with the retrieving conditions are initially selected. Then, a previous process is performed such that the morpheme and the format of the articles selected due to the primary retrieval are analyzed. As a result, the time required to complete the filtering process can be shortened and the required storage region can be reduced. Moreover, a mechanism is provided with which words for changing the user profile are extracted from the adaptable documents or non-adaptable documents specified by the user to change the user profile with the extracted words to make the user profile to be adaptable to the requirements and the interests of the user. Thus, the filtering performance can be improved further satisfactorily.

Additional objects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate presently preferred embodiments of the invention and, together with the general description given above and the detailed description of the preferred embodiments given below, serve to explain the principles of the invention.

FIG. 1 is a block diagram showing the overall structure of an information filtering system according to the present invention;

FIG. 2 is a schematic view showing the operational state of the information filtering system shown in FIG. 1;

FIG. 3 is a block diagram showing the structure of an information filtering center provided for the information filtering system according to a first embodiment of the present invention;

FIG. 4 is a flow chart showing a flow of a user profile generating process to be performed in the system according to the first embodiment;

FIG. 5 is a flow chart showing a flow of an article information extracting process to be performed in the system according to the first embodiment;

FIG. 6 is a diagram showing an example of expressions of articles employed in the system according to the first embodiment;

FIG. 7 is a diagram showing another example of expressions of articles employed in the system according to the first embodiment;

FIG. 8 is a flow chart showing a flow of an article retrieving process to be performed in the system according to the first embodiment;

FIG. 9 is a diagram showing a state of supplied articles ranked by the selection process in the system according to the first embodiment;

FIG. 10 is a flow chart showing a flow of an article selection process in the system according to the first embodiment;

FIG. 11 is a diagram showing an example in which top ten articles have been selected in a case where a result of ranking as shown in FIG. 9 has been obtained in the system according to the first embodiment;

FIG. 12 is a diagram showing an example in which articles having similarities greater than 0.86 with the user profile have been selected in the case where a result of ranking as shown in FIG. 8 has been obtained in the system according to the first embodiment;

FIG. 13 is a diagram showing a state where the top portion of a plurality of results of ranking is merged to select articles to be presented to a user in a case where a plurality of retrievals and ranking are performed with respect to one user in the system according to the present invention;

FIG. 14 is a flow chart showing a flow of an article similarity calculating process in the system according to the first embodiment;

FIG. 15 is a diagram showing an example of articles supplied from different information sources in the system according to the first embodiment;

FIG. 16 is a flow chart showing a flow of a presentation information generating process in the system according to the first embodiment;

FIG. 17 is a diagram showing a state where duplicated articles are generated from one press release in the system according to the first embodiment;

FIG. 18 is a diagram showing a state where duplicated articles are generated from one even in the system according to the first embodiment;

FIGS. 19A and 19B are diagrams showing examples of sets of duplicated articles obtained due to calculations for obtaining similarities among articles performed with respect to four articles shown in FIG. 15 in the system according to the first embodiment;

FIG. 20 is a diagram showing an example in which information about omitted duplicated articles is added to information of the body of the article to be presented in the system according to the first embodiment;

FIG. 21 is a diagram showing a display state of information of relevant articles in the system according to the first embodiment;

FIG. 22 is a diagram showing another display state of information of relevant articles in the system according to the first embodiment;

FIG. 23 is a diagram showing another display state of information of relevant articles in the system according to the first embodiment;

FIG. 24 is a flow chart showing a flow of display screen switching process for information of relevant articles in the system according to the first embodiment;

FIG. 25 is a flow chart showing another flow of display screen switching process for information of relevant articles in the system according to the first embodiment;

FIG. 26 is a diagram showing an example in which a list of articles to be presented to a user is displayed together with information of duplicated articles in the case where duplication of articles takes place as shown in FIG. 20 in the system according to the first embodiment;

FIG. 27 is a flow chart showing a flow of an article similarity calculating process in the system according to the first embodiment;

FIGS. 28A and 28B are diagrams showing examples of a set of articles selected by an article selection portion this time and a set of articles presented previously to the user in the system according to the first embodiment;

FIG. 29 is a flow chart showing a flow of presentation information generating process in the system according to the first embodiment;

FIG. 30 is a diagram showing an example in which information of previous relevant articles is added to information of the article supplied this time in the system according to the first embodiment;

FIG. 31 is a diagram showing another example in which information of previous relevant articles is added to information of the article supplied this time in the system according to the first embodiment;

FIG. 32 is a diagram showing an example in which information of previous relevant articles is included in information of the body of the article supplied this time in the system according to the first embodiment;

FIG. 33 is a diagram showing a state where a list of previous articles related to the first sentence shown in FIG. 32 is displayed when the first sentence has been selected in the system according to the first embodiment;

FIG. 34 is a diagram showing an example of display of the body of a relevant article "earthquake off .largecircle..largecircle., magnitude 4" shown in FIG. 33 when the relevant articles have been selected in the system according to the first embodiment;

FIG. 35 is a flow chart showing another example of the flow of the article similarity calculating process in the system according to the first embodiment;

FIG. 36 is a diagram showing another example of the presentation information generating process in the system according to the first embodiment;

FIG. 37 is a diagram showing another example in which information of the body of an article supplied this time is presented together with information of other relevant articles in the system according to the first embodiment;

FIG. 38 is a diagram showing another example in which information of the body of an article supplied this time is presented together with information of other relevant articles in the system according to the first embodiment;

FIGS. 39A and 39B are diagrams showing an example in which similarities among articles is reflected on the article presenting order in the system according to the first embodiment;

FIG. 40 is a schematic view showing a user profile for use in an information filtering system according to a second embodiment of the present invention;

FIG. 41 is a block diagram showing the structure of an information filtering center in the system according to a second embodiment;

FIG. 42 is a flow chart showing a flow of an article retrieving process in the system according to the second embodiment;

FIG. 43 is a schematic view showing supplied articles which have been ranked in the system according to the second embodiment;

FIG. 44 is a flow chart showing a flow of an article selection process in the system according to the second embodiment;

FIG. 45 is a diagram showing topics and results of retrieving the topics in the system according to the second embodiment;

FIG. 46 is a diagram showing topics and an added information generating process in the system according to the second embodiment;

FIG. 47 is a diagram showing a state in which information of topics satisfied by each article is added to a list of captions of articles selected for a user in the system according to the second embodiment;

FIGS. 48A and 48B are diagrams showing a state where information of the number of articles satisfying each topic is presented to the user in the system according to the second embodiment;

FIG. 49 is a diagram showing a state where summaries or abstracts or the bodies of articles selected for the user are collected for each topic to be presented to the user in the system according to the second embodiment;

FIG. 50 is a diagram showing a state where information about retrieving conditions satisfied by the article is added as header information to be presented to the user in the system according to the second embodiment;

FIGS. 51A and 51B show an example in which the retrieving conditions satisfied in the system according to the second embodiment are stressed and displayed in the article;

FIGS. 52A and 52B show another example in which the retrieving conditions satisfied in the system according to the second embodiment are stressed and displayed in the article;

FIGS. 53A and 53B show another example in which the retrieving conditions satisfied in the system according to the second embodiment are stressed and displayed in the article;

FIG. 54 is a diagram showing a specific example of the retrieving conditions for retrieving documents satisfying a certain topic in the system according to the second embodiment;

FIG. 55 is a diagram showing an example of display of the retrieving conditions to be added to an article retrieved under the retrieving conditions shown in FIG. 54 and presented to the user in the system according to the second embodiment;

FIG. 56 is a diagram showing another example of display of the retrieving conditions to be added to an article retrieved under the retrieving conditions shown in FIG. 54 and presented to the user in the system according to the second embodiment;

FIG. 57 is a flow chart showing another example of the article retrieving process in the system according to the second embodiment;

FIG. 58 is a flow chart showing another example of an added information generating process in the system according to the second embodiment;

FIG. 59 is a table showing the relationship among a plurality of users and articles to be transmitted to the users in the system according to the second embodiment;

FIG. 60 is a diagram showing a state where information about other users who have received the article is added to a list of captions of articles selected for a certain user in the system according to the second embodiment;

FIG. 61 is a diagram showing a state where information about other users who have received the article is added to a list of captions of articles selected for a certain user in the system according to the second embodiment;

FIG. 62 is a diagram showing a state where information about other users who have received the article is, as header information, added to the body of the article to be presented to the user in the system according to the second embodiment;

FIG. 63 is a diagram showing another example of the state where information about other users who have received the article is, as header information, added to the body of the article to be presented to the user in the system according to the second embodiment;

FIG. 64 is a diagram showing an example of display in which relevance feedback information previously performed by a certain user or other users is added to information of the article to be presented this time so as to be presented in the system according to the second embodiment;

FIG. 65 is a diagram showing another example of display in which relevance feedback information previously performed by a certain user or other users is added to information of the article to be presented this time so as to be presented in the system according to the second embodiment;

FIG. 66 is a block diagram showing an information filtering center provided for an information filtering system according to a third embodiment of the present invention;

FIG. 67 is a diagram showing examples of keywords and user profiles expressed with the weights of the keyword in the system according to the third embodiment;

FIG. 68 is a flow chart showing a flow of a summary or abstract generating process in the system according to the third embodiment;

FIG. 69 is a diagram showing an example of topics selected by the user and their priorities in the system according to the third embodiment;

FIG. 70 is a diagram showing a list of articles to be presented to a user who has selected the topics shown in FIG. 69 and topics satisfying the articles in the system according to the third embodiment;

FIG. 71 is a conceptual view showing information of articles to be presented to the user in the system according to the third embodiment;

FIG. 72 is a diagram showing topics selected by the user and their priorities in the system according to the third embodiment;

FIG. 73 is a diagram showing an example of information of articles to be presented to the user at a next filtering process in a case where feedback has been performed in the system according to the third embodiment;

FIG. 74 is a flow chart of a summary or abstract generating process in the system according to the third embodiment;

FIG. 75 is a diagram showing examples of articles selected by the article selection portion in the system according to the third embodiment;

FIG. 76 is a diagram schematically showing another example of information of articles to be presented to the user in the system according to the third embodiment;

FIG. 77 is a diagram showing examples of articles selected to be presented to the user in a case where newspaper publishing companies have been employed as attributes in the system according to the third embodiment;

FIG. 78 is a diagram showing articles to be presented to the user in the case shown in FIG. 77 in the system according to the third embodiment;

FIG. 79 is a diagram showing another example of information of articles to be presented to the user at a next filtering process in a case where feedback has been performed in the system according to the third embodiment;

FIG. 80 is a flow chart showing a flow of a presentation information generating process in the system according to a fourth embodiment;

FIG. 81 is a block diagram showing the structure of an information filtering center provided for the information filtering system according to a fifth embodiment of the present invention;

FIG. 82 is a flow chart showing a flow of a presentation information generating process in the system according to the fifth embodiment;

FIG. 83 is a flow chart showing a flow of an output process of a duplicated article set in the system according to the fifth embodiment;

FIG. 84 is a diagram showing an example of presentation of articles to the user in the system according to the fifth embodiment;

FIG. 85 is a diagram showing an example of presentation of articles to the user in the form of a hyper text in the system according to the fifth embodiment;

FIG. 86 is a diagram showing an example of presentation of articles to the user in the form of a hyper text in the system according to the fifth embodiment;

FIG. 87 is a diagram showing an example of presentation of articles to the user in the form of a hyper text in the system according to the fifth embodiment;

FIG. 88 is a block diagram showing the structure of an information filtering center provided for the information filtering apparatus according to a sixth embodiment of the present invention;

FIG. 89 is a flow chart showing a flow of a text article receiving process in the apparatus according to the sixth embodiment;

FIG. 90 is a flow chart showing a flow of a similarity calculating process in the apparatus according to the sixth embodiment;

FIGS. 91A and 91B are diagrams showing a data format of the retrieving conditions and an example of actual data in the apparatus according to the sixth embodiment;

FIG. 92 is a flow chart showing a flow of a transmission article determining process in the apparatus according to the sixth embodiment;

FIG. 93 is a block diagram showing the function and structure of an apparatus according to a seventh embodiment of the present invention;

FIG. 94 is a flow chart showing a portion of a flow of a duplicated article deleting process in the apparatus according to the seventh embodiment;

FIG. 95 is a flow chart showing a residual portion of the flow of the duplicated article deleting process in the apparatus according to the seventh embodiment;

FIG. 96 is a flow chart showing a flow of an article similarity calculating process in the apparatus according to the seventh embodiment;

FIG. 97 is a block diagram showing the function and structure of an apparatus according to an eighth embodiment of the present invention;

FIG. 98 is a block diagram showing the function and structure of an apparatus according to a ninth embodiment of the present invention;

FIG. 99 is a block diagram showing the function and structure of an apparatus according to a tenth embodiment of the present invention;

FIGS. 100A and 100B are diagrams showing examples of data format of an article to be transmitted in the apparatus according to an eleventh embodiment of the present invention;

FIG. 101 is a block diagram showing the function and structure of an apparatus according to a twelfth embodiment of the present invention;

FIG. 102 is a flow chart showing a flow of a process to be performed by a relevance feedback portion in the apparatus according to the twelfth embodiment;

FIG. 103 is a block diagram showing the structure of an apparatus according to a thirteenth embodiment of the present invention;

FIG. 104 is a diagram showing a state where results of retrieval of topics are lined up in the descending order in terms of the similarity in the apparatus according to a thirteenth embodiment of the present invention;

FIG. 105 is a flow chart showing a process for obtaining the right-hand end of a flat portion of a descending curve of similarities, the similarity at the position and the order of documents in the apparatus according to the thirteenth embodiment;

FIG. 106 is a flow chart showing a process for changing the number of documents to be output within a range in which the number is not larger than a specified number of documents to be output in the apparatus according to the thirteenth embodiment;

FIG. 107 is a flow chart showing a process for changing the number of documents to be output within a range in which the number is not smaller than a specified number of documents to be output in the apparatus according to the thirteenth embodiment;

FIG. 108 is a flow chart showing a process for determining continuation of a fact that the number of retrieved documents is larger than a specified number by a specified number of times in the apparatus according to the thirteenth embodiment;

FIG. 109 is a flow chart showing a process for determining continuation of a fact that the number of retrieved documents is larger than a specified number by a specified number of times in the apparatus according to the thirteenth embodiment;

FIG. 110 is a flow chart showing a process for decreasing the specified number of documents to be output in the apparatus according to the thirteenth embodiment;

FIG. 111 is a flow chart showing a process for increasing the specified number of documents to be output in the apparatus according to the thirteenth embodiment;

FIG. 112 is a flow chart showing a process for reducing the specified threshold of similarities in the apparatus according to the thirteenth embodiment;

FIG. 113 is a flow chart showing a process for enlarging the specified threshold of similarities in the apparatus according to the thirteenth embodiment;

FIG. 114 is a flow chart showing a process for deleting a text data base from a user information storage portion in the apparatus according to the thirteenth embodiment;

FIG. 115 is a flow chart showing another process for deleting a text data base from a user information storage portion in the apparatus according to the thirteenth embodiment;

FIG. 116 is a diagram showing an example of retrieval in the apparatus according to the thirteenth embodiment;

FIG. 117 is a flow chart showing a process for changing the topic in the apparatus according to the thirteenth embodiment;

FIG. 118 is a flow chart showing another process for changing the topic in the apparatus according to the thirteenth embodiment;

FIG. 119 is a diagram showing another example of retrieval in the apparatus according to the thirteenth embodiment;

FIG. 120 is a flow chart showing another process for changing the topic in the apparatus according to the thirteenth embodiment;

FIG. 121 is a flow chart showing another process for changing the topic in the apparatus according to the thirteenth embodiment;

FIG. 122 is a diagram showing another example of retrieval in the apparatus according to the thirteenth embodiment;

FIG. 123 is a flow chart showing a process for changing the threshold of the similarity in accordance with the balance of contraction with the user in the apparatus according to the thirteenth embodiment;

FIG. 124 is a flow chart showing a process for changing the font size of a text to be presented in the apparatus according to the thirteenth embodiment;

FIG. 125 is a diagram showing an example of topics for use in the apparatus according to the thirteenth embodiment;

FIG. 126 is a block diagram showing the function and structure of an apparatus according to a fourteenth embodiment of the present invention;

FIG. 127 is a flow chart showing the overall process in the apparatus according to the fourteenth embodiment;

FIG. 128 is a flow chart showing, in detail, the filtering process shown in FIG. 127;

FIG. 129 is a flow chart showing, in detail, the process for changing the user profile shown in FIG. 127;

FIG. 130 is a flow chart showing, in detail, the document analyzing process shown in FIG. 128;

FIG. 131 is a flow chart showing, in detail, the filtering process shown in FIG. 128;

FIG. 132 is a flow chart showing a specific procedure of the document analyzing process shown in FIG. 128;

FIG. 133 is a flow chart showing the procedure of the format analyzing process shown in FIG. 132;

FIG. 134 is a diagram showing an example of a document to be subjected to the format analyzing process shown in FIG. 132;

FIG. 135 is a diagram showing results of analysis of the format of the document shown in FIG. 134;

FIG. 136 is a diagram showing an example of results of analysis of the morpheme corresponding to the results of the analysis of the format shown in FIG. 135;

FIG. 137 is a flow chart showing a document analyzing process in the apparatus according to a fifteenth embodiment of the present invention;

FIG. 138 is a flow chart showing another example of the process shown in FIG. 131;

FIG. 139 is a flow chart showing a retrieval process in the apparatus according to the eleventh embodiment of the present invention;

FIG. 140 is a flow chart showing a retrieval process in the apparatus according to the twelfth embodiment of the present invention;

FIG. 141 is a flow chart showing a retrieval process in the apparatus according to the thirteenth embodiment of the present invention;

FIG. 142 is a flow chart showing a retrieval process in the apparatus according to the fourteenth embodiment of the present invention;

FIG. 143 is a flow chart showing a retrieval process in the apparatus according to the fifteenth embodiment of the present invention; and

FIG. 144 is a flow chart showing the flow of the overall process in the apparatus according to the fifteenth embodiment.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Referring to the drawings, preferred embodiments of the present invention will now be described.

Referring to FIG. 1, the overall structure of an information filtering system according to the present invention will now be described.

The information filtering system is an information supply system which receives text articles containing texts and images supplied from a plurality of information sources 2, such as newspaper publishing companies, news agencies and publishers, to periodically transmit the text articles to subscribed user terminals 3. The information supply service is realized by an information filtering center 1-1.

The information filtering center 1-1 is realized by one computer system connected, through a communication network, to the plural information sources 2 and the plural subscribed user terminals 3. The information filtering center 1-1 comprises a central processing unit 4 for performing controls and process for performing an information filtering operation, a storage unit 5, such as a semiconductor memory, a magnetic disk or an optical disk, for storing programs and data, a receiving portion 6 for receiving text articles from the information sources 2 through a communication network, such as a communication line or radio waves and a transmission portion 7 for transmitting text articles to the user terminals 3 through the communication network, such as the communication line or radio waves.

Each of the user terminals 3 is an information processing terminal, for example, a personal computer or a work station, and comprises a text information receiving portion 8 for receiving text articles transmitted from the information filtering center 1-1, a display portion 9 for displaying, on a screen thereof, the supplied text articles.

As shown in FIG. 2, the information filtering center 1-1 stores a kind of retrieval conditions, called a user profile 10, for each user to retrieve articles to be supplied to a subject user in accordance with the user profile 10. The user profile 10 consists of a plurality of topics specified by the user so that an article meeting the topic is retrieved and selected so as to be supplied to the user.

The specific structure of the information filtering center 1-1 will now be described.

First Embodiment

FIG. 3 shows the structure of the information filtering center 1-1 according to a first embodiment of the present invention. Referring to FIG. 3, continuous line arrows indicate flows of data items.

As shown in FIG. 3, the information filtering center 1-1 consists of a user-profile generating portion 11, a user-profile storage portion 12, an article-information extraction portion 13, an article retrieval portion 14, an article selection portion 15, an article similarity calculating portion 16, a presentation information generating portion 17 and an article information storage portion 18. The elements surrounded by a dashed line, that is, the user-profile generating portion 11, the article-information extraction portion 13, the article retrieval portion 14, the article selection portion 15, the article similarity calculating portion 16 and the presentation information generating portion 17 can be realized by, for example, software to be executed by the article retrieval portion 14 shown in FIG. 1. The user-profile storage portion 12 and the article information storage portion 18 can be realized by the storage unit 5.

The user-profile generating portion 11 analyzes requirements, interests and the like previously specified by each user to generate, for each user, a user profile required to perform the retrieval. The user profiles are stored in the user-profile storage portion 12. The article-information extraction portion 13 extracts information required to perform the retrieval and calculation of the similarity from articles supplied from each of the information sources 2 and stores information in the article information storage portion 18 together with the text article.

The article retrieval portion 14 retrieves the articles supplied from each of the information sources 2 to obtain an article that meets the user profile. In the retrieval process, the similarities among the user profile and the supplied articles and among articles are sorted in the descending order in terms of the similarity. The article selection portion 15 is provided to select an article to be presented to the user in accordance with a result of the retrieval operation. For example, all articles of a type having a similarity level higher than a certain threshold or some articles of a type having a high level similarity are selected.

The article similarity calculating portion 16 is arranged to detect the similarities among articles by calculating the similarities among the selected articles. The presentation information generating portion 17 generates article information to be presented to the user in accordance with a result of the article selection and a result of the calculation of the similarities of the articles. The article information storage portion 18 stores article information and the result of the calculation of the article similarity. Specific processes will now be described which are performed by the user-profile generating portion 11, the article-information extraction portion 13, the article retrieval portion 14, the article selection portion 15, the article similarity calculating portion 16 and the presentation information generating portion 17.

FIG. 4 shows a flow of the process to be performed by the user-profile generating portion 11.

The user-profile generating portion 11 receives requirements and interests from each user (step S1). The requirements and interests of the user are expressed in a natural language as "I want to read articles about .largecircle..largecircle. and XX", or in the form of a set of keywords of frequency occurrence in the topic of interest, the keywords being given a priority order or weighted or by a retrieval equation for use in a usual document retrieval operation.

The user-profile generating portion 11 uses a word dictionary, a dictionary of synonyms and the like to perform language processes, such as extraction of a word and development of a synonym (steps S2 and S3). Then, the user-profile generating portion 11 performs conversion into a format that can be retrieved to generate a user profile (steps S4 and S5). The generated user profile is, for each user, stored in the user-profile storage portion 12 so as to be used as a retrieval condition for retrieving the articles.

FIG. 5 shows an example of a flow of the process to be performed by the article-information extraction portion 13.

The article-information extraction portion 13 receives an article supplied from the information source (step S11) to subject the article to a morpheme analysis, construction analysis and format analysis by using a dictionary for sentence analysis, a dictionary for information extraction and the like so as to extract the information source, date of supply of the article, frequency information of the components of the document, such as characters and words, position of appearance and information relevant to 5W1H (steps S12 to S14). Then, the article-information extraction portion 13 expresses the article as the set of the extracted information items (step S15). For example, the article is expressed by a vector, the component of which is the frequency of words allowed to appear, or by a 5W1H template having substituted apparent values. Examples of expression of the articles are shown in FIGS. 6 and 7. FIG. 6 shows frequency vectors, the component of each of which is frequency of occurrence (14, 9, 5, 2, 3) of words (semiconductor, memory, friction, depression, production, . . . ) allowed to appear in the article. FIG. 7 shows a template having items consisting of an information source, the number of words, caption, topic, date, place, the subject, main verb and the like.

The article-information extraction portion 13 expresses the article as described above, and then also performs an indexing process to quickly retrieve articles (step S16). Then, the article-information extraction portion 13 expresses the article by the vector or the template and stores the article and indexing information in the article information storage portion 18 (step S17).

FIG. 8 shows a flow of the process to be performed by the article retrieval portion 14.

The article retrieval portion 14 makes a reference to article information extracted by the article-information extraction portion 13 to retrieve an article that meets the user profile from the supplied articles.

The foregoing operation corresponds to a calculation of the similarities among the user profile and the supplied articles. The similarity may be in the form of a discrete value such that "the article meets the user profile" or "the article does not meet the user profile" depending upon the method of the retrieval, or in the form of continuous values in such a manner that articles satisfactorily meeting the user profile are given higher similarly values. The description will be made about the case where the similarity is in the from of continuous values.

The article retrieval portion 14 performs the following process with respect to the user profile for each user.

Initially, the article retrieval portion 14 reads the user profile from the user-profile storage portion 12 (step S21). Then, the article retrieval portion 14 substitutes "1" for variable i (step S22), and then calculates the similarity between the i-th article (the first article) and the user profile (step S23). The calculation of the similarity corresponds to a usual retrieval process in which references are made to the expression of the article and the retrieval index stored in the article information storage portion 18.

Then, the article retrieval portion 14 updates the value of the variable i by increasing the same by one, and then examines whether the value of i is larger than the number of the supplied articles (steps S24 and S25). If the value of i is not larger than the number of the supplied articles, the article retrieval portion 14 recognizes that articles, the similarity of each of which has not been calculated, exist. Thus, the article retrieval portion 14 repeats steps S23 to S25 until the value of i is made to be lager than the number of the supplied articles. When the calculations of the similarities among all of the supplied articles and the user profile have been completed, that is, when the retrieval process in which all of the supplied articles are retrieved has been completed, the article retrieval portion 14 sorts the supplied articles in the descending order in terms of the similarity with the user profile to rank the articles (step S26). A result of the ranking operation is stored in the article information storage portion 18. An example of a result of the ranking operation is shown in FIG. 9.

FIG. 10 shows a flow of a process to be performed by the article selection portion 15.

The article selection portion 15 reads, from the article information storage portion 18, the supplied articles retrieved and ranked by the article retrieval portion 14 (step S31) to select an article to be presented to the user (step S32). Information of the article determined to be presented to the user is again stored in the article information storage portion 18.

The selection of the article may be performed such that the number N of articles to be presented to the user is previously determined by the information filtering center 1-1 to present N upper rank articles or that articles having the similarity higher than a certain threshold are presented. FIG. 11 shows an example in which 10 upper rank articles are selected in a state where a result of ranking shown in FIG. 9 has been obtained.

FIG. 12 shows an example in which articles each having a similarity with the user profile of 0.86 or greater have been selected in the state where a result of ranking shown in FIG. 9 has been obtained.

FIG. 13 shows an example in which upper portions of results of ranking of a plurality articles are merged to select articles to be presented to the user in a case where a plurality of retrieval operations and ranking operations have been performed for one user.

In the foregoing case, retrieval operations for three topics as "semiconductor technology", "low price personal computer" and "artificial intelligence" have been performed individually so that articles A1, B1, C1, A2 and B2 have been selected from the higher rank articles as a result of each of the three retrievals.

Articles A1 and A2 are those meeting the topic "semiconductor technology", articles B1 and B2 are those meeting the topic "low price personal computer" and article C1 is an article meeting the artificial intelligence".

As a method of selecting articles, a method may be employed in which a predetermined number of articles are selected as shown in FIG. 11 or a method in which articles each having a similarity greater than a predetermined value are selected, as shown in FIG. 12.

FIG. 14 shows a flow of the process to be performed by the article similarity calculating portion 16.

The article retrieval portion 14 calculates the similarity between the user profile and the article, that is, the article retrieval portion 14 uses the user profile as a retrieval equation to perform a usual retrieval of the articles. On the other hand, the article similarity calculating portion 16 calculates the similarity between articles.

The calculation of the similarity is performed by subjecting the expressions of the articles, for example, as shown in FIGS. 6 and 7, to a comparison and a result of the calculation is stored in the article information storage portion 18.

An assumption is performed here that a plurality of information sources 2, such as newspaper publishing companies, exist and articles supplied from different information sources, for example, articles supplied from newspaper publishing company M and those supplied from newspaper publishing company N are subject to the process for calculating the similarity between articles.

Although all of combinations of the articles supplied from the different information sources may be subjected to the process of calculating the similarity, a description will be performed hereinafter about case of a low calculation cost where only the articles selected by the article selection portion 15 are subjected to the process of calculating the similarity.

That is, the article similarity calculating portion 16 initially reads, from the article information storage portion 18, articles selected by the article selection portion 15 (step S41). Then, the article similarity calculating portion 16 selects articles among the read articles that have been supplied from different information sources to calculate their similarity so as to store a result of the calculation in the article information storage portion 18 (step S42).

A specific example of the calculation for obtaining the similarity among articles will now be described.

FIG. 15 shows examples of articles selected by the article selection portion 15 and supplied from different information sources. FIG. 15 shows a case where four articles A to D are presented to the user.

Articles A and D are those supplied from newspaper publishing company M, article B is an article supplied from newspaper publishing company N and article C is an article supplied from publishing company O.

In this case, combination of article A and article B, that of article A and article C, that of article B and article C and that of article C and article D are subjected to the similarity calculations. Since article A and article D are those supplied from the same information source, their similarity is not calculated.

FIG. 16 shows a flow of a process to be performed by the presentation information generating portion 17.

The presentation information generating portion 17 reads, from the article information storage portion 18, information about the article selected by the article selection portion 15 and the similarity calculated by the article similarity calculating portion 16 (steps S51 and S52).

Then, the presentation information generating portion 17 classifies, as a set of duplicated articles, a set of articles having great similarity and supplied from different information sources (step S53). The duplicated articles are articles of a type about the same event individually produced by a plurality of information sources. Thus, the duplicated articles are those which can be considered to be the articles having the same or substantially the same contents.

Then, the presentation information generating portion 17 selects, from the set of the duplicated articles, one, generally, N articles, to be presented to the user as a representative articles (step S54). Thus, the presentation information generating portion 17 generates information to be presented to the user by, for example, adding information about the articles, that have not been selected, as relevant article information, to the contents of the selected article, and then transmits the information item (steps S55 and S56).

Specific examples of the duplicated articles and relevant article information will now be described.

FIG. 17 shows an example in which duplicated articles are derived from one press release. When press release article P describing a certain event has been supplied to newspaper publishing companies M, N and O, each newspaper publishing company edits the press release article P and, for example, adds a comment so that individual articles M, N and O are produced. If articles M, N and O are supplied from respective information sources to the information filtering center 1-1, the articles M, N and O are duplicated articles.

FIG. 18 shows an example in which duplicated articles are produced from one event.

In this example, newspaper publishing companies M, N and O individually collect materials so that articles M, N and O are produced. If the articles are supplied to the information filtering center 1-1, the articles M, N and O are duplicated articles.

Since the information filtering center is provided for the purpose of most effectively causing the user to make an access to required information among a great quantity of information items, it can be considered that it is not preferable that the articles to be presented to the user include duplicated articles. If all of the articles M, N and O are presented to the user in the case, for example, shown in FIG. 18, the user is required to read three articles to obtain information about one event.

To prevent the presentation of duplicated articles, the presentation information generating portion 17 selects one, generally, N articles to the user from a set of duplicated articles as a representative article. Then, an operation to be performed when only one article is selected will now be described.

FIGS. 19A and 19B each show an example of a set of duplicated articles obtained as a result of calculations of four articles shown in FIG. 15 to obtain similarity among articles.

In the foregoing examples, articles A and C and articles B and D have great similarity, thus resulting in two sets of duplicated articles being obtained.

The presentation information generating portion 17 selects one article from the sets of duplicated articles in accordance with a predetermined algorithm.

Assuming that the user or the service center has determined to give the newspaper publishing company M, articles A and D supplied to the user are presented.

Similarly, a method may be employed in which a press release having the largest quantity of information is given highest priority and is selected.

Another method may be employed in which the article given the highest rank as a result of the retrieval is selected.

The similarity between the user profile and the article is, in the case shown in FIG. 19, such that article C has the greatest similarity in the set of duplicated articles 1 and article D has the greatest similarity in the set of duplicated articles 2. Therefore, the articles C and D are presented to the user.

Another algorithm may be employed in which the longest or shortest article is selected.

As a result of the foregoing process, duplicated articles are omitted from candidates of articles to be presented to the user. Information about the omitted duplicated articles is added to information about the body of each article, and then presented to the user.

FIG. 20 shows an example in which information about the duplicated articles, which have been omitted, is added to information about the body of the article.

In this example, the information about the body of the article which is presented to the user and information about articles determined to have the same contents as the foregoing article and supplied from other information sources is supplied as added information. Specifically, the caption, the information source, the number of words of the article and the similarity of the article with the article, the body of which is being presented to the user, are listed up.

In the foregoing example, although articles each having ".largecircle.X corporation has pulled out from the service business" have been obtained from three information sources, .largecircle..largecircle., .DELTA..DELTA. and .quadrature..quadrature. newspaper publishing companies, the article supplied from the .largecircle..largecircle. newspaper publishing company is selected so as to be presented to the user.

By adding information about the omitted duplicated articles to information about the body of the article and by presenting the information item, repeated reading of articles having the same contents but supplied from different information sources can be prevented. Moreover, the methods of the different information sources to cover the same event can schematically be detected.

FIG. 21 shows a modification of the presentation method of relevant information shown in FIG. 20.

That is, although the structure shown in FIG. 20 presents relevant information in the form of a text, FIG. 21 has a structure such that the text portion of added information is formed into a hyper text structure to enable an access to the body of the duplicated article to be made.

In this case, the caption of the article in the area of the added information is formed into button permitting selection by using a mouse device or the like. Thus, the user is able to make a reference to the body of the relevant article 1 by selecting a relevant article 1.

FIGS. 22 and 23 show examples in which the body of relevant article 1 is displayed in a case where relevant article 1 has been selected in the case shown in FIG. 21.

The article "semiconductor conference was", the body of which is shown in FIG. 21 is, in FIG. 22, displayed such that only information, such as caption is displayed in an area for added information. As an alternative to this, the body of relevant article 1 is displayed in the area for the information of the body.

To return the state shown in FIG. 22 to that shown in FIG. 21, the user is required to select a button "semiconductor conference . . . (original article)" in the area for the added information shown in FIG. 22.

In the case shown in FIG. 23, information of the body of relevant article 1 is displayed on a window newly opened while maintaining information displayed in FIG. 21. By employing the foregoing display method, a plurality of duplicated articles can be subjected to a comparison.

The screen is shifted from the state shown in FIG. 21 to the state shown in FIG. 22 as follows in accordance with a flow of a process shown in FIG. 24.

The presentation information generating portion 17 adds information about the relevant information to information about the body of the article to be presented, as shown in FIG. 21 (step S61). If an event of the button of the relevant article being selected occurs, the presentation information generating portion 17 fetches information about the body of the selected relevant article (steps S62 and S63) to display information about the original article in the area for added information and the body of the selected relevant article in the area for information about the body (step S64).

The switch of the screen can be performed under control of the user terminals 3 if the information about the body of the relevant article has been transmitted to the user terminals 3 from the information filtering center 1-1.

The shift of the screen from the state shown in FIG. 21 to the state shown in FIG. 23 is performed as follows in accordance with a flow of a process shown in FIG. 25.

The presentation information generating portion 17 adds information of the relevant article to information about the body of the article to be presented so that the user terminals 3 displays information on the display thereof, as shown in FIG. 21 (step S71). Then, if an event that the button of the relevant article is selected occurs, the presentation information generating portion 17 fetches information about the body of the selected relevant article from the article information storage portion 18 (steps S72 and S73) to display the body of the selected relevant article on the window (step S74).

Also the switch of the screen can be performed under control of the user terminals 3 if the information about the body of the relevant article has been transmitted to the user terminals 3 from the information filtering center 1-1.

The relevant article to be added to the area for added information as shown in FIGS. 20 and 21 may be decreased in accordance with an algorithm similar to that employed to select articles from the set of duplicated articles.

By employing the structure which permits an access from the article, which is representative of the set of duplicated articles and the body of which is displayed, to the body of the other duplicated article as shown in FIGS. 21 to 23, the user is able to selectively read the other duplicated articles if the representative article selected by the information filtering system is not an appropriate article.

Even if the information filtering system has an algorithm to give a priority to and select the article supplied from N Times in accordance with a requirement from the user, an effective result can be obtained in a case where the user requires a press release in place of articles which are supplied from N Times.

Moreover, opinions of a plurality of newspaper publishing companies about one event can be subjected to a comparison.

FIG. 26 shows an example in which a list of articles to be presented to a user is displayed together with information about duplicated articles in a case where articles are duplicated.

In this case in which four articles exist to be presented to the user, an article, which is the third article and the content of which is such that ".largecircle.X corporation has pulled out from information service business" has two duplicated articles.

Similarities between the user profile and the articles are displayed in the end of the caption of the articles. Moreover, similarities between the original article and the duplicated articles are added to the duplicated articles. It can be said that the added similarities indicate the probabilities of the duplicated articles. The original article is, in this case, the article ".largecircle.X corporation has pulled out from information service business".

The foregoing description has been performed about the process with respect to one user profile.

Since a plurality of users who are supplied with the information filtering service exist in general, the information filtering center holds the user profile for each user to perform the filtering process.

First Modification of the First Embodiment

Examples of the structure of the article similarity calculating portion 16 and that of the presentation information generating portion 17 will now be described. FIG. 27 shows a flow of a process to be performed by the article similarity calculating portion 16.

The article retrieval portion 14 calculates the similarity between the user profile and the article, that is, the article retrieval portion 14 uses the user profile as the retrieving formula to perform a usual retrieving operation in which the articles are subjects of the retrieval. On the other hand, the article similarity calculating portion 16 calculates the similarity between articles.

The calculations for obtaining the similarity are performed by subjecting expressions of articles shown in, for example, FIGS. 6 and 7 to a comparison. A result of the calculations is stored in the article information storage portion 18.

An assumption is performed here that information of articles obtained by N times of information filter operations is stored in the article information storage portion 18.

In an example case where information filtering service is performed by one time per a day and N is made to be 1, it means a fact that information of articles obtained by the information filtering operation performed yesterday has been stored. Then, the description will be performed mainly in a case where N=1.

In this system, sets of articles supplied this time and articles which have been supplied until the previous operation are subject of the calculations for obtaining similarity between articles.

Although all combinations of the articles supplied this time and the articles, which have been supplied until the previous operation, may be subjected to the calculations for obtaining the similarity between articles, a lower cost method will now be described in which only the similarities of the combinations of articles selected this time by the article selection portion and articles which have been presented to the user until the previous operation are calculated.

That is, the article similarity calculating portion 16 reads information of articles selected by the article selection portion 15, and then reads, from the article information storage portion 18, information of articles which have been presented to the user until the previous filtering operations (steps S81 and S82). Then, the article similarity calculating portion 16 calculates the similarities of the combinations between the articles selected this time by the article selection portion 15 and the articles which have been presented to the user due to the previous operations so as to store results of the calculations in the article information storage portion 18 (step S83).

FIGS. 28A and 28B show examples of a set of articles selected by the article selection portion 15 this time and a set of articles presented at the previous operation.

In the foregoing case, articles A, B, C and D have been presented to the user at the previous operation, while articles E, F, G and H will be presented this time.

In the foregoing case, the calculations for obtaining similarity are performed such that 4.times.4=16 combinations, for example, a combination of article A and article E and that of article A and article F, are calculated.

A modification may be employed in which only articles satisfying a predetermined condition are made to be the subjects of the calculations for obtaining similarities.

If only the similarities of articles supplied from one information source are calculated in the cases shown in FIGS. 28A and 28B, the calculations for obtaining similarity of article E supplied from newspaper publishing company M this time are required to be performed with respect to only articles A and B supplied from newspaper publishing company M at the previous operation.

Another structure may be employed in the cases shown in FIGS. 28A and 28B, in which only articles each having a similarity with the user profile which is larger than a predetermined value are employed as the subjects of the calculations for obtaining similarities.

If only articles each having a similarity with the user profile of 0.8 or greater are made to be the subjects, only combination of article E and article A and that of article G and article A are required to be calculated.

FIG. 29 shows a flow of a process to be performed by the presentation information generating portion 17.

The presentation information generating portion 17 reads, from the article information storage portion 18, information of the articles selected by the article selection portion 15 this time, information of articles presented to the user until the previous operation and the similarity between articles calculated by the article similarity calculating portion 16 (steps S91 to S93). Then, information about the body of the article supplied this time is, together with information about of relevant articles supplied until the previous operation, presented to the user (steps S94 and S95).

FIGS. 30 and 31 show examples in which information about the body is, together with information about the relevant articles supplied previously, presented.

In the example shown in FIG. 30, information of articles about a semiconductor and supplied until yesterday is, as added information, supplied in addition to information about the body as "semiconductor conference was . . . . ". Specifically, captions of articles supplied previously, information sources, the number of words and the similarity with the article presented this time are listed up.

In the foregoing example, an article of .largecircle..largecircle. Times dated on the 15-th day is presented this time, while articles dated on the 14-th day and supplied from .largecircle..DELTA. Times and .largecircle..largecircle. Times are displayed as relevant articles supplied previously.

As shown in FIG. 31, information about articles as "Series: Semiconductor Friction (Part 1)" and "Series: Semiconductor Friction (Part 2)" supplied until yesterday from the newspaper publishing company .largecircle..largecircle. is displayed in addition to information about the body of the article as "Series: Semiconductor Friction (Part 3)" presented this time.

FIGS. 21 and 23 showing the first embodiment also show modifications of the examples shown in FIGS. 30 and 31.

That is, also the foregoing system permits a user to make an access to the body of each relevant article supplied previously.

Although the examples shown in FIGS. 21 to 23 have the arrangement in which information about the body and added information are completely separated from each other, a structure may be employed in which information about the previous articles is included in information about the body.

FIG. 32 shows an example in which information about the relevant articles is included in information about the body of the article supplied this time.

In this example, a body of an article as "Earthquake XX off .largecircle..largecircle. again activated" dated on the 19-th day is displayed. A portion of the first sentence as "Earthquake XX off .largecircle..largecircle. of .largecircle..largecircle. prefecture commenced on the 14-th day last month was . . . " is formed into a button which can be selected by a mouse or the like.

If a user selects the button, information about previous articles including information similar to the foregoing article is displayed.

FIG. 33 shows an example in which a list of previous articles considerably relating to the sentence is displayed in a case where the user has selected the first sentence.

In this example, captions of the articles dated on the 14-th day as "Earthquake occurred off .largecircle..largecircle., magnitude 4", the information sources, the number of words and similarity with the article supplied this time are listed up.

FIG. 34 shows an example in which the body of the relevant articles as "Earthquake occurred off .largecircle..largecircle., magnitude 4" is displayed in a case where the user has selected the relevant article above.

Another structure may be employed in which one or more bodies of the relevant articles are displayed as shown in FIG. 34 immediately after the user has selected the first sentence as shown in FIG. 32.

To perform a method in which information about the previous relevant articles is included in information about the body of the article supplied this time as shown in FIG. 32, similarity between each component of the article supplied this time and the previous articles is calculated in place of calculating the similarity between the article supplied this time and previous articles.

As the components of the body, paragraphs, sentences, clauses, phrases and words may be employed.

As a modification of this, also information about the previous relevant articles may be presented in the form of components of the body in place of the article unit.

For example, only the first paragraph may be displayed in place of displaying the overall body of the relevant article as shown in FIG. 34.

The arrangement in which an access from the article presented this time to the previous relevant articles is permitted enables the user to easily recognize the process of an event, the state of which is changed as a lapse of time and to easily obtain information about a plurality of articles, such as a serialized article.

Moreover, the foregoing structure is effective for the user to again confirm the contents of an article in a case where the user has recalled the previous article.

Second Modification of First Embodiment

Another example of the structure of the article similarity calculating portion 16 and that of the presentation information generating portion 17 will now be described.

FIG. 35 shows a flow of a process to be performed by the article similarity calculating portion 16.

The article retrieval portion 14 calculates the similarity between the user profile and the article, that is, the article retrieval portion 14 uses the user profile as the retrieving formula to perform a usual retrieving operation in which the articles are subjects of the retrieval. On the other hand, the article similarity calculating portion 16 calculates the similarity between articles.

The calculations for obtaining the similarity are performed by subjecting expressions of articles shown in, for example, FIGS. 6 and 7 to a comparison. A result of the calculations is stored in the article information storage portion 18.

In this case, a combination of articles supplied this time is subject of the calculations for obtaining similarities among articles.

Although the similarities of all of the supplied articles may be calculated, a lower cost method will now be described in which only the similarities among the articles selected by the article selection portion 15 this time are calculated.

Although the similarities among the articles supplied this time are calculated similarly to the first embodiment, this modification is different from the first embodiment in which calculations of only articles supplied from different information sources are performed. This modification has no limitation above.

In the case where four articles have been selected by the article selection portion 15, the article similarity calculating portion 16 reads the articles from the article information storage portion 18 (step S101) to calculate the similarities of all of the combination of article A and article B, that of article A and article C, that of article A and article D and that of article B and article D (step S102).

A structure may be employed in which only articles meeting a predetermined condition are made to be the subjects of the calculations for obtaining similarities.

FIG. 36 shows a flow of a process to be performed by the presentation information generating portion 17.

The presentation information generating portion 17 reads, from the article information storage portion 18, information of articles selected by the article selection portion 15 and similarity between articles calculated by the article similarity calculating portion 16 (steps S111 and S112). The presentation information generating portion 17 presents, to the user, information about the body of the article supplied this time together with information of other relevant articles supplied this time (steps S113 and S114).

FIG. 37 shows an example in which information about the body of the article supplied this time is displayed together with information about the other relevant articles supplied this time.

In the foregoing example, information about articles relating to a semiconductor and dated on the 15-th day is supplied as added information in addition to information about the body of an article as "semiconductor conference" dated on the 15-th day. As a result, there arises a risk that the duplicated articles are unintentionally displayed as has been experienced with the first embodiment. In this case, a process for deleting duplicated articles employed in the first embodiment may be performed.

When information about the body of the article as "XX Corporation monopoly-controlled share of semiconductors" is read, the article as "Semiconductor Conference" is displayed in the area for added information, as shown in FIG. 38.

FIGS. 21 to 23 showing the first embodiment also show modifications of the structures shown in FIGS. 37 and 38.

That is, similarly to the first embodiment, a structure may be employed in which the user is permitted to make a direct access to the bodies of the relevant articles supplied today.

Reflection of Similarity Between Articles onto Presenting Order of Articles

Although the description has been performed about the addition of information about relevant articles when each article is presented to a user, the presenting order of articles to the user may be determined by using the similarity between articles supplied this time.

FIGS. 39A and 39B show examples in which the similarity between articles is reflected on the article presenting order.

In this example, an assumption is performed that the user profile is a set of words relating to three different fields as "semiconductor technology", "low price personal computer" and "artificial intelligence".

When a retrieval is performed in accordance with the foregoing method, a result of retrieval in which words in three different fields are mixed is obtained, as shown in FIG. 39A.

When, for example, 8 upper cases or articles having a similarity of 0.80 with the user profile are selected and the selected articles are presented to the user in the selected order, the user sometimes and unintentionally reads the articles in the order as semiconductor, low price personal computer, artificial intelligence, semiconductor and low price personal computer.

Although there is sometimes a case where reading of articles having approximate similarities with the user profile is effective, it can be considered that collecting of similar articles and collectively display are easy for the user to understand as shown in FIG. 39B as in the foregoing case where articles in a plurality of fields mixedly exist.

In the foregoing example, three leading articles relate to the semiconductor, three ensuing cases relate to the low price personal computer and the two residual cases relate to the artificial intelligence.

As described above, the system according to the first embodiment has the structure such that the frequency vector or the like is used to compare the expression among articles so that the similarities among articles are calculated. In accordance with the similarities, relevant articles relating to the article to be presented to the user are determined. Information about the relevant articles is added to information about the body of the article to be presented to the user and supplied to the user. As the subjects the similarities of which are calculated, it is preferable that the articles presented this time or article supplied this time and previous articles are subjected to the calculations for obtaining similarities. As a result, the relationship among the articles presented this time and the relationship between the articles presented this time and articles presented due to the previous filtering operation can be made clear. Thus, the relevance of articles can be displayed to the user.

When existence of duplicated articles is examined by calculating the similarity between articles, the information about the body of the duplicated article is not presented to the user but only information of the caption of the duplicated article can be added as information about relevant articles so as to be presented to the user. As a result, a fact that articles about the same contents obtained from a plurality of different information sources are presented in the duplicated manner can automatically be prevented.

As a result, when a plurality of articles are presented to a user by performing an information filtering operation by one time, the relationship among articles can be cleared and presented to the user. Thus, it can be considered that the user is able to easily understand the articles.

Second Embodiment

A second embodiment of the information filtering system according to the present invention will now be described. The overall structure of the system according to this embodiment is the same as that shown in FIG. 1. A user profile for each user is stored and the user profile is used to retrieve articles. The user profile is a retrieval condition with which articles meeting topics in which the user is interested are retrieved.

FIG. 40 is a conceptual view of the user profile according to the second embodiment.

In this example, a certain user A has selected two topics consisting of "semiconductor technology" and "semiconductor trade". Another user B has selected three topics as "semiconductor trade", "low price personal computer" and "artificial intelligence".

At this time, the user profile of the user A is composed of retrieving conditions for retrieving articles relating to the "semiconductor technology" and retrieving conditions for retrieving articles relating to the "semiconductor trade". Similarly, the user profile of the user B is composed of retrieving conditions for retrieving articles relating to the "semiconductor trade" and retrieving conditions for retrieving articles relating to the "low price personal computer" and retrieving conditions for retrieving articles relating to the "artificial intelligence".

FIG. 41 shows the structure of an information filtering center 1-2 according to the second embodiment. As shown in FIG. 41, the information filtering center 1-2 comprises a user profile generating portion 21, a topic storage portion 22, an article information extracting portion 23, an article retrieving portion 24, an article selection portion 25, an added-information generating portion 26 and an article information storage portion 27. Among the components above, elements each surrounded by a dashed line, that is, the user profile generating portion 21, the article information extracting portion 23, the article retrieving portion 24, the article selection portion 25 and the added-information generating portion 26 can be realized by software which is executed by the central processing unit 14 shown in FIG. 1. The topic storage portion 22 and the article information storage portion 27 can be realized by the storage unit 5.

The user profile generating portion 21 is supplied with requirements and interests of each user. The requirements and interests of the user are expressed in the form of a natural language as "I want to read articles relating to .largecircle..largecircle. and XX", a set of keywords allowed to frequently appear in a topic of interest, sets give priority order and/or weights or a retrieving formula for use in a usual document retrieval.

On the other hand, the user profile generating portion 21 performs a language process, such as extraction of words and development of synonyms to perform conversion into a format in which the retrieval is enabled so as to generate a user profile. The user profile for each user is stored in the topic storage portion 22.

The user profile generating portion 21 as well has a relevance feedback function in which it receives feedback from the user about a fact whether the articles supplied to the user were useful to modify the retrieving conditions to be employed in the topic storage portion 22 in such a manner that information about the feedback is reflected onto the modification process.

The article information extracting portion 23 receives articles supplied from information sources to subject the articles to analyze the morpheme, construction and format of each article so as to extract information relating to 5W1H, for example, the information source of the article, date, information about occurrence frequency of components of the document, such as characters, words and the like, and positions of appearance. The article information extracting portion 23 expresses the articles as a set of extracted information items. For example, the article information extracting portion 23 expresses the article with vectors, the component of which is the frequency of words allowed to appear or expresses the same by a 5W1H shape template into which realized values are substituted. Examples of expressions of the articles are the same as those according to the first embodiment shown in FIGS. 6 and 7.

The article information extracting portion 23 as well as performs an indexing process for realizing a quick retrieval of articles. Information of articles extracted by the article information extracting portion 23 is stored in the article information storage portion 27.

Referring to FIG. 42, a flow of a process to be performed by the article retrieving portion 24 will now be described.

The article retrieving portion 24 makes references to the conditions for retrieving the topics stored in the topic storage portion 22 and information about articles extracted by the article information extracting portion 23 to retrieve supplied articles that meet the topics. The foregoing operation corresponds to a calculation of the similarity between the topic and the supplied article. The similarity sometimes is, depending upon the method of the retrieval, formed into discrete values, such as "adapted to the topic" or "not adapted to the topic", or continuous values in such a manner that articles adapted satisfactorily have greater similarities. The description will hereinafter be made about the more usual case where the similarities are formed into continuous values.

The article retrieving portion 24 subjects each topic to the following process.

Initially, the article retrieving portion 24 substitutes 1 for variable i (step S121), and then fetches retrieving conditions of the i-th topic (topic 1) from the topic storage portion 22 (step S122). Then, the article retrieving portion 24 substitutes 1 for variable j (step S123), and then calculates the similarity between topic i (topic 1) and supplied article j (supplied article 1), followed by storing the similarity in the article information storage portion 27 together with information about the satisfied retrieving conditions (step S124). The calculations for obtaining similarities correspond to a usual retrieval process in which references are made to the expression of articles and retrieval indexes stored in the article information storage portion 18.

Then, the article retrieving portion 24 updates the value of the variable j by increasing it by one, and then determines whether the value of j is larger than the number of supplied articles (steps S125 and S126). If the value of j is not larger than the number of the supplied articles, the article retrieving portion 24 determines that articles, the similarities of which have not been calculated, remain, and, thus, repeats steps S124 to S126 until the value of j is made to be larger than the number of the supplied articles. When all of the supplied articles have been subjected to the calculations for obtaining the similarity with the topic i, the article retrieving portion 24 sorts the supplied articles in the descending order in terms of the similarity with the user profile to rank the articles (step S127). A result of the ranking operation is stored in the article information storage portion 27.

Then, the article retrieving portion 24 updates the value of the variable i by increasing it by one, and then determines whether the value of i is larger than the number of all topics (steps S128 and S129). If the value of i is not larger than the number of all topics, the article retrieving portion 24 determines that topics, the similarities of which have not been calculated, remain, and, thus, repeats steps S122 to S129 until the value of i is made to be larger than the number of all topics.

FIG. 43 is a conceptual view showing supplied articles with respect to the topics ranked by the article retrieving portion 24. Thus, the supplied articles are ranked in topic units.

FIG. 44 shows a flow of a process to be performed by the article selection portion 25.

The article selection portion 25 selects articles to be presented to each user from results of retrieval of the topics obtained by the article retrieving portion 24 and stored in the article information storage portion 27.

That is, the article selection portion 25 substitutes 1 for the variable i (step S131), and then fetches the user profile of user i (user 1) from the topic storage portion 22 (step S132). Then, the article selection portion 25 substitutes 1 for the variable j (step S133), and then fetches a result of retrieval of the topic j (topic 1) of the user i so as to select an article to be presented to the user (step S135). As a method of selecting the article, a method may be employed, for example, in which a number N of articles to be presented to the user is previously determined by the information filtering center 1-2 so as to present N upper ranked articles. Another method may be employed in which articles of a type having a similarity with the user profile which is larger than a certain threshold. Information of the selected articles is stored in the article information storage portion 27.

Then, the article selection portion 25 updates the value of the variable j by increasing it by one, and then determines whether the value of j is larger than the number of topics specified by the user i (steps S136 and S137). If the value of j is not larger than the number of the specified topics, the article selection portion 25 recognizes that results of retrieval of other topics which have not been selected remain, and, thus, repeats steps S134 to S137 until the value of j is made to be larger than the number of the topics of the user i. When articles with respect to all topics of the user i have been selected, the article selection portion 25 updates the value of the variable i by increasing it by one, and then examines whether the value of i is larger than the number of all users (steps S138 and S139). If the value of i is not larger than the number of all users, the article selection portion 25 recognizes that users, for which articles have not been selected, remain, and, thus, repeats steps S132 to S139 until the value of i is made to be larger than the number of all users.

As a result of the process above, a result of "semiconductor trade", that of "low price personal computer" and that of "artificial intelligence" are fetched for the user who has selected three topics, for example, "semiconductor trade", "low price personal computer" and "artificial intelligence", as shown in FIG. 45. From the upper articles, an article to be presented to the user is selected.

FIG. 46 shows a flow of a process to be performed by the added-information generating portion 26.

The added-information generating portion 26 performs the following process for all users.

Initially, the added-information generating portion 26 substitutes 1 for the variable i (step S141), and then fetches the user profile of the user i (user 1) from the topic storage portion 22 (step S142). Then, the added-information generating portion 26 fetches the articles selected by the article selection portion 25 so as to be presented to the user 1 and information about the retrieving conditions satisfied by the articles (step S143).

Information relating to the retrieving conditions satisfied by the article is information about the topic selected by the user and satisfied by the article and information about satisfied retrieving conditions. The retrieving conditions are conditions needed to be satisfied by the article, for example, the subject of the article or the subject of the action, and expressed by a Boolean expression or a natural language for use in a usual document retrieval or a format which can be processed by the article retrieving portion 24.

Then, the added-information generating portion 26 adds, to the article selected by the article selection portion 25, information relating to the retrieving conditions satisfied by the article to present the information item to the user i (step S144). The added-information generating portion 26 updates the value of variable i by increasing it by one, and then examines whether the value of i is larger than the number of all users (steps S145 and S146). If the value of i is not larger than the number of all users, the added-information generating portion 26 recognizes that users having no added-information remain. Thus, the added-information generating portion 26 repeats steps S142 to S146 until the value of i is made to be larger than the number of all users.

FIG. 47 shows an example of display formed by adding information of the topic satisfied by each article to the caption of the article selected by a certain user and presented to the user. An assumption is performed here that the user has selected three topics, "semiconductor trade", "low price personal computer" and "artificial intelligence".

In this case, six captions of articles are presented to the user such that three articles meet "semiconductor trade", two articles meet "low price personal computer" and one meets both of "semiconductor trade" and "low price personal computer".

Even if one article meets a plurality of topics as described above, the ground causing the article to be presented is displayed.

In the foregoing case, values of similarities between adapted topics and articles and calculated by the article retrieving portion 24 at the time of performing the retrieval are displayed at the ends of each line.

Since an article given number 6 meets two topics, it has two similarities consisting of a value of similarity with "semiconductor trade" of 1.05 and a value of similarity with "low price personal computer" of 0.80.

FIGS. 48A and 48B show examples in which the numbers of articles each of which has satisfied each topic are presented to the user shown in FIG. 47.

In the example shown in FIG. 48A, the numbers of articles which have satisfied each of the topics selected by the user are formed into a table so as to be presented to the user.

Since articles having numbers 1, 2, 3 and 4 shown in FIG. 47 meet "semiconductor trade", 4 is displayed as the number of articles. Similarly, articles having numbers 4, 5 and 6 shown in FIG. 47 meet "low price personal computer", 3 is displayed as the number of articles. Since no article exists that meets "artificial intelligence" in this case, 0 is displayed as the number of articles.

Since four articles meeting "semiconductor trade" and 3 articles meeting "low price personal computer" including one duplicated article, 6 is presented to the user as the number of articles.

As a modification of the foregoing example, the number of the articles meeting a plurality of topics may be individually counted as has been performed with the article having article No. 6 shown in FIG. 47.

In this case, the number of articles meeting, for example, "semiconductor trade" shown in FIG. 48B is three in terms of the number of the articles meeting only the foregoing topic.

In the example shown in FIG. 48B, information of the number of articles meeting the topics selected by the user is displayed in the form of a Venn diagram.

In this example, a fact is displayed in which three articles respectively having numbers 1, 2 and 3 shown in FIG. 47 meet only "semiconductor trade", two articles having numbers 4 and 5 meet only "low price personal computer" and the article having number 6 meets both of the two topics.

In this example, the relationship between the number of articles meeting the topics and the number of all articles can be made more clearly as compared with the example shown in FIG. 48A.

FIG. 49 shows an example of display presented to the user such that summaries, extractions or bodies of articles selected for a certain user are collected in each topic so as to be presented to the user.

The summary is a text formed in such a manner that the body of the original article is processed to enable the user to recognize the gist, while the extraction is a text which is a portion of the body of the original article and which has been extracted without any process.

In this example, three articles relating to "semiconductor trade" are lined up and first displayed, and articles relating to "low price personal computer" follow the foregoing articles.

As described above, topics to which the articles, to be presented to the user, adapt are displayed so that the user recognizes the contents of the articles and determines the articles to be read. Thus, the user is able to efficiently collect information.

FIG. 50 shows an example of display presented to the user such that information relating to the retrieving conditions satisfied by the article is added as header information of the body of the article.

In this case, a fact that the displayed article meets "semiconductor trade" among the topics selected by the user is displayed in the line of the "subject topic".

A fact that the similarity between "semiconductor trade" and the article is 1.32 is displayed below the "subject topic".

Moreover, retrieving conditions employed to retrieve the articles relating to "semiconductor trade" and conditions among the foregoing conditions that have been satisfied by the displayed article are lined up and displayed.

In the body shown in FIG. 50, a portion of the text is emphasis-expressed.

The emphasis-expression is display to be usually performed such that a portion of a text is emphasized as compared with other portions by using an additional symbol, such as an underline, a different font, a character having different size or different color.

In this example, an assumption is performed that the retrieving conditions set for retrieving articles meeting the topic "semiconductor trade" is a condition that "words, such as semiconductor, IC and procurement, are included in the body".

Since the article meets the foregoing condition, words "semiconductor", "IC" and "procurement" in the first sentence of the body are emphasized to clearly display the foregoing fact.

As a modification of this example, the word, for example, "IC" in the "caption of the article" may be emphasized.

As a result of the emphasis-expression above, the user is able to recognize the ground of the retrieval and presentation of the displayed article.

Since the text in the emphasized portion usually contains an important fact, it can be considered that the user is able to recognize the contents of the article by skimming through the article.

The foregoing fact improves the efficiency in performing an operation for determining the usefulness of the article presented for relevance feedback for example.

FIGS. 51A, 51B, 52 and 53 show examples in which the satisfied retrieving condition is emphasized in the article so that the usefulness of the article is efficiently determined.

FIG. 51A shows an example of a retrieving condition for retrieving article meeting a topic as "natural language process".

In this example, an article having the body containing language expressions as "natural language process", "NL", "machine translation" and "kana-kanzi conversion" has a high point.

If expressions as "natural language" and "analysis" are allowed to appear in one sentence, the article has a high point.

Moreover, various conditions for retrieving articles are written.

FIG. 51B shows an example of an article retrieved by using the retrieving conditions shown in FIG. 51A and presented to the user. Since the foregoing article meets the retrieving conditions as "a language expression as a natural language process is contained in the body", the expression "natural language process" in the article is emphasized. The portion of the sentence containing the emphasized expression "natural language process" is such that "this software does not use natural language process and simple character strings are used to perform the retrieval". Thus, the user is able to quickly understand that the foregoing article does not relate to the natural language process.

Since the user is able to determine that the foregoing article is not needed to be read, the user is able to collect information by reading only articles considered to be useful or efficiently perform relevance feedback.

Also FIGS. 52A and 52B show examples for quickly determining a fact that the article is not useful, similarly to FIGS. 51A and 51B.

In this example, English texts are retrieved. FIG. 52A shows conditions for retrieving a topic "artificial intelligence".

In this example, an article containing words "artificial and "intelligence" is arranged to be given a high point.

FIG. 52B shows an example of an article retrieved by using the retrieving conditions shown in FIG. 52A and presented to the user, in which a word "artificial" is emphasized.

The structure shown in FIG. 52B enables t