Method for presenting information on display devices of varying sizes5848406Abstract A method for presenting information on display devices of varying sizes. The method processes a query of a database and determines which tables are most likely to contain information which the user wishes to primarily view. The join relationships between tables of the database are used to determine the priority for displaying tables. Those join relationships which were used more frequently in previous queries are accorded higher weight scores and correspondingly have a greater chance of being prioritized to be displayed on the computer screen. Also, a user may select a particular table to be focused upon. In that situation, the method displays not only the focused table but also the tables with a direct relationship to it. Claims We claim: Description BACKGROUND OF THE INVENTION
______________________________________
Create weighted graph G;
For each column of each table forms node n in graph G, Initialize set
of edges E to NULL;
For each join query J involving tables T.sub.i and T.sub.j begin
if edge e involving T.sub.i and T.sub.j already exists in E
then
add 1 to weight of e
else
create edge e and associate it with T.sub.i and T.sub.j,
set weight of e to be 1
end if
end
______________________________________
The graph thus constructed represents the strength of relationships between columns of tables in the database schema. PRUNING THE GRAPH Once this graph is constructed, the graph is then pruned by dropping all edges whose weight is below some selected threshold value. The threshold value is chosen in an iterative manner so that the largest resulting subset of tables still fits in the available screen space. The resulting graph is processed to choose sets of tables which are connected to each other by at least one edge whose weight exceeds the threshold value. Each set of such connected tables forms what is termed a Table Connected Schema Sub-component (TCSS) of the graph, or simple connected schema sub-component. Notice that the notion of connected sub-components in the graph is not the same as the traditional meaning of the term in graph theory. Here, the fact that individual nodes of the graph belong to a parent table is exploited. Two tables are connected if there is an edge between any of their component nodes (i.e., columns). The mechanism to obtain Table Connected Schema Sub-components is described in the following pseudocode:
______________________________________
For every edge e in G with weight < threshold, delete e from G
Resultant graph is G'
Initialize Remaining Tables {RT} to be set of all tables in metadata
Initialize current-set, {CS} to NULL
repeat forever
begin
if {RT} is empty
then
Output tables in {CS} as the last TCSS
exit loop
end if
t = any table from {RT}
remove t from {RT}
add t to {CS}
for every edge e in G that is associated with any table t' in CS
begin
edge e is associated with tables t' and t.sub.other
add t.sub.other to {CS}
remove e from G
end for
end repeat
______________________________________
An example is shown in FIG. 2. The schema contains 6 tables and each table (R.sub.1 100, R.sub.2 102, R.sub.3 104, R.sub.4 106, R.sub.5 108, R.sub.6 110) contains columns as shown (a.sub.1, a.sub.2 etc.). The columns with edges drawn between them (both solid as well as dotted) correspond to the join paths derived from a set of SQL queries against these tables. For example, there is a join path 114 between R.sub.1 -a.sub.1 and R.sub.2 -a.sub.1. The number above each edge indicates the weight of the edge. For example, join path 112 has weight 114 of "3". The graph corresponding to this schema contains 17 nodes (one for each column in every table in the schema), and 6 edges as drawn in FIG. 2. Given a threshold value of 3, the only three edges which have at least this value are shown as solid lines in the FIG. 2: join path 112, join path 116, and join path 118. The edges with dotted lines are those whose weights are lower than the threshold value. Such a graph yields three connected sub-components: S.sub.1 containing tables R.sub.1 100, R.sub.2 102 and R.sub.4 106; S.sub.2 containing tables R.sub.3 104 and R.sub.5 108; and S.sub.3 containing just table R.sub.6 110. It should be understood that the present invention is not limited to the situation where the database resides solely on one computer, but also encompasses the situation where the database and its tables may be distributed across several computers. Also, the present invention is not limited to where tables to be queried are contained only in one database, but may be found in several databases. By choosing the threshold value appropriately (depending on the weights of the edges and connectivity of nodes in the original graph, as well as the display size), a set of connected schema sub-components can be constructed such that each set of tables fits in one display screen. Notice that these tables are logically connected to each other since they are derived from frequently occurring join paths between them. Considering that the join paths were themselves obtained from a suite of frequently used queries, it is expected that these table groupings represent the information that a user would like to see together. If the initial threshold chosen is too high, there may be many sub-components which contain very few tables together. Therefore the threshold value is reduced until a suitable value is found. Notice also that if no join information is available, there are many sub-components each of which contains just one individual table, i.e. there will be as many TCSSs as tables in the schema. This corresponds to the situation where the preprocessing has been unable to yield natural groupings of tables in the schema. The user may choose to either view the tables one by one or view all the tables together and use traditional methods of navigation (such as scroll bars). FIG. 3 shows how the correct set of tables to be displayed is chosen. The start block 200 indicates that processing begins at block 210. Block 210 sets the threshold to the highest edge weight. Block 220 gets all the TCSS from graph G. Decision block 230 inquires whether the largest TCSS can fit within the size of the computer output display. If it cannot, then block 240 reduces the threshold by 1. Decision block 250 inquires whether the threshold is 0. If it is, processing continues at block 260; else processing returns to block 220. If decision block 230 determines that the largest TCSS can fit within the size of the computer output display, then block 260 displays the tables that are in the TCSS. Processing for this aspect of the present invention then terminates at stop block 270. ADDING FOCUS TO A TABLE For each of the tables displayed, only those columns that participate in joins between the various tables are displayed on the screen. Once a group of related tables is presented to the user, there is an option of making a specific table the focus. Making a table a focus involves displaying all the columns of the given table, and in addition, all other columns (belonging to other tables) that share an edge with columns belonging to the table in focus. This results in displaying all the tables which have columns sharing edges with the columns of the table in focus. The additional tables that are now displayed are those which share edges with the columns of the table in focus--regardless of the weight of the edge involved. In this fashion, a user could navigate the entire schema by making different tables the focus. Of course, if when a particular table is made the focus and too many other associated tables are to be shown, there can be scrolling through the screen or having a second threshold value to prune the number of items being displayed. The idea of making a table the focus is to indicate interest in all other tables that are logically related to this focused table. For example, referring back to FIG. 2, while displaying sub-component S.sub.2 only tables R.sub.3 104 and R.sub.5 108 are shown on the screen. If, now, R.sub.3 104 is made the focus, the additional tables that are displayed alongside the original two tables are R.sub.4 106, R.sub.6 110 and R.sub.1 100, since all these tables have edges in common with columns in R.sub.3 104. FIG. 4 shows this procedure of making table T the focus. The start block 300 indicates that processing begins at block 310. Block 310 gets the graph G. Decision block 320 determines if there are more edges in the graph. If there are no more, then block 330 adds table T to the set of focus tables and processing for this aspect of the invention terminates at stop block 340. If Decision block 320 determines that there are more edges in the graph, then block 350 gets edge E. Next, decision block 360 determines if that edge E involves table T. If it does not, then processing returns to decision block 320. If decision block 360 does determine that edge E involves table T, then block 370 obtains the other table T1 involved in edge E. Processing continues at decision block 380. If Decision block 380 determines that T1 is displayed, then processing returns to decision block 320. If it is not, then block 390 displays T1 and adds T1 to the list as introduced by T. REMOVING FOCUS FROM A TABLE When a table is removed as the focus, all the tables containing edges of weights lower than the threshold value that the table of focus introduced into the display are removed. Since there is no guarantee that the user will remove tables as focus in the reverse order that they were made foci, the order in which tables were made foci is maintained so that they can be undone in the reverse order. This is somewhat similar to hypertext navigation systems, i.e. navigation of a schema in this fashion is akin to a stack of traversals. The first degree neighbors of a table, T, is defined to be all other tables whose columns share an edge with columns from T. This is a schema wherein making a table the focus involves displaying all first degree neighbors of this table. One could extend this notion and display nth order neighbors where n is a tunable parameter. FIG. 5 shows this procedure of removing focus from table T. The start block 400 indicates that processing begins at decision block 410. If decision block 410 determines that the focus-tables do not contain table T, then processing branches to block 450 which will be discussed below. If decision block 410 determines that the focus-tables do contain table T, then processing continues at block 420 which gets the next table T1 from the focus-tables. Decision block 430 determines if table T is the same as table T1. If they are, then processing branches to block 450. If T is not the same as T1, then for each table T2, block 440 removes table T2 from the list of tables being displayed. Block 450 displays the list of tables and processing terminates for this aspect of the invention at the stop block 460. EXAMPLE The following example uses tables R1 510, R2 512, R3 514, R4 516 with the relationships shown in FIG. 6 (these relationships are similar to the ones shown in FIG. 2). Table R1 may assume the following values:
______________________________________
Employee Grade Employee Id
______________________________________
1 354
1 355
1 356
2 320
2 322
2 327
2 328
3 301
3 302
* *
* *
* *
20 1
______________________________________
Table R2 may assume the following values:
______________________________________
Employee Grade Salary
______________________________________
1 $15,000
2 $15,500
3 $15,750
* *
* *
* *
20 $150,000
______________________________________
Table R3 may assume the following values:
______________________________________
County of Employee Home Home Phone
Residence
Employee Id
Address Number
______________________________________
Cuyahoga
1 123 Main Street
(301) 588-8437
* * * *
* * * *
* * * *
Westview
301 345 Dogwood Road
(315) 384-4823
Timberwood
302 2441 Wakefield Drive
(315) 384-4341
Cuyahoga
320 842 Dirger Street
(301) 588-3429
Westview
322 142 Beaumont Blvd.
(315) 384-6893
Westview
327 251 Troy Street
(315) 384-5372
Westview
328 623 Cardinal Road
(315) 384-5672
Westview
354 972 Industrial Street
(315) 384-1592
Timberwood
355 482 Billers Drive
(315) 384-7473
Timberwood
356 848 Rosemont Street
(315) 384-2492
______________________________________
Table R4 may assume the following values:
______________________________________
Employee Name Employee Id
Work Phone Number
______________________________________
B. Williams 1 (315) 575-2543
* * *
* * *
* * *
T. Burke 301 (315) 575-2501
K. Niems 302 (315) 575-2534
J. Peters 320 (315) 575-2502
O. Smith 322 (315) 575-2523
Y. Timms 327 (315) 575-2587
A. Gibson 328 (315) 575-2500
R. Copston 354 (315) 575-2423
K. Harrington 355 (315) 575-2599
N. Sagy 356 (315) 575-2589
______________________________________
A user may request that data from the tables R1, R2, R3, and R4 be sent to the user's PDA screen by specifying the following query: SELECT DISTINCTROW R1.›Employee Grade!, R1.›Employee Id!, R2.Salary, R4.›Employee Name!, R4.›Work Phone Number!, R3.›County of Residence!, R3.›Employee Home Address!, R3.›Home Phone Number!, * FROM ((R1 INNER JOIN R2 ON R1.›Employee Grade!=R2.›Employee Grade!) INNER JOIN R3 ON R1.›Employee Id!=R3.›Employee Id!) INNER JOIN R4 ON (R4.›Employee Id!=R3.›Employee Id!) AND (R1.›Employee Id!=R4.›Employee Id!) ORDER BY R1.›Employee Grade!; The query shown above results in the following four paths each of weight one being generated: (R1.Employee Grade, R2.Employee Grade) (R1.Employee Id, R3.Employee Id) (R3. Employee Id, R4. Employee Id) (R1.Employee Id, R4.Employee Id) The present invention processes the query in accordance with the method disclosed above to display results 580 that will fit within the screen 590 of the PDA 600 as shown in FIG. 7. The preferred embodiment has a SQL parser that, given, a set of queries (Select queries) produces join path sets of the form R.sub.i.A.sub.j,R.sub.k.A.sub.l . . . Each such set contains attributes that occur together in a single query and contribute one edge for every pair in the set. Using the above sets of generated join path information and a schema's metadata (i.e. names of tables and their columns), another program generates the graph describing the path information. Given a threshold (currently an integer value), this program then generates the various sub-components of the graph with table grouping information. There is also a command line interface which allows different tables to be the focus (and subsequently remove them). All the code mentioned above runs on Unix, but the present invention is not limited to this specific software. For example, another embodiment can run on Windows 3.1. Given a schema and a threshold value, the program determines the largest connected sub-component and displays it on the screen. Using mouse interactions, different tables can be made foci (and, later, removed as foci as well) in the manner discussed earlier. FIG. 8 shows another embodiment of the invention. FIG. 8 is a network diagram showing tables placed in separate databases across a local network which is connected to a printer and the Internet. A user on computer 650 may query for information contained in tables contained in database 660 on computer 662 and database 670 on computer 672. This embodiment can use the present invention to prioritize information that is to be sent across a network (such as the Internet 680). In this situation, the present invention can send information to the user on computer 650 which the user is more likely to primarily wish to view. The present invention can also direct the output to a printer 690 located on a local network 700. The embodiments which have been set forth above were for the purpose of illustration and were not intended to limit the invention. It will be appreciated by those skilled in the art that various changes and modifications may be made to the embodiments described in this specification without departing from the spirit and scope of the invention as defined by the appended claims.
|
Same subclass Same class Consider this |
||||||||||
