System and methods for intelligent analytical graphing5375201Abstract An electronic spreadsheet system includes an intelligent analytical graphing module for creating complex analytical graphs from user-supplied information. Methods are described for automatically creating one or more desired analyses on a spread of user-supplied information, all without the user having to modify his or her information or supply (or even understand) the operations necessary for a given analysis. The system includes a preferred interface for receiving selections from the user indicating one or more analyses to be performed on the information, and for displaying analytical graphs for visualizing the result of a selected analysis. Claims What is claimed is: Description COPYRIGHT NOTICE
______________________________________
Button Function
______________________________________
Erase Shortcut for /Edit Erase Block.
copy (CPY) Shortcut for /Edit Copy.
Move (MOV) Shortcut for /Edit Move.
Style (STY)
Shortcut for /Style Use Style.
Align (ALN)
Displays the /Style Alignment menu.
Font (FNT) Displays the /Style Font menu.
Insert (INS)
Shortcut for /Edit Insert.
Delete (DEL)
Shortcut for /Edit Delete.
Fit Shortcut for /Style Block Size Auto Width.
Sum Uses @SUM to total rows, columns, or both.
Format (FMT)
Displays the /Style Numeric Format menu.
CHR Switches to text (character) mode display.
WYS Switches to WYSIWYG display.
BAR Displays more buttons.
______________________________________
For an additional description of the general features and operation of the system 130, see Campbell, M., Quattro Pro 4.0 Handbook, 4th Ed., 1992, the disclosure of which is hereby incorporated by reference. C. Graphing As shown in FIG. 1C, the system 130 includes the graphing module 130 for creating graphs from spreadsheet data. A graph is a visual representation of numeric information, often revealing information not immediately apparent in a table of numbers. Graphs offer a fresh perspective on the data stored in the spreadsheets. Often a graph will uncover a trouble spot or pinpoint the beginning of a new pattern. The user can then return to the spreadsheet data for a look at the data behind such areas. Graphs help both with analyzing past or present data and with visualizing future situations. Just as the user can set up a database to create what-if scenarios, the user can use graphs to help foresee future directions. System 130 provides a spreadsheet interface 200, as shown in FIG. 2B. The interface 200 includes a Graph menu 213 having a plurality of graphing tools. Graph Type lets the user choose the basic graph type desired, such as Line, Bar, XY, Stacked Bar, Pie, Area, Rotated, Bar, Column, High-Low, Text, Bubble, and 3-D graphs. Series lets the user specify up to six blocks of data to plot and one block of data to use as labels for the x-axis; the Series selection includes an Analyze submenu for allowing the user to transform series data (as described hereinbelow). Text lets the user add text to graphs, as well as change the typestyle, size, and color of that text. Customize Series lets the user embellish graphs by changing marker symbols, colors, and fill patterns, and using a different graph type or y-axis scale for an individual series. X-Axis contains options related to the x-axis, including scaling and adjusting tick marks. Y-Axis contains options related to the y-axis, including scaling and tick marks. Overall lets the user add a three-dimensional effect to a graph, set grid options, and change background colors. Insert lets the user paste a graph directly into the spreadsheet. Hide removes a graph from the spreadsheet. Name lets the user store graphs with the current spreadsheet, display a graph previously saved, delete graphs, create a slideshow, and copy graphs to other spreadsheets. View displays the current graph; to return to the Graph menu, the user presses any key except the slash key (/). Pressing / while a graph displays takes the user directly into the Annotator. Fast Graph automatically creates a graph from the block of data the user has specified. Annotate activates the Annotator, where the user can add text, arrows, lines, colors, clip art, and geometric shapes to graphs. The basic procedure for building a graph is as follows. The user selects the type of graph he or she wants to create (for example, a line graph or pie graph). The type of graph the user may choose usually depends on the analysis the user wants to perform and what he or she wants to illustrate. The line graph is the most common business graph. It connects each value in a series with a line. If there are several series, a separate line represents each series. Line graphs plot values from left to right, in the order in which they appear in the spreadsheet. A line graph makes it easy to see dips and rises in a series of numbers, and they are often used to plot data over time to review patterns and predict trends. Next, the spreadsheet data to be plotted are specified. In an exemplary embodiment, the user can specify up to six sets, or series, of values (with a /Graph Series command). Upon selecting 1st Series from the Graph menu, for example, the user is prompted for a block of data. (If a block was previously assigned, it appears as the default and is highlighted.) Here, the user specifies the block of values to plot as the first series. To point out a block with the pointing device 105 (e.g., mouse), the user simply drags from one corner of the block to the opposite corner, then releases. To plot more than one series, the user simply repeats the process, specifying a block for each desired series. In addition to the series to plot on the graph, an X-Axis Series is provided to allow the user to specify a block of labels or values to display along the x-axis as tick labels, to help define the data being plotted; unlike the y-axis, it generally has no numeric meaning and is simply used to place labels, from left to right, in the order in which they appear in the spreadsheet. Group selection is provided to let the user specify a block of values for the system to divide into separate series--either one column or one row per series. In addition to selecting data, the user can embellish the graph with text, arrows, and geometric shapes using Graph annotation tools which are provided. After completing his/her selections, the user "views" the graph by invoking the View command. In response, the system constructs the specified graph and displays it on the screen device 106. Also, at any time while building a graph above, the user can view what he or she has assigned so far by invoking the view command. Once satisfied with the results, the user may print the graph and save it for future use. Analytical Graphing A. Introduction According to the present invention, the spreadsheet system 130 also includes an analytical graphing module. Analytical graphing means the user can analyze spreadsheet data according to desired specifications, then graph the results without changing the spreadsheet. If a spreadsheet contains daily sales totals, for example, but the user want to graph average monthly totals, the system can calculate and graph this information without changing the spreadsheet. Or, if the daily sales vary widely, the system can display a moving average, which smoothes the data points. The system can also show a general trend, if one exists, by fitting the data with a line or an exponential curve. Thus, when the user uses analytical graphing of the present invention, the system calculates new data points and graphs them, all without the user knowing or understanding the theory or mathematics behind the analytical formulas. From the user's perspective, analytical graphing is basically a two-part process. First, the user decides what information is to be analyzed; specifically, the user assigns blocks of data to each desired series, as described above (using /Graph Series command) for basic graphing. Second, the user specifies what relationships (analysis type) to investigate. To accomplish the latter, the system provides a plurality of Analyze commands; each will be described in turn. B. Analysis Types After specifying the first series to analyze, the user selects the type of analysis he or she wants performed. Using the Analyze command (/Graph Series Analyze), the user can analyze spreadsheet data, then the system graphs the results without changing the underlying spreadsheet. In an exemplary embodiment, the following types of analyses are provided: Aggregation, Moving Average, Linear Fit, and Exponential Fit. Aggregation combines multiple data points and plots them as a single point that may be the sum, average, standard deviation, minimum, or maximum of the data. Moving average smoothes fluctuating data points by plotting progressive averages. Linear fit generates a line that best fits the data using simple linear regression. Exponential fit generates a curve to fit data that increases or decreases exponentially. A (result) Table option is also provided to let the user generate a table of results in the spreadsheet. 1. Aggregation Aggregation lets the user aggregate multiple data points and plot them as a single point that may be the sum, average, standard deviation, minimum, or maximum of the data. Plotting aggregates transforms the selected graph series and reveals relationships not immediately apparent in the spreadsheet, such as weekly averages for information recorded daily. In an exemplary embodiment, submenus are provided for indicating (1) Series Period; (2) Aggregation Period; and (3) Transformation (function) type. Each will be described in turn. The Series Period menu command is invoked for indicating what the series values represent (i.e., what period exists) in the series. Since the user will often aggregate a series according to a time period, the Aggregation commands are preferably based on 1 day representing a period of 1. Thus, Days (the default) represents a period of 1; Weeks represents a period of 7; Months represents a period of 30; Quarters represents a period of 90; and Years represents a period of 360 (traditional accounting year). If a series contains weekly information, for example, one would choose Weeks on the menu. For the purposes of aggregation, the system employs the following standard time periods.
______________________________________
Standard time periods
A week has
A month has A quarter has
A year has
______________________________________
7 days 30 days 90 days 360 days
4 weeks 12 weeks 51 weeks
3 months 12 months
4 quarters
______________________________________
An Aggregation Period menu command is also provided for indicating how many data points are combined for each point on the graph. The user may choose among the following types. Weeks (the default) shows weekly data, or aggregates by 7; Months aggregates by 30; Quarters aggregates by 90; and Years aggregates by 360. An Arbitrary option is also provided to let the user specify a particular (user-defined) aggregation period. Thus if one enters 3 (i.e., to show aggregates of 3 or information every three days), the Series Period is set to Days. Finally, the aggregation period set by the user is checked to make sure it is larger than that entered for the series period (so that the system does not, for example, transform monthly data into weekly data). A Function menu command is provided for indicating what transform the system applies to the data. In an exemplary embodiment, for instance, SUM (the default) totals the data (analogous to applying a spreadsheet @SUM function to the data); AVG averages the data (@AVG function); STD calculates a population standard deviation (@STD function); STDS calculates a sample standard deviation (@STDS function); MIN specifies a series minimum (@MIN function); and MAX specifies a series maximum (@MAX function). Those skilled in the art will appreciate that other operations may be employed in a like manner effect desired transformation of the data. All told, aggregation commands may be viewed as handy standards for simple periodic aggregation. For example, when the user sets the Series Period to Days (Series Period Days), Aggregation Period to Months (Aggregation Period Months), and Function to average (Function AVG), the system will average 30 data points, plot the average, then average the next 30 points, plot the average, and so forth. 2. Moving Average Moving Average smoothes fluctuating data points by plotting progressive averages. Starting with the first point in the series, the system calculates and plots the average for a specified previous number of points, called a period. At each following point, the system maintains the specified period: It drops the oldest value, the one farthest from the new point, so the number of points being averaged is always the specified period. The system then calculates and plots the new average, and continues in this manner. To specify the range of influence, the user can indicate a period, which is a number of previous daily totals to average with each point and move on to the next point. If the period is set to 3, for instance, the system starts with the first day's sales and plots this value. For the second average, the system moves to the second day, averages the first and second day's totals, and plots the average. The system moves to the third day and calculates the average of the third, second, and first day's sales, then plots the average. At the fourth day, the system drops the first day's sales and averages the second, third, and fourth day's sales. With Period set to 3, at each day, the system averages the day's sales with the previous two. In this way, each point is tempered by previous points and data is smoothed to show a general trend. The user can apply a simple or weighted moving average by using Period and Weighted commands of the system. Period indicates a number of points to average. Weighted, on the other hand, specifies that the operation is to be Weighted, that is, the system numerically places greater emphasis or weight on recent points (closest to the point being worked on) and less weight on older points (farthest from the most recent point). The default setting is non-weighted (No). 3. Linear Fit Linear Fit generates a line that best fits the data, using simple linear regression. Linear Fit is useful for showing a general trend among fluctuating points; it is also easier for one to grasp regression information in graph form than in a table. After specifying a series (desired range) of data, the user invokes a Linear Fit command (/Graph Series Analyze Linear). In response, the system generates a line that best fits the data. With Linear Fit, the system calculates and plots linear regression information in a line, even if the data does not have a general trend. 4. Exponential Fit Similar to the above, Exponential Fit generates a curve to fit data that increases or decreases exponentially. For this feature to work, all values in the series must be of the same sign, either all negative or all positive. Blank cells in the series are treated as zeros. To fit exponential data with a curve, the user specifies the series containing the data and then chooses an Exponential Fit command (Series Analyze nth Series Exponential Fit). In response, the system generates a curve that best fits the data. 5. Table (Results) The Table command lets the user generate a table of analytical graphing results in the spreadsheet. The system performs the specified analysis before it draws the graph. The user does not have to view the graph in order to see the results. To produce a result table, the user specifies a series (/Graph Series) and an analysis type (Aggregation, Moving Average, Linear Fit, or Exponential Fit), then the user chooses the Table command and specifies a target block for the table. The user can indicate an entire block or specify only one cell as the top cell of the table. If the user specify one cell, the system extends the block downward as needed. The system generates the table and places it at the specified target. The table may be saved, graphed, or otherwise processed as desired. B. Examples Referring now to FIGS. 3-4, exemplary methods and interface of the present invention for analytical graphing will now be illustrated with a sample set of data. As shown in FIG. 3A, an ANALYZE.WQ1 spreadsheet 310 is loaded into the spreadsheet worksurface 310. The spreadsheet includes daily sales totals for a fictitious company, MouseStrap. 1. Aggregation Example Suppose the user wants to see average weekly sales. With a conventional electronic spreadsheet, this would entail a lot of work on the part of the user (redesigning the spreadsheet). According to the present invention, however, methods are provided for the spreadsheet system to automatically graph average weekly sales, all without changing the spreadsheet. With particular reference to FIGS. 3B-D, an exemplary method of the present invention for aggregating the sales information of spreadsheet 310 is illustrated. First, a graph type is selected from a Graph Type palette 320, as shown in FIG. 3B. For aggregation, a line graph works well because it easily shows a progression over time and displays trends. The user may select the Line Graph option with the screen cursor or enter a keyboard equivalent command (e.g., /Graph Graph Type Line). If the user specifies an X-axis series when aggregating a line graph, the system creates X-axis labels by showing every nth point in the X-axis series, where n is the aggregation period. When Aggregation Period is set to Weeks, for example, the system shows every seventh point in the X-axis series as an X-axis label. After specifying a graph type, the user specifies the block containing the data he or she wants to aggregate. The user does this by selecting the 1st Series option with the mouse 105 or keyboard 104 (Series 1st Series) and then enters a range of cells (again by pointing with the mouse or entering with the keyboard). In the example at hand (for the data of spreadsheet 310), the user enters a range of C5 . . . C55. To add horizontal labels along the horizontal (X) axis, the user can specify the block containing data for the X-axis by choosing X-Axis Series and entering a range of cells. To label the X-axis with week ending dates (every seventh day), the user chooses X-Axis Series and enters B5 . . . B55. Next, the user indicates the analytical operation desired for this series of data--aggregation (in this example). Specifically, the user invokes the Aggregation operation using the mouse and/or keyboard (Analyze 1st Series Aggregation). To indicate what the 1st Series values represent, or what period exists in the series, the user sets up the period of the series (choose Series Period). Days is the Series Period default. Since the series values are daily sales (the series has a period of 1), this setting remains unchanged. To indicate by what period the user wants to aggregate the series, one chooses Aggregation Period and then selects a standard aggregation period (or enters an Arbitrary one). Weeks is the Aggregation Period default. Since the user want weekly information (the user want to aggregate by 7), one leaves this setting unchanged. Finally, the transformation operation is set to Average (by choosing Function AVG). After entry of the user selections, the spreadsheet 310 appears as shown in FIG. 3C (with the user choices represented in cascading submenus 325). The user may now proceed to view the graph (e.g., by pressing F10). As shown in FIG. 3D, an aggregation graph 350 is displayed; each point of the graph represents a weekly sales average. Since the aggregation period is 7, every seventh day labels the X-axis. In a preferred embodiment, the system ignores the last aggregate of data if it is not a multiple of the aggregation period. For example, if the aggregation period is set to Weeks and the points are not evenly divisible by 7, the data points that follow the last group of 7 do not show on the graph as an aggregate. 2. Multiple Relationships in One Graph System 130 allows a user to easily view more than one relationship in the same graph, all without redesigning the spreadsheet and without repetitive calculations. If the user aggregates by the same period for each analysis, he or she can view multiple relationships in the same graph. Illustrated in FIGS. 3E-F, this method will now be demonstrated. Suppose that the user wants to view weekly maximum sales in the same graph as weekly averages (as shown in the previous figure). The user would proceed as follows. First, the user adds another series; as shown in FIG. 3E, for example, the user has selected a range of C5 . . . C55 as the second series (e.g., choose Series 2nd Series and enter C5 . . . C55). Now the user proceeds to specify an aggregation analysis for this second series (e.g. choose Analyze 2nd Series Aggregation). In this example, the Series Period is left set to Days, and the Aggregation Period is left set to Weeks. To specify maximum, one chooses the MAX function (Function MAX). At the completion of this step, the user's selections are displayed as shown in the cascading submenus 360 of FIG. 3E. If desired, the user may add legends to the graph using Text Legends commands of the system 130. For example, one may enter the following descriptive labels. Under 1st Series, one may enter "Weekly avg. sales". For the 2nd Series, one may enter "Weekly max. sales". For the X-axis, "Week ending date" may be entered; for the Y-axis, "Sales" is appropriate. After specifying the graph, the user may now view the graph (e.g., by pressing F10). In response, the system 130 constructs and displays the graph 370 of FIG. 3F. As shown, weekly maximum and average sales are displayed in the same graph. Like any graph, the user can use the Graph menus or the Annotator to further customize the graph or return to the spreadsheet for further data entry or modification. 3. Table Example The Table command (Analyze nth Series Table) lets the user display analytical graphing information directly in the spreadsheet. Continuing with the previous example, the Table command will be illustrated for viewing weekly average and maximum sales values in the spreadsheet 310. As described above, the range of C5 . . . C55 has been previously entered for both the first and second series; the former for determining weekly average sales, the latter for determining maximum sales values. To create a table of weekly average and maximum sales, the user proceeds as follows. First, the user selects the table command for the first of the two series (e.g., /Graph Series Analyze 1st Series Table); recall that this range has been previously used to generate the weekly average values. Next, a target or destination in the spreadsheet 310, such as cell D5, is entered; this will serve as the top cell of the block for the table. The user can specify a block or only the top cell; the system extends the block downward as needed. The system now places the weekly averages table in the spreadsheet, as shown by table 391 of FIG. 3G. After completing the first series, the user proceeds to select the table command for the second series (choose 2 nd Series Table). Again, a destination (e.g., cell E5) in the spreadsheet is entered. The system now places the weekly maximums table in the spreadsheet, as shown by table 393 of FIG. 3G. The tables list the calculated weekly averages and weekly maximums. The user can label the tables as desired a title. 4. Moving Average Example Because daily sales vary widely, the user may want to determine a general trend by smoothing data points. By selecting the Moving Average method, the user can plot progressive averages, so each day's sales is influenced by previous day's sales. Illustrated in FIGS. 4A-B, this method will now be demonstrated. In this example, the user wishes to display and compare two moving averages in the same graph as daily sales. Continuing with the previous example (1st Series and 2nd Series are C5 . . . C55, and X-Axis is B5 . . . B55), the user specifies a third series (3rd Series is C5 . . . C55). For the second series, the user selects an analysis type of moving average (choose Analyse 2nd Series Moving Average); for the Period, the user enters 3. This operation is repeated for the third series except the Period is set to 9 (choose 3rd Series Moving Average Period, enter 9). Upon completion, the interface 300 displays the user's selections in cascading submenus 410, as shown in FIG. 4A. The user may now view the graph (e.g., by pressing F10). As shown by the graph 420 of FIG. 4B, the system 130 compares two different moving averages against daily data in one graph; one with a period of 3, the other with a period of 9. Against the fluctuating daily information, a moving average with a period of 3 provides a smoother line through the daily data. A moving average with a period of 9 provides an even smoother effect. Note: The larger the period, the smoother (and less true to the daily totals) is the line formed by Moving Average. As before, the user may apply text, legends, labels, or otherwise. The user may readily apply a weighted moving average to the analysis to place greater emphasis, or weight, on recent points and less weight on older points. To convert one of the above to a weighted moving average, the user simply chooses the Weighted option (Weighted Yes) for the respective series. Although tedious to do by hand, a weighted moving average may be readily performed by the system 130. The general methodology employed by the system is as follows. First, the oldest point (i.e., the point farthest from the point the system is working with) is multiplied by 1, the next multiplied by 2, and so on, until the most recent point is reached; this last point is multiplied by the period specified. For example, with Period set to 3, the oldest point is multiplied by 1, the next one by 2, and the most recent point is multiplied by 3. The weighted moving average is determined by dividing the sum of the weighted points by the sum of the weights (In this example, 1+2+3). 5. Linear Fit Example Linear Fit generates a line that best fits the data, using simple linear regression. Referring to FIGS. 4C-D, this method will be illustrated by fitting daily sales data to a line. Continuing the example (with 1st Series and 2nd Series set to C5 . . . C55, and X-Axis set to B5 . . . B55), the user specifies a Linear Fit analysis for the second series (choose Series Analyze 2nd Series Linear Fit). Upon completion, the interface 300 displays the user's selections in cascading submenus 430, as shown in FIG. 4C. The user may now view the graph (e.g., by pressing F10). As shown by the graph 440 of FIG. 4D, the system 130 plots a line through the daily sales information. Linear Fit calculates and plots linear regression information in a line, even if the data does not have a general trend. To view linear regression information in the spreadsheet, the user may generate a table as previously described in the Table section. Internal Operation Referring now to FIG. 5, a preferred method 500 of the present invention for interpreting information with analytical graphs will now be described. At step 501, a series is defined, for example, by specifying a range of information members (cells); this represents a data series. As previously described, the system of the present invention provides a preferred interface (e.g., menus 360 of FIG. 3E) for receiving the data series (typically, via keyboard and/or mouse input). In a preferred embodiment, the series represents an ordinal set, that is, a first data point, a second data point, a third data point, and the like. This information may be deduced from the relative position of each information cell relative to other information cells of the selected range. The system may, for instance, accept a range of cells which are spatially arranged in the spreadsheet from left to right, where the left most cell is the first member of the set and the right most cell is the last member of the set. Similarly, the system may accept a vertically-oriented range, where the top most cell is the first member of the set and the bottom most cell is the last member of the set. Such a determination is simplified if ranges are restricted to blocks of contiguous cells. Next, the method may proceed to display a selection of available analyses; but first, however, it is desirable to examine the type of information comprising the above-selected range so that only those analyses which are appropriate for the selected information are presented to the user. Thus, at step 502, the system intelligently determines the data type (e.g., from among fixed, scientific, currency, percent, date, time, text, and user-defined data types) for the information selected in the range. For instance, the system may determine that a range includes members which are temporally related (i.e., varying over a period of time); accordingly, the system may suggest an aggregation analysis. Those skilled in the art will appreciate that data types (e.g., date, currency, percent, floating point, integer, and the like) may be readily determined from internal data representation and/or format settings for the information of interest. Also, heuristic information about data types (e.g., a particular range of integers representing 20th century dates) may be provided using known expert system technique (typically in the form of an inference engine with one or more knowledgebases). By examining this information beforehand, the system of the present invention may intelligently guess the analysis desired by the user (or subset of analyses from those available). Alternatively, in a simple embodiment, step 502 may be eliminated, whereby a simple list of available analyses is displayed; in such a case, the user would determine which analysis is appropriate for his or her information set. At step 503, a desired analysis or data transformation is selected. As shown by the menus 360 of FIG. 3E, for example, the system of the present invention includes a user interface dialog to facilitate this selection. If the optional step 502 was performed, then the dialog will preferably display only those analyses which are appropriate for the selected information (as determined from its type). At step 504, the method determines whether any user-supplied parameters are required. In a moving average analysis, for instance, the user may specify whether the analysis is center-weighted. If additional parameters are required (yes at step 504), then the method proceeds to step 505 for retrieving the relevant parameters. If the type of information comprising the range was determined above (at step 502), then the parameters appropriate for display may be intelligently determined. For time-based information, for instance, appropriate parameters may include a display of business weeks, calendar weeks, months, quarters, and the like. As another example, an aggregation performed on business information may be adjusted to take into account bank holidays. At step 506, the method applies the appropriate operation or series of operations necessary to effect the specified analysis. In this manner, the method derives a new (temporary) data set. According to the present invention, this new data set per se is not important to the user; instead, the set serves as a vehicle for constructing the desired analysis. Thus, the derived data set represents temporary or meta-data, which is preferably not retained at the conclusion of the method. Instead, the data is employed to construct the analytical graph for the analysis. Continuing on to step 507, the analytical graph is constructed by plotting the data points of the temporary set, after which the set is discarded. In a preferred embodiment, the meta-data set is only maintained in the background (i.e., not on the active spreadsheet) so that the user is effectively shielded from this added complexity. In this fashion, the user receives what he or she really wants: an analytical graph, which serves as a visual tool for understanding his or her information. If desired, however, the system provides an option for saving the meta-data set in the spreadsheet, as indicated by step 508. This step corresponds to the Table operation illustrated in FIG. 3G. While the invention is described in some detail with specific reference to a single preferred embodiment and certain alternatives, there is no intent to limit the invention to that particular embodiment or those specific alternatives. Thus, the true scope of the present invention is not limited to any one of the foregoing exemplary embodiments but is instead defined by the following claims.
|
Same subclass Same class Consider this |
||||||||||
