Method of query return data analysis for early warning indicators of possible security exposures6928554Abstract System, method and article of manufacture for securing data. Queries are analyzed to detect security violation efforts. In one embodiment, algorithms for detecting selected security violation patterns are implemented. Generally, patterns may be detected prior to execution of a query and following execution of a query. Illustrative patterns include union query analysis, pare down analysis, non-overlapping and others. Claims 1. A method of providing security with respect to data, comprising: Description BACKGROUND OF THE INVENTION
Taken independently, each of the foregoing queries returns a reasonable number of results. Collectively, however, the number of results which satisfy each of the conditions will be significantly smaller, perhaps only one person. Having determined a clinic number for one individual, a user may run any query that returns clinic numbers and any other information, and identify which information corresponds to the one individual. The foregoing is merely one example of how users may exploit conventional databases. A variety of other subversive techniques may be used to bypass security mechanisms in place to protect data contained in databases. Therefore, there is a need for improved security mechanisms for databases. SUMMARY OF THE INVENTION The present invention generally is directed to a method, system and article of manufacture for database security. In one embodiment, a method of providing security with respect to data is provided. One embodiment comprises receiving a query issued against a database by a user; and determining whether a security violation pattern exists based on at least one of: (i) pre-execution comparative analysis of the query with respect to at least one other previously issued query from the user; and (ii) post-execution comparative analysis of results returned from execution of the query and results returned from execution of the at least one other previously issued query. Another method of providing security with respect to data comprises receiving a plurality of plurality queries from a user; executing the plurality of queries against a database; receiving a subsequent query issued against the database by the user; and based on the plurality of queries and the subsequent query, programmatically determining whether a user effort to access an unauthorized amount of data from the database is identifiable. Another method of providing security with respect to data comprises receiving a plurality of queries from a user; executing the plurality of queries against a database; receiving a subsequent query issued against the database by the user; executing the subsequent query; and based on the plurality of queries and the subsequent query, programmatically determining whether a user effort to bypass security constraints preventing unique identification of individuals is identifiable. Another method provides for security of data having a particular physical data representation, the method comprising providing a query specification comprising a plurality of logical fields for defining abstract queries; providing mapping rules which map the plurality of logical fields to physical entities of the data; providing security rules; receiving an abstract query issued against the data by a user, wherein the abstract query is defined according to the query specification and is configured with at least one logical field value; and analyzing the abstract query with respect to the at least one previously received abstract query from the user to detect an existence of security violation activity prompting invocation of a security rule. Yet another embodiment provides a computer-readable medium containing instructions which, when executed, perform a security violation identification operation, comprising: receiving a query issued against a database by a user; and determining whether a security violation pattern exists based on at least one of: (i) pre-execution comparative analysis of the query with respect to at least one other previously issued query from the user; and (ii) post-execution comparative analysis of results returned from execution of the query and results returned from execution of the at least one other previously issued query. Yet another embodiment provides a computer-readable medium containing security validation instructions which, when executed, performs a security validation operation comprising: receiving a plurality of queries from a user; executing the plurality of queries against a database; receiving a subsequent query issued against the database by the user; executing the subsequent query; and based on the plurality of queries and the subsequent query, programmatically determining whether a user effort to bypass security constraints preventing unique identification of individuals is identifiable. Still another embodiment provides a computer-readable medium, comprising information stored thereon, the information comprising: a query specification comprising a plurality of logical fields for defining abstract queries; a plurality of mapping rules which map the plurality of logical fields to physical entities of data; a plurality of security rules; a runtime component executable to perform a security violation activity detection operation in response to receiving an abstract query issued against the data by a user, wherein the abstract query is defined according to the query specification and is configured with at least one logical field value. The security violation activity detection operation comprises receiving an abstract query issued against the data by a user, wherein the abstract query is defined according to the query specification and is configured with at least one logical field value; and analyzing the abstract query with respect to at least one previously received abstract query from the user to detect an existence of security violation activity prompting invocation of a security rule. BRIEF DESCRIPTION OF THE DRAWINGS So that the manner in which the above recited features of the present invention are attained and can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to the embodiments thereof which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments. FIG. 1 is one embodiment of a computer system; FIG. 2A is a logical/physical view of software components of one embodiment of the invention; FIG. 2B is a logical view of an abstract query and a data repository of abstraction; FIGS. 3A and 3B are a flowchart illustrating the operation of a runtime component; FIG. 4 is a flowchart illustrating the operation of a runtime component; FIG. 5 is a flow chart illustrating the operation of a runtime component to identify and handle non-overlapping conditions using pre-execution analysis; FIG. 6 is a flow chart illustrating the operation of a runtime component identify and handle non-overlapping conditions using post-execution results analysis; FIG. 7 is a flow chart illustrating the operation of a runtime component identify and handle query union analysis using post-execution results analysis; and FIG. 8 is a flow chart illustrating the operation of a runtime component identify and handle pare down analysis using post-execution results analysis. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Introduction The present invention generally is directed to a system, method and article of manufacture for determining users' unauthorized attempts to access data. In general, analysis is performed on a query prior to execution and/or analysis is performed on results returned by execution of the query. In one embodiment, the detection of a possible security violation causes one or more security measures to be taken. For example, in one embodiment a user's query is not executed. In another embodiment, the event is logged and/or an administrator is notified of the event. In one embodiment, security features are implemented as part of a logical model of data. The logical model is implemented as a data repository abstraction layer, which provides a logical view of the underlying data repository. In this way, data is made independent of the particular manner in which the data is physically represented. A query abstraction layer is also provided and is based on the data repository abstraction layer. A runtime component performs translation of an abstract query into a form that can be used against a particular physical data representation. However, while the abstraction model described herein provides one or more embodiments of the invention, persons skilled in the art will recognize that the concepts provided herein can be implemented without an abstraction model while still providing the same or similar results. One embodiment of the invention is implemented as a program product for use with a computer system such as, for example, the computer system shown in FIG. 1 and described below. The program(s) of the program product defines functions of the embodiments (including the methods described herein) and can be contained on a variety of signal-bearing media. Illustrative signal-bearing media include, but are not limited to: (i) information permanently stored on non-writable storage media (e.g., read-only memory devices within a computer such as CD-ROM disks readable by a CD-ROM drive); (ii) alterable information stored on writable storage media (e.g., floppy disks within a diskette drive or hard-disk drive); or (iii) information conveyed to a computer by a communications medium, such as through a computer or telephone network, including wireless communications. The latter embodiment specifically includes information downloaded from the Internet and other networks. Such signal-bearing media, when carrying computer-readable instructions that direct the functions of the present invention, represent embodiments of the present invention. In general, the routines executed to implement the embodiments of the invention, may be part of an operating system or a specific application, component, program, module, object, or sequence of instructions. The software of the present invention typically is comprised of a multitude of instructions that will be translated by the native computer into a machine-readable format and hence executable instructions. Also, programs are comprised of variables and data structures that either reside locally to the program or are found in memory or on storage devices. In addition, various programs described hereinafter may be identified based upon the application for which they are implemented in a specific embodiment of the invention. However, it should be appreciated that any particular nomenclature that follows is used merely for convenience, and thus the invention should not be limited to use solely in any specific application identified and/or implied by such nomenclature. Physical View of Environment FIG. 1 depicts a block diagram of a networked system 100 in which embodiments of the present invention may be implemented. In general, the networked system 100 includes a client (e.g., user's) computer 102 (three such client computers 102 are shown) and at least one server 104 (one such server 104). The client computer 102 and the server computer 104 are connected via a network 126. In general, the network 126 may be a local area network (LAN) and/or a wide area network (WAN). In a particular embodiment, the network 126 is the Internet. The client computer 102 includes a Central Processing Unit (CPU) 110 connected via a bus 130 to a memory 112, storage 114, an input device 116, an output device 119, and a network interface device 118. The input device 116 can be any device to give input to the client computer 102. For example, a keyboard, keypad, light-pen, touch-screen, track-ball, or speech recognition unit, audio/video player, and the like could be used. The output device 119 can be any device to give output to the user, e.g., any conventional display screen. Although shown separately from the input device 116, the output device 119 and input device 116 could be combined. For example, a display screen with an integrated touch-screen, a display with an integrated keyboard, or a speech recognition unit combined with a text speech converter could be used. The network interface device 118 may be any entry/exit device configured to allow network communications between the client computer 102 and the server computer 104 via the network 126. For example, the network interface device 118 may be a network adapter or other network interface card (NIC). Storage 114 is preferably a Direct Access Storage Device (DASD). Although it is shown as a single unit, it could be a combination of fixed and/or removable storage devices, such as fixed, disc drives, floppy disc drives, tape drives, removable memory cards, or optical storage. The memory 112 and storage 114 could be part of one virtual address space spanning multiple primary and secondary storage devices. The memory 112 is preferably a random access memory sufficiently large to hold the necessary programming and data structures of the invention. While the memory 112 is shown as a single entity, it should be understood that the memory 112 may in fact comprise a plurality of modules, and that the memory 112 may exist at multiple levels, from high speed registers and caches to lower speed but larger DRAM chips. Illustratively, the memory 112 contains an operating system 124. Illustrative operating systems, which may be used to advantage, include Linux and Microsoft's Windows®. More generally, any operating system supporting the functions disclosed herein may be used. The memory 112 is also shown containing a browser program 122 that, when executed on CPU 110, provides support for navigating between the various servers 104 and locating network addresses at one or more of the servers 104. In one embodiment, the browser program 122 includes a web-based Graphical User Interface (GUI), which allows the user to display Hyper Text Markup Language (HTML) information. More generally, however, the browser program 122 may be any program (preferably GUI-based) capable of rendering the information transmitted from the server computer 104. The server computer 104 may be physically arranged in a manner similar to the client computer 102. Accordingly, the server computer 104 is shown generally comprising a CPU 130, a memory 132, and a storage device 134, coupled to one another by a bus 136. Memory 132 may be a random access memory sufficiently large to hold the necessary programming and data structures that are located on the server computer 104. The server computer 104 is generally under the control of an operating system 138,shown residing in memory 132. Examples of the operating system 138 include IBM OS/400®, UNIX, Microsoft Windows®, and the like. More generally, any operating system capable of supporting the functions described herein may be used. The memory 132 further includes one or more applications 140 and an abstract query interface 146. The applications 140 and the abstract query interface 146 are software products comprising a plurality of instructions that are resident at various times in various memory and storage devices in the computer system 100. When read and executed by one or more processors 130 in the server 104, the applications 140 and the abstract query interface 146 cause the computer system 100 to perform the steps necessary to execute steps or elements embodying the various aspects of the invention. The applications 140 (and more generally, any requesting entity, including the operating system 138 and, at the highest level, users) issue queries against a database (e.g., databases 1561 . . . 156N, collectively referred to as database(s) 156). Illustratively, the databases 156 are shown as part of a database management system (DBMS) in storage 134. The databases 156 are representative of any collection of data regardless of the particular physical representation. By way of illustration, the databases 156 may be organized according to a relational schema (accessible by SQL queries) or according to an XML schema (accessible by XML queries). However, the invention is not limited to a particular schema and contemplates extension to schemas presently unknown. As used herein, the term "schema" generically refers to a particular arrangement of data. In one embodiment, the queries issued by the applications 140 are defined according to an application query specification 142 included with each application 140. The queries issued by the applications 140 may be predefined (i.e., hard coded as part of the applications 140) or may be generated in response to input (e.g., user input). In either case, the queries (referred to herein as "abstract queries") are composed/executed using logical fields defined by the abstract query interface 146. In particular, the logical fields used in the abstract queries are defined by a data repository abstraction component 148 of the abstract query interface 146. The abstract queries are executed by a runtime component 150 which first transforms the abstract queries into a form consistent with the physical representation of the data contained in the DBMS 154. In one embodiment, the data repository abstraction component 148 is configured with security information 162. For embodiments not based on the abstraction model (or some equivalent thereof), the security information may reside elsewhere. In one embodiment, the security information 162 includes keys associated with one or more fields. Aspects of such keys will be described in more detail below. The runtime component 150 operates to perform various analyses and, in some embodiments, enforce various security features or take other actions according the results of the analyses performed. Accordingly, the runtime component 150 is shown configured with a security algorithm 151 (which may be representative or a plurality of algorithms), which implements the methods described herein. In general, the security features implemented by the runtime component 150 may be applied to a particular user, a group of users or all users. In one embodiment, elements of a query are specified by a user through a graphical user interface (GUI). The content of the GUIs is generated by the application(s) 140. In a particular embodiment, the GUI content is hypertext markup language (HTML) content which may be rendered on the client computer systems 102 with the browser program 122. Accordingly, the memory 132 includes a Hypertext Transfer Protocol (http) server process 152 (e.g., a web server) adapted to service requests from the client computer 102. For example, the server process 152 may respond to requests to access the database(s) 156, which illustratively resides on the server 104. Incoming client requests for data from a database 156 invoke an application 140. When executed by the processor 130, the application 140 causes the server computer 104 to perform the steps or elements embodying the various aspects of the invention, including accessing the database(s) 156. In one embodiment, the application 140 comprises a plurality of servlets configured to build GUI elements, which are then rendered by the browser program 122. FIG. 1 is merely one hardware/software configuration for the networked client computer 102 and server computer 104. Embodiments of the present invention can apply to any comparable hardware configuration, regardless of whether the computer systems are complicated, multi-user computing apparatus, single-user workstations, or network appliances that do not have non-volatile storage of their own. Further, it is understood that while reference is made to particular markup languages, including HTML, the invention is not limited to a particular language, standard or version. Accordingly, persons skilled in the art will recognize that the invention is adaptable to other markup languages as well as non-markup languages and that the invention is also adaptable future changes in a particular markup language as well as to other languages presently unknown. Likewise, the http server process 152 shown in FIG. 1 is merely illustrative and other embodiments adapted to support any known and unknown protocols are contemplated. Logical/Runtime View of Environment FIGS. 2A-B show an illustrative relational view 200 of components of the invention. The requesting entity (e.g., one of the applications 140) issues a query 202 as defined by the respective application query specification 142 of the requesting entity. The resulting query 202 is generally referred to herein as an "abstract query" because the query is composed according to abstract (i.e., logical) fields rather than by direct reference to the underlying physical data entities in the DBMS 154. As a result, abstract queries may be defined that are independent of the particular underlying data representation used. In one embodiment, the application query specification 142 may include both criteria used for data selection (selection criteria 204) and an explicit specification of the fields to be returned (return data specification 206) based on the selection criteria 204. An illustrative abstract query corresponding to the abstract query 202 shown in FIG. 2B is shown in Table I below. By way of illustration, the abstract query 202 is defined using XML. However, any other language may be used to advantage.
| |||||||||||||||||||||||||||||||||||
