Risk adjustment tools for analyzing patient electronic discharge records6266645Abstract A system and method are disclosed for examining and effectively managing resource allocation in a health care organization or facility (e.g., a hospital, a hospice, or a nursing home). The disclosed technology relies upon an analysis of the electronic discharge records of a health care organization in a manner that allows extraction of only those records that were generated for patients having a specified condition (e.g., septic shock, coronary artery disease, auto-immune disease, etc.) or fall into a particular class based upon resource usage (e.g., length of hospital stay, type of surgery, or quantity and type of pharmaceuticals taken). Note that discharge records often fail to explicitly specify the condition of interest. To accomplish selective extraction, the content of the discharge records is matched against one or more "key explanatory variables" such as a "selection vector" which is collection of patient codes that implicitly specify the condition of interest. Claims What is claimed is: Description COPYRIGHT NOTICE
DRG Admissions Type
76 & 77 Other Resp Surg Complications or Comorbidities
204 Disorders of Pancreas
277 & 278 Cellulitus
316 Renal Failure
144 & 145 Other Circulatory Complications or Comorbidities
15 Transient Ischemic Attack and Precerebral Occlusion
130 & 131 Peripheral Vascular Accident Complications or
Comorbidities
294 & 295 Diabetes Adult
395 Red Cell Disorders
24 & 25 Seizures, Headaches, Complications or Comorbidities
403 & 404 Lymphomas, Leukemias
188 & 189 Other Digestive Complications or Comorbidities
202 Chirrosis, Hepatitis
127 Heart failure with Shock
475 Respiratory Involvement with Ventilator
416 Septicemia Adult
121, 122 & 123 Circulatory, Acute Myocardial Infarction
79 & 80 Respiratory Infection
174 & 175 Gastro-intestinal tract Hemorrhage
488, 489 & 490 HIV
296 & 297 Nutri-Metabolic
182 & 183 Esophagitis Complications or Comorbidities
140 Angina
320 & 321 Kidney Urinary Track Infection Complications or
Comorbidities
138 & 139 Cardiac Arrhythmia Complications or Comorbidities
143 Chest Pain
Again, the DRG codes are provided in the "DRG Guide," 1997 Ed., Medicode, Inc., Salt Lake City, Utah, 1996, which is incorporated herein by reference for all purposes. One approach to defining such selection vectors is set forth in FIG. 2A. As shown there, a process 200 begins at a starting point 201 and then specifies a patient condition to be analyzed in a step 203 (e.g., sepsis, HIV infection, liver cancer, toxemia, etc.). Next, a clinical data set is evaluated at a step 205 to identify a collection of patients that unambiguously have the condition of interest. Clinical data, which is not constrained by the limitations of a coding system such as the ICD-9 codes applied to UB-92 forms, should clearly identify those patients having a particular condition. Preferably, to provide a statistically significant sample, the clinical data set should represent at least about 100 patients. Next, at a step 207, the electronic discharge records for those patients identified from the clinical data are provided for further analysis. Thus, one now has the electronic discharge records for a statistically significant sampling of patients known to have the condition of interest. These electronic records are analyzed to identify various coding combinations that they have in common. This produces one or more patient code combinations (selection vectors) specifying records for patients having the condition of interest. See step 209. This may not be a trivial task, given that thousands of different codes are available for entry into the electronic discharge records. When looking for a combination of codes, the likelihood that any particular combination will randomly occur is extremely small. In a preferred embodiment, the selection vectors are generated after all patient condition codes in the electronic discharge records are ranked according to frequency of occurrence. The ranked list is then analyzed (automatically or manually) with physician guidelines available in the field. To generate vectors for sepsis or septic shock, one can employ guidelines provided in Bone et al., "American College of Chest Physicians/Society of Critical Care Medicine Consensus Conference: Definitions for sepsis and organ failure and guidelines for the use of innovative therapies in sepsis," Critical Care Medicine, 864-874, June 1992. The patient condition codes comprising the selection vectors can be generated from the list of codes in the clinical sample with the aid of a statistical technique such as analysis of variance, linear regression, logistic regression, CART (classification and regression trees), neural network techniques, entropy mini-max, and SMILES (similarity-metric least squares). Obviously, if an electronic discharge record contains the exact code for the condition of interest, the problem is trivial. However, there is a need for the present invention partially because the code for the condition of interest frequently is not listed in the electronic discharge record. To ferret out those electronic discharge records that apply to patients having the condition of interest but do not list the code for the condition of interest, selection vectors having great specificity for the condition of interest are developed. When these are matched against the electronic discharge records, they should specifically select only those records for patients having the condition of interest. Preferably, selection vectors of this invention have a sensitivity of at least about 80 percent (more preferably at least about 85 percent) and a specificity of at least about 70 percent (more preferably at least about 75 percent). This sensitivity and specificity have the meanings commonly used in statistics. Thus, a septic shock selection vector having a sensitivity of 86 percent, for example, will correctly select the electronic patient profiles of 86 out of every 100 septic shock patients. And, a septic shock selection vector having a specificity of 70 percent, for example, will not select the electronic profiles of 70 of every 100 sepsis patients; but it may select the profiles of 30 of these patients. After the selection vectors have been selected based upon the analysis of the clinical data and the corresponding discharge records, those selection vectors should be validated by a statistically rigorous process. See step 211. Many such processes are known in the art. Generally, they will include a statistically significant sample of data unrelated to the data employed to generate the selection vectors. Examples of suitable validation techniques include analysis of variance, linear regression, logistic regression, probit and tobit modeling, CART (classification and regression trees), neural network techniques, entropy mini-max, and SMILES. The entropy mini-max process is described in Christensen, Ronald, "Entropy Minimax Multivariate Statistical Modeling-I: Theory," pages 231-277, and the SMILES process is described in U.S. patent application Ser. No. 08/784,206, filed on Jan. 15, 1997, naming Minor et al. as inventors, and entitled "METHOD AND APPARATUS FOR PREDICTING THERAPEUTIC OUTCOMES," both of which is incorporated herein by reference for all purposes. After the selection vector at issue has been validated, the process 200 is completed at a stopping point 213. Typically, the chosen selection vectors are strings of one or more patient condition codes. In one preferred embodiment, the selection vector is simply a string of ICD-9 codes (numerals) covering the combination of interest. Any electronic discharge record found by matching to possess the same combination of patient condition codes is selected. Obviously, the format of the selection vector components should be the same as the format of the components of the electronic discharge record (e.g., key word strings, ICD-9 codes, etc.). In a preferred embodiment, a match with a selection vector produced as described above not only indicates the presence of a condition but also specifies a limited range amount of additional resources that may be normally associated with that condition. Examples of the resources at issue include the length of stay at a health care facility, the type of facilities used by the patient (e.g., an intensive care unit), the costs of tests performed on the patient, the total cost of the patient's treatment, etc. By identifying the level of resource usage associated with patients having particular conditions, the selection vectors of the present invention can be employed to classify patients for risk adjustment, guideline monitoring, etc. Another key explanatory variable of patient condition or expected resource allocation or medical outcome is the use of a particular laboratory test--termed a laboratory cost driver. Not surprisingly, laboratory cost drivers often turn out to be, from the clinical perspective, those tests that a typical physician would request to diagnose a condition that she suspects or to gauge the severity of a condition known to exist. Thus, the use of such tests (as indicated in an electronic record) strongly correlates with patient condition and resource usage. While the choice of such laboratory cost drivers as key explanatory variables may seem logical in retrospect, only with rigorous statistical analysis of a large sample of data do some of these laboratory tests reveal themselves as key explanatory variables. To qualify as a key explanatory variable, the laboratory test should strongly correlate with a medical condition. The laboratory cost drivers are chosen to strongly correlate with total resource usage for a given patient. In other words, when a patient record contains a laboratory cost driver, the total cost of treating the patient should typically deviate from the main by a significant amount. Often this amount will be far in excess of the true cost of the test that represents the laboratory cost driver. For example, the presence of a $50 test may have, on average, a $500 effect on the total cost of a patient's treatment. From a mathematical perspective, a laboratory cost driver is preferably "statistically significant" (i.e., it maintains a p value of not more than about 0.05 in a regression test performed on a sample of about 100 admissions for a condition) and "economically meaningful" (i.e., it has a beta weight (parameter estimate) in regression of at least about three times the cost of the test itself). As illustrated below, the parameter estimate of laboratory cost driver may correspond to an expected change in the total cost of treating a patient each time the test associated with the cost driver is performed. Referring now to FIG. 2B, a process 224 for identifying laboratory cost drivers is depicted. The process begins at a starting point 226, and then at a step 228 the condition from which the laboratory cost drivers are to be developed is specified. As mentioned, laboratory cost drivers may be identified by choosing them from a collection of electronic records for a patient having the pre-identified condition. To get at such records, one must first identify patients having the condition. This is accomplished at a step 230 by, for example matching the selection vectors described above with UB-92 forms for a number of patients. In another example, such records are obtained by selecting those records given the appropriate DRG for the condition of interest. The data set for the patients selected in step 230 may then be combined with the discharge records. Regardless of how such records are obtained, they are now evaluated to identify and sum by test type all or at least many of the tests that were conducted for each patient in the data set. See step 232. Thereafter, the various tests are ranked by volume. Accordingly, the test that is performed more than all other tests is ranked first, the test that is performed second most often is ranked second, and so on. This is depicted in a step 234. Alternatively, the collection of such records is classified according total resource usage. For example, the records may be ranked according to the length of stay in a hospital, the total test costs, or the total administrative costs for the patients. Then, those records associated with patients who used up the most resources are separated for further analysis. After the ranking, the tests may be filtered, as indicated in a step 236, to focus on those that most likely have a profound effect on resource usage. In one specific example, only the records associated with the top 10 percent of resource usage according to the selected resource category are selected. Also, the list may be analyzed to remove those tests that obviously have no relationship to the condition under consideration. For example, if the specified condition is appendicitis, then cranial X-rays can be disregarded as potential laboratory cost drivers. After an appropriate pool of laboratory cost drivers has been identified, the individual tests are statistically analyzed to quantify the resource usage associated with them. See step 238. Only those tests having a profound impact on resource usage serve as laboratory cost drivers. Process 224 is then concluded at a stopping point 240. In a preferred embodiment, regression analyses are employed to correlate the usage of the resource of interest with various potential laboratory cost drivers. Any conventional regression analysis may be employed. In one specific embodiment, the SAS/STAT software available from the SAS Institute, Inc. of Cary, N.C. may be employed for this purpose. After a laboratory cost driver has been identified, it may be validated by various techniques. In one example, validation can be performed by developing and selecting a best explanatory statistical model on 60 percent of records randomly selected from a data set such as that used in step 307 (see FIG. 3A and the associated discussion below). This statistical model will then be applied to the remaining 40 percent of the data sample. Close values for the beta weights in the two models and statistically significant variables in the second model indicate that a valid explanatory model has been identified. The presence of a laboratory cost driver in a patient's electronic profile does not necessarily mean that the patient had the specified condition used to derive the cost driver. Thus, unlike a match with a selection vector, a match with a laboratory cost driver does indicate that the patient likely had the specified condition. Nevertheless, a match to a cost driver does indicate that statistically, the patient is likely to have deviated from the norm in resource consumption by an amount determined at step 238 of process 224. This is due in part to some association between the test and the underlying condition; a certain fraction of the patients having the test performed have the underlying condition. Various examples of laboratory cost drivers and their associated deviations in resource usage will be set forth below. In general, the laboratory cost drivers find significant value in their ability to predict resource usage. SELECTION VECTORS AND LABORATORY RESOURCES FOR SEPTIC SHOCK Septic Shock Selection Vectors Today, hospitals and other health care organizations are generally reimbursed for a DRG that reflects mainly the principal diagnosis of a patient (the main condition for which the patient was admitted). Many conditions that develops during the course of the patient's stay at the health care organization may not affect reimbursement. Thus, for example, if a diabetic patient is admitted for appendicitis and during her stay at the hospital she develops septic shock, the hospital is reimbursed only for the cost for treating the appendicitis (with an adjustment for her diabetic condition) but not for the costs associated with treating the septic shock. Unfortunately, septic shock arises relatively commonly during the course of a patient's stay at a health care organization. Thus, health care organizations who treat a relatively high percentage of patients developing septic shock may receive relatively low reimbursement for their actual costs and may have difficulty maintaining financial health. As mentioned above, many hospital discharge records do not directly state that a given patient had a particular condition (septic shock in this case) when, in fact, the patient did have this condition. Identifying those discharge records that do not directly recite septic shock, but which nevertheless represent a patient having septic shock, presents one of the challenges solved by this invention. Most generally, the invention accomplishes this by recognizing that clinically, septic shock arises when one or more bodily systems fail as a result of the toxins produced by an infection. Importantly, that infection must reside in a bodily system or locality that does not form part of the one or more systems that shut down, in order for predictions to be valid. For example, a severe kidney infection may produce enough toxin to poison the respiratory and cardiovascular systems, causing them to shut down. If information in a hospital discharge record reflects this, an appropriately constructed septic shock vector will select that record even though the record does not recite the patient condition code for septic shock. With this in mind, some preferred selection vectors of this invention include patient condition codes for an infection and one or more organ or organ system failures. The exact patient condition codes employed in such vectors will depend the type of illness under study and the coding system used. Selection vectors for records using ICD-9 codes will contain ICD-9 codes for infections and organ system failures. Selection vectors to be applied to records using ICD-10 codes will contain appropriate ICD-10 codes. If an ICD-9 vector generically covers renal failure, it preferably includes most or all of the different ICD-9 codes specifying a renal failure (e.g., code 586.x for acute renal failure, code 997.5 for post operative renal failure, code 593.9 for toxemia, etc.). The following is a list of selection vectors which have been identified and proven to accurately identify those patients who contracted septic shock. As explained above, such selection vectors as applied to a collection of electronic discharge records select only those records of patients having septic shock. Specific Selection Vectors: 1) Septic shock or toxic shock; 2) Sepsis and organ system failure (central nervous system, heart, coagulation, renal, liver, or lung); 3) Infection and organ system failure (central nervous system, heart, coagulation, renal, liver or lung); 4) Lung infection and other organ failure (central nervous system, heart, coagulation, renal, or liver); 5) Kidney infection and other organ failure (central nervous system, heart, coagulation, liver, or lung); 6) Bacteria and electrolyte imbalance and other organ failure (heart, coagulation, liver, or central nervous system); 7) Bacteria and dysrhythmia and other organ failure (renal, liver, coagulation, central nervous system, or lung); 8) Bacteria and fluid imbalance and other organ failure (central nervous system, coagulation, liver, or lung); 9) Fever of unknown origin and two organ systems failure. Note that the above list ranks the vectors based upon their ability to accurately identify septic shock. This ability was determined by comparison against clinical records (each associated with a corresponding electronic discharge record) which unambiguously confirmed the presence or absence of septic shock. Not surprisingly, the first and second vectors specifically recite septic shock, toxic shock, or sepsis. Thus, some electronic records do accurately record these clinical conditions. However, many other records fail to so record this condition. It is these seemingly incomplete records that pose a problem which the present invention addresses. The above-listed selection vectors include components comprising generic patient conditions. As noted, there are many different formats for representing these generic patient conditions. In one specific example, the conditions are represented as strings of ICD-9 codes. The following is a list of ICD-9 codes which code for the generic conditions recited in the above selection vectors.
ICD-9 Codes for Septic Shock
Shock Codes
Septic shock 785.59
Pulmonary shock 518.5
Renal Failure
Acute 586.x
Post OP 997.5
Post trauma 958.5
Post labor 669.3
Toxemia 593.9
Post abortion 634.3 through 639.3 (only codes ending on .3)
Pulmonary Failure
ARDS 518.82
Resp. failure 518.18
On Respirator v46.1
Heart Failure
Fail. w/congestion 428.0
Left ventricle fail. 428.1
Acute sudden fail. 428.9
Cardio-resp. fail. 799.1
Circulatory fail. 799.8
Septic myocarditis 422.92
Toxic myocarditis 422.93
Acute hf & renal fail. 404.93
Acute vent. fail. 391.8
Coagulation Failure
D.I.C. 286.6
Liver Failure
Hepatic fail. 572.8
CNS Failure
Cerebrovascular 437.8
collapse
Infections
Septicemia 038.xx
Bacteremia 790.7
Enterobacteremia 038.49
Infect. due to device 996.62
Bacterial Conditions
Specific bacterias 001.x-005.x
008.x-009.x
020.x-041.x
097.x-098.x
100.x-104.x
Note that ".x" represents any possible number for this digit.
A complete list of ICD-9 codes for the above-listed selection vectors components--for a specific embodiment--is provided in the Appendix. Of course, other formats which adequately describe the various patient condition codes may be employed. Among these are ICD-8 codes, ICD-10 codes, CPT-4 codes, ICCS codes, and country-specific codes used in some European data bases. A knowledge of sepsis pathology may suggest additional patient conditions for use in selection vectors. Among these are treatments with specific antibiotics, monoclonal antibody preparations, and tumor necrosis factors, lactate test use, and use of procalcitonin testing. The patient conditions provided in the above-identified selection vectors were chosen by a statistically rigorous analysis of a large collection of electronic discharge records and associated clinical data. The selection vectors were found to do a remarkably good job of selecting records representing patients who, on average, consumed nearly identical amounts hospital resources. Referring now to FIG. 3A, the technique employed to identify the above selection vectors is depicted. This technique represents a specific example of the procedure generally depicted in FIG. 2A. As shown in FIG. 3A, a pool of clinical data records 303 is employed. From these records, a subset 305 containing records for only those patients unambiguously contracting septic shock is identified. In this particular example, records subset 305 contains the clinical data for about 400 patients contracting septic shock. Of course, other statistically significant samples could be derived from other sources, so long as it is clear that each of the patients had septic shock. For each of those patients contracting septic shock, associated electronic discharge records 307 are analyzed. Thirty to forty DRGs are represented within records group 305. Of these, DRGs 148, 415, 416 (sepsis), 475, and 483 represented nearly fifty percent of the records. See the "DRG Guide," 1997 Ed., Medicode, Inc. Salt Lake City, Utah 1996. From ICD-9 codes present in the discharge records of group 309, at least tens of thousands of code combinations are possible. From these combinations, nine combinations were identified which comprise specific septic shock selection vectors of this invention. These combinations were identified by first ranking by frequency all ICD-9 codes appearing in the discharge records 307. The inventor then applied her judgment and knowledge of sepsis and septic shock pathology to identify those combinations that likely specified septic shock implicitly. Of course, other techniques for generating the selection vectors--such as those generation techniques described above--could have been employed. In the end, the selection vectors in group 309 included between one and four ICD-9 codes. As shown, a collection of selection vectors 309 is thereby generated. Together, all nine selection vectors account for 100 percent of the septic shock cases identified from clinical data set 305 and which contained a minimum of three ICD-9 codes, at least one of which was a procedure code. (This last condition is a minimum criteria for meaningful data.) As indicated above, the first vector simply recites the patient condition code for septic shock or toxic shock (ICD-9 code 785.59 in this example). Discharge records having this correct coding accounted for only about 27 percent of the total records for patients known to have septic shock (i.e., 27 percent of the total records in discharge records group 307). Other vectors were necessary to identify the remaining records from within group 307. Generally, these vectors include (1) an infection, possibly of unspecified origin and (2) a system shut down (e.g., renal or heart shut down). In this example, the remaining eight selection vectors identified above capture the remaining 73% of the septic shock cases. In essence, the selection vectors other than the first one directly reciting the septic shock patient condition code contain codes that implicitly identify septic shock in a language other than the expected direct language. The selection vectors were validated by applying them to a very large data set of discharge records 311 which included records for both patients with and without septic shock (e.g. patients admitted for heart disease, abortions, and other condition unrelated to sepsis). In fact, data set 311 included 27 percent of all hospital admissions in the US for the year 1995 (corresponding to millions of hospital admissions). From this large data set, 1000 discharge records were selected for each month of 1995. These 12 sets of 1000 records each (subset 313) were selected by matching the selection vectors within group 309 to the records of data set 311. Thus, it was believed that the records within subset 313 were limited to records for patients that had septic shock. For each of the twelve monthly groups of subset 313, the mean cost of patient treatment and the death rate were extracted. The variation in cost was from about $100 to about $948,000 (corresponding to six standard deviations). Quite surprisingly, it was found that the mean treatment costs of the twelve monthly groups within subset 313 were within about $3,500 of one another. The institutional death rate for septic shock normally varies between about 9 percent and 26 percent. Also surprisingly, it was found that the death rates between the twelve monthly groups of record subset 313 varied within 8 percent. This establishes that the septic shock selection vectors developed as described above correctly select patients of similar cost (resource consumption). More generally, it confirms that the septic shock selection vectors were able to identify patients having quite similar conditions. Within each of the twelve monthly groups, there were quite large variations in outcome (death rate) and cost (between about $100 and $1,000,000). This variation may be due to different efficiencies of the various hospitals in the US, etc. However, in a large enough mix of US hospitals, the selection vectors of this invention identify septic shock patients having, on average, similar outcomes and similar charges. While the above-listed septic shock selection vectors all identify records for patients who had septic shock, they vary in the types of septic shock that they identify. Septic shock comes in varying degrees of severity, often based upon the difficulty of treating septic shock with antibiotics. For example, sepsis in the kidneys or urinary system is often relatively easy to treat (and hence less severe), while sepsis in the bones, extremities, or closed organs is much more difficult to treat. The various selection vectors identified above may select predominately or exclusively records for patients having a specific type of septic shock (e.g., septic shock associated with an infection of the kidneys). Not surprisingly, the costs of treating septic shock may vary dramatically depending upon the origin of that septic shock. For example, treating septic shock originating with an infection of the kidneys or urinary system costs significantly less, on average, than treating septic shock originating with a bone infection. This results because antibiotic treatment cures urinary system infections much faster, on average, than it cures bone infections. Hence septic shock associated with a urinary system infection can generally be treated less expensively than septic shock associated with a bone infection. As a result, the selection vectors that select one type of septic shock over another type also correspond to different treatment costs. It was found during the validation procedure that the first selection vector (containing patient condition codes for septic shock or toxic shock) was associated with an increase in treatment costs of about $2000 on average over the cost of care for sepsis. The remaining eight septic shock selection vectors all similarly predicted a large increase in treatment costs over the cost for simple sepsis. Health care organizations can generally benefit from identifying which types of septic shock its patients have or had. This allows them to adjust risk based upon the types of patients that they typically handle. It also allows the health care organization to intelligently design and implement guidelines for treating septic shock. For example, a hospital may determine that for those patients appearing to have sepsis of the urinary tract, a relatively limited treatment regime can be employed. Appropriate guidelines to this effect could then be monitored using the selection vectors of this invention. That is, by applying a selection vector identifying urinary tract sepsis to the electronic discharge records of a hospital implementing these guidelines, one identifies only those patients having urinary tract sepsis. Patients treated according to the guidelines and identified by the urinary system selection vector should have a relatively low treatment cost under the guidelines. If not, one may assume that the guidelines are not being followed. More generally, the information provided with the selection vectors of this invention provides some insight into which classes of patients are likely to develop sepsis or have actually had septic shock. Such information can be used by the health care organization to evaluate its financial performance. This information may also be used for risk adjustment, whereby health care organizations accepting riskier patients are reimbursed for taking that risk. FIGS. 3B and 3C illustrate one example of how the selection vectors of this invention have been applied to compare competing hospitals. Initially, the electronic discharge records were compared against the above selection vectors to identify a class of similarly severe septic shock patients in each hospital. Prior to this invention, such filtering to identify similar classes of patients across hospitals was either impossible or too laborious to execute. FIG. 3B illustrates the Medicare payment (left bar) and average actual charges incurred (right bar) for the selected class of septic shock patients for each of hospitals 1, 2, 3, and 4. As can be seen, hospital 1 had actual costs far outstripping those of the other three hospitals. Thus, armed with this information, a HMO or other contractor might be disinclined to contract with hospital 1. Similarly, hospital 1 might decide reassess its treatment procedures for sepsis/septic shock patients. FIG. 3C illustrates the adjusted death rates (adjusted to have similar distribution of risk) for the selected class of septic shock patients for each of the hospitals 1, 2, 3, and 4. As can be seen, hospitals 2 and 4 had death rates far in excess of hospitals 1 and 3. Armed with this information, a patient or a provider organization would likely select hospitals 1 and 3 over hospitals 2 and 4. Hospitals 2 and 4 might reassess their procedures for treating sepsis/septic shock patients. In light of the cost data in FIG. 3B and the outcome data in FIG. 3C, hospital 3 appears to have the best procedures for treating the selected class of septic shock patients. Sepsis Laboratory Cost Drivers (Laboratory Tests) While the septic shock selection vectors discussed above do a good job of identifying from hospital discharge records those patients having septic shock, they only partially explain the resources consumed by any given sepsis of septic shock patient. It has been found that when the septic shock selection vectors are used in conjunction with laboratory cost drivers associated with sepsis therapy, a remarkably good prediction of total resource usage is obtained. These tests likely predict just how severe was the patient's sepsis of septic shock.
Gram Stain $730
Transfusion Cross-Match $482
Chemistry Panel 12 $261
Unspecified Lab Test $535
Urinary Microscopy -$564
Unspecified Bacteria Test $189
Serum Magnesium $463
Bacterial Sensitivity $218
Random Glucose $92
A.P.P.T.* $516
MIC** $426
Blood Gases $603
Leukocyte Differential $375
Chemistry Panel 7 $658
Specimen Collection $102
Aerobic or Anaerobic Bacterial Culture $995
Aerobic Culture $498
Complete Blood Count (CBC) $1,002
*Activated Partial Protrombintine
**Minimize Inhibitory Concentration
Generally, the above tests should be well known to those of skill in the art. For example, the transfusion cross-match test determines whether blood types are compatible for a transfusion. A patient having this laboratory cost driver has been thought to need or has received a transfusion. Thus, if the patient had sepsis, it was at least a moderately severe case of sepsis and the cost of treatment went up accordingly (by $482 per cross-match on average). The chemistry profile is a sodium, potassium, chloride electrolyte concentration profile. Urinary microscopy refers to a microscopic examination of a urine sample. Interestingly, for each additional urinary microscopy performed, the charges of an admission actually falls, on average, by $564. As mentioned, urinary infections are often easier to treat (and generally cheaper). Each serum magnesium test indicates that the treatment cost will go up by an additional $463. Such tests suggest that the patient was going into electrolyte imbalance. The bacterial sensitivity test determines whether the bacteria infecting the patient is sensitive to prescribed antibiotics. The A.P.P.T. test tests for coagulation. The blood gases test tests for CO.sub.2 and O.sub.2 in the blood. The leukocyte differential test measures the difference in abundance between lymphocytes and granulocytes. The specimen collection covers generic specimens (e.g., wound, blood, urine, etc.). The complete blood count and anaerobic and aerobic bacterial cultures have the biggest impact on cost. FIG. 4 graphically depicts the process by which the laboratory cost drivers presented above were derived. This serves as a specific example of the process generally depicted in FIG. 2B. Initially, a data set 330 for about 14,000 admissions for patients having sepsis was obtained. All sepsis admissions were identified by virtue of falling within the sepsis DRG classification (DRG 416). As explained above, many patients who have been treated for septic shock are never actually classified in DRG 416 because their electronic discharge records either do not recite the ICD-9 code for septic shock or sepsis, or recite it in a field other than the principal diagnosis code field. Of course, those patients classified in the sepsis DRG will assuredly have had sepsis during their treatment. Thus, the patient records considered in this study definitely describe patients who had sepsis. Of course, the septic shock vectors described above could have also been used to accurately identify records of septic shock patients not classified into DRG416. In this instance, data set 330 included a set of ICCS codes 332 which detailed the laboratory tests employed on the sepsis patients, however, other sources of laboratory test usage may also be suitable for this purpose. The laboratory tests in records 332 were sorted by volume (how often the test was performed in the 14,000 admissions). Then the high volume tests having a logical relation to sepsis were selected. Some high volume tests such as a hemoglobin test are simply unrelated to sepsis and could be disregarded. Those laboratory tests passing through this filter were separated into bins 334 by volume. Then the tests in the bins were analyzed in a comparison block 336 for four categories of resource usage: length of hospital stay, ICU use, total test cost, and total admission cost. Next, a multivariate regression analysis was performed to identify a correlation between tests performed (laboratory cost drivers) and resource usage. In this analysis, the charges for a patient are analyzed as a function of combinations of laboratory cost drivers and sometimes selection vectors. Hundreds of regressions were performed on various laboratory test data with a statistical analysis routine (SAS/STAT Software available from the SAS Institute, Inc. of Cary, N.C.). The combinations of tests that best predicted total charges for patient treatment were selected as laboratory cost drivers 338 for a resource usage model. The expression resulting from this analysis has been found to explain 85 to 91 percent of the variation in charges for an individual hospital's charges for patients falling within DRG 416 (sepsis). For comparison, known risk adjustment tools can explain no more than about 42 percent of the variance in charges in such a DRG population. Expressions for Modeling Health Care Resource Usage From the above septic shock selection vectors and sepsis laboratory cost drivers, expressions were derived for modeling resource usage for a patient as a function of these vectors and cost drivers. In their simplest forms, these expressions express the total cost of a health care admission as linear functions of the above laboratory cost drivers, assorted well-known generic cost drivers, and any one of the above selection vectors. Such expressions have multiple terms each comprised of a coefficient (parameter) and one of these variables (vector, cost driver, etc.). If the variable is a laboratory cost driver, the value of its term in the expression is the product of its coefficient and the number of times that laboratory test was performed during treatment. Thus, if a laboratory test represented in the expression was not performed during the patient's treatment, then the term does not contribute to the cost of treatment. If the test was performed once, one times the associated coefficient is summed with other terms of the expression. If the test was performed twice, two times the associated coefficient is summed with other terms of the expression, and if the test was performed "n" times, n times the coefficient is summed. If the variable is a selection vector or a generic binary cost driver, the value of the associated term is (i) zero if the vector or binary cost driver fails to match the patient's electronic profile and (ii) the value of the coefficient itself if the vector or binary cost driver does match the electronic profile. No higher multiples of such coefficient are possible. FIGS. 5A and 5B present a list of the terms in an expression developed as described above. The values of the various terms are summed. The following variables may be either zero or one: selection vector ("PRIME VECTOR"), SPEC. UNIT, SURGERY, DIED, BEDS 199, BEDS 299, BEDS 500, NORTHEAST, and NORTH-CENTRAL. In this expression the selection vector is the first septic shock vector identified above (i.e., the vector specifying the patient condition codes for septic shock or toxic shock). If this vector matches the patient's electronic profile, then variable value is one. If not, then variable value is zero. When patient was treated in an intensive care unit or a critical care unit, then the SPEC. UNIT variable equals one. When a surgical procedure was performed on the patient, the SURGERY variable equals one. When the patient dies during treatment, the DIED variable equals one. When patient receives treatment in hospital having between 0-199 beds, the BEDS 199 variable equals one. When patient receives treatment in hospital having between 200-299 beds, the BEDS 299 variable equals one. When patient receives treatment in hospital having more than 500 beds, the BEDS 500 variable equals one. When the patient receives treatment in the Northeastern part of the US (i.e., the New England states, New York, New Jersey, and Pennsylvania) the NORTHEAST variable equals one. And when the patient receives treatment in the North Central part of the US (i.e., Illinois, Indiana, Iowa, Kansas, Michigan, Minnesota, Missouri, Nebraska, North Dakota, Ohio, South Dakota and Wisconsin), the NORTH-CENTRAL variable equals one. All other variables are laboratory cost drivers and may have any integer value. More specifically, the value of each laboratory cost driver variable is an interger equaling the number of times that the associated laboratory test was performed. The "INTERCEPT" term in the expression is the "y axis" intercept, assuming that the expression takes the form y=INTERCEPT+.SIGMA.(PARAMETER*VARIABLE). In this expression, the intercept value is -$10.22. The coefficients have units of US dollars. Thus, if the selection vector is present in a record, $1955 will be subtracted from the cost of treating the patient. If the hospital is located in the Northeastern part of the country, the associated term of the expression contributes $3406.37 to the cost of treating the patient. In the case of a complete blood count laboratory test performed once, the average cost of treating a patient increases by $1001.97. When the complete blood count test is performed twice, the average cost jumps by $2003.94, and so on. Other forms of the above expression may be employed. For example, other selection vectors may be substituted. When this occurs, the coefficients of the selection vector and other expression variables (including the laboratory cost drivers) change as does the value of the function's intercept. However, for the nine above-listed septic shock selection vectors, the form of the above expression is identical and the parameter values vary only slightly. This above expression (as developed for each of the above-presented nine septic shock selection vectors) explains at least about 85 percent of the variance in patient charges found in the conventional DRG 416 patient categorization. The following expressions calculate the charges for treating an average sepsis patient (not having septic shock) and an average patient having septic shock. These were derived by a method similar to that employed to derive the above cost expression for septic shock patients. However, they were normalized for geographic location and hospital size. Thus, the associated parameters do not appear in the expressions.
Vector 1 Low Risk
Septic Shock Sepsis
Intercept $5615 $2681
ICU 2977 2950
Surgery 3595 2754
Died -4301 -1963
Amylase 690 216
Blood gases 576 846
CBC 998 445
Chem. 20 716 619
Creatin. Kinase -1425 72
Coagulation -45 230
Creatinine -596 -107
Culture -17 511
Glucose 246 91
Gram Stain 2071 -498
Electrolytes 765 880
Magnesium 262 661
Specimen col. 137 99
Urine -1664 -190
Cross Match 40 848
R square 0.73 0.52
Adj. R. square 0.72 0.52
Death Rate 50.9% 7.7%
The "Vector 1" expression applied to only those patients having electronic discharge records matching the first septic shock selector vector listed above (i.e., records having an ICD-9 code for septic shock or toxic shock). The "Low Risk" expression applies to only those patients having sepsis but not septic shock. Upon inspection of these expressions, it can be seen that hospitals incur significantly less costs in treating an average sepsis patient than in treating an average septic shock patient. This is evidenced primarily by the more than 100 percent increase (nearly $3000) in the intercept of the septic shock expression over the sepsis expression. Nevertheless both patient categories fall under DRG 416, and therefore hospitals receive the same reimbursement from Medicare for these patients. Obviously, the ability to segregate septic shock patients from sepsis patients generally allows an important risk adjustment. APPLICATIONS EMPLOYING VECTORS AND LABORATORY RESOURCES In general, the systems of this invention select patients having certain electronic profiles which match a selection vector. The systems accomplish this by determining whether a patient has the collection of patient condition codes found in the selection vector being considered. Each and every one of the codes in the vector must be present in the patient's electronic profile in order for the record to match the vector. A match between the vector and the electronic profile confirms that the patient has the condition (or condition severity) specified by the vector. Laboratory cost drivers may also be employed to aid in the selection process. After the patients have been classified by the selection vector(s) and laboratory cost driver(s), if necessary, their actual outcomes and/or costs may be determined and compared against similar patients treated at other health care organizations as described above with reference to FIGS. 3B and 3C. In addition, or alternatively, the system may calculate an adjusted cost and/or adjusted outcome for treating the classified patients. This may be accomplished with an expression such as that illustrated in FIGS. 5A and 5B. FIG. 6A illustrates some typical inputs and outputs of software implementing the methods/systems of this invention. The illustrated software 601 may run on any suitable computing device such as those described above and illustrated generically in FIG. 1A and stored on any suitable computer readable medium. As shown, software 601 may accept as inputs (a) diagnostic codes (e.g., items 158 and 160 in the FIG. 1B form), (b) procedure codes (e.g., items 162 and 164 in FIG. 1B), (c) laboratory tests performed, (d) patient outcome data (e.g., death or survival), and (e) drugs used in the treatment. Software 601 may also output various classes of patients 603 based upon medical condition and/or severity the medical condition. For any combination of these classes, software 601 may also output the following information, for example: (a) actual patient outcome, (b) actual treatment cost, (c) actual reimbursement for the treatment, (d) predicted cost adjusted for patient classification, (e) predicted patient outcome adjusted for patient classification, and (f) mean values of any of the foregoing for other comparable health care organizations. FIG. 6B is a process flow diagram depicting various of the procedures 600 that might be employed by software 601 or other suitable computer program product. Process 600 begins at a starting point 602 and from there the system provides (1) an appropriate selection vector for a specified condition of interest (step 604) and (2) an appropriate laboratory cost driver (or drivers) for the specified condition (step 606). Then at a step 608, the system analyzes a specified set of electronic patient profiles with the selection vector and laboratory cost driver(s). This may be accomplished by matching the cost drivers and vector components against the data contained in the electronic patient profiles--although other analysis criteria may be employed. From there, the system selects patient profiles (at a step 610) that match the analysis criteria of step 608. The selected patient profiles should all have the specified medical condition. Thereafter, the system may optionally calculate an adjusted cost of treating the patients at a step 612. At this point, the system may also group the selected patients according to severity, cost, outcome, or other appropriate characteristic. Finally, the system may optionally perform a risk adjustment, at a step 614, based upon the classification or grouping of patients. Process is complete at 616. The systems of this invention and their outputs may be employed by a health care organization to explain why its costs are higher or lower than the norm. If a health care organization treats sicker than normal patients, the models of this invention can prove this thereby allowing the hospital to justify its costs to an HMO or other entity with whom it would like to do business. In addition, this invention can suggest how a health care organization should perform if it implements a particular guideline for a particular class of patients. Such guidelines may cover how an organization tests the patients, the kinds of drugs administered to the patients, when a patient is admitted to an intensive care unit, how early in the treatment patients are tested for drug susceptibility, how many of these tests are run a particular patient. In one example, guidelines for patients having severe cases of sepsis may specify that the patients are tested for drug susceptibility relatively early in the process. In this manner, the care provider can provide the correct drug earlier in the treatment process. One can use the methods of this invention to determine whether patients successfully treated under a proposed or temporary guideline are representative a large class of patients or are special cases. For example, a guideline for sepsis treatment might specify that patients suspected of having sepsis are immediately given a series of five specific tests. If the outcomes of tests A, B, and D are positive, then the patient is transferred to the intensive care unit. Now if it is found that the patients treated under this guideline have successful outcomes (they cost less and recover more often without complications), the models of this invention may show that these patients were merely a special case (e.g., the less severe cases of sepsis or cases that are easily treated such as urinary system infections) or were part of a larger class (e.g., all suspected sepsis patients). A health care organization will want to know how many of the patients successfully treated under its guidelines are actually representative of what the organization would normally see in a given time period. In other words, the organization wants to know how many of these patients it could successfully manage under the guidelines. If the number is large, then the guideline could be broadly applied. If the number is small and limited to a marginal group of patients, then the guideline should be applied to an appropriate smaller group of patients. Without applying the methods of this invention, the health care organization might conclude that its new guideline could be applied to all sepsis patients. If in fact only thirty percent of the patients are of a severity that can be effectively managed according to that guideline, then the health care organization is in for an unpleasant surprise when it broadly implements the guideline. In a simple example, the patients treated under the guidelines could be classified by applying multiple vectors to the patient discharge records. If the vector (or collection of vectors) that matches covers a wide range of patient classes, then it can be assumed that the guidelines should be applied across all classes. If more limited vectors match the treated patients, then the guidelines should be applied more narrowly. RANGE OF EMBODIMENTS Although the foregoing invention has been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the appended claims. Furthermore, it should be noted that there are alternative ways of implementing both the process and apparatus of the present invention. Accordingly, the present embodiments are to be considered as illustrative and not restrictive, and the invention is not to be limited to the details given herein, but may be modified within the scope and equivalents of the appended claims. APPENDIX
CONDITIONS ICD9 CODES INCLUDED CODE EXPLANATION
Septicemia 003.1 Salmonella septicemia
038.xx Septicemia
639.0 Septicemia due to childbirth
999.3 Septicemia due to infusion
Bacterial 031.8 B. Bronchisepticia
infection 033.1 Bordatella parapertussis
036.0 Meningococcus
040.xx Other bacterial diseases
Bacterial infections in conditions
041.xx classified elsewhere and of
unspecified site
998.5 Post op infection
Procedure code 99.21 Infusion of antibodies
Site-specified 042.1 HIV causing other specific
infections, infections
except 320.xx Bacterial meningitis
lungs or 322.xx Unspecified meningitis
kidneys 440.24 Athrosclerosis with gangrene
540.x Acute appendicitis
567.0 and 567.2 Peritonitis, infectious
577.0 Acute pancreatitis
Infection or inflammation due to
996.6x implant
998.5 Post op. infection
Kidney 590.xx Infection of kidney
infection 595.0 Acute cystitis
599.0 Bacteriuria
Lung 481.xx Pneumococcal pneumonia
Infections 482.xx Other bacterial pneumonia
484.3 Pneumonia in whooping cough
Pneumonia due to inhalation of
507.0 food or vomitus
513.0 Lung abscess
Fever 780.6 Fever of unknown origin
Disorders of 276.2 Acidosis
electrolytes 276.3 Alkalosis
276.4 Mixed acid-base disorder
276.7 Hyperkalemia
276.8 Hypokalemia
276.9 Not classified
Fluid 276.0 Hyperosmolality
disorders 276.5 Volume depletion
276.6 Fluid overload
Dysrythmias 427.0 Tachycardia
427.1 V. Tach.
427.3x Atrial fibrillation
427.4x Ventricular fibrillation
427.6x Premature beats
427.8x Other dysrythmias
427.9 Other rhythm disorders
Shock 785.59 Septic shock
Post operative shock, endotoxic,
998.0 hypovolemic, septic
Toxic shock 040.89 Toxic shock syndrome
639.5 Shock due to sepsis in childbirth
Renal failure 586 Renal failure unspecified
584.xx Acute renal failure
996.73 Complic. of renal dialysis
Renal failure, tubular necrosis,
997.5 anuria due to complications
788.5 Oliguria/anuria
403.91 Hypertension with renal failure
Procedure code 39.65 Hemo-dialysis
Procedure code 39.95 ECMO
Pulmonary 518.5 Pulmonary insufficiency
failure following trauma and surgery
518.81 Respiratory failure
518.82 ARDS
799.1 Cardio-respiratory failure
V44.0 Tracheostomy
V461 Respirator
Procedure code 31.1 Tracheostomy
Procedure code 96.7x Continuous mechanical
ventilation
Procedure code 39.66 Percenteral cardio-pulm. bypass
Hepatic failure 997.4 Hepatic failure
570.x Hepatic failure, coma
572.2 Hepatic coma
Procedure code 50.92 Hemo-dialysis for hepatic
assistance
CNS failure 293.xx Transient organic psychosis
434.xx Cerebral infarct
310.xx Organic brain syndrome
997.0 Anoxic brain damage
Mechanical complication of
996.2 nervous system device
780.02 Transient Alteration of
awareness
Cardiac failure 227.5 Cardiac arrest
428.x Heart failure
799.1 Cardio-respiratory failure
Heart syncope
992.1 Heart syncope with arrest due to
997.1 complications
585.5 Cardiac shock
Coagulophaty 286.6 DIC
Phlebitis or thromboplebitis due
997.2 to complications
Acquired coagulation factor
286.7 deficiency
286.9 Unspec. coag. defect
287.5 Throbocytopenia unspecified
|
Same subclass Same class Consider this |
||||||||||
