OPERATOR INTERFACE (E.G., GRAPHICAL USER INTERFACE)

Virtual models of complex systems

6983227

Abstract

A computer based virtual models of complex systems, together with integrated systems and methods provide a development and execution framework for visual modeling and dynamic simulation of said models. The virtual models can be used for analysis, monitoring, or control of the operation of the complex systems modeled, as well as for information retrieval. More particularly, the virtual models in the present implementation relate to biological complex systems. In the current implementation the virtual models comprise building blocks representing physical, chemical, or biological processes, the pools of entities that participate in those processes, a hierarchy of compartments representing time-intervals or the spatial and/or functional structure of the complex system in which said entities are located and said processes take place, and the description of the composition of those entities. The building blocks encapsulate in different layers the information, data, and mathematical models that characterize and define each virtual model, and a plurality of methods is associated with their components. The models are built by linking instances of the building blocks in a predefined way, which, when integrated by the methods provided in this invention, result in multidimensional networks of pathways. A number of functions and graphical interfaces can be selected for said instances of building blocks, to extract in various forms the information contained in said models. Those functions include: a) on-the-fly creation of displays of interactive multidimensional networks of pathways, according to user selections; b) dynamic quantitative simulations of selected networks; and c) complex predefined queries based on the relative position of pools of entities in the pathways, the role that the pools play in different processes, the location in selected compartments, and/or the structural components of the entities of those pools. The system integrates inferential control with quantitative and scaled simulation methods, and provides a variety of alternatives to deal with complex dynamic systems and with incomplete and constantly evolving information and data.


Claims

What is claimed is:

1. A computer readable medium or media comprising a virtual model of a complex system implementable in a computer system, comprising a plurality of model elements and program instructions, wherein: i) the model elements include first elements representing pools of any number of entities of a given type, state and/or compartment, and a plurality of second elements representing processes of different types in which the pools of entities participate; ii) there are a plurality of input, regulator and output references or links between a plurality of said first and second elements; and iii) the program instructions comprise functions to integrate in the computer system said first and second elements according to said references or links, resulting in a network of one or more pathways, at least one of said first or second elements acting as junctions where different pathways merge or different pathways branch, or both, the network representing the topology of the interactions detected between the components of the complex system.

2. The computer readable medium or media of claim 1, the program instructions further comprising functions to generate visual representations of said network of pathways on the display means of said computer system.

3. The computer readable medium or media of claim 1, wherein said program instructions further comprise functions to generate on-the-fly visual representations of subsets of networks of pathways which meet user-defined criteria, including the selection of at least one of said first or second elements as starting nodes or a direction upstream or downstream from said starting nodes, or any combination thereof.

4. The computer readable medium or media of claim 1, wherein said program instructions further comprise functions to display visual representations of said elements on the display means of said computer system, and navigation functions associated with said visual representations, enabling interactive navigation between related elements.

5. The computer readable medium or media of claim 1, wherein at least one of said regulator references or links represents an activator, inhibitor, catalyst, ligand, agonist, antagonist, repressor, or promoter.

6. The computer readable medium or media of claim 1, wherein a plurality of said elements are classified according to a semantically meaningful hierarchy.

7. The computer readable medium or media of claim 1, further comprising an inference engine capable of data type reasoning over instances of said elements or over the values of properties or attributes of said instances.

8. The computer readable medium or media of claim 1 wherein said program instructions comprise chaining functions to establish downstream or upstream relationships between chains of first or second elements, or any combination thereof, identified based on the sequential order of said first and second elements as defined by said references or links.

9. The computer readable medium or media of claim 1 wherein said program instructions further comprise query functions in reference to one or more of said first or second elements, or any combination thereof, including criteria based on the upstream or downstream position within said network, or both, of any first or second element in relation to said one or more reference elements.

10. The computer readable medium or media of claim 1, wherein said program instructions further comprise query functions to identify a downstream set of any number of said first or second elements, or any combination thereof, of said virtual model which represents a set of pools of entities or processes, or any combination thereof, of said complex system which are putatively affected by manipulating at least one other of said entities or processes represented by a reference query set of at least one of said first or second elements.

11. The computer readable medium or media of claim 1, wherein said program instructions further comprise query functions to identify an upstream set of any number of said first or second elements, or any combination thereof, of said virtual model which represents a set of pools of entities or processes of said complex system which are putative targets for manipulation to achieve any desired outcome, including affecting entities or processes represented by a reference query set of at least one of said first or second elements.

12. The computer readable medium or media of claim 1, the model elements further comprising a plurality of entity elements, each comprising descriptions of the common structure, composition or state, or any combination thereof, of the entities represented by one or more of said pools of entities.

13. The computer readable medium or media of claim 12, wherein said program instructions comprise query functions including criteria based on the structure, composition or state described in the entity elements.

14. The computer readable medium or media of claim 1, the model elements further comprising a plurality of organizing elements representing compartments, including space, time-interval, or function compartments, or any combination thereof, organized in one or more levels of a hierarchy, each level representing a respective level of complexity, wherein said first and second elements have relationships with their corresponding organizing elements in the hierarchy.

15. The computer readable medium or media of claim 14 wherein said program instructions further comprise query functions in reference to one or more of said first and second elements selected as reference elements, including criteria based on the upstream or downstream position within said network of any first or second element, the types of its references or links, its relationship with any organizing element, or any combination thereof.

16. The computer readable medium or media of claim 14, wherein said complex system is a biological system, and said organizing elements comprise elements to represent compartments of biological organization.

17. The computer readable medium or media of claim 16, wherein the biological organization represented by said organizing elements comprise organization at the molecular assembly, reaction cascade, subcellular, cellular or multi-cellular levels, or any combination thereof, in one or more levels of a hierarchy.

18. The computer readable medium or media of claim 16, wherein said organizing elements comprise one or more elements to represent different stages during cell activation, cell cycle, apoptosis, differentiation, disease or life cycle, or any combination thereof, organized in one or more levels of a hierarchy.

19. The computer readable medium or media of claim 1, wherein said virtual model represents a biological cellular system applicable to model the transduction of signals provided by ligands in their external environment to the interior of the cell, resulting in the execution of specific functions.

20. A method for predicting pathways affected by a perturbation comprising applying the virtual model and program instructions of the computer readable medium or media of claim to identifying the signal cascades which occur through the pathways when stimuli are introduced to the complex system and dynamically generating results during execution of said model in a computer system.

21. The computer readable medium or media of claim 1, wherein at least one of said elements implements a lumped parameter system, black-box system, encapsulation or aggregation.

22. The computer readable medium or media of claim 1 wherein a plurality of said second elements comprise stoichiometric coefficients relating said inputs and outputs.

23. The computer readable medium or media of claim 1 wherein: a) a plurality of said second elements comprise rate variables representing quantitative or semi-quantitative rates of conversion of inputs into outputs; and b) said program instructions comprise simulation functions to infer or compute, or any combination thereof, the value of said rate variables when simulating the behavior of said complex system.

24. The computer readable medium or media of claim 23 wherein: a) said rate variables are of a plurality of different types, representing different types of processes; and b) said simulation functions comprise a plurality of different types of functions to infer or compute, or any combination thereof, the value of the corresponding different types of rate variables.

25. The computer readable medium or media of claim 1, wherein said complex system is a biological system and a plurality of the entities represented are of at least one of the following types: cell, gene, mRNA, protein or any assembly, complex or combination thereof.

26. The computer readable medium or media of claim 25, wherein a plurality of said elements further comprise links to a plurality of databases, the model providing a mechanism for integration of data in said databases.

27. The computer readable medium or media of claim 25, wherein said program instructions further comprise query functions in reference to a set of one or more of said first or second elements, or any combination thereof, selected as reference, including criteria based on the upstream or downstream position within said network, or both, of any first or second element, or the types of its references or links, or any combination thereof, the result set representing any number of pools of entities or processes which are putative targets for or affected by manipulation of said biological system.

28. The computer readable medium or media of claim 27, wherein said query functions identify an upstream set of any number of said first or second elements, or any combination thereof, representing pools of entities or processes of said biological system which are putative targets for manipulation to achieve any desired outcome, including affecting the expression of any gene(s) or receptor(s), or the secretion of any substance(s), or any combination thereof, wherein the one or more pools of entities or processes which are biomarkers for the outcome are represented by the reference query set.

29. A method for identifying desirable targets for screening agents or combination of agents to be further tested for the prevention or treatment of adverse conditions on a human or any other living organism, or any part thereof, comprising applying the virtual model and program instructions of the computer readable medium or media of claim 28 in a computer system to identify pools of entities or processes, represented by said upstream set, which are putative targets for manipulation in said human or other living organism, or any part thereof, to achieve any desired outcome known to affect the prevention or treatment of said adverse conditions, wherein said reference query set represents the one or more pools of entities or processes which are biomarkers for the desired outcome.

30. A method for designing strategies for controlling the expansion or further differentiation of progenitor or other precursor living cells in a cell culture or reactor to produce outcome cells of one or more types having characteristic phenotypes, comprising: i) applying the virtual model and program instructions of the computer readable medium or media of claim 28 in a computer system to identify the upstream set of pools of entities or processes which are putative targets for putative inducing agents, the reference query set representing the pools of entities or processes which are biomarkers for the one or more phenotypes of the desired outcome cells; and ii) applying the one or more of said identified targets for designing a strategy for controlling the expansion or further differentiation of said precursor cells in said cell culture medium.

31. A computerized storage and retrieval system of biological information comprising the virtual model and instructions of the computer readable medium or media of claim 27.

32. A method for identifying or characterizing candidates for agent development, including drugs or any other preventive, diagnostic, prognostic or therapeutic agents or procedures, or any combination thereof, comprising applying the virtual model and program instructions of the computer readable medium or media of claim 27 in a computer system to determine the pools of entities or processes in said biological system which are putatively affected by applying said at least one candidate agent or involved in attaining a desired outcome, wherein the model of the applicable biological system includes at least one first or second element representing at least one agent or its interaction with the biological system or a biomarker for the desired outcome in said biological system.

33. The method of claim 32, wherein the virtual model and program instructions are applied to identifying any number of first or second elements which are downstream of the at least one first or second element representing in said virtual model at least one agent or its interaction with the biological system, and thereby identifying pools of entities or processes in said biological system which are putatively effected by said at least one agent acting upon said biological system.

34. The method of claim 32, wherein the virtual model and program instructions are applied to: i) selecting a set of two or more of said first or second elements representing two or more agents or their interactions with their targets, ii) identifying a shared set of any number of first or second elements which are shared by the pathways downstream from the first or second elements of the selected set, and iii) identifying any outcome which would be affected by changes in the pools of entities or processes represented by the shared set as a result from manipulating the pools of entities or processes represented by the selected set, and thereby identifying or characterizing putative combined therapy regimes or harmful side effects of the combination of two or more selected agents.

35. The method of claim 32, wherein the virtual model and program instructions are applied to identifying any number of first and second elements in said virtual model which are upstream of the at least one first or second element representing in said virtual model at least one pool of entities or process which are biomarkers for a desired outcome, and thereby identifying upstream candidate target entities or processes which when targeted by an agent would influence said desired outcome in said biological system.

36. The method of claim 32, wherein the virtual model and program instructions are applied to identifying common first or second elements in said virtual model which are upstream of a plurality of first or second elements representing pools of entities or processes which are biomarkers for at least one desired outcome, and thereby identifying upstream candidate target entities or processes which when targeted by an agent would simultaneously influence the plurality of biomarkers for the at least one desired outcome.

37. A method for identifying putative effects of an agent or combination of agents on a human or other living organism, or any part thereof, the agents including environmental agents, drugs or any combination thereof, comprising applying the virtual model and program instructions of the computer readable medium or media of claim 27 in a computer system to identify downstream pools of entities or processes which could be affected by exposing said human or other living organism, or any part thereof, to said agent or combination of agents, wherein the reference set of one or more first or second elements represents the agent or combination of agents to be tested and the interactions of the agents with their targets in said human or other living organism, or any part thereof.

38. The computer readable medium or media of claim 25, wherein: a) a plurality of said first elements comprise first variables, representing the quantities or concentrations of the entities of said pools; and b) said program instructions comprise simulation functions to infer or compute, or any combination thereof, the values of said first variables when simulating the quantitative behavior of said complex system.

39. The computer readable medium or media of claim 38, wherein: a) a plurality of said second elements comprise second variables representing the rates of conversion of inputs into outputs represented by said input and output references or links; and b) said program instructions comprise simulation functions to infer or compute, or any combination thereof, the values of said second variables when simulating the quantitative behavior of said complex system.

40. The computer readable medium or media of claim 39, the virtual model further comprising a plurality of organizing elements representing compartments, including space, time-interval, or function compartments, organized in one or more levels of a hierarchy, each level representing a respective level of complexity, wherein said first and second elements have relationships with their corresponding organizing elements in the hierarchy.

41. The computer readable medium or media of claim 40, wherein said organizing elements comprise one or more elements to represent compartments of biological organization at the molecular assembly, reaction cascade, subcellular, cellular or multi-cellular levels, or any combination thereof, in one or more levels of a hierarchy.

42. The computer readable medium or media of claim 40, wherein said organizing elements comprise a plurality of elements representing different stages of the cell, including phases of the cell cycle, apoptosis, differentiation, disease or life cycle, or any combination thereof, in one or more levels of a hierarchy.

43. A method for simulating physiological or pathological states of a cell by using the computer readable medium or media of claim 40 in a computer system, comprising:

a) implementing a virtual model of the cell comprising one or more first or second elements representing biomarkers for one or more phenotypes of one or more states of the cell, wherein one or more of said elements have a relationship with one or more of said organizing elements;

b) setting the values of one or more sets of parameters or initial conditions of said virtual model; and

c) executing the simulation functions over the variables of said virtual model to simulate a state or a succession of states of the cell.

44. The computer readable medium or media of claim 39 applicable to testing different model scenarios, wherein said program instructions comprise functions to assign to at least one selected variable or parameter of said virtual model a plurality of initial values representative of different model scenarios.

45. The computer readable medium or media of claim 39, wherein said program instructions comprise query functions for finding an upstream set of any number of said elements representing entities or processes of said biological system which are putative targets for manipulation to achieve any desired outcome, including affecting the expression of any gene(s) or receptor(s), or the secretion of any substance(s), the pools of entities or processes which are biomarkers for the outcome represented by a reference query set of any number of said elements.

46. The computer readable medium or media of claim 39, wherein said program instructions comprise query functions for finding a downstream set of any number of said elements representing entities or processes of said complex system which are putatively affected by the manipulation of entities or processes represented by a reference query set of any number of said elements.

47. A method for predicting putative beneficial or toxic effects of an agent or combination of agents on a human or other living organism, or any part thereof, the agents including drugs or any other preventive, diagnostic, prognostic or therapeutic agents or procedures, by applying the virtual model and program instructions of the computer readable medium or media of claim 39 in a computer system, comprising:

a) implementing a virtual model representing said human or other living organism, or any part thereof, which comprises one or more sets of first and second elements representing the agent or combination of agents to be tested and the interactions of the agents with their targets in said human or other living organism, or any part thereof;

b) executing said virtual model under various sets of conditions to simulate the effects of manipulating the agent or combination of agents on the quantitative behavior of said human or other living organism, or any part thereof; and

c) identifying a set of any number of said first or second elements of said virtual model which represent pools of entities or processes of said human or other living organism, or any part thereof, which are putatively affected by exposing said human or other living organism, or any part thereof, to said agent or combination of agents, thereby identifying the beneficial or toxic effects of the agent or combination of agents.

48. A method for predicting putative targets for manipulation for the prevention or treatment of adverse conditions on a human or any other living organism, or any part thereof, by applying the virtual model and program instructions of the computer readable medium or media of claim 39 in a computer system to, comprising:

a) implementing in said computer system a virtual model of said human or other living organism, or any part thereof;

b) identifying, in reference to a query set, an upstream set of any number of said first or second elements of said virtual model representing pools of entities or processes which are putative targets for manipulation to achieve any desired outcome in said human or other living organism, or any part thereof, known to affect the prevention or treatment of said adverse conditions, including affecting the expression of any gene(s) or receptor(s), or the secretion of any substance(s), wherein the reference query set of one or more of said first or second elements represents pools of entities or processes which are biomarkers for the desired outcome; and

c) executing said virtual model under various sets of conditions to simulate the effects of manipulating the putative targets on achieving the desired outcome, and thereby characterizing the putative targets.

49. A method for identifying one or more components of a cell as putative targets for interaction with one or more agents, the agents including drugs or any other preventive, diagnostic, prognostic or therapeutic agent or procedure, by applying the virtual model and program instructions of the computer readable medium or media of claim 39 in a computer system, comprising:

a) implementing a virtual model of the cell comprising first and second elements representing components characteristic of a phenotype of the cell;

b) executing the simulation functions over the variables of said elements to simulate a first state of the cell;

c) perturbing the virtual model by deleting one or more elements thereof, changing the amount or concentration of one or more first elements thereof or modifying one or more simulation functions or the relationships between one or more elements thereof, or any combination thereof;

d) executing the simulation functions over the variables of said elements after said perturbation of the virtual model to simulate a second state of the cell; and

e) comparing said first and second simulated states of the virtual model to identify the effect of said perturbation on the state of the virtual model, and thereby identifying one or more components of said cell as desirable putative targets for interaction with one or more agents.

50. The computer readable medium or media of claim 1 further enabling quantitative, semi-quantitative or mixed type simulations of said virtual models in said computer system, wherein:

a) mathematical models characterizing the quantitative behavior of said complex system comprise a number of quantitative or semi-quantitative variables or parameters, or any combination thereof, distributed among said first and second elements, said variables or parameters representing characteristics of the components they represent; and

b) said program instructions further comprise inference or simulation functions, or any combination thereof, for computing the values of said variables during the execution of the model.

51. The computer readable medium or media of claim 50 applicable to testing different model scenarios, wherein said inference or simulation functions comprise functions for assigning to at least one of said variables or parameters of said model a plurality of initial values representative of different model scenarios.

52. The computer readable medium or media of claim 50 wherein said program instructions comprise functions to identify common upstream pools of entities or processes, or any combination thereof, which putatively exert influences on a plurality of selected pools of entities or processes, or any combination thereof, based on the sequential ordering of the first and second elements as defined by said references or links.

53. The computer readable medium or media of claim 50 wherein said program instructions comprise functions to identify downstream pools of entities or processes, or any combination thereof, influenced by one or more selected pools of entities or processes, or any combination thereof, based on the sequential ordering of the first and second elements as defined by said references or links.

54. The computer readable medium or media of claim 50 applicable in a computer system comprising monitoring capabilities for monitoring the operation of said complex system in conjunction with the virtual model, comprising means for mapping one or more monitored variables of any component of said complex system to the corresponding of said variables which represent them in said virtual model.

55. The computer readable medium or media of claim 50 applicable in a computer system further comprising controlling means to developing strategies or generating instructions for controlling the operation of said complex system based on the simulation of said virtual model by modifying any component of said complex-system represented by any of said variables of said elements of the model.

56. The computer readable medium or media of claim 50, wherein said complex system is a biological system and a plurality of the entities represented are of at least one of the following types: cell, gene, mRNA, protein or any assembly, complex or combination thereof.

57. A method for predicting a function of a component of a complex biological system, intrinsic or added to the system, comprising applying the virtual model and program instructions of the computer readable medium or media of claim 56 in a computer system to simulate perturbations of said component in said virtual model for testing the function of said component in said biological system, wherein the virtual model of the applicable biological system comprises at least one first or second elements representing the component or its interactions with one or more components of said biological system.

58. A method for identifying or characterizing candidates for agent development, including drugs or any other preventive, diagnostic, prognostic or therapeutic agents or procedures, or any combination thereof, comprising applying the virtual model and program instructions of the computer readable medium or media of claim 56 in a computer system to identify the pools of entities or processes which are putatively affected by applying at least one candidate agent or involved in attaining a desired outcome in said biological system, wherein the model of the applicable biological system includes at least one first or second element representing the at said least one agent or its interaction with the biological system or at said least one biomarker for the desired outcome in said biological system.

59. The method of claim 58 wherein the virtual model and program instructions are applied to simulating the effects of manipulating at least one first or second element in said virtual model, representing said at least one agent or its one or more interactions with its one or more targets on said biological system, on any number of downstream first or second elements, which represent the pools of entities or processes putatively affected by said at least one agent, and thereby predicting putative beneficial or harmful effects of said at least one agent.

60. The method of claim 58 wherein the virtual model and program instructions are applied to simulating the effects of manipulating any combination of two or more first or second elements in said virtual model, representing a plurality of agents or their interactions with their targets, on a downstream set of any number of first or second elements which are shared by the downstream pathways from said plurality of first or second elements, the downstream set representing the entities or processes of said complex system affected by said combination, and thereby predicting putative effects of combined therapy regimes or harmful side effects for said combination of agents.

61. A method for identifying or characterizing appropriate applications for an agent, including drugs or any other preventive, diagnostic, prognostic or therapeutic agents or procedures, or any combination thereof, comprising applying the virtual model and program instructions of the computer readable medium or media of claim 56 in a computer system to simulate and evaluate the effects of the agent on the biological system of interest at the subcellular, cellular, or multi-cellular level, or any combination thereof.

62. A method for identifying one or more components of a cell as putative targets for interaction with one or more agents, the agents including drugs or any other preventive, diagnostic, prognostic or therapeutic agent or procedure, by applying the virtual model and program instructions of the computer readable medium or media of claim 56 in a computer system, comprising:

a) implementing a virtual model of the cell comprising first and second elements representing components characteristic of a phenotype of the cell;

b) setting the parameters or initial conditions of said virtual model to correlate said phenotype to the state of the cell;

c) perturbing the virtual model by deleting one or more elements thereof, changing the parameters or initial conditions of one or more elements thereof or modifying one or more simulation functions or the relationships between one or more elements thereof; and

d) executing the simulation functions over the perturbed virtual model to determine whether said perturbation causes a desired transition of said cell from one phenotype to another, and thereby identifying one or more components of said cell as desirable putative targets for interaction with one or more agents.

63. A method for identifying one or more components of a cell as putative targets for interaction with one or more agents, the agents including drugs or any other preventive, diagnostic, prognostic or therapeutic agent or procedure, by applying the virtual model and program instructions of the computer readable medium or media of claim 56 in a computer system, comprising:

a) implementing a virtual model of the cell comprising first and second elements representing components believed to be intrinsic to a phenotype of the cell;

b) generating one or more expanded or perturbed virtual models by inferring new or modified first and second elements and their relationships to any of the elements of the virtual model subsystem using experimental data;

c) determining parameter values or constraining the virtual models by (i) sampling a set of one or more initial, expanded or perturbed virtual models and parameter values, (ii) simulating said virtual models by executing the simulation functions over the variables of the virtual models, and (iii) determining the initial values of variables and parameters which optimally fit a given set or sets of experimental data; and

d) comparing a plurality of simulated states of the initial or expanded or perturbed virtual models to identify the effect of said expansion or perturbation on the state of the virtual model, and thereby identifying the pertaining new or modified first or second elements representing the one or more targets for interaction with one or more agents.

64. A method for designing or implementing a strategy for controlling the expansion or further differentiation of progenitor or other precursor living cells in a cell culture or reactor to produce outcome cells of one or more types having characteristic phenotypes, comprising applying the virtual model and program instructions of the computer readable medium or media of claim 56 in a computer system to:

a) implementing a virtual model of the cellular system which comprises at least one first or second element representing at least one putative inducing agent or its interaction with at least one target in said precursor cell, or at least one pool of entities or process which is to be induced or modified to embody a characteristic phenotype of at least one outcome cell type, or any combination thereof;

b) executing said virtual model to simulate the putative effects of manipulating said one or more putative inducing agents in achieving said desired phenotypes of said outcome cell types; and

c) designing or implementing a strategy for controlling the expansion or further differentiation of said precursor cells in said cell culture medium or reactor based on one or more of said manipulations which result in the desired phenotypes.

65. A method for designing or implementing a strategy for controlling the production in a living cellular system of one or more substances for preventive, therapeutic or diagnostic uses by applying the virtual model and program instructions of the computer readable medium or media of claim 56 in a computer system, wherein the living cellular system comprises the producing cells in a cell culture, reactor or a microorganism, plant or animal living environment, comprising:

a) implementing a virtual model of the living cellular system which includes one or more first or second elements representing one or more pools of substances to be produced or their production processes within said producing cells;

b) executing said virtual model and any number of perturbations of said virtual model to simulate the putative effectiveness of said living cellular system and any number of manipulations of said living cellular system in achieving the desired production of said substances by said living cellular system; and

c) designing or implementing a strategy for controlling the production of said substances in said living cellular system based on the simulation of said virtual model or any number of said perturbations which result in the desired level of production of said one or more substances.

66. The computer readable medium or media of claim 1 wherein a plurality of said first elements comprise at least one input or output reference or link to second elements or their components, representing respectively an input to the represented pool of entities from the represented process or a contribution by the represented pool of entities to the represented process.

67. The computer readable medium or media of claim 66 wherein said first elements comprise state variables and said program instructions comprise simulation functions to infer or compute, or any combination thereof, the value of any of said state variables over said inputs or outputs, or both.

68. The computer readable medium or media of claim 67 wherein said second elements comprise rate variables representing the rate of conversion of said inputs into said outputs and said program instructions comprise simulation functions to infer or compute, or any combination thereof, the value of any of said rate variables.

69. A computer system for implementing the virtual model and instructions of the computer readable medium or media of claim 1.

70. A method for implementing a virtual model of a complex system in a computer system comprising defining or selecting, or any combination thereof:

a) a plurality of first and second variables, the first variables representing quantitatively, semi-quantitatively, or any combination thereof, pools of any number of entities of a given type, state and/or compartment, and the different types of second variables representing quantitatively, semi-quantitative, or any combination thereof; the rates of different types of processes of said complex system where said entities participate;

b) a plurality of input, regulator and output relationships between a plurality of said first and second variables;

c) functions to integrate said variables and their relationships, resulting in a network of one or more pathways, a plurality of said variables acting as junctions where different pathways merge or different pathways branch, or both; and

d) functions to compute the values of said variables, to simulate the behavior of said complex system during execution of said model.

71. The method of claim 70 further comprising defining or selecting, or any combination thereof, links between a plurality of said elements and a plurality of databases, thereby integrating data in said databases.

72. The method of claim 70 further comprising defining or selecting, or any combination thereof, functions associated with any of said variables enabling the successive interactive navigation between related variables.

73. The method of claim 70 further comprising defining or selecting, or any combination thereof, functions to generate visual representations of said network of pathways on the display means of said computer system.

74. The method of claim 70 further comprising defining or selecting, or any combination thereof, functions to generate on-the-fly visual representations of subsets of the networks of pathways which meet user-defined criteria, including selection of one or more of said variables as starting nodes or a direction upstream or downstream from said starting nodes, or any combination thereof.

75. The method of claim 70 further comprising defining or selecting, or any combination thereof, query functions in reference to one or more of said variables, including criteria based on the upstream or downstream position within said network of any first or second variable in relation to said reference one or more variables, or any combination thereof.

76. The method of claim 70 applied to test different model scenarios, further comprising assigning to at least one selected variable or parameter of said virtual model a plurality of initial values representative of different model scenarios.

77. The method of claim 70 wherein said computer system comprises monitoring capabilities applicable in conjunction with the virtual model, further comprising defining or selecting, or any combination thereof, program functions for i) mapping the one or more monitored variables of components of said complex system to their corresponding variables in said model, ii) comparing said monitored values to the simulated values of their corresponding variables; and iii) inferring or computing, or any combination thereof, any adjustments derived from said comparisons.

78. A computer readable medium or media comprising instructions for implementing in a computer system the method of claim 77.

79. The method of claim 70, wherein said computer system comprises controlling capabilities applicable in conjunction with the virtual model, further comprising defining or selecting, or any combination thereof, program functions to control the operation of said complex system by modifying any of its components based on the simulation of said model.

80. A computer readable medium or media comprising instructions for implementing in a computer system the method of claim 79.

81. The method of claim 70 further comprising defining or selecting, or any combination thereof, a plurality of organizing elements representing compartments, including space, time-interval, or function compartments, or any combination thereof, organized in one or more levels of a hierarchy, each level representing a respective level of complexity; and relationships between a plurality of said variables and their corresponding organizing elements in the hierarchy.

82. The method of claim 81 further comprising defining or selecting, or any combination thereof, query functions in reference to one or more of said variables, including criteria based on the upstream or downstream position within said network of any first or second variable in relation to said reference one or more variables or the relationships between said variable and any of said organizing elements, or any combination thereof.

83. The method of claim 70 wherein said regulator relationships comprise relationships of at least of the following types: activator, inhibitor, catalyst, ligand, agonist, antagonist, repressor or promoter.

84. The method of claim 70, wherein said complex system is a biological system and the model is applied to identifying putative targets for manipulation to achieve any desired outcome, including affecting the expression of any gene or receptor, or the secretion of any substance, or any combination thereof, in various types of applications, including treatment of disease, improvement of livestock or food crops, or improvement of the environment, further comprising defining or selecting, or any combination thereof, functions to identify a target set of first or second variables, or any combination thereof, representing the putative target pools of entities or processes, which are upstream of a reference query set of one or more first or second variables, or any combination thereof, representing the pools of entities or processes which are biomarkers for said outcome.

85. A computer readable medium or media comprising instructions for implementing in a computer system the method of claim 84.

86. The method of claim 70, wherein said complex system is a biological system and the model is applied to identifying the putative effects of manipulation strategies for said biological system in various types of applications, including treatment of disease, improvement of livestock or food crops, or improvement of the environment, further comprising defining or selecting, or any combination thereof, functions to identify a downstream set of first or second variables, or any combination thereof, representing the pools of entities or processes of said biological system which are putatively affected by manipulating the entities or processes represented by a reference query set of one or more of said first or second variables, or any combination thereof.

87. A computer readable medium or media comprising instructions for implementing in a computer system the method of claim 86.

88. A computer readable medium or media comprising instructions for implementing in a computer system the method of claim 70.

89. A method for implementing models of biological complex systems in a computer system comprising aggregating a plurality of modules representing components of a complex system, wherein: i) a plurality of the modules comprise one or more terminals allowing the module to exchange signals with other modules; ii) the modules include process modules representing different types of processes; iii) a plurality of the process modules comprise input, regulator and output terminals; iv) the aggregating comprises establishing references or links, or any combination thereof, between the terminals of different modules resulting in a network of one or more crossing pathways which describes the topology of the complex system.

90. The method of claim 89, further comprising defining or selecting links between a plurality of said modules and a plurality of databases, thereby integrating data in said databases.

91. The method of claim 89, further comprising applying libraries of predefined and reusable modules in at least one of the composition or extension of said models, the libraries comprising different types of said modules, their components, or any combination thereof.

92. The method of claim 89, wherein said modules have references to or containment relationships with at least one compartment module in a hierarchy of at least one level representing at least one physical or conceptual compartment of the complex system.

93. The method of claim 89 wherein the modules comprise first modules representing pools of any number of entities of a given type, state and/or compartment and the references or links are references or links between the terminals of said first modules and the terminals of said process modules.

94. The method of claim 93, further comprising applying chaining functions to establish downstream or upstream relationships between chains of first modules and process modules, or any combination thereof, identified based on the sequential order of said first modules and process modules as defined by said references or links.

95. The method of claim 93, further comprising generating visual representations of said model on the display means of said computer system, with nodes representing a plurality of said modules linked according to said references or links resulting in a network of crossing pathways.

96. The method of claim 93 wherein a plurality of the entities represented are of at least one of the following types: cell, gene, mRNA, protein or any assembly, complex or combination thereof.

97. The method of claim 96 applied to the development of agents, the agents including drugs or any other preventive, diagnostic, prognostic or therapeutic agents or procedures, further comprising:

a) including in the model of the applicable biological system at least one first module or process module representing at least one putative agent or its interaction with at least one target in said biological system, or biomarker for a desired outcome in said biological system; and

b) applying said model to identify any number of pools of entities or processes affected as a result of applying said at least one putative agent or involved in achieving said desired outcome in said biological system.

98. The method of claim 97, wherein the method is applied to identifying putative beneficial or harmful effects of at least one agent acting upon said biological system, comprising identifying any number of first modules or process modules in said virtual model which are downstream of the at least one first module or process module representing the at least one agent or its interaction with the at least one target in said biological system.

99. The method of claim 97, wherein the method is applied to characterizing putative combined therapy regimes or putative harmful side effects resulting from interactions between the effects of two or more agents, comprising identifying any number of first modules or process modules in said virtual model which are shared by the pathways downstream from the two or more of said first modules or process modules representing said two or more agents or their interactions with their targets in said biological system.

100. The method of claim 97, wherein the method is applied to characterizing candidate target pools of entities or processes applicable in the selection of agents to influence at least one desired outcome in said biological system, comprising identifying any number of first modules or process modules in said virtual model which are upstream of the at least one first module or process module representing at least one pool of entities or process which is a biomarker for said desired outcome, wherein said upstream elements represent pools of entities or processes which are putative targets for manipulation in said biological system to achieve said desired outcome.

101. A computer readable medium or media comprising instructions for implementing in a computer system the method of claim 97.

102. The method of claim 96 applied to designing a strategy for controlling the expansion or further differentiation of progenitor or other precursor living cell in a cell culture or reactor to produce outcome cells of one or more types having characteristic phenotypes, further comprising:

a) including in the model of the applicable cellular system at least one first module or process module representing at least one pool of entities or process which is a biomarker to be induced or modified to embody a characteristic an outcome cell phenotype;

b) defining or selecting functions to identify at least one candidate target for putative inducing agents represented by at least one first module or process module in said model which is upstream of the at least one first module or process module representing said phenotype biomarker; and

c) designing a strategy for controlling the expansion or further differentiation of said precursor cells in said cell culture or reactor based on at least one of said identified candidate targets for putative inducing agents which would induce the desired outcome cell phenotype.

103. A computer readable medium or media comprising instructions for implementing in a computer system the method of claim 102.

104. The method of claim 93 further comprising the step of defining or selecting stoichiometric coefficients for said process modules which relate said inputs and outputs.

105. The method of claim 93 wherein a plurality of said modules comprise quantitative or semi-quantitative variables or parameters, or any combination thereof, and the aggregating comprises applying functions over said variables or parameters to infer or simulate the behavioral model of the complex system.

106. The method of claim 105 applied to the analysis, dynamic simulation, steady state simulation, or optimization of the biological complex models, or any combination thereof.

107. The method of claim 105 wherein at least one of said modules is implemented as lumped parameter system, black-box system, abstraction, encapsulation, aggregation, or any combination thereof.

108. The method of claim 105, wherein the variables of said process modules comprise rate variables representing the rate of conversion of said inputs into said outputs.

109. The method of claim 105 applied to modeling of a set of related biological complex systems or their states, further comprising creating one or more network topologies corresponding to models or submodels representing the biological system or its subsystems; wherein one of said network topologies of said set is a master network topology representing a base biological system or state; and the difference between any of two said biological systems or their states is described through changes in: i) the values of one, or more parameters, ii) the initial values of one or more variables, iii) the topology derived from the references or links, iv) the exclusion or additional inclusion of one or more modules of the master network topology, or v) any combination thereof.

110. The method of claim 109, wherein said network topologies are abstract modules which can be reused as submodels to compose a plurality of more complex models or alternative implementation models.

111. The method of claim 105 further comprising applying said model to designing manipulation strategies or for controlling the operation of said complex system based on the behavior of the simulated model.

112. The method of claim 105 applied to monitoring or controlling the operation of said complex system, further comprising:

b) mapping one or more monitored or controlled variables of any component of said complex system to the corresponding of said variables which represent them in said model;

c) executing the model and comparing the values of the said monitored or controlled variables to the corresponding simulated values of the variables which represent them in said model;

c) inferring or computing any corrective adjustments derived from said comparisons; and

b) applying any of said corrective adjustments to modify the corresponding simulated or controlled variables.

113. A computer readable medium or media comprising instructions for implementing in a computer system the method of claim 112.

114. The method of claim 105 applied to testing different model scenarios, further comprising assigning to at least one of said variables or parameters of said model a plurality of initial values representative of different model scenarios.

115. The method of claim 105 applied to studying the operation of said biological system or to design manipulation strategies to control said operation for various applications, including those in the areas of disease prevention or treatment, therapeutics, diagnostics, drug production, livestock, food crops, food production or the environment, wherein the model of the biological system is at the subcellular, cellular, or multi-cellular level, or any combination thereof.

116. The method of claim 105 applied to agent development, including drugs or any other preventive, diagnostic, prognostic or therapeutic agents or procedures, or any combination thereof, further comprising identifying the pools of entities or processes which are putatively affected by applying at least one candidate agent or involved in attaining a desired outcome in said biological system, wherein the model of the applicable biological system, at the subcellular, cellular or multi-cellular, or any combination thereof, includes at least one first or second element representing the at least one agent or its interaction with the biological system, or a biomarker for the desired outcome in said biological system.

117. The method of claim 105 applied to predicting in a computer system a behavior of a biochemical system, further comprising comparing two or more models of a biochemical system under different conditions, and identifying correlative changes of the values of one or more of said variables between said two models with said different conditions, wherein said correlative changes predict a behavior of said biochemical system.

118. A computer readable medium or media comprising instructions for implementing in a computer system the method of claim 105.

119. A computer readable medium or media comprising instructions for implementing in a computer system the method of claim 93.


Description

COMPACT DISC APPENDIX

The Tables 1-233 referenced in the specification are pseudo-code listings provided in the Compact Disc Appendix, which comprise one file with 213 pages.

TECHNICAL FIELD

The present invention in its broadest form relates to computer-based systems, methods and visual interfaces for providing an integrated development and deployment framework for visual modeling and dynamic simulation of Virtual Models of complex systems, which can be further integrated with monitoring and control devices to control the operation of the complex systems modeled and can be used for information retrieval. More in particular, the complex systems that are the focus of the exemplary embodiments of this invention are living organisms or subsystems or populations thereof, comprising any combination of biological and regulatory networks of genetic, biochemical and/or signal-transduction pathways, at the molecular, cellular, physiological or population levels.

COPYRIGHT NOTICE

A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.

BACKGROUND ART

Knowledge-Based and Model-Based System for Monitoring and Control

Process industries, including the pharmaceutical, biotechnology, chemical, food, environmental and others may save millions of dollars by using artificial intelligence for process optimization to control complex productions facilities. Using large-scale cultivation of microorganisms or mammalian cells are extreme cases in terms of complexity, when considering then as the individual manufacturing plants involved in complex chemical synthesis. Current systems monitor very general types of phenomena, such as gas pressure, pH, and in some occasions, the concentration of some product that correlates with cell growth or production, but those parameters are usually poor indicators of how much of the desired product is produced. Other methods for designing monitoring and control systems for laboratory and industrial applications have been described, such as the one described in the patent application published as EP 0 367 544 A2 (Int. Dev. Res. Center) 9 May 1990, which uses a graphical interface to graphically model the set of instruments and controllers of such monitoring and control systems and a natural language to allow the integration of the knowledge of experts into the automated control facilities. Most monitoring systems are concerned with the overall processes that occur within the physical constraints of given reactions tanks, but do not model the many compartmentalized subsystems contained in each of those tanks with biological systems, to more finely tune the productivity of those subsystems. Complex mixtures of chemical reactions can be finely controlled externally by modifying the types and amounts of inputs added, if one could predict what will happen by adding those inputs, which requires a good knowledge and a model of such system of reactions. This is particularly the case with biological cellular systems that have very sophisticated methods to transduce the signals provided by ligands in their external environment to the interior of the cell, resulting in the execution of specific functions. Such detailed and accessible mechanistic models of those pathways of reactions are not currently used for monitoring and control systems, but would be highly desirable.

Several knowledge-based systems for monitoring and control functions have been used to include the knowledge of experts into the automated control of production facilities. A knowledge-based system interprets data using diverse forms of knowledge added to the system by a human domain expert including: a) shallow knowledge or heuristics, such as human experience and interpretations or rules-of thumb; and b) deep knowledge about the system behavior and interactions. The systems that mainly based in the first type of knowledge are in general referred to as knowledge-based expert systems, and the logic is represented in the form of production rules. In the more advanced real-time expert systems, inferencing techniques are usually data-driven using forward chaining, but can also employ backward chaining for goal-driven tasks and for gathering data. The inference engine searches for and executes relevant rules, which are separate from the inference engine and therefore, the representation is intrinsically declarative.

Object-oriented expert systems allow a powerful knowledge representation of physical entities and conceptual entities. In those systems, data and behavior may be unified in the class hierarchy. Each class has a template that defines each of the attributes characteristic of that class and distinguish it from another types of objects. Manipulation and retrieval of the values of the data structures may be performed through methods attached to an object's class. Model-based systems can be derived from empirical models based on regression of data or from first-principle relationships between the variables. When sufficient information to model a process—or part of it—is available, a more precise and compact system can be built.

There is a number of commercially available shells and toolkits that facilitate the development and deployment of domain-specific knowledge-based applications. Of those, real-time expert-system shells offer capabilities for reasoning on the behavior of data over time. Each of the real-time object-oriented shells from various vendors offers its set of advantages, and each follows a different approach, such as compiled versus interpreted, and offers a different level of graphic sophistication. The specific shell currently selected for the implementation of this invention is Gensym Corporation's G2 Version 3.0 system, and in part Version 4.0, which is designed for complex and large on-line applications where large number of variables can be monitored concurrently. It is able to reason about time, to execute both time-triggered and event-triggered actions and invocations, to combine heuristic and procedural reasoning, dynamic simulation, user interface, database interface capabilities, and other facilities that allow the knowledge engineer to concentrate on the representation and incorporation of domain-specific knowledge to create domain-specific applications. G2 provides a built-in inference engine, a simulator, prebuilt libraries of functions and actions, developer and user-interfaces, and the management of their seamless interrelations. A built-in inspect facility permits users to search for, locate, and edit various types of knowledge. Among G2's Inference Engine capabilities are: a) data structures are tagged with time-stamp and validity intervals that are considered in all inferences and calculations, taking care of truth maintenance; and b) intrinsic to G2's tasks are managed by the real-time scheduler. Task prioritization, asynchronous concurrent operations, and real-time task scheduling are therefore automatically provided by this shell. G2 also provides a graphic user interface builder, which may be used to create graphic user interfaces that are language independent and allow to display information using colors, pictures and animation. Dynamic meters, graphs, and charts can be defined for interactive follow-up of the simulation. It also has debugger, inspect and describe facilities. The knowledge-bases can be saved as separated modules as ASCII files. The graphic views can be shared with networked remote CPUs or terminals equipped with X Windows server software.

Computer-Aided Physiological and Molecular Modeling and Artificial Intelligence in Molecular Biology

Most computer-aided physiological and molecular modeling approaches have resulted in computer models of physiological function that are numerical mathematical models that relate the physiological variables using empirically determined parameters. Those models, which can become quite complex, aim at modeling the overall system.

Both molecular biology and medicine have been fields of previous activity in the application of artificial intelligence (AI). In molecular biology, although there were some early systems such as Molgen and Dendral, the activity has intensified recently as a consequence of the explosion in new technologies and the derived data, mostly related with the Human Genome project and the handling of large amounts of sequence data generated, relating to both DNA and proteins. There has also been an increased interest in computer methodologies in 3D structural models of molecular interactions. For a current state of the art, see the topics covered in symposia such as the recent Second International Conference on Intelligent Systems for Molecular Biology, 1994, Stanford University, CA. (its Proceedings are here included by reference). Here, only two projects will be mentioned that have some common objectives with the system that is the object of this invention. Discussions over other previous approaches are also included in those references.

The Molgen group at Stanford University has studied scientific theory formation in the domain of molecular biology, as reported by Karp, P. D. and Friedland, P. (included here by reference). This project relates to the system of this invention in that both "are concerned with biochemical systems containing populations of interacting molecules . . . in which the form of knowledge available . . . varies widely in precision from quantitative to qualitative", as those authors write. The Ph.D. dissertation of P. D. Karp (included here by reference) "developed a qualitative biochemistry for representing theories of molecular biology", as he summarizes in an abstract in AI Magazine, Winter 1990, pp 9-10. He developed three representation models to deal with biochemical pathways, each having different capabilities and using different reasoning approaches. Model 1 uses IntelliCorp's KEEframes to describe biological objects and KEE rules to describe chemical reactions between the objects, which he recognizes to have serious limitations because is not able to represent much of the knowledge available to biologists. The objective of Model 2 is to predict reaction rates in a given reaction network, incorporating a combination of quantitative and qualitative reasoning about state-variables and their interdependencies. The drawbacks are that this model is not able to incorporate a description of the biological objects that participate in the reactions, and it does not have temporal reasoning capabilities, representing just a static description of the state variables and their relationships. The third model, called GENSIM and used for both prediction and hypothesis formation, is an extension of Model 1 and is composed of three knowledge-bases or taxonomical hierarchies of classes of a) biological objects that participate in a gene-regulation system, b) descriptions of the biological reactions that can occur between those objects, and c) experiments with instances of those classes of objects. The GENSIM program predicts experimental outcomes by determining which reactions occur between the objects in one experiment, that create new objects that cause new reactions. Characteristics of the GENSIM program that may be relevant, although different, for the system of this invention are: a) chemical objects are homogeneous populations of molecules, objects can be decomposed into their component parts, and identical objects synthesized during a simulation are merged; b) chemical processes represent reactions between those populations as probabilistic events with two subpopulations, one that participate in the reaction and one that does not. Those processes can create objects and manipulate their properties, but cannot reason about quantitative state variables such as quantities. In his words, "processes . . . specify actions that will be taken if certain conditions hold", and in that sense are like production-rules; c) restrictions are specified in the form of preconditions for chemical reactions to happen; and d) temporal reasoning is not available, resulting again in static representations and simulating only behavior in very short time intervals.

The system of this invention integrates a variety of forms of knowledge representation, some of them totally novel, while some of these forms may have been treated by other authors similarly in some aspects. However, upon integration into a totally new approach, that treatment becomes a part of a novel representation and innovative system. Regarding the semi-quantitative simulation component this invention, L. E. Widman (1991) describes a semi-quantitative simulation of dynamic systems in a different domain, with the assumption that " . . . questions can be answered in terms of relative quantities rather than absolute quantities . . . model parameters that are not specified explicitly are given the implicit, default values of 'normal' (unity) . . . ". As it will become clear from the detail descriptions in the following sections, the innovative tools and methods used in the present implementation a requite different. For example, while he assumes that " . . . the default, or implicit, value of 'normal' maps onto unity for parameters and onto zero for variables . . . . " the assumption in the prebuilt modular components in the current implementation differs in that the default value of 'normal' may map onto values other than unity and zero, with those values being defined based on expert knowledge.

DISCLOSURE OF INVENTION

This invention describes an integrated computer-based system, methods and visual interfaces for providing a development and deployment framework for visual modeling and dynamic simulation of Virtual Models of complex systems, which can be further integrated with monitoring and control devices to monitor and control the operation of the complex systems modeled, based on the real-time simulation of those Virtual Models. The system of this invention can be used by scientists to build the Virtual Models, which in turn are to be used by scientists or process engineers to design, monitor and control a variety of processing units. The Virtual Models can also be used for information retrieval, by using the set of visual interfaces provided to perform a variety of tasks. The available information and data about those complex models is stored into modular, modifiable, expandable and reusable knowledge structures, with several layers of encapsulation that allow to hide or display the details at the desired level of complexity. More particularly, the Virtual Models in the present invention describe visual models of biochemical complex systems, comprising sets of icons representing processes and their participants linked into multidimensional pathways, further organized in a hierarchy of icons representing discrete time and space compartments, wherein such compartments may contain other compartments, and wherein those modular icons encapsulate in different layers all the information, data, and mathematical models that characterize and define each Virtual Model.

Some process industries use large-scale cultivation of microorganisms or mammalian cells, which are extreme cases in terms of complexity when considering those cells as the individual manufacturing plants involved in complex chemical synthesis. Microorganisms are the preferable systems for producing natural substances that have a multitude of uses, such as drugs, foods, additives, biodetergents, biopolymers, and other new and raw materials. Mammalian cells are the preferable systems for producing potent active substances for therapeutic and diagnostic uses. The ultimate level of complexity is using a whole animal as the live factory for continuous production for important secreted proteins. However, the current systems only monitor very general types of phenomena, such as gas pressure, pH, and in some occasions, the concentration of some product that correlates with cell growth or production. For example, for controlling the production of a particular secreted protein that is produced in very low amounts in relation to other proteins, the total protein amount of protein is measured, which is a very poor indicator of how much of the desired protein is produced. Complex mixtures of chemical reactions could be finely controlled externally by modifying the types and amounts of inputs added, if one could predict what will happen by adding those inputs, which requires a good knowledge and a model of such system of reactions. This is particularly the case with biological cellular systems that have very sophisticated methods to transduce the signals provided by ligands in their external environment to the interior of the cell, resulting in the execution of specific functions. Such detailed and accessible mechanistic models of those pathways of reactions are not currently used for monitoring and control systems, but would be highly desirable. This invention provides the system and methods that allows scientists to visually build detailed mechanistic models of the complex systems involved, and to further develop and use inference methods to integrate the simulation of those Virtual Models with inputs from monitoring devices to allow for the intelligent control of the operation of the complex system.

The accuracy and validity of knowledge-based systems correlates not only with the quality of the knowledge available to the developers but also with their ability to understand, interpret and represent that knowledge. Because of the complex interrelationships driving the biochemical processes within and between cells, it is necessary to provide the many options for knowledge representation required by those systems. This invention presents a hybrid dynamic expert system that combines: a) the inheritance and encapsulation features characteristic of an object-oriented modeling approach, b) the procedural and rule-based inference capability of an expert system, and c) a model-based simulation capability. It is an objective of this invention to provide a framework that: a) allows domain-experts to directly enter their knowledge to create visual models, and to modify them as needed based on additional experimentation, without the need of knowledge engineers as intermediaries, and b) provides a knowledge-base having a well defined structure for representing knowledge about entities, populations of entities, processes, pathways and interacting networks of pathways, providing a visual interface and associated methods capable of dealing with incomplete and constantly evolving information and data. The data structures and domain-specific knowledge-base are independent of application-specific use, allowing the application-specific knowledge-bases to expand without affecting the basic operation of the system. Specific applications can be quickly built by using the prebuilt building blocks and the paradigm of "Clone, Link, Configure, and Initialize". The structure of the domain-specific knowledge-base serves also as the infrastructure provided to store additional information about new objects and models. The innovations of this invention include but are not limited to the specific design, generation, integration, and use of the libraries of building blocks, access panels, and their associated methods, that allow for representation, interpretation, modeling and simulation of different types of entities and their states, their relations and interactions, the pools of each of those entities in different compartments, the processes in which those pools of entities participate, and their discrete compartmentalization in time and space, as well as the concepts that make the simulation of this very complex systems possible. A large Virtual Model can be built as a set of modules, focusing each on different subsystems. Each of the modules can be run on top of the repository module that contain the class definitions and associated methods, and they can be dependent or independent of each other, or they can be maintained in separate CPUs and seamless integrated by the Shell's supervisor.

Such libraries provide the prebuilt building blocks necessary for visually representing the vast breath of knowledge required for building Virtual Models of complex systems in general, and of biological systems in particular. The building blocks are classified in palettes that can be selected through system menus. The descriptive information, data, and the mathematical models are all encapsulated within the modular components, in the form of attributes or in the form of component icons, with a plurality of methods associated with each of the icons. The visual models are built by interconnecting components of each reservoir icon to components of the one or several process icons in which their entities participate as inputs or outputs, or by interconnecting components of each process icon to components of the one or several reservoir icons that provide inputs or receive outputs to that process icon, resulting in complex networks of multidimensional pathways of alternating layers of reservoir icons and process icons, which may be located in the same on in different compartments. The time compartment icons, the reservoir icons, and the process icons or their components comprise sets of quantitative variables and parameters, and a set of associated methods that permit real-time simulations of the models created with those modular components. The system of this invention combines a number of programming paradigms, integrating the visual interface to define and constrained objects represented by icons, with procedural and inferential control and quantitative and semi-quantitative simulation methods, providing a variety of alternatives to deal with complex dynamic systems.

Of importance in simulating the behavior of complex systems is the need to model the different quantities and states at which the entities can be found in particular locations at different points in time, and also to model the events that cause the transitions from one state to another, or the translocation from one location to another, or the progress to the next phase in the time sequence. Teachings of this invention comprise: the representation of those states, transitions, and locations; the methods to implement their graphic modeling; and the methods to dynamically simulate the dynamics of the pools of entities in each state, location, or phase. In the particular domain implemented to illustrate this invention, there are several major types of states and transitions to be considered, depending of whether the entity to be considered is a biological system, organ, cell, cellular compartment, molecule or any other of their components. We are providing here with just a few examples considered in the currently preferred embodiment of this invention.

The architecture of the system of this invention allows various bioProcesses to simultaneously compete for the contents of a bioPool while also allows one bioPool to participate in various bioProcesses in different capacities, with different units within a bioPool behaving in different ways, as determined by the class of bioReactant to which they are distantly connected via the bioPosts. With this implementation, all bioEngines that input units into a given bioPool and all the bioEngines that get input units from that bioPool are integrated. At the same time, all bioPools that input units into a given bioEngine and all the bioPools that get input units from that bioEngine are also linked. The result is a very complex multidimensional network of composite bioObjects, which provides the matrix for further reasoning and simulation by the program. For example, the formulas that provide the values for the encapsulated sets of variables and parameters refer to linked bioObjects, both graphically or distantly, providing the capability of concurrently and dynamically compute as a large number of processors arranged both in parallel and in series within that matrix.

The modeled system's behavior is defined by mathematical components, represented by a set of model differential and algebraic equations that provide the values of the system's variables and describe their behavior, together with the set of associated parameters that control the behavior of the variables and the system as a whole. The system's variables and parameters are embedded and distributed throughout the system of connected structures, encapsulated within the subcomponents that define the system's architecture. The model can then be viewed as a set of embedded block diagram representations of the underlying equations that can be used for dynamic numerical simulation and prediction of the effects of perturbations on the system, and to ask what-if type questions.

The compartmentalized bioModels, and the methods attached to them, encode knowledge that enables the program to reason about the containment of different parts of the model in several compartments, while the architecture of the network of linked bioObjects of diverse types is transparently maintained, regardless of the transfer of the bioObject icons to different locations. They also comprise quantitative variables and parameters, some relating to quantities and rates translocation, distributed either within the corresponding reservoirs and process or relating to time intervals that may be associated with the time compartments, all of which have associated simulation formulas to compute their values. In addition, the modeler can define expert rules to monitor the values of any of those variables while the simulation is running, and to either set other values or control the course of the simulation in a variety of ways. The expert rules can also reason about time or about events resulting from the simulation, and Inference using those types of knowledge may direct further actions to be executed by the system.

The visual interface further provides quick access to several automated methods for compiling, retrieving, and displaying the modular components of the visual models as well as the information and data they contain. The system integrates inferential control with quantitative and semi-quantitative simulation methods, and provides a variety of alternatives to deal with complex dynamic systems and with incomplete and constantly evolving information and data. A number of functions and visual interfaces can be selected from the menus associated with each of those icons or their components, to extract in various forms the information contained in the models build with those building blocks, such as:

    • a) methods to create and display of interactive networks of pathways, by programmatically integrating the components of the Virtual Model into multidimensional networks of interacting pathways, including branching and merging of pathways, cross-talk between pathways that share elements, and feed-back and forward loops;
    • b) methods to perform complex predefined queries that combine criteria related to the structural composition of the bioEntities involved, the position of bioPools downstream or upstream of the bioPool taken as reference, the role that those pools of entities play in processes, the location of those pools and processes in the discrete time and space compartments, or any combination of them;
    • c) methods to dynamically simulate their continuous interactions, modifications and translocations to other compartments, and the time-dependency of such interactions, optionally using the encapsulated absolute-valued or scaled-valued parameters and variables.


  • The examples provided to document the current implementation of this invention focus on modeling biochemical regulatory processes that are relevant for intracellular or intercellular signaling. This is however one of the many potential applications of the system of this invention, and it should not be construed to limit the applicability of this system to the numerous other applications that involve complex systems in any other domain, which can be developed by minor modifications or additions of the system using the methodology here presented.

    There are numerous other uses of the core system, methods and visual interface of this invention, in addition to the monitoring and control applications described here. Of particular interest is the modeling and simulation of disease specific conditions and the testing the effects—both desired and unwanted side effects—of potential therapeutic agents. This invention also allows to analyze disturbances such as potential environmental or biological inducers of disease in both physiological and pathophysiological models. The methods included in this invention can in a similar way be used in a variety of applications, including but not limited to: simulation and prediction of experimental results; interactive drug design; study of drug side-effects and multiple drug interactions; simulation of interactive causes in the induction and progression of disease, including both biological and environmental factors; diagnostics and clinical decision support; therapy planning; theoretical research; and numerous other applications.

    The foregoing and additional objectives, descriptions, features, operations and advantages of the present invention will be understood from the following detailed description of the preferred embodiments in combination with the accompanying figures.

    BRIEF DESCRIPTION OF DRAWINGS

    FIG. 1 is a high-level illustration of the various components integrated in the system of this invention to use the Virtual Models for control functions.

    FIG. 2 is a schematic representation of the organization in the system of this invention of the components of the Virtual Models in the domain of cell biology.

    FIG. 3 is a schematic representation of the organization of domain-specific processes in discrete space compartments and time compartments.

    FIG. 4 is a schematic representation of the handling of the dynamics of the progression of populations of cells through different states by means of the sets of pools of cells and processes characteristic of this invention.

    FIG. 5 is a schematic representation of the multiple layers of linked pools of entities and processes the result in the multidimensional pathways characteristic of this invention.

    FIG. 6 is a more detailed representation of how the various iconic components encapsulated in the icons representing the pools of entities and processes are linked in the current implementation of this invention.

    FIG. 7 is a detailed representation focusing on the iconic components of a process and their relations to its sources of inputs and the targets for its outputs.

    FIG. 8 is a detailed representation focusing on the iconic components of a pool of entities and its inputs and outputs.

    FIG. 9 shows domain menus that allow different types of access to different types of capabilities, for the purpose of modeling or using the virtual models, here focusing on the developer mode that allows developing additional tools.

    FIG. 10 is the table of attributes of a complex variable structure that allows the integration of measured values and the simulated values that facilitates the implementation of this invention.

    FIG. 11 is a library of predefined classes of variables with associated generic simulation formulas associated with each class available to the modeler virtual models for a specific domain.

    FIG. 12 describes the sets of attributes of the components of a reservoir, focusing on the variables and parameters of its model-block component.

    FIG. 13 describes the sets of attributes of the components of a pool of entities, focusing on its variables and parameters.

    FIG. 14 describes the sets of attributes of the components of a process, focusing on the variables and parameters of its reactants.

    FIG. 15 describes the sets of attributes of the components of a process, focusing on the variables and parameters of its engine and products.

    FIG. 16 shows domain menus of the modeler mode that allow modelers to access the different palettes of prebuilt building blocks.

    FIG. 17 shows examples of different palettes of prebuilt molecular components.

    FIG. 18 shows an example of a complex molecular structure built with the prebuilt molecular components.

    FIG. 19 shows a palette with examples of prebuilt model-blocks.

    FIG. 20 shows a palette with examples of prebuilt composite processes.

    FIG. 21 shows a palette with examples of prebuilt reactants.

    FIG. 22 describes the tools for interactively establishing links between components that result in multidimensional pathways.

    FIG. 23 shows a palette with examples of prebuilt composite domain-specific compartments.

    FIG. 24a and FIG. 24b describe a domain-specific compartmentalization of the components of a Virtual Model, in this case focusing on the sequential phases of a cell's cycle and on its subcellular compartments.

    FIG. 25a and FIG. 25b describe a domain-specific implementation of compartmentalized model of two cells interacting with each other and with the external environment.

    FIG. 26 describes a domain-specific characterization of the compartmentalized model from an external point of view.

    FIG. 27 describes a domain-specific implementation of an evolutionary tree representing the alternative successive states that a compartmentalized model may follow depending on the events that depend on their internal processes responding to the environment, from an external point of view.

    FIG. 28 is a schematic representation of the combination of the inference and simulation capabilities used in this invention to simulate Virtual Models of complex systems.

    FIG. 29 is a detailed representation of how in this invention pools of cells interact with pools of molecules as reactants of processes, which products are either molecules or cells in a different state.

    FIG. 30 describes an example of the predefined Query Panels and their use.

    FIG. 31a and FIG. 31b describe how a user can request from any process within the Virtual Model the dynamic generation of the pathways of all the processes that are either upstream or downstream from that process.

    FIG. 32 describes an example of the predefined Navigation Panels that the user can request from any reservoir within the Virtual Model for the dynamic generation of constrained pathways of all the processes that are either upstream or downstream from that reservoir, but which are contained within a compartment selected by the user.

    FIG. 33 describes an example of the predefined Simulation Panels that the user can request from any reservoir within the Virtual Model the dynamic generation of constrained pathways and for the control of the dynamic simulation of the kinetics of those pathways.

    FIG. 34 describes how such simulation can be followed by plotting the time-series of any of the quantitative values of any reservoir and process in the subsystem being simulated.

    FIGS. 35a and b describe an example of the predefined Experiment Panels that the user can select from the domain menus to request the dynamic generation of constrained pathways from multiple initial points and for the control of the dynamic simulation of the kinetics of those pathways.

    FIGS. 36a and b describe an example of such set of pathways from multiple initial points.

    BEST MODE FOR CARRYING OUT THE INVENTION

    Notes: The body of the present application has sections that may contain some discussion of prior art teachings, intermingled with discussion of innovative and specific discussion of the best mode to use that prior art in this invention as presently contemplated. To describe the preferred embodiments, it is necessary to include in the discussion the capabilities offered by the shell used as development and deployment framework for this invention (hereafter referred to as "the Shell"). The applicant specifically notes that statements made in any of those sections do not necessarily delimit the various inventions claimed in the present application, but rather are included to explain how the workings of an existing set of tools is used to illustrate the preferred embodiments of the new tools and applications claimed in the claims section. The currently preferred embodiment of this invention, as described in the present application, is based on the definitions of a particular Shell: Gensym Corp.'s G2 Expert System. There are several other attributes that relate to the Shell's built-in performance and formatting capabilities, which are not shown in those examples. Some information included within the body of this application was extracted from various sources describing the characteristics of G2, including user manuals (included here by reference), and some of this material is subject to copyright protection.

    Monitoring and Controlling the Operation of a Reactor using a Simulated Virtual Model

    The system object of this invention is a hybrid combination of model-based methods describing the explicit mechanistic reference behavior of the production system with the input from process sensors that monitor the state of the production system, triggering event-based control flow of operations characteristic of each system. The system implements rules that wait for events to happen. Such events may be complex combinations of individual events, such as selected measured values reaching certain predefined or simulated values, or be within certain predefined ranges, including the implementation of fuzzy-logic within those rules. Such rules may fire other rules or invoke inference procedures or cause certain control actions to take place. This system can detect the current status of the production system, and when the selected monitored variables reach certain values specified by the simulated values of those variables, then certain control action take place. This system incorporates monitoring capabilities with the knowledge-based model to provide expert control of the operation of the production system or biofermentor. This control system provides simultaneous supervision of any number of operating variables (such as intermediary or end-products, which in the case of cellular systems can be intracellular or secreted), and compares them with the simulated values of those variables resulting from the encapsulated mathematical models and, depending on the dynamically monitored behavior of the production system, the control system is able to compensate by feeding components at different rates, feeding additional components, or stop feeding some of the previous components.

    Because of the user friendly and intuitive interface, the bench scientists can participate in the design of the production system by editing the knowledge-base and the visual models themselves, incorporating information and data obtained in their own experimental research or that from other published resources. It allows the direct incorporation of research into the scale-up operations. The scientists can rapidly build models by selecting any desired building block from the many palettes provided, clone it and drop it on desired compartment, link it to other building blocks in the model, configure it by entering values of desired attributes, and initialize it to establish relationships with other building blocks in the model. The information and data is entered into the system as attributes of objects, either as user inputs, or directly from measuring devices using and interface between said measuring devices and the computer system.

    The system provides a graphical computation and control language, where the objects communicate through the links established between them. Some of those linkages are built in within the composite prebuilt building blocks, while the modeler establishes other links between appropriate components from different building blocks. Other information, such as the name, description, references, or the values of parameters specific for each component are entered by the modeler. Variables in this system are themselves objects, and maybe of two major classes: a) one-valued variables have only one value, which may be either provided during a simulation by a simulation formula or procedure, or inferred by any other means, such as rules or general formulas defined by the modeler; and b) two-valued variables have two values: the simulated-value as before, and the measured-value is provided in real time through an external sensor mapped to said variable. The one-valued variables are used by default because of their smaller footprint, and their values are provided by default by generic simulation formulas that are specific for each class of variable. However, the modeler can replace them with the equivalent subclasses of two-valued variables for each instance of a component where the variable is mapped to an external sensor, and when both the measured valued and the simulated value of such variable is desired. The modeler can also write specific formulas for any desired instance of a variable, which then overrides the default generic formulas. It is possible, through any of the inference mechanisms to compare the measured value and the simulated-value, either the current values or the values mapped at some time point in the past, since the system is able to keep a time-stamped history of both types of values, and take specified actions when the inference criteria are met, such as: causing a valve for a component feed to be more or less open or closed, or activating or deactivating whole branches of the model pathways being simulated. Such actions can alternatively be invoked as a result of comparing either the measured-value or the simulated-value to any predefined constant value or range of values, or the value of any other variable or parameter in the system. Or the action or set of actions could result from inferences involving any number of comparisons between measured-values or simulated-values for any number of variables, or any number of parameters, or any predefined constant values.

    As shown in FIG. 1, a system of this invention is composed of:
    • a) a reactor (102), which may comprise any combination of reaction tanks, fermentors, bioreactors, or other processing units;
    • b) one or more data acquisition devices (108), which may comprise any combination of hardware and software devices, such as sets of sensors (106) that measure the amounts of selected chemicals in the reactor, signal transducers, filters, amplifiers, data acquisition boards, appropriate device drivers, or any other required devices;
    • c) one or more computer systems (112) comprising CPU, memory, storage device, display, user input device, linked (110) to the data acquisition devices (108);
    • d) one or more computer programs (114) and one or more computer Virtual Models (116) of pathways of chemical interactions and other processes in the reactor (102), wherein the computer programs (114) and the Virtual Models (116) are used to quantitatively or semi-quantitatively simulate in real-time the Virtual Models (116), wherein the Virtual Models encapsulate one or more variables (118) that represent the quantities of certain monitored entities in the reactor (102) and one or more variables (128) that represent the quantities of certain entities or certain events, which, when reaching certain values during the simulations of the Virtual Models (116) by the computer programs (114), trigger certain control actions that affect the operation of the reactor (102), and wherein both the computer programs (114) and the Virtual Models (116) are loaded in the memory of one or more of the computer systems (112);
    • e) one or more monitoring interfaces (120) loaded in the memory of one or more of the computer systems (112), which act as bridges (122) or software interfaces between the data acquisition devices (108) and the computer programs (114) and allow passing of values (124, 126), such as the values of the amounts of certain entities in the reactor (102), as measured (104) by the corresponding sensors (106), to the corresponding variables (118) embedded in the Virtual Models (116) that represent said amounts;
    • f) one or more controller devices (138), which regulate the operation of the reactor (102), such as controlling the flow of certain inputs (142) to the reactor (102), wherein the controller devices (138) are linked (136) to the computer systems (112);
    • g) one or more controller interfaces (130) loaded in the memory of one or more of the computer systems (112), which act as bridges or software interfaces between the computer programs (114) and the controller devices (138) and allow passing of control signals generated by the computer programs (114) as a result of the values of any combination of any number of variables (128) embedded in the Virtual Models (116) reaching certain values.


  • Depending on the application requirements, the interfaces may provide bridges to Supervisory Control and Data Acquisition (SCADA) systems, Distributed Control Systems (DCS) or Programmable Logic Controllers (PLCs), with the adequate protocol drivers, as well as to relational databases, object-oriented databases, ASCII files, as well as to a number of other connectivity applications, allowing the program to send or receive values over said interface.

    The knowledge-based Virtual Models include model-based reasoning that models the dynamic behavior of processes on the reactor. This mechanistic approach may involve any number of variables to be monitored, including those measured and monitored in the reactor and those simulated and monitored in the Virtual Models. The Virtual Models provide a visual qualitative and quantitative description of processes that happen inside of the reactor, as well as a description of the participants in those processes. The system of this invention separates the representation of the physical systems as Virtual Models from the monitoring and control aspects, allowing to integrate the same Virtual Models with different combinations of monitoring an d control designs, to solve different production needs.

    FIG. 1 also shows the directions of flow of data and control in the system of this invention. The amounts of certain entities, which are specific for each particular process design, are captured (104) by the corresponding set of biosensors (106) and, through the data acquisition devices (108), those values are passed (124) to the monitoring interface variables, which in turn pass those values (126) to the corresponding variables (118) in the Virtual Models (116) that represent those values. Those variables (118) are integrated with many other variables and parameters embedded in the Virtual Models (116) during the real-time simulation of the Virtual Models (116), including other variables (128) embedded in the Virtual Models (16) that represent quantities, rates, or other events, and which are monitored by the programs (114) during the simulation. Whenever during the simulation the values of any combination of any number of said monitored simulated variables (128), which are specific for each particular process design, reach certain values, then the programs (14) pass control signals (134), through the appropriate control auxiliary structures (132) in the controller interfaces (130), which are forwarded (140) to the controller devices (138), which in turn control the flow of inputs (142) and regulate (144) the operation of the bioreactor (102), which is being monitored (104) in a continuous manner.

    Depending on the application requirements, the interfaces may provide bridges to Supervisory Control and Data Acquisition (SCADA) systems (such as the HP's RTAP SCADA or others), Distributed Control Systems (DCS) (such as Honeywell TDC3000 DCS or others) or Programmable Logic Controllers (PLCs) (such as Allen-Bradley's PLC3/PLC5 families, or others), with the adequate protocol drivers, as well as to relational databases, object-oriented databases, ASCII files, as well as to a number of other connectivity applications, allowing the program to send or receive values over said interface. The program (114) and the interfaces (120, 130) are separate processes, which may be located on one or more host computers. In the later case, a communications link, such as a TCP/IP or DECnet protocols based link and port, is required. The program can also use various interfaces simultaneously, linked to different types of devices. The interface serves as a bridge between selected variables embedded in the objects of the Virtual Models and their corresponding mappings in one or more external systems. Functions defined in the interface can be invoked by remote procedure calls (RPCs) defined in the program, and viceversa.

    The ideal system for the intelligent control of biotechnological processes must posses a large set of features reflecting the real-time aspects of the control system, as well as the specific characteristics of the biochemical processes involved. There are a variety of new sensor technologies that may be used to provide the monitoring capabilities external to the Virtual Models. Furthermore, the system of this invention allows to integrate a variety of real-time control options with the ability to use mechanistic models to drive the automatic decision making support for process control.

    The knowledge-based Virtual Models include model-based reasoning that models the dynamic behavior of processes on the reactor. This mechanistic approach may involve any number of variables to be monitored, including those measured and monitored in the reactor and those simulated and monitored in the Virtual Models, which are both used as inputs for the inference engine to be compared against each other or against other specified values, and which meeting the specified conditions trigger actions that result in control operation on the reactor. The Virtual Models provide a visual qualitative and quantitative description of processes that happen inside of the reactor, as well as a description of the participants in those processes. In some cases, the reactor may contain chemicals in solution or bound to other structures, such as carriers or membranes. In other cases, the reactor may contain less defined mixtures resulting from cell extracts. Yet in other cases, the reactor may contain cells of one or more types, interacting with each other and with their environment, the culture medium, where each of such cells is an reactor itself that interact with other cell reactor. The system of this invention provides capabilities to produce more or less detailed Virtual Models of any of those types of reactors, or Virtual Models of subsystems within such reactors.

    The program provides a domain-specific framework for continuous and discrete process modeling and simulation, which combines a hierarchical object-oriented representation to provide facilities for hierarchical systems modeling to handle multi-dimensional information, model-based and knowledge-based reasoning, temporal inferencing, and dynamic real-time monitoring and control. Furthermore, the system of this invention separates the representation of the physical systems as Virtual Models from the monitoring and control aspects, allowing to integrate the same Virtual Models with different combinations of monitoring and control designs, to solve different production needs.

    The present invention also refers to a domain-specific, application-independent, knowledge-based and model-based, object-oriented, iconic, real-time computer system capable of being used by scientists as a framework to construct specific and interactive information, modeling and simulation applications in the chemical and biochemical domains. It provides a variety of domain-specific, application-independent tools, graphical interfaces and associated methods to provide users the capability to extract, interactively or automatically, the integrated knowledge contained in those applications or models build by the modeler, and to further use those models, among other uses, to navigate through the pathways of processes and explore those processes and their participants, or for quantitative real-time dynamic simulations. The computer system comprises a plurality of methods, hereinafter called "the methods" and diverse sets of objects, some of them representing either entities or concepts, hereinafter called bioObjects, and other representing other auxiliary structures, which in general are referred as tools. Objects are arranged in object-class hierarchies and workspace hierarchies. Methods may be associated with a class of objects, an individual instance of a class, or a specified group of instances within a class. Libraries of prebuilt knowledge-based generic bioObjects are provided as the building blocks that can be combined in prescribed ways to create diverse and new knowledge structures and models.

    In the description of this invention, the following clarifications should be noted:
    • a) a person using the underlying Shell to define the domain-specific application-independent but knowledge-based classes of objects, their associated methods, and the prebuilt knowledge-based building blocks is hereinafter referred to as a "developer", while the person using those building blocks to build application-dependent models, or expanding the libraries of prebuilt bioObjects, is hereinafter referred to as "modeler", and the person that extracts and uses the accumulated knowledge and runs simulations, is hereinafter referred to as a "user";
    • b) text expressions are case insensitive, and what is referred in the description as bioObject or bioProcess, may be referred to in the code as bio-object or bioprocess, or the names may appear in the code in low case or all capitalized;
    • c) a window refers to a display of the program on a computer screen, while a workspace refers to one of the many containers of objects that may be displayed within a window, and the subworkspace of an object is the workspace (only one) that may be associated with that object and may encapsulate the components of that object (composite object);
    • d) the selection of a menu option is equivalent to clicking on that option, and both expressions are used in the following description of this invention;
    • e) expressions similar to "selection of option A displays B" is a short form meaning something similar to "selection of option A causes the invocation of the action or procedure specified in the definition of option A and, as a result of the execution of such action or procedure, B is displayed in the window from which option A was selected."


  • The preferred embodiment of this invention integrates a variety of knowledge representation techniques, as required for creation of virtual models of complex systems. The accuracy and validity of knowledge-based systems correlates not only with the quality of the knowledge available to the modelers but also with their ability to understand, interpret and represent that knowledge. The object-oriented approach provides a powerful knowledge representation of physical entities (such as organs, cells, DNA, enzymes, receptors, ligands, mediators, or ions) and conceptual entities (such as processes and cellular interactions, quantities and rates). In this system, data and behavior are unified in the objects. Two characteristics of object-oriented environments, encapsulation and inheritance, are very important for the design and implementation of the system of this invention. Objects are defined following class hierarchies in which the definition of each class specifies the types of attributes characteristic of all subclasses and/or instances of that class. Encapsulation permits to hide the details behind each object, and encapsulation is implemented in two different forms: a) at the attribute level, is the standard form of encapsulation of object-oriented approaches; and b) at the workspace level, a less common form of encapsulation related more to the iconic approach. Multiple levels of workspace encapsulation are supported (FIGS. 18 and 24), allowing modules with a multilayered structure with increasing levels of detail. Since the subworkspace is not inherited through the class hierarchy, neither are the components. However, once a generic instance for a class is completed with components in its subworkspace, it can be added to the corresponding palette (FIGS. 18, 20, 23) as a prebuilt building block, and the resulting composite object can be cloned, in which case all encapsulated levels of subworkspaces are also cloned. The interpreted domain-specific framework approach is easy to use and allows rapid building of iconic models. The domain-expert knowledge is represented in an easy to understand declarative from, separated from the methods that specified how that knowledge is used by the program. The iconic approach allows the user configuration for building of models to consist primarily of connecting iconic objects and filling out their tables of attributes. The visual framework enables reasoning about the interactions between objects and their compartmentalization. This graphical programming capability is particularly important because it allows the continuous expansion and modification of the system, necessary for this type of applications, by scientist modelers with no programming skills. Its modularity allows to delete, modify or create parts of the system without affecting the operation of the rest of the system. The knowledge can also be extended and reused under different contexts, and in different new applications. The same hierarchic architecture used to develop the bioObjects can be extended by the modeler to expand the libraries of bioObjects, creating new objects or by cloning, configuring and modifying existing ones. With its iconic format, the system can be easily browsed and understood by the user.

    The principle followed in the design and implementation of this invention is the breaking down of the knowledge about entities, processes and pathways down to smaller functional units, to a level where the following requirements can be met: a) allowing their repeated use as building blocks in a variety of situations; b) allowing access to the structures and processes that are susceptible to control and regulation; and c) keeping the number of units manageable. The functional units, hereinafter called composite bioObjects, are further broken down into operational and standardized knowledge and data structures. These generic but domain-specific iconic objects add flexibility and allow the user to open tables of individual instances to input data, modify existing bioObjects, or create new types of bioObjects. Following such principle, the system of this invention comprises a variety of domain-specific knowledge structures, all encoded as objects, with further components, such as: a) other iconic bioObjects on their subworkspaces, b) standard graphic user interface objects such as buttons to request data displays or input panels, c) non-iconic objects, which might be pointed to as attributes of other objects, such as domain-specific variables, parameters, lists or arrays, or d) methods refering to the bioObjects, their attributes or their components, such as actions, formulas (Tables 83-88), relations (Tables 75 and 218a, rules (Tables 76-81) and procedures (various Tables), and which can be invoked interactively by the user from either the bioObjects or their components or from the standard graphic user interface objects or domain-menus, or invoked programmatically by other rules or procedures, and which are hereinafter referred to as the associated methods. It is an object of this invention the manner in which the combination of those types of knowledge structures encode, in a distributed form, the data and the knowledge that enables the system to compute and to reason about conditions and events defined by the developer, the modeler, any other external source, and/or generated during a dynamic simulation. Based on those conditions, the knowledge structures are used by the Shell to infer or simulate different behaviors, while data, information and/or control are propagated and actions are executed. As can be observed by comparison with the prior art previously discussed with the description in the sections to follow, the form of knowledge representation in the iconic models, libraries of composite bioObjects, graphical interface and associated methods, and the overall organization of the system object of this invention offer a very different and innovative approach that not only integrates the qualitative and quantitative description of chemical and biochemical objects with a set of state and dependent variables, such as amounts and rates, but also incorporates temporal reasoning and generates dynamic simulations.

    Therefore, one innovation of the present invention is the iconic interface and associated methods used to partition the domain knowledge into modular and interactive knowledge structures represented as iconic objects, the connections, interactions and relations between those structures, and their behavior, incorporating declarative and procedural information. Each independent functional iconic building block, a composite bioObject, comprises a set of iconic bioObjects and other objects and represents either:
    • a) the descriptive characterization of single units of molecular entities, molecular subcomponents, or molecular complexes, hereinafter called in general bioEntities (FIG. 18);
    • b) the description of a number of units with the same characteristics, defined as a population of such molecular entities or molecular complexes in specific states, hereinafter called a bioPool, which together with the potential inputs to and outputs from such bioPool is encapsulated in what is hereinafter called a bioReservoir (FIG. 8);
    • c) the interactions between fractions of different such populations of molecular entities or molecular complexes, or their transitions from one molecular state or discrete compartment to another over time, hereinafter called bioProcesses (FIG. 7); or
    • d) the organization and integration of networks of pathways composed of sets of linked bioReservoirs and bioProcesses (FIGS. 7, 8) organized in compartments, hereinafter called bioModels (FIG. 24). Low level bioModels can be further organized in discrete location and time compartments, representing the subcellular organelles and the phases of the cell cycle respectively, which are themselves components of higher level compartments representing single cells. All those types compartments are themselves bioModel subclasses at different levels of complexity and detail in a hierarchy of subworkspaces.


  • BioObject, a subclass of the class objects provided by the Shell, is the superclass at the top of various hierarchies of classes of objects such as the class hierarchies of bioEntity, bioReservoir, bioProcess, bioModel, and others. Each instance of a class is characterized by the set of attributes defined for the class, which can be inspected in the table of attributes attached to each object by selecting it with the mouse. Some attributes define configuration information, while other attributes describe the composition and characteristics of the object, and still others hold dynamic state information, such as the current value(s), data histories, and status. The value of some attributes of an object, such as the variables and parameters, can be dynamically modified at runtime.

    The knowledge-based modeling and simulation system of this invention interprets the information and data contained in the BioObjects' data structures, based on additional knowledge contained in the form of rules, formulas or procedures. Model-based knowledge is integrated and encapsulated in the structural, functional and behavioral models represented by the virtual iconic models or bioModels. Those virtual models are defined qualitatively (FIG. 24) by the locations, connections, and relations of their components, and quantitatively by their encapsulated mathematical models (FIGS. 12-15). The constants, parameters and variables that define those models are distributed through the different types of bioObjects, defined as their attributes, and the corresponding formulas and functions relate those data structures to others and characterize the particular system. The system integrates propagation of values, inference and control throughout the pathways. Model-generated results are used as input information for the inference engine, and dynamic models can also reason about the historic values of its variables as well as projecting values of variables into the future. All those different types of knowledge are added by human domain experts to incrementally build an integrated Virtual Model.

    In the current implementation of this invention, the variables embedded in the model that are set to get values from the sensors (such as the concentrations of the corresponding bioPools) inherit their properties from two parent classes: a class of float-variable and a class of interface-data-service. One of the attributes of the later class is Interface, which value defines the interface that will provide the current value for such variable. The value for such variable, the same as for any other variable in this system, can be set to be evaluated at set intervals, or it can be set to its value having a value that expire after a set validity interval. When the inference engine seeks the value of said variable, either at the preset update intervals or when such value is needed but has has expired, a request if set to the specified interface to provide a value. Such variables are registered in the interface. After the interface is set-up, when the start_interface function is called by the program and the connection established between the interface (107) process and the computer program (115) process, the functions executed by the interface comprise: initialize_context (initializes the connection between the program and the interface), pause_context, resume_context, shutdown_context, receive_registration (called when the program seeks to map for the first time a variable to an external data point); receive_deregistration, poll for data (checks periodically which registered variables need updating, retrieved data values available, packages the data into the data structure used by the return functions, and sends the data back to the program (115); get_data (called when the program requests data service for one or more registered variables); set_data (called when the program executes one or more set actions within a rule or procedure, it sends a request to the interface to set the external data point to which the registered variable is mapped, it also may call interface return functions to change the value of the registered variable after its corresponding value in the external system has been changed). The interfaces perform also many other functions related to passing objects and messages, checking data types, and several other API functions.

    In the current implementation of this invention, methods such as procedures, rules, formulas and relations, defined in a structured natural language, may apply to a single instance or a group of instances, or to a class or a group of classes, and they may include references to connections and to connected objects. Methods may refer or apply to the values of attributes at different time points or to the behavior of variables or parameters over time. They can perform in response to: a) given events or conditions, b) at predetermined time intervals, or c) upon request from other rules or procedures. Methods can be executed in real-time, in simulated time, or as fast as possible, implementing different strategies concurrently and over extended periods of time. Arithmetic and symbolic expressions can be used independently or combined, and dynamic models may include from non-linear differential equations to logic expressions to simulate both analytic and/or heuristic behavior.

    The following sections will also discuss various of the innovative teachings of this invention that refer to methods and tools used t