Method and system for implicitly resolving pointing ambiguities in human-computer interaction (HCI)6907581
Abstract
A method and system for implicitly resolving pointing ambiguities in human-computer interaction by implicitly analyzing user movements of a pointer toward a user targeted object located in an ambiguous multiple object domain and predicting the user targeted object, using different categories of heuristic (statically and/or dynamically learned) measures, such as (i) implicit user pointing gesture measures, (ii) application context, and (iii) number of computer suggestions of each predicted user targeted object. Featured are user pointing gesture measures of (1) speed-accuracy tradeoff, referred to as total movement time (TMT), and, amount of fine tuning (AFT) or tail-length (TL), and, (2) exact pointer position. A particular application context heuristic measure used is referred to as containment hierarchy. The invention is widely applicable to resolving a variety of different types of pointing ambiguities such as composite object types of pointing ambiguities, involving different types of pointing devices, and which are widely applicable to essentially any type of software and/or hardware methodology involving using a pointer, such as in computer aided design (CAD), object based graphical editing, and text editing.
Claims
1. A method for implicitly resolving pointing ambiguities in human-computer interaction, comprising the steps of:
(a) intending by a user to select a user targeted object from a plurality of at least two objects in an object domain displayed by a computer executing a computer application including a pointing mechanism featuring a pointer dynamically moveable throughout said object domain;
(b) moving by said user said pointer towards said user targeted object;
(c) estimating by said computer user movement continuation of said pointer towards said user targeted object;
(d) forming by said computer a set of candidate predicted user targeted objects according to parameters selected from the group consisting of pointer movement continuation parameters obtained from step (c) and pointer position parameters;
(e) predicting by said computer said user targeted object from said set of said candidate predicted user targeted objects according to at least one category of heuristic measures selected from the group consisting of implicit user pointing gesture measures, application context measures, and, number of computer suggestions of each predicted user targeted object measures, for generating by said computer a best predicted user targeted object;
(f) suggesting by said computer said best predicted user targeted object to said user; and
(g) making a decision by said user, said decision is selected from the group consisting of accepting said computer suggested best predicted user targeted object as said user targeted object and as correct, and, rejecting said computer suggested best predicted user targeted object as not said user targeted object and as incorrect, whereby if said decision is said accepting said computer suggested best predicted user targeted object as said user targeted object, then said user performs an acceptance action using said pointing mechanism, indicative that the pointing ambiguities are resolved.
2. The method of claim 1, wherein said category of implicit user pointing gesture measures includes particular types of said heuristic measures selected from the group consisting of speed-accuracy tradeoff heuristic measures and exact pointer position heuristic measures.
3. The method of claim 1, wherein said category of implicit user pointing gesture measures includes particular types of said heuristic measures selected from the group consisting of total movement time (TMT) heuristic measures and amount of fine tuning (AFT) or tail length (TL) heuristic measures.
4. The method of claim 1, wherein said category of implicit user pointing gesture measures includes total movement time (TMT) heuristic measures based on applying Fitts' Law for determining a total movement time parameter, TMT, for performing a given task as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a pre-determined reference point, where said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
5. The method of claim 1, wherein said category of implicit user pointing gesture measures includes total movement time (TMT) heuristic measures based on applying a reversed version of Fitts' Law for predicting a size, W, of said user targeted object from a total movement time parameter, TMT, for performing a given task and from a distance, A, of said user targeted object from a predetermined reference point, where said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
6. The method of claim 1, wherein said category of implicit user pointing gesture measures includes total movement time (TMT) heuristic measures based on applying a Shannon formulation of Fitts' Law for determining said total movement time, TMT, for performing a given task as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a predetermined reference point, where said Shannon formulation of said Fitts' Law is described by a formula, said TMT=a+b*log2[(A/W)+1], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(A/W+1)] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
7. The method of claim 1, wherein said category of implicit user pointing gesture measures includes total movement time (TMT) heuristic measures based on applying a reversed Shannon formulation of Fitts' Law for predicting a size, W, of said user targeted object from a total movement time parameter, TMT, for performing a given task and from a distance, A, of said user targeted object from a pre-determined reference point, where said Shannon formulation of said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
8. The method of claim 7, whereby extent by which said prediction of said size, W, of said user targeted object fits said reversed Shannon formulation of said Fitts' Law is defined as a fit written in a form, said fit=ABS(TMT-a-b*log2[(A/W)+1]).
9. The method of claim 1, wherein said category of implicit user pointing gesture measures includes amount of fine tuning (AFT) or tail length (TL) heuristic measures for determining a tail length parameter, TL, of said user movement of said pointer as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a predetermined reference point, where said tail length parameter is described by a formula, said TL=a*W2+b*W+c, where said a, said b, and said c, are empirically determined (AFT) or (TL) parameters and said asterisk symbol, *, is a multiplication operator.
10. The method of claim 1, whereby said category of implicit user pointing gesture measures includes particular types of exact pointer position heuristic measures selected from the group consisting of distance of said pointer from center of said user targeted object and direction of said moving by said user said pointer towards said user targeted object.
11. The method of claim 1, whereby said category of application context measures is based on context, of said selecting said user targeted object by said user, including any information external to said selecting and relevant to understanding said selecting said user targeted object.
12. The method of claim 1, whereby said category of application context measures includes a containment hierarchy particular type of said heuristic measures.
13. A method for implicitly resolving pointing ambiguities in human-computer interaction, comprising the steps of:
(a) intending by a user to select a user targeted object from a plurality of at least two objects in an object domain displayed by a computer executing a computer application including a pointing mechanism featuring a pointer dynamically moveable throughout said object domain;
(b) moving by said user said pointer towards said user targeted object;
(c) selecting by said user a position of said pointer located in a vicinity of said user targeted object;
(d) estimating by said computer user movement continuation of said pointer towards said user targeted object;
(e) forming by said computer a set of candidate predicted user targeted objects according to parameters selected from the group consisting of pointer movement continuation parameters obtained from step (d) and pointer position parameters;
(f) predicting by said computer said user targeted object from said set of said candidate predicted user targeted objects according to at least one category of heuristic measures selected from the group consisting of implicit user pointing gesture measures and application context measures, for generating by said computer a best predicted user targeted object; and
(g) selecting by said computer said computer generated best predicted user targeted object, whereby if said computer generated best predicted user targeted object is said user targeted object, then the pointing ambiguities are resolved.
14. The method of claim 13, wherein said category of implicit user pointing gesture measures includes particular types of said heuristic measures selected from the group consisting of speed-accuracy tradeoff heuristic measures and exact pointer position heuristic measures.
15. The method of claim 13, wherein said category of implicit user pointing gesture measures includes particular types of said heuristic measures selected from the group consisting of total movement time (TMT) heuristic measures and amount of fine tuning (AFT) or tail length (TL) heuristic measures.
16. The method of claim 13, wherein said category of implicit user pointing gesture measures includes total movement time (TMT) heuristic measures based on applying Fitts' Law for determining a total movement time parameter, TMT, for performing a given task as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a pre-determined reference point, where said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
17. The method of claim 13, wherein said category of implicit user pointing gesture measures includes total movement time (TMT) heuristic measures based on applying a reversed version of Fitts' Law for predicting a size, W, of said user targeted object from a total movement time parameter, TMT, for performing a given task and from a distance, A, of said user targeted object from a pre-determined reference point, where said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
18. The method of claim 13, wherein said category of implicit user pointing gesture measures includes total movement time (TMT) heuristic measures based on applying a Shannon formulation of Fitts' Law for determining said total movement time, TMT, for performing a given task as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a predetermined reference point, where said Shannon formulation of said Fitts' Law is described by a formula, said TMT=a+b*log2[(A/W)+1], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(A/W)+1] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
19. The method of claim 13, wherein said category of implicit user pointing gesture measures includes total movement time (TMT) heuristic measures based on applying a reversed Shannon formulation of Fitts' Law for predicting a size, W, of said user targeted object from a total movement time parameter, TMT, for performing a given task and from a distance, A, of said user targeted object from a pre-determined reference point, where said Shannon formulation of said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
20. The method of claim 19, whereby extent by which said prediction of said size, W, of said user targeted object fits said reversed Shannon formulation of said Fitts' Law is defined as a fit written in a form, said fit=ABS(TMT-a-b*log2[(A/W)+1].
21. The method of claim 13, wherein said category of implicit user pointing gesture measures includes amount of fine tuning (AFT) or tail length (TL) heuristic measures for determining a tail length parameter, TL, of said user movement of said pointer as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a pre-determined reference point, where said tail length parameter is described by a formula, said TL=a*W2+b*W+c, where said a, said b, and said c, are empirically determined (AFT) or (TL) parameters and said asterisk symbol, *, is a multiplication operator.
22. The method of claim 13, whereby said category of implicit user pointing gesture measures includes particular types of exact pointer position heuristic measures selected from the group consisting of distance of said pointer from center of said user targeted object and direction of said moving by said user said pointer towards said user targeted object.
23. The method of claim 13, whereby said category of application context measures is based on context, of said selecting said user targeted object by said user, including any information external to said selecting and relevant to understanding said selecting said user targeted object.
24. The method of claim 13, whereby said category of application context measures includes a containment hierarchy particular type of said heuristic measures.
25. A method for implicitly resolving pointing ambiguities in human-computer interaction, comprising the steps of:
(a) intending by a user to select a user targeted object from a plurality of at least two objects in an object domain displayed by a computer executing a computer application including a pointing mechanism featuring a pointer dynamically moveable throughout said object domain;
(b) moving by said user said pointer towards said user targeted object;
(c) implicitly resolving by said computer the pointing ambiguities by implicitly analyzing user movements of said pointer towards said user targeted object located in said object domain and predicting said user targeted object, whereby said implicitly analyzing and predicting are performed by using at least one category of heuristic measures selected from the group consisting of implicit user pointing gesture measures, application context measures, and number of computer suggestions of each best predicted user targeted object measures;
wherein said category of implicit user pointing gesture measures includes total movement time (TMT) heuristic measures based on applying a reversed Shannon formulation of Fitts' Law for predicting a size, W, of said user targeted object from a total movement time parameter, TMT, for performing a given task and from a distance, A, of said user targeted object from a predetermined reference point, where said Shannon formulation of said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units;
and wherein extent by which said prediction of said size, W, of said user targeted object fits said reversed Shannon formulation of said Fitts' Law is defined as a fit written in a form, said fit=ABS(TMT-a-b*log2[(A/W)+1]).
26. The method of claim 25, wherein said category of implicit user pointing gesture measures also includes particular types of said heuristic measures selected from the group consisting of speed-accuracy tradeoff heuristic measures and exact pointer position heuristic measures.
27. The method of claim 25, wherein said category of implicit user pointing gesture measures also includes particular types of said heuristic measures selected from the group consisting of total movement time (TMT) heuristic measures and amount of fine tuning (AFT) or tail length (TL) heuristic measures.
28. The method of claim 25, wherein said category of implicit user pointing gesture measures also includes total movement time (TMT) heuristic measures based on applying Fitts' Law for determining a total movement time parameter, TMT, for performing a given task as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a predetermined reference point, where said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
29. The method of claim 25, wherein said category of implicit user pointing gesture measures also includes total movement time (TMT) heuristic measures based on applying a reversed version of Fitts' Law for predicting a size, W, of said user targeted object from a total movement time parameter, TMT, for performing a given task and from a distance, A, of said user targeted object from a pre-determined reference point, where said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
30. The method of claim 25, wherein said category of implicit user pointing gesture measures also includes total movement time (TMT) heuristic measures based on applying a Shannon formulation of Fitts' Law for determining said total movement time, TMT, for performing a given task as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a predetermined reference point, where said Shannon formulation of said Fitts' Law is described by a formula, said TMT=a+b *log2[(A/W)+1], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2 [(A/W)+1] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
31. The method of claim 25, whereby said category of implicit user pointing gesture measures also includes particular types of exact pointer position heuristic measures selected from the group consisting of distance of said pointer from center of said user targeted object and direction of said moving by said user said pointer towards said user targeted object.
32. The method of claim 25, whereby said category of application context measures is based on context, of said selecting said user targeted object by said user, including any information external to said selecting and relevant to understanding said selecting said user targeted object.
33. The method of claim 25, whereby said category of application context measures includes a containment hierarchy particular type of said heuristic measures.
34. The method of claim 25, wherein said pointing ambiguities include an overlapping object ambiguity.
35. The method of claim 25, wherein said pointing ambiguities include a composite object ambiguity.
36. A method for implicitly resolving pointing ambiguities in human-computer interaction, comprising the steps of:
(a) intending by a user to select a user targeted object from a plurality of at least two objects in an object domain displayed by a computer executing a computer application including a pointing mechanism featuring a pointer dynamically moveable throughout said object domain;
(b) moving by said user said pointer towards said user targeted object;
(c) implicitly resolving by said computer the pointing ambiguities by implicitly analyzing user movements of said pointer towards said user targeted object located in said object domain and predicting said user targeted object, whereby said implicitly analyzing and predicting are performed by using at least one category of heuristic measures selected from the group consisting of implicit user pointing gesture measures application context measures, and number of computer suggestions of each best predicted user targeted object measures;
wherein said category of implicit user pointing gesture measures includes amount of fine tuning (AFT) or tail length (TL) heuristic measures for determining a tail length parameter, TL, of said user movement of said pointer as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a pre-determined reference point, where said tail length parameter is described by a formula, said TL=a*W2+b*W+c, where said a, said b, and said c, are empirically determined (AFT) or (TL) parameters and said asterisk symbol, *, is a multiplication operator.
37. A method for implicitly resolving pointing ambiguities in human-computer interaction, comprising the steps of:
(a) intending by a user to select a user targeted object from a plurality of at least two objects in an object domain displayed by a computer executing a computer application including a pointing mechanism featuring a pointer dynamically moveable throughout said object domain;
(b) moving by said user said pointer towards said user targeted object;
(c) implicitly resolving by said computer the pointing ambiguities by implicitly analyzing user movements of said pointer towards said user targeted object located in said object domain and predicting said user targeted object, whereby said implicitly analyzing and predicting are performed by using at least one category of heuristic measures selected from the group consisting of implicit user pointing gesture measures and application context measures;
wherein said category of implicit user pointing gesture measures includes total movement time (TMT) heuristic measures based on applying a reversed Shannon formulation of Fitts' Law for predicting a size, W, of said user targeted object from a total movement time parameter, TMT, for performing a given task and from a distance, A, of said user targeted object from a pre-determined reference point, where said Shannon formulation of said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W] extent by which said prediction of said size, W, of said user targeted object fits said reversed Shannon formulation of said Fitts' Law is defined as a fit written in a form, said fit=ABS(TMT-a-b*log2[(A/W)+1]), where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units;
and wherein extent by which said prediction of said size, W, of said user targeted object fits said reversed Shannon formulation of said Fitts' Law is defined as a fit written in a form, said fit=ABS(TMT-a-b*log2[(A/W)+1]).
38. The method of claim 37, wherein said category of implicit user pointing gesture measures also includes particular types of said heuristic measures selected from the group consisting of speed-accuracy tradeoff heuristic measures and exact pointer position heuristic measures.
39. The method of claim 37, wherein said category of implicit user pointing gesture measures also includes particular types of said heuristic measures selected from the group consisting of total movement time (TMT) heuristic measures and amount of fine tuning (AFT) or tail length (TL) heuristic measures.
40. The method of claim 37, wherein said category of implicit user pointing gesture measures also includes total movement time (TMT) heuristic measures based on applying Fitts' Law for determining a total movement time parameter, TMT, for performing a given task as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a pre-determined reference point, where said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
41. The method of claim 37, wherein said category of implicit user pointing gesture measures also includes total movement time (TMT) heuristic measures based on applying a reversed version of Fitts' Law for predicting a size, W, of said user targeted object from a total movement time parameter, TMT, for performing a given task and from a distance, A, of said user targeted object from a predetermined reference point, where said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W-], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
42. The method of claim 37, wherein said category of implicit user pointing gesture measures also includes total movement time (TMT) heuristic measures based on applying a Shannon formulation of Fitts' Law for determining said total movement time, TMT, for performing a given task as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a predetermined reference point, where said Shannon formulation of said Fitts' Law is described by a formula, said TMT=a+b*log2[(A/W)+1], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(A/W+1)] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
43. The method of claim 37, whereby said category of implicit user pointing gesture measures also includes particular types of exact pointer position heuristic measures selected from the group consisting of distance of said pointer from center of said user targeted object and direction of said moving by said user said pointer towards said user targeted object.
44. The method of claim 37, whereby said category of application context measures is based on context, of said selecting said user targeted object by said user, including any information external to said selecting and relevant to understanding said selecting said user targeted object.
45. The method of claim 37, whereby said category of application context measures includes a containment hierarchy particular type of said heuristic measures.
46. The method of claim 37, wherein said pointing ambiguities include an overlapping object ambiguity.
47. The method of claim 37, wherein said pointing ambiguities include a composite object ambiguity.
48. A method for implicitly resolving pointing ambiguities in human-computer interaction, comprising the steps of:
(a) intending by a user to select a user targeted object from a plurality of at least two objects in an object domain displayed by a computer executing a computer application including a pointing mechanism featuring a pointer dynamically moveable throughout said object domain;
(b) moving by said user said pointer towards said user targeted object;
(c) implicitly resolving by said computer the pointing ambiguities by implicitly analyzing user movements of said pointer towards said user targeted object located in said object domain and predicting said user targeted object, whereby said implicitly analyzing and predicting are performed by using at least one category of heuristic measures selected from the group consisting of implicit user pointing gesture measures and application context measures;
wherein said category of implicit user pointing gesture measures includes amount of fine tuning (AFT) or tail length (TL) heuristic measures for determining a tail length parameter, TL, of said user movement of said pointer as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a pre-determined reference point, where said tail length parameter is described by a formula, said TL=a*W2+b*W+c, where said a, said b, and said c, are empirically determined (AFT) or (TL) parameters and said asterisk symbol, *, is a multiplication operator.
49. A system for implicitly resolving pointing ambiguities in human-computer interaction, comprising:
(a) a user intending to select a user targeted object from a plurality of at least two objects in an object domain;
(b) a pointing mechanism featuring a pointer dynamically moveable throughout said object domain and controllable by said user; and
(c) a computer displaying said plurality of said at least two objects in said object domain and executing a computer application including said pointer dynamically moveable throughout said object domain, whereby said computer implicitly resolves the pointing ambiguities by implicitly analyzing user movements of said pointer towards said user targeted object located in said object domain and predicting said user targeted object, said implicitly analyzing and predicting are performed by using at least one category of heuristic measures selected from the group consisting of implicit user pointing gesture measures, application context measures, and number of computer suggestions of each best predicted user targeted object measures;
wherein said category of implicit user pointing gesture measures includes total movement time (TMT) heuristic measures based on applying a reversed Shannon formulation of Fitts' Law for predicting a size, W, of said user targeted object from a total movement time parameter, TMT, for performing a given task and from a distance, A, of said user targeted object from a predetermined reference point, where said Shannon formulation of said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units;
and wherein extent by which said prediction of said size, W, of said user targeted object fits said reversed Shannon formulation of said Fitts' Law is defined as a fit written in a form, said fit=ABS(TMT-a-b*log2[(A/W)+1]).
50. The system of claim 49, wherein said category of implicit user pointing gesture measures also includes particular types of said heuristic measures selected from the group consisting of speed-accuracy tradeoff heuristic measures and exact pointer position heuristic measures.
51. The system of claim 49, wherein said category of implicit user pointing gesture measures also includes particular types of said heuristic measures selected from the group consisting of total movement time (TMT) heuristic measures and amount of fine tuning (AFT) or tail length (TL) heuristic measures.
52. The system of claim 49, wherein said category of implicit user pointing gesture measures also includes total movement time (TMT) heuristic measures based on applying Fitts' Law for determining a total movement time parameter, TMT, for performing a given task as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a pre-determined reference point, where said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
53. The system of claim 49, wherein said category of implicit user pointing gesture measures also includes total movement time (TMT) heuristic measures based on applying a reversed version of Fitts' Law for predicting a size, W, of said user targeted object from a total movement time parameter, TMT, for performing a given task and from a distance, A, of said user targeted object from a pre-determined reference point, where said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
54. The system of claim 49, wherein said category of implicit user pointing gesture measures also includes total movement time (TMT) heuristic measures based on applying a Shannon formulation of Fitts' Law for determining said total movement time, TMT, for performing a given task as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a pre-determined reference point, where said Shannon formulation of said Fitts' Law is described by a formula, said TMT=a+b*log2[(A/W)+1], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(A/W)+1] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
55. The system of claim 49, whereby said category of implicit user pointing gesture measures also includes particular types of exact pointer position heuristic measures selected from the group consisting of distance of said pointer from center of said user targeted object and direction of said moving by said user said pointer towards said user targeted object.
56. The system of claim 49, whereby said category of application context measures is based on context, of said selecting said user targeted object by said user, including any information external to said selecting and relevant to understanding said selecting said user targeted object.
57. The system of claim 49, whereby said category of application context measures includes a containment hierarchy particular type of said heuristic measures.
58. The system of claim 49, wherein said pointing ambiguities include an overlapping object ambiguity.
59. The system of claim 49, wherein said pointing ambiguities include a composite object ambiguity.
60. A system for implicitly resolving pointing ambiguities in human-computer interaction, comprising:
(a) a user intending to select a user targeted object from a plurality of at least two objects in an object domain;
(b) a pointing mechanism featuring a pointer dynamically moveable throughout said object domain and controllable by said user; and
(c) a computer displaying said plurality of said at least two objects in said object domain and executing a computer application including said pointer dynamically moveable throughout said object domain, whereby said computer implicitly resolves the pointing ambiguities by implicitly analyzing user movements of said pointer towards said user targeted object located in said object domain and predicting said user targeted object, said implicitly analyzing and predicting are performed by using at least one category of heuristic measures selected from the group consisting of implicit user pointing gesture measures, application context measures, and number of computer suggestions of each best predicted user targeted object measures;
wherein said category of implicit user pointing gesture measures includes amount of fine tuning (AFT) or tail length (TL) heuristic measures for determining a tail length parameter, TL, of said user movement of said pointer as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a pre-determined reference point, where said tail length parameter is described by a formula, said TL=a*W2+b*W+c, where said a, said b, and said c, are empirically determined (AFT) or (TL) parameters and said asterisk symbol, *, is a multiplication operator.
61. A system for implicitly resolving pointing ambiguities in human-computer interaction, comprising:
(a) a user intending to select a user targeted object from a plurality of at least two objects in an object domain;
(b) a pointing mechanism featuring a pointer dynamically moveable throughout said object domain and controllable by said user; and
(c) a computer displaying said plurality of said at least two objects in said object domain and executing a computer application including said pointer dynamically moveable throughout said object domain, whereby said computer implicitly resolves the pointing ambiguities by implicitly analyzing user movements of said pointer towards said user targeted object located in said object domain and predicting said user targeted object, said implicitly analyzing and predicting are performed by using at least one category of heuristic measures selected from the group consisting of implicit user pointing gesture measures and application context measures;
wherein said category of implicit user pointing gesture measures includes total movement time (TMT) heuristic measures based on applying a reversed Shannon formulation of Fitts' Law for predicting a size, W, of said user targeted object from a total movement time parameter, TMT, for performing a given task and from a distance, A, of said user targeted object from a pre-determined reference point, where said Shannon formulation of said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units;
and wherein extent by which said prediction of said size, W, of said user targeted object fits said reversed Shannon formulation of said Fitts' Law is defined as a fit written in a form, said fit=ABS(TMT-a-b*log2[(A/W)+1]).
62. The system of claim 61, wherein said category of implicit user pointing gesture measures also includes particular types of said heuristic measures selected from the group consisting of speed-accuracy tradeoff heuristic measures and exact pointer position heuristic measures.
63. The system of claim 61, wherein said category of implicit user pointing gesture measures also includes particular types of said heuristic measures selected from the group consisting of total movement time (TMT) heuristic measures and amount of fine tuning (AFT) or tail length (TL) heuristic measures.
64. The system of claim 61, wherein said category of implicit user pointing gesture measures also includes total movement time (TMT) heuristic measures based on applying Fitts' Law for determining a total movement time parameter, TMT, for performing a given task as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a pre-determined reference point, where said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
65. The system of claim 61, wherein said category of implicit user pointing gesture measures also includes total movement time (TMT) heuristic measures based on applying a reversed version of Fitts' Law for predicting a size, W, of said user targeted object from a total movement time parameter, TMT, for performing a given task and from a distance, A, of said user targeted object from a predetermined reference point, where said Fitts' Law is described by a formula, said TMT=a+b*log2[(2*A)/W], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(2*A)/W] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
66. The system of claim 61, wherein said category of implicit user pointing gesture measures also includes total movement time (TMT) heuristic measures based on applying a Shannon formulation of Fitts' Law for determining said total movement time, TMT, for performing a given task as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a pre-determined reference point, where said Shannon formulation of said Fitts' Law is described by a formula, said TMT=a+b*log2[(A/W)+1], where said a and said b are empirically determined Fitts' Law parameters, said asterisk symbol, *, is a multiplication operator, and said factor log2[(A/W)+1] is an index of difficulty describing difficulty for said performing said given task in ‘bit’ units.
67. The system of claim 61, whereby said category of implicit user pointing gesture measures also includes particular types of exact pointer position heuristic measures selected from the group consisting of distance of said pointer from center of said user targeted object and direction of said moving by said user said pointer towards said user targeted object.
68. The system of claim 61, whereby said category of application context measures is based on context, of said selecting said user targeted object by said user, including any information external to said selecting and relevant to understanding said selecting said user targeted object.
69. The system of claim 61, whereby said category of application context measures includes a containment hierarchy particular type of said heuristic measures.
70. The system of claim 61, wherein said pointing ambiguities include an overlapping object ambiguity.
71. The system of claim 61, wherein said pointing ambiguities include a composite object ambiguity.
72. A system for implicitly resolving pointing ambiguities in human-computer interaction, comprising:
(a) a user intending to select a user targeted object from a plurality of at least two objects in an object domain;
(b) a pointing mechanism featuring a pointer dynamically moveable throughout said object domain and controllable by said user, and
(c) a computer displaying said plurality of said at least two objects in said object domain and executing a computer application including said pointer dynamically moveable throughout said object domain, whereby said computer implicitly resolves the pointing ambiguities by implicitly analyzing user movements of said pointer towards said user targeted object located in said object domain and predicting said user targeted object, said implicitly analyzing and predicting are performed by using at least one category of heuristic measures selected from the group consisting of implicit user pointing gesture measures and application context measures;
wherein said category of implicit user pointing gesture measures includes amount of fine tuning (AFT) or tail length (TL) heuristic measures for determining a tail length parameter, TL, of said user movement of said pointer as a function of a size, W, of said user targeted object and a distance, A, of said user targeted object from a predetermined reference point, where said tail length parameter is described by a formula, said TL=a*W2+b*W+c, where said a, said b, and said c, are empirically determined (AFT) or (TL) parameters and said asterisk symbol, *, is a multiplication operator.
Description
FIELD AND BACKGROUND OF THE INVENTION
The present invention relates to the field of human-computer interaction (HCI) focusing on using a pointer for selecting objects and, more particularly, to a method and system for implicitly resolving pointing ambiguities in human-computer interaction by implicitly analyzing user movements of a pointer toward a user targeted object located in an ambiguous multiple object domain and predicting the user targeted object. Implicitly analyzing the user movements of the pointer and predicting the user targeted object are done by using different categories of heuristic (statically and/or dynamically learned) measures, such as (i) implicit user pointing gesture measures, (ii) application context, and (iii) number of computer suggestions of each best predicted user targeted object.
In the continuously developing field of human-computer interaction (HCI), human-computer interfaces are constantly getting closer to the cognitive level of the user. Contemporary interfaces try to make the human-computer dialogue as natural and non-formal as possible, thus taking advantage of the user's natural perceptual and communicative abilities.
The last decades have witnessed dramatic changes in this respect, with the introduction of new graphical user interface (GUI) concepts and devices such as graphical object manipulation, ‘mouse’ pointing devices and associated pointers, windows, pull-down dynamic menus, and icons, all of which are now ubiquitous. These concepts were the foundations for the ‘WYSIWYG’ (what you see is what you get) paradigm originated during the 1970's at Xerox PARC laboratories, and were later used extensively in commercial systems such as the Xerox Star (1981), the Apple Lisa (1982), the Apple Macintosh (1984), and currently in Microsoft Windows. The theoretical principles and psychological foundation of these interfaces were eventually termed ‘Direct Manipulation’ by Shneiderman, as described by Shneiderman, B., in The Future of Interactive Systems and The Emergence of Direct Manipulation, Behavior and Information Technology 1, 1982, 237-256, and reviewed by Myers, B. A., in Brief History of Human Computer Interaction Technology, ACM Interactions, 5(2), March 1998, 44-54. The trend of approaching the user's cognitive domain in the field of human-computer interaction has continued since, and is now manifested in the form of the ‘Perceptual User Interfaces’ paradigm as described by Turk, M. and Robertson, G., in Perceptual User Interfaces, Communications of the ACM, March 2000, 43(3), 33-34.
Direct Manipulation (DM) interfaces are characterized by continuous graphic representation of the system objects, physical actions instead of complex syntax, and incremental operations whose impact on the object is immediately visible. DM interfaces are generally shown to improve users' performance and satisfaction, and were cognitively analyzed as encouraging ‘Direct Engagement’ and reducing ‘Semantic Distance’, as described by Hutchins et. al., in Direct Manipulation Interfaces, User Centered System Design: New Perspectives on Human-computer Interaction, Norman, D. A. and Draper, S. W. (eds.), Lawrence Erlbaum, New Jersey, USA, 1986, 87-124. Direct Engagement refers to the ability of users to operate directly on objects rather than conversing about objects using a computer language. The ‘Cognitive Distance’ factor describes the amount of mental effort needed to translate from the cognitive domain of the user to the level required by the user interface, as described by Jacob, R. J. K., in A Specification Language for Direct Manipulation User Interfaces, ACM Transactions on Graphics, 5(4), 1986, 283-317. The graphic nature of DM interfaces, with their rich visual representation of the system's objects, exploits the natural perceptual bandwidth of humans. It enables presenting abundant and complex information to be processed simultaneously and naturally by users.
The intuitiveness of DM may also be attributed to the fact that it incorporates natural dialogue elements. For instance, the fact that both user's input and computer's output are performed on the same objects, also referred to as ‘inter-referential I/O’, is reminiscent of natural language, which often includes references to objects from earlier utterances, as described by Draper, S. W., in Display Managers as the Basis for User-Machine Communication, User Centered System Design: New Perspectives on Human-computer Interaction, Norman, D. A. and Draper, S. W. (eds.), Lawrence Erlbaum, NJ, USA, 1986, 339-352.
DM has some limitations, as not all interaction requirements can be reduced to graphical representation. An example of such limitation is the fact that DM is tailored for a demonstrative form of interaction, at the expense of neglecting descriptive interaction, that is, DM is suited for actions such as "delete this", but not for actions such as "delete all red objects that reside next to blue objects". This is a significant limitation of DM, as many human-computer interaction scenarios may benefit from descriptive operations. However, such limitations did not prevent DM from becoming a de facto standard in the last decades, as described by Buxton, B., in HCI and the Inadequacies of Direct Manipulation Systems, SIGCHI Bulletin, 25(1), 1993, 21-22.
Despite the dramatic advances of the last decades, user interface is still considered a bottleneck limiting successful interaction between human cognitive abilities and the computer's continually evolving and developing computational abilities. The goal of natural interfaces is being further pursued in contemporary research under the framework of ‘Perceptual User Interfaces’ (PUI). PUI seek to craft user interfaces that are more natural and compelling by taking advantage of the ways in which people naturally interact with each other. Human-computer interfaces are required, by the PUI paradigm, to be as transparent as possible, and to incorporate several modalities of input and output. Possible input methods include speech, facial expressions, eye movement, manual gestures and touch. Output may be given as a multimedia mix of visual, auditory and tactile feedback, as described in the above Turk and Robertson reference.
While there is consensus that user interfaces should become more ‘natural’ to the user, different concepts of ‘natural interfaces’ are being pursued. Many researchers focus on natural language interfaces. The main goal of this discipline is the understanding and deployment of spoken language as a means of entering input. These interfaces are often augmented by additional modalities like physical pointing, as described by Bolt, R. A., in Put-That-There: Voice and Gesture at the Graphics Interface, Computer Graphics, 14(3), 1980, 262-270, and described by Kobsa et al., in Combining Deitic Gestures and Natural Language for Referent Identification, Proceedings of the 11th International Conference on computational Linguistics, Bonn, West Germany, 1986, 356-361, or lip movement recognition to improve the interface's robustness.
Other researchers try to incorporate natural dialogue principles into non-verbal dialogue. In the above Buxton reference, several drawbacks of using spoken language as a suitable human-computer interface are presented. There, it is claimed that spoken languages are not actually natural (but, rather learned), are not universal (profound differences exist between languages), and are "single-threaded" (only one stream of words can be parsed at a time). Consequently, Buxton's approach is to rely on elements that are arguably more fundamental to natural dialogue such as fluidity, continuity, and phrasing, and integrate them into traditional graphical user interfaces. Similar approaches try to augment traditional GUI, which is said to consist of a syntactic, semantic, and lexical level, as described by Foley, J. D. et al. in Computer Graphics: Principles and Practice, Addison-Wesley, Reading, Mass., USA, 1990, with an additional discourse level. The discourse level refers to the ability of interpreting each user's action in the context of previous actions, rather than as an independent utterance. The interpretation is performed according to human dialogue properties such as conversational flow and focus, as described by Perez, M. A. and Sibert, J. L., in Focus in Graphical User Interfaces, Proceedings of the ACM International Workshop on Intelligent User Interfaces, Addison-Wesley/ACM Press, FL, USA, 1993, and by Jacob, R. J. K., in Natural Dialogue in Modes Other than Natural Language, Dialogue and Instruction, Reun, R. J., Baker, M., and Reiner, M. (eds.), Springer-Verlang, Berlin, 1995, 289-301.
Another aspect of the cognitive gap between humans and computers is the incompatible object representation. While computer programs usually maintain a rigid definition of their interaction objects, users are often interested in objects that emerge dynamically while working. This disparity is a major obstacle on the way to natural dialogue. The gap can be somewhat bridged using ‘Gestalt’ Perceptual Grouping principles that may identify emergent perceptual objects that are of interest to the user. Such an effort was made in the "Per Sketch" Perceptually Supported Sketch Editor, as described by Saund, E. and Moran, T. P., in A Perceptually Supported Sketch Editor, Proceedings of the ACM Symposium on UI software and Technology, UIST, CA, USA, 1994, and in Perceptual Organization in an Interactive Sketch Editing Application, ICCV, 1995.
Perceptual Grouping principles may also be used in order to automatically understand references to multiple objects without the user needing to enumerate each of them. For example, following a user's reference to multiple objects, either with a special mouse action or verbal utterance, a ‘Gestalt’ based algorithm may be used for identifying the most salient group of objects as the target of the pointing action, as described by Thorisson, K. R., in Simulated Perceptual Grouping: An Application to Human-Computer Interaction, Proceedings of the Sixteenth annual Conference of the Cognitive Science Society, Atlanta, Ga., USA, Aug. 13-16, 1994, 876-881.
One profound difference between Perceptual User Interfaces and standard Graphical User Interfaces is that while input to GUIs is atomic and certain, PUIs input is often uncertain and ambiguous and hence its interpretation is probabilistic. This fact presents a challenge of creating robust mechanisms for dealing with uncertainties involved during human-computer interactions. Typically, the strategy of meeting this challenge is by integrating information from several sources, as described by Oviatt, S. and Cohen, P., in Multidomal Interfaces That Process What Comes Naturally, Communications of the ACM, 43(3), March, 2000, 45-53.
The present invention for implicitly resolving pointing ambiguities in human-computer interaction, described below, falls within the Perceptual Interface paradigm with respect to both objective and methodology. An important continuously underlying objective of the present invention is to enable transparent human-computer interaction within the Direct Manipulation (DM) paradigm. Methodology of the present invention is based on heuristically dealing with uncertain input, as is often the case with Perceptual Interfaces.
Pointing. A fundamental element of any linguistic interaction is an agreement on the subject to which a sentence refers. Natural conversation is often accompanied by deictic (logic) gestures that help identify the object of interest by showing its location, as described by Streit, M., in Interaction of Speech, Deixix and Graphical Interface, Proceeding of the workshop on Deixis, Demonstration and Deictic Belief, held on occasion of Esslli XI, August, 1999. Deictic gestures performed in natural dialogue are very limited in information by themselves, as they only provide a general area of attention. However, these gestures are accurately understood in the conversation context, by integrating information from the spoken language and the gesture.
User interfaces need to incorporate an equivalent mechanism for determining the subject of user operations. DM interfaces implement this conversational element with the concept of the Current Object of Interest (COI), which is the designated object for the next user's actions. Many of the operations available to users of DM are designed to act upon the COI. The common method of designating the COI is by using a deictic gesture to point at the COI, that is, by clicking on the object. A typical interaction scenario consists of selecting the COI by pointing at it, and performing different actions on it, also referred to as the noun-verb or object-action paradigm.
The COI has a unique role in DM interfaces. One of the characteristics of DM interfaces is that they are modeless, whereby, each user's action is interpreted in the same manner, rather than according to a varying application ‘mode’. However, the COI mechanism is in fact a way of achieving modes in DM applications, as it reduces the acceptable inputs and determines the way inputs are interpreted, as described in the above Jacob, 1995, reference.
Pointing Ambiguities. Like their natural counterparts, user interface pointing gestures are limited in the information they convey. In scenarios where graphical representations are complex, pointing gestures are typically ambiguous. Simply clicking on the desired object is usually the most intuitive and common selection method. This method of selection is very precise in specifying the exact location of interest, but lacks any other information. In particular, in scenarios featuring complex graphical representations, the exact location information is not sufficient to determine the user targeted object, as several interaction objects may share the same location. In order to overcome this and other types of pointing ambiguities, currently used pointing techniques and mechanisms are extended in various ‘explicit’ ways, summarized herein below, each of which is at the expense of the desired invisibility of the interface.
Composite Objects Ambiguity. One of the problematic scenarios for target selection is dealing with hierarchical object situations in which some selectable objects are compounds of lower level selectable objects. In such cases, pointing is inherently ambiguous, since several interaction objects share any given pointing device (‘mouse’) click position. Furthermore, there is no area that is unique to any of the objects, which can serve as an unambiguous selection area. As shown in FIG. 1, a schematic diagram illustrating an example of the commonly occurring composite object type of pointing ambiguity, when a user clicks inside the circle A, the user may want to select either the inner slice B or the entire circle C. This particular type of pointing ambiguity is sometimes referred to as the "pars-pro-toto" ambiguity—mistaking the part for the whole, and vice versa, as described in the above Streit, 1999, reference.
The composite object ambiguity problem exists under the surface of many common human-computer interaction scenarios. For example, in a common word processor, system objects typically consist of letters, words, lines, and paragraphs, each being a possible current object of interest (COI). However, clicking the pointing device (mouse) when the pointer is in the area of a letter can be interpreted as pointing to any of the above system objects. As shown in FIG. 2, a schematic diagram illustrating an example of the commonly occurring composite object type of pointing ambiguity with respect to text objects, when a user clicks inside the area of the letter ‘U’ indicated by the pointer, the user may want to select the letter ‘U’, the word ‘User’, or the entire sentence including the indicated ‘U’. Frames shown in FIG. 2 represent the imaginary or potential selection area of each text object.
Currently used pointing techniques and mechanisms use or incorporate different explicit techniques to explicitly overcome or resolve composite object ambiguities. One currently used explicit technique is to ‘avoid the hierarchy’ by allowing access to only one level of hierarchy at a time. This solution is typical to vector graphics drawing software, whereby users may place graphical elements on a ‘canvas’ and manipulate them. Objects may be grouped together into a composite object, thus creating an object hierarchy. However, once grouped together, the elemental objects are not accessible unless explicitly ungrouped, in which case the composite object ceases to exist. In terms of the selection mechanism, only one type of object exists, and its selection is very straightforward. In such scenarios, additional, often undesirable, actions of grouping and/or ungrouping need to be performed. Moreover, a related significant limitation of the grouping/ungrouping technique is that the deeper or more extensive the grouping hierarchy becomes, the harder it is for users to select the elemental objects, as more ungrouping actions are required, as well as the need to reconstruct the hierarchy following successful completion of the ungrouping and selecting actions. An example of this approach is the MS WORD drawing objects grouping mechanism, shown in FIG. 3, a schematic diagram illustrating one currently used technique of ‘avoiding the hierarchy’ for explicitly overcoming or resolving composite object types of ambiguities shown in FIGS. 1 and 2. In FIG. 3, objects A and B are each individual or elemental selectable objects. Once grouped together, objects A and B are part of a single compound or composite object C. A user clicking anywhere within compound object C selects entire object C. The only way of selecting individual or elemental objects A and/or B is by first ungrouping compound or composite object C.
A second currently used explicit technique for overcoming or resolving composite object ambiguities is based on ‘having different modes of selection’. This technique enables a user to have direct access to any level of hierarchy, after setting the corresponding selection mode. This explicit technique is only applicable in cases where there is a limited number of hierarchy levels, and, if the hierarchy levels have an a priori meaning to the user. This technique is widely used in CAD applications, where users often need to define and use groups of objects. Again, the meaning of the grouping is that selecting one member of the group results in selecting all members of the entire group. The user may change operation of the selection mechanism properties to be either in a group mode or in a single object mode.
A third currently used explicit technique for overcoming or resolving composite object ambiguities is based on using different procedures or protocols for performing the selection action itself, referred to as ‘extended selection’ procedures. For instance, MSWord designates a double click action for selecting a word and a triple click action for selecting an entire paragraph. Another extended selection procedure is ‘click and drag’, whereby a user specifies an area of interest rather then a single point, resulting in either giving more information on a targeted object, or for enabling the user to simultaneously select a plurality of targeted objects.
Two additional currently used explicit types of object selection techniques are ‘Pose Matching’ and ‘Path Tracing’, presented as part of the above mentioned "Per Sketch" Perceptually Supported Sketch Editor, which were developed specifically for disambiguating object selection in an environment with rich object interpretations, as described in the above Saund and Moran references. In the Pose Matching technique, a user performs a quick gesture that indicates the approximate location, orientation, and elongation of an intended or targeted object, while in the Path Tracing technique, a user traces an approximate path over the intended or targeted object's curve.
While the above described explicit techniques disambiguate the selection process, each one requires a user to have explicit knowledge of either the particular application modes or the particular selection procedures. Additionally, explicit selection techniques necessarily involve indirect user access to objects, thereby making the human-computer interaction dialogue less conversation like, and undesirably decreases invisibility and/or smoothness of the human-computer interface.
There are other commonly occurring scenarios in the field of human-computer interaction, focusing on object selection and pattern recognition, in which a single click of a pointing device is not enough for users to determine intended or targeted objects. Such scenarios feature one or more of the following types of pointing ambiguities: overlapping of objects, inaccurate pointing due to demanding and/or dynamic conditions such as small targets, moving targets, and a sub-optimal working environment, and, limited pointing devices such as mobile pointing devices, gesture recognition interfaces, and eye-movement interfaces.
To one of ordinary skill in the art, there is thus a need for, and it would be highly advantageous to have a method and system for implicitly resolving pointing ambiguities in human-computer interaction (HCI) by implicitly analyzing user movements of a pointer toward a user targeted object located in an ambiguous multiple object domain and predicting the user targeted object. Moreover, there is a need for such an invention whereby implicitly analyzing the user movements of the pointer and predicting the user targeted object are done by using different categories of heuristic measures, which are widely applicable and extendable to resolving a variety of different types of pointing ambiguities such as composite object types of pointing ambiguities, involving different types of pointing devices besides the commonly used ‘mouse’ pointing device, and which are widely applicable to essentially any type of software and/or hardware methodology involving using a pointer, in general, and involving object selection, in particular.
SUMMARY OF THE INVENTION
The present invention relates to the field of human-computer interaction (HCI) focusing on using a pointer for selecting objects and, more particularly, to a method and system for implicitly resolving pointing ambiguities in human-computer interaction by implicitly analyzing user movements of a pointer toward a user targeted object located in an ambiguous multiple object domain and predicting the user targeted object. Implicitly analyzing the user movements of the pointer and predicting the user targeted object are done by using different categories of heuristic (statically and/or dynamically learned) measures, such as (i) implicit user pointing gesture measures, (ii) application context, and (iii) number of computer suggestions of each best predicted user targeted object. Particular types of implicit user pointing gesture measures are (1) speed-accuracy tradeoff measures, and (2) exact pointer position measures. Two independent types of speed-accuracy tradeoff heuristic measures primarily used in the present invention are referred to as total movement time (TMT), and, amount of fine tuning (AFT) or tail-length (TL). A particular type of application context heuristic measure used in the present invention is referred to as containment hierarchy.
The present invention successfully overcomes several significant limitations of presently known techniques and methods for resolving pointing ambiguities in human-computer interaction. In particular, the present invention enables implicitly, rather than explicitly, resolving pointing ambiguities, thereby simplifying design and implementation of human-computer interfaces and human-computer interaction protocols, while preserving their transparent or invisible aspects to users. During object selection processes, explicit human-computer interaction operations are eliminated, accordingly, users are not required to learn how to perform different types of explicit operations in order to perform the object selection processes. The present invention provides the additional benefit of involving and encouraging natural ‘conversation like’ human-computer interaction rather than formal command dictation for using a pointer during object selection in an ambiguous object selection environment. An additional benefit of the present invention is the relatively fast user acquisition of a user intended targeted object in an ambiguous multiple object domain.
The present invention is widely applicable and extendable to resolving a variety of different types of pointing ambiguities such as composite object types of pointing ambiguities, involving different types of pointing devices and mechanisms besides the commonly used ‘mouse’ pointing device, and which are widely applicable to essentially any type of software and/or hardware methodology involving using a pointer, in general, and involving object selection, in particular. For example, the present invention is readily applicable to resolving pointing ambiguities occurring in object selection methodologies featured in computer aided design (CAD) software, object based graphical editing tools, and text editing.
Implementation of the method and system for implicitly resolving pointing ambiguities in human-computer interaction (HCI), of the present invention, involves performing or completing selected tasks or steps manually, automatically, or a combination thereof. Moreover, according to actual instrumentation and/or equipment used for implementing a particular preferred embodiment of the disclosed method and system, several selected steps of the present invention could be performed by hardware or by software on any operating system of any firmware or a combination thereof. For example, as hardware, selected steps of the invention could be performed by a computer chip or an electronic circuit. As software, selected steps of the invention could be performed by a plurality of software instructions being executed by a computer using any suitable operating system. In any case, selected steps of the present invention could be described as being performed by a data processor, such as a computing platform for executing a plurality of instructions.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention is herein described, by way of example only, with reference to the accompanying drawings, wherein:
FIG. 1 is a schematic diagram illustrating an example of the commonly occurring composite object type of pointing ambiguity;
FIG. 2 is a schematic diagram illustrating an example of the commonly occurring composite object type of pointing ambiguity with respect to text;
FIG. 3 is a schematic diagram illustrating one currently used technique of ‘avoiding the hierarchy’ for explicitly overcoming or resolving composite object types of ambiguities shown in FIGS. 1 and 2;
FIG. 4 is a schematic diagram illustrating use of the target (user targeted object) size parameter, W, for applying Fitts' Law to one (left side) and two (right side) dimensional settings, for evaluating the total movement time (TMT) type of speed-accuracy tradeoff heuristic measure, in accordance with the present invention;
FIG. 5 is a schematic diagram illustrating pointer speed as a function of time, for two typical pointer movements, A and B, where the circled area is the final acquisition stage, the ‘movement tail’, of the overall pointer movement, relating to the amount of fine tuning (AFT) or tail length (TL) type of speed-accuracy tradeoff heuristic measure, in accordance with the present invention;
FIG. 6 is a schematic diagram illustrating the experiment layout used for assessing the predictive power of the two types of speed-accuracy tradeoff heuristic measures, total movement time (TMT), and, amount of fine tuning (AFT) or tail length (TL), used in the present invention;
FIG. 7 is a schematic diagram illustrating computation of the tail length (TL), relating to the amount of fine tuning (AFT) or tail length (TL) type of speed-accuracy tradeoff heuristic measure, in accordance with the present invention;
FIG. 8 is a graph of experimental data and model prediction, using the Shannon formulation of Fitts' Law measure, of total movement time, TMT, plotted as a function of the index of difficulty, ID=log2[(A/W)+1];
FIG. 9 is a graph of experimental data and model prediction, using the fine tuning (AFT) or tail length (TL) measure, of tail length, TL, plotted as a function of the size of the target (user targeted object), W;
FIG. 10 is a graph of experimental data and Shannon formulation of Fitts' Law heuristic measure model prediction of total movement time, TMT, plotted as a function of the index of difficulty, ID=log2[(A/W)+1], for Near vs. Far Targets;
FIG. 11 is a graph of experimental data and (AFT) or (TL) heuristic measure model prediction of tail length, TL, plotted as a function of the size of the target (user targeted object), W, for Near vs. Far Targets;
FIG. 12 is a schematic diagram illustrating an exemplary preferred procedure for scoring candidate predicted user targeted objects, predicted by the computer from a set of candidate predicted user targeted objects according to the different categories of heuristic measures, for generating, by the computer, a best predicted user targeted object, in accordance with the present invention;
FIG. 13 a schematic diagram illustrating an exemplary containment hierarchy as a particular type of application context heuristic measure;
FIG. 14 is a schematic diagram illustrating an exemplary simulation of an ambiguous multiple object domain of object based drawing or graphics software;
FIG. 15 is a schematic diagram illustrating an exemplary simulation of an ambiguous multiple object domain of text selection; and
FIG. 16 is a schematic diagram illustrating an exemplary simulation of an ambiguous multiple object domain of a computer aided design (CAD) environment.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
The present invention relates to the field of human-computer interaction (HCI) focusing on using a pointer for selecting objects and, more particularly, to a method and system for implicitly resolving pointing ambiguities in human-computer interaction by implicitly analyzing user movements of a pointer toward a user targeted object located in an ambiguous multiple object domain and predicting the user targeted object. Implicitly analyzing the user movements of the pointer and predicting the user targeted object are done by using different categories of heuristic (statically and/or dynamically learned) measures, such as (i) implicit user pointing gesture measures, (ii) application context, and (iii) number of computer suggestions of each best predicted user targeted object. Particular types of implicit user pointing gesture measures are (1) speed-accuracy tradeoff measures, and (2) exact pointer position measures. Two independent types of speed-accuracy tradeoff heuristic measures primarily used in the present invention are referred to as total movement time (TMT), and, amount of fine tuning (AFT) or tail-length (TL). A particular type of application context heuristic measure used in the present invention is referred to as containment hierarchy.
It is to be understood that the invention is not limited in its application to the details of the order or sequence of steps of operation or implementation, or the construction, arrangement, composition, of the components, set forth in the following description, drawings, or examples. For example, the following description refers to the commonly used ‘mouse’ type of pointing device or mechanism associated with a pointer on a computer display screen, in order to illustrate implementation of the present invention. The invention is capable of other embodiments or of being practiced or carried out in various ways.
Also, it is to be understood that the phraseology and terminology employed herein are for the purpose of description and should not be regarded as limiting. In particular, the description of the present invention refers to the term ‘user targeted object’ for clearly indicating that an ‘object’ is ‘targeted’ by a ‘user’. Prior art of the relevant related technology typically refers to this entity simply as the ‘target’ or ‘object’ of a ‘user’ or associated with the intention of a ‘user’. Accordingly, throughout the present disclosure, it is to be clearly understood that the presently used term ‘user targeted object’ is equivalent in meaning and functionality to the commonly used prior art terms of ‘target’ and ‘object’.
Steps, components, operation, and implementation of a method and system for implicitly resolving pointing ambiguities in human-computer interaction by analyzing and predicting user movements of a pointer, according to the present invention, are better understood with reference to the following description and accompanying drawings.
Two preferred embodiments of the method and system for implicitly resolving pointing ambiguities in human-computer interaction are described herein. In the first preferred embodiment of the invention, implicitly resolving the pointing ambiguities implicitly analyzing user movements of a pointer towards a user targeted object located in an ambiguous multiple object domain and predicting the user targeted object are done by using three categories of heuristic measures, that is, (i) implicit user pointing gesture measures, (ii) application context, and (iii) number of computer suggestions of each best predicted user targeted object. In the first preferred embodiment, each sequence of the method includes a step wherein there is suggesting, preferably, by computer visual indication, a best predicted user targeted object to the user, whereby the user either accepts the computer suggested best predicted user targeted object as the user targeted object and as correct, or, rejects the computer suggested best predicted user targeted object as not the user targeted object and as incorrect.
Following each sequence of rejecting the suggested best predicted user targeted object, there is looping back along with feedback of generated heuristic data and information, for as many times as are needed, into the sequence of the method for suggesting another best predicted user targeted object, until the user accepts the suggested best predicted user targeted object as the user targeted object and as correct. Accepting the computer suggested best predicted user targeted object as the user targeted object and as correct is performed by the user performing an acceptance action, for example, ‘clicking’, using the pointing mechanism, for example, the mouse, with the pointer, indicative that the pointing ambiguities are resolved. In the first preferred embodiment, implicitly analyzing user movements of the pointer toward the user targeted object and suggesting the best predicted user targeted object are done ‘before’ or ‘prior to’ the user accepting the suggested best predicted user targeted object as the user targeted object and as correct, for example, by ‘clicking’ the pointing mechanism. Accordingly, the first preferred embodiment of the present invention is herein also referred to as the ‘pre-click suggestion’ preferred embodiment.
In the second preferred embodiment of the invention, implicitly analyzing the user movements of the pointer towards a user targeted object located in an ambiguous multiple object domain and predicting the user targeted object are done by using two categories of heuristic measures, that is, (i) implicit user pointing gesture measures, and (ii) application context. In the second preferred embodiment, each sequence of the method includes a step wherein the user selects a pointer position located in a vicinity of the user targeted object, followed by a step wherein the user selects a computer best predicted user targeted object. In this preferred embodiment, the computer selects the best predicted user targeted object which is not suggested to the user for optionally accepting or rejecting, but, rather is either correct or incorrect, at completion of each sequence, thus, accounting for the absence of using the third category of heuristic measures, that is, category (iii) number of computer suggestions of each best predicted user targeted object, which is used in the first preferred embodiment of the invention. In the second preferred embodiment, if the computer generated best predicted user targeted object selected by the computer is not the user targeted object and is incorrect, the user optionally decides to repeat the sequence from anew, for as many times as are needed, whereby, in contrast to the first preferred embodiment, there is no looping back along with feedback of generated heuristic data and information into the sequence of the method, until the user selects the computer best predicted user targeted object as the user targeted object.
In the second preferred embodiment of the invention, the ‘selecting action’, in the step of selecting, by the user, a pointer position located in the vicinity of the user targeted object, is performed, for example, by the user ‘clicking’ the pointing mechanism, for example, the mouse, of the pointer located in the vicinity of the user targeted object. In this preferred embodiment, implicitly analyzing user movements of the pointer towards the user targeted object and generating the computer best predicted user targeted object are done ‘after’ or ‘following’ the user selecting the pointer position located in the vicinity of the user targeted object, for example, by ‘clicking’ the pointing mechanism. Accordingly, the second preferred embodiment of the present invention is herein also referred to as the ‘post-click prediction’ preferred embodiment.
Relevant background and details of the different categories of heuristic (statically and/or dynamically learned) measures, such as (i) implicit user pointing gesture measures, (ii) application context, and (iii) number of computer suggestions of each best predicted user targeted object, used for implicitly analyzing user movements of a pointer towards a user targeted object located in an ambiguous multiple object domain and predicting the user targeted object are provided herein. This description is used for enabling implementation of the above indicated two preferred embodiments of the method and system of the present invention, that is, the first or ‘pre-click suggestion’ preferred embodiment, and, the second or ‘post-click prediction’ preferred embodiment, each of which is described in further detail hereinafter.
Implicit User Pointing Gesture Measures. The first category of heuristic (statically and/or dynamically learned) measures for implicitly analyzing the user movements of the pointer and predicting the user targeted object, according to the present invention, is implicit user pointing gesture measures. Particular types of implicit user pointing gesture measures are (1) speed-accuracy tradeoff measures, and (2) exact pointer position measures. Two independent types of speed-accuracy tradeoff heuristic measures primarily used in the present invention are referred to as total movement time (TMT), and, amount of fine tuning (AFT) or tail-length (TL).
Different techniques of gesture-based object selection are currently used in contemporary applications for overcoming the selection ambiguity problem. These techniques require a user to perform special gestures that describe the user targeted object, such as encircling, pose matching, and path tracing as described above. Since all the above techniques require the user to perform a specific gesture describing the user targeted object, they can be referred to as explicit techniques. While these methods successfully disambiguate or resolve ambiguities present in the selection process, they require some knowledge and practice from the user, and thus may be regarded to an extent as non-intuitive.
The present invention is different from the above explicit methods by focusing on the very basic methods of pointing and clicking using a pointing mechanism, such as a computer ‘mouse’. Consequently, the present invention resolves pointing ambiguities by using only parameters or measures which are associated with natural pointing gestures. A user is not expected to learn any special selection gesture. In fact, a user may not be aware of the information provided for resolving the pointing ambiguities.
In the present invention, implicitly analyzing user movements of a pointer and predicting a user targeted object located in an object domain, featuring some extent of ambiguity, are done by using, for example, the category of implicit user pointing gesture measures, where, particular types of implicit user pointing gesture measures are (1) speed-accuracy tradeoff measures, and (2) exact pointer position measures. Two independent types of speed-accuracy tradeoff heuristic measures primarily used in the present invention are referred to as total movement time (TMT), and, amount of fine tuning (AFT) or tail-length (TL). These implicit user pointing gesture heuristic measures are related to the speed-accuracy profile of the movement of the pointer and to the exact location of the pointer at the time of selecting a pointer location by using a pointing device or mechanism, such as by clicking on a ‘mouse’ type of pointing mechanism. It is herein described that these basic physical measures, that are a part of every pointing gesture, contain sufficient information for efficiently resolving pointing ambiguities.
Intuitively, the smaller a user targeted object is, the more time it will take to acquire it. This rule, generally referred to as the Speed-Accuracy tradeoff, as reviewed by Plamondon, R. and Alimi, A. M., in Speed/accuracy Tradeoffs In Target-directed Movements, Behavioral and Brain Sciences 20(2), 1997, 279-349, can supply a general scale factor for identifying a particular user targeted object. This scale factor is a very strong disambiguation clue, especially in the scenario of composite object ambiguity, in which possible user targeted object candidates tend to vary significantly in size. Following is a brief description of speed-accuracy tradeoff heuristic measures, followed by a description of two particular types of speed-accuracy tradeoff heuristic measures, total movement time (TMT) associated with the known Fitts' Law, and amount of fine tuning (AFT) or tail-length (TL), both based on different aspects of the speed-accuracy tradeoff, and both uniquely applied for resolving pointing ambiguities, in accordance with the present invention.
Speed-Accuracy Tradeoff Heuristic Measures. The phenomenon of speed-accuracy tradeoff as a measure has been studied in detail starting as early as the late nineteenth century. Empirical experiments conducted by Woodworth, R. S., described by him in The Accuracy of Voluntary Movement, Psychological Reviews Monograph Supplements, 3(3), 1899, involved comparing target directed movement in cases where the eyes of subjects were open and closed. Subjects were instructed to perform a directed movement in varying speed limits. Woodworth's main finding was that average spatial errors increase with movement speed in the eyes-open condition, but not in the eyes-closed condition, where the error rate was constantly high. Woodworth's account for the result was that in the eyes-closed condition, the subjects' moves were pre-programmed and guided only by a ballistic ‘initial impulse’, where as in normal movements represented by the eyes-open condition, the ‘initial impulse’ phase is followed by a visual feedback correction phase Woodworth named "current control". The faster the movement is, the less effective the visual feedback is, and thus at very high velocities there is a convergence in performance of the eyes-open and eyes-closed conditions.
Total Movement Time (TMT). Woodworth's experiments were followed by extensive work of other researchers, trying to quantify the speed-accuracy tradeoff phenomenon. These efforts resulted in the formulation of Fitts' Law, as described by Fitts, P. M., in The Information Capacity of the Human Motor System in Controlling the Amplitude of Movement, Journal of Experimental Psychology, 47(6), 1954, 381-391, stating that the total time it takes to perform a directed movement task is a function of the target (user targeted object) size and distance of the target from a pre-determined reference point. Fitts instructed subjects to move a stylus, or pointer, as fast as possible between two targets (user targeted objects) varying in widths and distances from the pre-determined reference point. Fitts observed that the total time taken to perform the task, herein, referred to as the ‘total movement time’ (TMT), is a function of the width of the target, W, and its distance, A, from a pre-determined reference point, whereby this phenomenon is described by the formula: where a and b are empirically determined parameters, herein, also referred to as the Fitts' Law parameters, and the asterisk symbol, *, is the multiplication operator. The factor log2[(2*A)/W] is termed the index of difficulty, and describes the difficulty to perform a given task in ‘bit’ units.
Fitts' Law was proven very robust, and was replicated successfully in numerous experiments, covering a wide range of pointing types of movements, experimental conditions and pointing type devices and mechanisms. Several models have been proposed to explain the validity of Fitts' Law. Some of the proposed models, such as the ‘Iterative Corrections Model’ described by Crossman, E. R. F. W. and Goodeve, P. J., in Feedback Control of Hand-Movement and Fitts' Law, Quarterly Journal of Experimental Psychology, 35A, 1963/1983, 251-278, are based on the controlled visual feedback movement phase. Other models, such as the ‘Impulse Variability Model’, proposed by Schmidt et al., in Motor Output Variability: A theory for the Accuracy of Rapid Motor Acts, Psychological Review, 86, 1979, 415-451, are based on the initial impulse of the pointer type movements, while other models, such as the ‘Optimized Initial Impulse Model’ proposed by Meyer et al., in Optimality in Human Motor Performances: Ideal Control of Rapid Aimed Movements, Psychological Review, 95, 1988, 340-370, are hybrids of the two factors of visual feedback movement and initial impulse movement. Yet a different kind of model or theory, the ‘Kinematic Theory’, proposed by Plamondon, R., in The Generation of Rapid Human Movements, Rapport Technique EPM/RT-93-5, cole Polytechnique de Montreal, Feb. 1993, states that the speed-accuracy tradeoff phenomenon is an inherent constraint that emerges from the global neuromuscular system involved in pointing types of movements, as described in the previously cited Plamondon, R. and Alimi, A. M. reference above.
Application of Fitts' Law. The wide acceptance and robustness of Fitts' Law encouraged many human-computer interaction (HCI) researches to adopt it as a tool for analyzing user performance in graphical user interface (GUI) tasks, as reviewed in the previously cited MacKenzie, I. S. reference above. Typically, Fitts' Law is used for predicting the time it takes to perform graphical pointing tasks. In the present invention, the traditional use of Fitts' Law is reversed, and applied as an estimator, not of total movement time (TMT), but rather of the size of a user targeted object given the total movement time (TMT) and distance of the user targeted object from a pre-determined reference point. In other words, instead of looking at a certain user targeted object and estimating the total movement time it takes to acquire it, according to the present invention, there is retrospectively measuring and analyzing the total movement time (TMT) taken for acquiring the user targeted object for drawing conclusions and predictions of the size of the user targeted object.
As a basis for devising the required formula, the present invention uses the Shannon formulation of Fitts' law, which includes a slight variant of the formulation of the index of difficulty factor of Fitts' law, as follows: where the terms A, W, a, and b, are the same as previously defined above according to Fitts' Law.
The Shannon formulation is preferred for implementing the present invention, as it is shown by prior art to have a better fit with observations, and always yields a positive index of difficulty as opposed to the original index of difficulty formulation, log2[(2*A)/W], that may sometimes yield negative values, that is, when A<W/2, as described by MacKenzie, I. S., in Movement Time Prediction in Human-Computer Interfaces, Readings In Human-Computer Interaction, 2nd edition, Baecker, R. M., Buxton, W. A. S., Gurdin, J. and Greenberg, S., (eds.), Los Altos, Calif., USA: Kaufmann, 1995, 483-493.
For the reversed version of Fitts' Law, that is, the Shannon formulation, there is as input, a measured total movement time, TMT, a known distance of movement, A, and, several candidates of user targeted objects with sizes Wi. Given these parameters, the extent by which each candidate user targeted object fits the model, herein referred to as fiti, is written in the following form: whereby, the Fitts' Law empirical parameters, a and b, are preferably computed once, and reflect the properties and characteristics, such as type of pointing device or mechanism, and, computer speed, of the specific human-computer interaction environment or scenario. For better user personalization, Fitts' Law parameters can also be re-computed for each user.
Fitts' original experimental setting instructed subjects to perform one dimensional target (user targeted object) selection. The effective size of the target (user targeted object) was consequently the length of the axis of the target (user targeted object) along the direction of approach by the stylus, or pointer. When extending Fitts' Law to two dimensional settings, a different definition is used for the size, W, of the target (user targeted object), for correctly representing the difficulty of acquiring the target (user targeted object). The definition used in the present invention for the size, W, of the target (user targeted object), is the length of the smaller axis of a box bounding the target (user targeted object), as shown in FIG. 4, a schematic diagram illustrating use of the target (user targeted object) size parameter, W, for applying Fitts' Law to one (left side) and two (right side) dimensional settings, for evaluating the total movement time (TMT) type of speed-accuracy tradeoff heuristic measure. This measure is shown to achieve good fit, and is computationally simple, as indicated by MacKenzie, I. S. and Buxton, W., in Extending Fitts' Law to Two-Dimensional Tasks, Proceedings of the CHI'92 Conference on Human Factors in Computing Systems, New York: ACM, USA, 1992, 219-226.
Amount of Fine Tuning (AFT) or Tail Length (TL). Fitts' Law measurements are based on the total movement time (TMT) it takes to perform a task. For implicitly resolving pointing ambiguities, in the present invention, there is introduced a complementary speed-accuracy tradeoff measure that is based on inner properties of the pointer movement profile. The ‘tail length’ (TL) parameter captures the amount of ‘fine tuning’ needed for acquiring a target (user targeted object). The relationship between pointer movement tail length and size of the user targeted object, and validity of the relationship as a predictor of user targeted objects was studied.
Pointer movements towards a user targeted object typically consist of an acceleration phase followed by a deceleration phase. Pointer movements requiring only approximate terminal accuracy were shown to have acceleration and deceleration phases with similar lengths, resulting in a symmetric, Gaussian like, velocity profile. However, for pointer movements characterized by a high terminal accuracy, this symmetry breaks down, as the deceleration phase tends to be longer than the acceleration phase, in proportion to the required terminal accuracy, as described by Taylor, F. V. and Birmingham, H. P., in The Acceleration Pattern of Quick Manual Corrective responses, Journal of Experimental Psychology, 38, 1948, 783-795, and described by Beggs, W. D. A. and Howarth, C. I., in The Movement of the Hand Towards a Target, Quarterly Journal of Experimental Psychology, 24, 1972, 448-453.
The deceleration phase of pointer movements with high terminal accuracy typically ends with an asymptote of practically zero velocity that corresponds to the minute movements before final target (user targeted object) acquisition, or the ‘fine tuning’ of the pointer movement. The smaller the target (user targeted object) is, the more fine tuning required for acquiring the target (user targeted object). In the present invention, this fine tuning stage is termed the ‘movement tail’, as shown in FIG. 5, a schematic diagram illustrating pointer speed as a function of time, for two typical pointer movements, A and B, where the circled area is the final acquisition stage, the ‘movement tail’, of the overall pointer movement, relating to the amount of fine tuning (AFT) or tail length (TL) type of speed-accuracy tradeoff heuristic measure. This type of speed-accuracy tradeoff heuristic measure capturing the length of the ‘movement tail’ contains additional information relating to the pointer total movement time (TMT) described and evaluated by Fitts' Law. In particular, as can be seen in FIG. 5, the movement tail is characterized by a dramatic decrease in pointer speed, corresponding to a high deceleration value. In the present invention, the dramatic deceleration value is used for identifying the beginning of the fine tuning stage, where the specific threshold value of deceleration is empirically decided upon for each particular application environment.
The amount of fine tuning (AFT) or tail length (TL) type of speed-accuracy tradeoff heuristic measure has several features that make it attractive for applying to methods and systems for resolving pointing ambiguities. Firstly, this heuristic measure is local by nature, and hence computationally simpler than the total movement time (TMT) heuristic measure based on Fitts' Law. Specifically, the information needed for evaluating or computing the (AFT) or (TL) measure resides in the very last phase of the pointer movement, which does not require knowledge of when and where the pointer movement originated. Secondly, the (AFT) or (TL) measure is essentially independent of the Fitts' Law parameters, in that, pointer movements violating the model of total movement time (TMT) as represented by Fitts' Law, may still conform to the end of movement scheme as represented by the (AFT) or (TL) measure.
The local nature of the (AFT) or (TL) measure is especially suitable for real life pointing gestures that are sometimes not really target directed. In particular, real life human-computer interaction (HCI) pointing scenarios are sometimes different from the strict target directed movements of laboratory tests. For example, a user may start the pointer movement in an indecisive manner, in scenarios where the user is not sure of the final user targeted object at the start of the pointing movement gesture. In these scenarios, it can be subjective and quite difficult to identify the time of actual pointer movement start needed for evaluating the Fitts' Law measure of total movement time (TMT) and therefore, the Fitts' Law empirical parameters, a and b.
In contrast, the (AFT) or (TL) measure remains valid, as it is invariant to absolute speed of the pointer, since this measure is based on acceleration rather than speed. This invariance makes the (AFT) or (TL) measure robust when applied to different users, computers, and pointing device or mechanisms, for performing a particular task involving pointer movements, in general, and pointing ambiguities, in particular. However, the value of the acceleration threshold is somewhat dependent on the nature of the particular task, and may need to be empirically adjusted for different kinds of task profiles. More sophisticated movement parsing methods may be employed to extract the tail length of the pointer movement profile. However, the (AFT) or (TL) measure is preferred over more sophisticated pointer movement parsing algorithms because of its computational simplicity, which is important as it is preferably incorporated into computer applications executed extensively during computer runtime.
Given the locality of the amount of fine tuning (AFT) or tail length (TL) type of speed-accuracy tradeoff heuristic measure, it is expected to be highly dependent on the size of a user targeted object, yet largely independent of the distance of the user targeted object from a predetermined reference point. Consequently, the behavior of the (AFT) or (TL) measure is modeled as a function only of the size of the user targeted object. As shown in the following description of an empirical experiment below, the relation between the amount of fine tuning (AFT) or tail length (TL) and size of the user targeted object, W, is well described as a second degree polynomial having the form of: where a, b, and c, are empirically determined parameters, herein, also referred to as the (AFT) or (TL) parameters, and the asterisk symbol, *, is the multiplication operator. For the size of the user targeted object, W, the same definition proposed by MacKenzie, I. S. and Buxton, W., as described in the previously cited MacKenzie and Buxton reference above, used in the herein above described Shannon formulation of Fitts' Law, that is, the length of the smaller axis of a box bounding the target (user targeted object), as shown in FIG. 4, is also used for the (AFT) or (TL) heuristic measure of the present invention.
Empirical Experiment. An empirical experiment was performed for a first objective of assessing the predictive power of each of the two types of speed-accuracy tradeoff heuristic measures, total movement time (TMT), and, amount of fine tuning (AFT) or tail length (TL), used in the present invention, as well as their combined power. A second objective was to test the assumption that the (AFT) or (TL) type of speed-accuracy tradeoff heuristic measure can be modeled as a function of the size of a user targeted object only, indicating its general independence of distance of the user targeted object from a reference point, located in an object domain.
The experiment was performed using a Windows NT computer operating system, with a 17″ display monitor set to 600×800 pixel resolution. The computer executed a computer application including a pointing device or mechanism, in particular, a ‘mouse’, featuring a pointer dynamically moveable throughout an object domain and running inside a 450×630 pixel window. Five persons or subjects manually operated the pointing device or mechanism, the ‘mouse’, with their preferred hand, where in this experiment all five subjects used their right hand.
The experimental procedure was as follows. The subjects were instructed to select targets, herein, also referred to as user targeted objects, as fast as they could with the mouse pointing mechanism, that is, each subject placed the cursor or pointer on the target (user targeted object) and pressed the left button of the mouse. The subjects pressed a small "Next" button at the left side of the computer display screen before each next target (user targeted object) was presented for selection by the user. This ensured that all pointer movements initiated at a single pre-determined, fixed or constant reference point in the object domain. FIG. 6 is a schematic diagram illustrating the experiment layout used for assessing the predictive power of the two types of speed-accuracy tradeoff heuristic measures, total movement time (TMT), and, amount of fine tuning (AFT) or tail length (TL), used in the present invention.
All targets (user targeted objects) in the object domain were square-shaped. The manipulated controlled variables were the size, W, of the target (user targeted object), expressed in dimensions of number of pixels, in particular, 12, 25, 50, 100, and 200, pixels, as the length of the smaller axis of a box bounding the target (user targeted object), and, the horizontal distance, A, also expressed in dimensions of number of pixels, in particular, 200, and 350 pixels, from the target (user targeted object) to a pre-determined reference starting point in the object domain. The vertical position of the targets (user targeted objects) was distributed randomly along the vertical axis of the screen to avoid automatic pointer movement. The order of target (user targeted object) appearance was random with each given target (user targeted object) size and horizontal location. Target (user targeted object) selection was repeated 15 times for each subject, amounting to a total of 150 separate times or events of a series of pointer movements, for 2 different target (user targeted object) locations, for 5 different target (user targeted object) sizes, translating to a total of 150 target (user targeted object) selections. The dependant variables were the total movement time (TMT) for the Fitts' Law measure, and length of the pointer movement tail (TL) for the amount of fine tuning (AFT) or tail length (TL) measure.
The dependant variables were computed as follows. The location of the pointer position in the object domain was sampled at constant time intervals during each pointer movement. A vector of the Pythagorean distances between each two consecutive samples was computed, to represent the pointer movement profile. Since absolute measurement units were not relevant for analyzing and understanding the results of this experiment, the computed distance vector was in fact regarded as a speed vector, without dividing it with the constant time interval of sample frequency.
The total movement time (TMT) was represented simply as the number of pointer samples during each movement, that is, the length of the speed vector. For the amount of fine tuning (AFT) or tail length (TL) measure, an acceleration vector was computed, containing the differences between each two consecutive entries in the speed vector. The length of the tail was defined as the number of sample points in which acceleration was below 10 pixels per sample point, counting back from the end of the acceleration vector. This computation captured the notion of the fine tuning stage of the pointer movement previously described above, as shown in FIG. 7, a schematic diagram illustrating computation of the tail length (TL), relating to the amount of fine tuning (AFT) or tail length (TL) type of speed-accuracy tradeoff heuristic measure. The value of 10 pixels per sample point was extracted empirically prior to the experiment, by examining acceleration profiles of target directed movement during similar conditions. The resulting data was analyzed with no outlier removal. Out of the 150 independent times or events of a series of pointer movements measured for each subject, 100 were used for developing a prediction model, while the remaining 50 were used for assessing the prediction power of the model.
For each size, W, of the target (user targeted object), and horizontal distance, A, from the target (user targeted object) to the predetermined reference starting point in the object domain, for 5*2=10 sampling categories, the average values of the measures were computed for each subject, that is, resulting in 10 average category values for each subject. For each subject, linear regression was used to determine the relation between the empirical data (average category values) and each of the measures.
In the case of applying the Shannon formulation of Fitts' Law measure, the regression was done between parameters y=TMT, and x=log2[(A/W)+1], resulting in evaluation of the Fitts' Law empirical parameters a and b from the above described Shannon formulation of Fitts' Law formula: where A is the horizontal distance from the target (user targeted object) to the predetermined reference starting point in the object domain, and W is the size of the target (user targeted object) expressed as the length of the smaller axis of a box bounding the target (user targeted object). Experimental data and model prediction using the Shannon formulation of Fitts' Law measure are graphically shown in FIG. 8, a graph of experimental data and model prediction, using the Shannon formulation of Fitts' Law measure, of total movement time, TMT, plotted as a function of the index of difficulty, ID=log2[(A/W)+1].
In the case of applying the amount of fine tuning (AFT) or tail length (TL) measure, the regression was performed between parameters y=TL, x=W, and x2=W2, resulting in evaluation of the (AFT) or (TL) empirical parameters a, b, and c, from the above described (AFT) or (TL) measure formula:
Experimental data and model prediction using the fine tuning (AFT) or tail length (TL) measure are graphically shown in FIG. 9, a graph of experimental data and model prediction, using the fine tuning (AFT) or tail length (TL) measure, of tail length, TL, plotted as a function of the size of the target (user targeted object), W.
Both Fitts' Law (Shannon formulation) and (AFT) or (TL) measures were accurately described by the respective regression models. Two fit to model values, average individual fit and average general fit, of the regression parameter, r2, and standard error, Se (coefficient) for each coefficient or empirical parameter of each speed-accuracy tradeoff measure, that is, a and b, for the Fitts' Law (Shannon formulation) measure, and, a, b, and c, for the (AFT) or (TL) measure, and the parameter, y, for each speed-accuracy tradeoff measure, are presented in the following four tables, two tables for each speed accuracy tradeoff heuristic measure: Average individual fit—an average of the fit to model levels of each individual subject (Sub). Average general fit—the fit to model of the averaged data of all five subjects. Coefficients-Individual Subjects Sub1 Sub2 Sub3 Sub4 Sub5 Average r2 0.92 0.96 0.78 0.87 0.8 0.866 Se (b) 0.60 0.57 0.94 0.8 1.11 0.8 Se (a) 1.8 1.7 2.86 2.42 3.67 2.49 Se (y) 2.3 2.17 2.86 3.08 4.28 2.94
TMT=a+b*log2[(A/W)+1] Coefficients-Average Data Average r2 0.966 Se (b) 0.4 Se (a) 1.23 Se (y) 1.56 Coefficients-Individual Subjects Sub1 Sub2 Sub3 Sub4 Sub5 Average r2 0.99 0.98 0.98 0.98 0.94 0.97 Se (a) 0.00 0.00 0.00 0.00 0.00 0.00 Se (b) 0.01 0.06 0.03 0.04 0.08 0.04 Se (c) 0.41 2.18 1.35 1.7 2.91 1.71 Se (y) 0.4 2.1 1.3 1.64 2.8 1.65
TL=a*W2+b*W+c Coefficients-Average Data Average r2 0.988 Se (a) 0.00 Se (b) 0.036 Se (c) 1.3 Se (y) 1.25
The two values of each parameter for each model both indicate the degree by which the speed-accuracy tradeoff heuristic measures are accurately described using the respective formula. As seen in the tables, with the Fitts' Law (Shannon formulation) measure, the average individual fit to model is r2=0.866, and the fit for the average data is r2=0.9660. With the (AFT) or (TL) measure, the average individual fit to model was r2=0.97, and the fit for the average data was r2=0.988. It is noted that the (AFT) or (TL) measure fit is significantly higher than the Fitts' Law (Shannon formulation) measure fit.
Predictive Power. The high fit to model value of the two models reflect that the averaged data is very well represented with the derived models, even for each individual subject. However, this accurate representation does not ensure good prediction rates, as high variance in the data might make model prediction ineffective. As prediction rate is the ultimate goal of using the speed-accuracy tradeoff heuristic measures, this issue was confronted directly, by using the regression curve computed from 100 pointer movements of each subject, as a model for predicting the size of the target (user targeted object) of the remaining 50 pointer movements for that subject. The prediction test was performed on 4 ambiguous target (user targeted object) combinations: [target 1 vs. target 5], [target 1 vs. target 3], [target 1 vs. target 2], and [target 1 vs. target 3 vs. target 5], where target 1 is the smallest target and target 5 is the biggest target. For instance, in the [target 1 vs. target 5] condition, the prediction test was performed as follows: from the 50 independent pointer movements that were not included in the regression model, there was extracting pointer movements that were towards either target 1 or towards target 5. Then, a scoring method, described in detail below in the implementation guidelines section, was used for predicting whether the pointer movement was towards target 1 or towards target 5. The prediction was then verified using the knowledge of the real target (user targeted object).
Three different prediction settings were used:
1—Shannon formulation of Fitts' Law speed-accuracy tradeoff measure.
2—Amount of fine tuning (AFT) or tail length (TL) speed-accuracy tradeoff measure.
3—Combined measure of both 1 and 2.
In the combined measure setting, a simple voting was conducted between the scores of the two separate measures 1 and 2, knowing that the scores of both measures were normalized, and assuming that the two measures are equally trustworthy.
The results of these tests are summarized in the following tables of ‘Hit Rates’. It is noted that the Tail Length measure yielded, on average, better results in all prediction categories. It is also noted that, in general, the combined measure of both Fitts' Law and Tail Length yielded better results than each of the two measures did separately, indicating effective integration of the two speed-accuracy tradeoff heuristic measures for predicting a user targeted object in the object domain presented to each user or subject. Fitts Hit Rates (in percents) Sub1 Sub2 Sub3 Sub4 Sub5 Average 1 Vs 5 85 85 90 85 85 86 1 Vs 3 55 85 75 70 70 71 1 Vs 2 65 60 70 60 55 62 1 Vs 3 Vs. 5 50 63.33 66.66 70 60 62 Tail Hit Rates (in percents) Sub1 Sub2 Sub3 Sub4 Sub5 Average 1 Vs 5 80 80 90 100 95 89 1 Vs 3 65 75 70 75 80 73 1 Vs 2 60 65 60 70 65 64 1 Vs 3 Vs. 5 56.66 73.33 76.66 76.66 66.66 69.994 Joint Tail and Fitts Hit Rates (in percents) Sub1 Sub2 Sub3 Sub4 Sub5 Average 1 Vs 5 85 85 90 100 95 91 1 Vs 3 65 70 70 70 70 69 1 Vs 2 65 60 65 75 60 65 1 Vs 3 Vs. 5 56.66 66.66 73.33 83.33 76.66 71.33
Near Versus Far Analysis. The above data of the empirical experiment were further analyzed by separating the results of the closer (Near) targets (user targeted objects) from the results of the farther (Far) targets (user targeted objects). This process was done for each subject as well as for a cross-subject average. The results are graphically shown in FIG. 10, a graph of experimental data and Shannon formulation of Fitts' Law heuristic measure model prediction of total movement time, TMT, plotted as a function of the index of difficulty, ID=log2[(A/W)+1], for Near vs. Far Targets, and, in FIG. 11, a graph of experimental data and (AFT) or (TL) heuristic measure model prediction of tail length, TL, plotted as a function of the size of the target (user targeted object), W, for Near vs. Far Targets.
The cross-subject average analysis revealed that while the near vs. far conditions appear balanced for Fitts' Law (Shannon formulation) measure they demonstrate a slight bias for the Tail Length measure, that is, the tail tends to be longer for closer targets (user targeted objects). This bias indicates that the (AFT) or (TL) speed-accuracy tradeoff heuristic measure is not completely invariant to the horizontal distance, A, from the target (user targeted object) to the pre-determined reference starting point in the object domain, and that a more accurate prediction model may be derived by integrating this factor into the computation.
Exact Pointer Position Heuristic Measures. In most human-computer interaction (HCI) environments, a user targeted object is explicitly selected, by a user, by locating the pointer position and performing an explicit selection action using a pointing mechanism, for example, by clicking on a ‘mouse’ type of pointing mechanism anywhere on the user targeted object, regardless of the exact location of the pointer position within the user targeted object. Other human-computer interaction environments feature special explicit interaction areas within the user targeted object, typically, edges, or designated ‘handles’ associated with the user targeted object, for explicitly enabling activation of special operations on the user targeted object. In the later case, the special interaction areas are allocated in advance and have a unique role, and thus function as independent human-computer interaction user targeted objects. In typical computer applications including a pointer, or a cursor, dynamically moveable throughout an object domain, a user placing the pointer, or cursor, on any of these special interaction areas usually invokes a visual feedback, typically, a change in shape of the pointer, or cursor, to ensure the unique interpretation of the input of the pointer, or cursor, movement by the user.
In contrast to these explicit and deterministic types of human-computer interaction, the present invention includes the use of a second particular type of the above described category of implicit user pointing gesture measures, which is herein referred to as exact pointer position heuristic measures. The exact pointer position is used in an implicit and probabilistic manner, while integrating this information with the other types of heuristic measures, such as the previously described Fitts' Law (Shannon formulation) and (AFT) or (TL) types of speed-accuracy tradeoff heuristic measures, as well as with the other two categories of heuristic measures, that is, application context heuristic measures, and, number of computer suggestions of each best predicted user targeted object heuristic measures.
The exact pointer position heuristic measure conveys information both with regard to the final position of the pointer and with regard to the recent user movement continuation, also referred to as user movement trend, of the pointer towards a user targeted object in an ambiguous multiple object domain. Specifically, the present invention features two kinds of exact pointer position heuristic measures, as follows:
1—Distance from the center of an object. The closer the pointer position is to the center of an object in an ambiguous multiple object domain, the more likely that object is the user targeted object in the multiple object domain. In order to avoid a bias for small objects in the ambiguous multiple object domain, the proximity of the pointer position to the center of an object is normalized with the area of the object.
2—Direction of pointer movement. Higher scores are assigned to pointer movements that are towards the center of an object in an ambiguous multiple object domain, than to those pointer movements that tend to be farther away from the center of an object. This reflects the observation that users tend to perform the object selection action using the pointing mechanism, for example, clicking the mouse, with the pointer position close to, but not beyond, the center of an object, but tend not to overshoot when performing the selection action, for example, clicking the mouse, with the pointer position located beyond the center of an object. Another justification for a penalty for getting away from, or beyond, the center of an object, is that the user already had an opportunity to perform the object selection action using the pointing mechanism, for example, clicking the mouse, with the pointer position more centrally located in the immediate vicinity of the object, and is thus less likely to select the pointer position located beyond the center of the object compared to pointer positions located close to, and not beyond, the center of the object.
Application Context Heuristic Measures. The second category of different heuristic (statically and/or dynamically learned) measures for implicitly analyzing the user movements of the pointer and predicting the user targeted object, according to the present invention, is application context heuristic measures. A particular type of application context heuristic measure used in the present invention is referred to as containment hierarchy.
The context of the selection action of a user is defined as any information that is external to the object selection action itself, but is relevant to understanding the obje |