The Interaction of Spatial Reference Frames and Hierarchical Object Representations:
A Computational Investigation of Drawing in Hemispatial Neglect

Jeffrey Beng-Hee Ho
Department of Psychology
Carnegie Mellon University
Pittsburgh, PA 15213-3890
ho+@cmu.edu

Marlene Behrmann
Department of Psychology
Carnegie Mellon University
Pittsburgh, PA 15213-3890
mb9h@crab.psy.cmu.edu

David C. Plaut
Department of Psychology
Carnegie Mellon University, and
Center for the Neural Basis of Cognition
Pittsburgh, PA 15213-3890
plaut@cmu.edu

In Proceedings of the 17th Annual Conference of the Cognitive Science Society, pages 148-153. Hillsdale, NJ: Lawrence Erlbaum Associates.

Abstract

In drawing a figure, hemispatial neglect patients typically produce an adequate representation of parts on the right of the figure while omitting significant features on the left. This contralateral neglect is influenced by multiple spatial reference frames and by the hierarchical structure of the object(s) in the figure. The current work presents a computational characterization of the interaction among these influences to account for the way in which neglect manifests in drawing. Neglect is simulated by a ``lesion'' (monotonic drop-off from right to left) that can affect performance in both object-centered and viewer-centered reference frames. The joint effects of neglect in both these frames provide a coherent account of the drawing performance of a patient, JM, and may be extended to account for the copying performance of other patients across a range of objects and scenes.

Introduction

Hemispatial or unilateral neglect is a visuospatial deficit, typically caused by brain damage to the right parietal lobe, in which patients fail to perceive or act on information that appears on the side of space opposite the lesion While patients with (left-sided) neglect have normal intellectual abilities and intact primary motor and sensory function, they do not notice objects on the left, may leave food untouched on the left side of the plate, and may not shave or bathe the left side of the body. Neglect is generally interpreted as a failure to distribute attention evenly along the horizontal meridian such that less attention is deployed to the left than to the right (Cohen, Romero, Servan-Schreiber, & Farah, 1994, Kinsbourne, 1993, Mozer & Behrmann, 1990, Posner, 1988).

A central question in understanding neglect is what constitutes ``left''---that is, With respect to what frame of reference is the left side defined? Possibilities include viewer-centered frames (e.g., aligned with the retina, head, or trunk), environment-centered frames (e.g., aligned with the room, table, or page), and object-centered frames (i.e., aligned with depicted objects). Under most viewing conditions, these frames are all aligned, so there is no way to evaluate which frame determines the nature of neglect behavior.

In fact, recent evidence suggests that neglect behavior is sensitive to spatial information defined simultaneously with respect to multiple reference frames. When viewer-centered and object-centered frames are deconfounded by rotating the stimulus or the viewer, patients continue to exhibit viewer-centered neglect but also fail to report information on the left of the object even though this information is located to the right of midline of the viewer and/or the environment (Behrmann & Moscovitch, 1994, Driver & Halligan, 1991, Young, Hellawell, & Welch, 1992). A compelling example of this object-centered neglect is the case of NG, a patient with right-sided neglect, who failed to read the rightmost letters of a word even when the word was presented vertically or in mirror-reverse format (Caramazza & Hillis, 1990). Object-centered neglect can also be demonstrated under fixed viewing conditions. When viewing an equilateral triangle with a gap on one side, patients with left-sided neglect fail to detect the gap more often when they are biased to see the triangle as pointing in a direction that places the gap on the left of the perceived major axis (Driver, Baylis, Goodrich, & Rafal, 1994). These data support the view that the spatial positions of parts of an object are coded with respect to a reference frame aligned with the principle axis of the object itself (Marr, 1982, Marr & Nishihara, 1978), and that visual attention is allocated, at least in part, relative to this frame.

Object-centered effects in neglect may manifest in more complicated ways in a copying task, in which a target stimulus, often a daisy or a clock, is presented upright in the center of a blank piece of paper. Patients with neglect often produce an adequate representation of the right side of the figure while leaving out significant features on the left. For example, the left drawing in Figure 1 shows the performance of a neglect patient, JM, in copying an upright daisy, in which the leftmost petals were omitted. Again, though, the standard copying task confounds the influences of reference frames centered on the viewer, the environment, and the object. If parts are located with respect to an object-centered frame, then, when copying a daisy rotated from the upright, patients should continue to neglect to draw parts on the left defined intrinsically with respect to the object itself. As is evident in Figure 1, patient JM still omits features on the object-centered left of the daisy when copying misoriented versions of the target daisy.

Figure 1: Neglect patient JM's copying of a daisy presented in different orientations.

Interestingly, object-centered neglect in copying may occur not only for a single object but also when the figure to be copied contains multiple items. In this case, patients may omit features on the left of an object while still including features on the right of another object that is further to the left on the page (Gainotti, Messerli, & Tissot, 1972, Marshall & Halligan, 1993). Moreover, the object-centered deficit may appear even within subparts of a single, complex object. When patient PP (Driver & Halligan, 1991) was required to copy a single wheel, presented on the left or right of a page, she omitted spokes on the left of the wheel. When two wheels were presented, one on the left and one on the right, she omitted spokes only on the left of the left wheel. When the two wheels appeared as parts of a larger object (a bicycle), PP omitted the left wheel entirely, retaining only the right one.

These findings make sense if the representation of an object has a hierarchical structure in which its parts are in themselves objects at a smaller spatial scale, and which decompose further into their own parts at an even smaller scale (Marr & Nishihara, 1978, Palmer, 1977). During the copying of a complex figure, a reference frame aligned with a part of the object serves as the context frame for locating and drawing its subparts. Thus, the object-centered frame is not fixed throughout the task; rather, objects are recursively decomposed and dynamically assigned to roles as objects and parts depending on the current relevant level of the hierarchy (see Hinton, 1990). Accounting for the copying performance of neglect patients (and of normal subjects) is complicated because, at one point in time, the context frame may represent the spatial coordinates for copying a particular part, whereas at a second point in time, this same part may itself define the context frame for the copying of its own subparts.

The goal of this paper is to examine how hierarchical object representations might interact with spatial reference frames to explain the performance of patients who show neglect both with respect to multiple frames of reference and at multiple levels of the object hierarchy. First, we examine whether, as suggested above, the performance of normal subjects in copying misoriented versions of a daisy is mediated by a hierarchical representation of the daisy. We then implement this process as a conventional tree-traversal algorithm over a hierarchical data structure representing the daisy. During the traversal, the position of each component is maintained relative to both the local object-centered frame and the global viewer-centered frame. By imposing a spatially defined lesion, analogous to the deficit hypothesized to underlie the attentional impairment in patients with right-parietal damage, we demonstrate how neglect can arise with respect to both the viewer- and object-centered reference frames even when objects are misaligned from their canonical orientation.

Hierarchical Representations in Drawing

It is commonly assumed that hierarchical object representations are used to structure drawing (Taylor & Tversky, 1992) and that this representation is the same one that mediates perception (Kosslyn, 1987, VanSommers, 1989). In the case of the daisy, we assume that the hierarchical representation is composed of three major parts (parents), each of which can be broken down into their subparts (children) (see Figure 2). These children are decomposed further---for example, the central stem decomposes into the oblique stems which break down further to encompass the leaves. The representation used in this study has four levels, as illustrated in Figure 2.

Figure 2: A daisy and its hierarchical representation.

To verify that this hierarchical object representation adequately captures drawing performance, we had 20 normal subjects generate 3 copies of a daisy presented in each of 4 orientations (upright, 90 deg rotation to the left or right, and inverted). We tracked the order of strokes used by the subjects. Drawing performance was considered to obey the hierarchical representation if the order in which the components were drawn followed a depth-first traversal order through the hierarchy (ignoring the order among subparts). In other words, once a stroke within a particular subtree is drawn, all of its components and subcomponents must be drawn before a stroke within another subtree at the same level is drawn. Any stroke that did not adhere to this rule was counted as a violation of the hierarchy. Across all subjects and drawing conditions, the mean number of hierarchy violations was 1.3 (SD 0.84), and was not significantly affected by the orientation of the daisy (F<1). This number of violations is significantly different from the mean obtained from 120 randomly-generated stroke sequences (17.2; SD 2.6; F[1,238]=3953, p< .001). This finding supports the proposal that the performance of normal subjects in copying the daisy is based on traversing a hierarchical representation like the one in Figure 2.

Implementation of Neglect Drawing

Method

Given the evidence that normal subjects use a hierarchical object representation when drawing, a computational investigation was carried out to explore the implications of a spatial impairment in object- and viewer-centered reference frames when drawing using a hierarchical representation. The hierarchy depicted in Figure 2 was represented as a conventional tree data structure, in which each node in the tree corresponded to a particular part of the daisy. The node for a part contained information on its location in the object-centered frame defined by its parent. Specifically, the object-centered frame for a part was oriented and centered on its parent, with a scale defined by the horizontal extent of the parent (with x-coordinates ranging between +/-1). The viewer-centered frame was always upright, centered on the entire daisy, and used a scale defined by the horizontal extent of the daisy. Thus, for instance, the rightmost petal in the upright daisy has a viewer-centered x-coordinate of about 0.5 (i.e., the horizontal position of its center is about half way between the midline of the daisy and the tip of the right leaf) and an object-centered x-coordinate of about 2.0 (i.e., its horizontal distance from the center of its parent, the circle, about twice the radius of the circle). For a misoriented daisy, the viewer-centered positions of parts changed accordingly but their object-centered positions remained the same.

For a particular orientation of daisy, the probability that the part would be drawn (i.e., not neglected) in a particular frame was assumed to be a monotonically-increasing function of its horizontal position in that frame (Figure 3). The specific (exponential) form of this function is not critical as it influences only quantitative aspects of the results. Notice that the probability of drawing a part is near 1.0 on the right side of the frame, about 0.9 at the midline, and drops off sharply towards the left of the frame. The overall likelihood that a part is drawn was assumed to be a weighted average of its separate probabilities in the viewer-centered frame and in the object-centered frame---the effects of different relative weightings are explored below. All else being equal, the effect of neglect is generally stronger in the object-centered frame than in the viewer-centered frame because the former is defined more locally (i.e., parts typically fall outside the +/-1 frame defined by the horizontal extent of their parent).

Figure 3: The probability that a part is drawn (i.e., not neglected) as a function of its horizontal position within a reference frame.

A simple depth-first tree traversal algorithm was used to determine the neglect pattern. At every node, the probability that the corresponding part was drawn was calculated based on its viewer-centered and object-centered coordinates. We assumed that if a part was not drawn, then none of its subparts would be drawn. Thus, the probability of a part being drawn is the product of and the probability of its parent being drawn and its own local probability based on its relative positions in the viewer- and object-centered frames. The order of traversal among children of the same parent was irrelevant. The outcome of the tree traversal was that every part was assigned a probability of being drawn based on the orientation of the daisy and the particular weightings of the viewer- and object-centered frames.

Results and Discussion

To investigate the relative contribution of the viewer- and object-centered neglect on drawing performance, we calculated the probability of each part being drawn for daisies in all four orientations---up, left, down, and right---first when the viewer- and object-centered effects had independent influences on drawing performance and then when performance was influenced by a combination of viewer- and object-centered effects. To explore the independent effects of the two frames, the weighting of either the viewer- or object-centered effect was set at 1 and the other effect was set at 0. Thereafter, combinations of the two frames were examined when the weighting of one frame was 0.75, 0.5, or 0.25 and that of the other was set to produce a sum of 1.

Because the misoriented daisy allows for the decoupling of the viewer- and object-centered effects, unlike the upright daisy, Figure 4 illustrates the independent contribution of viewer-centered neglect and of object-centered neglect in a left-facing daisy. The numbers superimposed on the daisy indicate the probability of each feature being drawn, calculated according to the algorithm described above. It is important to recognize that the probability of a part being drawn is contingent on the probability of its parent being drawn---if the parent or containing objects is omitted, so is the child. The probabilities for the subparts such as the petals and leaves, therefore, reflect the conditional probability of parent and child both being drawn and are subsequently lower than the probability of the parent alone.

As is evident from this figure, when the viewer-centered influence is 1.0 with no object-centered influence (Figure 4a), information on the viewer-centered left has a fairly low probability of being drawn, with the probability of the daisy head being 0.75 and the surrounding left and right petals ranging from 0.45 to 0.63 respectively. The petal that occupies the leftmost position has a probability of 0.38. In contrast, when the viewer-centered effect is set to have no influence and neglect arises solely within the object-centered frame (Figure 4b), information to the right of the canonical midline of the daisy has a high probability of being drawn (approximately 0.94) whereas the petals and leaf on the left of the intrinsic axis have a very low probability of being drawn (approximately 0.24). The leaf on the canonical left stem has a probability of 0.06 because it is conditional on its parent stem being drawn and because it occupies the most extreme left position in the object-centered frame.

Figure 4: The probabilities that the parts of a left-facing daisy are drawn when neglect operates (a) solely in the viewer-centered frame, and (b) solely in the object-centered frame(s).

Figure 5 illustrates how a mixture of neglect in the viewer- and object-centered frames affect the probability of a feature being drawn. As is evident from the upright daisy, the object-centered effect has a somewhat stronger influence on performance---when it is set at 0.75, the probability of the petals and stem on the left being drawn is lower than when the viewer-centered effect is set at 0.75. In all cases, the midline structures (flowerhead, stem, and pot) have a high probability of being drawn. The slightly stronger influence of the object is also seen in the upside-down figure where, even when the two effects are equivalent at 0.5, the probability of drawing parts on the left, defined in the object-centered frame, is lower than the probability of drawing parts on the left of the viewer-centered frame (canonical right of daisy). The interaction of the two effects is best seen in the 90 deg misoriented daisies. In the left-facing daisy, the probability of drawing the petals on the canonical left (where the petals are also on the viewer-centered left) is much lower than in the right-facing daisy (where these same petals appear on the right of the viewer-centered frame). As either the severity of the viewer- or the object-centered neglect is increased, so the probability of drawing such a left-sided petal decreases. For example, on the left-facing daisy, when the viewer- and object-centered effects are equivalent, at 0.5, the probability of drawing the petal to the immediate left of the canonical midline is 0.36. As the object-centered effect is decreased to 0.25, the probability of that petal being drawn increases to 0.41, and as the object-centered effect is increased to 0.75, so the probability decreases to 0.30. The important point is that when a petal appears on the left of both the object- and the viewer-centered frame, it has a high probability of being neglected, relative to when it appears on the left in only one frame. This is in marked contrast to the data from the same petal (canonical left) on the right-facing daisy in which, irrespective of the relative contribution of object- and viewer-centered neglect, remains approximately constant with a probability of about 0.97. The crucial result from this analysis is that there are joint effects on neglect on the distribution of attention in both frames and that these are exacerbated as the severity of neglect---particularly in the object-frame---increases.

Figure 5: The effects of various mixtures of neglect in the viewer-centered frame and in the object-centered frame(s).

The probabilities in the figures can be interpreted as indicating the frequency with which individual parts would be drawn over a large sample of drawings. In order to account for the drawing performance of a single neglect patient, JM, and to generate daisies that would adequately characterize his performance, we adopted a discrete approach in which we converted the probabilities into absolute values. We then selected a threshold of 0.57 such that features with values below this threshold were not drawn (i.e., were neglected). Figure 6 contains the renderings of the daisies in four orientations when this threshold value is used. The relative weightings of viewer- and object-centered neglect were 0.6 and 0.4, respectively, as these best accounted for JM's drawings. As can be seen from these drawings, the data are reasonably well captured by this mixture of object- and viewer-centered neglect with the threshold as the cut-off value for performance. One apparent discrepancy is that JM's drawing of the left-facing daisy does not contain the upper-right petal (Figure 1b). As it turns out, he initially drew this petal and then erased it, removing a small part of the circle along with it. Another aspect of his data not yet explained is the right-facing daisy (Figure 1d), in which petals on the canonical object left are omitted but petals are also omitted from the left of the daisy head defined in viewer-centered coordinates. This pattern again reflects the joint effect of viewer- and object-centered effects. One interesting possibility is that after JM drew the daisy head (which he did first), because the circle in the center has no intrinsic axis, it has no obvious left-right asymmetry. This ambiguity lends itself to neglect in both frames: In an object-centered frame, a petal to the left is ignored and in a viewered-centered frame, petals to the left are also ignored. Although we cannot account for these data in our current implementation and the joint effect of the two frames in Figure 1d is beyond the interactions we have modeled here, the model may be extended to account for these data utilizing the same principles as those incorporated in the simulations thus far.

Figure 6: Drawings produced by a mixture of 0.6 viewer-centered neglect and 0.4 object-centered neglect, assuming a threshold probability of drawing a part of 0.57.

Conclusions

We have presented a computational investigation of how hierarchical object representations might interact with multiple spatial reference frames. This combination of the representation and reference frames is able to reproduce the pattern of drawing performance observed in patients with hemispatial neglect. The deficit in patients with neglect is considered to be a failure to distribute attention evenly across space with the result that information on the contralateral side is ignored or omitted. Of much recent interest is that, in addition to neglect of information on the left defined by viewer-centered coordinates (Farah, Brunn, Wong, Wallace, & Carpenter, 1990, Karnath, 1994), patients may simultaneously ignore information on the left defined by coordinates intrinsic to an object (Behrmann & Moscovitch, 1994, Caramazza & Hillis, 1990, Driver & Halligan, 1991, Marshall & Halligan, 1993). We simulate the co-existence of neglect in more than one set of coordinates by assuming that the same deficit (instantiated as a monotonic drop-off of attention from right to left) underlies the distribution of attention in these different reference frames (but see Humphreys & Riddoch, 1994). By computing the probability of drawing a feature as a function of its left-right position in the object- and viewer-centered frames, we can explain why neglect patients fail to localize objects or to attend to information situated in a larger context frame. Through the dynamic reassignment of elements to object or parts roles, this same model can account for neglect of objects on the left of a multi-object scene, neglect on the left of a single object, and neglect for features on the left of a part of a single object. We also show how, by varying the relative weighting of neglect in each frame, we can account for the drawing performance of neglect patient, JM.

One aspect of neglect copying performance which is not easily explained by the current model and which is robust across patients is that some features are retained. Under the account we have proposed, these same features should probably be omitted. For example, patients almost never draw only the right half of the circle for the head of the flower, nor do they omit the lip of the pot (if the base is drawn), even if it occupies a position on the left of the spatial reference frame. Similarly, in clock drawing or copying, even if patients neglect to fill in the numbers on the left of the clock, they invariably draw the entire perimeter of the clock. A possible explanation for this retention of left-sided information derives from the nature of the object representation underlying the subject's performance. Several recent studies have shown that patients with neglect remain sensitive to Gestalt properties of the stimulus. Thus, if a feature on the left of the object's midline can be grouped together with a feature on the right to form a ``good'' figure, based on principles such as good continuation, symmetry or closure, the left-sided feature is less likely to be neglected (Ward, Goodrich, & Driver, 1994). The grouping of features according to Gestalt heuristics may be incorporated into the hierarchical representation adopted here and a rather direct extension of the current implementation can account for these seemingly contradictory results.

The approach adopted here has been to characterize systematically the behavior of a mechanism in which hierarchical object representations and multiple reference frames interact to co-determine performance of a system. The simulations are not intended to be a veridical instantiation of the neural mechanism underlying neglect nor to parallel directly the function of parietal lobe. The principles embodied in this work, however, are consistent with many views that argue that the parietal lobe integrates and transforms data from one set of coordinates to another (Colby, 1991, Karnath, 1994, Stein, 1992). How the brain might actually implement a hierarchical representation and how it might achieve the dynamic reassignment of the components to parts and wholes are difficult research issues (see Hinton, 1990, for a connectionist approach to these problems).

The example of the daisy was used in this research because it is standardly used in the clinical assessment of neglect and because much is known about the way neglect patients perform on this task. The principles governing the joint effects of neglect in more than one reference frame, as proposed here, however, are believed to apply more generally. Predictions from this model may be generated to account for the copying behavior of neglect patients when they are confronted with more complex hierarchical objects and visual scenes.

Acknowledgments

Financial support for this research was provided by the National Institute of Mental Health (Grant MH47566), the McDonnell-Pew Program in Cognitive Neuroscience (Grant T89-01245-016), the Neural Processes in Cognition Summer Internship Program, and a Student Undergraduate Research Grant from Carnegie Mellon University.