Seeing the best way forward : Intentionality cues bias action perception toward the most efficient trajectory

Humans interpret others’ behaviour as intentional and goal-directed, expecting others to take the most energy-efficient path to achieve their goals. Recent studies have shown that these expectations of efficient action provide a perceptual prediction of an ideal efficient trajectory, against which the observed action is evaluated, resulting in a distorted perceptual representation of unexpected inefficient actions. Here we show that these predictions rely on the inferred intentionality of the stimulus. Participants observed an actor reach for an object with a straight or arched trajectory. The actions were made efficient or inefficient by adding or removing an obstructing object. The action disappeared mid-trajectory and participants reported the last seen position of the hand on a touch screen. Replicating previous research, judgments of inefficient actions were biased toward the efficient prediction (straight trajectories upward to avoid the obstruction, arched trajectories downward towards the target) In two further experiments, we removed intentional cues by replacing the hand with a nonagentive ball (Exp 2), and by removing the biological profile of the motion by depicting it move at a constant speed (Exp 3). Perceptual biases were substantially reduced when intentional cues were removed. Predictions of efficient action are at least partially perceptually represented and influence perceptual judgments of others actions, biasing them towards these expectations. These predictions emerge from attributions of intentionality to the observed actor, triggered by the perception of agency and kinematics that follow biological motion profiles.

Seeing the best way forward: Intentionality cues bias action perception toward the most efficient trajectory.
From early on, humans see others' behaviour as purposeful and goal directed (Baillargeon, Scott & Bian, 2016;Baker, Saxe & Tenenbaum, 2009;Csibra & Gergely, 2007;Gergely & Csibra, 2003).A key signature of this "intentional stance" (Dennett, 1987) is the assumption that other people generally act rationally: they take the most energy-efficient path to achieve their goal, and expend additional energy only when an obstacle has to be overcome (e.g., Csibra & Gergely, 2007;Hunnius & Bekkering, 2014).This simple efficient action heuristic arises early in development and allows children to attribute intentionality to observed behaviours, even when carried out by inanimate objects (e.g., Gergely, Bekkering & Király, 2002;Gergely, Nádasdy, Csibra, & Bíró, 1995;Liu & Spelke, 2017).Human infants (and some non-human primates) show surprise when actors that are believed to be intentional violate these assumptions, for example, when they do not attempt to avoid an obstacle or take an unnecessary long path towards their goal (Gergely & Csibra, 2003;Rochat, Serra, Fadiga & Gallese, 2008).Once established, this simple heuristic has been argued to form a steppingstone for more sophisticated abilities for reasoning about others (e.g., Gergely & Csibra, 2003;Wellman & Brandone, 2009).For example, seeing a seemingly inefficient action (e.g.reaching directly towards an object despite an obstacle in the way) can prompt the insight that others act according to beliefs and not objective reality (i.e. they may not have seen the obstacle), forming the basis of a prototypical theory of mind.
We have argued (Bach & Schenke, 2017;Hudson, McDonough, Edwards, & Bach, under review) that expectations of efficient action are, to some extent, perceptually represented, in the form of an ideal "reference" trajectory that a rational actor would take through a given environment, against which observed actions can be judged.This proposal emerges from recent predictive processing models of social perception (e.g.Bach & Schenke, 2017; Csibra,   2008; Kilner, Friston & Frith, 2007 ab ; Zaki, 2013).These models argue that perception of others' actionslike perception in generalis hypothesis-driven.Any assumption about the external world (and the people within it) is translated into the perceptual input that would result from such a state, and can guide perception and be tested against actual stimulation (e.g.Clark, 2013;Friston & Kiebel, 2009;Hohwy, 2013).In non-social perception, such expectations explain several visual illusions (e.g.dress illusion, Schlaffke et al, 2015), the switch between different bi-stable percepts (e.g., Kondo, Farkas, Denham, Asai & Winkler, 2017), or why the same objects can appear convex or concave depending on prior assumptions about light sources, for example (Adams, Graf & Ernst, 2004).In social perception, simply attributing a goal to another person could similarly elicit associated predictions about how this individual would realise such a goal, specifying which action they may soon carry out (e.g.Bach, Bayliss & Tipper, 2011;Bach, Knoblich, Gunter, Friederici & Prinz, 2005; for theoretical arguments, see Bach, Nicholson & Hudson, 2014, 2015; Csibra,   2008; Kilner et al., 2007 ab ).The principle of efficient action can make a direct contribution here, specifying the ideal "reference" trajectory that achieves the actor's goals with minimum energy expenditure, given the current environmental constraints, such as affordances of potential goal objects and potential obstacles in the way (e.g.Bach & Schenke, 2017;Bach et al., 2014).
In a recent series of studies, we used this paradigm to test whether, in the case of observed actions, the perceptual biases reflect the predictions derived from the assumption of efficient action (Hudson, McDonough, Edwards, & Bach, under review).Participants observed a hand reach for an object with a straight or arched trajectory.The actions were either efficient (reaching straight when the path was clear or arched over an obstacle) or inefficient (straight towards an obstacle or arched over empty space).The movement disappeared at some point on its course and participants reported the hand's last seen position on a touch screen, or by comparing it to probe stimuli presented.Both measures revealed that perceptual judgements were biased by expectations of efficient action.Straight reaches were reported higher if there was an obstacle in the way, as if lifted to avoid it.Conversely, reaches with a high arched trajectory were reported lower if the path was clear.These biases were present automatically, but increased when participants explicitly predictedprior to action onsetthe most efficient trajectory through the scene, or when attention was drawn to the environmental constraints.Moreover, they could be disrupted by dynamic visual noise masks directly after stimulus offset, which specifically disrupt re-entrant top-down projections to visual cortex required for conscious visual experience of a stimulus, suggesting that the biases emerge during ongoing perception or directly after the sudden offset, when the visual system "fills in" the expected future path (Fahrenfort, Scholte, & Lamme, 2007;Lamme, Zipser & Spekreijse, 2002).
Together, these results indicate that the teleological stance is at least partly perceptually represented, providing an ideal reference trajectory that interacts with the action that was indeed perceived.Here, we test on what stimulus features these predictions of efficient action depend.In children, as well as in adults, intention attributionand the resulting surprise when seeing an inefficient actiondepends on the presence of cues to intention (e.g.Johnson, 2000;2003), such as seeing an agentive stimulus (such as a hand relative to a ball; e.g., Falck-Ytter, Gredeback, Hofsten, 2006), or observing movements with biological motion trajectories (e.g., Baron-Cohen, 1995;Leslie, 1994;Morewedge, Preston, & Wegner, 2007;Rakison & Poulin-Dubois, 2001).If such cues indeed trigger attributions of intentionality to others, and expectation of efficient action is tied to such intentionality attribution, then they should also determine to what extent perceptual biases towards efficient actions are observed.
We first replicated the original experiment from Hudson and colleagues (under review), in which participants saw efficient and inefficient reaches and indicated the hand's perceived disappearance point on a touch screen.In further experiments, we progressively removed intentional cues.First, as in prior research on infant intention attribution, we replaced the hand with a non-agentive stimulusa ball (e.g., Falck-Ytter et al., 2006) -, which however followed the same well-known biological motion trajectories and profiles as the hands in the previous experiment, showing the classical bell-shaped velocity profile of reaches towards objects (e.g., Beggs & Howard, 1972).Second, humans are sensitive to motion cues that distinguish the intentional biological agents from inanimate objects, such as self-propulsion and change of direction (Baron-Cohen, 1995;Leslie, 1994), or a trajectory and speed of movement that is similar to one's own movement (Morewedge et al., 2007;Rakison & Poulin-Dubois, 2001).In a third experiment, participants therefore saw the same ball as in Experiment 2, but it did now not follow a biological motion profile, removing all kinematic cues to intention.If biases toward efficient action emerge from cues that signal intentionality, they should be substantially reduced in Experiment 2, and further in Experiment 3, as cues to intentionality are removed.

Participants
Eighty-two participants took part (mean age = 21 years, SD = 4.3, 66 females, experiment 1: n=29, experiment 2: n=27, experiment 3: n=26).Nine additional participants were excluded due to performance (see Results).All participants were right handed and had normal/corrected-to-normal vision, recruited from Plymouth University or the wider community for course credit or payment.The study received ethical approval from the University of Plymouth's ethics board, in accordance with those of the ESRC and the Declaration of Helsinki.A priori power analyses of the original experiment which Experiment 1 replicates (Hudson et al., under review, "Report Obstacle" condition), revealed that a sample size of 11 is required to achieve power of 0.95.

Apparatus.
Presentation (NeuroBS) software was used to present the experiment via a HP EliteDisplay S230tm 23-inch widescreen (1920 x 1080) Touch Monitor.Verbal responses were recorded using Presentation's sound threshold logic via a Logitech PC120 combined microphone and headphone set.

Stimuli.
Example stimuli can be seen in Figure 1A.For Experiment 1, forty videos of an arm reaching for an object were used, taken from our previous studies (Hudson et al., under review).To derive a set of stimuli of efficient actions, videos were filmed of an arm at rest to the right of the screen, which then began to reach for one of four objects (an apple, a packet of crisps, a glue stick or a stapler) on the left of the screen.The reaches were made with either a straight trajectory, directly reaching for the target object (Straight/Efficient), or an arched trajectory over one of three obstacles (an iPad, lamp or pencil holder; Arched/Efficient).Each video clip was then converted into 22 frames, where frame 1 depicted the hand at rest, and frame 22 depicted the actor's arm mid-way through the action.For each efficient action, an inefficient action sequence was created by digitally removing the obstacles from the Arched/Efficient videos (Arched/Inefficient), or by inserting the obstructing objects into the Straight/Efficient videos, (Straight/Inefficient).The inefficient actions were therefore identical to the efficient actions in terms of movement kinematics, and differed only by the presence/absence of the obstacle.Finally, response frames were created by digitally removing the actor's arm from the scene, so that only the objects and background remained.Presenting this frame immediately after the action sequence gave the impression of the hand disappearing from the scene, and participants indicated the last seen location of the tip of the index finger on this frame with a touch response on screen.
For Experiment 2, the forty videos used in Experiment 1 were digitally manipulated so that the actor's hand was replaced with a ball, coloured using the same tones as the hand.The ball was the same size as tip of the index finger and was positioned at the same coordinates in each frame.An additional frame was created by positioning the ball mid-air before the first frame (where the ball contacts the table) creating an illusory "bounce" motion, providing a realistic context for the ball movement in order to reduce impressions of self-propelled movement that could also cue the observer that the motion is intentional (Luo & Baillargeon, 2005).
For Experiment 3, the forty videos of Experiment 2 were digitally manipulated so that the ball now appeared to move in a straight line and at a constant speed after the bounce frame, eliminating the biological motion profile.To achieve the straight trajectory, the line of best fit was calculated through the last four frames of each sequence of Experiment 1 (i.e. the all possible disappearance points).The constant speed of the ball was created by recalculating the Y coordinates at equal distances along this line, between the first and last frame.

Procedure
An example trial sequence can be seen in Figure 1B.Participants completed four blocks of 48 trials in which each condition was presented an equal amount of times (Straight/Efficient, Straight/Inefficient, Arched/Efficient, Arched/Inefficient).At the start of each trial, participants saw an instruction to "Hold the spacebar", to which they pressed the spacebar with their right hand and kept it depressed.This ensured that they did not track the observed motion with their finger and could only initiate their response once the action sequence had disappeared.Participants then saw the first frame of the action sequence as a static image (the hand at rest in experiment 1 and the "bounce" frame in experiment 2 and 3) and were required to say "yes" into the microphone if there was an obstructing object present, and "no" if there was not.The action sequence began 1000ms after a verbal response had been detected: every third frame of the action was presented for 80ms each and the final frame was randomly selected after 5, 6, 7 or 8 frames (e.g.trials with a length of 8 frames showed frames 1-4-7-10-13-16-19-22).This final frame was then immediately replaced with the response frame to give the illusion that the hand/ball had disappeared.Participants released the spacebar and, with their right hand, touched the screen where they thought the final position of the tip of the observed index finger was in Experiment 1, or the final ball position in Experiments 2 and 3.As soon as a response was registered, the next trial began.

Results
Data filtering was identical to Hudson et al. (under review).In all three experiments, trials were excluded if the correct response procedure was not followed (e.g.lifting the spacebar too early; 3.5%), or if response initiation or execution times were less than 200ms or more than 3SDs above the sample mean (2.2%, Initiation: mean =393.7ms,SD=173.3;Execution: mean =571.9ms,SD=203.3).Three participants were excluded because too few trials remained after trial exclusions (< 50% valid trials).Additional participants were excluded if the distance between the real and selected positions exceeded 3SD of the mean (mean =39.9 pixels, SD=18.9, 2 participants excluded), or if the correlation between the real and selected positions was more than 3SD below the median r value (X axis: median r =.940, SD = .041;Y axis: median r =.908, SD = .063,4 participants excluded).
Analysis was conducted on the predictive perceptual bias by subtracting the real final coordinates of the tip of the index finger/ball from the participant's selected coordinates on each trial.This resulted in separate "difference" scores along the X and Y axis where positive X and Y scores represented a rightward and upward displacement respectively, and negative X and Y scores represented a leftward and downward displacement respectively.A score of 0 on both axes indicated that the participant selected the real final position exactly.These difference scores were entered into a 2x2x3 ANOVA for the X and Y axis separately, with Trajectory (straight vs over) and Efficiency (efficient vs inefficient) as repeated-measures factors, and Experiment as a between-subjects factor.

Y axis
As in our prior work (Hudson et al., under review), we predicted (1) that inefficient actions would be perceptually "corrected" towards the more efficient action alternative, and (2) that these biases should be present in Experiment 1 but weaker when cues to intentionality are removed in Experiments 2 and 3. Indeed, the analysis revealed the predicted interaction of Trajectory and Efficiency (F(1,79) = 45.0,p <.001, ηp 2 = .363).The disappearance points for straight trajectories were perceived to be higher when the actions were inefficient (i.e.reaching towards an obstacle, 2.26px) than when the actions were efficient (no obstacle, -.967px; t(81) =5.46, p<.001, d=.60).Conversely, the perceived disappearance points for arched reaches was perceived to be lower for inefficient actions (7.87px) than for efficient actions (11.6px; t(81) =4.81, p<.001, d=.53).
As unpredicted effects are subject to alpha inflation in an ANOVA due to multiple testing (e.g.Cramer, 2016), all additional results in the analysis of Y-Axis and X-Axis results should be interpreted with caution, and considered relative to a Bonferroni-adjusted alpha of .004.

Discussion
Previous studies have shown that observers' perceptual representations of others' actions is predictively biased towards the goals attributed to them (Hudson et al., 2009; 2012; 2016 ab ,   2017; Hudson & Jellema, 2011) and that these predictions are informed by the assumption of efficient action, reflecting the specific trajectories that would allow an actor to efficiently reach the inferred goal (Hudson et al., under review).Here, we tested if these predictions depend on cues to intentionality and varied whether the stimulus was a hand with biological motion kinematics (i.e.bell-shaped velocity profile of reaching, Beggs & Howarth, 1972), a non-agentive ball that travelled the same biological motion trajectory as the hand, or a ball travelling a non-biological trajectory.As before, we asked participants to accurately report the moving object's last seen position after it suddenly disappeared.
The results replicated our prior work (Hudson et al., under review) in that perceptual reports of hand disappearance points were not veridical, but "corrected" towards the expected action kinematics of a rational, efficient actor.The perceived disappearance points of hands reaching straight towards an obstacle were reported higher than if the path was clear.
Similarly, the perceived disappearance point of arched reaches was perceived lower if there was no obstacle to reach across, compared to when there was an obstacle.These biases towards efficient action depended, however, on cues to intentionality.We found that biases towards efficient action were (numerically) reduced when participants watched a nonintentional objecta balltravel on the same biological motion trajectory as the, starting slowly and speeding up along, as if self-propelled.Biases towards efficient action were, however, almost completely eliminated, when the same ball was now seen travelling with a non-biological trajectory that nevertheless traversed, on average, the same locations as the hands, but without showing the characteristic bell-shaped velocity profile of intentional actions towards objects (e.g., Beggs & Howarth, 1972).
These results confirm first that, as in our prior study (Hudson et al., under review), observers predict the ideal action trajectory a rational actor would take that is fully aware of all relevant environmental constraints.Second, they show that these predictions influenced the perceptual judgments of observed actions, subtly biasing them towards the most efficient trajectory.
These findings are therefore in line with predictive processing models of social perception (e.g.Bach et al, 2014; Bach & Schenke, 2017; Friston & Frith, 2007 ab ; Hudson et al., 2016 ab ;   2017; Kilner; Zaki, 2013), which assume that perceptual experience of others' actions emerges from an integration of bottom-up sensory information and prior assumptions about the others' goals and how they would (best) realise them.Our data now show that when observing the behaviour of others these predictions of efficient action depend on bottom-up cues to intentionality derived from the object semantics and the specific trajectory and motion profile it has.Both types of cues have been previously identified as the basis for attributing intentionality to observed agents in children (e.g., Baron-Cohen, 1995;Leslie, 1994;Morewedge et al., 2007;Rakison & Poulin-Dubois, 2001).The finding that these cues also modulate predictive biases towards efficient action therefore directly supports our proposal that these predictions emerge from the attribution of intention to the observed actions (Hudson et al., 2016 ab ; 2017; Hudson et al., under review), which then inform their perceptual representation.During action observation these top-down influences can compensate for the perceptual "blurring" during motion perception (i.e.motion sharpening, Bex et al., 1995;Hammett, 1997), or fill in missing steps (Muckli et al., 2005).They can serve own action, allowing it to be coordinated with the others' future behaviour or the end-state of their actions (e.g., Sebanz, Bekkering & Knoblich, 2006).Finally, they can be compared to actual behaviour, triggering revisions of prior assumptions if prediction errors become too large (e.g.Clark, 2013;Hohwy, 2013), signalling, for example, that a behaviour deemed to be intentional may, in fact, be unintentional or signal that the actor is not aware of all relevant environmental constraints.
Further work now needs to resolve at what level cues of intentionality act on to induce the biases towards efficient action.One possibility is that they reflect higher level "cognitive" attributions of intention to others, which then feedback to lower level perceptual processes via one's knowledge of own action kinematics (e.g., Ansuini, Cavallo, Bertone & Becchio,   2015; Csibra, 2008; Kilner, Friston & Frith, 2007 ab ; Otten, Seth, & Pinto, 2017), thereby inducing biases similar to one's own action.However, imaging studies suggest that cues to intentionality may already act on lower-level regions within higher-level visual cortex, such as the superior temporal sulcus (e.g., Grossman, Donnelly, Price, Pickens, Morgan, Neighbor, & Blake, 2000;Saygin, 2007).The perceptual biases towards efficient actions could therefore also emerge from such "mid-level" recurrent interactions that act on the action's perceptual representations.
It is currently debated on what level of representation such effects on perceptual judgments could emerge.Several studies, both psychophysical and based on neuroimaging, have provided evidence that predictions exert downstream effects on early perceptual processes, across different modalities (e.g.vision, Ekman et al., 2017;Muckli et al., 2005;audition, Kondo et al., 2017), providing sensory "templates" of expected stimulation (Ekman et al., 2017), or filling in missing information during apparent motion (Muckli et al., 2005).Others argue that expectations influence primarily decision-related processes that integrate bottomup with top-down information on all levels of the hierarchy (e.g., Bang & Rahnev, 2017;Rungratsameetaweemana, Itthipuripat, Salazar & Serences, 2018), or that they reflect attentional modulations of the response properties of neurons in early sensory areas (e.g., Desimone and Duncan, 1995;Serences and Kastner, 2014).Others argue that many of the psychophysical effects of expectation may in fact reflect testing artefacts or demand effects, when participants realise what is being tested (e.g., Durgin, Baird, Greenburg, Russell, Shaughnessy & Waymouth, 2009;Firestone & Scholl, 2016).
While the precise mechanism has to be confirmed by neuroimaging studies, several aspects of prior work imply a role in the action's perceptual representation.First, when asked during piloting of the original set of studies, participants were unaware of the experimental hypotheses, arguing against demand effects.Indeed, the effects were present already very briefly (250 ms) after action offset, in psychophysical probe judgment tasks (Hudson et al,   2016 ab , 2017, under review; for a review of similar findings in non-biological motion perception, see Hubbard, 2015) that has been shown to relatively robust against cognitive control processes (Courtney & Hubbard, 2008;Ruppel, Fleming & Hubbard, 2009).Most importantly, the biases towards efficient action are disrupted by brief (560 ms) dynamic visual noise masks that interfere with the re-entrant feedback from higher cortical areas with visual cortex that is required for the stabilisation of percepts for conscious access, during both perception (Breitmeyer & Ögmen 2006;Fahrenfort, Scholte, & Lamme, 2007;Kinsbourne & Warrington, 1962;Lamme, Zipser & Spekreijse, 2002) and imagery (Dijkstra, Mostert, de Lange, Bosch & van Gerven, 2018).The observed biases in perceptual judgments are therefore unlikely to stem from unspecific perceptual changes in memory or motor control.
Instead, we propose that they either play a role in ongoing motion perception emerging from the re-current interactions between lower and higher visual regions involved in stabilising percepts and compensating for the substantial blurring during motion perception.

Conclusions
The principle of efficient action allows observers to predict ideal reference trajectories that intentional actions will follow, given that the agent is fully aware of all relevant environmental constraints.The data presented here confirm that these predictions are at least partially perceptually represented and influence perceptual judgments of others actions, biasing them towards these expectations.They show that these predictions emerge from attributions of intentionality to the observed actor, triggered by the perception of biological "agentive" objects and kinematics that follow biological motion profiles.

Figure 1 .
Figure 1.Stimulus conditions and trial sequence.The stimulus conditions are depicted in Panel A, these are true for all three experiments.The Action Trajectory was either straight or arched over.The presence or absence of an obstructing object made the action trajectory either efficient or inefficient.Panel B depicts an example of a Straight/Inefficient trial in the Biological Ball experiment (top) and the Non-Biological Ball experiment (bottom).In all examples (Panel A and B), the hand/ball is in the initial start position, and the white markers depict the final four frames of the trajectory of the index finger tip.The action sequence disappeared at one of these four points.An example trial sequence is depicted in Panel C, depicting an efficient arched trajectory over an obstruction.

Figure 2 .
Figure 2. The Trajectory X Efficiency interactions for the Biological Hand (A), Biological Ball (B), and Non-biological Ball (C) experiments.The difference between the real final position and the selected final position is plotted for the X axis and Y axis.The of each plot represents the real final position on any given trial (0px difference on each axis).Panel D depicts a comparison of the size of the Y axis interaction in pixels, equivalent to the total amount by which inefficient actions were corrected towards a more efficient trajectory.Error bars depict 95% confidence intervals.