{"id":1031720,"date":"2023-12-02T02:42:27","date_gmt":"2023-12-02T07:42:27","guid":{"rendered":"https:\/\/www.immortalitymedicine.tv\/learning-few-shot-imitation-as-cultural-transmission-nature-com\/"},"modified":"2024-08-17T15:10:11","modified_gmt":"2024-08-17T19:10:11","slug":"learning-few-shot-imitation-as-cultural-transmission-nature-com","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/human-reproduction\/learning-few-shot-imitation-as-cultural-transmission-nature-com.php","title":{"rendered":"Learning few-shot imitation as cultural transmission &#8211; Nature.com"},"content":{"rendered":"<p><p>GoalCycle3D task space    <\/p>\n<p>    We introduce GoalCycle3D, a 3D physical simulated task space    built in Unity38,39 which expands on    the GoalCycle gridworld environment of ref. 33. By anchoring    our task dynamics to this previous literature and translating    it to a 3D space, our results naturally extend prior work to a    more naturalistic and realistic environment. The resulting    richness is an important direction for the eventual deployment    of AI, highlighting which algorithmic novelties are required to    exceed the prior state-of-the-art in a more realistic setting.  <\/p>\n<p>    Similar to ref. 27, we decompose an    agents task as the direct product of a world, a game and a set    of co-players. The world comprises the size and topography of    the terrain and the locations of objects. The game defines the    reward dynamics for each player, which in GoalCycle3D amounts    to a correct ordering of goals. A co-player is another    interactive policy in the world, consuming observations and    producing actions. Each task can be viewed as a different    Markov decision process, thus presenting a distribution of    environments for reinforcement learning.  <\/p>\n<p>    While the 3D task space yields significant richness, it also    presents opportunities for handcrafting which would reduce the    generality of our findings. To avoid this, we make use of    procedural generation over a wide task space. More    specifically, we generate worlds and games uniformly at random    for training, and test generalisation to held-out probe tasks    at evaluation time, including a held-out human co-player, as    described in Probe Tasks. This train-test split provides data    that enables overfitting to be ruled out, just as in supervised    learning.  <\/p>\n<p>    Worlds are parameterised by world size, terrain bumpiness and    obstacle density. The obstacles and terrain create navigational    and perception challenges for players. Players are positively    rewarded for visiting goal spheres in particular cyclic orders.    To construct a game, given a number of goals n, an order    Sn is sampled    uniformly at random. The positively rewarding orders for the    game are then fixed to be {, 1}    where 1 is the opposite direction of the    order . An agent has a chance (frac{2}{(n-1)!}) of selecting a    correct order at random at the start of each episode. In all    our training and evaluation we use n4, so one is    always more likely to guess incorrectly. The positions and    orders of the goal spheres are randomly sampled at the start of    each episode.  <\/p>\n<p>    Players receive a reward of +1 for entering a goal in the    correct order, given the previous goals entered. The first goal    entered in an episode always confers a reward of +1. If a    player enters an incorrect goal, they receive a reward of 1    and must now continue as if this were the first goal they had    entered. If a player re-enters the last goal they left, they    receive a reward of 0. The optimal policy is to divine a    correct order, by experimentation or observation of an expert,    and then visit the spheres in this cyclic order for the rest of    the episode. Figure1 summarises the    GoalCycle3D task space.  <\/p>\n<p>            A 3D physical simulated task space.Each task contains            procedurally generated terrain, obstacles, and goal            spheres, with parameters randomly sampled on task            creation. Each agent is independently rewarded for            visiting goals in a particular cyclic order, also            randomly sampled on task creation. The correct order is            not provided to the agent, so an agent must deduce the            rewarding order either by experimentation or via            cultural transmission from an expert. Our task space            presents navigational challenges of open-ended            complexity, parameterised by world size, obstacle            density, terrain bumpiness and a number of goals. Our            agent observes the world using LIDAR (see Supplementary            Movie30).          <\/p>\n<p>    The term cultural transmission has a variety of definitions,    reflecting the diverse literature on the subject. For the    purpose of clarity, we adopt a specific definition in this    paper, one that captures the key features of few-shot    imitation. Intuitively, the agent must improve its performance    upon witnessing an expert demonstration and maintain that    improvement within the same episode once the demonstrator has    departed. However, what seems like test-time cultural    transmission might actually be cultural transmission during    training, leading to memorisation of fixed navigation routes.    To address this, we measure cultural transmission in held-out    test tasks and with human expert    demonstrators40,41, similar to the    familiar train-test dataset split in supervised    learning42.  <\/p>\n<p>    Capturing this intuition, we define cultural    transmission from expert to agent to be the average of    improvement in agent score when an expert is present and    improvement in agent score when that expert has subsequently    departed, normalised by the expert score, evaluated on held-out    tasks that have never before been experienced by the agent.    Mathematically, let E be the total score achieved by the    expert in an episode of a held-out task. Let    Afull be the score of an agent with the    expert present for the full episode. Let    Asolo be the score of the same agent without    the expert. Finally, let Ahalf be the score    of the agent with the expert present from the start to halfway    into the episode. Our metric of cultural transmission is  <\/p>\n<p>      $${{{{{{{rm{CT}}}}}}}}:!!!=frac{1}{2}frac{{A}_{{{{{{{{rm{full}}}}}}}}}-{A}_{{{{{{{{rm{solo}}}}}}}}}}{E}+frac{1}{2}frac{{A}_{{{{{{{{rm{half}}}}}}}}}-{A}_{{{{{{{{rm{solo}}}}}}}}}}{E},.$$    <\/p>\n<p>      (1)    <\/p>\n<p>    A completely independent agent doesnt use any information from    the expert. Therefore it has a value of CT near 0. A fully    expert-dependent agent has a value of CT near 0.75. An agent    that follows perfectly when the expert is present, but    continues to achieve high scores once the expert is absent has    a value of CT near 1. This is the desired behaviour of an agent    from a cultural transmission perspective, since the knowledge    about how to solve the task was transmitted to, retained by and    reproduced by the agent.  <\/p>\n<p>    We first examine how reinforcement learning can generate    cultural transmission in a relatively simple setting, a 4-goal    game in a 2020m2 empty world. This is far from    the most challenging task space for our algorithm, but it has a    simplicity that is useful for developing our intuition. We find    that an agent trained with memory (M), expert dropout (ED), and    an attention loss (AL) on tasks sampled in this subspace    experiences 4 distinct phases of training. The learning pathway    of the agent passes through a cultural transmission phase to    reach a policy that is capable of online adaptation,    experimenting to discover and exploit the correct cycle within    a single episode. By comparison, a vanilla RL baseline (M) is    incapable of learning this few-shot adaptation behaviour. In    fact it completely fails to get any score on the task (see The    role of memory, expert demonstrations and attention loss).    Cultural transmission, then, is functioning as a bridge to    few-shot adaptation.  <\/p>\n<p>    The training cultural transmission metric shows four distinct    phases over the training run, each corresponding to a distinct    social learning behaviour of the agent (see    Fig.2). In phase 1 (red),    the agent starts to familiarise itself with the task, learns    representations, locomotion, and explores, without much    improvement in score. In phase 2 (blue), with sufficient    experience and representations shaped by the attention loss,    the agent learns its first social learning skill of following    the expert bot to solve the task. The training cultural    transmission metric increases to 0.75, which suggests pure    following.  <\/p>\n<p>            Training cultural transmission (left) and agent score            (right) for training without ADR on 4-goal in a small            empty world. Colours indicate four distinct phases of            agent behaviour from left to right: (1) (red) startup            and exploration, (2) (blue) learning to follow, (3)            (yellow) learning to remember, (4) (purple) becoming            independent from expert.          <\/p>\n<p>    In phase 3 (yellow), the agent learns the more advanced social    learning skill that we call cultural transmission. It remembers    the rewarding cycle while the expert bot is present and    retrieves that information to continue to solve the task when    the bot is absent. This is evident in a training cultural    transmission metric approaching 1 and a continued increase in    agent score.  <\/p>\n<p>    Lastly, in phase 4 (purple), the agent is able to solve the    task independent of the expert bot. This is indicated by the    training cultural transmission metric falling back towards 0    while the score continues to increase. The agent has learned a    memory-based policy that can achieve high scores with or    without the bot present. More precisely, MEDAL displays an    experimentation behaviour in this phase, which involves using    hypothesis-testing to infer the correct cycle without reference    to the bot, followed by exploiting that correct cycle more    efficiently than the bot does (see Supplementary    Movies14). The bot is not    quite optimal because for ease of programming it is hard-coded    to pass through the centre of each correct goal sphere, whereas    reward can be accrued by simply touching the sphere. Note by    comparison with Fig.3a that this    experimentation behaviour does not emerge in the absence of    prior social learning abilities.  <\/p>\n<p>            Score (left), training cultural transmission (CT,            centre), and evaluation CT on empty world 5-goal probe            tasks (right) over the course of training. a            Comparing MEDAL with three ablated agents, each trained            without one crucial ingredient: without an expert (M),            memory (EDAL), or attention loss (MED). b            Ablating the effect of expert dropout, comparing no            dropout (MEAL) with expert dropout (MEDAL). We report            the mean performance for each across 10 initialisation            seeds for agent parameters and task procedural            generation. We also include the experts score and            MEDALs best seed for scale and upper-bound            comparisons. The shaded area on the graphs is one            standard deviation.          <\/p>\n<p>    In other words, few-shot imitation creates the right prior for    few-shot adaptation to emerge, which remarkably leads to    improvement over the original demonstrators policy. Note that,    social learning by itself is not enough to generate    experimentation automatically, further innovation by    reinforcement learning, on top of the culturally transmitted    prior, is necessary for the agent to exceed the capabilities of    its expert partner. Our agent stands on the shoulders of    giants, and then riffs to climb yet higher.  <\/p>\n<p>    We have shown that our MEDAL agent is capable of learning a    test-time cultural transmission ability. Now, we show that the    set of ingredients is minimal, by demonstrating the absence of    cultural transmission when any one of them is removed. In every    experiment, MEDAL and its ablated cousins were trained on    procedurally generated 5-goal, 2020 worlds with no vertical    obstacles and horizontal obstacles of density    0.0001m2, and evaluated on the empty world 5-goal    probes in Probe tasks. We use a variety of different dropout    schemes, depending on the ablation. M- is trained with full    dropout (expert is never present), MEAL is trained with no    dropout (expert is always present) and all other agents are    trained with probabilistic dropout.  <\/p>\n<p>    Figure3a shows that memory    (M), the presence of an expert (E), and our attention loss (AL)    are important ingredients for the learning of cultural    transmission. In the absence of these the agent achieves 0    score and therefore also doesnt pick up any reward-influencing    social cues from the expert (if present), accounting for a mean    CT of 0.  <\/p>\n<p>    First, we consider the M- ablation. By removing expert    demonstrations and, consequently, all dependent components, the    dropout (D) and attention loss (AL), the agent must learn to    determine the correct goal ordering by itself in every episode.    The MPO agents exploration strategy is not sufficiently    structured to deduce the underlying conceptual structure of the    task space, so the agent simply learns a risk-averse    behaviour of avoiding goal spheres altogether (see    Supplementary Movie5).  <\/p>\n<p>    Next, we analyse the EDAL ablation. Without memory, our agent    cannot form connections to previously seen cues, be they    social, behavioural, or environmental. When replacing the LSTM    with an equally sized MLP (keeping the same activation    functions and biases, but removing any recurrent connections),    our agents ability to register and remember a solution is    reduced to zero.  <\/p>\n<p>    Lastly, we turn to the MED ablation. Having an expert at hand    is futile if the agent cannot recognise and pay attention to    it. When we turn off the attention loss, the resulting agent    treats other agents as noisy background information, attempting    to learn as if it were alone. Vanilla reinforcement learning    benefits from social cues to bootstrap knowledge about the task    structure; the attention loss encourages it to recognise social    cues. Note that the attention loss, like all auxiliary losses    to shape neural representations, is only required at training    time. This means that our agent can be deployed with no    privileged sensory information at test time, relying solely on    its LIDAR.  <\/p>\n<p>    To isolate the importance of expert dropout, we compare our    MEDAL agent (in which the expert intermittently drops in and    out) with the previous state-of-the-art method ME-AL (in which    the expert is always present). We use the same procedural    generation and evaluation setting as in the previous section.    Studying Fig.3b, we see that the    addition of expert dropout to the previous state of the art    leads to better CT. MEDAL achieves higher CT both during    training and when evaluated on empty world 5-goal probe tasks.    This is because dropout encourages the learning of    within-episode memorisation, a capability that was absent from    previous agents33 and which    confers a higher cultural transmission score (see also Agents    recall expert demonstrations with high fidelity).  <\/p>\n<p>    As we have seen, learning cultural transmission in a fixed task    distribution acts as a gateway for learning few-shot    adaptation. While this is undeniably useful in its own right,    it begs the question: how can an agent learn to transmit    cultural information in more complex tasks? ADR is a method of    expanding the task distribution across training time to    maintain it in the Goldilocks zone for cultural transmission.    It gradually increases the complexity of the training worlds in    an open-ended procedurally generated space (parameterised by 7    hyperparameters).  <\/p>\n<p>    Figure4a shows an example    expansion of the randomisation ranges for all parameters for    the duration of an experiment. Training CT is maintained    between the boundary update thresholds 0.75 and 0.85. We see an    initial start-up phase of ~100 hours when social learning first    emerges in a small, simple set of tasks. Once training CT    exceeds 0.75, all randomisation ranges began to expand.    Different parameters expand at different times, indicating when    the agent has mastered different skills such as jumping over    horizontal obstacles or navigating bumpy terrain. For intuition    about the meaning of the parameter values, see Supplementary    Movies69.  <\/p>\n<p>            a The expansion of parameter ranges over            training for one representative seed in MEDAL-ADR            training. b Score (left), training Cultural            Transmission (CT, centre), and evaluation CT on complex            world probe tasks (right) over the course of training            for the automatic (A) and domain randomisation (DR)            ablations of MEDAL-ADR. We report the mean performance            for each across 10 initialisation seeds for agent            parameters and task procedural generation. We also            include the experts score and the best MEDAL-ADR seed            for scale and upper bound comparisons. The shaded area            on the graphs is one standard deviation.          <\/p>\n<p>    To understand the importance of ADR for generating cultural    transmission in complex worlds, we ablate the automatic (A) and    domain randomisation (DR) components of MEDAL-ADR (for    parameter values, see Supplementary TableD.1). The MEDAL agent    is trained on worlds as complicated as the end point of the ADR    curriculum. The MEDAL-DR agent is trained on a uniformly    sampled distribution between the minimal and maximal    complexities of the ADR curriculum (i.e., no automatic    adaptation of the curriculum). In Fig.4b    we observe that ADR is crucial for the generation of cultural    transmission in complex worlds, with MEDAL-ADR achieving    significantly higher scores and cultural transmission than both    MEDAL-DR and MEDAL.  <\/p>\n<p>    To demonstrate the recall capabilities of our best-performing    agent, we quantify its performance across a set of tasks where    the expert drops out. The intuition here is that if our agent    is able to recall information well, then its score will remain    high for many timesteps even after the expert has dropped out.    However, if the agent is simply following the expert or has    poor recall, then its score will instead drop immediately close    to zero. To our knowledge, within-episode recall of a    third-person demonstration has not previously been shown to    arise from reinforcement learning. This is an important    discovery, since the recent history of AI research has    demonstrated the increased flexibility and generality of    learned behaviours over pre-programmed ones. Whats more,    third-person recall within an episode amortises imitation onto    a timescale of seconds and does not require perspective    matching between co-players. As such, we achieve the fast    adaptation benefits of previous first-person few-shot imitation    works (e.g., refs. 22,43,44) but as a    general-purpose emergent property from third-person RL rather    than via a special-purpose first-person imitation algorithm.  <\/p>\n<p>    For each task, we evaluate the score of the agent across ten    contiguous 900-step trials, comprising an episode of experience    for the agent. In the first trial, the expert is present    alongside the agent, and thus the agent can infer the optimal    path from the expert. From the next trial onwards the expert is    dropped out and therefore the agent must continue to solve the    task alone. The world, agent, and game are not reset between    trial boundaries; we use the term trial to refer to the    bucketing of score accumulated by each player within the time    window. We consider recall from two different experts, a    scripted bot and a human player. For both, we use the worlds    from the 4-goal probe tasks (see Automatic domain    randomisation).  <\/p>\n<p>    Figure5 compares the recall    abilities of our agent trained with expert dropout (MEDAL-ADR)    and without (ME-AL, similar to the prior state of the    art33). Notably,    after the expert has dropped out, we see that our MEDAL-ADR    agent is able to continue solving the task for the first trial    while the ablated ME-AL agent cannot. MEDAL-ADR maintains a    good performance for several trials after the expert has    dropped out, despite the fact that the agent only experienced    1800-step episodes during training. From this, we conclude that    our agent exhibits strong within-episode recall.  <\/p>\n<p>            Score of MEDAL-ADR and ME-AL agents across trials since            the expert dropped out. a Experts are scripted            bots. b Experts are human trajectories.            Supplementary Movie10 shows            MEDAL-ADRs recall from a bot demonstration in a            3600-step (4 trial) episode. Supplementary            Movie31 shows            MEDAL-ADRs recall from a human demonstration in an            1800-step (2 trial) episode.          <\/p>\n<p>    To show causal information transfer from the expert to the    agent in real time, we can adopt a standard method from the    social learning literature. In the two-action    task28,29,30 subjects are    required to solve a task with two alternative solutions. Half    of the subjects observe a demonstration of one solution while    the others observe a demonstration of the alternative solution.    If subjects disproportionately use the observed solution, this    is evidence that supports imitation. This experimental approach    is widely used in the field of social learning; we use it here    as a behavioural analysis tool for artificial agents for the    first time. Using the tasks from our game space analysis, we    record the preference of the agent in pairs of episodes where    the expert demonstrates the optimal cycles  and    1. The preference is computed as the    percentage of correct complete cycles that an agent completes    that match the direction of the expert cycle. Evaluating this    over 1000 trials, we find that the agents preference matched    the demonstrated option 100% of the time, i.e., in every    completed cycle of every one of the 1000 trials.  <\/p>\n<p>    Trajectory plots further reveal the correlation between expert    and agent behaviour (see Fig.6). By comparing    trajectories under different conditions, we can again argue    that cultural transmission of information from expert to agent    is causal. The agent cannot solve the task when the bot is not    placed in the environment (Fig.6a). When the bot is    placed in the environment, the agent is able to successfully    reach each goal and then continue executing the demonstrated    trajectory after the bot drops out (Fig.6b). However, if an    incorrect trajectory is shown by the expert, the agent still    continues to execute the wrong trajectory    (Fig.6c).  <\/p>\n<p>            Trajectory plots for MEDAL-ADR agent for a single            episode. a The bot is absent for the whole            episode. b The bot shows a correct trajectory in            the first half of the episode and then drops out.            c The bot shows an incorrect trajectory in the            first half of the episode and then drops out. The            coloured parts of the lines correspond to the colour of            the goal sphere the agent and expert have entered and            thes correspond to when the agent entered the            incorrect goal. Here, position refers to the agents            position along the z-axis. Supplementary            Movies1113            correspond to each plot respectively.          <\/p>\n<p>    To demonstrate the generalisation capabilities of our agents,    we quantify their performance over a distribution of    procedurally generated tasks, varying the underlying physical    world and the overlying goalcycle game. We analyse both    in-distribution and out-of-distribution generalisation,    with respect to the distribution of parameters seen in training    (see Supplementary TableC.2).    Out-of-distribution values are calculated as 20% of the    min\/max in-distribution ADR values where possible, and    indicated by cross-hatched bars in all figures.  <\/p>\n<p>    In every task, an expert bot is present for the first 900    steps, and is dropped out for the remaining 900 steps. We    define the normalised score as the agents score in 1800    steps divided by the experts score in 900 steps. An agent who    can perfectly follow but cannot remember will score 1. An agent    which can perfectly follow and can perfectly remember will    score 2. Values in between correspond to increasing levels of    cultural transmission.  <\/p>\n<p>    The space of worlds is parameterised by the size and bumpiness    of the terrain (terrain complexity) and the density of    obstacles (obstacle complexity). To quantify generalisation    over each parameter in this space, we generate tasks with    worlds sampled uniformly from the chosen parameter while    setting the other parameters at their lowest in-distribution    value. Games are then uniformly sampled across the possible    number of crossings for 5 goals.  <\/p>\n<p>    From Fig.7a, we conclude that    MEDAL-ADR generalises well across the space of worlds,    demonstrating both following and remembering across the    majority of the parameter variations considered, including when    the world is out-of-distribution.  <\/p>\n<p>            a A slice through the world space allows us to            disentangle MEDAL-ADRs generalisation capability            across different world space parameters. b            MEDAL-ADR generalises across the game space,            demonstrating remembering capability both inside and            outside the training distribution. We report the mean            performance across 50 initialisation seeds for a            and 20 initialisation seeds for b. The error            bars on the graphs represent 95% confidence intervals.            Supplementary Movies1420            demonstrate generalisation over the world space and            game space.          <\/p>\n<p>    The space of games is defined by the number of goals in the    world as well as the number of crossings contained in the    correct navigation path between them. To quantify    generalisation over this space, we generate tasks across the    range of feasible N-goal M-crossing games in a    flat empty world.  <\/p>\n<p>    Figure7b shows our agents    ability to generalise across games, including those outside of    its training distribution. Notably, MEDAL-ADR can perfectly    remember all numbers of crossings for the in-distribution    5-goal game. We also see impressive out-of-distribution    generalisation, with our agent exhibiting strong remembering,    both in 4-goal and 0-crossing 6-goal games. Even in complex    6-goal games with many crossings, our agent can still perfectly    follow.  <\/p>\n<p>    Deep learning models are not necessarily readily interpretable.    On the other hand, interpretability is often desirable or even    pre-requisite for deploying AI systems in the real world. Here,    we demonstrate that our model is interpretable at the neural    level. Training agents to imitate via meta-reinforcement    learning embeds the logic for a state-machine capable of    approximately Bayes-optimal cultural transmission into the    neural networks weights45. By inspecting    a trained agents memory, we find clearly interpretable    individual neurons. These neurons have specialised roles    required for solving a new task online via cultural    transmission, a subset of the sufficient statistics which drive    the state-machine46. One, dubbed    the social neuron, encapsulates the notion of agency;    the other, called the goal neuron, captures the    periodicity of the task.  <\/p>\n<p>    To identify the social neuron, we use linear    probing47,48, a well-known    and powerful method for understanding intermediate layers of a    deep neural network. We train an attention-based classifier to    classify the presence or absence of an expert co-player based    on the memory state of the agent. The neuron with the maximum    attention weight is defined to be a social neuron, and its    activation crisply encodes the presence or absence of the    expert in the world (Fig.8a).    Figure8b shows a stark    difference in prediction accuracy for expert presence between    differently ablated agents. This suggests that the attention    loss (AL) is at least partly responsible for incentivising the    construction of socially-aware representations.  <\/p>\n<p>            a Activations for MEDAL-ADRs social neuron.            b We report the accuracy of three linear probing            models trained to predict the experts presence based            on the belief states of three agents (MED, MEDAL, and            MEDAL-ADR). We make two causal interventions (in green            and purple) and a control check (in red) on the            original test set (yellow). We report the mean            performance across 10 different initialisation seeds.            The small standard deviation error bars suggest a broad            consensus across the 10 runs on which neurons encode            social information. c Spikes in the goal            neurons activations correlate with the time the agent            remains inside a goal (illustrated by coloured            shading). The goal neuron was identified using a            variance analysis, rather than the linear probing            method in b.          <\/p>\n<p>    To identify the goal neuron we inspect the variance of memory    neural activations across an episode, finding a neuron whose    activation is highly correlated with the entry of an agent into    a goal sphere. Figure8c shows that this    neuron fires when the agent enters and remains within a goal    sphere. Interestingly, it is not the presence or the following    of an expert that determines the spikes, nor the observation of    a positive reward. Appendix D.3 contains full details of our    methods and results.  <\/p>\n<p><!-- Auto Generated --><\/p>\n<p>More:<br \/>\n<a target=\"_blank\" href=\"https:\/\/www.nature.com\/articles\/s41467-023-42875-2\" title=\"Learning few-shot imitation as cultural transmission - Nature.com\" rel=\"noopener\">Learning few-shot imitation as cultural transmission - Nature.com<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> GoalCycle3D task space We introduce GoalCycle3D, a 3D physical simulated task space built in Unity38,39 which expands on the GoalCycle gridworld environment of ref. 33. By anchoring our task dynamics to this previous literature and translating it to a 3D space, our results naturally extend prior work to a more naturalistic and realistic environment <a href=\"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/human-reproduction\/learning-few-shot-imitation-as-cultural-transmission-nature-com.php\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"limit_modified_date":"","last_modified_date":"","_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[1246857],"tags":[],"class_list":["post-1031720","post","type-post","status-publish","format-standard","hentry","category-human-reproduction"],"modified_by":null,"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/1031720"}],"collection":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/comments?post=1031720"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/1031720\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/media?parent=1031720"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/categories?post=1031720"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/tags?post=1031720"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}