Imputed genomes and haplotype-based analyses of the Picts of early medieval Scotland reveal fine-scale relatedness between Iron Age, early medieval and the modern people of the UK

The origins and ancestry of the Picts of early medieval Scotland (ca. AD 300-900) has been traditionally seen as a problem, prompted in part by exotic medieval origin myths, their enigmatic symbols and inscriptions, and the meagre textual evidence. The Picts, first mentioned in the late 3rd century AD resisted the Romans and went on to form a powerful kingdom that ruled over a large territory in northern Britain. In the 9th and 10th centuries Gaelic language, culture and identity became dominant, transforming the Pictish realm into Alba, the precursor to the medieval kingdom of Scotland. To date, no comprehensive analysis of Pictish genomes has been published, and questions about their biological relationships to other cultural groups living in Britain remain unanswered. Here we present two high-quality Pictish genomes (2.4 and 16.5X coverage) from central and northern Scotland dated from the 5th-7th century which we impute and co-analyse with >8,300 previously published ancient and modern genomes. Using allele frequency and haplotype-based approaches, we can firmly place the Pictish genomes within the Iron Age gene pool in Britain and demonstrate local biological affinity. We also demonstrate the presence of population structure within Pictish groups, with Orcadian Picts being genetically distinct from their mainland contemporaries. When investigating Identity-By-Descent (IBD) with present-day genomes, we observe broad affinities between the mainland Pictish genomes and the present-day people living in western Scotland, Wales, Northern Ireland and Northumbria, but less with the rest of England, the Orkney islands and eastern Scotland - where the political centres of Pictland were located. The pre-Viking Age Orcadian Picts evidence a high degree of IBD sharing across modern Scotland, Wales, Northern Ireland, and the Orkney islands, demonstrating substantial genetic continuity in the Orkney for the last ~2,000 years. Analysis of mitochondrial DNA diversity at the Pictish cemetery of Lundin Links (n = 7) reveals absence of female endogamy, with implications for broader social organisation. Overall, our study provides novel insights into the genetic affinities and population structure of the Picts and direct relationships between ancient and present-day groups of the UK.


Introduction
We retrieved DNA from eight individuals, one from Balintore (Easter Ross) and seven from Lundin 119 Links (Fife), representing the northern and southern parts of Pictland (Fig 1, Table 1, S1 Table, S1.1 120 Text). Two individuals, BAL003 and LUN004 were treated with the USER enzyme to remove post-121 mortem deamination (34) and were shotgun sequenced to medium and high coverage (2.4 and 16.5X) 122 (Table 1, S1 Table). Seven individuals from Lundin Links were shotgun sequenced to a sufficient 123 mitochondrial DNA (mtDNA) coverage for subsequent analyses (3.47-195.05X, Table 1). LUN001 and 124 LUN003 are excluded from the population genetics analysis involving autosomal DNA as we found 125 evidence for library index misassignment in the autosomal data, but not the mitochondrial data (S1.2 126 Text). Ten samples from Lundin Links (including LUN001, LUN002, LUN003 and LUN009) and three 127 from Balintore were radiocarbon dated to the 5 th -7 th century AD (Table 1, 144 The mitochondrial haplogroups observed in the samples are common in present-day north-western 145 Europeans, with the sub-clade J1c3 being identified in three individuals out of eight (Table 1, S5 Table). 146

154
To investigate population affinities of the individuals from Pictland, we performed Principal 155 Component Analysis (PCA) and ADMIXTURE analyses on a dataset comprising present-day Europeans, 156 the newly imputed genomes and the imputed ancient genomes from Margaryan et al. (9) Table). 157 The PCA shows that the ancient individuals from Britain broadly fit within present-day diversity ( Fig  158   2A). However, we notice some variability among these individuals as BAL003 and LUN004 fall within 159 the modern Welsh cluster, but with BAL003 being notably closer to the present-day Scottish, 160  Fig 3). Included in this cluster are also Viking Age individuals from Britain, Iceland and Scandinavia; 187 the latter likely corresponds to individuals buried in Scandinavia but whose parents were from a 188 British-like gene pool, consistent with results in Margaryan et al. (9). Based on outgroup-f3, the 189 individuals from Orkney, Scotland, and England, dated from the Iron Age to medieval period are 190 symmetrically related to each other (S5 Fig). However, we also note that BAL003, but not LUN004,191 show multiple instances of IBD sharing >4 cM with early medieval individuals from England (S21 Fig),  192 which is also reflected in their relative position in the PCA (Fig 2A), implying substantial shared 193 ancestry and possibly recent gene-flow from a source genetically similar to those samples. This implies 194 that we cannot consider individuals from Pictland a homogenous genetic group but instead a complex 195 mixture of contemporary genetic ancestries. 196 The unlinked approach implemented in the ADMIXTURE analysis also reveals a minor but detectable 197 genetic structure consistent with results from the PCA (Fig 2A)  Isles due to extensive admixture with Scandinavians, recent genomic research shows that genetic drift 219 also played an important role (11,42). This is consistent with our results that show a high proportion 220 of shared IBD segments among modern Orcadians (>1 to >6 cM, S19 Fig show that all early medieval individuals (excluding I0777) share more IBD with modern Danish than 239 with any other present-day population (Fig 3), suggesting genetic continuity between modern-day 240 Danish and the ancestors of these individuals (S1.6 Text). 241 The analysis also revealed high IBD sharing between early medieval individuals from England and 242 present-day people across Britain following a southeast/northwest cline (Fig 4 and S22 Fig). This 243 pattern suggests that Anglo-Saxon ancestry expanded out of south-eastern England followed by 244 admixture with local populations, which is a scenario consistent with previous research 245 (11,14,17,42,43). BAL003 and LUN004 share a high proportion of IBD segments with present-day 246 people from western Scotland, Wales and Northern Ireland, similar to the individuals from Late Iron 247 Age Orkney and England (Fig 4 and S22 Fig). However, unlike these individuals, LUN004, and to a lesser 248 extent BAL003, shares relatively few IBD segments with the present-day eastern Scottish population 12 sample (Fig 4 and S22 Fig). Byrne et al. (43) and Gilbert et al. (42) previously suggested that the genetic 250 structure between western and eastern Scotland could result from the divide between the kingdoms 251 of the Gaelic-speaking Dál Riata in the west and Picts in the east, which is seemingly in contradiction 252 with the results presented here. Instead, the present-day genetic structure in Scotland likely results 253 from more complex demographic processes that cannot be reduced to a single model. 254 We propose two non-exclusive processes that might explain the observed pattern of IBD sharing 255 between the Iron Age and early medieval populations and the present-day Scottish population. The 256 first is substantial admixture from immigrants that brought Iron Age Orcadian-, and England-like 257 ancestries (likely independently), which partially replaced the eastern Scottish early medieval gene 258 pool. Indeed, in the following centuries (AD 1,100-1,300), eastern Scotland received substantial 259 immigration, such as settlers from Britain south of the Forth, France, and the Low Countries (44-46). 260 Under this scenario, BAL003 and LUN004 are good representatives of the broader ancestry present in 261 Scotland during the Pictish period. Alternatively, the ancestors of BAL003 and LUN004 share more IBD 262 segments with present-day people from western Scotland, Wales, and Northern Ireland because they 263 (or their direct ancestors) migrated from these regions but did not contribute substantially to later 264 generations via admixture with local groups in eastern Scotland. This scenario is consistent with an 265 emerging picture of west-east lifetime mobility of both males and females in the early medieval period 266 in Scotland (47,48). Under such a model, it may be feasible that there are indeed still undiscovered 267 'pockets' of eastern Pictish-period ancestry, likely similar to that observed in Iron Age Orcadians, that 268 was differentiated from ancestry carried by BAL003 and LUN004 and which contributed significantly 269 to present-day populations from eastern Scotland. Oxygen and strontium isotope analysis of teeth 270 from these individuals holds promise to characterise this further. Importantly, we also emphasise that 271 stochasticity likely affected the pattern of IBD sharing in such a small sample size. Indeed, high 272 variability in IBD sharing is observed amongst individuals from the early medieval and Iron Age groups, 273 and to some extent between BAL003 and LUN004 (S22 Fig). 274 Our results also show substantial IBD sharing between Iron Age, Viking Age and present-day 284 Orcadians, supporting our observations using allele-frequency based methods of strong genetic 285 continuity in this region over time (Fig 2, 4 and S22 Fig). Therefore, the marked genetic differentiation 286 between the Orkney and mainland Britain is not only a result of Scandinavian admixture, as previously 287 hypothesised (11,42,50-53) but also pronounced genetic continuity that persisted for at least 2,000 288 years. The relatively low IBD sharing between BAL003 and LUN004 and modern-day Orcadians (Fig 4)  289 suggest the emergence of Pictish culture in Orkney (21,22,36) was not associated with population 290 replacement but largely due to cultural diffusion and connections. 291 IBD segments in Iron Age individuals from south-eastern England are widespread throughout western 292 and northern Britain compared to the more recent Romano-British individuals from northern England; 293 the latter, however, do not share substantial IBD with any present-day people of the British Isles ( Fig  294   4 and S22 Fig). The only exceptions is 6DT3 who was from the same genetic population as two early 295 medieval individuals (I0159 and I0773) with Scandinavian-, and northern European-like ancestry 296 ('pop12', S22 Fig, S1.6 Text). 6DT3 also share relatively more IBD segments >1 cM with the present-297 14 day population from Scandinavia, Belgium and the UK (Fig 3), suggesting that Scandinavian-like 298 ancestry could have spread to the British Isles before the Anglo-Saxon period. 299

Social organisation
300 Seven mtDNA genomes were retrieved at Lundin Links, which allows us to answer questions about the 301 Pictish social organisation reflected in the individuals interred at the site. The use of the cemetery was 302 relatively short, likely around 130 years (S2 Fig), and the individuals excavated were adults (S1 Table) 303 (35). The diversity of mtDNA lineages was high, and none of the individuals shared an immediate 304 maternal ancestor (S5 Table). It is worth noting that the two individuals retrieved from the horned 305 cairns complex individuals (S1 Fig) show evidence of familial links based on skeletal morphology 306 (LUN001 and LUN009) (35), but are not maternally related (S5 Table). In a matrilocal system, which is 307 typical of matrilineal descent, low female post-marital migration and high male migration decrease 308 female mtDNA diversity (54-57). This result suggests the individuals buried at Lundin Links were 309 unlikely to have been practicing matrilocality. Ongoing isotope analyses focused on the movement 310 histories of the Lundin Links individuals using strontium, oxygen and other isotope approaches may 311 further characterise sex-specific mobility. Additional Y-chromosome analyses will also help confirm 312 whether patrilocality or neolocality was more common in Pictish society (54). 313 Seventy per-cent of matrilocal societies are associated with a matrilineal system (58). Thus, this is 314 unlikely that the community at Lundin Links followed a matrilineal inheritance system, which 315 challenges older arguments for matrilineal succession among Pictish rulers (59). However, while some 316 individuals buried at Lundin Links may have been of elevated social status, the relationship between 317 people buried in monuments such as these and the Pictish uppermost elite is uncertain. The cemetery 318 evidences a wide diversity of cultural practices (35), mirrored in the high mitochondrial diversity, 319 suggesting relatively high levels of mobility within the Pictish social structure at this level of society. 320 The burials are organised in complex and stand-alone graves, made of round and square cairns and 321 long cists (S1 Fig). This complexity suggests that, as social practices influence the genetic structure of 322 populations, the social status of archaeological sites can, in turn, bias our understanding of population 323 structure; in this case, the samples may only be representative of a small proportion of the overall 324 Pictish population. Non-harmonious kinship systems (i.e., patrilocal and patrilineal or matrilocal and 325 matrilineal societies) may also impact the genome in different ways. The lack of broad sample size and 326 useful markers (Y-chromosome) to enhance kinship-and mtDNA-based findings remains an obstacle 327 to illuminate further Pictish descent patterns. 328

329
Our study provides novel insight into genetic affinity between ancient and modern populations of the 330 British Isles, a rare opportunity to directly observe micro-scale evolution. High-quality genomes of two 331 individuals buried in Scotland from the Pictish period, one from Balintore (BAL003) and one from 332 Lundin Links (LUN004), reveal a close genetic affinity to Iron Age populations from Britain but with 333 evidence of some genetic differentiation between samples. Overall, our data supports the current 334 archaeological consensus arguing for regional continuity between the Late Iron Age and early 335 medieval periods, but likely with complex patterns of migration, life-time mobility and admixture. We 336 also show that BAL003 and LUN004 were genetically differentiated from the pre-Viking Age Picts from 337 Orkney, which suggests that Pictish culture spread to Orkney from Scotland primarily via cultural 338 diffusion rather than direct population movement or inter-marriage. We detect strong continuity 339 between local Iron Age and present-day peoples in Orkney but less pronounced affinity between early 340 medieval and modern people in eastern Scotland. More ancient genomes from the Iron Age and early 341 medieval periods in the UK are necessary to illuminate these relationships further, combined with 342 analyses of lifetime mobility using complementary approaches (e.g., isotope analysis). On a more local 343 level, our mtDNA analysis of individuals interred at Lundin Links is inconsistent with matrilocality. This 344 finding argues against the older hypothesis that Pictish succession was based on a matrilineal system, 345 assuming that wider Pictish society was organised in such manner. 346 and sequenced on an Illumina HiSeq2000 platform. BAL003 and LUN004 libraries were generated 361 using enzymatic damage repair (34), from the same extracts as BAL003-b1e1l1 and LUN004-b1e1l1, 362 respectively, and sequenced over 5 lanes for LUN004 and 6 lanes for BAL003. 363 Sequence processing and alignment 364 We discarded reads with indexes showing at least one mismatch. Read pairs were merged and adapter 365 sequence removed using Adapter Removal v2.1.7 (62), with a minimum overlap of 11 bp and summing 366 base qualities in overlapping regions. Merged read pairs were mapped as single-end reads to the 367 human reference genome build 37 with decoy sequences (hs37d5) using BWA aln v0.7.8 (63) with the 368 non-default parameters -n 0.01 (maximum edit distance) and -o 2 (maximum number of gap opens), 369 allowing more mismatches and indels, and disabled seeding with -l 16500 as in Lazaridis et al. (2014) 370 (1) and Skoglund et al. (2014) (64). We collapsed duplicate reads having identical start and end 371 coordinates into consensus reads using FilterUniqueSAMCons.py (65). Finally, we filtered the 372 alignment so that only reads longer than 35 bp, with mapping quality >30, not containing indels, and 373 with more than 90% matches with the reference were retained. We merged libraries sequenced over 374 several lanes using SAMTOOLS v1.9 (63). Summary statistics of the obtained reads are presented in S1 375 Table. 376 The Y-chromosome makes up 2% of the genome (69). 389

390
The biological sex of sequenced individuals was determined using the Ry parameters (70). Ry is the 391 fraction of the Y-chromosome alignments (ny) compared to the total number of reads aligned to the 392 X-and Y-chromosomes (nx + ny). The 95% confidence interval (CI) was computed as ± 393 1.96 (1 -) . This method determines whether an ancient individual can be determined to be male 394 or female. If the lower CI limit of Ry is >0.077 the individual is assigned as male. If the upper CI limit 395 of Ry is <0.016 the individual is assigned as female. 396 Mitochondrial and Y-chromosome haplogroups 397 We obtained the consensus mitochondrial DNA from endogenous reads, removing the bases with 398 quality <30 (-q 30) using Schmutzi (67). The mitochondrial haplogroups were assigned using 399 Haplogrep2 (71). The Y-chromosome haplogroup was obtained using pathPhynder (72) Age/early medieval individual from Slovakia (DA119) (77); 8) one early medieval individual from the 417 Czech Republic (RISE569) (3); 9) two pre-Christian individuals from Iceland (SBT-A1 and DAV-A8) (78) 418 (S1.4 Text, S7 Table). 419

420
The EU and UK datasets were phased together with the genomes from Margaryan et al. (9) (S7 Table,  421 S1.5 Text) and the re-imputed and newly imputed ancient genomes using BEAGLE 5.2 (79). We 422 restricted the phasing on the intersections of the genotypes newly imputed in this study and those 423 imputed in Margaryan et al. (9) to prevent sporadic missing genotypes imputation by BEAGLE 5.2 (79). 424 The window and overlap lengths were set as wider than any chromosome ( Table). VA, Viking Age. 541 populations. We tested for symmetry as D (Pict, ancient England; X, Mbuti) and plotted the resulting 554 Z-scores. Details on the test results and sample size are in S10 Table.