Posted by Craig, July 2017

21-25 July 2017: Repositive presents cutting-edge developments in Prague

In just one week, our team heads to Czechia to present to the eager computational biology students and open source bioinformaticians of Prague, and beyond. 5 days, 3 Repositive experts and 3 events - summer certainly doesn't slow down a startup.

So, what can you expect from the world's genomic data experts?

ECCB - Student Council Symposium

Our tour of Prague kicks-off with the ECCB - Student Council Symposium. Senior Bioinformatics Scientist Richard Shaw and CEO / Founder Fiona Nielsen speak to students, postdocs and young researchers in the fields of computational biology and bioinformatics. During Fiona's talk she will explain how she went from the lab to founding Repositive, and gives advice to any young researchers looking to become entrepreneurs themselves.

Talk Title: A bioinformatician spark which lead to a new platform for sharing genomic data

Time: 16:00 - 16:30

Speaker: Fiona Nielsen


Speaker Bio: Bioinformatics scientist specialised in genome analysis with 15 years of experience in software development and project management. Fiona left her job at Illumina Cambridge in 2013 to pursue her vision of enabling efficient genomic data sharing and founded the charity DNAdigest. In August 2014 she founded Repositive as a spin out of DNADigest.

https://www.youtube.com/watch?v=Mel97gQFodw&t=1342s https://www.youtube.com/watch?v=u-0wZ7UlIv0


Following from the Student Council Symposium, Fiona and Richard head straight to ISMB/ECCB (European Conference on Computational Biology) as they continue to educate academic and corporate resarchers alongside Intel, BMC Bioinformatics and Overleaf. Following the release of our 23andMe Data Collection and parallel to a paper being written by Scientific Lead Manuel Corpas, Richard presents a poster exploring dataset dynamics and limitations when mapping the Genotypes of the 23andMe datasets on Repositive onto the 1000 Genomes principal components.

Poster Title: Prospecting in Contributed Personal Genomic Data

Poster slot number: Session B-450

Poster Presenter: Richard Shaw


Authors: Richard Shaw, Dennis Schwartz, Manuel Corpas and Fiona Nielsen

Poster Abstract: Open-access personal genomic data contributed directly by the individuals genotyped is a growing resource but, as with human genomic data in general, its storage is fragmented across multiple sites. Using the Repositive human genomic metadata aggregation platform (https://discover.repositive.io/?ECCB2017), we explored the landscape of such genomic data, within the scope of SNP array genotypes generated by a prominent provider (23andMe TM).

Our approach was to search for metadata containing the name of the provider, download the corresponding (3137) data files and then filter out those files not matching the format of interest (GRCh37 23andMe genotypes) or that appeared corrupted. An initial principal component analysis revealed that 122 of the 2402 remaining were from the same individual as other genotypes in the dataset. Some corresponded to identical files multiply submitted to the same or different repositories but others to different versions of the same genotype.

Mapping the deduplicated set of 2280 genotypes onto principal component axes generated from a set of African, Asian and European genotypes from 1000 Genomes populations and then applying nearest neighbour classification showed that the dataset is predominantly comprised of European ancestry genotypes. Promethease (https://www.snpedia.com/index.php/Promethease) analyses of these genotypes revealed, among other traits, a preponderance of male individuals.

With this analysis we have shown that it is possible to collect and aggregate a large dataset from open access data available across multiple data sources. The examination of the data may be useful in further investigations into linking genotype and phenotype.

BOSC 2017

Our tour of Prague ends with BOSC 2017(https://www.open-bio.org/wiki/BOSC_2017# Sponsors) (The Bioinformatics Open Source Conference), where Repositive is proud to sponsor alongside friends The Hyve, SevenBridges and GigaScience and hope to make new friends with fellow sponsors Mozilla Science Lab and eLIFE. Bioinformatics Developer Dennis Schwartz joins other open source advocates within the biological research community to promote the practice and philosophy of open source software development and open science.


Connect with Repositive

If you would like to meet us, find out more about our services, or just pick up some awesome Liz merchandise.

Posted by
Craig Smith

Craig Smith

Marketing Manager
See all Craig's posts