Yesterday I attended my first ever hackathon, held at the Natural History Museum in London. What on earth is a hackathon? Well it’s when a bunch of computer programmers and others collaborate intensively on software projects. This one was organised by the PREDICTS and Living Planet Index (LPI) projects to develop ways to make it easier to find potential sources of data to add to their models of biodiversity. PREDICTS and the LPI use published data on species diversity and populations respectively to measure the state of the world’s biodiversity. There is lots of data out there, but finding it, ensuring it meets the projects’ criteria and extracting the data is laborious. This Biodiversity Hackathon event brought together computer programmers and biodiversity experts to begin to solve these problems.
After an introduction to PREDICTS, LPI and the problems that need solving, ideas were pitched and groups formed and the hacking started. I was in a group exploring ways to see if finding connections between authors of papers already in the databases could be used to find other relevant papers. As my computer programming skills are still rudimentary, I was worried I would not be very useful, but my experience with biodiversity modelling and the PREDICTS project (or domain knowledge as it is known) meant I could answer the programmers’ questions on the sort of papers we were looking for and advise on some of the social networking sites that researchers use, such as Researchgate.
By manually searching for sample of authors from the LPI database we were able to relevant papers from co-authors, showing that the concept seemed to work. Automating the process proved more difficult, as few of the websites had APIs to allow us to access the data, although it may be possible to use web scraping techniques.
At the end of the day there were presentations by each group on what they had been working on and how far they had got. I was really impressed with what people had managed to do in just a few hours and was particularly excited by machine learning techniques used to suggest papers based on what was already in the database. These allow computers to become better at finding suitable papers as they are given more data on which are suitable and which are not. As I am just starting out to find papers for my own project I am really keen to try this sort of programming, and save some time searching.
I had a really enjoyable and inspiring day, hopefully we will be able to continue to develop the ideas and there will be more biodiversity hackathons in the future!
Check out my Storify to see Tweets and photos from the day: