At CARTO we take pride in empowering the next generation of researchers and data-lovers to create projects that push the boundaries of how we understand our world. Recently, we shared our technology with Hannan Abid and Julian Moulton, two students transcribing, cleaning, and digitizing data at the American Museum of Natural History related to the historic species collection expeditions conducted in the Solomon Islands. Check out how Hannan and Julian used CARTO in their project!
Hannan Abid & Julian Moulton
The Pacific Ocean covers over 30% of the Earth’s surface, which is more than all the landmasses combined. Scattered in this vast region are thousands of islands and archipelagos that at the beginning of the 20th century were virtually unknown to scientists. In order to document the biodiversity of this area, and particularly of the region’s bird life, the American Museum of Natural History (AMNH) launched the Whitney South Sea Expedition (WSSE). During the expedition, which lasted from 1920 to 1935, some 40,000 bird specimens were collected from over 600 islands, and dozens of new species and subspecies were discovered and described (Figure 1). The majority of these specimens are housed today in the Department of Ornithology at the AMNH. These specimens are critical to our understanding of the region’s biodiversity, but few of them were cataloged with an accurate georeference as most specimen records included only a general locality such as the island name. This missing information hampers the use of these specimens for modern scientific research.
Fortunately the participants in the WSSE maintained detailed field journals, which are deposited in the AMNH archives. Using these unpublished journals, the logbook from The France (named after the expedition’s ship), specimen labels, and handwritten catalogs we matched individual specimens to precise geographic locations along the expedition’s route while referencing maps, gazetteers, and online sources. By discovering the collecting locality for each specimen, we hoped to depict visually early 20th century avian diversity.
Because of time limitations, we focused our study on specimens collected within the Solomon Islands. The Solomon Islands consist of six major islands and over 900 smaller islands that lie to the east of New Guinea in the region known as Melanesia (Fig. 1). Although the island of Bougainville is geographically part of the Solomon Islands, politically it is part of Papua New Guinea, and thus excluded from this study. The Solomon archipelago has played a vital role in studies in evolutionary biology, notably Ernst Mayr and Jared Diamond’s The Birds of Northern Melanesia. In recent years, the AMNH has renewed interest in the Melanesian islands and several AMNH expeditions have resampled the islands.
Our objectives were to update specimen collecting information in the AMNH database, and to create a map that visualized the route of the Whitney South Sea Expedition through the Solomon Islands. The AMNH database was generated by transcribing specimen data from handwritten ledgers. These ledgers were based on data transcribed from the original specimen labels, which were written by the collectors in the field. Because this data was transcribed several times, from handwritten documents, errors were sometimes introduced into our sample. Additionally, the lack of standardized geographic names (alternate spellings, abbreviated references, etc.) meant that a given locality could be accounted for several different ways. Similarly, taxonomic names also were not standardized, and sometimes were completely missing. Figure 3 below describes the steps that were taken to generate the current database for specimens collected on the WSSE.
The first step in our data clean up was standardizing the legacy geographic data. We extracted around 8,000 locality records from the AMNH database. This data contained 809 unique values, but many proved redundant as a result of the previously mentioned lack of standardization. The AMNH database contains multiple hierarchical locality fields, such as country, island, political subdivision, and precise locality, and during data entry it was not always obvious in which field a particular value should be entered, and consequently many were misassigned. We cleaned, standardized, and parsed this data down to 154 unique locality records.
These new records gave us general information about the locations visited by the WSSE. Although this information was a useful start, our project required specific locality records that could be georeferenced. A database containing such data did not exist at the time so we decided to create one. We read through the unpublished field journals of the crewmembers of the WSSE and the logbook of The France. From these sources we were able to extract specific geographic and temporal data (Fig. 2), which we compiled in a spreadsheet with rows for date, and columns for the various crewmembers notes on geography. This enabled us to associate dates with specific localities that were visited by the WSSE.
Using the updated locality records and the temporal and spatial information extracted from the archival materials we assigned latitudinal and longitudinal coordinates to each specimen. These updates not only improved our understanding of certain specimen’s locality, but also provided detailed information related to the WSSE’s route. After fixing some outdated taxonomic information, our dataset was complete and ready to be visualized.
The purpose of our study was to render previously unusable data from a biologically significant area available for scientific research. We parsed and standardized 809 redundant locality records into 154; we extracted an additional 114 precise localities from the WSSE field journals, and we cleaned and georeferenced 8,172 records of avian specimens. With the help of colleagues from CARTO, we generated two maps visualizing these results. The first map, for instance, illustrates the specific localities at which 8,172 specimens from the WSSE were collected (Fig. 5). CARTO also allowed us to visualize the complete itinerary of the Whitney South Sea Expedition through the Solomon Islands.
Our research in the WSSE archives allowed us to assign georeferences to over 8,000 specimens, and with this data create digital maps illustrating the avian diversity of the Solomon Islands during 1927 to 1930. This data will be uploaded to the AMNH database and made available via online sources such as VertNet as well. Our work can be used for future studies including species distribution modeling, environmental impact, and climate change. Although a large number of WSSE specimens were georeferenced in our study, there exist many more in need of similar work throughout the South Pacific.
We thank Javier de la Torre and Santiago Giraldo at CARTO for their help with mapping, and Tom Trombone for providing raw data. We extend our thanks to Mark Weckel, Nuala Caomhanach and the rest of the SRMP team. Paul Sweet has also been an amazing mentor and guide throughout this project. Finally, we dedicate our work to the crew of the WSSE, especially Rollo Beck (Fig. 4), for providing us with the rich details of their travels.
The outcome of the work is quite beautiful and revelatory. At CARTO, we helped them create an explorable version of the species collected using the CARTO Builder.
Happy Data Mapping!
Recently, as part of our ongoing mission to empower Data Scientists with the best data and analysis, we announced the integration of our platform with Databricks, using eit...Use Cases
Please fill out the below form and we'll be in touch real soon.