Harvesting Historical Images and Documents in the SIO Archives

SIO Archivist Deborah Day
collecting XBT data on R/V Melville,
near New Zealand (COOK20MV)

The SIO Archives record the human endeavor in oceanography, with a collection of documents and photographs of work at sea going back to 1904. By following the threads of careers, project documents and publications, we will show how expeditions have been a fundamentally collaborative, interdisciplinary, and evolving process. These historical archives reveal a wide range of information that can make scientific data archives “come alive” for scientists and non-scientists alike.

For this project, the SIO Archives has selected documents and images relating to expeditions conducted by the Scripps Institution of Oceanography that were of particular importance for their contributions to geophysics. These materials describe where the scientists went, what they saw, and what they did. Included are photographs of life at sea and on shore as well as of individual scientists and of scientific instruments. The SIO Library is also scanning publications that describe the scientific findings from each of these cruises.

Preparing Textual Resources

For many of our historic voyages, we are fortunate in having expedition reports, which include the track of the vessel, the list of personnel and ports of call, the expedition objective and the scientific results of the expedition. The expedition reports are being used to select photographs, track charts, correspondence, cruise narratives and other content that illustrate the expedition, scientists, and work at sea.

Expedition reports are being scanned, OCR’d and encoded by Pacific Data Conversion Corporation. For preservation purposes, each page (including the cover, the title page, back matter, and advertisements) is scanned at 600 dpi TIFF Group IV lossless compression and burned to CD-Rom media. The text is encoded in SGML using Level 4, TEI-Lite with 218 accompanying tables, graphs, and photographs embedded throughout as 300 dpi GIF images. The encoded text is checked for accuracy and consistency, parsed, and enhanced.

Preparing Photographs

Thousands of oceanographic cruise images from the SIO Archives are being scanned for this project along with drawings, blueprints, diary excerpts, newspaper clippings, correspondence, and oceanographic instruments. With a few exceptions, all images are being digitized by Luna Imaging, Inc. The standards adhered to for digital capture are based on the California Digital Library Digital Image Format Standards, 2001.

Print photographs and 35mm slides were all scanned into uncompressed TIFF 6.0 format, at 3072 to 6144 pixels on the long side of each original.  For more efficient web use, JPEG versions at 192, 768 and 1536 pixels were created. 
 

Preparing Metadata

During the image selection process, the SIO Archives internally uses a MicroSoft Access database to enter basic descriptive and administrative metadata for each item. Fifty-six metadata elements were adopted and now serve as the standard for the description of digital objects for the SIO Archives. After receiving the scanned images from Luna and the metadata was populated, the database is exported to ASCII delimited text. Of the fifty-six metadata elements, only eighteen were selected for display to the public on this site. The metadata standard relies on existing standards where applicable, including:

Type: Internet Media Types
http://www.isi.edu/in-notes/iana/assignments/media-types/media-types

Topside Location: Alexandria Digital Library Gazetteer:
http://fat-albert.alexandria.ucsb.edu:8827/gazetteer/

Cruise name: GDC Cruise Index:
http://gdcmp1.ucsd.edu/gdc/cruises/cruise.index

Vessel : Vessel name from SIO Archives authored authority list:
▪ ROGER authority file
▪ UNOLS:
http://www.unols.org/images/ships/shipimages.html

Ocean location: Grid number from 4th edition of International Hydrographic Organization Limits of Oceans and Seas

Underwater features: GEBCO Gazetteer of Undersea Feature Names:
http://www.ngdc.noaa.gov/mgg/gebco/underseafeatures.html

Subject headings: Keywords from Library of Congress Subject Headings controlled vocabulary

Preparing Published Literature

An extensive bibliography of scientific publications that resulted from each of the expeditions has been created, and, with the permission of the copyright holders, these publications will be scanned and presented as pdf files on the website.
 

Copyright ©  SIOExplorer