Digitization is the process of converting information on physical media to digital format. In information systems, digitization usually refers to the process of converting text or an image, such as a photograph or map, into a binary format, by hand entry or with a scanning device, so that the result can be stored and retrieved for display. This is often done by scanning a paper copy and saving it as a raster image (TIFF or another format). Optical character recognition software can then be used to change the image to searchable text, which may be saved in PDF format or as plain text.
Preservation and copyright are two large issues for the digital library, as materials can be duplicated and distributed easily. Conversely, digital storage is debatable as an archival medium, unlike proven analog technologies such as microfilm. See Digital Dark Ages and fair use.
- Europeana: cultural heritage database containing digitized images, sounds, texts, and videos all available for free download.
- Forgotten Books: more than 450,000 titles, most free to read on line. Upgraded membership available for more functionality.
- Google Books: almost 30 million scanned books currently. Snippet views available for books that are not in public domain.
- Open Library: open source, user created site, including cataloging functions, and over 1.7 million public domain books.
- Project Gutenberg: the largest collection of free digitized books, most as ubiquitous plain text.
- PublicLiterature.org: HTML versions of classics such as Beowulf and Dracula. Audio and feeds also available.