11
Google Books & the Internet Archive A Vital Part of Any History Student’s Toolkit Here are a few hints for working with a couple of the most useful new information sources that have become available to us in the last few years. It is quite possible that some of you getting this message may be much more au fait with this than I am, given that you are members of the web-enabled generation and that I am emphatically not , in which case I’d appreciate getting your comments and suggestions for incorporation into a revised version of these guidance notes. What is Google Books? Google Books is an enormous project for digitizing the contents of major research libraries, principally in the US and the UK so far, and making them fully searchable and (within the constraints of the somewhat uncertain laws of copyright) viewable and even downloadable for users. Why should it interest History students? Principally because, among the materials already digitized, and most freely available, there is a wealth of C19th works, including many public documents and even entire runs of specialized magazines. UG and PG students working in the areas of British and US history for the C19th period are thus particularly well served with printed primary source material which adds enormously to the stock of knowledge at our disposal. But what if I’m not specializing in the Victorian period? This is where other parts of the growing Google Books collection become useful to you – notably the scanned texts of masses of recent books on everything under the sun where, by agreement with the publishers, parts of the text (typically Tables of Contents, the first few pages of most chapters, and 1

Google Books

Embed Size (px)

DESCRIPTION

Various books on British Raj

Citation preview

Page 1: Google Books

Google Books & the Internet Archive

A Vital Part of Any History Student’s Toolkit

Here are a few hints for working with a couple of the most useful new information sources that have become available to us in the last few years. It is quite possible that some of you getting this message may be much more au fait with this than I am, given that you are members of the web-enabled generation and that I am emphatically not, in which case I’d appreciate getting your comments and suggestions for incorporation into a revised version of these guidance notes.

What is Google Books?

Google Books is an enormous project for digitizing the contents of major research libraries, principally in the US and the UK so far, and making them fully searchable and (within the constraints of the somewhat uncertain laws of copyright) viewable and even downloadable for users.

Why should it interest History students?

Principally because, among the materials already digitized, and most freely available, there is a wealth of C19th works, including many public documents and even entire runs of specialized magazines. UG and PG students working in the areas of British and US history for the C19th period are thus particularly well served with printed primary source material which adds enormously to the stock of knowledge at our disposal.

But what if I’m not specializing in the Victorian period?

This is where other parts of the growing Google Books collection become useful to you – notably the scanned texts of masses of recent books on everything under the sun where, by agreement with the publishers, parts of the text (typically Tables of Contents, the first few pages of most chapters, and a selection of pages from the text as a whole, on which search words that you’ve specified appear) are available to view. Sometimes a “Limited Preview” Google book will provide you with enough of the contents that you feel you can dispense with the costly business of e.g. making an inter-library loan application; or at least, you will know that the book is going to be worthwhile before you take that decision, and spend your and/or our money on a Document Delivery Service (DDS) request. Even a “Snippet View” of a book – giving you just three passages on which your search words appear -- can provide enough hints to its content to guide this decision.

In order to get maximum value from a Limited Preview book, you should use the word search facility to select out those pages in which you are most likely to be interested. If you just scroll through such a book, you will find that there are great lacunae – sequences of pages apparently missing, or unavailable. Using the word search technique will enable you to view even a page which is apparently unavailable. There will still be some inaccessible pages, and you may run into what Google Books calls your “viewing limit” for a particular work, but in general I find that I can get most of what I want this way.

1

Page 2: Google Books

How do I access Google Books?

You can do this very easily via any web browser at http://books.google.com/, but it’s probably easiest if you open a Google account. This will enable you to save books and other items you find useful to a personal “Library,” which you can organize into categories that are meaningful to you, so that you can come back to them quickly and easily from any internet-connected device, anytime, anywhere. But you can get a lot out of Google Books without having an account. The choice is yours.

How do I find stuff on it?

Dead simple, of course – this is Google, after all! At the very simplest, you just type in keywords describing the topic you’re interested in, and leave it to Google’s usual search logic to produce a list of results ranked in order, with those which it thinks are most relevant coming first. Thus, for example, if I search today (26 October 2011) for “open fire place” I will get 16.3 million hits (four years ago it was just 26,500), with Putnam’s 1880 classic The Open Fire-Place in All Ages in pole position, because it has my three search words in the right order, in immediate proximity to one another, and (we can be confident) the same exact phrase repeated often throughout the volume. If we were to work down the list, we would find an increasing proportion of false positive results – e.g. works of military history in which the phrase “open fire” occurred – but basically we can rely on Google’s logic to produce good results first, and won’t waste too much time if we work through the hits in the order they’re served up to us.

How do I refine my search?

This is more a matter of experience than exact rules. You don’t want to specify so many search words that you get few hits and miss genuinely relevant material; on the other hand, you don’t want to be swamped either. Suck it and see.

There is also a very simple way of cutting down the number of results you may need to wade through. The above 26,500 hits are the result of a default “Any view” search, and include items (a) where no preview at all is available – very frustrating, but at least you know they may be relevant, and perhaps worth looking at in a real rather than virtual library; (b) where all you get is what Google calls “Snippet view,” i.e. you will see three short extracts of a few lines from the book giving you a flavour of what’s there; (c) where you have a “Limited preview”; and (d) where a “Full view” is available. You can filter your results by specifying that all you’re interested in is Full View works, in which case the number of hits declines to 433,000, or Preview and Full View, in which case you’ll get 2.3 million.

How do I squeeze maximum value out of my results?

Let’s suppose that you’ve decided just to work with Full View books, so that you’re not going to run into the frustration of knowing there’s something entirely relevant for you inside a volume, but not (or hardly) being able to see it. The second item (in today’s search; Google is very dynamic, so if I do exactly the same search tomorrow it will probably produce, not just more hits, but a different ranking) is an interesting-sounding book by Isaac Buxton written in 1810. If I click the link, it’ll take me to the page where there’s the best match for my search words (p. 170), and also tell me that there are another four pages in the book on which they occur. I can easily go from one to another, and can sort my page results in terms of their relevance or their running-order in a book. The latter is a really easy way of making your own personalized “index entry” for a book, so that you can just track a search term’s occurrences throughout the text -- you don’t need to read the whole of it if you don’t want to;

2

Page 3: Google Books

you can, in effect, scan the whole book at one go, and read it quickly with the aid of an index entry that’s been customized for your needs.

A single search term won’t squeeze all of the value out of a book, so even if you are reading a Google Book with keyword searching, it’s a good idea to try a number of overlapping terms that describe what you’re looking for. Suppose that what I’m really interested in is C19th ideas about the importance of open fire-places as sources of ventilation – health-giving fresh air. I can search within the book for the word “Ventilation” instead, and now I’ve got just one hit on a particular page.

How do I actually read onscreen?

Google Books has two modes for displaying books to you, page by page:

(i) Standard Mode , in which you can increase or decrease the page image size, and/or get two pages side-by-side in regular book view, or see thumbnails of a batch of pages at a time, and maximize the amount of the screen displaying pages by going to “Full Screen,” which eliminates distractions . All of these options are only available for Full View books; for some Limited Preview books, you can only see one page at a time.

(ii) Plain Text Mode , only available with Full View books, where you see the text as interpreted by Google’s Optical Character Reader [OCR] programme, which is not 100 percent accurate but is still pretty amazing. The great advantage of this is that if you are taking notes from a Full View book, you can copy and paste quotations or whole interesting passages from the Plain Text version.

Whichever mode you are working with, you can also increase the amount of text you can see on screen at any particular time by using the standard Windows option View Full Screen (F11), which eliminates the Windows and browser menu bars at top and bottom, and gives you the biggest possible reading pane. Using these together, you can get a reading pane where there’s enough space that two book-pages are actually quite legible side-by-side.

Browsing: You can scroll through a book, from page to page; or move to a particular numbered page; or click the blue arrows to go forward or back a page; or, if you’ve done a word search within a book as described above, you can just go from the site of one hit to another (listed, with ‘snippets,’ in the menu bar on the RH side of the screen). It’s all very easy.

How do I take notes easily from a Google Book?

It depends how you work. If you take notes on paper, this is no problem. But if you take them (as I do) on the PC itself, then what you have to do is this:

open your note-taking software, whatever it may be, and reduce the size of its window so that it just occupies the top LH corner of the screen (I work with a window that’s about 4/5ths of the width of the screen, and a little over half the depth). This means that I can overlay my note-taking window on the reading pane, and by moving from one program to the other via Alt-TAB I can scroll through the book and take notes more or less as I go. (I can also, of course, drag the note-taking window around the screen if it gets in the way of reading the text.)

3

Page 4: Google Books

But I don’t just want to take notes, I want a copy!

There are various ways of doing this.

One of the easiest, with a Full View book, is just to download a PDF of the book to your own PC, or a Google eBook (good for some portable devices, and free for Full View, out-of-copyright works). These PDFs can of course be enormous, and you can’t search within them in the way you can via Google Books. (What you need to do is to search for particular pages on Google Books, and make a note of them – then you can go to them using the Adobe reader, but bear in mind: Google Books has intelligent pagination, whereas Adobe counts every page – blanks, front matter, etc. So p. 374 of a book won’t necessarily be the 374th page in a PDF file.) But if you want a permanent copy, and the ability to print off pages at will, this is a good option. You can build up your own little library of heavily-used works, for the duration of a project.

You can also cut-and-paste from Plain Text view of a Full View book, if what you are interested in is just shorter extracts and quotes of the kind needed for notes.

With a Full View book, you can also use the Clip option to copy a piece of text which you can then paste into your notes; you can also copy a picture.

Method: * click Clip, and highlight the area of text or image that you want to copy using the cursor in the shape of a cross that appears on your screen; when you’ve marked a block of text or image like that, a new menu window will open giving you three options. * The first, “Selection text,” is any text within the area you’ve highlighted, but lacking punctuation and formatting – copy the text you want from within the menu line, and then you can paste it into your notes; * The second, “Image,” is a snapshot of your highlighted block as an image. Copy the long URL and then open a new browser page and paste this web address into the address bar. Depending on the browser you are using, you will be able to save this image as a low-resolution JPEG or PNG file to your PC.

What I do is to make sure I name these files distinctively – e.g. if they’re from an 1885 book by William Keep, I’ll name the image from p. 236 as Keep_1885_236, so that as these copied images accumulate on my PC (a) several pages from the same work will be in the right order, and (b) I will always be able to link images to my notes, so that I can supply a full bibliographical reference for them if I need to.

* * *

The problem arises when you’re working with a Limited Preview book. For copyright reasons, you can’t clip a copy of any text or images from these works. But, as usual, there’s a work-around. What you do is to use the Windows Full Screen (F11) and Google’s own page-viewing options (magnify/reduce, single or double page if available, full screen) to maximize the size of the page on your screen and minimize the amount of extraneous clutter; and then you use the PC’s Print Screen key to “snap” the whole screen as an image onto the clipboard, which you can then paste into (say) a Word document. If what you want is to copy several pages or double-page spreads from a single book, you can paste them into a single Word doc, and organize them very easily in that way.

Alternatively, you can paste these screenshots into image-editing software, which will enable you to e.g. save successive pages from a book as separate files, and also to clean up those images by cropping them down to the size of the page.

4

Page 5: Google Books

There are a few problems with this approach, e.g. you’ll find that the definition of the images is not great, for viewing/printing purposes. But it’s perfectly serviceable, and once you get the hang of it it is a good way to copy text and images from a Limited Preview book.

What else can you get from Google Books?

One of the valuable facilities is, in effect, a quick-and-dirty citation finder.

You may already be familiar with the Web of Knowledge’s citation-indexing system, or you may be in the habit of doing JSTOR searches for e.g. “Justin Willis” if you want to find references to our current head of department’s work (i.e. to track articles and reviews by him, as well as articles citing his work – a good way of spotting ‘families’ of related articles that you may need to get your head around). But Google Books will do something similar, and quite useful, for you, dealing with books, not just or mostly journal literature.

Method: suppose that the book (or article) I’m interested in, i.e. the range of whose scholarly connections I want to know, is e.g. my own first (The Right to Manage). I stick in a search for “Howell Harris Right Manage” (including common words like a, an, and, to, or the adds nothing to a search) and I get about 13,500 hits [1,240 with Preview Available]. This is actually quite interesting for an author – answers the “who’s been sleeping in MY bed?” question – but, even if you don’t have this particular reason for wanting to know, it is useful to be able to form an almost instant image of the group of scholarly works which seems to have felt a need to cite an item. It tells you what’s connected, and when you’re shaping a research bibliography it helps to have this awareness of relevant literature in a field. As an indication of the accuracy of Google’s search algorithm, most of the authors at the top of the hit list are friends or research colleagues of mine, which probably helps explain why they think well of my stuff and cite it often enough for Google to notice; or people whose books I’ve reviewed; or people whose books I should have read.

* * *

Another way you can get this sense of a book’s scholarly interconnections is just by clicking on it and then clicking again on the About this book option in the menu column on the LH side of the screen. You will see:

links to reviews (and if they’re in EJs to which we subscribe, or JSTOR, you can get the text of the review via the UL’s links to individual journals)

links to related books – another good way of adding to your sense of the scholarship or printed sources relevant to a topic

links to common terms and phrases within the book, which is a good, quick way of getting a sense of what a book is about, and of course finding the pages within it where a particular term or phrase appears

This is an extremely good, quick way of forming an idea of where a particular book fits within the universe of scholarship, and figuring out what you should wish to read, if you are interested in the topic(s) connecting this extended family of books and articles together.

* * *

The trouble with Google Books is that there’s so much in it that the temptation is to investigate every little feature.

A few worth mentioning now in the menu column on the RH side of the “About this book” page: * the link to the publisher’s website – this can be handy if publishers offer a “view sample

5

Page 6: Google Books

chapter” option for a book; it will add to the amount of a Limited Preview text that you can read.* the link to Amazon can offer the same facility, with their “Look Inside” service.* the Amazon link will also show you if there’s a cheap secondhand copy available – if you’re doing a research project or are really interested in a topic, it may be worth buying a book (can even be cheaper and quicker than using DDS).* the Find in a library option. Google interconnects with WorldCat, and thus you can see if the University Library holds an item. Suppose that it doesn’t? Stick in your postcode, and Google will tell which other libraries in WorldCat’s world hold the book, sorted in order of how far away they are. This can actually be very useful, even if (all too frequently for a US historian) the answer is that the nearest library is on the US E. Coast.

I would be very glad to get feedback on this, if you try to follow these suggestions and find them either (a) useful or (b) too complicated to be helpful or (c) inadequate, and capable of being improved upon (there is a better way from A to B) or even (d) incomplete, because I have missed a feature you find valuable.

Howell Harris

[email protected]

I will add to this advice at a (not too much) later date, notably to explain how to get around one of the most frustrating features of Google Books: it will tell you that a book (usually from the 1870s-1920s) period is available, but only in Snippet View, for copyright reasons. What can you do then? Well, one of the answers is to look for the same book in another online library that does not have such a restrictive interpretation of copyright law as Google. One of the best is the Internet Archive, http://www.archive.org/ which contains far more than simply books and periodicals. But if you search it just for texts http://www.archive.org/details/texts, you will find lots. The search interface is not as clever as Google’s – it just looks for search words in a book’s bibliographical “metadata” (author, title, subjects, etc.) – but if e.g. we look for “Samuel Pepys” we will get (today) 359 hits sorted in the Internet Archive’s idea of probability order, with various different editions of the Diary and some old biographical studies at the top. The Internet Archive is in some ways nicer to use than Google Books – e.g. some people prefer the way it presents texts onscreen in its “Read Online” mode, http://www.archive.org/stream/samuelpepy00lubb#page/n9/mode/2up -- it’s more book-like than Google Books. It’s easy to save an image from a page to your own PC, and you can also download the Full Text of a book for ease of note-taking. But a fuller explanation of the joys of the Internet Archive, and indeed of other similar services (e.g. for French libraries and texts the Gallica digital library, http://gallica.bnf.fr/?lang=EN which has a lot of stuff in English too), will have to await another day.

6