Everyday digital scholarship: Using web-based tools for research

Preview:

Citation preview

Everyday digital scholarship: Using web-based tools for research

Francesca Di Donato University of Pisa, COST A32

In Our End are Fresh BeginningsPerspectives for Open Scholarly Communities on the Web

München, 2 Oct. 2010

didonato@sp.unipi.it

Topics

• Searching the Web or: How to make a smart use of search engines

• Storing, organizing, sharing sources

• Disseminate your results

The research community had used links

between paper documents for ages:

Table of contents, indexes, bibliographies, and

reference sections are hypertext links....

On the Web, however, [...] scientists could escape from the sequential organization of each paper and bibliography, to pick and choose a path of references that served their

own interests. [1]

The Web is our library:

The Web is our library:How to search inside it?

A paradox as a premisePlato Meno, XIV 80d–e/81a

I know, Meno, what you

mean; but just see what a tiresome dispute you are

introducing. You argue that man cannot enquire either about that which he knows, or about that which he does not know; for if he knows, he has no need to enquire; and if not, he cannot; for he

does not know the very subject about which he is to

enquire.

And how will you enquire,

Socrates, into that which you do not

know? What will you put forth as the subject of enquiry? And if you find what you want, how will you ever know that this is the thing which you did not know?

Topology of the WebThe web is a graph: an abstract representation of a set of objects where some pairs of the objects (vertices) are connected by links (edges).

(Koenigsberg bridges, 1736)

The web is a direct graph (links go in only one direction).

like the scholarly publications graph (where nodes are papers and links are citations)

Plato Kant Di Donato

The web is a direct graph (links go in only one direction).

A small world network:

In 2004, the degrees of separation on the Web were 19.

On the Web, not all the nodes are equal: there are hubs and authorities

The biggest nodes are in contact with most part of nodes

http://thenextweb.org/wp-content/uploads/2008/06/fragmented.jpg

Which is then the Web form?

It has 4 continents.. but we can explore only two of them

[Witt et al, p. 93]

Exploring the Web surface: on the use of SES

Though hundreds of search engines are freely and publicly available, a very few capture the overwhelming majority of the audience. According to the well-known 80/20 rule, 80 percent of users are concentrated on 20 percent of applications.

Users trust their own ability as web searchers More than 90 percent of people who use search engines say they are confident in the answers; half are very confident. Users also judge their research activities as successful in most cases.

The less Internet experience people have, the more successful they regard their own searches.

[I.H. Witten, M. Gori, T. Numerico, Web Dragons, pp. 23-4].

Surveys have revealed that more than two-thirds of users believe that search engines are a fair and unbiased source of information.

In SES we trust

A smart use of search engines is

essential for a good researcher

Search engines are many and

different

Rule n. 1

many!!!

http://www.searchlores.org/main.htm

Fravia’s mapBest s.e.:

CUILMSNsearch Google - GOOGLE dedicated pageAskYahoo! - YAHOO dedicated page Fast Altavista

Useful s.e.

Wayback (past)Lycos Gigablast Swicki ("vertical")IceRocket (webarchive)Rollyo ("vertical")

Graph s.e.

Touch (graph)Dicy (cluster)Mooter (cluster)

Second Tier

Alexa Exalead (date & regexp) A 9 (google's )Baidu

"Visual" s.e.

lyGO ("visual" search)yaouba ("visual" & anon)searchme ("visual" search)

Other

EntirewebExcite (not † but very ill)Factbite (encycl)dmoz (directory)Furl (webarchive)[FTPSEARCH]

searching techniques and tips @ fravia's

Golden rulesLong termShort termDeep webFiles reposTargetsLocalRegionalCompoundUsenetAccmailLive searchesCombingKlebingGuessingDatabasesAllinonesImagesBooksLawsFilesFilezPasswords

Cadavers

Teoma († 07)Wisenut († 07)Ouverture († 07)Northernlight († 02)Webtop († 01)

and the web is plenty of websites and books on this subject

http://www.searchlores.org/main.htm

S.e. usage.....

...and coverage

Rule 2.

Learn how to formulate

your queries

Rule 3.

Use operators

http://www.searchlores.org/operators.htm

Ex: Google operators

site: allintitle: (all of the query words in the title) intitle: (that word in the title) allinURL: (all of the query words in the URL) inURL: (that word in the URL) cache: link: related: (pages that are "similar" to a specified web page) info: (google's info)

other practical advices

other practical advices

1. use small letters

other practical advices

1. use small letters

2. use inverted commas [“”]

other practical advices

1. use small letters

2. use inverted commas [“”]

3. Insert errors

other practical advices

1. use small letters

2. use inverted commas [“”]

3. Insert errors

4. Use booleans

other practical advices

1. use small letters

2. use inverted commas [“”]

3. Insert errors

4. Use booleans

5. Use asterisk [*]

<http://www.searchlores.org/longtermsearching.htm>

Long term searching (ex. a PhD thesis, a book)

<http://www.searchlores.org/longtermsearching.htm>

see also: http://www.searchlores.org/effective_searching.htm

1. Develop your search strategy: prepare a written plan2. Prune your query! 3. Run preliminary searches 4. Explore the deep web5. Identify "grey areas"

- conference papers and proceedings, - unpublished dissertations on relevant topics, - "unofficial" messageboards, - IRC channels, - blogs offer most of the time top-notch information

6. try different approaches7. re-run your query using different languages8. keep records of all your search activities

Organizing your searches:results and paths

Web 2.0 or Social Web1) The Web as a platformEx. Google account http://www.google.it

2) Software as a service (not as a product)3) Decentralization: every client is a server (P2P)4) Some rights reserved

Ex. Napster, Emule, etc..

Ex. Creative Commons

Jan 2007

5) User Generated ContentsJan1983

Web 2.0: a video<http://www.youtube.com/watch?v=6gmP4nk0EOE>

The Machine is Us/ing Us

Some examplesfacebookhttp://www.librarything.com/

Youtubehttp://www.anobii.com/

ebayhttp://www.ebay.com

Myspacehttp://www.myspace.com

LinkedInhttp://www.linkedin.com/

LibraryThinghttp://www.librarything.com/

Anobiihttp://www.anobii.com/

Zopahttp://www.zopa.uk

Kivahttp://www.kiva.org/

Twitterhttp://twitter.com/

Digghttp://digg.com/

Lastfmhttp://www.last.fm/?setlang=en

Social networks for scholarly research

delicioushttp://delicious.com/

Connoteahttp://www.connotea.org/

CiteULike http://citeulike.org/

Zoterohttp://www.zotero.org

MediaCommonshttp://mediacommons.futureofthebook.org

Academia.eduhttp://www.academia.edu

PanMindhttp://www.panmind.org/

Other?

Disseminate your results

Two Scholarly Publishing systems:

2nd. The "Web age"

1st. The "Printing Era"

To publish means to make intellectual productions accessible for the public of readers.

In the Academiadissemination means publishing

Librarians

Scientific Institutions (Universities)

Pressmen (Publishers)

1st framework: the “printing era”

Scholars

Actors

•Inelastic market -------> “Serial Price Crisis”

Ist framework > Market Scenario

All Scholars

UniversitiesPublishers

Librarians

The Public of Readers

“Gatekeepers”

2nd framework: the “Web age”

ActorsLibrariansScientific Institutions

(Universities)Publishers Scholars Computer Scientists

A new scenario

Public Repositories

• xxxxxxxxxvTesto

Open Journals

Traditional printed Journals

Print on demand

BlogsOther channels

Dissemination channels

1) OA archives

Dissemination channels

1) OA archives

Dissemination channels

2) traditional and OA journals

1) OA archives

Dissemination channels

2) traditional and OA journals3) New tools/paradigms such as Lulu.com or MediaCommons

1) OA archives

Dissemination channels

2) traditional and OA journals

4) On-line bibliographical tools (Citeulike)

3) New tools/paradigms such as Lulu.com or MediaCommons

5) create your wikipedia entries (using different languages)

1) OA archives

Dissemination channels

2) traditional and OA journals

4) On-line bibliographical tools (Citeulike)

3) New tools/paradigms such as Lulu.com or MediaCommons

5) create your wikipedia entries (using different languages)

1) OA archives

Dissemination channels

2) traditional and OA journals

4) On-line bibliographical tools (Citeulike)

6) create your institutional homepage

3) New tools/paradigms such as Lulu.com or MediaCommons

5) create your wikipedia entries (using different languages)

1) OA archives

Dissemination channels

2) traditional and OA journals

4) On-line bibliographical tools (Citeulike)

7) create your research blog6) create your institutional homepage

3) New tools/paradigms such as Lulu.com or MediaCommons

OA in practiceApplications for Archives and Journals- Eprints - DSPace- CDSWare- Fedora

- OJS- HyperJournal-......

D

D

D

©protocols

Applications Policies

Web Services

Services

Internet, Web, Sem Web

OA in practiceApplications for Archives and Journals- Eprints - DSPace- CDSWare- Fedora

- OJS- HyperJournal-......

D

D

D

©protocols

Applications Policies

Web Services

Services

Internet, Web, Sem WebInfrastructure

OA in practiceApplications for Archives and Journals- Eprints - DSPace- CDSWare- Fedora

- OJS- HyperJournal-......

D

D

D

©protocols

Applications Policies

Web Services

Services

Internet, Web, Sem WebInfrastructure

RightsNational and International Law

Licenses (CC)

OA in practice

Protocols, standardOAI-PMH, RSS, Dublin Core, ...

Applications for Archives and Journals- Eprints - DSPace- CDSWare- Fedora

- OJS- HyperJournal-......

D

D

D

©protocols

Applications Policies

Web Services

Services

Internet, Web, Sem WebInfrastructure

RightsNational and International Law

Licenses (CC)

OA in practice

Protocols, standardOAI-PMH, RSS, Dublin Core, ...

Applications for Archives and Journals- Eprints - DSPace- CDSWare- Fedora

- OJS- HyperJournal-......

D

D

D

©protocols

Applications Policies

Web Services

Services

Internet, Web, Sem WebInfrastructure

RightsNational and International Law

Licenses (CC)

Web services

OA in practice

Protocols, standardOAI-PMH, RSS, Dublin Core, ...

Applications for Archives and Journals- Eprints - DSPace- CDSWare- Fedora

- OJS- HyperJournal-......

D

D

D

©protocols

Applications Policies

Web Services

Services

Internet, Web, Sem WebInfrastructure

RightsNational and International Law

Licenses (CC)

PoliciesInternational declarations

Policies

Web services

OA in practice

Protocols, standardOAI-PMH, RSS, Dublin Core, ...

Applications for Archives and Journals- Eprints - DSPace- CDSWare- Fedora

- OJS- HyperJournal-......

D

D

D

©protocols

Applications Policies

Web Services

Services

Internet, Web, Sem WebInfrastructure

RightsNational and International Law

Licenses (CC)

PoliciesInternational declarations

Policies

Web services

?

MediaCommonsKathleen Fitzpatrick, Scholarly Publishing in the Age of the Internet, <http://mediacommons.futureofthebook.org/mcpress/scholarlypublishing/>

The traditional publishing system is broken.

“What exactly do we in the humanities want the future of scholarship to look like, and what do we have to do in

order to persuade ourselves, our senior colleagues, our departments, and our institutions — all of which tend, if

unconsciously, toward an obstinate luddism — that such a future is not only acceptable but necessary?”

<http://mediacommons.futureofthebook.org/mcpress/scholarlypublishing/2-mla-task-force-recommendations/>

From “born-digital” to “consumed digital”

A publication should be evaluated without any mediabased bias

Scholarly monograph in the digital context. How does it change?

Scholarly monograph must move online

Digital monographs are able to embed multimedia contents (images, videos, etc.)

Blogs-like monographs:trackbacks, as a means parallel to bibliographies of tracing scholarly discussions not simply backward in time but also forward, might reshape the nature of doing research;

versioning, as a means of allowing a text to continue changing even after it’s been published, might reshape the processes of academic publishing;

comments, as a means of including conversation about a text within the text, might reshape the nature of peer-review.

From peer review to peer-to-peer review

A democratic knowledge exchange system

Whitworth B., Friedman R., Reinventing Academic Publishing online. PART II. A Socio-Technical Vision, «First Monday», 14, 9, 2009,<http://&rstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/2642/2287>.

How to calculate/quantify your impact

OA and evaluationPlurality of sources and criteria (not only indexes: peer-to-peer review!)

Trasparenceof processes

Access to documents and data

OA and evaluationIn practice it’s possible

1. to calculate different indexes on OAI-compliant archives and journal networks

2. to use download metrics

3. to use social network analysis-based metrics, such as:Degree Centrality:“The sum of the number of relationships pointing to and from an actor, i.e., their in- and out-degree, normalized by the total number of relationships in the social network”

Closeness Centrality: “The average shortest path distance of an actor to all other actors in the network”.

Betweenness Centrality: “The frequency by which an actor is part of the shortest path between any pair of agents in the network”.

questions?

References

[1] T. Berners-Lee, Weaving the Web, p. 38.

Photos:slide 1: http://www.flickr.com/photos/batigolix/2367456992/slides 3,4: http://www.flickr.com/photos/marinos/4040243418/slide 6: http://files.splinder.com/07e2282cca381de636ce4f27d9413431.jpeghttp://it-it.abctribe.com/Disegni/Guide/Generiche/Maestro-scheda%281%29.jpgslide 7: http://physics.weber.edu/carroll/honors_images/BarbasiBridges.jpgslides 9,10: http://www.flickr.com/photos/ajc1/2553555562/slide 11: http://m.blog.hu/ne/nemlinearis/image/erdosgraph.jpgslide 14: http://www.flickr.com/photos/osvaldo_zoom/3506973686/slide 15: http://media.ly/images/search_engines.jpg

A-L. Barabàsi, Linked: The New Science of Networks (Perseus, Cambridge, MA, 2002).

I.H. Witten, M. Gori, T. Numerico, Web Dragons. Inside the Myths of Search Engine Technology, Morgan Kaufmann Publishers-Elsevier, San Francisco 2007.

Fravia, Searchlores. Advanced Internet searching strategies & advice. Resources for basic, advanced & deep web seekers, http://www.searchlores.org.

On-line resources

Deep web searching.The lore of searching: how to exploit the shallow deep_webhttp://www.searchlores.org/deepweb_searching.htm

How to find books and textshttp://www.searchlores.org/books.htm

Combinghttp://www.searchlores.org/combing.htm

Regional search engineshttp://www.searchlores.org/regional.htm

Bloghttp://www.searchlores.org/blog.htm

Essays http://www.searchlores.org/essays.htm

Classrooms http://www.searchlores.org/c_intro.htm

Conferences and workshopshttp://www.searchlores.org/mines.htm

The lore of (dinosauria) researching (An "how to" for young web seekers)How to research, evaluate and collate web material

by A+heist (heavily edited by fravia+), February 2008http://www.searchlores.org/how_to_research.htm

http://www.gutenberg.org: Project Gutenberghttp://gutenberg.net.au/: Gutenberg australia(As the oz law is just 50 years max...)http://gallica.bnf.fr/: Gallicahttp://about.eserver.org/: Eserverhttp://books.google.com/books?: Google bookshttp://scholar.google.com/: Google scholarhttp://en.scientificcommons.org/: Index of OAI-compliant papershttp://www.archive.org/search.php?query=subject%3A%22search%22: Internet Archivehttp://vlib.org.uk/: The WWW Virtual Library http://digital.library.upenn.edu/books/search.html: The University of Pensylvania Online Books Pagehttp://abu.cnam.fr/index.html: ABU: la Bibliothèque Universellehttp://www.opencontentalliance.org/: Open Content Alliancehttp://www.readprint.com/: Our website offers thousands of free books for students, teachers, and the classic enthusiast.http://www.gutenberg.org/: There are 17,000 free books in the Project Gutenberg Online Book Catalog. http://www.bibliomania.com/: Free Online Literature with more than 2000 Classic Textshttp://digital.library.upenn.edu/books/: Upenn.edu, Listing over 25,000 free books on the Webhttp://www.ipl.org/div/subject/browse/hum60.60.00/: The Internet Public Library, Literature Online Texts

Texts for SSH

http://www.literature.org/: An Online Library of Literature. Read. Learn. Think.http://www.loc.gov/: The Library of Congress serves as the research arm of the US-Congress.http://ota.ahds.ac.uk/: The Oxford Text Archive hosts AHDS Literature, Languages and Linguistics.http://bcdlib.tc.ca/links-subjects.html: British Columbia digital library: The focus of this set of links is on collections of electronic texts (not individual titles) preserved through libraries, archives, museums and corporate or private initiatives.http://un2sg4.unige.ch/athena/html/fran_fr.html: Textes d'auteurs d'expression françaisehttp://www.intratext.com/: Full-text Digital Library committed to accuracy, accessibility and usability, offering texts and corpora as lexical hypertextshttp://www.fordham.edu/halsall/sbook2.html: Internet Medieval Sourcebookhttp://pomoerium.com/links.htm: classics resourceshttp://un2sg4.unige.ch/athena/html/fran_fr.html: Textes d'auteurs d'expression françaisehttp://www.intratext.com/: Full-text Digital Library committed to accuracy, accessibility and usability, offering texts and corpora as lexical hypertextshttp://www.fordham.edu/halsall/sbook2.html: Internet Medieval Sourcebookhttp://pomoerium.com/links.htm: classics resourceshttp://germazope.uni-trier.de/Projects/DWB: Das Deutsche Wörterbuch von Jacob und Wilhelm Grimm auf CD-ROM und im Internethttp://www.ikp.uni-bonn.de/kant/: Das Bonner Kant-Korpus. Elektronische Edition der Gesammelten Werke Immanuel Kantshttp://dewey.library.upenn.edu/sceti: SCETI. Virtual facsimiles of rare books and manuscripts.

http://dsal.uchicago.edu/index.html: Digital South Asia Libraryhttp://www.fiu.edu/~mirandas/cardinals.htm: The Cardinals of the Holy Roman Churchhttp://www.1911encyclopedia.org/: (Britannica, eleventh edition)http://www.newadvent.org/cathen/: The Catholic Encyclopediahttp://plato.stanford.edu/: Stanford Encyclopedia of Philosophyhttp://ourworld.compuserve.com/homepages/cornwall_business_systems/index.htm: A Smaller Classical Dictionary of Biography, Mythology and Geographyhttp://lexicorient.com/e.o/index.htm: The Encyclopaedia of the Orient.http://astronomy.nju.edu.cn/twkp/astrobook/Oth_Historical.html: Astronomical Books Onlinehttp://www.biblegateway.com/: "Enter the Bible passage (e.g. John 3:16), keyword (e.g. Jesus, prophet, etc.) or topic (e.g. salvation) you want to find"http://www.questia.com/Index.jsp: Questia, Your Online Library for Research. Search over 60,000 Scholarly Books and 1,000,000 Journals.

Finding laws, UE and UN documentshttp://www.searchlores.org/laws.htmhttp://www.searchlores.org/frav_eu2.htmhttp://www.searchlores.org/eurosearch.htmhttp://www.searchlores.org/frav_eu1.htm

Othershttp://avaxhome.ws/http://gigapedia.com/

See also: http://www.searchlores.org/books.htm

Universal library: http://www.searchlores.org/universallibrary.htm

Classicaliahttp://www.thelatinlibrary.com/index.html: the latin library, latinhttp://www.thelatinlibrary.com/neo.html: the latin library, neo-latinhttp://www.forumromanum.org/literature/authors_a.html: Corpus scriptorum latinorumhttp://www.fh-augsburg.de/~harsch/a_chron.html#latmed: BIBLIOTHECA AUGUSTANA (Bibliotheca Latina scriptorum latinorum collectio)http://www.corpusthomisticum.org/iopera.html: CORPUS THOMISTICUM S. THOMAE DE AQUINO OPERA OMNIAhttp://www.textkit.com/: Textkit is the Internet's largest provider of free and fully downloadable Greek and Latin grammars and readers. With currently 146 free books to choose from.http://www.molfettanet.com/tradizioni/pesi_e_misure.htm: Pesi e misure nell'antichità (Italian)http://www.ancientlibrary.com/wcd/: The wiki classical dictionary at ancientlibrary (currently down :-(http://www.perseus.tufts.edu/hopper/: Perseus Digital Library

De re orientaliahttp://omphaloskepsis.com/collection/index.html: OmphaloskepsisOmphaloskepsis provides free access to important works of eastern literature in digital format.

Whitworth B., Friedman R., Reinventing Academic Publishing online. PART II. A Socio-Technical Vision, «First Monday», 14, 9, 2009,<http://&rstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/2642/2287>.

Fitzpatrick K., Scholarly Publishing in the Age of the Internet, <http://mediacommons.futureofthebook.org/mcpress/scholarlypublishing/>

Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities (Oct. 2003), <http://www.zim.mpg.de/openaccess-berlin/berlindeclaration.html>

Budapest Open Access Initiative (2001-2004)<http://www.soros.org/openaccess/>

Suber P., Open Access overview, <http://www.earlham.edu/~peters/fos/overview.htm>

Recommended