The Sociosemantic Web Research Notes
Librarians and archivists use the term metadata to identify
“descriptive information used to index, arrange, file, and improve access to a library’s or museum’s resources.”
“Defining Metadata” by A.J. Gilliland-Swetland. In M. Baca (ed.), Introduction to Metadata: Pathways to Digital Information (1998), Getty Information Institute, Los Angeles, CA, p. 41.
We employ a word or phrase to describe the subject of a document for the purposes of retrieval. We try to concisely encapsulate the aboutness of a document at the immediate moment to support findability later. Metadata’s ability to help people find what they need it truly important of Semantic Web.
“The Semantic Web” by Tim Berners-Lee, James Hendler, and Ora Lassila. Scientific American, May 17, 2001. Available at http://sciam.com
In 1988, sociologist Susan Leigh Star coined the term “boundary object” to describe artifacts or ideas that are shared but understood differently by multiple communities.
Hopefully we can use metadata as a boundary object, to foster translation, build shared understanding, and encourage real social progress.
The history of metadata is inextricably interwoven with hierarchy, for the organization of ideas and objects into categories and subcategories is fundamental to human experience. A formal taxonomy starts with a single root node sits atop the hierarchy. Properties flow from class to subclass through the principle of inheritance. Each object and category is assigned a single location with the taxonomy.
Faceted classification through the use of multiple fields or “facets” to describe the objects within our collections. First defined in the 1930s by Indian librarian S.R. Ranganathan, faceted classifications is a relational approach to accommodate navigation that varies by user and task. Thus allowing for objects exist simultaneously in many locations.
controlled vocabularies are used to manage the ambiguity of language. we define equivalence relations to handle synonyms (variant terms that are equivalent for the purposes of retrieval) and we specify associative relationships to support [see also] links (often used for cross-sell and up-sell) that lead beyond hierarchy, creating thesaurus terms and relationship structures.
Classification systems not only enable findability, they also facilitate understanding, influence identity, and claim authority.
In philosophy, an ontology is a theory about the nature of existence, of what types of things exist. The most typical kind of ontology for the Web has a taxonomy and a set of inference rules. (Berners-Lee, http://sciam.com). The most visible of ontology models is found in the Resource Description Framework (RDF). RDF is a W3C standar for describing and exchanging metadate. The structure is a collection of triples, each consisting of a subject, a predicate, and an object. The triples are specified with XML tags.
“What is RDF?” by Tim Bray. Book available at http://www.xml.com/lpt/a/2001/01/24/rdf.html
Eventually led to the creation of Dublin Core Metadata Standard, which defines a simple element set for describing networked resources.
The 16 elements include: Title, Creator, Subject, Description, Publisher, Contributor, Date, Type, Format, Identifier, Source, Language, Relation, Coverage, Rights, and Audience. http://dublincore.org/
Sorting Things Out: Classification and its Consequences by Geoffrey C. Bowker and Susan Leigh Star. MIT Press (2000), p.5
The use of user-defined labels and tags to organize and share information. A kind of informal social classification.
Tags are simple, yet powerful, social software innovation. Today millions of people are freely and openly assigning metadata to content and conversation. Unlike rigid taxonomy schemes that people dislike, the case of tagging for personal organization with socail incetives leads to a rich and discoverable folksonomy. Intelligence is provided by real people from the bottom-up to aid social discovery. And with the right tag search and navigation, folksonomy outperforms more structured approaches to classification.
“Technorati Launches Tags,” a January 17, 2005 post on the blog of David Sifry, founder and CEO of Technorati, the self-described “authority on what’s going on in the world of weblogs.”
The advantage of folksonomies is not that they are better than controlled vocabularies, it is that they are better than nothing, because controlled vocabularies are not extensible to the majority of cases where tagging is needed.
“Folksonomies + Controlled Vocabularies” at http://www.corante.com/many/archives/2005/01/07/folksonomies_controlled_vocabularies.php
Users tag objects with keywords, with the option of multiple tags per object.
The tags are shared and become pivots for social navigation. Users can move fluidly between objects, tags, authors, and indexers.
The object serves as a seed for emergent community. Tags serve as threads that weave a disparate collection of objects together, creating an emergent category that is defined from the bottom up, the folksonomy.
What is great is that we did not have to pay (or wait for) librarians, ontologists, or other members of the “well-designed metadata crowd” to impose a top-down hierarchy.
How Buildings Learn by Stewart Brand. Penguin Books (1995)
The clock of the Long Now by Stewart Brand, Basic Books (2000)
Semantic Web tools and standards create a powerful, enduring foundation. Taxonomies and ontologies provide a solid semantic network that connects interface to infrastructure. And the fast-moving, fashionable folksonomies sit on top: flexible, adaptable, and responsive to user feedback
The Shape of Information
Genre is not identical to the medium of the communication – a memo may be realized on paper or in an electronic mail message, while electronic mail may be used to deliver memos or inquires.
“Reproduced and Emergent Genres of Communication on the World Wide Web” by Kevin Crowston and Marie Williams. http://crowston.syr.edu/papers/genres-journal.html
Digital genre supports findability, supports document recognition, supports navigation within a document.
Research has shown semantics and structure are codependent.
“It’s the journey and destination: shape and the emergent property of genre in evaluating digital documents” by Andrew Dillon and Misha Vaughan. http://www.ischool.utexas.edu/~adillon/publications/journey.html
Structure contributes to understanding and comprehension, while meaning helps establish a sense of location. Users can tell where they are in a document from the semantic content of individual paragraphs.
Ubicomp has injected growing numbers of physical objects into the category of document.
“What is a Document?” by Michael Buckland. http://www.sims.berkeley.edu/~buckland/whatdoc.html
Graphic and written records are representations of ideas or of objects, but the objects themselves can be regarded as documents if you are informed by observation to them.
Traite de documentation by Paul Otlet. Brussels: Editiones Mundaneum (1934), p.217
The librarian and documentalist, Suzanne Briet, addressed this extension of meaning in a 1951 manifesto in which she asserts a document is “…evidence in support of a fact…any physical or symbolic sign, preserved or recorded, intended to represent…or to demonstrate a physical or conceptual phenomenon.”
Qu’est-ce que la documentation by Suzann Briet. Paris: EDIT (1951)
Otlet and Briet were trying to include the natural objects and artifacts of zoos, museums, and libraries, without admitting the entire world into the category of document.
The advancement of applications and technological features of mobile devices are thrusting the creation of whole new taxonomies of findable objects. Our understanding of the boundary objects called documents will continue to be the base reference in the identification of new findable objects.
Researchers at Stanford and Microsoft have explored the application of timelines, temporal landmarks, and spatial memory for document management and retrieval.
“Milestones in Time: The Value of Landmarks in Retrieving Information for Personal Stores.” Available at http://reseaerch.microsoft.com/~sdumais/SISLandmarks-Interact2003-final.pdf
“Data Mountain: Using Spatial Memory for Documentation Management.” http://www.microsoft.com/usability/UEPostings/p153-robertson.pdf