We will write a custom Essay on Oxford English Dictionary and the Wordnet Analysis specifically for you
301 certified writers online
Dictionaries have undergone a transformation into electronic versions and these versions have proved to be increasing in popularity. Since people work largely with computers and Internet connection availability ha become affordable, the electronic versions have become popular. The paper provides an analysis of two popular dictionaries, the Oxford English Dictionary – electronic version and the Wordnet electronic dictionary and gives their comparison. The paper also examines the term Lexicography.
The Oxford English Dictionary – Electronic Version
The Oxford English Dictionary (OED) is a comprehensive electronic version of the 20 volume printed OED. The dictionary has entries of more than 301, 100 words and the number of printed characters is is more than 350. In addition there are more than 6.5 million phrases, combinations, pronunciations, etymologies, cross references, illustrative quotations and derivatives. The dictionary is available as an online subscription service and also on CD format. The advantage of an online version is that when updates or upgrades are provided, they can be easily accessed. It is regarded as the most complete source of words in the English language. While earlier with the printed version, people usually referee red to the dictionary mainly to check spellings and finds truncated and brief information, the new dictionary in the electronic form is much more intuitive and user friendly. Along with spellings, the dictionary provides detailed information (New, 22 March, 2000). An image of the home page is shown below.
A subscriber or a student, can access the OED site from the campus or any computer and login to the website by using the authentication system. Once the user opens the designated account, a wide range of features is available. The search can be refined to narrow the results to a specific and narrow focussed range. The form is intuitive and has boxes for entering texts, using advanced search with Boolean characters such as operators such as NEAR, AND, OR, NOT, AND NOT, etc. A number of drop downs, radio buttons, check boxes and other elements are provided and the query can be submitted. Once the form is submitted, then the PAT search and retrieval engine runs a matching query in the database where the words are indexed using SGML characters and mark up language. The words are then converted into HTML language and displayed dynamically on the users screen. The speed at which the data is displayed depends on the network connectivity of the user. The approach is novel in the sense that information retrieval and delivery is cached in mirror servers using and encoding system such as SGML. The language uses tags for storing any kind of labels and can encode whole pages related to a word. Information can be manipulated to jump to hyper linked texts. A screen arrangement of the dictionary is shown in the following figure.
The dictionary offers access to the 20-volume Second Edition and also three additions series volumes and revised words each quarter. Users can select how entries are displayed by turning pronunciations, etymologies, variant spellings, and quotations on and off. It has everything from simple word look-ups to sophisticated Boolean searching, using any of the fields in the Dictionary, can be done with speed and ease. It is possible to find a term when one knows the meaning but has forgotten the word. Wildcards can be used if one os not sure of a spelling, or for searching for words with common characteristics. It is possible to search for quotations from a specified year, or from a particular author and work and to search for words that have come into English through others language. Facility is provided to search for pronunciations as well as accented and other special characters and for first cited dates, authors and works, search for words with a particular part of speech. Case sensitive searches are also possible and it is possible to restrict a search to a previous results set. The electronic version of the OED has features such as wild cards and once the phrase is searched, all related results are displayed (OED, 2001).
The above figure shows a typical search form. There a umber of boxes with the labels such quotation author, writer, place and users need to enter the first value in the first box. There are provided with operators such as operation A, Operation B and so on. A number of other boxes are also provided and these boxes help to refine the search. On the right side are shown a number of options for refining the search. So it is possible to set the search for a sequence such as first Operation A and then operation B. The image shows the search query build or the author named “Austen”. Case sensitive operation is also allowed so that “Austen” is regarded as different from ‘austen’.
Wild cards allows searching for a group of words that one is not sure og. If wild card word such as “geo+” was run, then the results would show a number of results related to geology, rock formation, earth science as well as rock music. If one wanted to refine the search to find information about a specific word such as granite, then the query should run as “rock+” AND “marble”. There is a certain amount of learning in using the search engine to make the most out of the dictionary. An image of the results are shown in the following figure.
When any word in clicked, further options are provided and a detailed information about the entry can be viewed.
The search results for the word ‘terrestrial’ are show in the above figure. The meaning of the word is given in the top in bold letters and in the below section, detailed etymology of the word is give, Also included are references and texts where he word has been used. The first instances where the word occurred has also been provided and a reader can use a number of words at the top to view addition information such as pronunciation, spellings, etymology, quotations, data chart and any new additions. The dictionary also provides for a searching by case sensitive methods. If a search had to be run for a computer language called ‘BASIC’ then it is possible to click the case sensitive check box and ensure that words such as ‘Basic’ and ‘basic’ are not included in the search results (QRG, 2007).
WordNet has been created by the Princeton University and it is a huge lexical database of words if the English Language. In the electronic form, adjectives, verbs, adverbs and nouns, are grouped into different sets of cognitive synonyms called synsets and each of the synsets express a distinct concept. They are interlinked by means of conceptual, semantic and lexical inter relations. The application is available as a free download with a browser and the meanings of related words and concepts can be accessed online. The structure is a useful tool for computational linguistics and natural language processing. The total number of unique strings, sysnsets and total word sense pairs is more than 206941 and the total number of Monosemous, Polysemous words and senses is more than 79450. The database only contains “open-class words” such as nouns, verbs, adjectives, and adverbs. Words that are not included are determiners, prepositions, pronouns, conjunctions, and particles. (WorldNet, 2007).
WorldNet uses the system of wnconnect which is a program that finds and reports all possible connections between two terms in WordNet. Certain advanced interconnections are possible and a concept map is displayed that shows how the terms are often connected in the application. There are certain unique concept, terms and lexicons used in the application. Some of the important ones are (Fellbaum, May 1998):
- adjective cluster: This is a group of adjective synsets. These are organized around antonymous pairs or triplets and contain two or more head synsets that represent antonymous concepts. Each head synset has one or more satellite synsets.
- attribute: This is a noun for which adjectives express values. For example the noun ‘weight’ is an attribute, and adjectives such as light and heavy express values.
- synset: This is a synonym set and represents a set of words that are interchangeable in some context
- semantic pointer: This is a semantic pointer that is used to indicate an inter-relation between synsets and word meanings
- collocation: This ia a string of two or more words that are connected by spaces or hyphens. Some types of examples are man-eating shark, blue-collar, depend on, line of products.
- basic synset: This is used to help explain the existing differences in entering synsets in lexicographer files
- exception list: This is a morphological transformations for words that are not regular and therefore cannot be processed in an algorithmic manner.
In the actual data base, the words are represented without blank spaces with a (_) character between two words such as man_eating_shark, depend_on, line_of_products and so on. The hypernym and the hyponym relationships that exist in the noun synsets can be interpreted as specialization relations between conceptual categories. The application can be interpreted and used as a lexical ontology in programming. Some correction may have to be done in the ontology since it contains hundreds of basic semantic inconsistencies such as (i) the existence of common specializations for exclusive categories and (ii) redundancies in the specialization hierarchy. Transforming WordNet into a lexical ontology usable for knowledge representation should normally also involve distinguishing the specialization relations into subtypeOf and instanceOf relations, and associating intuitive unique identifiers to each category (Fellbaum, May 1998).
An image of the application is as shown below.
In WorldNet, nouns and verbs are organized into hierarchies and these are defined by hypernym or IS A relationships. For example, the first sense of the word dog would mean a hypernym hierarchy as the words at the same level are synonyms of each other. In the hierarchy, some sense of dog is made to be synonymous with some other senses of domestic dog and Canis familiaris, etc. The topic map conversion of WordNet is created on W3C’s RDF version of WordNet. The conversion had (little simplified) steps of importing each single RDF file of WordNet to Wandora as a separate layer. For each imported layer RDF triplets were manually fixed to topic map associations. Generally this required mapping RDF’s subject and object to association roles. Fixing certain subject identifiers of imported topics. Constructing base and variant name for all words. Base names were constructed using URIs of RDF subjects. Variant names were constructed using base names. Simple Regular expressions were used in name construction. Creating light-weight topic hierarchy to connect WordNet topics to Wandora’s base ontology (Wandora, 2007). An image of the topic map conversion is provided below.
Difference between OED and WorldNet
WordNet does not carry information on terms such as pronunciation, etymology and the forms of irregular verbs and contains only limited information about usage. While it contains a wide range of commonly used words, it no way compares to the OED in the form of content, use, cross references, hyperlinks and other features. It also does not carry special domain vocabulary and it acts as an underlying database for different applications It may happen that those applications cannot be used in specific domains that are not covered by WordNet. But the application is available for free download and there are no subscription charges for the use and service. But the dictionary cannot be of much use to an advance user and can be utilized by college students, students who develop programs on computers and databases. WoldNet is meant of primary use by programmers and developers who know Unix language commands.
Get your first paper with 15% OFF
OED is a very user-friendly application with easy to use means and an advanced search engine that allows uses to easily find meaning of almost all words used in the English language. It provides meanings, spellings, pronunciation, etymology, quotations, references, datelines and others information of nouns, verbs, adverbs, prepositions and all other constructs of the English language. It is meant for use by people who have no idea of programming language but have certain basic skills of browsing the Internet. It requires a subscription fee that is to be paid.
OED and WorldNet are online dictionaries and wile OED is a simple and easy to use dictionary that can be used by lay people, WirldNet offers far limited words information and is meant to be used by people with some expertise in programming. The repository of OED has more than 6 million words while WolrdNet has far lesses.
- Fellbaum Christiane. 1998. WordNet An Electronic Lexical Database. MIT Press. ISBN-13: 978-0-262-06197-1
- Navigation. 2007. About the Dictionary Tour.
- New Juliet. 2000. The world’s greatest dictionary’ goes online. Web.
- OED. 2001. The Art and Craft of Lexicography, 2nd edition, Cambridge University Press 2001.
- QRG. 2007. Quick Reference Guide.
- Wandora. 2007. Topic map conversion. Web.
- WorldNet. 2007. About WordNet. Web.