A Corpus based Research Study Of Novel The Tragedy of Hamlet, Prince of Denmark

Corpora have been put to many different uses in fields as varied as natural language processing, critical discourse analysis and applied linguistics, to mention just a few. It is an area of studies based on the simultaneous analyses of texts. The aim of corpus linguistics is to search linguistic patterns or language units like keywords, lexemes, phrase-logical patterns, grammatical associations, etc. The most eminent way of studying phrases in Corpus Linguistic Approach is collocation. Collocation is the statistical tendency of words to co-occur. For example, the words big, good and great are collocations of deal as a noun, and/or a great deal.

In contrast, software that presents concordance lines simply identifies the target item (usually a word or phrase) each time it occurs in the corpus and presents each instance, or as many as are required, to the corpus user. Usually this is done with the target item in the center of the screen and a few words to the left and right of that item. This ‘key word in context’ presentation, as it is known, has a number of uses. Even the small amount of context is usually enough to show what the word or phrase means, what phrases it often occurs in, and/or the discourse function that it has. Quantitative information about word meaning and function that is not available automatically can therefore be calculated.

The character of a corpus is determined by the type of texts that constitute it. Whereas Dr. Johnson’s corpus consisted largely of works by Shakespeare, Milton, Dryden and other literary figures, a modern general corpus will contain both written and transcribed spoken material from a wide range of media such as: Books, Magazines, Newspapers, Emails, Television, Radio, and Conversations. To this end, I apply corpus to analysis of Shakespeare’s Hamlet. The research questions are:

  • What lexical patterns are characteristics of Shakespeare’s play?
  • What textual meaning do those patterns suggest?
  • What are strengths and limitations of the corpus approach in the study of fictional prose?

In the following sections I first give a brief introduction, afterwards an account of the methodology adopted in this study, comprising explanations about corpus descriptive tools and data preparation. Then, the results of the study are reported, followed by a discussion on the strengths and limitations of the corpus-driven approach to literary texts, as observed from a corpus of Shakespeare’s play Hamlet.


The study begins with an observation that a corpus linguistic technique has been adopted infrequently in a stylistic analysis of literary texts. A corpus-driven approach was applied to an analysis of Shakespeare’s play; Hamlet in order to see how well this method works with literary texts. The limitations of automatic analysis of texts should always be recognized; after all, computers will still only do what humans tell them to do.


  • Descriptive Tools:

There are three corpus descriptive linguistic tools that are used in this study to explore stylistic features and their textual functions in Shakespeare’s play The Tragedy of Hamlet, Prince of Denmark. The tools are: “keyness”, “concordances”, and “collocations”.

  • Keyness:

To answer the research questions stated above, the concept of “keyword” in corpus linguistics is a starting point in the analytical procedures. According to Scott and Tribble (2006), keywords are lexical items of significance to a text in question, because of their “unusual frequency in comparison with a reference corpus of some suitable kind”. The “unusual frequency” here refers to both “unusually high” and “unusually low” frequency. For the purpose of the present study, only items with “unusually high” frequency are considered.

Keywords are not necessarily the most frequent words found in a text. Keywords are important to the text because they are used “unusually often” when compared with other texts. To illustrate, if we consider a corpus of Shakespeare’s Hamlet alone, or even all of Shakespeare’s works, the definite article “the” is found to be the most frequent word in Shakespeare’s plays. But we cannot say that it marks Shakespeare’s writing style because “the” is an article, it is likely to be used very often in any piece of writing, not just in Shakespeare’s plays. Therefore, we need to compare Shakespeare’s works with other authors’ so that we can see if he really used “the” significantly more than others. And when compared with other novelists’ writing, the article does not turn up as a keyword in Shakespeare’s works because it is also used frequently by other authors. On theother hand, the word “blood”, whose frequency is lower than that of “the”, appears in the keyword list of Shakespeare’s Hamlet. This means that Shakespeare used the word “blood” significantly more often than other writers. Therefore, as Baker (2006) puts it, a keyword list gives a measure of saliency, not just frequency, of the lexical items in a text and hence can suggest further examination of their textual functions. This is the reason why keywords are fundamental to the corpus-stylistic analysis of Shakespeare’s play Hamlet in the present study: through the Keyword function in Rayson’s (2007) Wmatrix Tools (see below), lexical items that are characteristic of Hamlet are extracted, some of which will be further investigated in detail.

According to Scott and Tribble (2006), three kinds of words usually come out of a comparison as keywords: proper nouns, words that “human beings would recognize” as key, which tend to indicate a text’s “aboutness”, and words that are not usually identified consciously by readers as key but nonetheless occur in significantly high frequencies and so can be indicators of the style of a text, rather than of its content.

  1. Collocations:

After keywords are extracted, their significance in the six major novels needs to be explained. To this end, the concept of collocation, defined by Hoey (1991: 67) as “the relationship a lexical item has with items that appear with greater than random probability in its (textual) context”, is drawn upon. This definition emphasizes that collocation of a word is not just a random co-occurrence of words, e.g. “she + is”, but the co-occurrence takes place in a text for some reason, as seen from the phrase “with greater than random probability in its (textual) context”. For example, as shown by Stubbs (2001: 28), common collocations of the word “seek” include “help”, “advice” and “support”. An examination of the collocational patterns of a word in a text can therefore allow us to see the relationship between lexical items in a text, which in turn enables us to see the way words are used to create meanings in a text. To find out what keywords are used “with greater than random probability” in Hamlet, a computer-assisted extraction of collocates, through the statistical measure Mutual Information (MI), is adopted.


Table 1: Key Semantic Fields in Shakespeare’s Hamlet


Key Semantic Field

Sample Words In The Semantic Field


Degree boosters

so, more, most, very, much



father, mother, brother



dead, blood, murder, sword, poison



my lord, majesty, court



love, heaven, God, ghost, madness, hell, alas


Strong obligation or necessity

Should, must, have to



Table 2: Key Grammatical Categories in Hamlet


Key Grammatical Categories

Sample Words In The Grammatical Categories


Degree adverb

so, more, most, very, much


Be – infinitive



Modal auxiliary

should, would, could, might


Third person singular objective personal pronoun

he, him, her


Have – infinitive



  1. Concordance:

Concordance is an alphabetical list of all the words used in a book or set of books, with information about where they can be found and usually about how they are used. In other words it can be said that concordance deals with the use of different words of a literary work in all the contexts they’re used in.to find concordances in Shakespeare’s play Hamlet, we searched through our discourse data for all the instances of a word, and then the surrounding context for each instance that was found. The result was called a key-word-in context concordance. We type in a certain word (the key word), and in a few seconds we have a list of neatly lined up examples of that word as found in the data. Then we sorted the list of words in a variety of ways, in order to get a clearer picture of how it relates to its context. No matter how we sorted the words, their context remained available.

Table 3: Concordance in Shakespeare’s playHamlet



No. of hits




















Interpretations of Overall Findings

After making tables of key semantic field and key grammatical field it is observed that number of fields overlap with each other whether it is same category or across different linguistic groups. Keywords “so”, “more”, “most” are parts of key grammatical field “degree adverb” as well as key semantic field “degree boosters”. This shows that these overlapping items are special category of Shakespeare’s play “Hamlet”.

These overlapping categories or items when put in to groups result in to following groups of key linguistic features that marks the style of Shakespeare’s play.

(1) Words related to tragedy, comprise of keywords “dead”, “sword”, “poison”, “blood” and “murder”.

They lie in key semantic field “tragedy”.

(2) Words showing family relationships, comprise of keywords “father”, “mother” and “brother” have key semantic field “kin”.

(3) Words showing royalty or noble class comprise of “my lord”, “majesty” and “court” are part of key semantic field “royalty”.

(4) Words related to high degree comprising key words “so”, “more”, “most”, “very” and “much” lie in two fields i.e. key semantic field “degree booster” and key grammatical category “degree adverb”.

Words comprising of keywords “should”, “would”, “could” and “might” are part of key grammatical category “modal auxiliary”.

On the basis of Scott and Tribble’s categorization of keywords above groups of linguistic features can be divided in to two main groups. The first group consists of lexical items that can be identified through observation and suggests the content of the text. This group comprises of Groups(3), (4) and (6) of table. The lexical items of these groups match with what has been discussed in literary criticism of Shakespeare’s play “hamlet” .for instance  this play deals with royal life that can be represented by keyness of words related to “my lord”, “majesty” and “court”  and semantic field “royalty”. This play is tragic, indicated by keywords like “dead”, “sword”, “poison”, “blood” and “murder” in semantic field “tragedy”.

Other group consists of such lexical items about which Scott and Tribble state that, they are not consciously identified by the readers as important but still their occurrence is significantly in high frequencies and so they can be marked as indicators of style of text rather than content.

Words Related To High Degree:

In the group of key linguistic features words related to high degree are most characteristic of Shakespeare’s play “Hamlet”. It  is shown by their occurance in two different linguis tic categories as

  • Key words:

“So”, “more”, “most”

  • Key semantic field :

“degree booster”

  • Key grammatical category:

“degree adverb”

It is not only the density of words related to high degree but the degree of their Keynes also mark their greater significance to “Hamlet”.

After examining the concordance, lines of words in this group it is observed that words that denote a high degree are used in closed proximity to one another.

A strong density of high degree words at some case in play constitute an exaggerated discourse in “Hamlet”. This exaggeration then encourages the readers to feel that the part of text they are reading cannot be interpreted at face value.

Auxiliary “Be” And “Have”:

Auxiliaries “be” and “have” are helping verbs which generally do not express the content of the text. But in this play it seems that their role is greater than other groups. The occurrence of “be” is 222 times and hat of “have” is 183 times.

In order to find out that in what sort of textual environment these verbs are used we used clusters with frequency in parenthesis

Word clusters of “be”

To be (34)

Be , - - be (22)

be- - (24)

Word clusters of “have”

have you (15)

you have (15)

have  ? (18)

Thus corpus based approach has directed the attention to see that it is this group of lexical items that are used strategically for creating and hinting at meaning between lines in this play


Test bed virtualization through Big Data

The formation of large data has extensively delayed the capacity and extent of information. The duty of doling out and analyze this data has develop into crucial, if not possible by DBMS. The...more

Super DNA Printing

Scientists and researchers believe that super human qualities are much greater than other human beings. Advance Scientific experiments create such extraordinary qualities to the super DNA which we only think....more

How to "Print Super Humans" : Technical Working and Future Aspects

Science technology is progressing very fast. If we take a look few years back, there are a lot of problems in human life. Then science progress rapidly and give us...more

Ambient Media:Present Past and Future Trends

Ambient media has emerged as a new form of media that revolves around the natural surroundings of the people. Ambient media relates to the daily life, activities and environment and generates new ways...more

All Findings and Summary of the Novel:ALL THE LIGHT WE CANNOT SEE
ALL THE LIGHT WE CANNOT SEE 2 Marie who is now 89,she had a daughter name Helene and a grandson name Michel. Michel is a very intelligent boy and like his grandfather wants to be a scientist.Marie loves his grandson very much.at night she used to tell him stories of world war and her childhood and also tell him about...more
Present Past and Future(tube transportation) of Railways Transportation

With the passage of time, the means of transportation and traveling has been changed from horse riding to cross distant areas and traveling for months and even years. With a time, new developments...more

Flying Strobe drones applications, working and how Strobe drones works

A drone is commonly known as unnamed aerial vehicles. Drones are flying robots. They are remotely controlled or can be controlled by software which enabled with GPS. Drones are used for military purpose...more

How do Scientists Think to Recreate Nature

Nature is everything that surrounds the environment and also adds to the beauty of the environment. Nature plays a very important role in maintaining the balance in the environment and keeps it clean from the pollution. In other ways, if we think of making our lives more comfortable with the use of natural resources, no doubt we are making things...more

DNA Replication using Bio Printer

Bio printing is a subordinate of 3D printing, also known as additive manufacturing, focused on the creation of organs, DNA ,body parts and limbs compatible and functional enough to be...more

Working of Microscopic Robots and its Usage

The microscopic robots are often said to be nano or micro robots. The field belongs to micro robots is said to be nano or micro robotic engineering. The size of...more

Types of UAV Drones, Agriculture and Crops Drones

The concept of smart flying machines and robots is not something limited to fiction and sci-fi movies anymore. This is the era of the drones. They are gradually taking over...more

Human body as Thermoelectric generators

Human beings produce a lot of energy that mostly goes to waste.

We are looking for the new ways of producing energy or power from the humans heat or energy, new methods of consuming...more

Types and usage of Drones in Modern Tech Societies

Computers do things more precisely and faster than humans but we need human to deal with uncertainty .It’s this combination that interests me. People are running around, relying on intuition, and faster than...more

DNA Replication using Bio Printer

Bio printing is a subordinate of 3D printing, also known as additive manufacturing, focused on the creation of organs, DNA ,body parts and limbs compatible and functional enough to be...more

Interview Drones

Drones are flying machines who are able to record video as well as the sound by staying on the air, which is the new technology we are talking about .Now i am...more

Swarm Clothing:Dress Changes Automatically

Research is to study their behavior and their design and their controlling systems. Swarms involve constant behavior change automatically after per-defined time interval. Swarm clothing a new invention in the future. Swarm clothing...more

UAV Drones usage for Speedy Justice:Court Drones

The instant court drones are still a concept and not something that exists in reality till now. The instant drone concept refers to that the use of drones by the...more

Drone use for Refugees Monitoring over the deep sea and between borders

To design and fabricate a drone that has the capability to survey the large area foe refuge movement monitoring, has high endurance, low acoustic signature, low radar cross section area,...more

Types of UAV Drones, its usage and Applications

Dust drones are relatively Cheaper, Safe and precise than crop airplanes which are dangerous and also can be replaceable by drones. These Drones have many uses .They are used for...more