首页 热点专区 小学知识 中学知识 出国留学 考研考公
您的当前位置:首页正文

语料库语言学

2023-07-12 来源:要发发知识网


Summary of corpus linguistics ------------the book written by Douglas Biber,Susan Conrad,Randi Reppen Studies of language can be divided into two main areas:studies of structure and

studies

of

use.Traditionally,linguistics

analyses

have

emphasized

structure—identifying the structural units and classes of a language and describing how smaller units can be combined to form larger grammatical units.

A different perspective—which is the focus of this book –is to emphasize language use.From this perspective,we can investigate.Rather than looking at what is theoretically possible in a language,we study the actual language used in naturally occurring texts.

Many studies of language use focus on a particular linguistic structure,investingating the ways in which seemingly similar structures occur in different contexts and serve different functions.A structural analysis would describe the grammatical similarities and differences among these three sentences.All three options are equally grammatical ways to complete the meaning of the verb.However,an analysis of language use goes beyond traditional grammatical description to ask why the language should have multiple structures that are so similar in their meaning and grammatical function.

Corpus is a large and principled collection of natural texts.In subsequent

chapters of this book the readers will be introduced to some of the important issues relating to size and representativeness in corpus design.For some of the example analyses in the book,we have used corpora especially designed to address specific research questions.However,many of the example analyses use well-known corpora which are publicly available.There are four corpora(or parts of corpora)that we have used repeatedly in this book:the London-Lund Corpus,the Lancaster-Oslo/Bergen(LOB)Corpus,the conversation register from the British National Corpus(BNC),and certain registers from the Longman-Lancaster Corpus.Further information about them,including how to obtain them,is provided in the appendix.

All texts in the London-Lund,BNC,and LOB corpora are from British English.The academic prose and fiction texts in the Longman-Lancaster Corpus come from authors of both British and American English.Given the scope of the present book,and the large number of linguistic investigations already included,we have chosen to disregard national dialect differences here.However,the techniques introduced in this book could also be used to investigate differences across national dialects,and we would expect such a study to uncover interesting patterns of variation.

All of the chapters in this book are developed around example analyses.The introduction to each chapter states the research questions that will be addressed,so that you know exactly what aspects of language use are under investingation and why they are important.The discussion for each example analysis then includes the methodlogy,results,and interpretation of the results.The

sample analyses are used to teach many aspects of corpus-based analysis.The readers will learn about the new kinds of research questions that can be asked and the new findings uncovered from corpus-based studies. In addition,you will learn about the kinds of analytical procedures needed to address these questions and the kinds of decisions that researchers make during corpus-based analyses.Although each of the chapters follows this gengral outline,they address language issues with a slightly different focus,as described in the next section.

Throughout this book we have emphasized the usefulness of the corpus-based approach for studying how speakers and writers use the linguistic resources available to them in their language.This approach takes advantage of:computers’ capacity for fast,accurate,and complex analyses;the extensive information about language use found in large collections of natural texts from multiple registers;and the rich descriptions that result from integrating quantitative findings and functional interpretation.For these reason,the corpus-based approach has made it possible to conduct new kinds of investigations into language use and to expand the scope of earlier investigations.

These advantages apply to the study of individual linguistic features as well as the characterization of language varieties.Thus,for example,the application of corpus-based techniques in lexicography make it possible to study the collocations of words in a comprehensive way,indentifying differences in the preferred senses of related words.Similarly,many researchers in the past have been interested in differences between speech and writtinf,but until the development of corpus-based methods it was impossible to discover the patterns of co-occurring

features and dimensions of variation that characterize the two modes.

As an approach to linguistics,corpus-based analysis provides a new perspective on language use.The findings of the example analyses presented in corpus linguistics,as well as the findings of other corpus-based studies,show that there are strong,systematic patterns in the way that language is used.In fact,one of the most consistent findings in corpus-based studies concerns the importance of register variation.You have seen how linguistic features from all levels-including lexical collocations,word frequencies,nominalizations,dependent clauses,and a full range

of

co-occurring

features-have

patterned

differences

across

registers.Therefore characterizations of”general English” are usually not characterizations of any variety at all,but rather a middle-ground that describes no actual text or register.

Findings about the strong association patterns in language,and about the importance of register,show how the corpus-based approach identifies important characteristics that need to be included in a full description of language use.The corpus-based approach can thus bring to the fore aspects of language use that have not received attention in traditional studies.

Our primary goal in the present book has been to illustrate the wide range of research questions within linguistics that can be addressed with the corpus-based approach.Thus ,we included investigations of individual words,grammatical constructions,discourse patterns,varities of texts,language acquisition,and historical and stylistic issues.However,the breadth of work that is possible with

corpus-based studies is far greater than we have been able to illustrate in the example analyses here.

First of all,for the areas of research included in the book,many additional investigations could be done asking related research questions and applying similar analytical techniques.For example in grammar,you might be interested in the distribution of other word classes or structures across registers,or you might be interested in other structural variants,such as the factors influencing the choice between agentless passives and by-passives.For discourse analyses,you might be interested in tracking the use of other kinds of linguistic features,such as markers of stance.In the area of register analysis,you might be interested in registers used in the business world-investigating memos,reports,etc.

In addition,there are other research questions that would require slightly different analytical techniques or different specialized corpora.In lexicography,for example you might want to investigate specific kinds of collocations.Studies of this type might consider the association between synonyms and position versus negative sentences;or the lexical associations of words within a specific closed class.Language acquisition studies could investigate questions such as:How does language development vary across learners from different language backgrounds?How do learners of different ages vary in their language use?Do second-language students’ error vary after instruction of different types?A historical perspective can also be used for lexical and grammatical issues:for example,how did the use of nominalizations develop across registers over several time periods?What changes occur in the preferred collocates of a word over

time?Within historical linguistics,investigations could focus on issues such as: Have languages changed linguistically in similar ways over time?How does increased literacy affect the linguistic features found in a language?Based on your own particular interests,you can probably think of a large number of other research questions that could be addressed with the corpus-based approach.

There are also some areas of linguistics that we have not covered at all in this book.One very important area that we have neglected is computational linguistics and natural language processing.Much previous and on-going research in this area uses corpus-based techniques for a variety of applications,including the development of taggers and parsers,informationretrieval,text processing and production,and machine translation.There are numerous books and journals dealing with these topics;we refer interested readers to several of these at the end of the chapter.

Furthermore,though we have used English for the examples in the book,corpus-based research is equally important for the study of languages other than English.Such studies include investigations of lexical items and grammatical features in other languages,the investigation of dimensions of variation in other languages,comparing registers and use of features across different languages.

Finally,the corpus-based approach can be used to study specific situations that are important in society,in order to learn more about the language used in those contexts.Workplace or classroom discourse,for example,are fruitful arenas for a range of studies,examining language use across speakers,cultural

groups,genders,and positions of differential power.As with other areas,analysis of a large body of authentic language can show the actual language patterns being use-rather than having to rely on intuition or anecdotes,which may or may not be accurate reflections of the dynamics between teachers and students or managers and employees.

因篇幅问题不能全部显示,请点此查看更多更全内容