Senin, 22 Desember 2014

Testing Oral Production in Language Testing Course

Testing Oral Production

A.    What is meant by Speaking a Second Language ?
Speaking is a complex skill requiring the simulaneous use a number of different abilities which often develop at different rates. There are five components generally recognized in analysis of the speech process:
1.      Pronunciation (including the segmental features vowel and consonants and the stress and intonation patterns)
2.      Grammar
3.      Vocabulary
4.      Fluency (the case of speed of the flow of speech)
5.      Comprehension (requires a subject to respond to speech as well as to initiate it)

B.     The Major Problem in Measuring Speaking Ability

The central reason is the lack of general agreement on what “good” pronunciation of a second language really means: is comprehensibility to be the sole basis of the judgement, or must we demand or high degree of both phonemic and allophonic accuracy? We cannot put much confidence in oral ratings.

C.     Types of Oral Production Test

Most test of oral production fall into one of the following categories:
1.      Relatively unstructured interviews, rated on carefully constructed scale

Scored Interview
The simplest and most frequently emploted method of measuring oral proficiency is to have one or more trained rates interview each candidate separately and record their evaluations of his competence in the spoken language.
As the other types of highly subjective measures, the great weakness of oral ratings is tendency to have rather low reliability. Positive steps can be taken to achieve a tolerable degree of reliability for the second interview these are:
1.      Providing clear, precise, and mutually exclusive behaviorial statements for each scale point
2.      Training the raters for their tasks
3.      Pooling the judgement of at least two raters per interview

Use of More Than One Scores
The scoring of oral ability is generally highly subjective. Even with careful training, a single scorer is unlikely to be as reliable as one would wish. If two testers are involved in a (loosely defined) interview, then they can independently assess each candidate. If they agree, there is no problem. If they disagree, even after discussion, then a third assessor may be referred to.

2.      Highly structure speech samples (general recorded), rated according to very specific criteria
As a rule highly structured speech samples test are in several parts, each designed to licit a somewhat different kind of speech sample.
1.      Sentence Repetition
The examine hears and then repeats a series of short sentence
Scoring procedure: the raters listens to the pronunciation of two specific pronunciation points per sentence, marking wether or not each pronunced in an acceptable way.
Sentences                                                              point to be rated
1.      Jack always likes good food                           vowel contrast in good:food
2.      We’ll be gone for six weeks                            vowel contrast in six:weeks
3.      They’ve gone farther south                              voice-voicelessfricative farther:south

2.Reading Passage
    The examine is given several minutes to read a passge silently, after which he is intructed to read it aloud at normal speed and with appropriate expression
Scoring proceduer: the raters marks two or more pronunciation points per sentene and then makes a geneal evaluation of the fluency of the reading
Examiner’s copy of the test                                         points to be rates
While Mr. Brown read his newspaper his wife           primary stress
finished packing his clothes for the trip.                     Voiced final consonant (s)
The suitcase was already quite full, and she                vowell quality
was having a great deal of difficulty finding              primary stress
room for the shirt, socks, and handkerchiefs               series intonation
turning to her husband, she asked, “are you                consonants cluster
sure you really want to go on this trip?”                      intonation contour
“I’m sure,” replied Mr. Brown,                                   intonation contour
“but how about you?”                                                 stress and pitch

3.Sentence Conversion
The examinee is instructed to convert or transform sentence in specific way (from positive to negative, statement to question, present tense to past). The voice on the tape gives the sentence one at a time, the examinee supplying the conversion in the pause that follows.
Scoring procedures: the raters scores each converted sentence on the basis of whether or not it is grammatically acceptable.

4.Sentence Contruction
The voice on the tape asks the examinee to compose sentence appropriate to specific situation.
Scoring procedure: the raters score each sentence on an acceptable-unacceptable basis
Example:
1. “You are trying to find the post office in a strange city. Ask a policeman for direction.”
2. ”You have teleponed your friend Marry, but her mother answer and tell you that Marry is not at home. Ask her to leave a message for Marry to call you when she comes home.”



5.Respons to Pictorial Stimuli
The examinee is given time to study each of a series of pictures and then briefly describes what is going on in each scene.
Scoring procedure: for each picture the raters gives a separate ratting of the examinees pronunciation, grammar, vocabulary, and fluncy.

3.      Paper-and-pencil objective test of pronunciation, presumably providing indirect evidence of spaeking ability.
Characteristic item types appearing in paper-and-pencil pronunciation test:
1.      Rhyme words. The examinee is first presented wth a test word which he is instructed to read to himself, after which he is to select the one word from aong several alternative which rhymes with the test word.
Ex: 1. could rhymes with              a. Blood
                                                      b. food
                                                      c. would
      2. plays rhymes with               a. Case
                                                      b. raise
                                                      c. press
2.  Word stress. The examinee is to decide which syllable in each test word receives the heaviest stress.
Ex:       1. Frequently
            2. introduce
            3. develop
3.Phrase stress. The examinee is to decide which one of several numbered syllables in each utterance would receive the heaviest stress.
Ex:       1. I know that Henry went to the movie, but where did John go ?
2.      I’m certain Professor Brown wants to see you, but he’s in class just now
To have confidence in such paper-and pencil test of speaking ability, we would need strong statistical evidence of their validity, that is, evidence that they are really testing what they purport to test. We would need some trustworthy external criterion –some reliable measure of how the subjects actually do speak-and, it is the lack of such a measure that is still the chief stumbling block to all our efforts to evaluate oral production with real precision. Testing specialists who have used the paper-and-pencil objective tests have attempted to validate them by comparing test result with judges ‘evaluations of the subjects’ oral reading of the test items. However we have been unable to establish either the validity or the invalidity of these tests by rigorous statistical methods, we can cite a number of observations which cast considerable doubt on their efficacy. The users of such tests have frequently observed that some students with superior pronunciation have done poorly on the tests, while high scores have sometimes been obtained by students who could barely be understood.
Secondly, one cannot help wondering about the technique of of testing the production of the segmental phonemes by means of rhyme items.
Thirdly, even a casual examination of the range of problems treated in these tests inspires the strongest suspicions that they sample the total sound system most inadequately.
Summary
1.      The validity of paper-and-pencil objective techniques remains largely unproven, such techniques should therefore be used with caution, and certainly never as the sole measure of oral proficiency.
2.      The techniques of eliciting and rating highly structured speech samples shows much promise, but such testing is still in the experimental stage and requires very great test-writing skill and experience.
3.      The scored interview, though not so reliable measure as we would wish for, is still probably the best technique for use in relatively informal, small-scale testing situation; and ways can be shown for substantially improving the effectiveness of this testing device.

D.    Improving The Scored Interview
General procedures
1.      Decide in advance on interview methods and rating standards.
By devoting approximately the same length of time to the average interview, speaking to the candidates at about the same rate of speed, maintaining the same level of difficulty in the questions they ask. For the scoring, the raters should be able to reach basic agreement on methods and standards, ensuring a reasonable degree of uniformity.
2.      Conduct the interviews in some quiet place with suitable acoustic
It will naturally impose an unfair burden on the candidates and greatly reduce the reliability and the validity of the ratings.
3.      Reserve sufficient time for each interview
Ten to fifteen minutes would seem essential as the minimum for each interview, though the time required will vary somewhat from candidate to candidate.
4.      Use at least two raters for each candidate
At least two independent ratings are necessary if satisfactory rater reliability is to be obtained.
5.      Rate the candidate without reference to other test scores
Candidates score on other tests should be withheld from the raters untill after they have completed their evaluations.
6.      Record the rating after the interview
Scoring should be done after the candidate leaves the room and the next examinee will not enter until the marking has been completed.
7.      Obtain each candidate final score by pooling or averaging the two or more rating that have been given him.
The candidates should be called back for a second evaluation.
Suggestion for conducting the interview
1.      Beginning the interview
The interviewer should begin with a social questions, speak at normal conversational speed, modify his speech somewhat (if the candidate cannot comprehend what is being said), speaking more slowly, and with some simplification of sentence structure and vocabulary while making a mental note to score the candidate accordingly.
2.      Continuing the interview
The interviewer should move on to other areas of discussion and follow lines of questioning which the examinee has not been able to anticipate.
3.      Concluding the interview
Whatever the precise form that the conclusion takes, care should be taken not to give the candidate the impression that he is being cut off in the middle of a discussion.

CONCLUSION

The accurate measurement of oral ability is not easy. It takes considerable time and effort to obtain valid and reliable results. Neverheless, where backwash is an important consideration, the investment of such time and effort may be considered necessary.
Readers are reminded that the appropriateness of content, descriptions of criterial levels, and elicitation techniques used in oral testing will depend upon the needs of
individual institutions or organizations.





REFERENCES

Harris, David P. Testing Language As a Second Language.

Hughes, Arthur. Testing for Language Teachers. Cambridge University

Tidak ada komentar:

Posting Komentar