Sunday, January 13, 2008

Assessment in the Classroom

James Travers-Murison

McInerney (2002, pp.350-3) states that assessment requires measurement of quantitative data and then qualitative judgement of the data called evaluation. The measurement must have validity. It must measure what it purports to measure called face validity, it must match the objectives for which it was designed called content validity, it must compare with other assessment to see if it accurately assesses a student called criterion validity evidence. If it is accurate this is called concurrent validity. Construct validity evidence measures the underlying theoretical construct of assessment. Factor analysis is one method giving evidence, being a statistical procedure to discover a structure within a large number of variables which reduces the variables to more basic composite variables called factors. Mau’s item analysis maybe an example (Zajda, p.166).

Measurement must be also reliable over time. Coefficients of stability test correlation of results being consistent, if they are similar there is a coefficient of equivalence. Internal consistency tests if the assessment produces similar results in different sections covering similar material (Cronbach Alpha technique). Finally intermarker reliability correlates scores from different markers.

McInerney (p.352) says teachers generally do not test the validity of their assessments. Often problems in assessment stem from content being taught in a different manner to assessment or including concepts not taught. Like teaching only addition and having a test on addition and perimeters. Or the assessment measurement may be inappropriate, for instance multiple choice to test fluent writing styles.

Achievement Targets seem to be the catch cry currently being bandied about by people like Rosalind Mau (Zajda, 2003) to obtain the best in assessment. They aim at creating in students five factors; substantive knowledge, high order thinking skills, behaviour to be demonstrated as an outcome during the class, a product being achieved and assessment which is tested, measured and evaluated.

Achievement Targets are met by two types of assessment. Formative assessment which measures learning outcome and monitors learning progress during the term. Summative assessment is used generally at the end of the unit studied to determine if the instructional goal has been achieved.

High Quality Assessment should have clearly stated achievement targets. Matched assessment method so curriculum content is consistent with assessment and a variety of assessment methods i.e. -

1. Tests often called ‘pencil and paper’.

2. Performance assessment – checklist of specific behaviours in a written report or oral presentation.

3. Personal communication – interviews, journal writing, discussion, conferences.

4. Portfolios – combine above with samples of students work

Norm Referenced Tests are standardized tests that compare students’ performance – they are considered meritocractic. Criticisms are that they can mismatch with school’s curriculum, only superficial knowledge is tested based on recall and it tends to label poor students reducing their self efficacy. However it is a simple and fairly accurate filtering system for higher education and commonly used and encouraged by government.

Criterion Referenced Tests are also standardized but measures students against certain criteria which can be specified by the teacher therefore is more flexible and less judgmental than norm tests as there is no statistical comparison to other students, all can pass if they have the knowledge and display it according to the criteria.

Meritocratic Assessment is the current egalitarian method of assessing students. It is called democratic as it is based on performance not status, race, gender, religion. However it has suffered criticism as being inflexible and focusing on limited multiple intelligences such as math, verbal linguistic abilities.

The current view of what makes effective schooling (Levin 1993) must cover the following requirements:-

1. Basic inputs of curriculum

2. Instructional material

3. Time for learning

4. Teaching practices

5. Community involvement

6. Principal leadership

7. Commitment

8. Accountability

Global technological change is now creating a society according to Schlechty and Cole (Mau in Zajda, p.161) that is an Information society based on knowledge. Using ideas, concepts, symbols, abstractions to solve problems and produce products, it is ‘Future Orientated’ based on problem solving and reasoning using high order thinking. Therefore assessment of students needs to cater for this climate, and older rote learning methods are no longer valid.

There is a strong emphasis now to ensure accountability of learning to meet these new goals using empirical tools. Standards are a statement of expectations or a criteria for excellence to assure quality, indicate goals and promote change (National Council of Teachers of Mathematics; Romberg 1993 in Zajda, p.161). Benchmark is a measure of student learning outcomes at a certain time relative to program objectives. It allows measurement of growth in learning outcomes and student ability. It is not a test. It gives insight into student performance and achievement.

Mau refers to the importance of valued achievement targets establishing specific objectives which leads to benchmarks or standards being set (Zajda, p.162). This benchmark is then assessed based on specifications set by the stakeholders in education. Technical, professional, managerial, service, vocational jobs are then filled as an end result of this type of pragmatic education system, which to my mind looks like an attempt to create the alpha, betas in the Brave New World of Aldous Huxley or Orwell’s 1984.

Norm referenced testing is best, Mau seems to imply, stating that people i.e. humanists, have been unfair to norm referenced testing (Zajda, p.162). She says in fact it gives clear achievement targets, an equal chance to all to meet criteria and if clearly explained and prepared for by teacher then no discrimination (cultural deficit) against poorer performing students. She believes it can contain high order thinking and it is good for policy makers production style benchmarking to evaluate success in education.

Instruction should be linked to achievement targets using three phases of instruction according to Mau (Zajda p.163).

1. Pre-instructional phase where the Teacher selects content and achievement targets

2. Interactive phase where the Teacher presents, questions and provides practice of content and skills. Formative assessment is used here as a trial run during the class, where later analysis of the target allows revision of methods.

3. Post-instructional phase where the Teacher checks for understanding and provides feedback. Summative assessment is then used to see if Achievement target met by using benchmarks. Objective measure of learning outcomes is thus provided.

Aiming at the Target is vital therefore all phases require clear achievement targets and must have systematic process of assessment, which emphasise key concepts, assess high order thinking. This process allows the teacher to modify teaching so students understand.

Mau comes up with the curious ‘Table of specifications’ which sounds not unlike something out of an automobile factory (Zajda, p.164). The table requires the teacher to clearly write achievement targets of outcome down the side in rows. Criteria to hit are explained to students – question types and kind of thinking levels – across the top in columns. Test is on the items coming from concepts taught and shown on table. Instruction is linked to achievement targets assessed according to the specifications set. Test results are analysed using criteria. Objectives not met then can be retaught or reviewed. Quality is thus improved.

Example Table

Specification on history of WWI Year 9

Content

Knows concepts

Comprehends concepts

Applies concepts

Total

Causes of WWI

Main events in the war

Conditions on the battlefield

Social concerns during war

Results of WWI

Item analysis goes hand in hand with a specification table. This is known as a diagnostic assessment tool that evaluates:

1. Difficulty level

- % of students that get it right

2. Discrimination power of each item

-If item shows high or low achievement levels in students

Item analysis example:

Content

Knows concepts

Comprehends concepts

Applies concepts

Total

Causes of WWI

  1. 90%
  2. 2
  1. 70%
  2. 6
  1. 50%
  2. 8

Main events in the war

Conditions on the battlefield

Social concerns during war

Results of WWI



Benefits of Item analysis shows what can be learned better by the class and what items need improvement in terms of being included in the curriculum. Items that are a problem can be then retaught in a different way. This approach encourages student reflection on own thinking processes and self appraisal by seeing criteria they have been assessed on before and after the assessment process. Clear direction is thus given to the student as to where they are achieving and not achieving on a given item. This enhances self esteem as the student can more clearly understand why assessed and what achieved (Zajda, p.166).

In summary, Achievement Targets are of value in assessment by:-

1. Planning of the lesson clearly

2. Linking to content area

3. Giving clearly defined criteria of listed items

4. Encourages higher order thinking and assessment of this

5. Allows precise evaluation of targets assessed using Table of Specifications and Item analysis

6. Empowers students to interact and reflect on learning goals

As I have pointed out this article by Mau seems an apology for an empiricist approach to assessment, which seems to support norm referenced testing and more conservative approaches to assessment. The danger of this is becoming to much like a production line in education, where specification tables, item analysis and quality control, consumer satisfaction and stakeholders become glib cliches for superficial teaching styles and methods of assessment. In principal the process she suggests is good and probably already in use by most competent teachers whether they are aware of the terminology or not. Whether one would have the time as a teacher to do item analysis according to discrimination power and difficulty level for each item on every assessment assuming one did produce a table is highly questionable given the shortage of time and pressure a full time teacher is under. So it may be just another academic exercise. Further research on teacher implementation would be beneficial.

Mau, R. (2003) Assessment in the Classroom, Ch.11 in Zajda, J. (ed.) Learning and Teaching. Melbourne: James Nicholas Pub.

McInerney, D. and McInerney, V. (2002) Educational Psychology: Constructing Learning (3rd ed.) Sydney: Prentice Hall Australia.