Observed test scores - a summary of chapter 4 of A conceptual introduction to psychometrics by G, J., Mellenbergh

A conceptual introduction to psychometricsChapter 4Observed test scoresThe aim of testing is to yield scores of test takers’ maximum or typical performance.Two main types of test scores are distinguishedObserved testComputed after the separate test items are scored.Derived from the item scores by taking the unweighted or weighted sum of the item scores.The latent variable is unobserved, and in general, the laten variable is not a simple sum of item scores.Latent variable (construct) scoresTo compute the latent variable score, a model is needed that specifies the relation between the latent variable and item responses.The latent variable score is derived from the item responses under the assumption of a latent variable item response model. Conventionally, items are scored by assigning ordinal numbers to the responses.The scoring differs slightly between maximum and typical performance tests.Maximum performance items are scored by assigning 0 to the lowest category, and consecutive rank numbers to subsequent categories.Typical performance items are indicative or contra-indicative of the latent variable that is measured by the test, and the scoring of contra-indicative item has to be reversed with respect to the scoring of indicative items.Dichotomous indicative typical performance items are scored assigning 0 to the ‘no’ (don’t agree), and 1 to the yes (agree) categorie.Whereas contra-indicative items are scored by assigning 0 to the ‘yes’, and 1 to the ‘no’ category.The categories of ordinal-polytomous items are scored by assigning rank numbers to the categories.Bounded-continuous items ares cored in measurement...

Access options

How do you get full online access and services on JoHo WorldSupporter.org?

1 - Go to www JoHo.org, and join JoHo WorldSupporter by choosing a membership + online access

2 - Return to WorldSupporter.org and create an account with the same email address

3 - State your JoHo WorldSupporter Membership during the creation of your account, and you can start using the services

You have online access to all free + all exclusive summaries and study notes on WorldSupporter.org and JoHo.org
You can use all services on JoHo WorldSupporter.org (EN/NL)
You can make use of the tools for work abroad, long journeys, voluntary work, internships and study abroad on JoHo.org (Dutch service)

Already an account?

If you already have a WorldSupporter account than you can change your account status from 'I am not a JoHo WorldSupporter Member' into 'I am a JoHo WorldSupporter Member with full online access
Please note: here too you must have used the same email address.

Are you having trouble logging in or are you having problems logging in?

Read first the answers to the most frequently asked questions

Toegangsopties (NL)

Hoe krijg je volledige toegang en online services op JoHo WorldSupporter.org?

1 - Ga naar www JoHo.org, en sluit je aan bij JoHo WorldSupporter door een membership met online toegang te kiezen
2 - Ga terug naar WorldSupporter.org, en maak een account aan met hetzelfde e-mailadres
3 - Geef bij het account aanmaken je JoHo WorldSupporter membership aan, en je kunt je services direct gebruiken

Je hebt nu online toegang tot alle gratis en alle exclusieve samenvattingen en studiehulp op WorldSupporter.org en JoHo.org
Je kunt gebruik maken van alle diensten op JoHo WorldSupporter.org (EN/NL)
Op JoHo.org kun je gebruik maken van de tools voor werken in het buitenland, verre reizen, vrijwilligerswerk, stages en studeren in het buitenland

Heb je al een WorldSupporter account?

Wanneer je al eerder een WorldSupporter account hebt aangemaakt dan kan je, nadat je bent aangesloten bij JoHo via je 'membership + online access ook je status op WorldSupporter.org aanpassen
Je kunt je status aanpassen van 'I am not a JoHo WorldSupporter Member' naar 'I am a JoHo WorldSupporter Member with 'full online access'.
Let op: ook hier moet je dan wel hetzelfde email adres gebruikt hebben

Kom je er niet helemaal uit of heb je problemen met inloggen?

Lees dan eerst de antwoorden op de meest gestelde vragen

Join JoHo WorldSupporter!

What can you choose from?

JoHo WorldSupporter membership (= from €5 per calendar year):

To support the JoHo WorldSupporter and Smokey projects and to contribute to all activities in the field of international cooperation and talent development
To use the basic features of JoHo WorldSupporter.org

JoHo WorldSupporter membership + online access (= from €10 per calendar year):

To support the JoHo WorldSupporter and Smokey projects and to contribute to all activities in the field of international cooperation and talent development
To use full services on JoHo WorldSupporter.org (EN/NL)
For access to the online book summaries and study notes on JoHo.org and Worldsupporter.org
To make use of the tools for work abroad, long journeys, voluntary work, internships and study abroad on JoHo.org (NL service)

Register, become a JoHo member, and get your services

Sluit je aan bij JoHo WorldSupporter! (NL)

Waar kan je uit kiezen?

JoHo membership zonder extra services (donateurschap) = €5 per kalenderjaar

Voor steun aan de JoHo WorldSupporter en Smokey projecten en een bijdrage aan alle activiteiten op het gebied van internationale samenwerking en talentontwikkeling
Voor gebruik van de basisfuncties van JoHo WorldSupporter.org
Voor het gebruik van de kortingen en voordelen bij partners
Voor gebruik van de voordelen bij verzekeringen en reisverzekeringen zonder assurantiebelasting

JoHo membership met extra services (abonnee services): Online toegang Only= €10 per kalenderjaar

Voor volledige online toegang en gebruik van alle online boeksamenvattingen en studietools op WorldSupporter.org en JoHo.org
voor online toegang tot de tools en services voor werk in het buitenland, lange reizen, vrijwilligerswerk, stages en studie in het buitenland
voor online toegang tot de tools en services voor emigratie of lang verblijf in het buitenland
voor online toegang tot de tools en services voor competentieverbetering en kwaliteitenonderzoek
Voor extra steun aan JoHo, WorldSupporter en Smokey projecten

Meld je aan, wordt donateur en maak gebruik van de services

Access:

JoHo members

Join WorldSupporter!

Join with a free account for more service, or become a member for full access and support of WordSupporter

This content is related to:

A conceptual introduction to psychometrics by G, J., Mellenbergh - a summary

This is a summary of the book A conceptual introduction to psychometrics by G, J., Mellenbergh. The summary contains chapter 1 to 6, and focusus on developing psychological tests. The first chapter of this summary is for free, but to support worldsupporter and Joho,...Read more

4321 reads

Check more of this topic?

Psychologie en gedrag

Work for WorldSupporter

JoHo can really use your help! Check out the various student jobs here that match your studies, improve your competencies, strengthen your CV and contribute to a more tolerant world

Working for JoHo as a student in Leyden

Parttime werken voor JoHo

Search other summaries?

Associate with your Field of Study

Search Summaries or Notes

Start using Summaries

Add a Summary

This content is also used in .....

A conceptual introduction to psychometrics by G, J., Mellenbergh - a summary

Introduction - a summary of chapter 1 of A conceptual introduction to psychometrics by G, J., Mellenbergh

A conceptual introduction to psychometrics
Chapter 1
Introduction

Test definitions
Test types

Test definitions

Psychometric terminology sometimes differs depending on the types of test applications.

A psychological or educational test: an instrument for the measurement of a person’s maximum or typical performance under standardized conditions, where the performance is assumed to reflect one or more latent attributes.

A test is defined to be a measurement instrument. It is for measurement in the first place.
A test is defined to measure performance. Two types of performance:
- Maximum performance tests ask the person to do his or her best to solve one or more problems. The answers to this problems can vary in correctness.
- Typical performance tests asks the person to respond to one or more tasks where the responses are typical for the person. The person’s responses cannot be evaluated on correctness, but they typify the person.
Performance is measured under standardized conditions.
Test performance must reflect one or more latent attributes. The test performance is observable, but the latent attributes cannot be observed.

Tests are distinguished form surveys. It is not assumed that survey questions reflect a latent attribute.

Subtest: an independent part of a test.
A (sub)test consists of one or more items.
Item: the smallest possible subtest of a test. The building blocks of a test.
A test consists of n items, and is called a n-item test.

One or more latent attributes effect test performance.
The number of latent attributes is the dimensionality of the test.
Dimensionality: equal to the number of latent attributes (variables), which effects test performance.

Unidimensional test: a test that predominantly measures one latent attribute.
Multidimensional test: a test that measures more than one latent attribute.
Two-dimensional test: a test that measures two latent attributes. And so on…

Test types

Psychological and educational measurement instruments are divided into:

Mental test: consists of cognitive tasks
Physical test: consists of instruments to make somatic or physiological measurements

Maximum perfromance tests

A performance can be considered maximum in two different respects. If the performance is accurate and if the performance is fast.

Classified according to time:

Pure power test: consists of problems that the maker tries to solve. The test maker has ample time to work on each of the test items, even on the most difficult ones.
Emphasis on measuring the accuracy to solve the problem.
Time-limited power tests: test are constructed so that the majority of test takers have enough time to solve the problems, and only a small minority needs more time.
Speed test: measures the speed taken to solve problems. Usually, the test consists

Access:

Public

Developing maximum performance tests - a summary of chapter 2 of A conceptual introduction to psychometrics by G, J., Mellenbergh

A conceptual introduction to psychometrics
Chapter 2
Developing maximum performance tests

Construct of interest
Measurement mode
The objectives
The population
The conceptual framework
Item response mode
Administration mode
Item-writing guidelines
Item rating guidelines
Pilot studies on item quality
Compiling the first draft of the test

Seven elements

Construct
Measurement mode
Objectives
Population and subpopulations
Conceptual framework
Respons mode
Administration mode

Construct of interest

The test developer must specify the latent variable of interest that has to be measured by the test.
Latent variable is a general term. The term construct is used when a subsantive interpretation is given of the latent variable.
The latent variable (construct) is assumed to effect test makers’ item responses and test scores.

Constructs can vary in many different ways.

Vary in content of mental abilities, psychomotor skills or physical abilities
Construct may vary in scope
For example: from general intelligence to multiplication skill
Constructs vary from educational to psychological variables.

A good way to start a test development project is to define the construct that has to be measured by the test.
This definition describes the construct of interest, and distinguished it from other, related, constructs.
Usually, the literature on the construct needs to be studies before the definition can be given. Frequently the definition can only be given when other elements of the test development plan are specified.

Measurement mode

Different modes can be used to measure constructs.

Self-performance mode
The test taker is is asked to perform a mental or physical task
Self-evaluation mode
The test taker is asked to evaluate his or her ability to perform the task
Other-evaluation mode
Ask others to evaluate a person’s ability to perform a task

The objectives

The test developer must specify the objectives of the test. Tests are used for many different purposes.

Scientific vs practical
Individual level vs groep level
Description (describe performances) vs diagnosis (adds a conclusion to a description) vs decision-making (decisions are based on tests)

The population

Target population: the set of persons to whom the test has to be applied.
The test developer must define the target population, and must provide criteria for the inclusion and exclusion of persons.
A target population can be split into distinct subpopulations. The test developer must specify whether subpopulations need to be distinguished. And, if so, they need to define the subpopulations, and to provide criteria

Access:

JoHo members

Typical performance tests - a summary of chapter 3 of A conceptual introduction to psychometrics by G, J., Mellenbergh

A conceptual introduction to psychometrics
Chapter 3
Typical performance tests

Construct of interest
Measurement mode
The objectives
Population
The conceptual framework
Item response mode
Administration mode
Item writing guidelines
Item rating guidelines
Pilot studies on item quality
Response tendencies
Compiling the first draft of the test

Typical performance tests assess behavior that is typical for the person.
These tests are used to measure attitudes, interests, values, opinions, and personality characteristics.

Construct of interest

The test developer has to specify the latent variable of interest that is assumed to effect test takers’ item responses and test scores.
The ususal constructs of interest of typical performance tests are:

Attitudes
Interests
Values
Opinions
Personality characteristics

The responses to typical performance tests are not evaluated on their correctness, but are considered to typify a person.

At the start of a test development project, the researcher needs information on the construct of interest. This information can be obtained from different sources

A study of the literature on the construct and existing measurement instruments is nearly always needed at the start of a test development project
Different types of research can be done on the construct.

Focus group method
Uses small groups of persons who have experiential knowledge about the construct.
A focus group meets with the test developer to talk about their experiences with the construct.
Key information method
Uses persons who have expert knowledge about the construct of interest. The test developer interviews these key informants about the constructs.
Observation method

The test developer can use information from different sources to define the construct and, later on the test development process, he or she can use this information for item writing.

Measurement mode

Self-report mode
The test taker answers questions on a typical performance construct
Other-report mode
A person answers questions about another person’s construct
Somatic indicator mode
Uses somatic signs to measure constructs
Physical trace mode
Uses traces that are left behind to measure constructs

Each of these four modes can occur in tow different varieties

Reactive measurement mode
When test takers can deliberately distort their construct value
Nonreactive measurement mode
When test takers cannot distort their construct value

The reactive/nonreactive distinction is only used for typical performance measurements, and not for maximum performance measurements.
A maximum performance test asks test takers to do the best they can to perform the task.

Each of the four response modes can occur in two versions

Self-report mode
Test takers are asked to respond to questions or stimuli to assess their

Access:

JoHo members

Observed test scores - a summary of chapter 4 of A conceptual introduction to psychometrics by G, J., Mellenbergh

A conceptual introduction to psychometrics
Chapter 4
Observed test scores

Item scoring by fiat
The sum score
The observed test score distribution

The aim of testing is to yield scores of test takers’ maximum or typical performance.
Two main types of test scores are distinguished

Observed test
Computed after the separate test items are scored.
Derived from the item scores by taking the unweighted or weighted sum of the item scores.
The latent variable is unobserved, and in general, the laten variable is not a simple sum of item scores.
Latent variable (construct) scores
To compute the latent variable score, a model is needed that specifies the relation between the latent variable and item responses.
The latent variable score is derived from the item responses under the assumption of a latent variable item response model.

Item scoring by fiat

Conventionally, items are scored by assigning ordinal numbers to the responses.
The scoring differs slightly between maximum and typical performance tests.

Maximum performance items are scored by assigning 0 to the lowest category, and consecutive rank numbers to subsequent categories.
Typical performance items are indicative or contra-indicative of the latent variable that is measured by the test, and the scoring of contra-indicative item has to be reversed with respect to the scoring of indicative items.
Dichotomous indicative typical performance items are scored assigning 0 to the ‘no’ (don’t agree), and 1 to the yes (agree) categorie.
Whereas contra-indicative items are scored by assigning 0 to the ‘yes’, and 1 to the ‘no’ category.
The categories of ordinal-polytomous items are scored by assigning rank numbers to the categories.
Bounded-continuous items ares cored in measurement units, such as centimeters.

Measurement by fiat: the item scores are assigned to a test taker’s responses without any theoretical justification.
(for example, scores 0 and 1 are assigned to a correct and incorrect answer, ad the scores 1, - 5 are based on convention (by fiat) and are not based on psychometric theory)

The sum score

The score of the j^th test taker on the k^thitem is indicated by X_jk. The conventional test score of the j^th test taker on a n-item test is the unweighed sum of his (or her) item scores:

Us_j = X_j1 + X_j2 +… + X_jn

It may be argued that items differ in imporance, and that they should be weighted differently.
The weighed sum score of the j^thitem on an n-item test is:

Ws_j = w₁X_j1 + w₂X_j2 + … + w_nX_jn

w₁ is the weight assinged to the first item and so on.
A problem with

Access:

JoHo members

Classical analysis of observed test scores - a summary of chapter 5 of A conceptual introduction to psychometrics by G, J., Mellenbergh

A conceptual introduction to psychometrics
Chapter 5
Classical analysis of observed test scores

Measured precision of observed test scores
Information on a single observed score
Reliability of observed test scores in a population
Some properties of classical test theory
Parameter estimation

Measured precision of observed test scores

Test scores are used in practical applications.

Measurement precision has two different aspects:

Information
Applies to the test score of a single person
The within-person aspect of measurement precision
Reliability
Applies to a population of persons.
The between-persons aspect of measurement precision

The concept of measurement precision applies to observed test scores as well as to latent variable scores.

Information on a single observed score

Functional thought experiment: fulfils a function within a theory.

True test score: the expected value of the observed test scores of the repeated test administrations in the thought experiment.
Test taker j’s true test score is the expected value of his (or her) independently distributed observed tst scores from (hypothetical) repeated administrations of the test to the test taker.

The observed test score is a variable that varies across repeated test administrations.
The true score is constant.

Error of measurement: the difference between test taker j’s observed test score and his (or her) true score.
Test taker j’s error of measurement on an arbitrary measurement occasion is ht difference between his (or her) observed test score and his (or her) true test score.
The expected value of the errors of measurement is 0.

The within-person error variance is an index for the precision of the measurement of a person’s true score.

Test taker j’s standard error of measurement: the square root of his (or her) within-person error variance.

Information: the reciprocal of a person’s within-person error variance.
A small amount of information means that Test taker j’s observed test scores vary widely around j’s true score across repeated test administrations.
A large amount of information means that j’s observed test scores do not vary widely around j’s true score.

Reliability of observed test scores in a population

Reliability: the differentiation of test scores of different test takers from a population.

Psychometrics uses two definitions of reliability

A theoretical definition
Operational definition.
Yields procedures to assess reliability.

Reliability concerns the differentiation between the true test scores of different test takers from a population.
The differentiation is good if test taker’s true scores can be precisely predicted from their observed test

Access:

JoHo members

Classical analysis of item scores - a summary of chapter 6 of A conceptual introduction to psychometrics by G, J., Mellenbergh

A conceptual introduction to psychometrics
Chapter 6
Classical analysis of item scores

Item score distributions
Classical item discrimination
Distractor analysis
The internal structure of the test

The conventional way of scoring items is by assigning ordinal numbers to the response categories.
Usually, these item scores are ordered with respect to the attribute that the item is assumed to measure. But, these assignment of these ordinal numbers lacks a theoretical justification.

Usually, the analysis of test scores is supplemented by an analysis of the item scores.

Item score distributions

The scores of a given item have a distribution in a population of N persons.

Location: the place of the scale where item scores are centered
Dispersion: the scatter of the item scores
Shape: the form of the distributions

Classical item difficulty and attractiveness

The location of the item score distribution is used to define the classical item difficulty (maximum performance tests) and classical item attractiveness (typical performance tests) concepts.

Classical item difficulty: a parameter that indicates the location of the item score distribution in a population of persons.
Classical item attractiveness: a parameter that indicates the location of the item score distribution in a population of persons.

The two definitions are the same.

Classical item difficulty and attractiveness are defined in a population of persons.
Population-dependent and may differ between populations.

The mean in mainly used for this.
The mean of a dichotomously scored item is called the item p-value.

Item score variance and standard deviation

The most common parameters that are used in classical item score analysis are the variance and the standard deviation of the item scores.

Items that have a small item score variance, have little effect on the test score variance.

The variance of dichotomous item scores is a function of the item p-value.
For a given sample size, the variance has its maximum value at p=.5.

Classical item discrimination

Location and dispersion parameters yield useful information on the items of a test.
But, these parameters do not indicate the extent to which an item contributes to the aim of a test to assess individual differences in the attribute that is measured by the test.

Classical item discrimination: a parameter that indicates the extent to which the item differentiates between the true test scores of a population of persons.
Defined in a population of persons, may vary between different populations.

The item-test and item-rest correlations

An appropriate index for discrimination between the true scores would be the product moment correlation between the item score and the true score in the population of persons.
Test taker j’s observed

Access:

JoHo members

Test theory and practice

In this bundle, the literature of the course test theory and practice is bundled.

Supporting content:

A conceptual introduction to psychometrics by G, J., Mellenbergh - a summary

Year 1 of psychology at the uva

Access:

Public

Follow the author: SanneA

SanneA

More contributions of WorldSupporter author: SanneA:

Check how to use summaries on WorldSupporter.org

Online access to all summaries, study notes en practice exams
Using and finding summaries, study notes en practice exams on JoHo WorldSupporter
Quicklinks to fields of study (main tags and taxonomy terms)

Online access to all summaries, study notes en practice exams

Check out: Register with JoHo WorldSupporter: starting page (EN)
Check out: Aanmelden bij JoHo WorldSupporter - startpagina (NL)

Using and finding summaries, study notes en practice exams on JoHo WorldSupporter

There are several ways to navigate the large amount of summaries, study notes en practice exams on JoHo WorldSupporter.

Starting Pages: for some fields of study and some university curricula editors have created (start) magazines where customised selections of summaries are put together to smoothen navigation. When you have found a magazine of your likings, add that page to your favorites so you can easily go to that starting point directly from your profile during future visits. Below you will find some start magazines per field of study
Use the menu above every page to go to one of the main starting pages
Tags & Taxonomy: gives you insight in the amount of summaries that are tagged by authors on specific subjects. This type of navigation can help find summaries that you could have missed when just using the search tools. Tags are organised per field of study and per study institution. Note: not all content is tagged thoroughly, so when this approach doesn't give the results you were looking for, please check the search tool as back up
Follow authors or (study) organizations: by following individual users, authors and your study organizations you are likely to discover more relevant study materials.
Search tool : 'quick & dirty'- not very elegant but the fastest way to find a specific summary of a book or study assistance with a specific course or subject. The search tool is also available at the bottom of most pages

Do you want to share your summaries with JoHo WorldSupporter and its visitors?

Check out: Why and how to add a WorldSupporter contributions
JoHo members: JoHo WorldSupporter members can share content directly and have access to all content: Join JoHo and become a JoHo member
Non-members: When you are not a member you do not have full access, but if you want to share your own content with others you can fill out the contact form

Quicklinks to fields of study (main tags and taxonomy terms)

Field of study

Comments, Compliments & Kudos:

Add new contribution

Promotions

JoHo kan jouw hulp goed gebruiken! Check hier de diverse studentenbanen die aansluiten bij je studie, je competenties verbeteren, je cv versterken en een bijdrage leveren aan een tolerantere wereld