Direct vs Indirect Assessment
Direct vs. Indirect Assessment
What is Assessment?
At the program level, assessment is information or a collection of information centered on student learning. Assessment looks at how much knowledge students attain from their coursework over a program. The results of assessment help in making decisions to improve student learning within that program. Two basic forms of assessment are direct assessment and indirect assessment. These two types of assessment are not mutually exclusive and often work hand-in-hand to give program faculty the best picture of student learning within their programs.
Direct assessment of student learning is tangible, visible, and compelling evidence of exactly what students have and have not learned.
Direct assessment comes in a variety of forms:
- Case studies
- Clinical evaluations
- Oral presentations
Direct assessment asks students to show what they know. Students must be able to execute the skills they learn from the classroom and co-curricular activities.
Indirect assessment consists of proxy signs that students are probably learning. Indirect evidence is less clear and convincing and comes in a variety of forms:
- Overall GPA
- Student retention rates
- Graduation rates
- Job placement
(Palomba & Banta, 2013)
Distinguishing between Direct and Indirect Assessment
Any of the examples mentioned above, either under direct assessment or indirect assessment, could be either type of assessment. To determine whether an assessment methodology is direct or indirect, there are two important considerations:
- Who decides what was learned and/or how well it was learned?
- In direct assessment, a professional makes a decision regarding what was learned and how well it was learned. (E.g. a faculty member evaluating an essay)
- In indirect assessment, the student decides what was learned and how well it was learned. (E.g., surveys and teaching evaluations)
- Does the assessment measure the learning or is it a proxy for learning?
- Direct evidence: Students have completed some work or product that demonstrates they have achieved the learning outcome (E.g, projects, papers, exams, etc.).
- Indirect evidence: A proxy measure was used, such as participation in a learning activity, students’ opinions about what they learned, student satisfaction, etc. (E.g., number of students who visited an office or office hours, course grades*)
*How are course grades indirect assessment?
Course grades are based on many iterations of direct measurement. However, they are an indirect measurement of any one learning outcome because:
- They represent a combination of course learning outcomes; performance on these outcomes are averaged out in a final grade.
- They frequently include corrections not related to learning outcomes, such as extra credit or penalties for unexcused absences.
Challenges in Conducting Direct Assessment
Limited or inconsistent exposure to students (especially in co-curricular units)
- Captive Audience. Have students complete assessment work while they are with you, either in the classroom or during co-curricular activities or contact hours.
Challenge: Limited faculty/staff time
- Make assessment everyone’s responsibility in your program. Bring all faculty/instructional staff into the assessment process, so faculty and instructional staff participate from the beginning (determining what questions you have about student learning) to the end (dissemination of report; discussion of results; recommendations and actions to continuously improve student learning) of the assessment process. Not only does this meet the HLC expectation [link to HLC criteria for assessment content] that assessment involve the substantial participation of faculty and instructional staff, it also distributes the workload by making assessment a natural part of the work you do as a faculty group. It also formalizes the informal work faculty and instructional staff constantly engage in when they discuss their programs – what is working or not working, what students are learning well and where students struggle within their programs, etc.
- Captive audience. Incorporate assessment planning, activities, discussion, etc. into existing departmental meetings and/or planned events, such as faculty/staff retreats.
- Team up. Especially in smaller programs and departments, it can be helpful to seek other smaller units that would like to explore similar questions about student learning. It may be useful for two curricular or co-curricular units to work together or even for a curricular and co-curricular unit to team up to explore some aspect of student learning. Not only does this split the workload between more faculty/instructional staff, but it may also give you a richer picture of not just what students are learning in the classroom, but how well they are transferring that knowledge to other disciplines and/or applying knowledge they gain in your program outside of the classroom.
- Plan ahead. A little time spent planning at the beginning of an academic year can save a lot of time later on in actually conducting your assessment project(s).
Challenge: Student motivation to fully participate and engage in the assessment. How do we know assessment results are valid if students are not putting forth their best effort?
- Bring students into the process.
- Explain what you are doing and why you are doing it
- Inform students of what will be done with the results of the assessment
- Offer to make results available to students
- Bring students into the discussion of the assessment results, including its implications and how they might be used
- Think about giving students an opportunity to provide feedback on the assessment itself
- Make it "count." Even if it only counts a little bit, it can make a big difference in student’s motivation to participate in assessment projects.
In the Office of Teaching & Learning Assessment (Lewis 1300), you can find free books and books in the office library that you can check out, with more information on direct and indirect assessment.
Free books offered by TLA
Assessment Clear and Simple: A Practical Guide for Institutions, Departments, and General Education, 2nd Ed. by Barbara E. Walvoord.
Assessment in Practice: Putting Principles to Work on College Campuses by Trudy Banta, Jon P. Lund, Karen E. Black, & Frances W. Oblander.
Books for checkout from TLA library
Assessing General Education Programs by Mary J. Allen.
Assessing Academic Programs in Higher Education by Mary J. Allen.
Learner-centered Assessment on College Campuses: Shifting the Focus from Teaching to Learning by Mary E. Huba & Jann E. Freed.
Outcomes Assessment in Higher Education: Views and Perspectives edited by David Allen
Assessment Essentials: Planning, Implementing, and Improving Assessment in Higher Education by Catherine A. Palomba & Trudy W. Banta.
Online articles and book chapters
Actions Matter: The Case For Indirect Measures in Assessing Higher Education’s Progress On The National Education Goals, by Peter T. Ewell & Dennis P. Jones. Available here: http://www.jstor.org/stable/27797182?origin=JSTOR-pdf
Direct and Indirect Writing Assessment: Examining Issues of Equity and Utility by Ronald H. Heck & Marian Crislip. Available here: http://epa.sagepub.com/content/23/3/275.full.pdf+html
Additional Assessment Related Books & Articles
The Philosophy and Practice of Assessment and Evaluation in Higher Education by Alexander W. Astin & Anthony Lising Antonio.
Higher Education: Student Learning, Teaching, Programmes and Institutions by John Heywood.
As the name suggests, an assessment prompt is a “prompt” to provide some type of information. A prompt defines a task; it is a statement or question that tells students what they should do (E.g., in a survey item, essay question, or performance). An effective assessment prompt should “prompt” students to demonstrate the learning outcome that is being assessed.
Suskie, L. (2009). Assessing student learning: A common sense guide. Jossey-Bass: San Francisco, CA.
Two Main Categories of Assessment Prompts
These prompts limit the way in which students may present information. Multiple choice items are an example of a restricted response prompt.
- Advantage: Restricted response prompts tend to be easier to evaluate.
- Disadvantage: These types of prompts can restrict students’ abilities to provide diverse, individualized responses that provide richer information about the learning outcome of interest
These prompts give students latitude in deciding how to respond or provide information. Students may have flexibility in the format, length, and/or construction of their responses.
- Advantage: These prompts may provide richer information for assessment of the learning outcome of interest by allowing for diversity in students’ responses.
- Disadvantage: Extended response prompts require more time – both to determine a format for evaluating responses and for the actual evaluation process itself.
Suskie, L. (2009). Assessing student learning: A common sense guide. Jossey-Bass: San Francisco, CA.
Laying the Foundation for Writing Good Prompts
- Decide what you want students to learn from the experience.
- Determine how the learning aligns with your learning outcomes.
- Develop a meaningful task or problem related to the identified learning outcome.
- Determine the methods you will use to measure (scoring guide, rubric, reflection, etc.) students’ learning.
Examples of Prompts
“You are there” Scenarios
These prompts ask students to put themselves into a situation and respond. “You are there” prompts are a good way for students to demonstrate their ability to integrate and apply knowledge they gained from a particular program.
Below are a few potential “you are there” prompts, but the possibilities are endless. An advantage of this type of prompt is the ability to customize the prompt to the types of situations students are likely to face in your particular field.
- You are on the subway and overhear a conversation about...
- You are a corporate trainer leading a diversity workshop and...
- You are a consultant working with a community organization when...
- You are a business executive leading a high stakes meeting...
Surveys will generally be used as a secondary assessment methodology since the primary assessment methodology should be a direct measure of student learning. Please note that surveys may be a direct measure of student learning if they ask students to demonstrate what they have learned, rather than asking for students’ opinions about what they have learned, the program, etc.
Best Practices in Survey Design
- Limit the number of surveys your students receive
- Survey Fatigue – You are not the only one asking your students to complete surveys (E.g., institutional surveys, other programs, restaurants, grocery and retail stores, etc.)
- Have realistic expectations for survey response rates.
- What can increase response rates?
- What response rate do I need for assessment purposes? (I.e., not research, difference in requirements depending on whether or not using inferential statistics)
- Include clear instructions with the survey instrument that clarify the purpose of the survey and provide respondents with expected procedures for responding to the survey instrument.
- Example: “This survey is designed to determine your feelings about several current human rights issues. Please read each question carefully. For each question, please consider each response option and choose the one option that best matches your feelings about the issue raised in that question.”
- Keep survey instruments as short as possible: The higher the time commitment to complete, the less likely students are to complete it.
- Make sure all portions of the survey are immediately and clearly visible.
- Use easy to read font size and type.
- Use high contrast background and font colors, such as black and white.
- In web-based surveys, use radio buttons instead of drop-down boxes to display response options.
- In web-based surveys, do not “hide” definitions respondents may need to interpret and respond to survey items by requiring respondents to click on a link or hover over an area to view definitions.
- Group questions about similar concepts/topics together.
- Questions should be clear and concise.
- Make sure questions are succinct, only providing as much information as is necessary for respondents to properly interpret what is being requested of them.
- Use language and concepts that are clear and familiar to survey respondents.
- Avoid the use of jargon, acronyms, and/or overly technical language in writing questions.
- Example: The question, “How clear were the ppt presentations about the opportunities for experiential learning for students in LAS and CSH?” contains multiple acronyms. Even if you think respondents should be familiar with the acronyms or jargon you are using, it is clearer to spell out acronyms and avoid jargon when possible. In this case, the question would be more clearly written as &lduo;How clear were the PowerPoint presentations about the opportunities for experiential learning for students in the College of Liberal Arts and Social Sciences and the College of Science and Health?”
- Avoid the use of double negatives.
- Example: For the question, “Do you believe ex-convicts should not be allowed to have gun licenses?” - In order to agree with the concept that ex-convicts should be allowed gun licenses, the respondent needs to disagree with the original statement, creating a double negative.
- Avoid complex sentence structures
- Ask only one question at a time; avoid double-barreled questions.
- Example: In the question, “How satisfied are you with the depth and breadth of content covered in this class?” respondents must indicate their satisfaction with two different things ‘depth’ and ‘breadth.’ In this case, they may have different levels of satisfaction with depth than they do with breadth, making it difficult to provide one response to this question.
- Ask students about their primary and current experiences and knowledge.
- Avoid questions that request second-hand knowledge.
- Example: “How happy was your cohort with the experience of taking all core courses as a group?” In this question the respondent only has access to their own happiness with the cohort experience. In responding to this question they may indicate their own happiness, assuming everyone felt the same way, or have to guess how happy the rest of the students seemed with the experience.
- Avoid retrospective questions
- Example: “How comfortable did you feel with this content before you started this course? How comfortable do you feel with it now?” Students may not reliably have access to their comfort level before they started. A pre- post- design may be better if this is the information you are trying to access.
- Ensure questions don‘t bias students to provide a certain response.
- Avoid leading questions.
- Example: “Considering the horrible human rights atrocities in countries such as Russia and North Korea, how do you feel about Communism as a form of government?” Clearly, the person who wrote this question is not a proponent of Communism and the question leads respondents to provide a similar viewpoint. People will be sensitive to social desirability cues from the leading nature of this question and will be unlikely to indicate support for Communism.
- Avoid questions that make assumptions.
- Example: “To what extent do you agree that a change in state laws would be the most effective way to support gay rights in the United States?” This question assumes the respondent is a proponent of gay rights. It will not be clear when someone disagrees with this question whether they are disagreeing with the idea that the country should support gay rights or if they are disagreeing with the proposition that changes in state laws are the most effective way to support gay rights in the country.
Writing Response Options
- Only include a neutral response option if you reasonably expect students to have no opinion about a question, one way or the other.
- Example: In the question “How satisfied are you with the variety of electives available to you in this program?” you could reasonably expect that anyone in the program should have an opinion about the variety of electives available – even if it is only slightly positive or slightly negative. In this case, it is better not to include a neutral option since a neutral or middle category may indicate a variety of things other than true neutrality (i.e. someone who is unwilling to respond to the question, someone who is ambiguous, someone who does not understand the question, someone who feels there is no response option that describes their opinions or attitudes, etc.).
- Use the smallest number of response options necessary to provide the full range of expected responses to your question.
- For scaled response options, four response options is generally adequate if you are not including a neutral response option and five response options is adequate if you will include a neutral response option.
What is Triangulation?
Triangulation is defined as “multiple lines of evidence that lead to the same conclusion.” Experts recommend that a student’s learning should be measured in several different ways. When we triangulate, or use different types of measures, more accurate conclusions about student learning are produced. (Allen, 2004, p. 172) Triangulation is more than just using different types of assessment measures, but also assessing students in multiple time intervals, such as the end of a chapter or unit, end of the quarter, or end of the semester. Through multiple assessment measures and time intervals, patterns and inconsistencies in student learning can be found and tackled. (Landrigan, C. & Mulligan, T.)
Multiple methods used to triangulate include:
- Standardized exams
- Questionnaires for students
- Reports & essays
- Capstone courses
- Student Conferences
(Banta et al., 1996, 101-104) (Landrigan, C. & Mulligan, T.)
Allen, M.J. (2004). Assessing academic programs in higher education. Bolton, Massachusettes: Anker Publishing Company, Inc.
Banta, T.W., Lund, J.P., Black, K.E., & Oblander, F.W. (1996). Assessment in practice: Putting principles to work on college campuses. San Francisco: Jossey-Bass Publishers.
Landrigan, C. & Mulligan, T. Triangulating: The importance of multiple data points when assessing students. Retrieved from http://www.choiceliteracy.com/articles-detail-view.php?id=525
Reliability & Validity
Reliability and Validity in Assessment
Reliability and validity are important concepts in any form of inquiry, including assessment. However, it is also important to note that there are usually not the same demands for reliability and validity in assessment as there may be in research. While both are important considerations in the assessment of student learning outcomes, it’s also important to not become paralyzed in your perceived ability to draw conclusions from your assessment projects because you are unsure of the reliability or validity of your measurement. Instead, it may be more useful to allow these considerations to temper your conclusions.
As the name suggests, reliability refers to how consistent (or reliable) the results you achieve by assessing student learning are. How much a student has learned over the course of a program should be relatively stable and not change drastically in a short period of time or depending on who assesses the student.
Implications for assessment
Commonly, the issue of reliability comes up when multiple people are rating a piece of student work to determine how much a student has learned. Faculty and staff looking at an identical piece of student work should be arriving at identical (or very similar) conclusions. A few strategies to help improve the reliability of your assessment based on this issue include:
- Developing a better rubric or scoring guide. Frequently, reliability issues stem from poorly developed scoring guides or rubrics. Confusing scoring guides and rubrics are difficult for faculty and staff to use in a consistent manner. You can visit the Teaching Commons website for information to help you improve your scoring guides and rubrics.
- Conducting a norming session to help faculty and staff use rubrics and scoring guides in a more consistent manner.
Another common issue with reliability arises when using assignments from multiple courses or course sections to assess program learning outcomes. In this situation, reliability issues may occur because assignments from different courses or course sections may not be asking students to demonstrate the same knowledge, skills, or abilities. The best strategy to address this sort of reliability issue is to work on better communication among faculty and staff about a variety of issues, including:
- Defining learning outcomes
- Determining how students will demonstrate their acquisition of the relevant knowledge, skills, abilities, values, etc.
- Guidelines for defining assignments that will measure students’ achievement of each learning outcome
- Determining how the achievement of the learning outcome will be measured
Validity is concerned with the accuracy of the conclusions you draw based on the use of a measurement instrument. Validity is a concept that refers primarily to the conclusions you draw from conducting an assessment project. A common misconception is that validity refers to the instrument being used to measure a learning outcome. However, validity is not inherent to a measurement tool or instrument. In fact, any instrument that has high validity for one purpose will almost surely not be valid for another. For example, a driver’s test may have high validity for making conclusions about a person’s ability to safely drive a car, but would be ridiculous for making determinations about a student’s ability to communicate effectively in writing. Also, validity may be specific to the population that was studied and one should not assume because an instrument was valid for one population it will necessarily be valid for a different population. For example, an instrument that you used to measure undergraduate students’ achievement of a learning outcome may not be valid for measuring a similar learning outcome for graduate students.
There are different aspects of validity and while these were formerly assessed separately, Samuel Messick (1989) suggests that all aspects fall under construct validity in what is commonly referred to as the unified theory of validity. Messick advocates for making an argument for the validity of a measurement tool based on these different aspects of construct validity. For an inference or conclusion to be ‘valid,’ all aspects of validity should be considered. In other words, a single aspect of validity should not be considered sufficient for drawing conclusions about the validity of conclusions or inferences being drawn.
At its simplest level, construct validity simply means that the instrument being used to measure a particular construct is fully measuring that construct and only that construct.
- Representativeness: Does the instrument measure all relevant aspects of a construct or does it leave some out? For example, an instrument measuring students’ communication skills based solely on writing does not represent the oral communication aspect of ‘communication skills.’
- Construct Irrelevance: Does the instrument measure anything unrelated to the construct? For example, an instrument being used to measure students knowledge of different mineral types should measure only that knowledge and not include other constructs, such as students’ reading abilities (if the test does not uses overly complex vocabulary) or test savviness (perhaps students’ ability to guess the correct answer based on the structure of the test if, for example, all of the above or none of the above is always the correct answer).
- Predictive Ability: How well does the measurement instrument predicts future performances or outcomes; for example, does the SAT predict students’ future performance in college?
- Consistency: What is the degree of agreement between results on the measurement instrument and other measures of the same construct?
- Consequential Validity: What are the intended and unintended consequences of the use and interpretation of a measurement instrument? Messick argues for the importance of considering the social and value implications of the use and interpretation of the results of a measurement instrument when considering its validity.