After the foundational work of identifying SLO’s and aligning them with curricular and co-curricular learning opportunities is completed, the process of measuring how well students are achieving desired SLO’s begins. This step is often characterized by an overabundance of jargon and a lack of clarity, causing many involved to become intimidated and disengage from the process. However, the process does not need to be as complex as some authors make it out to be.
Before engaging in the assessment, it is essential to ensure that the process is not overly burdensome given available time and resources. Assessors should consider data that is either currently available, easily accessible, or can be gathered without too much effort. Often information that is already collected in the normal course of business may be repurposed for assessment purposes. For example, exams, writing assignments, research papers, performances, and group projects that are part of regular course assignments are excellent sources of assessment data. In some cases, data gathered from sources outside of the classroom may also be utilized (e.g., feedback from clinical placements, internships).
Choosing Appropriate Assessment Tools
Selecting an approach for assessing outcomes requires careful consideration; there are many methods for measuring student achievement or outcomes, and very often, more than one can be used simultaneously. However, regardless of the methods employed, it is important for assessors to choose an approach that provides the program with enough information with which to make reasonably informed judgments about how well students are performing. Results from different assessment activities can show a pattern of students’ performance, which provides confidence in the assessment results.
Provided in the links below are several considerations to be mindful of when assessing outcomes:
Direct and Indirect Evidence
Direct evidence of student learning is tangible, visible, self-explanatory evidence of exactly what students have and have not learned (Suskie, 2009). Direct methods require students to use knowledge, skills, abilities, etc., to directly demonstrate their achievement of the learning outcome(s). Below are several frequently used tools:
Standardized Tests – Many educational companies publish tests designed to measure a variety of skills and competencies such as critical thinking, writing or mathematical problem skills. The advantage of using one of these tests is that they are nationally normed and have published validity and reliability metrics. The disadvantage is that the skills and competencies that they measure may not align with programs learning goals and objectives. Examples include: Compass, GRE, and the Collegiate Learning Assessment (CLA).
Locally published tests – Internally developed tests used for final exams, assignments, comprehensive exams, etc. can be designed to focus specifically on program and course learning outcomes. A locally developed test should be accompanied by a test blueprint to help ensure that the test focuses on the learning outcomes being assessed. The downside of these tests is that, in most cases, they do not provide comparison results with other colleges, and they need to be adopted across the institution (or within a program), if comparisons across units are desired.
Scores and pass rates on certification/licensure exams – These measures primarily apply to specialized accredited programs. Since the learning outcomes of the program and those of the accrediting agencies are intertwined, scores on these exams are considered direct measures of student learning.
Course Artifacts – Perhaps the most frequently used evidence of student learning comes directly from assignments provided to students in the regular course of classes. Essays, reports, presentations, performances, etc. are sometimes referred to as “authentic assessments” because they demonstrate that students can successfully transfer the knowledge and skills gained in the classroom to various contexts, scenarios, and situations beyond the classroom. These assessments should be accompanied by a rubric or a scoring mechanism to ensure that the results are valid and reliable.
Pre-test/Post-test Evaluation – These test results enable faculty to monitor student progression and learning throughout prescribed periods of time. The results are often useful for determining where skills and knowledge deficiencies exist and most frequently develop. However, pre- and post-tests are sometimes difficult to design and implement and the evidence they produce are not as clear as they may appear as learning growth may be attributable to other experiences that are not directly addressed in a course.
Capstone experiences – A Capstone is a cumulative course, assignment or experience designed to tie together various elements of a program. Examples include research projects, theses, dissertations, etc. Capstone experiences are often used to assess and wide-variety of knowledge and skills.
Portfolios – Collections of evidence and reflections documented over the course of a program or course. An electronic portfolio is referred to as an eportfolio. Information about the students’ skills, knowledge, development, quality of writing, and critical thinking can be acquired through a comprehensive collection of work samples over time.
Indirect evidence provides signs that students are “probably” learning, but evidence of exactly what they are learning is less clear and not as convincing (Suskie, 2009). Indirect evidence may be used to supplement direct evidence, but indirect evidence alone is not adequate proof that students are achieving desired outcomes. Examples include:
Course Grades – Grades are a good indication that students are achieving goals, but they are not sufficient evidence of student learning because by themselves, grades do not provide enough information about the learning that was tested or the criteria that were used. Problems associated with grading include: grade inflation, lack of consistent standards, vague criteria among courses and institutions, non-learning criteria used in the grading process (e.g., attendance), and student motivation that focuses too narrowly on grades. On the other hand, individual assignment grades may be considered direct evidence if the grade is based on a rubric or scoring guide.
Retention and Graduation Rates – Retention and graduation rates indicate that students are progressing through the curriculum, but they do not provide any indication of what students are learning or where they are excelling or falling short. Many students return to an institution without developing the skills needed to advance in careers to be successful in four-year colleges.
Admission Rates into four-year and graduate schools – Admission rates are considered indirect because there is not a direct link between the achievement of learning outcomes and admission. Admission into a four-year institution (or graduate school) is based on a number of factors, some of which may not be directly related to the knowledge, skills, and abilities obtained in an undergraduate program. Moreover, because entrance standards vary greatly across institutions, it is hard to determine to what extent learning is a factor in admissions.
Student satisfaction/engagement survey results – Results from student surveys do not tell us what students have learned, but they may help programs better understand why students are or are not successful. The results can help a program improve processes and services to help create the best possible learning environment to help students succeed.
Focus Groups and Interviews – Like surveys, feedback provided directly from students is often valuable in determining the reasons why students may not be meeting the expectations faculty have set. Thematic analysis, are useful in identifying common obstacles that may be impeding student success.
As the name implies, subjective assessments involve the judgement of the assessor. Examples include performances, written assignments, and some tests where the impression of the assessor determines the score or performance level. To minimize bias in subjective assessments, a rubric or similar scoring device should be employed to aid in evaluating student artifacts. Ideally, multiple raters should be involved with scoring subjective artifacts so that the results are reliable. See explanation of Rubrics below for more information.
Unlike subjective assessments, objective assessments occur when the scoring procedure is completely specified, thereby enabling agreement among different scorers and minimizing biases. Objective measures are infrequently used in many disciplines, (e.g., the Arts), but they are frequently employed in fields where correct/incorrect answers are elicited. Examples include multiple choice tests, certification exams, demonstrating a procedure, SATs, etc. The efficacy of completely objective tests can be viewed as limiting because they may not provide enough nuance as to why students may be answering a question incorrectly or why they may be struggling. For this reason, objective assessments can be supplemented with other tools. However, with complex statistical analyses, such as item response theory, objective tests can be used to identify areas where students are struggling, which can then be used to revise questions that may not be clear to students.
A rubric is a scoring guide that provides a set of criteria that defines, and often describes, important dimensions of work being completed, critiqued, and assessed. Rubrics frequently contain levels of achievement for each dimension being measured, sometimes with a numerical score assigned to each dimension. Rubrics can also be simple checklists or tables that can highlight important components of the artifacts being scored. Rubrics work well for performance assessments that require the judgement of the assessor. Examples include papers, performances, fields experiences and portfolios.
Creating an effective rubric is not a simple task. Developing one that works well should be viewed as an iterative process that requires testing (norming) and refinement to ensure that it properly aligns with the desired outcomes. Once a rubric has been adopted, there are numerous potential benefits to using them in a course or throughout a program of study. Suskie (2009) highlights several, including:
- Help students understand the instructors’ expectations for assignments
- Can inspire better student performance by indicating exactly what is valued in their work
- Make scoring student artifacts easier and faster as they remind faculty what they are looking for
- Make scoring more accurate, unbiased and consistent. Every student is being assessed using the same criteria
- Improve communication with students. Help to identify students’ strengths and weaknesses
- Reduce arguments with students about their performance. Shift focus from grades to how students can improve their performance
- Allow for more objective measures of performance across multiple courses and assignments
Formative vs. Summative Assessments
Another valuable categorization of outcomes assessment is known as formative and summative assessment. Formative assessment is the process of gathering and evaluating information about student learning during the progression of a course or program, the results of which are used repeatedly to improve teaching and learning. Formative assessments are utilized to initiate immediate change to instruction, curriculum, and other student supports.
Formative assessment approaches are used all the time, but are often not thought of as assessments. For example, exhibiting proper posture to a student during a dance routine or providing students feedback on draft of their paper may be considered formative assessments because the goal is not to assess if they have met the desired outcome, but rather, the goal is to assess how students are progressing on their path to achieving it. Other formative assessment examples include using feedback prompts in class (e.g., quick poll or a clicker), mini-papers, homework assignments, discussion board posts, and student self-assessments.
Summative assessments, in contrast, involve gathering information at the conclusion of a course, program, or students’ academic careers. They help answer the overarching question – How well are students meeting the expected learning outcomes? Examples of summative assessments include: final exams, capstone projects, research papers, and final performances. Changes made as the result of summative assessments are designed to impact the next cohort of students taking the course or program and will have minimal impact on the students subject to the assessment itself. Most students are faculty are accustomed to summative assessments. Grades are a type of summative assessment that everyone is accustomed to.
Quantitative vs. Qualitative Data
When planning your assessments, it is helpful to think about the types of data that will be collected. Data often broken down into two categories: quantitative and qualitative data.
Quantitative data are measures using numbers that can be used for statistical analyses. Multiple-choice tests and rubric scores, graduation and placement rates, and closed-end survey questions are all examples of measures that produce quantitative data. In many disciplines, quantitative data is the norm and are preferable to qualitative data because they are viewed as objective and therefore more reliable.
Qualitative data are not typically measured using numbers or used to perform statistical analyses. Instead, qualitative data are categorized based on properties, attributes, labels, and other identifiers gathered through interviews, focus groups, open-ended questions, and observations. However, qualitative data can be summarized using counts in a process known as content or thematic analysis.
So which type of data should be collected when assessing student learning? The answer to this question will be dictated by the measure(s) being employed. Quantitative measures are very familiar to educators and are often easier (and faster) to summarize. Assessments producing qualitative data are often more time consuming to collect, analyze, and report on. Increasingly, assessments that use both quantitative and qualitative data, or a mixed-method approach, are preferable as they allow for a more holistic assessment of the phenomenon being measured.
Surveys are an effective assessment tool that may be used to supplement direct evidence of student learning. The use of surveys has proliferated in recent years in higher education as software used to create and launch them has become less expensive and easier to use. Surveys are often used to measure student perceptions of learning or to provide feedback on aspects of a course or program. Surveys can also be utilized to gather feedback from others about students – e.g., a survey of field placement coordinators about student performance.
Surveys are inexpensive to administer, but crafting well-devised questions, launching a survey, and analyzing the results require the technical expertise of trained researchers. BCC licenses software to create in-house surveys and has staff on hand to help faculty and staff develop and analyze survey results. BCC periodically administers surveys to various cohorts of students to measure their perceptions of learning and the college experience. Examples include the CUNY Student Experience Survey, The Community College Survey of Student Engagement (CCSSE), alumni surveys, and others.
Another type of survey requiring special mention is student evaluations of teaching and learning (a/k/a course evaluations). While they are ubiquitous in higher education, course evaluations are not assessments of student learning. The results of these surveys may be used to help inform decisions about courses and faculty, but they do not indicate whether learning outcomes are being achieved.
Suskie, L. (2009). Assessing student learning: A common sense guide (2nd edition). Jossey-Bass.