Digital Writing: Assessment and Evaluation

Chapter 13

Re-Mediating Writing Program Assessment

Karen Langbehn, Megan McIntyre, and Joe Moxley

ABSTRACT

This chapter explores ways that an online writing collection, distribution, and feedback system, My Reviewers, collapses traditionally linear, one-dimensional modes of writing assessment in which teacher written comments, grading, and program assessment occur at separate intervals of time. We report on ways we have used learning analytics to close the assessment loop and provide value-added assessment and identify benefits to intertwining the purposes of teacher response with the purposes of program assessment in terms of rhetoric, context, time, and space. We further argue that, by using My Reviewers to analyze how students and teachers are responding to a curriculum in real time, the WPA and teachers work with one another to fine-tune assignments and crowdsource effective teacher responses, thus improving program assessment results at the same time as we clarify teacher responses to student writers.

SYNTHESIZING TEACHER RESPONSE WITH PROGRAM ASSESSMENT

First-year composition (FYC) at the University of South Florida (USF) is a large program of 70–90 instructors and more than 3,000 students in any given semester. The program consists of a two-semester sequence: ENC 1101 introduces students to some of the conventions of college-level writing and emphasizes research practices and the recursive nature of the writing process. ENC 1102 focuses on social arguments, including discussions of rhetoric in advertising, Rogerian argument, and performing rhetoric as a social action

Figure 1. Screenshot of My Reviewers

Writing instructors at USF use My Reviewers—a web-based, document markup, peer review, and assessment tool developed by USF—to comment on student intermediate and final drafts. Students in the writing program write in response to shared projects, which have been developed via a datagogical process (Vieregge, Stedman, Mitchell, & Moxley, 2012). Throughout the semester, administrators and teachers may consult learning analytics at My Reviewers to identify aggregated data patterns, including student performance according to shared community rubric criteria (Focus, Organization, Evidence, Style, and Format) by section, teacher, and/or program. Instructors may also compare their grading averages with other teachers in the program. The database now includes over 80,000 student essays reviewed by teachers via My Reviewers.

Since 2007, writing program administrators and instructors in our FYC program have been crowdsourcing assessment practices, including the development of a shared community rubric used to assess all intermediate and final drafts of major projects. My Reviewers, our program’s response to the age of peer production (Anderson, 2006), is a revision and a reconceptualization of assessment that collapses the traditional procedure of (first) teacher response and (then) program assessment. Students and teachers access My Reviewers using our university’s single sign-on Net ID system to upload and download writing documents and teacher/peer responses. In an effort to facilitate efficient use of this system, we’ve also developed a series of corresponding reference materials (videos, podcasts, sample reviewed texts, etc.) that are posted and available immediately upon login to My Reviewers. Below are some examples of our videos, which we share with students to help them understand the rationale for My Reviewers as well as how to navigate its tools.

Video 1. My Reviewers Overview

Video 2. My Reviewers Video Series: Evidence, an Introduction

Video 3. My Reviewers Video Series: In Style

Video 4. My Reviewers Video Series: In Focus

Video 5. My Reviewers Video Series: On Format

Video 6. My Reviewers Video Series: Organization

Over the past few years, graduate students and the WPA have conducted several research studies with hopes of better understanding how networked assessment tools, like My Reviewers, can inform writing pedagogy and writing program administration. In Agency in the Age of Peer Production, a book-length qualitative study of our FYC program, Vierrege et al. (2012) explored the ways My Reviewers reconfigures the agency of composition instructors and writing program administrators. Joe Moxley (2012), USF’s WPA, reported that use of My Reviewers enabled our Spring 2010 instructors to reach an unprecedented level of inter-rater reliability agreement: The 10 independent evaluators’ scores were statistically the same as the classroom instructors’ scores on seven of our eight measures. In another study, Zack Dixon and Moxley (in production) analyzed 120,000 instructor in-text comments and endnotes to assess the degree to which teachers use rubric terms when critiquing student documents in an effort to further define the genre of teacher response (see also Smith, 2003).

In this chapter, we discuss the ways we have used assessment data from the Spring 2011 and Fall 2011 semesters to make significant, productive changes to our FYC curriculum. We begin by grounding our networked assessment effort in current theory and research. Then, we review the data from Spring 2011 and focus on Project 2: Historiography (hereafter referred to as Project 2) as our primary example because assessment results suggested that the existing curriculum failed to encourage student engagement with complex subjects like historiography. Instead, the complexity of the curriculum seemed to frustrate students and teachers alike. Subsequently, we explain our response: a clarification of Project 2 meant to better facilitate student critical engagement with historical writing as reflective of the writing theories and processes we’ve prioritized throughout our FYC curriculum. Lastly, by comparing student performance on Project 2 during the Fall 2011 semester versus the Spring 2011 semester, which demonstrated statistically significant improvement on all measures, we conclude that networked assessment tools like My Reviewers can close the assessment loop and provide value-added assessment through increased agency and interaction for instructors and students, epitomizing what we see as a democratic education. By making aggregated results transparent to teachers and writing program administrators in real time, networked assessment practices enable university writing programs to close the assessment loop, to intertwine the purposes of teacher response with the purposes of program assessment in rhetoric, context, time, and space.

CONTEXT: USING MY REVIEWERS FOR WRITING PROGRAM ASSESSMENT

My Reviewers facilitates two assessment purposes simultaneously: First, it is the USF FYC digital program assessment tool, and, second, it is the medium by which students upload writing assignments, conduct peer reviews, and access teacher responses and grades. Students also consult My Reviewers for definitions of rubric terms and sample marked-up papers. Perhaps most importantly, My Reviewers enables WPAs and teachers, to a more limited extent, to instantly access assessment data regarding every student and teacher in the FYC program. Hence, if there are 2,000 students in ENC 1101, our first composition course, My Reviewers provides real-time assessment data as these 2,000 students work through the curriculum, from intermediate to final drafts, from Project 1 to Project 4. This means that teachers can effectively trace student successes and challenges throughout the submission of multiple drafts of projects and adjust teaching methods in response to data they’re accumulating in real time via My Reviewers. At the culmination of the course, when the FYC curriculum development team asks teachers to identify specific challenges of the existing curriculum, teachers have access to these drafts and their specific feedback. Without a digital medium to allow for this sort of contextualized calibration, teachers would have no means of accurately measuring their effectiveness in a timely, significant way. WPAs, understandably, have even greater access than the teachers to the large amount of data available via My Reviewers; and they use the learning analytics to review all teacher successes and challenges with the curriculum’s projects as the curriculum is being taught.

These analytics enhance the effectiveness of a number of the WPA’s responsibilities, one of which is facilitating the practicum, a required, graduate-level course for new FYC instructors. The primary purpose of the practicum is for first-time FYC instructors, the WPAs, and FYC mentors to collaborate in discussion and application of composition teaching theory and pedagogy, as well as to problematize and develop creative alternatives to the existing curriculum. Without the advantages of a digital assessment tool, the WPAs wouldn’t be able to use the practicum to provide relevant, evolving assessment data for proposing or practicing new theories. For example, at a glance, WPAs can see if teachers are keeping pace with one another; they can also view teacher responses to students and compare teachers’ grades. In turn, teachers may assess how their students’ grades compare with other teachers’ grades in the writing program (although their permissions hide teacher name and sort by section number; see Figure 1). We believe that this transparency helps facilitate a community of learning (Blake, 1995).

Average grades for classes at UCF, Spring 2012

Figure 2. Average Grades by Class, Spring 2012 (ENC 1102)

We interpret a learning community as “focused on enhancing the learning of each student, and guided by a vision of what the organization must become to facilitate this, in particular, so that the individuals in the organization must also be continually learning” (Nugent et al., 2008). Because My Reviewers was launched and initiated prior to its complete development, as new features were introduced, My Reviewers became new again and again to users. Therefore, all FYC teachers were learners, continually learning how best to use the system, and subsequently crowdsourcing best practices for challenging, and sometimes frustrating, user issues. Most importantly, they were learning in a variety of contexts: during the required orientation week, with their mentor groups, in the practicum, and, more informally, by crowdsourcing one another’s emerging expertise. We believe that this variety of contexts across time facilitated all types of learners.

Our learning community was a key advantage—a necessity, even—of successfully implementing My Reviewers and its features. For instance, one of the more potentially controversial features, the ability to see grades by section as shown in Figure 1, allowed teachers to access (anonymous) quantitative data about how other FYC teachers were grading projects. The purpose of developing this feature and enabling these capabilities was to provide teachers with a reliable, easily accessible point of comparison so that they could generally determine the degree to which their grades reflected norms among other FYC teacher grades. Many of our FYC teachers were first-time teachers, and, as a result, they were anxious about grading. By critically reviewing other FYC teacher grades on particular projects, they had the benefit of more specifically identifying their challenges, either in teaching the project or in grading it. Of course we realize that the intention of this feature is not for new teachers to adjust their grades so that they reflect a more seasoned teachers’ grades, but rather to enable new teachers to identify discrepancies among grading trends and then to critically reflect on their pedagogy and develop understandings about why these discrepancies may exist.

It’s important for us to emphasize that we don’t claim a direct cause-and-effect relationship between teacher ability to compare grades among courses and the creation of a learning community. However, we do see a relationship between the learning community we had been developing prior to implementing our digital assessment tool and the (mostly) positive role of this community in adjusting to My Reviewers’ ensuing challenges.

Much of the personality of our existing learning community is the result of ongoing efforts at developing community prior to initiating My Reviewers. The development of this learning community began in 2007, when we received funding for the FYC Mentoring Program. FYC Mentors were expected to function as catalysts in developing teaching strategies, planning and enacting program assessment, and creating a collaborative community. We found that the mentoring program has significantly contributed to the creation of a more cohesive community by connecting administrators with both returning and new incoming teachers. The mentoring program’s goal was to add an extra layer of collaboration in addition to our use of social software and peer-production tools, but both initiatives were designed to create community among our oft-changing population of instructors (Vieregge et al., 2012).

Initially, we digitized program assessment to adapt to the technology of the day, to begin where our learners were. Our learners were obviously adapting to technology, albeit mostly in social rather than educational contexts. Therefore, we challenged ourselves to adapt to social change as it affects our institutions, our classrooms, our teachers, and our students. Ignoring the prevalence and efficiency of technology may have, at first, appeared to be the easier choice; however, such ignorance seems antithetical to our purposes as educators, as those who ought to engage with (often imperfect and/or incompletely developed) technological tools for the sake of beginning where our learners are and challenging the entire community to adapt, improve, contribute to, and advance our most basic philosophies. It was time for us to build and maintain writing assessment theories and practices consonant with our teaching and research (Huot, 1996). Our program’s emphasis on teacher commentary—especially as it was, and still is, a concerted interest of the practicum meetings, and the primary concern of our FYC mentors—led us to digitized assessment as an effort to elicit more of a dialogic response from students. Given that students were using technological tools as their primary medium for writing, we felt we ought to do the same. Therefore, our goal became the development of a tool for responding to student writing—a tool that they would respond with rather than react to. In particular, we were interested in engaging students in conversations about writing. We agree with Diana Awad Scrocco (2012) that student responses to teacher commentary ought to be conversational, meaning teacher commentary should be “open-ended, interrogative, specific, and individualized to the writer [because] conversational comments ideally elicit a response from writers, encouraging them to brainstorm new ideas about the text or consider a different perspective on an issue in the text” (p. 277). Because My Reviewers provides students with viable options for responding to teacher and peer commentary from the same space in which the commentary occurs, we believe they will be more likely to respond with commentary by accessing the various, embedded resources via My Reviewers—more so than they would be if those embedded resources were absent or disconnected, as they are in traditional, linear modes of assessment. As a contribution to Jeff Rice’s (2011) and Brian Huot’s (1996) call for the development of an alternative theory of networked assessment, we sought to create a hermeneutic framework, one “grounded in the philosophical and scientific traditions of phenomenology” (Broad, 2000, p. 244). In Bob Broad’s extensive study, he asserted that “positivist models of interpretation and evaluation break down quickly and repeatedly in the context of rhetorically robust communal writing assessment” (p. 248). Although we were aware that developing a technological tool for assessment would be a hard sell in various ways, we believed that accomplishing this goal would not only enable us to communicate more dialogically with students, and as a result, improve writing across our FYC courses, but also that doing so would enable us to more accurately address writing contexts and the effect of context on assessment values. For instance, the rubric, explanations and examples of rubric criteria, examples of well-written projects, and additional/embedded resources provide adequate context situating the projects, meaning that the commentary/grading that occurs within that context is an extension of the clearly articulated expectations of the projects. In developing My Reviewers with these considerations in mind, we sought to accomplish a hermeneutic assessment, one that:

involve[s] holistic, integrative interpretations of collected performances. . . that privilege[s] readers who are most knowledgeable about the context in which the assessment occurs, and that ground those interpretations not only in the textual and contextual evidence available, but also in a rational debate among the community of interpreters. (Moss, 1994, p. 7)

Furthermore, our decision to digitize program assessment derived from the necessity of developing a shared curriculum. First-year composition is one of few classes that most students at the University of South Florida are required to enroll in, and, as a result of our program’s scope and size (our total seat count has ranged from 7,000 to 9,500 students per academic year over last 8 years), and, additionally, because of the implications of FYC as a part of the General Education Curriculum (Florida’s 36-hour general education program, designed to introduce college and university students to the fundamental knowledge, skills, and values that are essential to the study of academic disciplines; Florida Department of Education), these courses draw the attention of university administrators. Because of these implications, we’re especially aware of how the success of students and teachers on an individual (classroom) level impacts success on a programmatic level, a departmental level, and, eventually, reflects positively on our university as an institution. Because, like Dylan William (2007), we believe that, “changing teachers’ minute-to-minute and day-to-day formative assessment practices is the most powerful way to increase student achievement” (pp. 200–201) we are especially sensitive to the ways in which ineffective practices cause problems beyond individual classrooms and the ways in which successful formative assessment at the classroom level provides the best opportunities to improve student learning and our shared curriculum. By studying these formative assessment practices and using them to gauge the relative effectiveness of our curricular decisions, we believed we could improve student learning. As a result, we felt it was vital to enable the WPA and other FYC staff members to assess classroom assignments and evaluations through a shared assessment space and criteria.

Rice (2011) described the WPA’s extended agency in terms of tracing, in which the tracing of recurring links may allow WPAs the opportunity to connect the dots in novel, informative ways, facilitating the purpose of networked assessment: successfully conducting an alternative, extended method, implicating response and assessment, while simultaneously developing a new vocabulary based in the language of new media. Ultimately, the fundamental purpose of My Reviewers—and its subsequent implications for WPAs and teachers—is democratic and pedagogical: It teaches us about the relationships circulating in our program that we have yet to see as being an influential part of a given network and teaches us about the inherent value of self-reflexivity as a critical, dynamic process of writing program assessment.

Additionally, given our responsibilities to students, teachers, and administrators, we hoped to develop a unique system capable of addressing seemingly conflicting goals. We recognized that teachers and students desired individual agency, but, at the same time, that institutional agency was accountable for ensuring a consistent and reliable learning experience for students. Careful deliberation about how best to negotiate these pressures led us to the conclusion that we needed to peer-produce a shared curriculum and assessment practices.

Lastly, our decision to digitize assessment can be traced to our accountability for preparing our program for the Southern Association of Colleges and Schools (SACS) re-accreditation. For this reason, as well as the two primary reasons explained above, our assessment tool was launched in Fall 2008 and has become the focus around which our seemingly contradictory agencies (as WPAs, composition teachers, and students of composition) converge, inform one another, and accomplish our foundational goal: collapsing the traditional binary that exists between teacher assessment (response) and program assessment, enabling our department the opportunity to engage in a more genuine community of learners who collaborate on traditionally non-collaborative responsibilities like program assessment. Most generally, and most significantly, My Reviewers is our interpretation of how rhetorical education in the 21st century ought to be practiced: as that which prepares students to command attention across a wide range of rhetorical performances—written, oral, visual, and digital (Rice, 2011).

TEACHER RESPONSE AND PROGRAM ASSESSMENT

Teacher response has often been practiced separately from program assessment primarily because of the process-oriented ideology in which the two (separate, different, unrelated) functions occurred: First, teacher response; then, program assessment. As a result of this traditional, linear operation, composition teachers and scholars have typically felt frustrated, cut-off, and uninterested in assessment because the inner workings of writing assessment had been kept private among “measurement specialists” who possessed expert knowledge about measurement.

Although the assessment process wasn’t intentionally linear, the means of facilitating a simultaneous and collapsed engagement of response with assessment wasn’t initially deemed important—let alone technologically possible. Understandably, because of the separateness of the two functions (in their contexts, rhetorics, spaces, and times) teacher response wasn’t easily compatible with program assessment because it occurred before assessment—assessment under different persuasions and for competing rationales. Therefore, as a post-process procedure, there was a distinct disconnect between teacher responses and program assessment. Most detrimentally, program assessment wasn’t prioritized in the context of the composition classroom because, quite simply, it wasn’t understood as a direct responsibility of composition teachers.

Instead, program assessment was the responsibility of WPAs and, more specifically, measurement specialists concerned with quantitative consistency: reliable, numerical scores of individual student papers from independent judges trained in the objective rhetoric of a fixed terminology pertaining to ethics, fairness, and validity. Validity, in its traditional use regarding program assessment, was used to determine that the assessment measured what it purported to measure, given the assumption that an assessment’s value is limited to distinct goals and properties in the instrument itself. In contrast, new, networked assessment repurposes validity. Now, current reiterations of validity, in terms of networked assessment, are repurposed so as to emphasize the determination of the accuracy of assessment and the impact of process on teaching and learning for a specific site (along with that individual site’s mission and goals) with the assumption that the value of an assessment can only be known and accountable in a specific context. As Rice (2011) explained:

new assessment procedures recognize the importance of context, rhetoric, and other characteristics integral to a specific purpose and institution; site-based, controlled locally, and moving away from the implicit standardization of assessment conclusions, a “topoi of assessment,” toward a networked tracing, an articulation cannot be standardized or predicated on a value-based system. (p. 29)

In tandem with the ideology of networked assessment, My Reviewers takes into account context, rhetoric, and other characteristics (i.e., agency as it manifests bottom-up, by teachers and students, and top-down, as it informs program- and university-wide measurements for success). As we explain below, My Reviewers helps to close the assessment loop, integrating program assessment into the classroom and instructor assessment into program assessment.

USING MY REVIEWERS TO CLOSE THE ASSESSMENT LOOP

In Fall 2009, we began revising our ENC 1101 curriculum into a three-project sequence, which consisted of an annotated bibliography project, historiography project (Project 2), and video project, as described in the video.

Video 7. My Reviewers video on Historiography

Because Project 2 is the most controversial in terms of our assessment analytics, we use it here as a means of grounding our theoretical claims about the benefits and efficiency of an accessible, embedded, collapsed process of digital assessment via My Reviewers.

As mentioned above, ENC 1101 and 1102 are subject to USF’s General Education program requirements, which meant that during the 2008–2009 academic year, we were tasked with incorporating history, a General Education requirement, into the ENC 1101 curriculum. Our specific responsibility was to incorporate history as an evolving process of interpretation and enable students to “recognize a diverse range of historical interpretations” (General Education Council, 2011). In response to this charge, administrators and instructors in the FYC program created a project in which students would be asked to develop a 1200–1500-word analysis of various interpretations of an historical event, figure, or idea. The students were required to complete an annotated bibliography of six sources, of which the publication dates of these six sources had to span at least a 10-year period. The assignment also required students to “address the way that interpretations of this event, figure, or idea have changed and the way that each writer’s perspective influences his or her interpretation.” Specifically, the assignment directed that students:

write a 1200–1500-word essay that analyzes four different interpretations of an historical event, figure, or idea. . . Your analysis should address the way that interpretations of this event, figure, or idea have changed and the way that each writer’s perspective influences his or her interpretation. Your analysis should also address how your own definition of history and your own ideas on how history is written fit into the chosen historical event. . . To successfully complete this assignment, you will need to identify the facts regarding the event or figure, explain the importance of this event, figure, or idea, and incorporate research on each author’s cultural and historical context in an effort to explain why there are discrepancies in each author’s interpretation.

From its conception, Project 2 was problematic for teachers and students: Teachers struggled with how to teach the requisite skills for each of the component parts, and first-year students struggled with the complexity of historiography as well as the number and difficulty of skills required to complete the assignment. As the second of three projects in a 16-week semester, students drafted the project between weeks four and nine, followed by a final project that also required a complex thesis. First-year students, only a few weeks into their college careers, struggled with Project 2 to the extent that teachers often softened requirements or took other steps to ease the process for their students. Based on email discussions and face-to-face meetings among teachers, program administrators knew that the integration of historiography was proving difficult, but our reliance on the end-of-the-year assessment process prevented us from pinpointing the exact difficulties with the project at the time because we were unable to examine a large cross-section of student work in a timely manner.

Here, especially, we emphasize the limitations of traditional assessment—most significantly, time. As one of the four factors (rhetoric, context, time, space) implicating assessment, time was one of the most significant in motivating us to create and implement digital assessment, because digital assessment enables us to instantly access pertinent data—learning analytics—to inform us of our successes and challenges in teaching from the existing curriculum. Although traditional assessment certainly enables this same data, it does so much more slowly. As a result, by the time the learning analytics are calibrated and applied to the existing curriculum, they’re no longer useful in addressing the context from which they were derived; instead, they’re applied to subsequent students, students likely to have different successes and challenges than those determined by the learning analytics of previous composition students.

During the following academic year, My Reviewers was implemented as the primary space in which student work was evaluated, archived, and assessed, which enabled WPAs to examine evaluation and assessment data from My Reviewers for each of the student essays uploaded and to then consider this data along with additional, specific feedback from instructors. Program administrators and curriculum facilitators examined assessment data, surveyed instructors, examined student feedback and determined that the project was simply too complex for the first-year sequence in which it was embedded.

During that same summer, in response to assessment data and instructor response pertaining to the challenges of Project 2, WPAs significantly clarified the purpose of the project by eliminating the annotated bibliography component, narrowing the focus of the assignment’s requirements, and reducing the number of required secondary sources. Though the revised project still requires that students to examine various perspectives on an issue, we believe that it does so more efficiently:

Project 2, a 600–800 word bibliographic essay, asks students to investigate a conversation surrounding their topic. This second project asks students to understand the conversation surrounding their chosen topic by examining four relevant sources. Two of these sources must be at least 10 years apart so that students can see how interpretations of an event, concept, or person evolve over time and that textual scholarship is an ongoing conversation. Students learn that knowledge is typically not something static that one unearths so much as a conversation, a set of shared assumptions. After reading and summarizing these sources, students will write a short essay that provides background on their topic, makes connections between at least two of their sources, and draws some conclusions regarding the conversation surrounding the topic.

Obviously, the new project retains many features of the first project, but repositions these tasks as part of an exploratory bibliographic essay, instead of as an argument about the malleability of history.

Table 1 offers a side-by-side comparison of the two approaches to Project 2. A comparison of the data from both approaches to the project suggests that students performed better under the new curriculum than the old: Students showed statistically significant improvement in all three categories measuring critical thinking (Focus: Critical Thinking, Evidence: Critical Thinking, and Organization: Critical Thinking). Additionally, students demonstrated statistically significant improvement in the Focus: Basics category, and in their overall mean grades for the project. We also saw a drop in the standard deviation for each of the aforementioned categories, which suggests that students scored closer to the mean score under the new curriculum than they previously had.

Criterion	Student work from 2010–2011 curriculum		Student work from 2011–2012 revised curriculum
	Mean	Standard Deviation	Mean	Standard Deviation
Focus: Basic Writing	3.12_a	1.17	3.36_b	.98
Focus: Critical Thinking	2.65_a	1.05	2.85_b	.96
Evidence: Critical Thinking	2.63_a	1.10	2.93_b	1.01
Organization: Basic Writing	2.91_a	.95	290_a	.87
Organization: Critical Thinking	2.74_a	1.02	2.97_b	.91
Style: Basic Writing	2.82_a	1.00	2.85_a	.88
Style: Critical Thinking	2.78_a	.96	2.72_a	.88
Format: Basic Writing	3.03_a	.93	2.95_a	1.07
Rubric Grade	2.77_a	.82	2.93_b	.75
Note: Values in the same row and sub-table not sharing the same subscript are significantly different at p < 0.05 in the two-sided test of equality for column means. Cells with no subscript are not included in the test. Tests assume equal variances; tests are adjusted for all pairwise comparisons within a row of each innermost sub-table using the Bonferroni correction.

Table 1. Comparison of Rubric Criteria Scores (2011–2012)

Table 1 compares the scores in each rubric area as well as the final rubric score for the Project 2: Historiography versus Project 2: Brief Bibliographic essay. The scores indicate that with the exception of Style: Critical Thinking, students demonstrated statistically significant improvement in each of the critical thinking rubric areas as well as in the final rubric score category. Additionally, with the exception of Format: Basic Writing, the standard deviation (the average difference between a student’s actual score and the average) was lower in every category. We believe this indicates that students better understood the project, a view supported by the statistically significant improvement in the Focus: Basic Writing category, which asks instructors to evaluate how well students actually met assignment requirements.

Because of the data accessible via My Reviewers, we have been able not only to make curriculum revisions in response to teacher feedback and rubric results, but also to examine projects across time, from year to year, and to evaluate the efficacy of our changes to the curriculum (for example, see Figure 2). Essentially, My Reviewers allows us to revise the curriculum in substantial ways because it shows us where project weaknesses are and where we need to revise our directions and expectations. In terms of the efficacy of the tool for curriculum revisions, My Reviewers has allowed us to quickly determine weaknesses by examining large numbers of student criteria on particular projects, as well as providing important feedback regarding whether or not changes made to address weaknesses have been effective or not. Ultimately, in analyzing our data and responding with critique and revision of our curriculum, we intend to synthesize digital writing activity with rhetorical education.

Figure 3. Comparisons demonstrating improvements in major rubric categories (2010–2011 and 2011–2012).

Figure 2 above shows marked improvement in the Focus: Basics category. The sole purpose of this category is to measure how well and to what extent a student has completed the requirements of the assignment. An improvement in this category suggests a better understanding of the new assignment.

Criteria	Level	Emerging (0)	1	Developing (2)	3	Mastering (4)
Focus	Basics	Does not meet assignment requirements		Partially meets assignment requirements		Meets assignment requirements
Focus	Critical Thinking	absent or weak thesis; ideas are underdeveloped, vague, or unrelated to thesis; poor analysis of ideas relevant to thesis		Predictable or unoriginal thesis; ideas are partially developed and related to thesis; inconsistent analysis of subject relevant to thesis		Insightful/ intriguing thesis; ideas are convincing and compelling; cogent analysis of subject relevant to thesis
Evidence	Critical Thinking	Sources and supporting details lack credibility; poor synthesis of primary and secondary sources/evidence relevant to thesis; poor synthesis of visuals/ personal experience / anecdotes relevant to thesis; rarely distinguishes between writer's ideas and source's ideas		Fair selection of credible sources and supporting details; unclear relationship between thesis and primary and secondary source s/ evidence; ineffective synthesis of sources/ evidence relevant to thesis; occasionally effective synthesis of visuals / personal experience / anecdotes relevant to thesis; inconsistently distinguishes between writer's ideas and source's ideas		Credible and useful sources and supporting details; cogent synthesis of primary and secondary source s/ evidence relevant to thesis; clever synthesis of visuals / personal experience/ anecdotes relevant to thesis; distinguishes between writer's ideas and source’s ideas
Organization	Basics	Confusing opening; absent, inconsistent, or non-relevant topic sentences; few transitions and absent or unsatisfying conclusion		Uninteresting or somewhat trite introduction, inconsistent use of topic sentences, segues, transitions, and mediocre conclusion		Engaging introduction, relevant topic sentences, good segues, appropriate transitions, and compelling conclusion
Organization	Critical Thinking	Illogical progression of supporting points; lacks cohesiveness		Supporting points follow a somewhat logical progression; occasional wandering of ideas; some interruption of cohesiveness		Logical progression of supporting points; very cohesive
Style	Basics	Frequent grammar/ punctuation errors; inconsistent point of view		Some grammar / punctuation errors occur in some places; somewhat consistent point of view		Correct grammar and punctuation; consistent point of view
Style	Critical Thinking	Significant problems with syntax, diction, word choice, and vocabulary		Occasional problems with syntax, diction, word choice, and vocabulary		Rhetorically sound syntax, diction, word choice, and vocabulary; effective use of figurative language
Format	Basics	Little compliance with accepted documentation style (i.e., MLA, APA) for paper formatting, in-text citations, annotated bibliographies, and works cited; minimal attention to document design		Inconsistent compliance with accepted documentation style (i.e., MLA, APA) for paper formatting, in-text citations, annotation bibliographies, and works cited; some attention to document design		Consistent compliance with accepted documentation style (i.e., MLA, APA) for paper formatting, in-text citations, annotated bibliographies, and works cited; strong attention to document design

Table 2. Rubric

Thus, as shown in Figure 2 and Table 2, it appears our new curriculum for Project 2 was successful at helping students learn necessary skills like integrating evidence and developing theses or purpose statements. This success may be because the more simplified prompt and direction enabled teachers to spend less time explaining the form and purpose of the assignment and more time instructing how to integrate evidence and develop theses. The score in the Focus: Critical Thinking category improved, at least in part, because the category is meant to measure thesis development and depth of analysis, students performed better when asked to create a thesis that serves one specific purpose (i.e., to make an assertion about how the conversation surrounding a topic may or may not have changed over a 10-year period).

Further, the revised Project 2 allowed students to focus on the work required and to actively read, research, and use sources more productively. The change from six annotated sources to four summarized sources, only two of which would be discussed in the final version of the essay, meant that students had more time to focus on the specifics of these sources and that instructors (who no longer had to model annotated bibliographies) had more time to teach summary and critical reading skills. As we recommended in the teacher discussion of the project, many teachers chose to assign the summaries one at a time as homework, which gave teachers the ability to work with students on summary and source integration even before students completed a full draft of the bibliographic essay. The increased time and attention to summary may account for some of the statistically significant improvement seen under the new curriculum.

This focus on summary, as well as the simplification of the assignment, may also have affected student organizational strategies and the relative cohesiveness of papers. As shown by higher scores in the Organization: Critical Thinking category, students appear to have found the new project’s streamlined purpose easier to fulfill and their ideas easier to connect as the revised version of the project saw an increase of nearly a quarter of a point on a four-point scale in this category.

Improvements in the aforementioned categories also likely account for the statistically significant improvement in the final mean rubric score, although the total length of the project (600–800 words as opposed to 1200–1500 words), the reduction in the number of sources, and the simplification of the purpose of the assignment are also certainly factors, regardless of their degree of impact on the categories above.

CONCLUSIONS

By analyzing in real-time how students and teachers are responding to a curriculum, WPAs and teachers can work with one another to fine-tune assignments and crowdsource effective teacher feedback, thus improving program assessment results. Most significantly, because of the immediacy enabled by a digital system, they can accomplish this in real time as WPAs trace teacher responses and grades, as indicators of strengths and challenges, on a day-to-day basis.

Concerning Project 2, there’s little doubt that the historiography project would, eventually, have been changed, but under traditional assessment processes that change would have been a lot longer coming. Students, teachers, the WPA, and university administrators would have been unduly challenged, frustrated, and disappointed at performance, assessment, and retention rates throughout the time in which learning analytics were gathered and the time at which they were calibrated and, finally, applied. Digital assessment speeds and simplifies evidence-based curricular revision: Instead of making changes based only on anecdotal reports or small sample sizes, networked assessment tools enable WPAs to aggregate a much larger amount of data and more quickly analyze the relative effectiveness of our curricular decisions. The comparisons above would be nearly impossible without My Reviewers, especially given the relatively large numbers of students, teachers, and papers.

In addition to the affordances of efficiency and breadth of application, using a digital mode of assessment meant that teacher responses to student writing were more significant on two accounts: First, as a means of tracing student challenges and successes throughout their enrollment in the FYC program and, second, as a means of using individual student data to articulate trends in the potential deficiencies of our curriculum and to address those challenges much more immediately than was previously possible with a traditional assessment system.

Digital assessment has enabled us to respond to one of the major deficiencies of traditional assessment: the amount of time required to gather, calibrate, and apply assessment results to the curriculum. In the time traditionally required to develop an interpretation of assessment results and then to revise the curriculum based upon those results, the students challenged by the curriculum being analyzed have already exited the course. In a traditional mode of assessment, curriculum changes occur too slowly to positively affect students that are challenged by the curriculum being studied and analyzed for deficiencies; instead, changes to the curriculum affect future students, who are likely to have different challenges.

Additionally, as discussed above, digital assessment has enabled us to respond to the disconnectedness between teacher assessment and program assessment. Although the initial stages of implementing My Reviewers was admittedly rocky (see Vieregge et al., 2012), the learning community we had worked to establish well before the implementation of My Reviewers was a positive asset in enabling us to sell teachers on the merits of a digital assessment tool. Ultimately, our experience suggests that the teachers most aware of and engaged in their pedagogy are the very people who are proactive in influencing others perhaps less zealous about our conversion to a digital system.

We contend that My Reviewers closes the assessment loop—teacher response, student performance/teacher assessment, and program assessment—and in doing so, teachers have significant agency in their participation in program assessment by virtue of their engagement. Because their individual inputs of student data are aggregated by the networked assessment tool, as opposed to a team of measurement specialists, FYC teachers are aware of the significant implications of the data they’re entering. By entering student information into the system, teacher agency is enhanced because they are contributing directly and immediately to the aggregation of program assessment data. Therefore, we suggest that the two traditionally contradictory processes work in tandem: Teacher response, in terms of a successfully developed curriculum, can be measured just as program assessment, in terms of aggregated teacher response, is deduced and analyzed. Ultimately, networked assessment tools like My Reviewers re-conceive, extend, and further justify the significance of the agency and influence of WPAs and teachers because it allows teachers—and, by extension, the program—to connect with students in a non-traditional, digital medium, one which they use almost exclusively for writing tasks. The persistence in continuing to improve, extend, and perfect this technology as much as possible worked as a model to our teachers about how engaging with digitally mediated approaches to learning (Arafeh, Levin, Rainie, & Lenhart, 2002; Caruso & Kvavik, 2005; Caruso & Salaway, 2007; Prensky, 2001, 2005) strengthened our agency in motivating student learning. Because our FYC teachers are practicing a mode of communication in sync with student preferences for technology, we are beginning to see that these teachers are more likely to have a greater, more positive impact on communicating with their students, therefore enabling greater student success and teacher agency and ultimately, we hope, increased long-term retention university-wide. Alternatively, at times, FYC teachers reported that students had problem-solved one aspect or another of My Reviewers, and then informed them of their solutions, flattening the traditional “teacher-as-expert” assumption and redistributing agency to the students. But in these scenarios, as in those where the teacher achieves greater agency with students by communicating via technology, agency is the ultimate value-add.

Although we are in the early, speculative stages of researching ways networked assessment tools reconfigure teacher methods of responding to student writing, our preliminary research suggests that networked assessment tools enhance the agency of individual instructors and inform curricular changes (Vieregge et al., 2012). As is the case with all research endeavors, especially those implicated by infinitely developing, evolving technology, we are always reflecting on how best to continue conducting additional research to better understand how these tools can be used to track student improvement as writers, critical thinkers, and researchers. Furthermore, our concern is to efficiently apply that data to inform significant, timely responses to student proficiency, teacher agency in classrooms and department meeting spaces, and WPA scope of positive influence both within and among the department and across the university. Most importantly, a digital assessment technology like My Reviewers helps us to ensure present and future relevance in our field—one that is significantly, positively, affected by technology and its agency in augmenting teacher investment in student learning.

REFERENCES

Anderson, Chris. (2006). People power. Wired, 14 (7). Retrieved from http://www.wired.com/wired/archive/14.07/people.html

Arafeh, Sousan; Levin, Doug; Rainie, Lee; & Lenhart, Amanda. (2002). The digital disconnect: The widening gap between Internet-savvy students and their schools. Washington, DC: PEW Internet and American Life Project: Retrieved from http://www.pewinternet.org/Reports/2002/The-Digital-Disconnect-The-widening-gap-between-Internetsavvy-students-and-their-schools.aspx

Blake, Elizabeth S. (1995). Talking about research: Are we playing someone else’s game? In Joe M. Moxley & Lagretta Tallent Lenker (Eds.), The politics and processes of scholarship (pp. 27–40). Westport, CT: Greenwood.

Broad, Bob. (2000). Pulling your hair out: Crises of standardization in communal writing assessment. Research in the Teaching of English, 35 (2), 213–260.

Caruso, Judy B. & Kvavik, Robert B. (2005). ECAR study of students and information technology, 2005: Convenience, connection, control, and learning roadmap. EDUCAUSE Center for Applied Research. Retrieved from http://connect.educause.edu/library/abstract/ECARStudyofStudentsa/37610

Caruso, Judy B. & Salaway, Gail. (2007). ECAR study of graduate students and information technology, 2007-roadmap. EDUCAUSE Center for Applied Research. Retrieved from http://connect.educause.edu/library/abstract/TheECARStudyofUnderg/45077

Dixon, Zack, & Moxley, Joe M. (In production.). Big data, analytics, and open assessment. Assessing Writing.

Florida Department of Education (2006). Pathways to success. In Planning on Pursuing a
Bachelor’s Degree? Pathways to Success. Retrieved from http://www.fldoe.org/articulation/pdf/Pathways_to_Success.pdf

General Education Council. (2011). Historical process context and process rubric. Tampa, FL: University of South Florida.

Huot, Brian. (1996). Computers and assessment: Understanding two technologies. Computers and Composition, 13,231–243.

Moss, Pamela A. (1994). Can there be validity without reliability? Educational Researcher, 23 (2), 5–12.

Moxley, Joe M. (2012). Aggregated assessment and “Objectivity 2.0.” Presentation at the European Chapter of the Association for Computational Linguistics. Avignon, France.

Moxley, Joe. (2010). Assessment 3.0: Reconfiguring grading spaces with My Reviewers. Presentation at the Computers and Writing Conference. Purdue University. West Lafayette, IN.

Moxley, Joe M.; Dixon, Zack; Carabelli, Jason, & McIntyre, Megan. (2012). Reflections from the panopticon: Inside looking in. Presentation at the Conference on Writing Program Administration. Albuquerque, NM.

Moxley, Joe. (In press). Using a community rubric across courses and sections to assess reasoning and writing ability—Toward a theory of social assessment. Assessing Writing.

Nugent, Jeff S.; Reardon, R. Martin; Smith, Fran G., Rhodes, Joan A.; Zander, Mary Jane; & Carter, Teresa J. (2008). Exploring faculty learning communities: Building connections among teaching, learning, and technology. International Journal of Teaching and Learning in Higher Education, 20, 51–58.

Prensky, Marc. (2001). Digital natives, digital immigrants, part II: Do they really think differently? On the Horizon, 9 (6), 1–9.

Prensky, Marc. (2005). Listen to the natives. Educational Leadership, 63 (4), 8–13.

Rice, Jeff. (2011). Networked assessment. Computers and Composition, 28, 28–39.

Scrocco, Diana Awad. (2012). Do you care to add something: Articulating the student interlocutor’s voice in writing response dialogue. Teaching English in the Two-Year College, 39 (3), 274–292.

Smith, Summer. (2003). The role of technical expertise in engineering and writing teachers’ evaluations of students’ writing. Written Communication, 20, 37–80.

Vieregge, Quentin; Stedman, Kyle D.; Mitchell, Taylor Joy; & Moxley, Joseph M. (2012). Agency in the age of peer production. Urbana, IL: National Council of Teachers of English.

William, Dylan. (2007). Content then process: Teacher learning communities in the service of formative assessment. In Larry Ainsworth, Lisa Almeida, Anne Davies, & Richard DuFour (Eds.), Ahead of the curve: The power of assessment to transform teaching and learning (pp. 183–204). Bloomington, IN: Solution Tree Press.