Observations on the SAGE sample exam

Earlier this year I wrote a multi-part series on the PARCC test samples, picking at potential pitfalls and interface issues. This was done with the assumption this would be my state’s new test.

Then (allegedly) the price tag on the bid came in too high, and we went with the American Institutes for Research instead. They have a contract to administer the Smarter Balanced test (so for the part of the US doing that one, this should interest y’all) but the test we will be seeing is customized for Arizona, presumably out of their test banks. This is close to the situation in Utah, which has a sample of what they are calling the SAGE test. Since there is no Arizona sample yet I decided to try my hand at Utah’s.

I’d like to think my approach to PARCC was gently scolding, but there’s no way around it: this test is very bad. One friend’s comment after going through some problems: “I’m starting to think this is an undergrad psych experiment instead.”

Question #1 is straightforward. Question #2 is where the action starts to happen:


Adding a point on the line gets the r-value closer to 1, but with no information on the exact coordinate points (those are nowhere near the grid lines) or the original r-value I believe this problem is impossible as written.


Question #3 is fairly sedate although they screwed up by neglecting to specify they wanted positive answers; (17, 19) and (-17, -19) both work but the problem implies there is only one valid pair. I’d like to draw attention to the overkill of the interface, which includes pi and cube roots for some reason. There seem to be multiple “levels” to the numerical interface, with “digits and decimal point and negative sign” being the simplest all the way up to “including the arctan if for some reason you need that” but without much rhyme or reason to the complexity level for a particular problem.

Case in point:


The percents in the problem imply the answer will also be delivered as x%, but there is absolutely no way to type a percent symbol in the line (just typing % with the keyboard is unrecognized). So something like 51% would need to be typed as .51. Fractions are also unrecognized.


Here’s the Common Core standard:

Derive the formula for the sum of a finite geometric series (when the common ratio is not 1), and use the formula to solve problems. For example, calculate mortgage payments.


I could linger on the bizarre conditional clause that makes this problem (*why* would one ever need to have a line with a y-intercept greater than one given in table form yet also perpendicular to some other particular line is beyond me) but instead I’ll point out the interface to the right, which is how all lines are drawn. (Just lines: there seems to be no way to draw parabolas and so forth like in the PARCC interface.) To add a line you click on “Add Arrow” (not intuitive naming) and click a starting point and an ending point. Notice that the line does not “complete” itself but rather hangs as an odd fragment on the graph. Also, fixing mistakes requires clicking “delete” and then the line, except if you click right on the line the points do not disappear so you have to repeat delete-click-delete-click on each of the points to clear everything out.

Oh, and the super-tiny cursor button is what you click if you want to move something around rather than delete and add. There was not enough room to have a button called “Move”?


First off, “objective function” is not a Common Core vocabulary word and linear programming is not in Common Core besides, at least not as presented in this question.

Solve linear equations and inequalities in one variable, including equations with coefficients represented by letters.

Graph the solutions to a linear inequality in two variables as a half-plane (excluding the boundary in the case of a strict inequality), and graph the solution set to a system of linear inequalities in two variables as the intersection of the corresponding half-planes.

Besides that, the grammar is very sloppy. It should say “the objective function z = -3x + 4y” without lumping it in a set of five statements where the student has to fish and presume the function being meant is the first line because it is the only one that includes all three variables in function form.


I include this problem only to indicate how wide the swerves in difficulty of this test are. First linear programming, then a simple definition, and then…


I came up with sqrt(2) and 2, but notice how the number line only accepts “0 1 2 3 4 5 6 7 8 9 . -” in the input. There is no way to indicate a square root.

Fractions are also right out, so one possible answer that does work (0.25 and .5) is very hard to get to. (I confess I was stumped and had to get hinted by a friend.)


I drove my eyes crazy trying to get the right numbers on the axis to match up, especially on Survey 1 which is not even placed against the number line. I thought the PARCC snap-to-half-grid was bad, but this is a floating snap-to-half grid which means it is very unclear if one has in fact aligned one’s graph with 6.7.


My average is going to be imaginary. The level of input that each problem allows is again quite erratic.

Incidentally, I found no “degrees” button which I guess means all arcsines and so forth are supposed to be in radians. (I was incidentally taught arcsin means the unrestricted inverse — that is, it is not a function and gives all possible answers — but they’re using it here to mean the function with restricted domain.)


This (very easy) geometry problem requires a very nasty number of clicks (I ended up using 12) for something that can be done by hand in 10 seconds. With practice I could do it in 30 but my first attempts involved misclicks. Couldn’t the student just place the point and that would be enough? Why is the label step necessary? How many points are deducted if the student forgets to drag C somewhere semi-close to the point? How close is close enough?


Since this is a small experimental probability set, I just made sure there were 10 trials. I do not believe this is what the test makers intended.


Is my letter “D” close enough? I could easily see this being accepted by a human but the parameters of the computer-grader are uncertain.


This question is extremely vague. What is considered acceptable here? Does it have to just look slightly bell-curvy? Since there is no axis label one could just claim the y-axis maximum is very high and the graph is normal distribution without clicking any squares at all.


First, note how it is an undocumented feature the arrows will “merge” to a point if they are on the same position. I was first confused by this problem because I had no idea how to draw it.

Also, notice how I’m having trouble here affecting a slope of 1 and -1 if I attempt to make the graph look “correct” by spanning the entire axis.

The correct side to shade is indicated by a single dot, which is puzzling and potentially confusing.


Their logic here is if the digits repeat, it is a rational number. It took me several read-throughs to discover that the first number does, in fact, repeat. By their same logic if I wrote


it would be a repeating number, but of course it is e. The repeated digits should use a bar over them to reduce both the ambiguity and the scavenger-hunt-for-numbers quality of the problem as it stands.


I am fairly certain the Common Core intent is to only have linear inequality graphs, not absolute value:

Graph the solutions to a linear inequality in two variables as a half-plane (excluding the boundary in the case of a strict inequality), and graph the solution set to a system of linear inequalities in two variables as the intersection of the corresponding half-planes.

This standard could be stretched, perhaps

Understand that the graph of an equation in two variables is the set of all its solutions plotted in the coordinate plane, often forming a curve (which could be a line).

but the intent is for the overall conceptual understanding that graph = solutions, not permission to run wild with inequality graphs.


I am fairly certain this is computer-graded. I can think of many ways to phrase what I presume is the intended answer (“at the least either a or b has to be irrational”) but the statement is open enough other answers could work (“neither a nor b can be zero”).

It is possible I am in error on something, in which case I welcome corrections. (I don’t know Utah too well — it is possible Utah made some additions to the standards which null out a few of my objections.) Otherwise, I would plead with the companies working on these tests to please check them carefully for all the sorts of issues I am pointing out above.

11 Responses

  1. Great post! I agree with the interface issues on the SBAC. There are so many versions of calculators/number entering interfaces, that students struggle with how to enter them. The other major issue I saw in practice tests was that there were multiple screens inside of one with many scrolling bars. It was hard to see the questions on the page, and it was hard to manipulate the interface.

  2. […] Dyer posted an excellent critique of some of the exam items and interface choices made by the American Institute for Research and […]

  3. Jason, I’m in AZ too. Thank you for this analysis. #15 gave me a minute of much needed laughter. I believe, in #12 the formula is not directly applicable as it’s about multiple deposits. Each of them will have a corresponding individual formula with different numbers of years of accumulation. Regarding Utah, I don’t believe they added anything on their own. I’m pretty sure about infinite geometric series and linear programming.

    • Ah, I see now; it’s an annuity problem. Unfortunately the typical formula given in finance books are still not the ones presented.

      I would consider the problem still not Common Core.

  4. I looked at the other grades. There is some room for improvement there too (alignment, presentation, etc.). As a side note, there some clues that a Russian was heavily involved in creating these sets, though, of course, I can’t be sure.

  5. This test was the training test for SAGE, not the actual summative test students took. These questions were designed to reflect all the possible question and answer formats a student could encounter. The content of the questions was not supposed to reflect the core or be indicative of the content of the questions a student would see on their test.

    • I figured something had to be up — the infinite sum was too off for nobody to catch it. (The main page claims “These tests are aligned to the Utah Core Standards” so they probably need to reword a little.)

      I’m unclear why they would put a not-Core-compliant test; if their test bank is well-organized enough (as it should be by that point) it should be easy to produce an actually-compliant-with-standards test that would give people a better idea of what’s going on. (If any actual AIR reps are reading this and want to chime in, feel free.)

      I am hoping, perhaps, the people talking about interface issues (I can’t be the only one, right?) will trickle down to the actual test.

    • Perhaps you had better supply one for evaluation by teachers and other interested parties. And please don’t go on about intellectual property rights.

  6. […] test dubbed AZMerit, which is similar to how Utah and Florida are implementing their state tests (colleague and fellow blogger Jason Dyer has a play-by-play of what those tests looks like). In February 2015  (this year), Arizona almost repealed the Common Core standards – instead, […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: