My basic issue with cognitive load theory

The idea of “working memory” — well established since the 1950s — is that the most objects someone can hold in their working memory is 7 plus or minus 2. There have been some revisions to the idea since (mainly that the size of the chunks matter; for instance, learners in languages that use less syllables for their numbers have an easier time memorizing number sequences).

This was extrapolated in the 1980s to educational theory via “cognitive load theory” by stating that the learner’s working memory capacity should not be exceeded; this tends to be used to justify “direct instruction” where the teacher lays out some example problems and the students repeat problems matching the examples. The theory here is by matching examples students suffer as little cognitive load as possible.

Cognitive load theory has some well-remarked problems with a lack of falsification and a lack of connection with modern brain science. These issues likely deserve their own posts.

My issue with cognitive load theory as applied to education is more basic: the contention that direct instruction requires less working memory than any discovery-based alternative. It certainly is asserted often

All problem-based searching makes heavy demands on working memory. Furthermore, that working memory load does not contribute to the accumulation of knowledge in long-term memory because while working memory is being used to search for problem solutions, it is not available and cannot be used to learn.

but the assertion does not match what I see in reality.

To illustrate, here’s a straightforward example — defining convex and concave polygons — done with three discovery-type lessons and direct instruction.

Discovery Lesson #1

Click on the image below to use an interactive application. Use what you learn to write a working definition of “convex” and “concave”.


Then draw one example each of a convex polygon and a concave polygon. Justify why your pictures are correct.

Discovery #2

The polygons on the left are convex; the polygons on the right are concave. Give a working definition for “convex” and “concave”.


Then draw one example each of a convex polygon and a concave polygon (not copying any of the figures above). Justify why your pictures are correct.

Discovery #3


The polygons on the left are convex; the polygons on the right are concave. Try to decide looking at the picture the difference between the two.

…after discussion…

A convex polygon is a polygon with all interior angles less than 180º.
A concave polygon is a polygon with at least one interior angle greater than 180º. The polygons on the left are convex; the polygons on the right are concave.

Draw one example each of a convex polygon and a concave polygon (not copying any of the figures above). Justify why your pictures are correct.

Direct Instruction

A convex polygon is a polygon with all interior angles less than 180º.
A concave polygon is a polygon with at least one interior angle greater than 180º. The polygons on the left are convex; the polygons on the right are concave.


Draw one example each of a convex polygon and a concave polygon (not copying any of the figures above). Justify why your pictures are correct.


Parsing and understanding technical words creates a demand on memory. The hardcore cognitive load theorist would claim such a demand is less than that of having the student create their own definition, but is that really the case? The student using their own words can rely on more comfortable and less technical vocabulary than the one reading the technical definition. The technical definition is easy to misunderstand and the intuitive visualization is only clear to a student if they have the subsequent examples.

Discovery #1 does not appear to have heavy cognitive load. On the contrary, being able to an immediately switch between “convex” and “concave” upon passing the 180º mark is much more tactile and intuitive than either of the other lessons. Parsing technical language creates more mental demands than simply moving a visual shape.

There might be a problem of a student in Discovery #1 or Discovery #2 coming up with an incorrect definition, but that’s why discovery is hard without a teacher present.

Discovery #3 is exactly identical to the direct lesson except the definition and examples are reversed places. Having a non-technical intuition built up before trying to parse the technical definition makes it easier to read; again it appears to have less cognitive demand.

Overestimating and underestimating

One of the basic assumptions of cognitive load theorists seems to be that the mental demands of discovery are given all at once. Usually the demands involve some sort of scaffolding. For instance, in Discovery #3 the intuitive discussion of the pictures and then definition are NOT given at the same time. Only after students have settled on an idea of the difference between the shapes — essentially reducing down to one mental object — is the definition given, which as I already pointed out is easier to read for a student who now has some context.

On the other hand, cognitive load theorists seem to underestimate the demands of direct instruction. While exact entire sentences tend not to be parsed by the student in definitions (this would clearly fail the “only seven units” test) mathematical language routinely has dense and specific enough language that breaking any supposed limit is quite easy. Using the direct instruction example above, taking everything in on one go would require a.) parsing and accepting the new terms “convex” b.) same for “concave” c.) recalling definitions of “polygon” d.) same for “interior angles” e.) keeping in mind the visual of greater and less than 180º f.) keeping track of “at least one” meaning 1, 2, 3, or more and g.) parsing the connection between a-f and the examples given below.

There are obviously counters to some of these — the definitions for instance should be internalized to a degree they are easy to grab from long term memory — but the list doesn’t look that different from a “discovery” lesson, and doesn’t possess the advantage of reducing pressure on vocabulary and language.

The overall concern

In truth, working memory is well-understood for memorizing digit sequences (called digit span) but the research gets fuzzy as processes start to include images and sounds. Any sort of declaration (including my own) that the working memory is busted by a particular task when the task involves mixed media is essentially arbitrary.

On top of that, the brain is associative to such an extent that memory feats are possible which appear to violate these conditions. For instance, there is a memory trick I used to perform for audiences where they would give me a list of 20 objects and I would repeat the list backwards. The trick works by pre-memorizing a list of 20 objects quite thoroughly — 1 for pencil, 2 for swan, say — and then associating the list with those objects. If the first object given was “yo-yo” I would imagine a yo-yo hanging off a pencil. The trick is quite doable by anyone and — given the fluency of the retrieval — suggests that association of images have a secondary status that exceeds that of standard “working memory”. (This is also how the competitors of the World Memory Championship operate, allowing them feats like memorizing 300 random words in 5 minutes.)

Students missing test questions due to computer interface issues

I’ve had a series where I’ve been looking at Common Core exams delivered by computer looking for issues. Mathematical issues did crop up, but the more subtle and universal ones were about the interface.

Part 1: Observations on the PARCC sample Algebra I exam
Part 2: Observations on the PARCC sample Algebra II exam
Part 3: Observations on the PARCC sample Geometry exam
Part 4: Observations on the SAGE sample exam

While the above observations were from my experience with design and education, I haven’t had a chance to experience actual students trying the problems.

Now that I have, I want to focus on one problem in particular which is on the AIR samples for Arizona, Utah, and Florida. First, here is the blank version of the question:


Here is the intended correct answer:


Student Issue #1:


In this case, it appears a student didn’t follow the “Drag D to the grid to label this vertex” instruction.

However, at least one student did see the instruction but was baffled how to carry it out (the “D” can be easy to miss the way it is on the top of a large white-space). Even given a student who missed that particular instruction, is the lack of dragging a letter really the reason you want students to miss the points?

Also, students who are used to labeling points do so directly next to the point; dragging to label is an entirely different reflex. Even a student used to Geogebra would get this problem wrong, as points in Geogebra are labeled automatically. I do not know of any automated graphical interface other than this test which require the user to add a label separately.

Student Issue #2:


Again, it appears possible the full directions were not read, but a fair number of students were unaware line connection was even possible, because they missed the existence of the “connect line” tool.

In problems where the primary activity was to create a line this was not an issue, but since the primary mathematical step here involves figuring out the correct place to add a point, students became blind to the line interface.

In truth I would prefer it if the lines were added automatically; clearly their presence is not what is really being tested here.

Student Issue #3:


This one’s in the department of “I wouldn’t have predicted it” problems, but it looks like the student just tried their best at making a parallelogram and felt like it was fine to add another point as long as it was close to “C”. The freedom of being allowed to add extra points suggests this. If the quadrilateral was formed automatically with the addition of point “D” (as I already suggested) this problem would be avoided. Another possibility would be to have the D “attached” the point as it gets dragged to the location, and to disallow having more than one point being present.

Induction simplified

When first teaching about the interior angles of a polygon I had an elaborate lesson that involved students drawing quadrilaterals, pentagons, hexagons, etc and measuring and collecting data and finally making a theory. They’d then verify that theory by drawing triangles inside the polygons and realizing Interior Angles of a Triangle had returned.

I didn’t feel like students were convinced or satisfied, partly because the measurements were off enough due to error there was a “whoosh it is true” at the end but mostly because the activity took so long the idea was lost. That is, even though they had scientifically investigated and rigorously proved something, they took it on faith because the path that led to the formula was a jumble.

I didn’t have as much time this year, so I threw this up as bellwork instead:


Nearly 80% of the students figured out the blanks with no instructions from me. They were even improvising the formulas. Their intuitions were set, they were off the races, and it took 5 minutes.

Observations on the SAGE sample exam

Earlier this year I wrote a multi-part series on the PARCC test samples, picking at potential pitfalls and interface issues. This was done with the assumption this would be my state’s new test.

Then (allegedly) the price tag on the bid came in too high, and we went with the American Institutes for Research instead. They have a contract to administer the Smarter Balanced test (so for the part of the US doing that one, this should interest y’all) but the test we will be seeing is customized for Arizona, presumably out of their test banks. This is close to the situation in Utah, which has a sample of what they are calling the SAGE test. Since there is no Arizona sample yet I decided to try my hand at Utah’s.

I’d like to think my approach to PARCC was gently scolding, but there’s no way around it: this test is very bad. One friend’s comment after going through some problems: “I’m starting to think this is an undergrad psych experiment instead.”

Question #1 is straightforward. Question #2 is where the action starts to happen:


Adding a point on the line gets the r-value closer to 1, but with no information on the exact coordinate points (those are nowhere near the grid lines) or the original r-value I believe this problem is impossible as written.


Question #3 is fairly sedate although they screwed up by neglecting to specify they wanted positive answers; (17, 19) and (-17, -19) both work but the problem implies there is only one valid pair. I’d like to draw attention to the overkill of the interface, which includes pi and cube roots for some reason. There seem to be multiple “levels” to the numerical interface, with “digits and decimal point and negative sign” being the simplest all the way up to “including the arctan if for some reason you need that” but without much rhyme or reason to the complexity level for a particular problem.

Case in point:


The percents in the problem imply the answer will also be delivered as x%, but there is absolutely no way to type a percent symbol in the line (just typing % with the keyboard is unrecognized). So something like 51% would need to be typed as .51. Fractions are also unrecognized.


Here’s the Common Core standard:

Derive the formula for the sum of a finite geometric series (when the common ratio is not 1), and use the formula to solve problems. For example, calculate mortgage payments.


I could linger on the bizarre conditional clause that makes this problem (*why* would one ever need to have a line with a y-intercept greater than one given in table form yet also perpendicular to some other particular line is beyond me) but instead I’ll point out the interface to the right, which is how all lines are drawn. (Just lines: there seems to be no way to draw parabolas and so forth like in the PARCC interface.) To add a line you click on “Add Arrow” (not intuitive naming) and click a starting point and an ending point. Notice that the line does not “complete” itself but rather hangs as an odd fragment on the graph. Also, fixing mistakes requires clicking “delete” and then the line, except if you click right on the line the points do not disappear so you have to repeat delete-click-delete-click on each of the points to clear everything out.

Oh, and the super-tiny cursor button is what you click if you want to move something around rather than delete and add. There was not enough room to have a button called “Move”?


First off, “objective function” is not a Common Core vocabulary word and linear programming is not in Common Core besides, at least not as presented in this question.

Solve linear equations and inequalities in one variable, including equations with coefficients represented by letters.

Graph the solutions to a linear inequality in two variables as a half-plane (excluding the boundary in the case of a strict inequality), and graph the solution set to a system of linear inequalities in two variables as the intersection of the corresponding half-planes.

Besides that, the grammar is very sloppy. It should say “the objective function z = -3x + 4y” without lumping it in a set of five statements where the student has to fish and presume the function being meant is the first line because it is the only one that includes all three variables in function form.


I include this problem only to indicate how wide the swerves in difficulty of this test are. First linear programming, then a simple definition, and then…


I came up with sqrt(2) and 2, but notice how the number line only accepts “0 1 2 3 4 5 6 7 8 9 . -” in the input. There is no way to indicate a square root.

Fractions are also right out, so one possible answer that does work (0.25 and .5) is very hard to get to. (I confess I was stumped and had to get hinted by a friend.)


I drove my eyes crazy trying to get the right numbers on the axis to match up, especially on Survey 1 which is not even placed against the number line. I thought the PARCC snap-to-half-grid was bad, but this is a floating snap-to-half grid which means it is very unclear if one has in fact aligned one’s graph with 6.7.


My average is going to be imaginary. The level of input that each problem allows is again quite erratic.

Incidentally, I found no “degrees” button which I guess means all arcsines and so forth are supposed to be in radians. (I was incidentally taught arcsin means the unrestricted inverse — that is, it is not a function and gives all possible answers — but they’re using it here to mean the function with restricted domain.)


This (very easy) geometry problem requires a very nasty number of clicks (I ended up using 12) for something that can be done by hand in 10 seconds. With practice I could do it in 30 but my first attempts involved misclicks. Couldn’t the student just place the point and that would be enough? Why is the label step necessary? How many points are deducted if the student forgets to drag C somewhere semi-close to the point? How close is close enough?


Since this is a small experimental probability set, I just made sure there were 10 trials. I do not believe this is what the test makers intended.


Is my letter “D” close enough? I could easily see this being accepted by a human but the parameters of the computer-grader are uncertain.


This question is extremely vague. What is considered acceptable here? Does it have to just look slightly bell-curvy? Since there is no axis label one could just claim the y-axis maximum is very high and the graph is normal distribution without clicking any squares at all.


First, note how it is an undocumented feature the arrows will “merge” to a point if they are on the same position. I was first confused by this problem because I had no idea how to draw it.

Also, notice how I’m having trouble here affecting a slope of 1 and -1 if I attempt to make the graph look “correct” by spanning the entire axis.

The correct side to shade is indicated by a single dot, which is puzzling and potentially confusing.


Their logic here is if the digits repeat, it is a rational number. It took me several read-throughs to discover that the first number does, in fact, repeat. By their same logic if I wrote


it would be a repeating number, but of course it is e. The repeated digits should use a bar over them to reduce both the ambiguity and the scavenger-hunt-for-numbers quality of the problem as it stands.


I am fairly certain the Common Core intent is to only have linear inequality graphs, not absolute value:

Graph the solutions to a linear inequality in two variables as a half-plane (excluding the boundary in the case of a strict inequality), and graph the solution set to a system of linear inequalities in two variables as the intersection of the corresponding half-planes.

This standard could be stretched, perhaps

Understand that the graph of an equation in two variables is the set of all its solutions plotted in the coordinate plane, often forming a curve (which could be a line).

but the intent is for the overall conceptual understanding that graph = solutions, not permission to run wild with inequality graphs.


I am fairly certain this is computer-graded. I can think of many ways to phrase what I presume is the intended answer (“at the least either a or b has to be irrational”) but the statement is open enough other answers could work (“neither a nor b can be zero”).

It is possible I am in error on something, in which case I welcome corrections. (I don’t know Utah too well — it is possible Utah made some additions to the standards which null out a few of my objections.) Otherwise, I would plead with the companies working on these tests to please check them carefully for all the sorts of issues I am pointing out above.

Telling left from right

I had a discussion last week when reviewing slope that went like this:

Student: Wait, how can you tell if the slope is positive or negative just by looking?

Me: Well, if you imagine traveling on the line from left to right, if you’re moving up the slope is positive and moving down the slope is negative.

Student: …What?

Me: (points) So, starting over here … (slides hand) … and traveling this way … this slope is moving up. Starting over here … (slides hand) … this slope is moving down.

Student: But I don’t understand where you start.

Me: You start on the left.

Student: I’m still confused.

Me: (delayed enlightenment) Wait … can you tell your right from your left?

Student: No.

This isn't picture that was up at the time, but it's in the same genre.

This isn’t the picture that was up at the time, but it’s in the same genre.

Left-right confusion (LRC) affects a reasonably large chunk of the population (the lowest estimate I’ve heard is 15%) but is one of those things teachers might be blissfully unaware is a real thing. (Note that LRC is at something of a continuum and affects women more than men.)

My own mother (who was a math teacher) has this problem, and has to use her ring finger whenever she needs to tell her right from her left. She reports that thinking about the graph as “reading a book” lets her get the slope direction correct.

Teaching the strange phrasing of technical mathematics

I’ve given my first geometry test this year (yes, I’m back on geometry, high five to all my geo-buds) and this is the first test I’ve given with the PARCC in mind.

Specifically, I made the phrasing match the technical language of PARCC questions, and I had my first experience with what happens when the students encounter something truly alien to them, like:

The point R is at (0, 3) and the point S is at (16, 7). Draw the line segment \overline{RS}.

Find a point L on the line \overline{RS} such that \overline{RS} is four times as long as \overline{RL}.

The first line didn’t go so bad. In a way students can plow through without reading it (draw a line segment? ok there must be points …. there they are!) but the second line had lots of bafflement.

It is a circumstance where vocabulary isn’t the issue, but phrasing is.

The issues seemed to be
a.) Not being clear where the question was; in this case it was directive to “find L”
b.) Students who got past the first hurdle were unclear in juggling the phrasing after; essentially the student brain seemed to go — first I find L, but to do that I need to worry about RS and RL, and somehow RS is — wait what?
c.) That rather than being told what to do (find the midpoint between X and Y) they had to hold a conditional in their head and fuss with a bit before they even understood they wanted a quarter-point to answer the problem.

This sort of conditional indirection seems to be common in PARCC questions: rather than being told what needs to be accomplished, be given some geometric object to assign THEN be told what will happen once that geometric object is in place THEN try to start unpiling what needs to be accomplished for those conditions to hold.

This sort of thing is routine in formal math texts but does not seem to be in the high school experience at all.


(From the PARCC sample Geometry test.)

Does anyone have experience teaching this sort of thing? How does one get students — both fluent and English language learners included — to read statements like the one above without blanking out?

Observations on the PARCC sample Geometry exam

Part 1: Observations on the PARCC sample Algebra I exam
Part 2: Observations on the PARCC sample Algebra II exam
Part 3: Observations on the PARCC sample Geometry exam

Calculator part: 18 of 25

Use the information provided in the animation to answer the questions about the geometric construction.

To pause the animation, select the animation window.

The students are supposed to watch a video of a construction and then say things about the proof enacted through the constructions. This is a very specific skill that needs to be practiced. Daniel Schneider kindly sent me a link to a website with a large number of construction animations (along with proofs) in case you need more to use in class.

However, there’s a serious interface problem. Here’s what the video looks like when paused, as well as a question to go with it:


Point “C” is completely covered. Whoops.

Non-calculator part: 6 of 7


This is one of those simple-looking questions which has enough of a trick to it I’m not sure how many students will get it right.

Part A requires students to work a double-completing-the-square manipulation, hopefully not getting sidetracked by the presence of b on the right hand side:

x^2 + y^2 - 4x + 2y = b
x^2 - 4x + y^2 + 2y = b
x^2 - 4x + 4 + y^2 + 2y + 1 = b + 4 + 1
(x-2)^2 + (y+1)^2 = b + 5

Part B requires noticing that a radius of 7 means the right hand side will be 49, so b + 5 = 49 and thus b = 44.

In principle this problem is solvable, but the lack of partial credit on a problem with a “trick” that I worry a student who can normally complete the square would still get no points due to the indirection.

Calculator part: 13 of 25


This problem’s rough for three reasons:

a.) Even with the phrase “the pipe is open at both ends” placed in there, this is something of a background knowledge problem; the students need to know the “outer surface” excludes the circles on the top and bottom.

b.) There are volume formulas on the formula sheet but not surface area formulas. Thus the students need to have memorized \pi d h or be able to extrapolate it, and know enough to exclude the circles.

c.) If \pi is set to be 3.14, there answer comes out to be 1356.48. If \pi is set to be 3.14159 or something with more digits (not unusual since graphing calculators have a “pi” button) the answer comes out to be roughly 1357.168. Rounding to the nearest integer thus can give either 1356 or 1357 as an answer.

Non-calculator part: 7 of 7


Out of all the problems on the PARCC final exam for geometry, 28% are related to transformations.

I can understand a transformational emphasis in general: it leads to a function transformation understanding of graphs (which is far more powerful and useful than looking at each kind of graph individually). However, why do so many of the dilation and rotation problem — 4 out of the 9 — involve centers not at the origin? This is not rhetorical; I really want to know where the utility is.

Non-calculator part: 20 of 25


This is one of the easier problems on the test, but assumes background the students don’t necessarily have. I can assume what a “collar” means here (even though I’ve never heard the word used in this context) but my ELL students are more likely interpret it as gibberish.

Calculator part: 10 of 25


This is very similar to the other problem in relying somewhat on background knowledge. Technically speaking one can ignore all the external stuff about merchant vessels and probes and focus on the math, but the brain of the ELL student doesn’t have an easy time removing the context.

Also note the weirdness of the rounding; in problem 20 the rounding needed to be done to the nearest tenth, while in this problem the rounding needs to be done to the nearest integer off the list.

Calculator part: 3 of 25


I’m noting this one because nothing in my current textbook (Carnegie Learning, written for Common Core) has anything resembling this kind of problem. Anyone have a source with problems that are similar?

Calculator part: 23 of 25


I don’t think I’ve ever give this much emphasis to the vocabulary of proof. Getting my students to keep the reflexive, symmetric, and transitive properties of congruence straight is going to be a nightmare and a half.

Ok, one last problem, from HS Sample Math Items, 7 of 10 (so not the final exam, but the open response part):


Here the angle bisector video returns (complete with unhelpful play button covering the diagram when paused) but the student is supposed to free-write a proof.

Here is how you type the first line as given:

1.) Pick “geometry” on the side and pick the short line in the upper right; that’s a “line segment” and will give you a blank box under a line segment so you can type letters.

2.) Type the letters you want under the line segment. If you accidentally type more than two letters any extra keypresses will be ignored.

3.) Go to “relations” above “geometry” and find the congruence symbol. Pick that. This will give the congruence symbol and a blank box.

4.) Pick the “line segment” and it will take the blank box that just appeared and put a line segment over it.

5.) Type the letters you need for the other line segment.

Now you have one step of the proof, now you just need to give a reason and then do four more steps.

(What would constitute a valid reason here, by the way? The mathopenref site I linked to early in this post just states “They were both drawn with the same compass width” — would this be considered valid by graders?)