Highlights from ICER 2013

2014-06-28 |

{ cs education }
{ research }
{ cs2 }
{ conferences }
{ cs1 }
{ icer }
{ travel }

A few weeks back, I attended ICER 2013 at UC San Diego. Afterwards, I went up to San Francisco and had some adventures there (and at Yosemite), and then spent time in Vancouver seeing friends before coming home.

ICER this year was a solid conference, as always. I liked that this year things reverted to having two minute roundtable discussions after every talk, before the Q&A. It makes for a much more engaging conference.

All hail the UCSD Sun God, who benevolently oversaw the conference.

My favourite talk this year was definitely Tom Park et al’s “Towards a taxonomy of errors in HTML/CSS“. HTML/CSS tends not to get studied in the CS ed crowd, and as Tom’s talk illustrated, that’s a big shame. HTML is for many people the first (and sometimes only) formal language that they encounter.

It turns out that many of the same stumbling blocks that people have when learning programming languages are the same as when they learn HTML. Syntax errors, figuring inconsistent syntax, learning that things need to be consistent – even just learning that what you see is not what you get.

In compsci we tend to overlook teaching HTML since it’s a markup language, not a programming language. But what we deal with in compsci is formal languages, and the simplest ones are the markup languages. Playing with a markup language is actually a much simpler place to start than giving novices a fully-fledged, Turing complete language.

Other research talks of note:

Peter Hubwieser et al presented a paper on categorizing PCK in computer science that I liked; I’d love to see more work on PCK in CS, and look forward to seeing subsequent work using their framework.
Colleen Lewis et al performed a replication study looking at AP CS exams. I love replication studies, so I may be a bit biased towards it :) In the original paper, they found that the first AP CS exam’s scores were strongly predicted by only a handful of questions – and those questions were ones like:int p = 3 int q = 8 p = q q = p_
what are the values of p and q?_
In Colleen’s paper, they found that the newer AP CS exams are much more balanced: things are not predicted by a small number of questions. Good to see!
Robert McCartney et al revisited the famous McCraken study that found that students can’t program after CS1, and found that students can, but that educators have unrealistic expectations for what students would know by the end of CS1.
Michelle Friend and Robert Cutler found that grade 8 students, when asked to write and evaluate search algorithms, favour a sentinel search (go every 100 spots, then every 10, then then every 1 spots, etc) over binary search.* Mike Hewner found that CS students pick their CS courses with really no knowledge of what they’ll be learning in the class. It’s one of those findings that’s kind of obvious in retrospect, but we educators really tend to mistake our students as thinking they know what they’re in for. Really, a student about to take a CS2 class doesn’t know what “hash tables” or “graphs” are coming in. Students pick classes more around who the prof is, the class’ reputation, and time of day.
Finally, Michael Lee et al found that providing assessments improve how many levels people will play in an educational game that teaches programming. It’s a neat paper, but the finding is kind of predictable given the literature on feedback for students.

I much more enjoyed Michael’s ICER talk from two years ago. He found in the same educational game that the more the compiler was personified, the more people played the game. Having a compiler that gives emoticon-style facial expressions, and uses first person pronouns (I think you missed a semi-colon vs. Missing semicolon) makes a dramatic difference in how much more people engage with learning computing. That’s a fairly ground-breaking discovery and I highly recommend the paper.
The conference, was of course, not only limited to research talks:
The doctoral consortium, as always, was a great experience. I got good feedback, about a dozen things to read, as well as awesome conference-buddies!
The lightning talks were neat. My favourite was Ed Knorr’s talk on teaching fourth-year databases, since third-year and fourth-year CS courses are so often overlooked in the CS ed community. I also liked Kathi Fisler’s talk on using outcome-based grading in CS.
The discussion talks were interesting! Elena Glassman talked about having students compare solution approaches in programming, which nicely complements the work that I presented this year.* The keynote talk also talked about the importance of comparing-and-contrasting. My takeaway from the keynote, however, was a teaching tip. Scott, the speaker, found that students learnt more from assignments if they were asked upon hand-in to grade themselves on the assignment. (Students would basically get a bonus if their self-assessment agreed with the TAs’ assessment.) It’s such a small thing to add onto the hand-in process, and adds so much to how much students get out of it. I’ll definitely have to try this next time I teach.
Overall, a great time, and I’m looking forward to ICER 2014 in Glasgow!

Comparing and contrasting algorithms is better than sequential presentation

2014-06-28 |

In five days, I’ll be heading to ICER 2013 in San Diego, where I’ll be presenting a paper “Comparing and Contrasting Different Algorithms Leads to Increased Student Learning“ (pdf here).

The findings in a nutshell: if you present two different approaches to solving a compsci problem side-by-side and have students compare them, the students will understand the problem better than if you present those approaches sequentially. And importantly, the students will be better transferring their understanding of the problem to similar problems.

Why is this notable? Because the sequential approach is pretty much ~95% of how we teach algorithms and data structures! Just this past term when teaching CS2 I did things this way: a unit on binary trees, then a unit on BSTs, then a unit on heaps. Yet the evidence we have here is that it’s better to show algorithms/data structure variation in parallel.

Background Knowledge

This study is a replication of two studies in math education – a study on algebra problems, and a fol]low-up study on estimation problems, both by Bethany Rittle-Johnson and Jon R. Star.

In the original algebra study, students were randomly assigned to one of two groups:

a control group where students were given workbooks that had students saw a worked example solving a problem using one approach; they answered questions about it; then on the next page they saw a second worked example solving the problem using a different approach, and answered questions about it
an experimental group where the workbooks presented the two worked examples side-by-side and students worked on all those problems on the same page
The study has a pretest-intervention-posttest model, where the pretest and posttest are the same test. This allows the researchers to see how much students learnt from the intervention. The tests probed three things:
procedural knowledge – how to solve a problem* procedural flexibility – being able to solve a problem with multiple approaches; being able to generate, recognize and evaluate different approaches
conceptual knowledge
And what did they find?
the compare & contrast group did better than the sequential group with respect to procedural knowledge and procedural flexibility
the two groups did the same on conceptual knowledge
The conceptual knowledge thing was a bit of a surprise to them. So, they did another study! This time, they did a prestest-intervention-posttest-followup study. That’s the same as before, but with a second posttest given some time after the study, to see how much students retain. In this study, the math problems were about estimation.

What did they find?

Again, the compare and contrast group did better on the posttest and the follow-up with regard to procedural knowledge and procedural flexibility
But again, the two groups are the same on conceptual knowledge.
It bothered them some more that the conceptual knowledge was different. Some similar studies in other fields would lead one to predict conceptual knowledge would be the same. So, they looked closer at their data:
For the students who started off with some conceptual knowledge, the compare & contrast condition lead to more learning.
For the students who had no conceptual knowledge to begin with, it didn’t matter which condition they were given.They speculated that you need to have some existing knowledge of a problem for comparing and contrasting to really shine.

Our study

We ran our study as a pretest-intervention-posttest-followup study, following the estimation study. In our study, CS2 students compared different techniques for hashing. We ran the study in three different sections of CSC 148 at UToronto, with 241 students participating.

Not that surprisingly, students in the compare-and-contrast group performed better at procedural knowledge and procedural flexibility – and the two groups performed the same on conceptual knowledge.

But we found the opposite of Rittle-Johnson and Star when we looked closer at conceptual knowledge:

students who started with _no_ conceptual knowledge gained more from the compare-and-contrast condition than from the sequential condition
students who started with some conceptual knowledge performed the same in both conditions
What we think is going on here is that the compare-and-contrast approach lets students build a schema. By looking at what is different and similar, they can decipher what the important aspects of a problem are.

For the students who already have such a schema in place, when they see things sequentially, they can relate things back to their schema. They already know what the important aspects of a problem are, and so can compare on the fly.

For an expert like me, or any other instructor, this is the same for us. When I look at a hash function, I already know the aspects of hashing that can be varied to solve problems differently. When I see something new presented, I can relate it back to what I already know.

Another difference comes with yardsticks. Experts work with yardsticks like big O notation and complexity classes – that allow us to scale up our knowledge very easily. For novices, it’s a lot easier to handle “mergesort is faster than selection sort” than “mergesort is O(nlgn)”.

For us experts, it makes sense to present information sequentially – because we can easily process things that way. For our students – the ones learning things completely afresh – that’s a lot harder.