Computational Sophistication of Games Programmed by Children: A Model for Its Measurement

In this episode I unpack Werner, Denner, Campe, and Torres’ (2020) article titled “Computational sophistication of games programmed by children: A model for its measurement,” which unpacks how the authors updated the game computational sophistical (GCS) model to account for computational learning evident within 39 games development by pairs of middle school children.

Welcome back to another episode of the
CSK8 podcast my name is Jared O'Leary in
this week's episode I'm unpacking the
paper titled computational
sophistication of games programmed by
children a model for its measurement it
was written by Linda Warner Jill dinner
Shannon camp and David M Torres as
always even find a link to this paper in
the show notes as well as links to the
author's Google Scholar profiles so you
can read more works by them simply visit
the show notes by clicking on the link
in the app that you're listening to this
on or by visiting jared O'Leary comm and
there you will find the citation for
this particular article as well as many
other show notes that are relevant to
this particular episode alright so the
abstract for this paper reads as follows
quote this article builds on prior work
that aims to measure computational
learning Ciel during middle school since
gained computational sophistication GCS
has been used as a proxy for a student's
engagement in CL we build on their model
to more completely describe the
relationship between different types of
building blocks of computer games and
GCS in doing so we present a single
quantitative measurement for GCS our
model called GCS 2.0 has face validity
for 39 games each program by a pair of
middle school children we choose four of
these games two with high GCS and two is
low GCS and discuss the computational
building blocks found in each game we do
this to help the reader better
understand our measurement of GCS and
its relationship to CL include okay so
if I were to summarize this article into
a single sentence I would describe it as
an article that unpacks how the author's
updated the game computational
sophistication model GCS to account for
computational learning evident within 39
games developed by pairs of middle
school children so the article itself is
organized around the question quote how
can we improve the game computational
sophistication model to make it easier
to use for assessing the computational
learning of middle school children
programming games in quote that question
is from page 12 : - and I modified us
slightly to actually just speak out the
acronyms so the author's begin the
article by citing Cooper at all
suggestion quote
but for education at the k-12 level we
should be focusing on computational
learning since it emphasizes the central
role that a computer can play in
enhancing the learning process and
improving achievement of k-12 students
in STEM and other courses in quote that
quote is from page 12 : 1 however the
authors also suggests that one of the
problems for widespread adoption of CT
is that is very difficult to measure
students learning when engaging in
computational thinking activities so
here's a quote from Chris Stevenson and
Chris is Google's head of computer
science education programs so this quote
kind of like it builds on the sentiment
that I just previously mentioned quote
I'm not sure yet that we understand
enough about how to measure
computational thinking knowledge and
learning even in our own discipline
space let alone across other disciplines
there are little pieces but there isn't
really anything comprehensive around
which we can construct a real framework
for assessment yet unquote that's from
pages 12 : 1 to 12 : 2 so one space
where middle school kids and high school
kids are engaging in computation
learning is through programming video
games which the authors cite as kind of
a medium for promoting computation on
knowledge and skills in a way that is
often motivating for kids I'm sure we
can all think of the kids that we work
with that are highly motivated by video
games myself being one of them even
though I'm not a kid
although that's debatable sometimes
catch me in an amusement park with
rollercoasters and I will be like a
child running around from ride to ride
anyways so the review of literature in
this particular article cites other
articles related to developing CT
frameworks however many of the studies
that they cite did not actually focus on
programming video games specifically or
did so in a way that was not easily
scalable and transferable to other
programming languages so the GCSE model
that they are using in this particular
study is quote based on the analysis of
olds using the Alice programming
environment in quote that's from page 12
: 3 and the author's provide a helpful
summary of how this model aligns with a
framework for how novice programmers
how to how to code and that's on page 12
: 3 which I recommend taking a look at
however the authors suggested the
original GCSE model while it may have
been helpful for some cases it was not
distilled down to a single assessment
that could be used to analyze student
work specifically using quantitative
measurements so this particular article
tries to remedy that so the study itself
looked at a 12 week long voluntary
after-school program were 39 pairs of
middle school kids across four different
schools spent about two hours per week
programming games in Alice which is a 3d
storytelling and game environment and
I'll include a link to it in the show
notes just in case you want to actually
try it out in your class or just for
funsies so to actually analyze the games
that were submitted by these 39 pairs
they have two undergrads analyzed all 39
games quote for programming constructs
patterns and game mechanics based on the
original GCSE model include from page 12
column four and if you're into
inter-rater reliability they were at
ninety seven point eight four percent
and if you're not into that that
basically means that the two undergrads
who analyzed the games independently so
on their own when it came together and
compared their scores or labels that
they used on the code it came to the
same conclusion about ninety seven point
eight four percent of the time which is
really high okay so the next main
section of this particular article kind
of unpacks what the 2.0 version of GCSE
is so on page 12 : 5 there is a table
that lists the programming patterns used
during analysis of a project and what
level each pattern is considered from
simple as a complex simple being a level
one of the levels you could get a score
from zero to the maximum level so level
one can have a score of zero or one
level two can have zero one and two and
a level three can have a score of zero
one two or three so for example writing
code that controls an object or camera
with the mouse or keyboard is considered
a level one pattern while programming a
vehicle to cause objects within the
vehicle to move in unison is a level two
skill so what that specifically means is
like if a vehicle if like your sprite
jumps into a car
and then the car starts and it started
driving away we want the person that's
inside the car to also go with it so
they move in unison otherwise it would
just be alright the car goes zooming off
then the sprite is still sitting there
floating in the air and as an example of
a level-3 skill creating a timer that
causes something to occur when the timer
hits a certain time or a threshold so if
you want to see all of the 21 patterns
check out page 12 : 5 and each one of
them includes the name of the pattern
what skill it is considered to be what
level and then a description of what
that pattern is ok now the second main
portion of this particular model on 12 :
mechanics such as like player
interaction or levels or puzzles or
racing or shooting or whatever now the
way the patterns and the mechanics work
together is they kind of come together
to create an overall score or rating for
a particular project so they do so by
adding up the game mechanics score with
the pattern score now the game mechanics
score is basically a formula that
accounts for the number of mechanics
used in the game so for example if you
have a game that has multiple levels has
player interaction and is a racing game
that's three different kinds of game
mechanics that are being used but if
only has player interaction that would
be one now the pattern score is another
formula that accounts for the number of
overall quality patterns used within
within the game so for example if you
use nested if-else conditional z' that
is one type of pattern and if you use
counters that's another type of pattern
so you'd have to and depending on how
well you use them you'd get a different
score or each one of those so the
authors explained how they came up with
the algorithms for the scores and then
provides some hypothetical examples of
how a game might score using the formula
but if it's a quality game or a low
quality game in terms of level of
sophistication and the description does
get pretty granular so if you're really
interested in learning more about
exactly how this model works they spell
it out pretty good so next the authors
provide two case studies of a high
scoring game and then two case studies
of a low scoring game so I recommend you
read those if you want to see some
examples of what would a high scoring
game look like
in Allison what would a low scoring game
look like so in the discussion section
of this paper the author suggests that
the gcas model could be used with other
programming languages outside of Alice
alone so for example it could be applied
in to scratch or Ruby or Swift or
JavaScript
however they note that the specific
patterns in mechanic's would need to
then be translated over into other
languages or platforms so if you're like
super excited about this GCSE model and
you want to apply it into a language
other than Alice you're gonna have to do
some translating before you can actually
use it in your classroom on those other
platforms but if you look at the
appendices at the very bottom it's
several pages long and it spells out
each one of the patterns and mechanics
so it should be relatively easy for you
to kind of create your own version of
this the authors also note in the
discussion that they were looking at the
product created by two people rather
than what each person individually
learned and this is important because we
cannot make the assumption that each
person has the same understanding as
their partner when collaborating on a
project for example I'm sure we've all
worked with groups or seen groups where
one person knows significantly more than
another and they're able to work
together on a project now speaking of
working together one of the really
interesting findings related to the
partnering was that quote students who
reported higher levels of friendship
with their partners at the beginning of
the course had lower GCSE scores at the
end no other significant correlations
were found in quote so thus on page 12 :
and I would love to see some follow up
on this why is it that in this
particular study students who were
friends or had higher levels of
friendship and hit up scoring lower on
their overall product and students who
were had lower friendships at the start
of this course I would love to see some
follow-up studies that explore this more
or even have the author's on this
particular podcast that way it can kind
of ask them what they think
hypothetically might have causes so in
the conclusion section of this article
the author suggests that this model can
be helpful for assessing computational
learning however they explicitly state
quote we do not suggest that the GCSE
standalone into
course assessment of programming skills
in quote from page 12 : 15 instead what
they recommend is pairing this with
other assessment frameworks so for
example they recommend pairing it with
self evaluation now this is something
that I was really hoping that they
mentioned in the conclusion and I'm so
glad they did I strongly agree with this
you can't rely on one assessment or one
an assessment type in any class that
you're working with you need to be
constantly engaging in formative
summative and episode of assessments
throughout a class to kind of get a more
of a holistic approach to understanding
what your kids actually know and can do
and I could talk for hours about why
that is very important but I'm going to
instead kind of point to some resources
that unpack it a little bit more in the
show notes alright so as always I now
want to kind of share some of my
lingering questions or thoughts now
these questions and lingering thoughts
are not meant as a critique of the
article or the authors they're simply my
own ponderings or wonderment that came
out of reading the article itself so one
of the things that I was thinking of is
I'm curious how this model would differ
if the analysis was based on the process
rather than the product so in particular
if you looked at this from a
constructivist perspective the idea that
knowledge is socially constructed and
individually understood and therefore is
unknowable rather than a constructionist
perspective which is based off of papper
its ideas that we've talked about in
this podcast before and is the idea that
like knowledge is constructed through
act of creating whether that be like
creating a game like in this particular
article or maybe building a rocking
chair out of wood or even creating some
kind of a mental model or something all
those are like acts of constructionist
practices so in other words when I'm
what I'm wondering is what if we
analyzed the learning evident through
the social interaction and the overall
process rather than the end product
result how would that have changed this
model and the things that would have
been looked at or discussed or labeled
or assessed now again the authors do
note that this type of assessment should
not be like a standalone the only form
of assessment you engage in the
summative assessment so it could be
paired with more process based
assessment
but I'm wondering in particular about
the model itself how it would have
differed if it was looking at the code
throughout the process rather than code
as the end result so as an example for
another question that I was kind of
thinking of is how might the model
differ if it also accounted for code
that was initially included or attempted
but ultimately removed from the final
product so the author's do account for
what they call non operational code
which is like the code that's in the
game but it doesn't actually work or
wasn't finalized there's kind of like
awful on the side not really doing
anything but I'm curious what about the
code that was deleted so I'm sure we've
all kind of like worked on stuff where
maybe you're creating something and you
start tinkering with something and then
you go you know what I don't like this
idea or I have another idea
and you just erase a portion of what you
had previously worked on what what are
those things that were erased have
potentially shown in terms of student
understanding so another question that I
had is how might educators use this
model in combination with its ative
assessments so if set of assessments are
quote an assessment the student makes
against their own performance so that
they are measuring their personal
progression against their own previous
work in quote now this quote is from an
article by savage and fault lee which i
will link to in the show notes so in
other words it's kind of like a
self-reflective assessment where you
kind of compare your own understanding
with your prior understandings so if you
were to take this GCSE model and do it
multiple times and go okay the previous
time I use the GCSE model on my project
I scored let's say a 10 but now I scored
a 12 why is that what about my learning
has changed how were these two projects
rated differently and what does that say
about my own understanding over time so
let's say a month ago I was rated at
that but then I incorporated X y MZ
patterns or game mechanics and now I'm
rated at this higher score what does
that say about my own learning now one
of the things I like to also do with
episode of assessment is to go one step
further and then kind of predict their
own future learning so when kids engage
in episode of assessments in the classes
that I work with it would reflect on
what they learned in relation to their
prior learning but then I have them
think through okay what is it
you want to learn next what do you what
are you missing or what do you want to
expand upon or dive deeper into with
your next project that you're gonna
start because again we can always learn
more we can always dive deeper so even
if you get a 100% on this project what
next so the next question that I had for
this article was how might this model
account for variances in complexity
evident across the different patterns
and mechanics so for example the nested
if-else statements are listed as a level
receive the maximum score of 1 while
timers are listed as a level 3 pattern
which can receive a maximum score of 3
however I can think of several games
where the use of a timer is much simpler
than the nested conditionals in
particular you can dive several layers
deep in different branches of nested
conditionals whereas a timer might be
simply initializing a variable creating
a loop changing the variable by one and
then waiting a second that's it so as
another example a counter is listed as a
level 3 pattern which again can have a
maximum score of 3 while embedded
methods are listed as a level 1 pattern
and methods again are basically another
name for functions and in this case I
again can think of many examples we're
keeping score by changing a variable can
be much simpler than sprites that have
several functions embedded within other
functions so my last question while
reading through this article was how
could a model like this account for
sophistication evident with the
efficiency of code so in the show notes
include a link to a tweet where I was
multiple years ago I shared out how I
was working on this infinite drum set
and I took the snare code and I narrowed
it down from 64 lines of code to 9 lines
of code all while maintaining the exact
same functionality in other words my
code was much more efficient in terms of
the amount of overall space it was using
and it's just overall design so when
thinking of instances like that how
could this model also account for
efficiency in code because at the moment
it's just are you using these patterns
and are you using these mechanics not
necessarily how efficiently or
effectively are you using them and again
that's not a critique on the authors or
the article itself I do highly recommend
people
read through this and I would love to
hear how people are using this potential
model inside their classroom now with
all that being said it's just a quick
reminder that if you're interested in
checking out some of the resources I
mentioned like the assessment stuff or
trying out Alice just click on the show
notes for this particular episode and it
has several links in there if you
enjoyed this episode I hope you consider
sharing it with somebody else for
example do you know a middle school
computer science educator who'd be
really interested in learning more about
different assessment types if so share
this with them and they'll probably be
like thanks person
I appreciate you with that being said
thank you so much for listening to this
episode stay tuned next week for another
interview and then two weeks from now
another unpacking scholarship episode I
hope you all have a wonderful week

Article

Werner, L., Denner, J., Campe, S., & Torres, D. M. (2020). Computational sophistication of games programmed by children: A model for its measurement. ACM Transactions on Computing Education, 20(2), 12:1-12:23.

Abstract

“This article builds on prior work that aims to measure computational learning (CL) during middle school. Since game computational sophistication (GCS) has been used as a proxy for a student’s engagement in CL we build on their model to more completely describe the relationship between different types of building blocks of computer games and GCS. In doing so, we present a single quantitative measurement for GCS. Our model, called GCS 2.0, has face validity for 39 games, each programmed by a pair of middle school children. We choose four of these games, two with high GCS and two with low GCS, and discuss the computational building blocks found in each game. We do this to help the reader better understand our measurement of GCS and its relationship to CL.”

Author Keywords

Computational thinking, student assessment, K-12 education, computational sophistication, computational learning, game programming, pair programming, middle school, assessment, and measurement

My One Sentence Summary

This article unpacks how the authors updated the game computational sophistical (GCS) model to account for computational learning evident within 39 games development by pairs of middle school children.

Some Of My Lingering Questions/Thoughts

I'm curious how this model would have differed if the analysis was based on the process rather than the product.
How might the model differ if it also accounted for code that was initially include or attempted, but ultimately removed from the final product?
How might educators use this model in combination with ipsative assessments?
How might this model account for variances in complexity evident across the different patterns and mechanics?
How could a model like this account for sophistication evident with the efficiency of code?

Resources/Links Relevant to This Episode

Other podcast episodes that were mentioned or are relevant to this episode
- Assessment Considerations: A Simple Heuristic
  - In this episode I read and unpack my (2019) publication titled “Assessment Considerations: A Simple Heuristic,” which is intended to serve as a heuristic for creating or selecting an assessment.
- How to Get Started with Computer Science Education
  - In this episode I provide a framework for how districts and educators can get started with computer science education for free.
- Rethinking the Roles of Assessment in [Computer Science] Education
  - In this episode I unpack Scott’s (2012) publication titled “Rethinking the roles of assessment in music education,” which summarizes three roles of assessment (assessment of learning, assessment for learning, and assessment as learning) that I discuss in relation to computer science education.
- More episodes related to assessment
- More episodes related to computational thinking
- More episodes related to video games
- All other episodes
Alice, the 3D programming environment used in this study
Ipsative assessment resources
- Learn more about ipsative assessments (as well as summative and formative assessments) by reading this article or reading this resource that I created
- Savage, J., & Fautley, M. (2016). Assessment processes and digital technologies. In A. King & E. Himonides (Eds.), Music, Technology, and Education: Critical Perspectives (pp. 210–224). Abingdon: Routledge.
Here’s the tweet I mention
Find other CS educators and resources by using the #CSK8 hashtag on Twitter