Video Games and CS Education: Data analysis to understand how students think about game design

By Kristina Holsapple

Problem Description:

This data analysis explores how computer science students, specifically introductory students, think about video game and video game design. In designing application programming interfaces (APIs) for online tools, it is becoming more common for developers to consult their intended audience throughout the design process. When data regarding user preconceptions exists, developers can use this data to guide design principles of an API. Although tools currently exist to help programmers develop video games, the broad problem regarding this analysis is that prior data regarding student preconceptions of game design does not exist. Thus, current game development tools have not been designed with respect to how users understand and expect the tool to work. Analysis of this data has the potential to guide future design of a student and classroom-friendly library for students to learn computer science through game design.

Problem Background:

It is well known in Computer Science Education literature that games are a motivating context for students to learn computer science. For example, positive factors of game development in educational settings include enthusiastic response to instruction and appeal to different student demographics according to "Teaching Computer Science through Game Design" by Mark Overmars (2004) improved problem-solving skills according to Mete Ackcaoglu's "Learning problem-solving through making games at the game design and learning summer program" (2014).

Students use game development tools to develop games. However, users of Application Programming Interfaces (APIs) of programming tools can experience a challenging learning curve, according to Myers and Stylos (2016). Myers and Stylos discuss how human-centered design can improve API usability. User-centered design includes involving the intended users in the design process of a tool.

The most prominent example of user-centered design in Computer Science Edcuation is the development of the programming language Quorum. Quorum's design was evidence-oriented, meaning that design choices were made with user perceptions of programming in mind. Quorum was designed to support computer science for all, a movement to make computer science more inclusive and reach more students with computer science education.

Python is a popular beginner programming langauge to learn, and game development libraries do exist for Python. Pygame claims to be light-weight, simple, and easy-to-use, although there is no peer-reviewed evidence to support this claim. Arcade promotes their Python game development library as being easy-to-learn and ideal for people learning to program. Although this intention is supportive of the learning experience, Arcade's design lacks the evidence that is evidence in user-centered design.

By designing a tool for an audience based on their understanding of the tool, ideally the audience's learning and use of the tool feels more natural and compensates for the learning curve of both game development and API usage. Research that supports the learning experience of students is valuable because learning is valuable.


Data analyzed comes from two surveys asked of introductory computer science students. In the fall semester of 2020, my co-PI and I designed and conducted our first survey, (referred to later and in code as survey version 1/v1) asking CISC108: Introduction to Computer Science I students about their preconceptions of game design vocabulary and logic (see further details later in Research Questions). After reviewing how students understood and answered questions, we acknowledged shortcomings of our survey.

Based on our limitations of survey v1, we altered some question phrasing to establish our second survey (survey version 2/v2). Changes were minimal and meant to mitigate bias. Overall, the game vocabulary and logic we asked in version 1 remained the same topics we asked in this second version. We launched this survey with CISC108 and CISC106: Introduction to Computer Science for Engineers students in the spring semester of 2021.

The survey data consists of two types of questions: free-response and multiple choice. Free-response questions (referred to later and in code as 'Open' questions) prompted students to answer questions with a text response. Multiple choice questions (referred to later and in code as 'Closed' questions) asked students questions with a list of 3-8 multiple choice options.

This data analysis comprehensively examines responses from versions 1 and 2 of our survey, with specific focus on differences in responses between survey version as well as differences in responses based on students' self-reported prior experience with programming and game design.

Research Questions:

To design an evidence-oriented game development library for students, it is necessary to understand how students think about video games and game development.

The questions I answer with this dataset are:

RQ1. What vocabulary do novices use when thinking about video games?

Specifically, how do students think about terms commonly referred to as:

This question and these concepts were justified in a couple of ways. Anecdotally, as an introductory student who learned CS with game deveopment, I struggled for weeks with the vocabulary and concept of sprites (interactive game graphics). More importantly, existing game library APIs such as Pygame and Arcade's refer to many of these concepts with different terms. For multiple choice questions inquiring about these topics, multiple choice options were chosen from existing game library APIs. Without an existing consensus of how students perceive these concepts, this survey data can offer insight into which terms students most relate to.

RQ2. Is there significant difference between survey results based on survey version (1 or 2) or participants prior programming and game development experience?

This question is of importance because it indicates how we can treat the data for further analysis. We initially decided to conduct a second iteration of the survey due to the small sample size of version 1. If there is no significant difference based on survey version, we can pool data from both versions together for a larger sample size and more analysis. Questions that are significantly different based on survey version offer insight into survey change implications.

Analyzing results based on prior programming and game design experience offer insight into how students with different prior knowledge may interact with game design libraries. Students were sorted into four groups:

Although this research has an emphasis on novice preconceptions (the less experience the better), significant differences based on these groups may indicate design challenges in catering to different users' needs that are important to be transparent and forthcoming about throughout the design process.

RQ3. Do results indicate game development vocabulary that best aligns with how novices think about video games? If so, what vocabulary is ideal?

Based on answers to RQ2, it will be interesting to see if there are clear results in support of API design for a game design library. The goals of this analysis are to (1) understand how students think about game design before learning game design in a classroom setting and (2) use these results to guide the evidence-based design of a student-friendly game library.

Our results hope to support the development of a game library with the same intentions as libraries such as Pygame and Arcade. These intentions include understanding, valuing, and supporting the learning experience of students as they navigate learning computer science, an already difficult journey. Our process is unique in seeking out evidence from the students we hope to help in order to include them in the design process. In this way, rather than only claiming to support learners once the library has been designed, our design process is comprehensively supportive of students.

Ethical Concerns:

Ethical considerations of this investigation include the delicate nature of student education. Student education is important and should not be unnecessarily interrupted and disturbed. Research regarding education runs the risk of disrupting students' edcuation, and this should be acknowledged and accounted for. Additionally, video games in the media tend to target male audiences. A concern is that game development in education may favor male students if not presented appropriately.

Regarding data science specifically, ethical concerns also exist in analyzing this specific data. We acknowledge bias throughout our design, and want to acknowledge that this data is only the start of evidence-oriented game libraries. Placing too much emphasis on only the data offered here overvalues a predominantly male sample from a predominantly white institute such as the Univeristy of Delaware.

Additionally, although we hope to take a more scientific and evidence-based approach to game library development, not everything was justified by scientific decision-making. Specifically, the way we chose the specific concepts investigated in our survey questions does not have scientific justification. We chose concepts we anecdotally caused students problems, as well as concepts for which multiple terms exist in current game libraries, but the concepts we chose are not comphrehensive. Choosing to focus on the concepts we did runs the risk of completely disregarding other concepts that students challenge with. This ethical concern can be mitigated in future stages of design process. For example, once a prototype of the game library exists and we have students test the library, we can ask them more broad questions to evaluate the challenges they faced using the library, during which different concepts we have not yet collected data for may become apparent. For now, we can acknowledge this ethical concern, aware that our data is not exhaustive, and commit to monitoring this concern as we move forward with the data.

Helper functions and data

During a lot of the analysis I ended up doing, I noticed I was repeating code. I defined these functions to help decompose and organize my analysis.

Analysis process

My data comes from two surveys, which I will refer to as survey version one and two. Analysis will be done on each survey individually. Then testing will be conducted to determine when it is appropriate to combine survey results.

Data Cleaning/Transformation Survey v1

Load fall 2020 survey 1 data from CSV file. Drop data irrelevant to data analysis.

Drop rows without survey responses. Filter prior game development experience to yes or no.

Define list of headers for survey questions.

Data Distribution Survey v1

Make table observing distribution of experience levels.

Data Cleaning Spring 21 Survey v2

Load spring 2021 survey 2 data from CSV file and format file.

Remove columns from results not pertinent to data analysis.

Remove non-consenting responses and determine number of participants.

Filter prior experience question to True/False

Get gender demographics

Data Distribution Survey v2 and Comparison

Edit survey_headers to account for changes to survey.

Find participants per levels of experience and compare to v1.

Compared to v1, different distribution of experience. Noticably more participants without prior levels of experience.

Compare Survey v1 and v2 Distributions

Compare responses from surveys 1 and 2 with Chi-Square Independence Test.

Perform Chi-Square Independence Test for every shared survey question.

If significant difference based on survey version, investigate relationship between responses per survey version and prior experience.

Null hypothesis: Survey results are independent of the variable they are sliced upon (either survey version or experience level).

For P-values less than our confidence interval of 0.05, we reject the null hypothesis that the results are independent. For example, ClosedMovement's p-value based on survey version is 0.000001, so we reject the null hypothesis and do not conclude that the results per survey version are independent.

Pooling V1 and V2 when appropriate

Questions with p-values in which we accept the null hypothesis that responses are independent based on survey version, we pooled v1 and v2 results together. When pooled, there were 194 responses. Maintain total responses from surveys for questions with no significant difference and unusable data ('OpenScreen' that did not measure proper concept).

For each question:


No significant difference based on experience.

animation and action closely frequent.


No significant difference based on experience.

repeat most frequent, iteration also fairly popular.


No significant difference based on experience.

'when' evidently most common.


Significant difference based on experience.

As initially noticed in V1, moment is more popular among participants without prior game development experience and not as popular among students with game development and programming experience. state is still most popular overall.


Significant difference based on experience. All results entered at least two times. Accounts for plurals. For example character represents responses of character or characters.

Closed Sprite

Signficantly difference based on experience.

More participants without prior game development experience preferred characters. Objects most popular overall.

When V1 and V2 data cannot be pooled

Investigating responses with signficant differences between V1 and V2.


rectangle more popular in V2, draw_rectangle more popular in V1.


Significant based on survey version. Visualizations are weird with added option.

Not including window in V1 affected results, considering it was not a survey option. V2 suggests window is popular option, screen as second option.