Skip to main content
AiGORA
Back to Q&A
Text
0

Can I use AI to analyse students’ free-text responses?

For some reason, this semester a lot of students responded to free-text questions on my course evaluations (usually nobody replies). While I am grateful on the one hand because there is of course a lot of very useful information in there that I will be able to learn a lot from, I am also overwhelmed and cannot possibly spend hours to analyse all of it. Can I upload it to AI and ask for a summary?

Mirjam Glessmer · 18 Jun 2026

Responses from the team

2 perspectives from the community

  • Mirjam Glessmer's profile photo

    Mirjam Glessmer

    There are several points I think are worth considering here. 

    First, those free-text answers are the students’ intellectual property and might contain sensitive, personal data. I would be very careful with uploading that type of text to a LLM (although LU’s CoPilot license allegedly protects our data). If you run an AI model locally on your computer, that might not be a concern.

    Second, I would be very wary of the quality of the outputs. In my own research, we find that LLMs are not suitable for qualitative research for many reasons (Glessmer & Forsyth, 2025) and Nguyen & Welch (2025) come to the same conclusion using a very different approach. Jowsey et al. (2025), in the piece “We reject the use of generative artificial intelligence for reflexive qualitative research” signed by 419 qualitative researchers from 32 countries, write very directly that they reject GenAI in reflexive qualitative research because 1. GenAI is incapable of meaning-making; 2. Qualitative research needs to be done by humans; and 3. GenAI is too harmful to the environment and to the human workers who need to filter out toxic content. Of course, all those three articles are focussed on scholarly use of qualitative data, but still relevant.

    There is more anecdotal evidence that nicely shows problems with AI analysing qualitative data that is directly relevant also to analysis that is done “just for a quick look” and does not need to have research quality:

    • a very nice Bluesky post by Sasha Gusev with the probably shortest write-up of a study in the history of AI: “I assigned random gender/ethnicity labels to scientific abstracts from the literature and then asked Claude to do a thematic analysis. Claude identified a clinical versus computational split for female/male authors and a DEI focus for Black/URM authors. All in completely random data.” Full prompt and output available on Github.
    • “Real signals or artificial stereotypes? Adventures with a cultural Copilot” by Adam Kucharski on Substack:**** An artificial dataset (200 responses assigned the label “UK”, plus the same 200 responses assigned the label “US” in the same file) and AI “finds” lots of cultural differences in expressiveness, language style, emotional framing, cultural tone that is clearly not in the dataset. Another dataset about career aspirations in 5 countries (again, with identical data for each) and the output is about lots of cultural differences. So if people haven’t gotten the message yet: Do not use LLMs for analysis of qualitative data!

    Lastly, I would think about what it says about the relationship between you and your students if they put in the effort to give you written feedback and you chose to not read it yourself, and instead read an artificially created summary. Maybe it is worth the effort to actually read it yourself?

    References:

    • Glessmer, M. S., & Forsyth, R. (2025). “Superficially Plausible Outputs from a Black Box: Problematising GenAI Tools for Analysing Qualitative SoTL Data”. Teaching and Learning Inquiry 13 (January):1–9. https://doi.org/10.20343/teachlearninqu.13.4
    • Nguyen, D. C., & Welch, C. (2025). Generative Artificial Intelligence in Qualitative Data Analysis: Analyzing—Or Just Chatting?. Organizational Research Methods, 10944281251377154.
    • Jowsey, T., Braun, V., Clarke, V., Lupton, D., & Fine, M. (2025). We reject the use of generative artificial intelligence for reflexive qualitative research. Qualitative Inquiry, 10778004251401851.
  • Kirsty Dunnett's profile photo

    Kirsty Dunnett

    There's not a lot to add to Mirjam's answer in terms of potentially using generative AI to analyse the responses. However, the question then arises: what sort of analysis do you need for your purposes? Clearly something more systematic than impressions after skimming through, but perhaps not something so labour intensive as inductive thematic coding. You will definitely need to read through all the comments yourself, but a deductive approach to coding (leading to a more quantitative overview) may save you time while avoiding the various problems of generative AI. After all, you were in the course too, have your own thoughts about how it went, know from experience some typical student reactions, and also know what may or may not be possible to change.

    The main thing to be careful with here it to not set up to confirm your own impressions and thoughts about what to change.

    A starting list for deductive coding might contain items under the following headers: - new thing(s) tried: liked/disliked. - personal (dis)satisfactions shared by students (list out yours and see which students noted enough to comment on). - reassuring comments (e.g., old things liked; general complements). - typical complaints (be specific: e.g., timing, workload too high). - interesting or unclassifiable comments — to come back to. - concrete suggestions — to come back to.

    I recommend assigning a maximum of 10-12 codes in any given read through (this really speeds things up because you know almost exactly what each code means). You can mark students' responses as completely coded, so you do not need to return to them in a second read through (i.e., if have more codes than it's easy to more or less remember the meaning of at any one time). The only ones to examine in more detail are then those marked as 'interesting, unclassifiable and concrete suggestions.

    It may also be efficient to break long comments into their constituent parts so it is easier to keep track of where the code assignment comes from (and also reduce needing to read through parts that have already been coded).

    From a quick internet search, the following clear description of the different approaches to qualitative coding may be useful: https://limbd.org/the-big-3-qualitative-coding-approaches-inductive-deductive-and-abductive/

Comments

Share your thoughts — comments are reviewed before they appear

No comments yet. Be the first!