Generative artificial intelligence (GenAI), particularly advanced language models like ChatGPT, is rapidly integrating into the daily learning routines of students. While these tools offer unprecedented capabilities for generating explanations, summaries, and detailed responses on demand, their educational impact in science learning remains a complex and evolving landscape. A recent study delving into the experiences of upper secondary students in Southwest China reveals a nuanced journey from initial reliance on AI-generated content to a more critical and reconstructed understanding of scientific concepts. The research, employing a constructivist grounded theory design, interviewed 21 students aged 15-18, offering a window into how young learners navigate the epistemic challenges posed by AI in their science education.

The findings underscore that GenAI use in science learning is not a monolithic phenomenon but a dynamic process influenced by a confluence of factors, including task demands, students’ prior knowledge, the pressure of time, and prevailing teacher norms. Students often initiate their interaction with GenAI driven by a quest for efficiency, initially placing a provisional trust in the fluent and authoritative tone of AI-generated responses. However, this trust is not static. It undergoes rigorous evaluation through internal consistency checks, external corroboration with established resources, and a growing awareness of GenAI’s inherent limitations. Based on these judgments, students then adapt their strategies, refining their prompts, selectively reconstructing the AI-generated content, or, in instances of unreliability, abandoning the tool altogether. This intricate dance between trust and critical evaluation forms the core of regulated epistemic engagement with AI in science learning.

The Evolving Landscape of AI in Education

The rapid proliferation of GenAI tools has placed educational institutions worldwide in a position of both opportunity and challenge. As these technologies become more sophisticated and accessible, their integration into academic settings necessitates a deeper understanding of how students interact with them, particularly in disciplines like science that demand rigorous analytical thinking and evidence-based reasoning. While meta-analyses suggest that GenAI can bolster student learning and motivation under specific conditions, their effects are far from uniform, varying significantly across disciplines, tasks, and pedagogical approaches. The critical question for educators and researchers alike is not merely if students are using these tools, but how they are interpreting, scrutinizing, and ultimately acting upon the information provided by AI.

In science education, this issue is particularly consequential. Students are increasingly turning to GenAI for a spectrum of learning activities, from seeking straightforward information and clarifying complex theories to summarizing dense texts and even supporting the composition of written assignments. However, science learning transcends the mere reception of plausible answers. It fundamentally requires students to evaluate evidence, distinguish between assertion and explanation, align claims with established disciplinary reasoning, and critically assess the warrant for knowledge claims. The very features that make GenAI attractive – its speed, coherence, and authoritative voice – can also inadvertently lead learners to conflate plausibility with validity, posing an epistemic risk if uncritically embraced.

Gaps in Current Research

Despite a burgeoning field of research on GenAI in education, several critical limitations persist. A significant imbalance exists in the literature, with a disproportionate focus on higher education settings. Reviews consistently highlight university-level studies, while school-based learning, especially at the upper secondary level, remains comparatively underexplored. This is a crucial distinction, as upper secondary science education is shaped by unique epistemic and instructional contexts, including stringent curricular accountability, high-stakes assessments, and a close alignment with formal disciplinary norms. Findings from university settings cannot be directly transposed to these more constrained and accountability-driven environments.

Furthermore, a substantial portion of current research concentrates on the adoption, attitudes, and general usage patterns of GenAI. While valuable in identifying student perceptions of utility for efficiency, clarification, and idea generation, alongside concerns about accuracy and dependency, these studies often fall short in detailing the post-output processing. The pivotal question in science learning remains how students determine the trustworthiness of a specific AI-generated explanation, answer, or line of reasoning, and consequently, whether to use, revise, or reject it.

Finally, the process dimension of student GenAI engagement is insufficiently theorized within school science education. Existing studies hint at heterogeneous and task-dependent reliance on GenAI, rather than uniform usage. While some school-based research indicates GenAI can support science learning and self-regulated learning when pedagogically scaffolded, and scholarship raises epistemic questions about AI-generated representations, there is a notable lack of understanding regarding the precise mechanisms by which upper secondary students move from initial exposure to an AI response to subsequent evaluative and behavioral decisions in relatively unstructured learning contexts.

This research gap can be more effectively addressed by integrating insights from three key areas: research on trust in artificial intelligence, which emphasizes appropriate reliance calibrated to system capabilities and context; self-regulated learning theory, which posits that productive tool use hinges on learner planning, monitoring, and reflection; and science education scholarship, which underscores scientific literacy as the critical evaluation of evidence, reasoning about expertise, and engaging with knowledge claims. Collectively, these perspectives advocate for understanding GenAI use in science learning as a dynamic process of epistemic judgment and regulatory action.

The Chinese Context: Educational Pressures and AI Adoption

The study’s setting in Southwest China provides a unique lens through which to examine GenAI adoption. Upper secondary education in China is intensely competitive, with the national college entrance examination (Gaokao) serving as a high-stakes gateway to higher education. This environment places immense pressure on students to master vast amounts of material efficiently and accurately. Consequently, tools that promise time savings and enhanced understanding are likely to be highly attractive.

The research design specifically incorporated two types of upper secondary schools: a provincial key high school, representing institutions with high academic rigor and resources, and a municipal non-key general high school, reflecting a broader range of academic demands and student profiles. This deliberate choice aims to capture variations in learning cultures, teacher expectations, and task stakes, all of which can significantly influence how students approach and utilize GenAI. The inclusion criterion of prior GenAI use for at least three months for science learning tasks ensures that participants have practical, experience-based insights rather than speculative opinions.

A Process-Oriented Investigation

To address the identified gaps, this study embarked on an in-depth qualitative investigation. Employing a constructivist grounded theory design, the research sought to develop a theoretical framework directly from the data, focusing on the unfolding process of student engagement with GenAI. Semi-structured interviews, conducted in Mandarin, allowed for rich, narrative accounts of students’ experiences. The interview guide was meticulously designed to elicit concrete, episode-based narratives rather than abstract attitudes, focusing on the sequence of initiation, evaluation, and subsequent actions related to GenAI outputs in science learning.

The analysis, conducted concurrently with data collection, involved iterative coding, constant comparison, and memo-writing to build a robust theoretical model. This process allowed for emergent categories to shape further data collection, ensuring the findings were deeply grounded in participant experiences. Trustworthiness was enhanced through multiple strategies, including peer debriefing, negative case analysis, and maintaining an audit trail.

Key Findings: From Provisional Trust to Critical Reconstruction

The study’s findings illuminate a prevalent pattern in how upper secondary students engage with GenAI in science learning, characterized by a movement from initial trust towards increasingly critical reconstruction. This process is not a rigid, linear progression but a dynamic interplay shaped by context and experience.

Initiation and Provisional Trust

The initial engagement with GenAI was frequently driven by efficiency-driven instrumental adoption. Students, facing heavy workloads and tight deadlines, turned to GenAI as a means to save time and reduce effort. Tools were perceived as fast-response mechanisms for generating explanations, summaries, or problem-solving steps. As one student articulated, "There’s just too much physics homework every night. When I can’t finish the problems before the deadline, I just feed the question into Deepseek, and it instantly gives me the full steps and answer. It’s so convenient, it saves me at least an hour every day." This instrumental use often preceded any deep evaluation of content accuracy.

This efficiency-driven approach was frequently coupled with initial trust under the halo of technical authority. The fluent, well-organized, and technically precise language of AI outputs often created an impression of reliability, particularly for students encountering unfamiliar concepts. Phrases like, "When I read the response, it looks very professional, full of technical terms that I barely understand, and it’s written in a very confident way. My first reaction was to trust it, because it looks like it knows what it’s talking about," illustrate this heuristic trust. However, this initial acceptance was demonstrably task-context dependent. Students expressed greater trust for lower-stakes tasks like brainstorming or summarizing, while exhibiting marked caution for tasks requiring precise reasoning, experimental design, or exam-relevant accuracy.

Cognitive Evaluation and Epistemic Judgment

The transition from provisional trust to critical evaluation was marked by several key processes. Internal consistency checking based on prior knowledge served as the first line of defense. Students compared AI responses with what they had learned in class or from textbooks. Discrepancies immediately triggered doubt. "It explained photosynthesis, but it reversed the light-dependent and light-independent reactions, and mixed up the inputs and outputs. I could tell immediately it was wrong, because I memorized that entire process from the textbook backwards and forwards," reported one participant.

When internal checks were insufficient or tasks were high-stakes, students engaged in cross-validation through external information sources. Textbooks, reputable online resources, and ultimately teachers or peers became crucial arbiters of truth. "If it’s a concept I don’t know well, I can’t tell if it’s wrong just by reading it. So I’ll check the textbook first, to see what the official explanation is. If the AI’s answer matches the textbook, then I’ll trust it. If not, I’ll go with the textbook," explained one student. However, this verification process was often sacrificed under time pressure, where the immediate need to complete assignments superseded rigorous checking.

Over time, repeated encounters with AI errors fostered a metacognitive awareness of GenAI limitations. Students began to recognize that convincing fluency did not equate to accuracy. Experiences of receiving incorrect answers, inconsistent information, or poor grades after relying on AI led to a recalibration of trust. "I’ve seen it give three different answers to the same physics problem when I ask it three times in a row. That’s when I realized it doesn’t actually ‘understand’ the problem, it’s just generating plausible-sounding text," a student reflected.

Behavioral Decision-Making and Content Reconstruction

Following evaluation, students adopted various strategies. Iterative prompt refinement emerged as a key skill. Students learned to ask more specific and nuanced questions to elicit better-quality responses. "At first I just asked ‘Why is the sky blue?’ and the answer was really shallow. Later I learned to be specific: ‘Explain why the sky is blue using Rayleigh scattering, at high school physics level, with no university math, and step-by-step formula explanation.’ Then the answer was actually useful for my class."

More significantly, students engaged in content integration and reconstruction. This involved selecting useful elements from AI outputs, reorganizing them, paraphrasing the language, and integrating them with personal reasoning and evidence from other sources. One student vividly described this as: "The AI’s paragraphs are like bricks. I don’t just take the whole house the AI built and put my name on it. I pick out the bricks I need… Then I build my own beams and pillars: my original viewpoints, my own reasoning, the evidence from my experiments and the textbook. Then I put the whole house (the assignment) together myself." This process allowed students to retain agency and ensure the final work was their own.

In cases where outputs were deemed too unreliable or difficult to repair, students resorted to strategic abandonment and substitution. They would cease using GenAI for a particular task and revert to self-solving, seeking peer help, or consulting teachers. "I asked it to solve a hard calculus derivative problem for physics, and it gave two wrong answers with made-up reasoning. I realized it can’t handle this kind of multi-step creative problem-solving. So I closed it, worked it out myself step by step, and asked my teacher to check it. It was faster that way than fixing the AI’s mistakes."

Implications for Science Education and Pedagogy

The study’s findings carry significant implications for how science education should adapt in the age of AI. The central message is that GenAI use is best understood as a form of regulated epistemic engagement. This reframing shifts the focus from mere adoption to the active, ongoing process of negotiation, evaluation, and transformation of AI-generated content.

Trust in GenAI, as demonstrated, is not a stable trait but a task-contingent calibration. Students adjust their reliance based on the specific scientific demands of a task and the perceived stakes. This highlights the importance of disciplinary context in shaping AI interaction. The findings also underscore that prior knowledge is a critical determinant of a student’s ability to critically evaluate AI outputs. Students with stronger foundational knowledge are better equipped to identify subtle inaccuracies and resist the allure of superficially convincing but flawed information.

The concept of reconstruction emerges as a particularly valuable educational outcome. When students actively select, reorganize, and re-author AI-generated material, they are not merely consuming information but actively constructing understanding. This process requires a deeper level of engagement than simple verification or paraphrasing and is crucial for ensuring that AI serves as a tool for learning rather than a shortcut.

For educators, these insights suggest several pedagogical shifts. Firstly, simply mandating "responsible AI use" is insufficient. Classrooms need to integrate verification as a routine part of scientific inquiry, perhaps through tasks that require students to document their checking processes. Secondly, explicit instruction in reconstruction strategies may be necessary to guide students in transforming AI outputs into academically sound and personally meaningful work. Finally, teacher norms play a crucial role. Instead of outright bans, establishing clear, bounded guidelines that distinguish between supportive AI functions and those that bypass critical thinking can foster more productive engagement.

Future Directions and Broader Impact

While this study provides valuable insights, its scope is limited to a specific cultural and educational context. Future research should expand to diverse geographical regions and educational systems to explore the universality of these findings. Triangulating interview data with behavioral traces, such as prompt histories or assignment drafts, would offer a more comprehensive picture of AI interaction. Furthermore, developing methods to measure task-contingent trust calibration and reconstruction strategies could enable broader impact assessments.

As GenAI continues to evolve and embed itself more deeply into educational ecosystems, understanding the nuanced ways students engage with these tools is paramount. This research suggests that fostering critical reconstruction, rather than merely controlling usage, is key to harnessing the potential of AI for meaningful science learning. By navigating the journey from initial trust to critical reconstruction, students can develop the essential scientific literacy skills needed to thrive in an increasingly AI-mediated world.

Leave a Reply

Your email address will not be published. Required fields are marked *