Most research on critical thinking (CT) in project-based learning (PBL) has focused on whether CT improves following instruction, typically through composite or dimension-specific scores. Less is known about how learners experience individual CT subskills during task engagement. A recent qualitative study delves into this crucial area, examining how secondary English as a Foreign Language (EFL) students articulate their use of critical thinking subskills during demanding intercultural project-based tasks. The findings suggest that these subskills are not uniformly experienced or activated, highlighting the importance of understanding the conditions that shape their effective deployment. Introduction: The Growing Imperative of Critical Thinking in Globalized Education In an increasingly interconnected world, the ability to think critically is paramount for learners navigating diverse perspectives, evaluating information from myriad sources, and adapting reasoning across linguistic and cultural boundaries. This is particularly true in secondary EFL contexts, where students often prepare for international study. Critical thinking serves as a vital cognitive toolkit, enabling them to process unfamiliar communicative situations, assess the appropriateness of linguistic choices, and re-evaluate assumptions when engaging with culturally diverse audiences. When EFL tasks require the creation of materials for international audiences, the cognitive demands intensify, compelling learners to interpret audience backgrounds, evaluate potential misunderstandings, and adjust their explanatory strategies to bridge cultural gaps. In many Sino-foreign cooperative secondary schools in China, students regularly participate in English-medium project tasks that simulate communication with international peers, making the pedagogical integration and research of critical thinking in these settings a significant concern. Project-based learning (PBL) has emerged as a prominent pedagogical approach widely adopted to foster CT development. Characterized by extended inquiry around authentic problems, collaborative decision-making, and the creation of tangible outcomes, PBL is believed to create fertile ground for students to engage in active problem-solving, assume agency in planning their work, and present outcomes to audiences beyond the teacher. While research has documented cognitive benefits associated with PBL in EFL and related contexts, including improvements in problem-solving capacity, metacognitive awareness, and collaborative reasoning, much of this evidence stems from quantitative designs. These studies typically employ standardized tests or self-report measures to assess whether students’ CT improves following PBL interventions. While establishing that PBL can support CT development when measured as overall test scores or general cognitive gains, they offer limited insight into the actual deployment of specific CT subskills during project work or whether certain subskills become more prominent depending on task characteristics. This outcome-focused orientation often leaves the underlying cognitive processes largely opaque. To address this gap, a qualitative study focused on the learner experience of CT subskills. The research drew upon Facione’s influential framework, which identifies six core CT subskills: interpretation, analysis, inference, evaluation, explanation, and self-regulation. While this framework has been widely operationalized in assessment instruments, its application in EFL research has predominantly focused on measuring CT as composite scores or dimension-specific subscales, aiming to determine if overall ability or individual dimension scores increase after instruction. This approach often treats the six subskills as coequal and separable components. However, there is a relative scarcity of process-oriented evidence detailing how these subskills are mobilized, sequenced, or constrained in authentic EFL tasks, or whether some subskills function as foundational moves while others emerge conditionally based on task demands and perceived stakes. Unpacking Critical Thinking: Facione’s Framework and its Application Facione’s (1990) Delphi-based framework, derived from the American Philosophical Association’s Delphi project, offers a structured understanding of critical thinking. It identifies six core subskills: Interpretation: Comprehending and expressing the meaning of diverse experiences, data, phenomena, events, conventions, criteria, roles, norms, or practices. Analysis: Identifying the intended or actual inferential relationships among statements, questions, concepts, descriptions, or other forms of representation intended to express beliefs, judgments, experiences, reasons, information, or principles. Inference: Identifying and securing elements needed to draw reasonable conclusions; forming conjectures and hypotheses; considering relevant information and deducing the consequences of decisions, actions, beliefs, or principles. Evaluation: Assessing the credibility of statements or other representations of personal conviction and assessing the logical strength of inferential relationships among statements, descriptions, questions, or other forms of representation. Explanation: Stating and presenting the results of one’s reasoning; to justify the reasoning in terms of evidence, conceptual considerations, methodological considerations, criteria, and contextual considerations; and to present one’s argumentation in the form of credible, coherent, and persuasive explanations. Self-regulation: Self-consciously monitoring one’s cognitive activities, the elements used in those activities, and the results educed, particularly by applying skills in analysis and evaluation to one’s own inferential judgments with a view toward questioning, confirming, validating, or correcting one’s reasoning or results. In EFL research, this framework has often served as a measurement structure. Studies such as those by Indah et al. (2022) and Algouzi et al. (2023) have reported dimension-specific scores or total CT scores, focusing on pre-post changes following interventions. While valuable for quantifying CT attainment, these approaches often overlook the dynamic interplay and subjective experience of these subskills during authentic task engagement. The question of whether learners perceive these subskills as equally salient, differently prioritized, or unevenly coordinated within complex tasks remains largely underexplored. This is especially pertinent in EFL contexts where learners must navigate higher-order thinking through an additional linguistic medium, potentially influencing which subskills become more prominent at different task junctures. Project-Based Learning and the Pursuit of Critical Thinking: From Outcomes to Processes PBL’s inherent characteristics—sustained inquiry, collaborative decision-making, and audience-oriented work—make it a natural fit for fostering critical thinking. Meta-analytic evidence offers partial support for this association, with some studies reporting positive effects on thinking skills, including specific aspects of critical thinking like analysis and evaluation. For instance, a synthesis of 66 studies by Zhang and Ma (2023) reported a moderate positive effect of PBL on thinking skills overall. Similarly, Song et al. (2025b) found significant improvements in analyzing and evaluating skills following an online PBL intervention with Chinese college EFL learners. However, a recurring limitation across many of these studies is their reliance on aggregate CT scores. This approach reveals whether PBL improves CT as a whole but cannot elucidate which subskills are most responsive to project work or how they are activated across task phases. Emerging qualitative work is beginning to offer more differentiated accounts. Studies by Song et al. (2025a) and Wu (2011) have begun to explore the mechanisms through which PBL supports CT and the metacognitive strategies employed by learners, providing early evidence that CT-related cognitive engagement can vary across task phases. This gap becomes particularly pronounced in intercultural project tasks. Tasks requiring students to explain cultural practices to unfamiliar audiences demand audience-specific interpretation, selective content decisions shaped by anticipated misunderstandings, and evaluative judgments about cultural appropriateness. CT and intercultural communicative competence (ICC) share overlapping cognitive skills, including analysis, interpretation, inference, and evaluation. Yet, empirical studies often measure these as composite constructs rather than tracing the mobilization of individual subskills across task phases. Understanding how subskills are experienced and coordinated under such conditions necessitates approaches that move beyond aggregate scores and attend to learners’ firsthand accounts of their task engagement. Methodology: A Deep Dive into Learner Narratives This qualitative study adopted an exploratory, process-oriented design to investigate how secondary EFL learners deploy critical thinking subskills during intercultural project-based tasks. The research focused on understanding students’ subjective sense-making processes as they articulated their reasoning, identified salient cognitive moves, and navigated decisions when interpreting and organizing cultural information for international audiences. Research Context and Participants: The study involved twelve Grade 12 students from a Sino-foreign cooperative secondary school in China. These students had completed a series of five intercultural project-based tasks during their final academic year. All participants had a minimum of ten years of EFL study and were enrolled in an international curriculum track designed to prepare them for overseas university study, with English serving as a primary medium of instruction. To ensure the unhindered articulation of complex reasoning, interviews were conducted in Chinese. Participants were recruited through purposeful sampling, selecting those who demonstrated a capacity to articulate their reasoning processes during previous post-task reflections. This strategy may have favored more articulate students, and the findings should be interpreted with this in mind. Project Tasks: The students had engaged in five distinct types of project-based tasks, all requiring the creation of materials for international audiences: Podcast: Recording audio episodes introducing Chinese cultural topics (e.g., ancient inventions, regional cuisine, learning style differences). Idea Talk: Creating presentations explaining Chinese secondary school routines (e.g., morning reading, evening self-study, dormitory systems) to prospective international students. Infographics: Designing visual comparisons of cultural practices (e.g., first-aid procedures) in accessible poster formats. Interactive Map: Developing digital orientation materials with embedded audio introductions to campus landmarks and cultural sites. Role-play: Simulating high-stakes scenarios involving communication with international dormitory supervisors during medical incidents, requiring judgments about culturally appropriate actions and institutional protocols. Data Collection: Semi-structured interviews were conducted one week after task completion. This interval allowed for reflective distance while preserving accessible memories of decision-making processes. Interviews, lasting between 30 and 40 minutes, were audio-recorded and transcribed verbatim. The interview protocol encouraged students to recall a particularly memorable project task and describe their thinking processes. Questions probed their understanding of task requirements and audience characteristics, the cognitive skills they employed, how these skills aided their work, and which skills they felt were less necessary. Follow-up prompts invited detailed reconstruction of specific decision-making moments, informed by Facione’s CT framework but remaining open-ended to allow for student-led narratives. Data Analysis: A hybrid thematic analysis was employed, combining deductive coding based on Facione’s six CT subskills with inductive attention to emergent patterns. Transcripts were analyzed to identify segments where students discussed their thinking. An initial coding frame, based on Facione’s subskills (Interpretation, Analysis, Inference, Evaluation, Explanation, Self-regulation), was applied using NVivo 15. Segments related to audience understanding, content prioritization, anticipatory reasoning, and appropriateness judgments were coded accordingly. Inductively identified patterns, such as how students specified their target audience or the reasoning in high-stakes scenarios, were also coded and integrated into the evolving frame. The coding process involved iterative refinement, with particular attention paid to ambiguities, especially between inference and evaluation. Comparisons across task types and students explored the consistent emergence of certain subskills as foundational versus later-stage refinements. Trustworthiness and Ethical Considerations: Trustworthiness was ensured through thick description, within-case and cross-case comparisons, and collaborative coding discussions. Rich contextual description supported transferability. An analytic log and iterative memo-writing documented interpretive development for dependability, while reflexive memos fostered vigilance against selective interpretation for confirmability. The study received ethical approval from the university’s institutional ethics committee. All participants provided informed consent, were informed of their voluntary participation, and their identities were protected through pseudonymization. Findings: The Uneven Landscape of Critical Thinking Engagement The study revealed distinct patterns in how students described the salience and sequencing of critical thinking subskills across their project work, organized into three functional clusters. Foundational Moves: Interpreting the Task and Analyzing Content Across student accounts, interpreting and analysing were consistently described as the initial and most foundational cognitive moves. These two processes were closely intertwined, with interpretation informing analytic decisions about content prioritization and organization. Students commonly began by identifying potential points of cultural mismatch with their intended audience. They actively scanned materials to pinpoint areas where international students might encounter confusion. For instance, a student working on an "Idea Talk" project recalled reading teacher-provided articles to "figure out why they come and what they struggle to adapt to" (S01). Similarly, a student creating an "Interactive Map" focused on reading to "figure out where they usually get stuck" (S03). This interpretive work aimed to locate anticipated gaps in audience knowledge, moving beyond simply comprehending the task brief. In role-play scenarios, this interpretive work extended to recognizing differences in institutional and procedural frameworks. A student simulating a medical emergency abroad understood that she was operating within a foreign system: "I first knew this was abroad, in their dorm, under their rules" (S02). This recognition guided her communication, shifting focus from merely describing the problem to explaining adherence to the host institution’s protocols. Students also drew on specific intercultural contrasts, such as differing norms around photography or unique first-aid practices, to sharpen their sense of what might require contextualization for an international audience. This early interpretive work significantly shaped not only content choices but also the communicative stance. As one student explained, "once you understand the other party’s situation, you know what to say and how to say it" (S01). Without this understanding, she noted, "you might easily sound like you are lecturing them." This process facilitated a shift from an instructional register to one that acknowledged difference without imposing judgment. Accounts where this foundation was less developed often revealed greater uncertainty in later decision-making. Following interpretation, students moved into analytic work: deciding what content to include, exclude, and how to structure their output. This filtering process was particularly prominent in content-rich tasks. An infographic creator described the challenge of "dig[ging] out the most essential stuff that could be compared" (S08). Another student framed this analysis in audience-centered terms, asking what "visiting students most want to know and what’s most useful" (S06). Analysis also extended to sequencing, with a role-play participant organizing her communication deliberately: "I’d say the location first, then the symptoms, then what we’d already done" (S02), characterizing this as "thinking through how to say it without getting confused." While interpretation and analysis were often described in close succession, the boundary was not always sharp. Some students characterized early work as "analysing how foreign students might think about us" (S04), blending audience interpretation with evaluative comparison. In tasks with more abstractly defined audiences, interpretation could merge with analysis. These variations suggest that the sequencing of these foundational processes was influenced by audience specificity and the nature of provided materials. Double-Edged Processes: Inferring and Evaluating Across Task Contexts While interpretation and analysis were relatively stable across tasks, inference and evaluation occupied less predictable ground. Students’ accounts portrayed these subskills as capable of either strengthening or weakening their communicative choices, depending on the specific conditions of their application. Productive Inference and Evaluation: In their productive forms, inference and evaluation enabled students to anticipate audience confusion and engage in culturally reflective judgment. Students who framed their inferences as questions or possibilities used them to pre-empt misunderstandings and add context. One student inferred that international students would be unfamiliar with "morning reading" and planned additional explanation accordingly (S01). Another anticipated that a foreign dorm supervisor might not understand the practice of informing parents first and prepared her response (S02). This form of inference acted as anticipatory perspective-taking, preparing for gaps in shared knowledge. Evaluation took on a substantive character when students perceived real consequences. In high-stakes role-plays, students moved beyond simple correctness to weigh cultural appropriateness. One student shifted from believing "with first aid, as long as it’s fast, it’s fine" to recognizing the need to "consider the other person’s cultural habits" (S05). Another articulated the evaluative process as "quickly weighing in your mind: is what I’m doing appropriate? Will this be offensive or culturally inappropriate?" (S10). This substantive evaluation aligned with Byram’s concept of critical cultural awareness, enabling the evaluation of practices using explicit criteria and echoing Deardorff’s ICC model’s emphasis on openness and respect. Constraining Inference and Evaluation: A different pattern emerged when inference was applied with high confidence but limited grounding. Several students described making strong predictions about audience interests and then using those predictions to delete content. One student reflected that she had "overused inference," removing material about evening self-study because she "kept thinking to myself that foreign students definitely would not be interested" (S01). Teacher feedback later indicated that this content effectively conveyed cultural differences. Another student nearly reduced the school’s history to a single photograph, reasoning that "foreigners would definitely find it boring" (S09). This student later reflected: "I almost overlooked our school’s core cultural heritage because of my own ‘taken-for-granted’ assumption. This inference thing, when used well, it’s considerate, but when overused, it’s like looking at people through tinted glasses." The repeated use of "definitely" suggested inferences were treated as certainties rather than tentative guesses, often without external challenge. Under lower-stakes conditions, evaluation often became procedural. Students focused on checking against the teacher’s rubric rather than engaging deeply with cultural content or audience perception. One student noted her evaluation "was basically just checking against the teacher’s rubric" (S01). Another acknowledged her group’s evaluation amounted to "going through the motions," with members agreeing work was "more or less okay" without deeper scrutiny (S08). This was attributed to time pressure and the perception that "this isn’t for real foreigners to see, it’s just an assignment." The distinction between productive and constraining forms of inference and evaluation hinged not on their presence but on how they were applied and under what conditions. Productive use involved framing inferences as questions and engaging in evaluation when perceived consequences were tangible. Constraining use involved treating inferences as facts and procedural evaluation under lower-stakes conditions or when rubrics focused solely on technical features. The availability of feedback played a crucial role, allowing students to recognize the gap between their predictions and audience interests. Late-Stage Refinements: Explaining and Self-Regulating Under Constraint Explanation and self-regulation were commonly described as appearing toward the end of the task process, functioning as refinements of already developed material. Both were also noted as vulnerable to practical constraints such as time pressure and fatigue. Explanation involved making culturally embedded practices intelligible to an audience presumed to lack shared background knowledge. A student creating an interactive map explained the rationale behind "class meetings" as "for building class spirit," concerned that the audience "might think it was the teacher criticizing students" (S03). Another student explained why she contacted the dorm’s emergency desk rather than dialing a general emergency number, adding "this is the procedure for international students" to preempt the impression of delayed assistance (S02). Explanation also involved adjusting language, moving from formal or potentially arrogant phrasing to plainer speech to ensure audience comprehension (S07, S04). In a few instances, explanation prompted students to deepen their own understanding, such as discovering a regional dish’s origins as an intercultural product (S11). Self-regulation appeared less frequently and was typically described as a final checking stage. The most common form involved revising tone to soften directness or reduce imperative language, such as changing "you must" to "you can" (S03) or replacing comparative judgments with neutral phrasing like "learning styles are somewhat different" (S04). However, students also described self-regulation as constrained by exhaustion, leading to perfunctory checks where work was approved as "more or less okay" (S07, S09). Many acknowledged "cut[ting] corners" at this stage (S09). The accounts of explanation and self-regulation highlighted their dependence on prior stages. When interpretation and analysis provided a clear foundation, explanation could focus on contextualization. Conversely, if earlier stages were rushed, explanation became more uncertain, and self-regulation more perfunctory. The thinness of descriptions for these later stages, compared to earlier processes, indicated their perceived lesser centrality and their susceptibility to practical limitations. Discussion: Re-evaluating Critical Thinking in Intercultural Contexts The study’s findings challenge the notion of critical thinking subskills as equally salient or simultaneously activated components. Instead, learners described a loosely ordered sequence where interpretation and analysis served as foundational, consistent moves, while inference and evaluation exhibited variability contingent on task conditions. Explanation and self-regulation emerged as later-stage refinements, often compressed by practical constraints. This experiential perspective complements existing quantitative findings that suggest differential responsiveness of subskills to instruction. Tola Chala et al. (2025) reported larger gains for Analysis and Interpretation compared to Evaluation in Ethiopian EFL student-teachers, while Song et al. (2025b) noted significant improvements in analyzing skills but marginal changes in evaluating skills among Chinese college EFL learners. These studies, while measuring outcomes rather than processes, align with the present study’s observation of uneven prominence. The study’s exploration of inference and evaluation as "double-edged processes" is particularly significant. In their productive forms, these subskills facilitated anticipatory perspective-taking and culturally reflective judgment, aligning with goals of intercultural communicative competence (ICC). However, when applied with unchecked assumptions or under low-stakes conditions, they could lead to narrowing of content and premature closure. This suggests that the relationship between CT and ICC is more conditional than often assumed, with the cognitive processes of CT potentially working against intercultural goals if not carefully guided. The identified conditions shaping productive versus constraining CT engagement—perceived task stakes, rubric design, and feedback availability—offer practical implications for educators. The fact that perceived consequences, rather than actual audience presence, seemed to activate more substantive evaluation highlights the importance of designing tasks where the implications of cultural misjudgment feel tangible. Furthermore, assessment rubrics that explicitly include cultural appropriateness as a weighted criterion, alongside technical features, could encourage deeper evaluation. Structured feedback loops, whether through peer review or engagement with external audiences, are crucial for interrupting the self-confirming nature of unchecked inference and fostering revisable thinking. Implications for Practice and Future Research The findings offer several design principles for integrating CT and ICC in secondary EFL classrooms: Explicit Audience Specificity: Clearly defining target audiences from the outset can strengthen interpretive foundations and mitigate premature inference. Feedback Loops: Incorporating peer or external feedback mechanisms can challenge assumptions and foster more substantive evaluation. Simulated audience responses or structured peer review can serve this purpose. Illustrative Case Examples: Providing concrete examples of intercultural misunderstandings can make the stakes of inference and evaluation more tangible, particularly in lower-risk tasks. Redesigned Assessment Rubrics: Rubrics should explicitly include cultural appropriateness and audience responsiveness as weighted criteria, signaling their importance beyond technical proficiency. Guided Inference and Evaluation: Scaffolding these processes through guided questions can help students articulate and examine their assumptions, especially at points vulnerable to compression. Future research could extend these findings through longitudinal and multi-site designs to assess the stability of observed patterns across diverse contexts. Incorporating real-time data collection methods, such as think-aloud protocols and classroom observations, alongside artifact analysis, would provide richer triangulation of perceived cognitive processes and actual work quality. Experimental variations of task stakes, rubric design, and feedback timing could further illuminate causal relationships. Conclusion: Towards a Nuanced Understanding of Critical Thinking in Action This study provides a task-based perspective on critical thinking in intercultural EFL learning, revealing that learners experience its subskills not as uniformly salient but as unevenly distributed and conditionally activated across task phases. Interpretation and analysis form a foundational base, while inference and evaluation exhibit variability dependent on factors like perceived stakes and feedback. Explanation and self-regulation emerge as late-stage refinements vulnerable to practical constraints. By understanding these nuances, educators can design more effective project-based learning experiences that foster deeper and more culturally reflexive critical thinking. Post navigation Depression Classification Based on Automatic Ontology Generation and Natural Language Processing