This article is a companion paper that expands on published research and provides accessible insights into the work by learning scientists from the University of Massachusetts Amherst’s College of Education and Manning College of Information & Computer Sciences (Gattupalli et al., 2025), presented at the IEEE International Conference on Advanced Learning Technologies (ICALT 2025).
What many considered unlikely—AI tutoring that runs entirely offline, in ordinary school labs—now functions in practice. I have pursued this question because contemporary educational technology often conflates intelligence with connectivity, even as large populations in the global South and underserved regions of the global North experience intermittent power, limited bandwidth, and persistent cost barriers. During the pandemic, UNESCO reported striking gaps in home internet and device access; those figures crystallized a simple design mandate for me: if pedagogy depends on infrastructure that learners do not have, the pedagogy fails them. The header image—a globe held in human hands—signals the inversion I sought: technology should be made to reach the learner, not the other way around.
Scope and contribution
This field report summarizes a school-based deployment of on-device, open-source AI tutoring in rural Telangana, India, designed under the SHIELD framework—Secure Handling of Information in Educational Devices. The study shows that a locally hosted model can scaffold introductory programming without sending data to external servers, while protecting privacy and keeping teachers central to instructional decision-making. The work accompanies a peer-reviewed paper presented at ICALT 2025. This project was possible only through the collaboration of the school’s headmaster, teaching staff, and—above all—the students who participated; their commitment shaped the research from design through reflection.
Why offline AI?
I began from a consequential equity problem. If educational AI presumes persistent connectivity, it systematically excludes schools where infrastructure is fragile or costly. Cloud services remain valuable, yet they encode a dependency model—recurring fees, vendor control, and data externalization—that many public schools cannot sustain. I asked whether a carefully constrained, curriculum-aligned tutor could operate entirely offline—no external calls, no telemetry—while still delivering formative scaffolding that honors local pedagogy and pacing. The question is less about novelty than fitness for context: can AI be designed to meet learners where they are and extend the teacher’s reach without substituting for it?
Introducing: The SHIELD Framework
SHIELD is an acronym for Secure Handling of Information in EducationaL Devices. It is a lightweight peda-governance and design framework that anchors development in three commitments: instructional integrity, privacy by design, and educator oversight. Instructional integrity is implemented through a structured system prompting techniques, such as the Context, Role, Expectations (CRE) technique, which is embedded at the system level. In practice, the tutor frames each interaction in relation to stated learning objectives and goals, adopts a supportive pedagogical role, and limits its responses to hints, worked examples, and explanations rather than direct answers. Privacy by design is realized by keeping all inference and logging on local machines; student work never leaves the classroom. Human oversight is maintained through teacher mediation: the teacher directs use, interprets feedback with students, and retains authority over assessment and pacing. SHIELD therefore operates at the intersection of product choices and classroom protocol; it is as much an implementation discipline as a pedagogical stance.
Implementation
The deployment took place in a rural government high school in Telangana State. Nineteen students, aged 15–17, participated in weekly one-hour sessions over three months (October–December 2024). The lab consisted of twenty mid-range desktops configured to run an open-source large language model locally via LM Studio (LM Studio). After benchmarking several candidates on code-generation and repair tasks inspired by HumanEval, I selected a Llama-family model for its balance of code competence and resource efficiency; the model was quantized so it could run on CPUs where discrete GPUs were unavailable (Llama; HumanEval). A minimal teacher console summarized session activity for supervision within the local network. The instructional sequence aligned with the state curriculum for introductory Python (variables, control flow, lists, simple functions, debugging), and the tutor was configured with prompt templates that reinforced classroom norms, including academic integrity and productive struggle.
Methods and Metrics
The study used a pragmatic field-deployment design with mixed methods to capture changes in learner experience and interaction behavior. Before and after the deployment, students completed short surveys measuring perceived programming self-efficacy, learner autonomy, and usefulness of the offline tutor using five-point Likert items. Students also wrote brief reflections on how the tool helped or hindered their learning. System logs, stored locally and anonymized for analysis, recorded categories of requests and the evolution of query types during the term. Descriptive statistics summarize survey shifts, while thematic coding of reflections and log categories traces movement from surface-level requests to more strategic problem-solving. The goal was not causal estimation but educationally meaningful insight into feasibility, classroom fit, and learner experience under offline conditions.
Findings
Perceptions of capability and autonomy increased over the study window. Baseline means were already favorable (approximately 4.0–4.2 on a five-point scale) and rose to above 4.5 post-deployment. Students commonly described the tutor as a “coding buddy” or “study partner,” language that signals a shift from dependence on teacher availability to self-initiated help-seeking. Open-ended reflections highlighted a reduction in unproductive waiting and an increase in perseverance when encountering syntax or logic errors. Interaction traces corroborated this trajectory: early sessions were dominated by syntax formulation questions; mid-study interactions concentrated on debugging and error interpretation; later sessions emphasized algorithmic refinement and code clarity. Teachers reported fewer interruptions for minor issues and more time for conceptual discussion, project guidance, and formative assessment conferences. From a feasibility perspective, the system operated reliably without internet connectivity, incurred no recurring license costs, and required only modest maintenance once configured—parameters that matter in budget-constrained public schools.
What this does—and does not—claim
The study demonstrates viability rather than finality. It establishes that on-device AI tutoring can be aligned with curriculum, monitored by teachers, and delivered with local data stewardship in a real school. It does not claim superiority to particular cloud services, nor does it report standardized achievement gains. The focus is on removing the structural precondition of connectivity and observing whether pedagogically useful scaffolding remains. Under SHIELD, it does.
Equity and the global distribution of infrastructure
Designing “offline-first” is a strategy for equity, not a concession to scarcity. When the pedagogical core functions without the network, bandwidth becomes an enhancer rather than a gatekeeper. This orientation matters in the global South, where schools face recurring costs and infrastructural volatility, and in the global North, where rural districts, disaster-affected communities, and privacy-sensitive institutions require local control of data and uptime. In both contexts, an offline-capable tutor widens participation by decoupling learning support from network availability while preserving the teacher’s role as orchestrator of activity and assessor of understanding. The work therefore argues for a reframing of the field’s default assumptions: connectivity is valuable, but it should not be the price of admission to intelligent support.
Risks and safeguards
Any AI tutor—offline or not—carries risks. Models can generate incorrect or misleading code. Students may over-rely on automated hints rather than developing independent debugging skills. And tutors may encode biases present in training data. SHIELD addresses these through several measures: (1) system prompting limits outputs to hints and explanations, never full solutions, to preserve productive struggle; (2) teacher mediation ensures that the tutor functions as one resource among many, not the arbiter of correctness; (3) logging provides visibility into student-tutor interactions, enabling teacher oversight; and (4) local hosting eliminates vendor tracking and gives institutions direct control over the system’s behavior and updates. These are not perfect solutions, but they establish a governance posture where humans remain accountable.
Implications for research and policy
The study’s implications extend beyond one rural school. First, it suggests that research on educational AI should expand evaluation criteria beyond learning gains to include feasibility, equity of access, and alignment with teacher autonomy. Second, it demonstrates that open-source models, properly configured, can deliver curriculum-aligned support at scale without dependency on external infrastructure. Third, it calls for policy frameworks that prioritize local data stewardship and on-device inference as equity strategies, not merely technical preferences. Educational AI policy currently assumes connectivity; this work shows that assumption is neither necessary nor neutral. Policymakers and funders should incentivize infrastructure-independent designs as a mechanism for widening participation and reducing structural inequity.
Future directions
SHIELD is a starting point, not a destination. Future work should investigate longer deployments, additional subjects beyond programming, and comparative studies across connectivity contexts. We also need rigorous study of how teachers integrate offline AI into formative assessment cycles, how students develop metacognitive strategies when tutoring is always available, and how local hosting affects maintenance burden and cost sustainability at district scale. Another critical direction is participatory design with educators and students in under-resourced settings, ensuring that offline AI reflects their priorities, not Silicon Valley’s assumptions about what counts as “smart.” Finally, as models improve and hardware becomes cheaper, we must continually ask whether offline-first designs remain necessary—or whether cloud dependency will again be normalized as infrastructure expands. The answer will vary by context, which is precisely why research must attend to equity as a design principle, not an afterthought.
Conclusion
This work shows that AI tutoring does not require the internet. It requires thoughtful design, local stewardship, and respect for teachers as decision-makers. SHIELD offers one model for how that might be done, tested in a real school with real students under real constraints. The results are promising: students gained confidence, teachers retained agency, and the system operated reliably without connectivity. The larger question remains open: will the field treat offline-first AI as a niche solution for “the disconnected,” or as a core equity strategy that expands what counts as infrastructure? The answer will shape who gets to participate in AI-mediated learning—and who is systematically excluded.
Acknowledgments
This research was conducted with the full cooperation of the school administration, teaching staff, and students in rural Telangana. I am deeply grateful for their openness, curiosity, and willingness to experiment with new approaches to computing education. Special thanks to the headmaster for facilitating access and to the teachers who integrated the SHIELD tutor into their instructional practice. This work would not have been possible without their partnership. I also thank my doctoral committee—Ivon Arroyo, Beverly Woolf, and Elizabeth McEneaney—for their guidance and support throughout this project.
