Gizmo is an AI-powered flashcard and quiz app that generates study materials from any uploaded content — lecture notes, PDFs, textbooks — and then adapts the review schedule to each learner's memory retention patterns. 13 million users and $22M in investment is not a coincidence; it is validation of a specific product thesis: AI can personalise learning at a level that human teachers cannot achieve at scale, and that personalisation drives measurably better outcomes. For engineering teams building EdTech products — or any product where behaviour change is the desired outcome — the patterns Gizmo exemplifies are worth understanding deeply. They are not tricks; they are the result of applying cognitive science with AI that has enough compute to individualise at scale.
What We'll Cover
What Makes a Learning Product Truly AI-Native?
The term "AI-powered learning" is overused. Most EdTech products that claim AI use it in one of two shallow ways: generating quiz questions from content (useful but not transformative), or providing a chatbot interface over a static knowledge base (conversational, but not adaptive). Truly AI-native learning products use AI at three deeper levels. First: knowledge state modelling — the system maintains a probabilistic model of what each learner knows, how confident they are in each concept, and how quickly they forget it. Second: adaptive sequencing — the system chooses what to present next based on the learner's current knowledge state, not a fixed curriculum order. Third: generative content — the system generates assessment items, explanations, and examples at the learner's current level, not pulling from a fixed question bank. These three capabilities together are what Gizmo and similar AI-native learning apps have that traditional EdTech does not.
Spaced Repetition: The Algorithm That Drives Retention
Spaced repetition is the most empirically validated technique in learning science. The principle: review an item shortly before you are about to forget it, each successful recall extends the interval before the next review. Over time, items are reviewed less frequently as they move into long-term memory. What AI changes is the accuracy of the forgetting curve model for each individual item and each individual learner.
Traditional spaced repetition uses the SM-2 algorithm, which estimates forgetting based on performance ratings and applies a fixed forgetting curve formula. AI-augmented spaced repetition replaces this with a learned model that accounts for: the specific content type (procedural vs declarative vs conceptual), the learner's individual memory characteristics, interference from similar items, contextual factors (time of day, session length), and active recall performance quality. In production implementations, AI-optimised spaced repetition shows 30-50% improvement in retention at the same review time investment compared to SM-2.
Adaptive Content Paths and Prerequisite Graphs
Static curricula assume all learners start from the same point and learn at the same pace. AI-native learning products replace the static curriculum with a knowledge graph: concepts are nodes, prerequisite relationships are edges, and the learner's current knowledge state determines which path through the graph is optimal. The engineering components required:
- Knowledge graph construction — define concepts and their prerequisite relationships. For user-generated content (like Gizmo's uploaded notes), an LLM extracts concepts and infers relationships automatically.
- Learner knowledge state estimation — a Bayesian Knowledge Tracing (BKT) or Deep Knowledge Tracing (DKT) model that estimates the probability a learner has mastered each concept based on their response history.
- Path selection algorithm — given the knowledge state and learning goal, select the optimal next concept. Typically a graph traversal prioritising concepts where prerequisites are met and which unlock the most downstream concepts.
- Content generation — given the selected concept and learner's current level, generate an explanation, example, or assessment item at the appropriate difficulty using a structured LLM prompt.
Engagement Architecture: Keeping Learners Coming Back
The hardest problem in EdTech is not building the learning system — it is getting learners to use it consistently. Dropout rates in traditional online courses run 80-90%. AI-native learning products that drive high engagement share several architectural patterns: daily review streaks that create loss aversion dynamics, micro-session design (3-5 minutes) that fits into commutes and breaks, progress visibility showing knowledge state improvement over time, and proactive nudges when items are about to expire from memory based on the forgetting curve. None of these require novel AI research — they require careful product and engineering decisions that keep the user coming back at the right frequency for retention to compound.
What This Means for Engineering Teams
For engineering teams building learning products — corporate training platforms, skill development tools, onboarding systems, or consumer EdTech — the Gizmo model provides a clear architecture to evaluate against. The key question is: does your product adapt to the individual learner's knowledge state, or does it deliver the same content in the same order to everyone? The engineering investment to move from content delivery to adaptive learning is significant but well-defined: knowledge graph, knowledge state model, adaptive sequencing, and generative content components. None of these require novel AI research — they require careful engineering of established techniques using available LLMs and ML models.
Pillai Infotech builds custom learning platforms and EdTech products for corporate training, professional development, and consumer markets. Our AI developers have hands-on experience with knowledge tracing models, spaced repetition systems, and LLM-based content generation for adaptive learning. Our custom software development engagements include full-stack EdTech platforms built on these patterns from the ground up.
Frequently Asked Questions
What is the difference between Bayesian Knowledge Tracing and Deep Knowledge Tracing?
Bayesian Knowledge Tracing (BKT) is a probabilistic model that estimates knowledge state for each concept independently using four parameters. It is interpretable and works well with small datasets. Deep Knowledge Tracing (DKT) uses a recurrent neural network that models dependencies between concepts — it performs better when interdependencies are complex and when sufficient interaction data is available (typically 10,000+ learner interactions). For new products with limited data, start with BKT.
How does Gizmo generate questions from uploaded content?
Gizmo uses an LLM to extract key concepts from uploaded text (PDFs, notes, slides), then generates flashcard-style question-answer pairs for each concept. The quality of generated questions depends heavily on the prompt design — effective prompts specify question type (recall, application, conceptual), difficulty level, and require verification that answers are unambiguously correct.
What data do you need to build an effective spaced repetition system?
You need: item ID, learner ID, timestamp, response (correct/incorrect), and ideally response time and confidence rating. The SM-2 algorithm works with just these five fields from the first session. AI-optimised models improve significantly with 20+ review events per item per learner. Start with SM-2 and layer in a learned forgetting model once you have sufficient interaction data.
Is this architecture suitable for corporate training and onboarding?
Yes — adaptive learning is particularly high-value in corporate training because learners have highly variable prior knowledge. AI-adaptive onboarding that routes experienced hires through advanced content reduces time-to-productivity. Spaced repetition significantly improves compliance training retention for regulations that must be remembered and applied in practice.
How much engineering effort does building an AI-native learning platform require?
A basic AI-native learning platform with content ingestion, LLM-generated flashcards, SM-2 spaced repetition, and a mobile-responsive UI requires approximately 3-4 months with a team of 3-4 engineers. A full adaptive system with knowledge graphs and DKT knowledge state modelling requires 6-9 months and includes an ML engineer experienced with knowledge tracing models.