This is NOT just a navigation page anymore. This is a complete teaching document. By the time you finish reading, you will:
Step 1: Capture faces → capture.html
Step 2: Export JSON + images
Step 3: Place files correctly
Step 4: Run recognition → recognize.html
Everything depends on correct training data. If training fails → recognition will ALWAYS show "Unknown".
This project is mostly powered by JavaScript running directly in the browser.
Some functionality comes from your own inline JavaScript inside
capture.html and recognize.html, and some functionality
comes from external JavaScript libraries loaded through script tags.
This section lists the important JavaScript sources, explains why each one is
important, and gives the direct links used in the project.
This is the main AI-related JavaScript library used by both
capture.html and recognize.html.
It provides the actual face detection, landmark extraction,
face descriptor generation, and matching support.
Without this file, the project cannot detect faces or recognize anyone.
In capture.html, it is used mainly for
detecting a face box so that the face crop can be saved properly.
In recognize.html, it is used for:
This is not loaded as a separate external file, but it is still one of the most important JavaScript parts of the project. It contains your custom logic for the capture workflow.
Its importance is very high because it handles:
getUserMedia()faces.json fileWithout this JavaScript, the library alone would not be enough. The page would have AI capability available, but no app behavior, no storage, and no export logic.
This is the custom JavaScript that turns the recognition page into a full live recognition tool.
Its importance is also extremely high because it controls:
data/faces.jsonFaceMatcherThis file is what connects the AI library with your actual recognition app behavior. Without it, there would be no training reload system, no live HUD, no logs, and no real recognition workflow.
This is not an external file, but it is one of the most important browser
JavaScript capabilities used by both pages.
The key API is:
navigator.mediaDevices.getUserMedia()
It is important because it gives the web page access to the camera. Without this browser API, neither capture nor recognition can work.
Your Start Camera event handlers in both pages depend on this API first. If camera access fails, almost the whole workflow stops immediately.
Your project also relies heavily on HTML canvas through JavaScript. Both pages use canvas drawing contexts for important visual and image work.
In capture.html, canvas is important for:
In recognize.html, canvas is important for:
The capture page uses browser storage through JavaScript:
localStorage
This is important because the page stores captured face images and metadata locally before export. That means the user can save several faces, refresh the gallery, and then export later.
Without localStorage in the current design, the user would lose the captured training set on refresh unless a backend API were added.
The recognition page uses the browser Fetch API to load
data/faces.json and also to load the referenced training images.
This is important because the recognition page does not hardcode the people and images directly into JavaScript. Instead, it reads them dynamically from files, which makes the system flexible and easier to maintain.
Without fetch, the page could not load your JSON config or external training images in the current architecture.
Your pages also include the AdSense script tag. This script is not part of the face capture or face recognition logic, but it is still a JavaScript file included in the page.
Its role is monetization and ad loading, not face processing. So it is present, but it is not a functional dependency for the recognition workflow itself.
In short, the most important functional JavaScript source in this project is face-api.js, but your own inline JavaScript inside capture.html and recognize.html is what actually turns that library into a working application. The browser APIs such as getUserMedia, Canvas, fetch, and localStorage are also essential because they provide camera access, image processing, file loading, and local persistence.
We will start from the MOST IMPORTANT part:
btnStartCam.addEventListener("click", startCamera);
This line connects button → function.
When user clicks button → startCamera() runs.
async function startCamera() {
Async because we use await (camera + promises).
stream = await navigator.mediaDevices.getUserMedia({
video: { facingMode: facing },
audio: false,
});
CRITICAL LINE
This asks browser: "Give me camera stream"
If user denies → error If HTTP (not HTTPS) → fails
---video.srcObject = stream;
Attach camera to video element
---await video.play();
Start video playback
---running = true;
System state → camera is active
---loop();
Starts infinite detection loop
---
async function loop() {
if (!running) return;
await detectFrame();
draw();
requestAnimationFrame(loop);
}
This runs continuously like a game loop.
Flow:
const dets = await faceapi .detectAllFaces(video, new faceapi.TinyFaceDetectorOptions()) .withFaceLandmarks() .withFaceDescriptors();
This is the CORE AI step.
It does:
ctx.strokeRect(b.x, b.y, b.width, b.height);
Draws red box on face
---const best = matcher.findBestMatch(d.descriptor);
COMPARES FACE WITH TRAINING DATA
---label = best.label;
Final output → person's name
---matcher = new faceapi.FaceMatcher(labeled, 0.55);
0.55 = strictness Lower = stricter Higher = more lenient
---
const res = await fetch("data/faces.json");
Loads training config
---for (const person of people)
For each person → load images → extract descriptors
---Recognition works like:
getUserMedia()---
When user clicks "Save":
{
name: "Champak",
files: [
"images/faces/champak-1.jpg"
]
}
---
Capture page groups all saved faces and creates:
faces.json
Means → training failed
---Means → detection failed
---Check:
Check:
Think of system like this:
Capture = Teaching Recognition = Testing
No teaching → no recognition.