How Chat moderation works to keep students safe

Misty Flowers

July 01, 2024 19:17

This article covers how content moderation works in Finalsite Chat, including how the hybrid AI and human moderation system processes text, images, and video, and what actions are taken for each harm category.

💡Quick answers

Is Finalsite's AI trained on school chat data? No. Chat data is analyzed in real time for safety but is never used to train AI models. Private conversations stay within the secure school environment.
What happens when harmful content is detected? Depending on severity, the system takes one of five actions: no action, flag for human review, shadow block (hidden from others without notifying the sender), block (removed with sender notified), or bounce and block (content rejected and account disabled).
Does the AI make final decisions on discipline? No. Borderline content is flagged for a school administrator to review. A human always evaluates context before any disciplinary action is taken.
What types of content does the moderation system catch? Text is scanned for bullying, child safety risks, self-harm, hate speech, sexual content, threats, PII sharing, spam, substance abuse, and academic dishonesty. Images and videos are scanned across 11 categories including explicit content, violence, hate symbols, and QR codes.
Is Chat HIPAA-compliant? No. Protected health information must not be transmitted through Chat. Use a HIPAA-compliant database or secure email for medical documents.
What video formats are supported for moderation scanning? MP4 and MOV containers using the H.264 (AVC) codec. Apple's HEVC/H.265 and ProRes formats are not currently supported.
Does this comply with student privacy laws? Yes. Finalsite Chat is compliant with FERPA and COPPA. No student data is used to train AI models.

Security and privacy in Finalsite Chat

At Finalsite, ensuring your privacy and security is our top priority. Robust privacy settings are in place ensuring that your community's data is protected with care and integrity. Please view our Privacy Policy to learn more and contact us at privacy@finalsite.com with any questions or concerns. We strive to maintain a safe chat environment for productive engagement between users and teachers.

Data integrity: No Large Language Models (LLM) are trained using your school's private chat data.
Global protection: AI text moderation has no language limits, protecting diverse communities across all dialects.

HIPAA and health security

It is our policy that no protected health information (PHI) subject to the Health Information Portability and Accountability Act (HIPAA) be disclosed through our platform. For transmitting medical documents, we recommend the following:

utilize a HIPAA-compliant database to ensure the highest level of security.
request that parents and guardians submit information including any protected health information via secure email.

How content moderation works

Every message (text, image, or video) undergoes real-time analysis through three primary engines:

AI Text Harm Detection (LLM): Analyzes context, intent, and sentiment.
AI Image Recognition: Powered by AWS Rekognition to scan for symbols, nudity, and violence.
AI Video Moderation: Frame-by-frame analysis of MP4 and MOV files.

Moderation actions explained

Depending on the severity and context, the AI will trigger one of the following actions:

Action	What It Does	Visibility
No action	Content passes through automatically.	Visible to all.
Flag	Marked for human review in the dashboard.	Visible while awaiting review.
Shadow block	Hidden from the public; the sender is unaware.	Hidden (Spam prevention).
Block	Removed immediately; sender is notified.	Hidden.
Bounce & flag	Rejected immediately and sent to the admin queue.	Hidden (Safety critical).
Bounce & block	Content rejected and the user account is disabled.	Hidden (Severe violations).

Moderation harm categories

Category	What it catches	Action	Logic/Goal
Bullying	Name-calling, insults, social exclusion, rumors, mocking.	Flag	Allows humans to distinguish between actual bullying and a teacher reporting a student's behavior.
Child safety	Grooming patterns, off-platform requests, boundary violations.	Bounce & flag	Safety critical; requires immediate rejection and admin review.
Self-harm	Suicidal ideation, self-injury, crisis messages.	Flag	Prioritizes intervention over silencing so a counselor can reach out.
Hate speech	Slurs, attacks on race, religion, gender, or orientation.	Block	Clear policy violation with no legitimate school use.
Sexual content	Explicit descriptions, solicitation, sexualized comments.	Block	Inappropriate for a K-12 environment.
Threats	Violence, school threats, intimidation, blackmail.	Bounce & flag	Safety critical; requires immediate school safety escalation.
PII sharing	Addresses, SSNs, financial info, sharing others' private info.	Flag	Distinguishes between a parent sharing their own info vs. a student doxxing someone.
Spam	Advertising, phishing, scams, repetitive "noise."	Shadow block	User remains unaware their message was hidden to reduce disruption.
Substance abuse	Drug dealing, underage drinking, promotion of misuse.	Flag	Context matters (e.g., an incident report vs. drug promotion).
Academic dishonesty	Selling test answers, plagiarism, cheating coordination.	Flag	Allows the school to handle the disciplinary process internally.

AI image and video moderation: 11 categories

Visual content is processed via AWS Rekognition to detect harmful imagery, symbols, or sensitive information within files. These 11 safety categories include both static images and video files. While images are analyzed as a single unit, videos undergo a frame-by-frame inspection to detect harmful content, symbols, or PII at any specific timestamp.

Category	What it catches	Action	Why this action
Explicit	Pornographic content, sexual acts.	Flag	No legitimate school use; requires review.
Non-explicit nudity	Exposed body parts.	Flag	Allows for review of health or art class edge cases.
Violence	Fighting, weapons, gore, aggression.	Flag	Sports photos or news clips may trigger; requires human context.
Visually disturbing	Mutilation, shock content.	Flag	Prevents trauma while allowing for review.
Drugs and tobacco	Paraphernalia, usage imagery.	Flag	Distinguishes between drug promotion and health class materials.
Alcohol	Beverages, consumption imagery.	Flag	Monitoring for event photos (e.g., school fundraisers).
Rude gestures	Offensive hand signals.	Flag	Context matters; allows for review of student behavior.
Gambling	Cards, dice, slot machines.	Flag	Distinguishes math class activities from actual wagering.
Hate symbols	Extremist or offensive imagery.	Flag	Monitors for hate speech while allowing for history class edge cases.
PII (visual)	Credit cards, IDs, passports, faces.	Flag	Catches sensitive documents shared within images.
QR code	Links that could bypass domain lists.	Flag	Prevents bypassing of blocked domain lists.
Swimwear and underwear	Revealing clothing (swim team, PE).	No action	Set to no action to avoid false positives in a school setting.

Important Note on processing and supported formats

The platform utilizes asynchronous processing, allowing images and videos to be accepted immediately while the analysis is performed in the background.

This sophisticated engine conducts a frame-by-frame analysis to ensure high-accuracy detection of objects, scenes, and activities. Once processing is complete, results are typically delivered via system notifications or stored directly in your dashboard for review.

Supported Formats:

Images: JPEG, PNG
Video Containers: MP4, MOV
Required Codec: H.264 (AVC)
Note on QuickTime: While the .mov container is supported, Apple's proprietary iPhone QuickTime formats (such as those using the HEVC/H.265 codec or ProRes) are not currently supported. For consistent results, please ensure files are encoded using the standard H.264 codec.

Configure your chat moderation tools

Once you are familiar with how moderation works, see Moderate your Chat feature for the tools available to take action on flagged content.

FAQs

Q: Does the AI "learn" from my messages or data?
A: No. The AI is not trained on your school’s data. It acts like a digital security guard—it "reads" a message once to check for safety, then immediately forgets it. Your private conversations never leave the secure school environment.

Q: Is my student’s data being sold to AI companies?
A: No. Finalsite does not sell or share student data. The AI is used exclusively for safety moderation (blocking bullying/threats) and accessibility (real-time translation).

Q: What exactly is the AI scanning for?
A: The system performs a real-time check on text, images, and videos for safety risks (bullying, self-harm), privacy leaks (home addresses, SSNs), and inappropriate content (violence, nudity).

Q: Does the AI make final disciplinary decisions?
A: No. If the AI detects a borderline issue, it flags the message for a school administrator. A real person at our school always reviews the context before any action is taken.

Q: Does this comply with student privacy laws?
A: Yes. Finalsite Chat is fully compliant with FERPA and COPPA. Because we do not "train" our models on your data, your student’s digital footprint remains strictly protected.

Q: Why use AI in our school app at all?
A: It provides 24/7 protection by blocking harmful content in milliseconds and allows families who speak different languages to communicate instantly with teachers via translation.