Finalsite employs a Hybrid Moderation Policy, that blends Natural Language Processing (NLP) with advanced AI automation and nuanced human judgment to be able to efficiently screen harmful content while ensuring accuracy. With multiple moderation levels in place, this system helps create a safe and supportive chat environment. This article will guide you in customizing moderation settings to best fit your district or school's needs.
Important Note
No Large Language Models (LLM) are trained using your school's chat data.
This is a Chat for School Leaders article helping district and school admins with everything they need to manage the chat setups for all Chat Managers and Chat Participants at their district or school level. School leaders are likely at either the District Admin or School Admin role.
Not a school leader? Check out Chat for Families or Chat for Teachers for more personalized help!
In this Article
Security and privacy in Finalsite Chat
At Finalsite, ensuring your privacy and security is our top priority. Robust privacy settings are in place ensuring that your community's data is protected with care and integrity. Please view our Privacy Policy to learn more and contact us at privacy@finalsite.com with any questions or concerns. We strive to maintain a safe chat environment for productive engagement between users and teachers.
HIPAA and health security
It is our policy that no protected health information (PHI) subject to the Health Information Portability and Accountability Act (HIPAA) be disclosed through our platform. For transmitting medical documents, we recommend the following:
- utilize a HIPAA-compliant database to ensure the highest level of security.
- request that parents and guardians submit information including any protected health information via secure email.
How content moderation works
In Finalsite Chat, moderation is possible as district and school administrators are given the ability to actively monitor. There is also a process of reporting that takes place. You can rest assured that immediate blocking filters are in place for a wide variety of inappropriate words and scenarios.
Key features of Finalsite's moderation
-
Extensive harm detection
- Detects approximately 40 types of harmful content
- Supports detection in 15 languages for global applicability
-
AI-powered moderation
- Employs both NLP & machine learning for precise content analysis
- Automatically flags or blocks potential issues to ease human moderator workload
-
Human moderation dashboard
- User-friendly interface enabling efficient review and decision-making
- Empowers moderators to manage complex moderation tasks easily
-
Multi-media moderation
- Covers images and videos ensuring comprehensive user safety
Moderation actions explained
- Flag: Content marked for human review but remains visible.
- Block: Content immediately prevented from posting.
- Bounce: Content returned to the sender for correction.
Detailed moderation workflow
When a user sends a message (text, image, or video) in the chat, the content undergoes real-time moderation via multiple engines:
- Blocklists: Checks submitted text against custom-defined lists of prohibited words or phrases.
- AI text harm detection engine: Analyzes textual content for harmful expressions in multiple languages.
- AI semantic filters: Identifies harmful content by contextual similarity and semantic understanding.
- AI image recognition: Evaluates visual content for inappropriate imagery or harmful symbols.
Based on the results from these moderation engines, one of the following moderation decisions is automatically made:
- Allow message: Content is appropriate, posted without intervention.
- Flag for review: Content is borderline; remains visible but is flagged for human moderation review.
- Block message: Content clearly violates guidelines and is immediately prevented from posting.
Moderators can manually review flagged messages via the built-in Admin Moderation Dashboard, where they can either approve, block, or bounce the content.
Additionally, any chat user can manually flag received messages from others. These manually flagged messages also appear in the Moderation Dashboard, requiring human moderators to review and take necessary actions.
Flagged for review
Text, images, and videos flagged by our AI model for toxicity, threats, or inappropriate language are reviewed by administrators.
Here is a list of all of the communication that is flagged to be reviewed:
- Dating: Soliciting romantic or personal relationships.
- Reputation harm: Intended to harm an individual's or entity's reputation.
- Platform bypass: Circumventing platform rules or moderation systems.
- Ads: Promotional messages from businesses or individuals.
- Useless: Irrelevant content not enriching conversation.
- PII: Sharing personally identifiable information (PII) online without consent.
- Underage user: Disclosure of being underage.
- Link: Comments containing external links.
- Geopolitical: Discussions on international political relations.
- Negative criticism: Hostile or unconstructive feedback.
- Boycott: Advocating protests against products or policies.
- Politics: Comments on political parties, figures, or governance.
AI image flagging
- Rude gestures: Images flagged by AI that include offensive gestures.
- Gambling: Any images that might be meant to initiate gambling or wagers.
Blocked immediately
To maintain a safe and appropriate chat environment, certain types of content will be immediately blocked and not permitted for display. Messages containing profanity are blocked before sending, with the sender notified of the violation.
Below is a list of items that fall into these restricted categories:
- Threat: Comments designed to intimidate or scare another person.
- Sexual harassment: Comments and abusive behavior designed to sexually humiliate a person.
- Moral harassment: Comments designed to morally humiliate a person.
- Self harm: Intentional act or threats of causing harm to oneself.
- Terrorism: Threats or incitement of violence or hostage-taking.
- Racism: Discrimination or prejudice based on race or ethnicity.
- LGBTQIA+ Phobia: Discrimination against the LGBTQIA+ community.
- Misogyny: Prejudice or discrimination against women.
- Ableism: Prejudice towards individuals living with disabilities.
- Pedophilia: Comments promoting sexual attraction towards minors.
- Insult: Disrespectful or abusive language targeting an individual.
- Hatred: Expressions intended to injure individuals, groups, or entities.
- Body Shaming: Harmful comments on someone's physical appearance or modifications.
- Doxxing: Sharing personal information without consent for intimidation or revenge.
- Vulgarity: Crude, offensive, or explicit language.
- Sexually explicit: Explicit sexual references or material.
- Drug explicit: Promoting or discussing drug use.
- Weapon explicit: Discussion or promotion of arms and weapons usage.
- Scam: Attempts to deceive or extort money from users.
- Flood: Repetitive or spam messages overwhelming communication channels.
- Terrorism reference: Referencing terrorist activities or extremist groups.
AI image blocking
- Explicit
- Non-explicit nudity
- Swimwear/underwear
- Violence
- Visually disturbing
- Drugs & tobacco
- Alcohol
- Hate Symbols
AI video blocking
- Explicit content
- Non-explicit nudity
- Swimwear/underwear
- Violence & visually disturbing
- Substances (drugs, tobacco, alcohol)
- Inappropriate Content (Rude gestures, gambling, hate symbols)
Moderate in the app
Adjust Posting permissions
Want to change who can post messages in a certain chat room? Posting permissions allow you to do just that! Certain chat rooms can be set up to only have school moderators post. This is great for rooms that will post announcements.
Here's how to adjust Posting permissions in a chat room:
Step 1: Open up the chat room whose permissions you want to adjust.
Step 2: Hold briefly on the room label at the top of the chat room to open up your chat room settings.
Step 3: In the Settings section, tap on Posting permissions.
Step 4: By default, each chat room is set to Everyone. Tap the radio button next to School moderators only. This means that anyone with chat participant level will not be able to post. Anyone with chat manager, school admin, or district admin roles can post in chat room set to School moderators only.
Quick Tutorial: Adjust Posting permissions
Here's an interactive tutorial
** Best experienced in Full Screen (click the icon in the top right corner before you begin) **Moderate in the Chat module
The Message moderation tab
In the Chat feature within your Mobile Apps module, you will find a Message moderation tab to manage all of the items in chat communication that are reported, reviewed, and blocked.
There are three tabs within Message moderation: Reported, Reviewed, and Blocked. Here are the different actions that can be taken within each area.
When any communication is reported, this is where it will first display, along with a number representing how many items are ready to be managed.
Take the following actions:
- Show full chat
- Hide message
- Unreport an item if made by mistake or overturned.
- Click the 3-dot menu to view other actions such as:
- 24-hour chat room block
- 24-hour global block
- Permanent chat room block
- Permanent global block
- Hide message
- Unreport
In the Reviewed tab, items will populate that have already been reviewed and taken action on.
Take the following actions in the Reviewed tab:
- Show full chat
- Hide message
- Click the 3-dot menu to view other actions such as:
- 24-hour chat room block
- 24-hour global block
- Permanent chat room block
- Permanent global block
- Hide message
The Blocked tab is where you can manage any user that has been blocked and taken new actions if needed.
Here are the actions you can take in the Blocked tab:
- Show full chat
- Hide message
- Click the 3-dot menu to take another action:
- 24hr chatroom block
- 24hr global block
- Permanent chatroom block
- Permanent global block
- Hide message
Filter your results
Within each of the 3 areas above, you can also filter and narrow down your results for quick and seamless moderation!
- Click into the All times field to view a calendar and select a range of time to repopulate the results below.
- Click All categories to filter by moderation category: Bullying or harassment, inappropriate language, discrimination or hate speech, threats of violence, sharing of personal information, or inappropriate content.
Comments
Please Sign in to leave a comment if you don't see the comment box below.