We maintain a library of trained models you can begin using immediately because speed matters once you’ve prioritized setting a new safety standard for your community.

Since each community is unique, these models can be immediately customized to reflect your guidelines. They continually and automatically update, keeping you informed and in control no matter how quickly your community evolves.



Our toxic model detects content that includes anything considered sexual, insulting, harassing, or hateful; this type of content would typically be removed from a platform but would not necessarily result in the poster being banned from the site.


Hate Speech

Our hate speech model detects attacks against a person or group on the basis of attributes such as race, religion, ethnic origin, national origin, sex, disability, sexual orientation, or gender identity.

Severe Toxic

Our severe toxic model detects content that includes anything considered to be severely sexual, insulting, harassing, or hateful; this type of content would typically result in the user who posted it being banned from the platform.



Our solicitation model detects acts of trying to obtain services from someone, offering services to someone, or mentioning services as a part of negotiation.

This can be narrowed down to specific types of solicitation such as solicitation of weapons, drugs, sex, or human trafficking.

Self Harm

Our self-harm model detects content that explicitly or suggestively exhibits that someone struggles with, or has struggled with self-harm; or content that encourages others to harm themselves.


Our drugs model detects content that suggestively or explicitly references any legal/illegal drugs, drug paraphernalia, or drug usage.


Our weapons model detects content with explicit mentions of weapons, ammunition, or accessories.

Sexual Content

Our sexual content model detects content that directly or suggestively mentions sexual activity; mentions or implies sexual arousal; or comments on physical appearance that exhibits sexual interest/consideration.


Our misogyny model detects content that appears to support a dislike of, contempt for, or ingrained prejudice against women.


Our grooming model detects content where there is recruitment of someone taking place with the intent to solicit content or membership.

This can be narrowed to be specific for underage or white nationalist grooming.


Our profanity model detects any content with vulgar, profane, or obscene language.


Our insults model detects content including language that degrades, shames, or attacks any group, individual, organization, entity, or inanimate object.


Our violence model detects content with mentions of any physical action, or intent to commit physical action, that may inflict harm upon a living being, representation of a living being, or one's self.

Sexual Harassment

Our sexual harassment model detects content including sexual remarks and advances that are unwelcome and unwanted by the receiver.

Graphic Violence

Our graphic violence model detects content with descriptions of violent acts that would be considered brutal and mentions of injury, blood, gore, bodily fluids, or internal organs that are likely the result of a violent act.

General Harassment

Our general harassment model detects content that includes aggressive pressure or intimidation and behavior that annoys, threatens, intimidates, alarms, or puts a person in fear of their safety.


Our bullying model detects content that seeks to harm, intimidate, or coerce someone or group who is perceived as vulnerable.


Our spam model detects content that includes irrelevant or inappropriate messages sent to a large number of recipients.


Our shaming model detects content that is intended to cause someone to feel ashamed or inadequate.


Our fraud model detects content that is deceiving or misrepresenting and has to do with payment or accounts that is typically illegal in nature.

Harmful Links & Phishing

Our harmful links and phishing model detects content that appears to be from reputable companies but is actually intended to induce individuals to reveal personal information, such as passwords and credit card numbers.

Off Topic

Our off-topic model detects content that drifts in subject matter from the intent of the platform.

I.e. Soliciting dates in the comments section of an e-commerce site.

Entity Recognition

Our entity recognition model recognizes specific entities being discussed within content.


Our sentiment model detects whether content is generally positive, negative, or neutral.


Our emotions model detects the emotion that the person who wrote the text was displaying within the content (i.e. Joy, Anger, Fear, or Sadness).