Guides
PII Protection
NeuronEdge uses a dual-engine protection system combining 102 optimized regex patterns with NER-based machine learning to detect and redact 105+ types of personally identifiable information.
How Protection Works
1. Regex Engine (Primary)
102 optimized regex patterns run in Rust/WASM for sub-millisecond detection. Patterns are designed for high precision to minimize false positives while catching common PII formats like SSNs, credit cards, emails, and phone numbers.
2. NER Engine (Supplementary)
Named Entity Recognition using Workers AI (Llama 3.2) detects contextual entities like names, organizations, and locations that don't follow fixed patterns. This catches PII that regex alone would miss.
3. Deduplication
Results from both engines are merged and deduplicated to ensure each PII instance is only redacted once, preventing double-replacement issues.
Entity Categories
NeuronEdge supports 105+ entity types organized into 8 categories:
Identity
Personal identifiers and government-issued IDs
PERSONSSNPASSPORTDRIVERS_LICENSEDOBNATIONAL_IDTAX_IDContact
Contact information and addresses
EMAILPHONEADDRESSZIP_CODEPO_BOXFinancial
Financial account numbers and payment data
CREDIT_CARDBANK_ACCOUNTIBANSWIFTROUTING_NUMBERCVVMedical
Protected health information (PHI)
MEDICAL_RECORDHEALTH_PLANNPIDEA_NUMBERDIAGNOSISPRESCRIPTIONLocation
Geographic and network location data
ADDRESSCOORDINATESIP_ADDRESSMAC_ADDRESSGEOLOCATIONTechnical
Secrets, credentials, and API keys
API_KEYPASSWORDAWS_KEYGITHUB_TOKENJWTPRIVATE_KEYOrganization
Organization names and business identifiers
ORGGPECOMPANY_IDEINDUNSCompliance
Regulatory-specific identifiers
GDPR_IDCCPA_IDHIPAA_IDRedaction Formats
Choose how detected PII is replaced before sending to the LLM provider:
Token Format
All tiersReplaces PII with type-based placeholders. Simple, readable, and reversible. The LLM sees the entity type but not the value.
// Input
"My name is John Smith and my SSN is 123-45-6789"
// Sent to LLM
"My name is [PERSON] and my SSN is [SSN]"
// Response (restored)
"Hello John Smith, I see you provided your SSN..."Hash Format
Professional+Generates deterministic hash-based placeholders. The same input always produces the same hash, enabling consistent references across conversations.
// Input
"Contact John Smith at john@example.com"
// Sent to LLM
"Contact [HASH:a1b2c3d4] at [HASH:e5f6g7h8]"
// The same values always produce the same hashes
// Useful for multi-turn conversationsSynthetic Format
Professional+Generates realistic fake data that maintains semantic meaning. The LLM sees believable placeholder data that looks real but isn't.
// Input
"My name is John Smith, SSN 123-45-6789, email john@example.com"
// Sent to LLM
"My name is Sarah Johnson, SSN 987-65-4321, email sarah.j@demo.org"
// Response uses synthetic data, then restored to originalDetection Modes
Balance between speed and thoroughness with detection modes:
| Mode | Engines | Latency | Use Case |
|---|---|---|---|
| real-time | Regex only | <1ms | Ultra-low latency, streaming |
| balanced | Regex + lightweight NER | ~5ms | Recommended default |
| thorough | Regex + full NER | ~15ms | Maximum accuracy |
Configuring Protection
Set redaction preferences per-request using headers:
curl -X POST https://api.neuronedge.ai/v1/openai/chat/completions \
-H "Authorization: Bearer ne_live_..." \
-H "X-Provider-API-Key: sk-..." \
-H "X-NeuronEdge-Format: synthetic" \
-H "X-NeuronEdge-Mode: balanced" \
-H "X-NeuronEdge-Entities: PERSON,SSN,EMAIL" \
-d '{"model": "gpt-5.2", "messages": [...]}'Or configure defaults in your policy to avoid per-request headers.