Overview
KafkaCode uses multiple detection methods to identify privacy issues, secrets, and compliance violations in your source code.Detection Categories
Secrets Detection
API keys, tokens, credentials
PII Detection
Personal identifiable information
Compliance Checks
GDPR, CCPA requirements
Context Analysis
AI-powered semantic analysis
Secrets Detection
Critical Level Secrets
AWS Access Keys
AWS Access Keys
Pattern: Severity: Critical (100 points)
AKIA[0-9A-Z]{16}Example:Private Keys
Private Keys
Pattern: Severity: Critical (100 points)
-----BEGIN (RSA |EC )?PRIVATE KEY-----Example:Stripe API Keys
Stripe API Keys
Pattern: Severity: Critical (100 points)
sk_live_[0-9a-zA-Z]{24}Example:Database Credentials
Database Credentials
Pattern: Password/credentials in connection stringsExample:Severity: Critical (100 points)
High Level Secrets
OAuth Tokens
OAuth Tokens
Pattern: GitHub, GitLab, and other OAuth tokensExample:Severity: High (50 points)
JWT Secrets
JWT Secrets
Pattern: Severity: High (50 points)
jwt_secret, JWT_SECRET assignmentsExample:API Keys
API Keys
Pattern: Generic API key patternsExample:Severity: High (50 points)
PII Detection
Medium Level PII
Email Addresses
Email Addresses
Pattern: Severity: Medium (10 points)GDPR Consideration: Email addresses are PII under GDPR
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}Example:Phone Numbers
Phone Numbers
Pattern: Various international formatsExample:Severity: Medium (10 points)CCPA Consideration: Phone numbers are personal information
Social Security Numbers
Social Security Numbers
Credit Card Numbers
Credit Card Numbers
Pattern: Luhn algorithm validated sequencesExample:Severity: Critical (100 points)
Low Level PII
IP Addresses
IP Addresses
Pattern: Severity: Low (1 point)
\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\bExample:URLs with Sensitive Paths
URLs with Sensitive Paths
Pattern: URLs containing Severity: Low (1 point)
/api/, /admin/, /secret/Example:High Entropy Strings
KafkaCode detects strings with high randomness that might be secrets:- Entropy > 4.5 and length > 16: Potential secret
- Entropy > 5.0 and length > 24: Likely secret
Sensitive Keywords
Detection of sensitive data based on variable naming:- Critical Keywords
- High Keywords
- Medium Keywords
Context-Aware Detection
The AI analyzer understands code context:Example 1: Configuration vs Hardcoded
Example 2: Test Data vs Real Data
Example 3: Public vs Private
Compliance-Specific Detection
GDPR Compliance
Personal Data
- Name, email, phone
- IP addresses
- Location data
- Cookies with PII
Special Categories
- Health data
- Biometric data
- Genetic data
- Religious/political views
CCPA Compliance
Personal Information
- Contact information
- Financial information
- Purchase history
- Browsing history
Identifiers
- Device IDs
- IP addresses
- Cookie IDs
- Account usernames
False Positive Reduction
KafkaCode uses several techniques to reduce false positives:1
Context Analysis
AI understands if a value is a placeholder, test data, or real credential
2
Assignment Context
Only flags sensitive keywords when they’re being assigned values
3
Environment Variable Detection
Recognizes when values come from env vars or config files
4
Comment Analysis
Understands
# TODO or # FIXME comments that mention sensitive dataBest Practices
Do's
Do's
- ✅ Use environment variables for all secrets
- ✅ Store credentials in secure vaults (AWS Secrets Manager, etc.)
- ✅ Use
.envfiles with.gitignore - ✅ Rotate secrets regularly
- ✅ Use different secrets for dev/staging/prod
Don'ts
Don'ts
- ❌ Never commit secrets to version control
- ❌ Don’t hardcode API keys or passwords
- ❌ Don’t store PII unnecessarily
- ❌ Don’t log sensitive information
- ❌ Don’t share secrets in plain text


\d{3}-\d{2}-\d{4}Example: