Getting Started with LLM Proxy

HIPAA-compliant AI in 5 minutes

~5 min setup
How it works: RedactiPHI's LLM Proxy sits between your application and LLM providers (OpenAI, Anthropic, etc.). PHI is automatically de-identified before reaching the LLM, then re-identified in responses. Just change your base URL - your existing code works unchanged.
1
Choose Your Integration Method
Two ways to get started

Web Chat Interface

Use our built-in HIPAA-compliant chat. Just bring your API key - no code required.

Open Chat

API Integration

Drop-in replacement for OpenAI/Anthropic SDKs. Just change your base URL.

See Code Examples
2
Integrate with Your Code
Change one line - that's it
# Before (direct to OpenAI)
import openai

client = openai.OpenAI(
    api_key="sk-..."
)

# After (PHI-protected via RedactiPHI)
import openai

client = openai.OpenAI(
    api_key="sk-...",  # Your OpenAI key
    base_url="https://llm.redact.health/v1/proxy/openai"  # Add this line!
)

# Use exactly as before - PHI protection is automatic
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{
        "role": "user",
        "content": "Summarize: Patient John Smith, DOB 03/15/1958..."
    }]
)
# Response contains original PHI - automatically re-identified!
print(response.choices[0].message.content)
// Before (direct to OpenAI)
import OpenAI from 'openai';

const openai = new OpenAI({
    apiKey: 'sk-...'
});

// After (PHI-protected via RedactiPHI)
import OpenAI from 'openai';

const openai = new OpenAI({
    apiKey: 'sk-...',  // Your OpenAI key
    baseURL: 'https://llm.redact.health/v1/proxy/openai'  // Add this!
});

// Use exactly as before
const response = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: [{
        role: 'user',
        content: 'Summarize: Patient John Smith, DOB 03/15/1958...'
    }]
});
console.log(response.choices[0].message.content);
# Direct API call with PHI protection
curl https://llm.redact.health/v1/proxy/openai/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-..." \
  -d '{
    "model": "gpt-4",
    "messages": [{
      "role": "user",
      "content": "Summarize: Patient John Smith, DOB 03/15/1958..."
    }]
  }'
That's all the code changes needed! Your PHI is now automatically de-identified before reaching OpenAI, and re-identified in the response. The LLM never sees real patient data.
3
Supported Providers
Works with all major LLM providers

OpenAI

/v1/proxy/openai

GPT-4, GPT-3.5, embeddings

Anthropic

/v1/proxy/anthropic

Claude 3.5, Claude 3 Opus/Sonnet/Haiku

Azure OpenAI

/v1/proxy/azure/{deployment}

Your Azure-hosted GPT models

Google Gemini

/v1/proxy/gemini/{model}

Gemini Pro, Gemini Ultra

4
Verify It's Working
Quick checklist to confirm setup
  • Try the Chat Interface Go to /chat, enter your API key, and send a message with PHI (e.g., "Patient John Smith...")
  • Check PHI Detection Click "View De-ID" in chat to see the tokenized version sent to the LLM
  • Verify Re-identification The response should contain original patient names, not tokens like [NAM_abc123]
  • Run the Test Suite Go to /llm-proxy and click "Run Tests" to verify your configuration
Security Note: For BYOK (Bring Your Own Key) mode, your API keys are used directly for LLM calls and are not stored on our servers. For managed keys, configure them in the Keys section.

Ready to Try It?

Experience HIPAA-compliant AI with our interactive demo or jump straight into the chat.

Interactive Demo Open Chat
?
Common Questions
What PHI types are detected?

Names, dates of birth, ages, addresses, phone numbers, SSNs, MRNs, email addresses, facility names, provider names, and more. See our full PHI type list in Policies.

Does this work with streaming responses?

Yes! Streaming is fully supported. PHI is re-identified in real-time as chunks arrive.

What if the LLM hallucinates PHI?

The LLM only sees tokens like [NAM_abc123], not real data. It can't leak PHI it never had access to. If it invents a name, it won't match any token and won't be "re-identified" to anything.

Is there latency overhead?

Minimal - typically 50-150ms for de-identification and re-identification combined. The LLM call itself dominates total response time.