A database security story at a scale most tools weren't built for

14K+ Microservices. 300 Databases. 1 Bn+ Daily Accesses. Razorpay's Security Team Has Context on All of It.

The numbers are almost beside the point. What Razorpay's security team actually needed wasn't more data — it was meaning behind the data. An IP address tells you something happened. A service account name tells you which system. Neither tells you whether it was a DBA doing routine work, a compromised application, or an AI agent accessing customer PII it was never meant to touch. At this scale, that ambiguity isn't a gap in reporting. It's a breach waiting to happen.

Ashwath K., Security Head, Razorpay
Aurva gives us real time, identity aware visibility into data access, helping us prevent unauthorized use and privilege escalation while meeting regulatory guidelines. As access becomes more agentic and ephemeral, we rely on Aurva to tie queries to identities and flag anomalies. It is a critical layer in keeping production safe at scale.
-Ashwath K., Security Head, Razorpay

1 Bn+

Daily access requests processed.

Auto

Compliance reporting and workflows

99.99%

Uptime requirement

< 2%

False positive rate

Section Separator

Summary

Razorpay, India's leading payment gateway, needed visibility into data exfiltration risks and database access patterns across their infrastructure.

Traditional monitoring solutions couldn't provide 100% visibility into database operations and network connections without impacting performance.

After deploying Aurva's eBPF-based monitoring, Razorpay achieved:

  • ✅ ~1 billion database queries and network calls monitored per day
  • ✅ 14000+ microservices, and 300+ databases
  • ✅ No performance impact on payment processing systems
  • ✅ Real-time detection of database access and network patterns
  • ✅ Automated compliance reporting
“Partnering with Aurva since the early days has been a genuinely collaborative experience: one where the platform has grown alongside our needs. What began as solving our egress monitoring challenges has evolved into something far more powerful. Today, Aurva gives us real-time visibility into every data access event: across databases, services, and AI agents tied to real identities and enriched with behavioral context. The addition of DAM, DSPM, and data flow capabilities means we're not just audit-ready; we're proactively catching anomalies, enforcing least privilege, and closing security gaps before they become incidents.“
— Manikandan Rajappan, Staff Security Engineer, Razorpay
jupiter

About Company:

Razorpay is India's leading payment solutions provider, powering payments for over 5 million businesses. Processing 7 billion+ transactions annually , Razorpay operates as one of Asia's largest payment infrastructures.

Industry:

Fintech (Payments & Banking)

Company Size:

1,000–5,000 employees (approx.)

Region:

Primarily India & international expansion

Environment:

psqlmysqlsqlserver

& more...

Product:

aws

Data Activity Monitoring

aws

AI Security

aws

Data Security Posture Management

aws

Data Flow Management

aws

External Threat Monitoring

Integrations:

slackicon

The Challenge: Three Visibility Gaps

Razorpay's security team faced three interconnected problems:

Gap

What they couldn't see

Business Impact

Network Egress

Which apps connected to external domains, data sent to third parties

Incidents took days to investigate, risky migrations

Data Access Context

Who accessed what data, when, and whether human or application

DBA activities unmonitored, no unauthorized access detection

Sensitive Data Access

Who accessed PII/PCI data and whether it was exfiltrated

Compliance gaps, data governance unenforceable

Traditional solutions didn't work:

  • Agent-based monitoring: 5-10% CPU overhead, incomplete coverage
  • Proxy-based solutions: 5-10ms latency per query, single point of failure
  • Native database audit logs: 20-40% database CPU, terabytes of logs daily
  • Traditional SIEM: IP addresses are ephemeral, missing identity attribution

The Requirement: Full visibility with no performance impact, no application changes, and 100% coverage.

The Aurva Solution: eBPF-Based Monitoring

Why eBPF

eBPF (extended Berkeley Packet Filter) runs in the Linux kernel and observes network traffic and system calls after they happen, outside the application's path.

Out-of-Line Architecture:

Application → Database (zero latency)
            ↓
    [eBPF in Kernel] (observes asynchronously)
            ↓
    Processing Pipeline → Elasticsearch

Benefits for Razorpay:

  • Zero latency: Payment processing path is not affected
  • Can't be bypassed: Kernel-level visibility
  • 100% coverage: Captures all database protocols and network connections automatically
  • Automatic context: Kernel provides process ID, user, container, pod metadata
  • No application changes: Deploy once, monitor everything
“By leveraging eBPF, we achieved complete visibility into outbound and database calls across all microservices without the latency. This out-of-line approach ensures zero impact on our payment processing path, providing node-agnostic monitoring with no performance trade-offs.“
— Manikandan Rajappan, Staff Security Engineer, Razorpay

Solution 1: Network Egress Visibility

Aurva provided kernel-level visibility into every outbound connection:

  • Every outbound connection captured: From every pod, container, and VM
  • Application identity resolution: Which microservice, which version, which pod
  • Domain and endpoint tracking: Map of third-party integrations
  • Anomaly detection: Traffic patterns flagged in real-time
  • Port and protocol analysis: Identify unexpected network behavior

Real Use Cases at Razorpay

Suspicious Domain Detection: Real-time alerts when applications connect to unexpected external domains, enabling investigation of potential data exfiltration or supply chain attacks.

Third-Party Migration: During Redshift endpoint migration, Aurva identified all applications making connections, enabling safe cutover with zero downtime.

NAT Analysis: Visibility into which applications used which NAT gateways enabled cost optimization.

50 million

network connections monitored per day.

Migration

risk reduced.

Cost Optimized

via NAT analysis.

Time Reduced

for security investigation.

Solution 2: Database Activity Coverage

Aurva captured every database query with full identity context:

  • Every database query captured: PostgreSQL, MySQL, MongoDB, Redis, and more
  • Identity resolution: Distinguish human users (DBAs, engineers) from application identities
  • Custom policy engine: Set environment-specific alerts:
    • "Alert on any DROP TABLE in production"
    • "Flag sensitive table access by non-approved applications"
    • "Detect unusual query patterns from specific users"
  • Real-time alerting: Detection to alert in <1 minute
  • Full audit trail: Every query, every user, every result

Real Use Cases at Razorpay

DBA Activity Monitoring: Track every DBA operation in production who ran what, when, and on which database. Used for insider threat detection and compliance.

Application Behavior Baseline: Understand normal query patterns per application, detect when applications behave differently (potential compromise or bugs).

User + Critical Application Monitoring: Monitoring policies for privileged users accessing sensitive databases, with alerting on unusual patterns.

~1 billion

queries monitored per day.

<1% CPU overhead

used to monitor queries.

No Impact

on Application Performance.

DBA

investigations are more effective.

Solution 3: PII/PCI Discovery with Access Tracking

Aurva connected sensitive data discovery with real-time access tracking:

  • Sensitive data discovery: Automatic scanning of databases and datalakes for PII/PCI data
  • Real-time correlation: Connect sensitive data with:
    • Who accessed it (user or application identity)
    • When it was accessed (timestamps, frequency)
    • What queries touched sensitive columns
    • Whether sensitive data left the infrastructure (network egress correlation)
  • DSPM + Datalake: Continuous scanning of both operational databases and analytics datalakes

Real Use Cases at Razorpay

Compliance Reporting: Automated reports showing which applications access PII/PCI data, frequency and patterns of access, and any unusual access ready for audit.

Data Onboarding: When new applications or datasets are deployed, Aurva automatically discovers sensitive data, baselines normal access patterns, and sets up monitoring policies.

Exfiltration Detection: Correlate sensitive data access with network egress in real-time: "Application X queried customer PII and made an outbound connection to unknown domain"—triggers investigation.

261749 columns

identified with PII/PCI data

80%

of sensitive data access now monitored

Automated

compliance reporting

Scaling to Billions: One Challenge, Three Battlegrounds

Moving from pilot to production exposed a single unified challenge: sustaining real-time monitoring guarantees across a pipeline processing 1 billion database queries and 50 million network connections per day. What looked like four separate symptoms: memory exhaustion, indexing lag, runaway cost, degraded alert latency; were the same problem expressed across three layers of the HILL architecture.

After: Architecture at Scale

┌─────────────────────────────────────────────────┐
│  Razorpay Infrastructure                        │
│  (20 K8s clusters, 14000+ services, 300+ DBs)   │
└──────────────┬──────────────────────────────────┘
               │
       [eBPF Collectors]
     (DaemonSet on each node)
     - <2% CPU overhead
     - Kernel-level capture
     - Smart filtering
     - Zero-copy transfer
               │
               ↓ (gRPC, aggregated)
               │
     ┌─────────┴─────────┐
     │                   │
     │  [Processor]      │  [Processor]
     │  (Auto-scaling)   │  (Multi-region)
     │                   │
     └─────────┬─────────┘
               │
               ↓
     ┌─────────────────────┐
     │   Storage Tiering   │
     │                     │
     │  Hot:  Elasticsearch│
     │        (7 days)     │
     │  Warm: ES + S3      │
     │        (8-90 days)  │
     │  Cold: S3           │
     │        (90+ days)   │
     └─────────────────────┘
               │
               ↓
     ┌─────────────────────┐
     │    Alert Engine     │
     │  + Compliance UI    │
     └─────────────────────┘
eBPF Collectors → [ Processor ] → [ Storage ] → [ Alert Engine ]
  • Processor - Enriches raw kernel events with identity context (pod, service, DB user) and routes them downstream.
  • Storage - Tiered persistence: Hot (Elasticsearch, 7 days) → Warm → Cold (S3, 90+ days).
  • Alert Engine - Evaluates security policies in real time and dispatches notifications.

The three layers are tightly coupled—a struggling Processor backs up Storage; lagging Storage stales the Alert Engine. The pipeline fails or succeeds as a system.

Before: The Pain Points

Layer

Problem

Impact

Processor

10K queue × 75KB/event; 2–4 DB lookups per log; unbounded PII buffers

8GB memory, OOM kills, silent coverage gaps

Storage

Individual writes; monolithic index; full fidelity = $50K/month

2-hour indexing lag, unsustainable cost

Alert Engine

Stale data from lagging storage; no deduplication

Undefined latency, alert storms

What Changed When the Scale Hit Billions

  • Processor memory: 8GB → 2GB, zero OOM kills, 12,000 events/sec sustained
  • Indexing lag: 2 hours → <30 seconds
  • Storage cost: 80% reduction at ~1TB indexed per day
  • Alert latency: <1 minute end-to-end, false positive rate <2%

How We Solved It

Processor Layer

Three fixes in concert: expanded the worker pool (10 → 50 workers) while cutting queue depth (10K → 1K) with upstream backpressure, so the system slows gracefully instead of accumulating silently; replaced per-event synchronous permission lookups (2–4 DB round-trips each) with a TTL-refreshed in-memory cache; and added TTL-based cleanup for PII log buffers that had been growing indefinitely.

Storage Layer

Switched from individual writes to bulk indexing—5,000 events per _bulk call—which alone collapsed the 2-hour indexing lag to under 30 seconds. Partitioned data into daily indices to isolate query scope and simplify retention. Applied risk-weighted sampling: routine queries are sampled, but PII/PCI-touching queries, all write operations, and DBA commands are captured at 100% fidelity regardless of tier. Result: 80% cost reduction with zero blind spots where they matter.

Alert Engine

Pre-compiled policies into efficient pattern-matching automata (eliminating runtime interpretation); cached evaluation results keyed on normalized query fingerprints so identical queries bypass the engine entirely; ran multiple load-balanced instances for horizontal throughput; and added a deduplication layer that groups high-frequency similar events into a single aggregated alert. Alert latency dropped to under 1 minute; the false positive rate held below 2%.

“I strongly recommend Aurva for their unwavering focus on solving customer challenges. Their team consistently delivers features ahead of expectations, often in half the anticipated time. In today’s AI-driven era, where achieving complete visibility into agent communication is increasingly complex, Aurva has already embedded solutions to address this in their core DNA. I look forward to continuing this successful partnership with Razorpay in the years to come.“
— Manikandan Rajappan, Staff Security Engineer, Razorpay

Key Design Decisions:

  • Horizontal Scaling: Processors scale based on load (Kubernetes HPA)
  • Storage Tiering: 80% cost reduction vs all-hot storage
  • Smart Sampling: 30% processing reduction

Coverage and Scale

Daily Monitoring:

  • 1 billion database queries
  • 50 million network connections
  • 14000+ microservices monitored
  • 300+ databases covered

Performance:

  • Application latency impact: 0ms
  • Database CPU overhead: 0% (native audit logs would be 20-40%)
  • eBPF collector CPU: <1% per node
“After extensive trial and error to find the ideal infrastructure, production profiling with real traffic became essential for uncovering performance bottlenecks that only emerge at scale“
— Manikandan Rajappan, Staff Security Engineer, Razorpay

The Ongoing Partnership

Current Focus:

  • Advanced threat detection: Correlate multiple signals (database access + network egress + user behavior)
  • Data lineage tracking: Understand data flow from source to destination

Future Roadmap:

  • Query performance optimization using visibility data
  • Capacity planning based on access patterns
“I strongly recommend Aurva for their unwavering focus on solving customer challenges. Their team consistently delivers features ahead of expectations, often in half the anticipated time. In today’s AI-driven era, where achieving complete visibility into agent communication is increasingly complex, Aurva has already embedded solutions to address this in their core DNA. I look forward to continuing this successful partnership with Razorpay in the years to come.“
— Manikandan Rajappan, Staff Security Engineer, Razorpay
aurva-logo

USA

AURVA INC. 1241 Cortez Drive, Sunnyvale, CA, USA - 94086

India

Aurva, 4th Floor, 2316, 16th Cross, 27th Main Road, HSR Layout, Bengaluru – 560102, Karnataka, India

aicpa-logoiso-logo

© 2025 Aurva. All rights reserved.Terms of ServicePrivacy Policy

twitterlinkeding