Skip to main content
Back to Blog

Federated Learning and Differential Privacy for LLMs: Privacy-Preserving AI at Scale

A comprehensive guide to privacy-preserving machine learning techniques for LLMs covering federated learning architectures, differential privacy mechanisms, DP-LoRA fine-tuning, and production strategies for training on sensitive data without compromising privacy.

6 min read
Share:

Large language models are data hungry—the more training data, the better the model. But much of the most valuable data is private: medical records, financial transactions, personal communications, proprietary business documents. Traditional centralized training requires collecting this data in one place, creating privacy risks and often violating regulations. Federated learning and differential privacy offer a path forward: training powerful models on distributed private data while providing mathematical guarantees that individual data cannot be extracted.

The Privacy Challenge in LLM Training

Training data shapes model behavior. A model trained on medical literature can assist doctors; one trained on legal documents can help lawyers. But accessing domain-specific data raises significant challenges:

Regulatory constraints: HIPAA (healthcare), GDPR (EU personal data), CCPA (California), and industry-specific regulations restrict data collection and processing. Centralizing data often violates these requirements.

Competitive sensitivity: Organizations are reluctant to share proprietary data that represents competitive advantage. Training on combined data would require trusting competitors.

User privacy expectations: Users generate valuable data through product interactions but expect privacy. Using this data for training without consent erodes trust.

Memorization risks: LLMs memorize portions of training data and can regurgitate them in outputs. A model trained on private data might leak that data in responses.

Attack surfaces: Centralized data creates attractive targets. Model weights themselves can leak training data through extraction attacks.

Traditional approaches to these challenges—anonymization, data use agreements, secure enclaves—provide limited protection. Anonymization is often reversible; agreements don't prevent breaches; enclaves add complexity without privacy guarantees. We need fundamentally different approaches.

Federated Learning: Distributed Training

Federated learning enables training on distributed data without centralizing it. The core idea is simple: instead of bringing data to the model, bring the model to the data.

The Federated Learning Paradigm

In standard training:

  1. Collect data at central server
  2. Train model on central server
  3. Deploy model

In federated learning:

  1. Distribute model to data sources (clients)
  2. Clients train on local data
  3. Clients send model updates (not data) to server
  4. Server aggregates updates into improved model
  5. Repeat

The server never sees raw data—only model updates (gradients or weight differences). Data remains on the devices or organizations that generated it.

Federated Averaging (FedAvg)

The foundational federated learning algorithm is Federated Averaging:

Server initialization: Initialize global model parameters θ(0)\theta^{(0)}

For each round t=1,2,...t = 1, 2, ...:

  1. Server selects subset of clients StS_t
  2. Server sends current model θ(t)\theta^{(t)} to selected clients
  3. Each client kk in StS_t:
    • Trains locally for EE epochs on local data
    • Computes update Δθk=θklocalθ(t)\Delta\theta_k = \theta_k^{\text{local}} - \theta^{(t)}
    • Sends Δθk\Delta\theta_k to server
  4. Server aggregates: θ(t+1)=θ(t)+1StkStΔθk\theta^{(t+1)} = \theta^{(t)} + \frac{1}{|S_t|} \sum_{k \in S_t} \Delta\theta_k

The aggregation weights each client equally, though variants weight by dataset size or other factors.

Challenges for LLMs

FedAvg works well for small models but faces challenges with LLMs:

Communication cost: Sending full model updates for a 70B parameter model requires transmitting 140GB per client per round. With hundreds of clients over multiple rounds, bandwidth becomes prohibitive.

Compute requirements: Full LLM training requires substantial GPU resources that most clients lack. A hospital might have domain expertise and data but not a data center.

Heterogeneity: Clients have different data distributions (non-IID data), computational capabilities, and availability. Standard FedAvg assumes relatively homogeneous clients.

Convergence: With non-IID data, local training can push models in conflicting directions. Aggregating divergent updates produces poor results.

These challenges have driven development of LLM-specific federated learning approaches.

Federated Learning for LLMs

Direct federated learning with FedAvg is impractical for LLMs. Modern approaches adapt the paradigm for large model constraints.

Split Learning

Split learning partitions the model between client and server:

Code
Client holds: Embedding layer, first few layers
Server holds: Middle layers, output layers

Forward pass:
1. Client computes embeddings and early representations
2. Client sends intermediate activations to server
3. Server completes forward pass
4. Server computes loss

Backward pass:
1. Server computes gradients through its layers
2. Server sends gradient of intermediate activations to client
3. Client completes backpropagation locally

This dramatically reduces client computation—only a small fraction of layers run on client devices. However, intermediate activations potentially leak information, and the server sees more than in pure federated learning.

Privacy considerations: Intermediate activations can reveal input properties. Various attacks reconstruct inputs from activations. Split learning provides computational efficiency, not privacy guarantees by itself.

Federated Fine-Tuning

Rather than training from scratch, federated fine-tuning starts with a pre-trained model and adapts it using private data:

Advantages:

  • Much less computation per client (fine-tuning vs. pre-training)
  • Fewer rounds needed for convergence
  • Leverages existing foundation model capabilities

Approaches:

  • Full fine-tuning: Update all parameters (expensive)
  • LoRA: Only update low-rank adapter matrices (efficient)
  • Prompt tuning: Only update soft prompts (very efficient)

LoRA is particularly well-suited for federated learning because updates are small (typically <1% of model parameters), dramatically reducing communication costs.

Federated LoRA

Federated LoRA combines parameter-efficient fine-tuning with federated learning:

Setup:

  • Base model θ\theta is frozen and shared (clients download once)
  • Each client maintains LoRA adapters AkA_k, BkB_k

Training round:

  1. Clients train LoRA adapters on local data
  2. Clients send adapter updates to server
  3. Server aggregates adapters (by averaging)
  4. Updated adapters distributed to clients

Communication savings: For a 7B model with rank-16 LoRA on attention layers:

  • Full model: 14GB per round
  • LoRA adapters: ~20MB per round
  • 700× reduction in communication

This makes federated learning practical even with limited bandwidth. A 20MB upload is feasible on consumer internet; a 14GB upload is not.

OpenFedLLM Framework

OpenFedLLM, presented at KDD 2024, provides a comprehensive framework for federated LLM training:

Supported paradigms:

  • Federated instruction tuning: Improve instruction-following across domains
  • Federated value alignment: Align models with distributed human preferences
  • Multiple FL algorithms: FedAvg, FedProx, SCAFFOLD, and others

Architecture: Clients run local training with HuggingFace Transformers; server coordinates aggregation; communication uses efficient serialization.

Key finding: Collaborative training on distributed private data achieves quality approaching centralized training while maintaining privacy. The gap is typically 2-5% on benchmarks.

Differential Privacy: Mathematical Privacy Guarantees

Federated learning keeps data distributed but doesn't prevent information leakage through model updates. An adversary who observes gradient updates can potentially reconstruct training examples. Differential privacy provides mathematical guarantees against such attacks.

The Definition

A randomized mechanism MM is (ϵ,δ)(\epsilon, \delta)-differentially private if for any two datasets DD and DD' differing in one element, and any output set SS:

P[M(D)S]eϵP[M(D)S]+δP[M(D) \in S] \leq e^\epsilon \cdot P[M(D') \in S] + \delta

In words: adding or removing one training example changes the output distribution by at most a factor of eϵe^\epsilon (plus a δ\delta probability of larger deviation).

Privacy budget ϵ\epsilon: Lower is more private. ϵ=1\epsilon = 1 is considered strong privacy; ϵ=10\epsilon = 10 is weak but measurable. The parameter controls the privacy-utility tradeoff.

Failure probability δ\delta: Should be cryptographically small, typically <1/n< 1/n where nn is dataset size. This accounts for rare worst-case events.

The Gaussian Mechanism

The primary mechanism for achieving differential privacy in deep learning adds Gaussian noise to gradients:

Noise calibration: For a function with sensitivity Δf\Delta f (maximum change when one input changes), adding noise N(0,σ2)\mathcal{N}(0, \sigma^2) where:

σ=Δf2ln(1.25/δ)ϵ\sigma = \frac{\Delta f \cdot \sqrt{2 \ln(1.25/\delta)}}{\epsilon}

achieves (ϵ,δ)(\epsilon, \delta)-differential privacy.

For gradients: The sensitivity is bounded by gradient clipping—limiting the maximum gradient norm per example to CC. Then Δf=C\Delta f = C and we add noise calibrated to this bound.

DP-SGD: Differentially Private Training

DP-SGD modifies standard training to provide differential privacy:

Standard SGD update: θt+1=θtη1BiBθL(xi,θt)\theta_{t+1} = \theta_t - \eta \cdot \frac{1}{|B|} \sum_{i \in B} \nabla_\theta L(x_i, \theta_t)

DP-SGD update: θt+1=θtη1B(iBclip(θL(xi,θt),C)+N(0,σ2C2I))\theta_{t+1} = \theta_t - \eta \cdot \frac{1}{|B|} \left( \sum_{i \in B} \text{clip}(\nabla_\theta L(x_i, \theta_t), C) + \mathcal{N}(0, \sigma^2 C^2 I) \right)

The modifications:

  1. Per-example gradients: Compute gradients for each example separately (not batched)
  2. Gradient clipping: Clip each gradient to maximum norm CC
  3. Noise addition: Add Gaussian noise calibrated to the clipping bound

Privacy accounting: Each training step consumes some privacy budget. After TT steps with batch size BB and sampling rate q=B/nq = B/n, the total privacy cost is tracked using composition theorems (typically the moments accountant or Gaussian DP).

Challenges for LLMs

DP-SGD faces significant challenges with large language models:

Computation overhead: Per-example gradients are expensive. Standard batched backpropagation computes the sum of gradients efficiently; separating them requires more memory and compute. For a 70B model, this can be 10-100× more expensive.

Privacy budget exhaustion: Pre-training requires millions of gradient steps. Even with tight composition, the cumulative privacy cost makes meaningful ϵ\epsilon guarantees impossible for full pre-training.

Noise impact: The noise required for privacy degrades model quality. Larger models require more noise (higher sensitivity), partially negating scale benefits.

Utility gap: DP-trained models typically underperform non-private models by 5-15% on benchmarks, though this gap is narrowing with better techniques.

DP Fine-Tuning: A Practical Approach

Rather than DP pre-training (largely impractical), the field has focused on DP fine-tuning:

Rationale:

  • Pre-training uses public data (no privacy concern)
  • Fine-tuning uses private domain data (privacy needed)
  • Fine-tuning requires fewer steps (less privacy budget consumed)
  • Smaller adapter updates have lower sensitivity

DP-LoRA: Combine differential privacy with LoRA for efficient private fine-tuning:

  1. Start with pre-trained base model (public data, no DP needed)
  2. Fine-tune only LoRA adapters with DP-SGD
  3. Noise is calibrated to adapter gradients, not full model gradients

DP-LoRA achieves reasonable privacy-utility tradeoffs:

  • ϵ=3\epsilon = 3: ~5% accuracy drop vs. non-private fine-tuning
  • ϵ=8\epsilon = 8: ~2% accuracy drop

These numbers are task-dependent but illustrate that DP fine-tuning is practical where DP pre-training is not.

Combining Federated Learning and Differential Privacy

Federated learning and differential privacy address different threats:

Federated learning: Protects against server seeing raw data Differential privacy: Protects against model updates leaking data

Combining them provides layered protection: the server never sees raw data (FL), and the updates it receives are noisy enough to prevent reconstruction (DP).

DP-FedAvg

The simplest combination applies local differential privacy to federated updates:

Training round:

  1. Client trains locally using DP-SGD
  2. Client clips and noises the model update before sending
  3. Server aggregates noisy updates

Each client's update is individually differentially private. Even if the server is adversarial, it cannot extract individual training examples.

Privacy amplification: When only a fraction qq of clients participate each round, privacy is amplified by subsampling. The effective ϵ\epsilon is approximately qϵlocalq \cdot \epsilon_{\text{local}}.

DP-FedLoRA

Combining DP with federated LoRA provides practical private LLM fine-tuning:

DP-FedLoRA protocol (from recent research, 2024-2025):

  1. Local training: Each client trains LoRA adapters using DP-SGD

    • Per-example gradient computation for adapter parameters
    • Gradient clipping to bound sensitivity
    • Gaussian noise addition calibrated to clip bound
  2. Update perturbation: Before sending, add additional noise to adapter matrices

    • Noise calibrated for (ϵ,δ)(\epsilon, \delta)-DP guarantee
    • Unbiased updates (noise has zero mean)
  3. Secure aggregation: Server aggregates noisy updates

    • Sum of individually-DP updates remains DP
    • Composition bounds total privacy cost

Theoretical guarantees: Under standard assumptions, DP-FedLoRA provides:

  • Unbiased gradient estimates (noise doesn't bias convergence)
  • Bounded noise variance (convergence rate analyzable)
  • Formal (ϵ,δ)(\epsilon, \delta)-DP guarantee per client

FLIP: Interactive Privacy-Utility Optimization

FLIP (Federated Learning Interactive Privacy), introduced in early 2025, provides an interactive framework for balancing privacy and utility:

Key insight: Privacy and utility tradeoffs are highly dependent on:

  • Data distribution across clients
  • Model architecture
  • Task requirements
  • Acceptable privacy budget

FLIP helps practitioners explore this tradeoff space:

  1. Parameter exploration: Systematically vary ϵ\epsilon, clipping bounds, and FL parameters
  2. Utility estimation: Predict model quality at different privacy levels
  3. Human-in-the-loop: Practitioner specifies constraints and preferences
  4. Optimization: Find parameters achieving desired privacy-utility balance

Experiments show the privacy-utility gap can be reduced from 5% to 2% with properly tuned parameters, compared to naive defaults.

Privacy Attacks and Defenses

Understanding potential attacks helps design robust systems:

Gradient Inversion Attacks

Attack: Reconstruct training data from observed gradients

Method: Optimize a dummy input to produce gradients matching the observed gradient: x^=argminxθL(x,θ)gobserved\hat{x} = \arg\min_x \| \nabla_\theta L(x, \theta) - g_{\text{observed}} \|

Effectiveness: High-resolution images can be reconstructed from gradients; text is harder but partial reconstruction is possible.

Defense: Differential privacy makes gradients noisy enough that inversion produces noise, not data. Gradient clipping also helps by limiting how much any single example influences updates.

Membership Inference Attacks

Attack: Determine whether a specific example was in the training data

Method: Train an attack model to distinguish "in-training" from "out-of-training" examples based on model behavior (loss, confidence, etc.)

Effectiveness: Significant privacy breach for sensitive applications. "Was this person's medical record used to train this model?"

Defense: Differential privacy provides provable bounds on membership inference accuracy. With (ϵ,δ)(\epsilon, \delta)-DP, membership inference advantage is bounded by eϵ1ϵe^\epsilon - 1 \approx \epsilon for small ϵ\epsilon.

Model Extraction Attacks

Attack: Steal a model's functionality by querying it

Method: Query the model extensively and train a copy on the responses.

Note: This isn't specifically a privacy attack on training data, but it's relevant to protecting model IP. Federated learning doesn't help here; the final model is still deployed.

Data Poisoning

Attack: Malicious clients submit poisoned updates to corrupt the global model

Method: Client includes adversarial examples or backdoors in local training, producing updates that degrade model performance or insert specific behaviors.

Defense:

  • Robust aggregation (median instead of mean)
  • Anomaly detection on updates
  • Client reputation systems
  • Byzantine-resilient algorithms

Secure Aggregation

Protocol: Clients' updates are encrypted such that the server can compute the sum without seeing individual updates.

Mechanism: Uses cryptographic techniques (secret sharing, homomorphic encryption) to enable aggregation on encrypted values.

Benefit: Even if individual updates aren't differentially private, the server only sees the aggregate, limiting attack surface.

Limitation: Increases communication and computation costs; doesn't protect against the aggregate leaking information.

Production Considerations

Deploying federated learning with differential privacy in production requires careful engineering:

Infrastructure

Client requirements:

  • Sufficient compute for local training (or split learning)
  • Reliable network connectivity for update transmission
  • Secure storage for local data and model parameters

Server requirements:

  • Aggregation infrastructure (can be distributed for scale)
  • Privacy accounting to track cumulative ϵ\epsilon
  • Secure communication channels

Communication:

  • Compression: Gradient quantization, sparsification
  • Scheduling: Handle client availability, stragglers
  • Security: TLS, certificate pinning, authentication

Privacy Accounting

Track privacy budget consumption across rounds:

Per-round accounting: Each training round consumes some ϵ\epsilon. Use tight composition theorems (Rényi DP, Gaussian DP) for accurate accounting.

Budget allocation: Decide upfront how much total ϵ\epsilon is acceptable. Allocate across rounds to achieve desired model quality before budget exhaustion.

Monitoring: Track realized privacy cost in real-time. Stop training if budget is exhausted.

Regulatory Compliance

Differential privacy helps with but doesn't automatically satisfy regulations:

GDPR: DP provides technical measures for data protection. Document privacy guarantees, mechanism details, and privacy budget in DPIA.

HIPAA: DP can support the "de-identification" requirement, but legal interpretation varies. Consult legal counsel.

Audit trails: Maintain records of training runs, privacy parameters, and client participation for regulatory audits.

Debugging and Monitoring

Private training complicates debugging:

Can't inspect data: By design, you can't look at training examples when debugging quality issues.

Noise obscures signals: DP noise can mask real issues. A model might perform poorly due to noise (expected) or data quality (fixable), and distinguishing is hard.

Strategies:

  • Test pipelines on non-private proxy data first
  • Use larger privacy budgets during development (not production)
  • Monitor aggregate statistics that don't violate privacy
  • Build quality signals into the training protocol

Detailed Privacy Analysis

Understanding the mathematical guarantees and practical implications of differential privacy in federated LLM training.

Privacy Budget Breakdown

The total privacy cost of federated training accumulates across multiple dimensions:

Per-round privacy cost: ϵround=ϵlocal+ϵaggregation\epsilon_{\text{round}} = \epsilon_{\text{local}} + \epsilon_{\text{aggregation}}

Where:

  • ϵlocal\epsilon_{\text{local}} = privacy cost of local DP-SGD training
  • ϵaggregation\epsilon_{\text{aggregation}} = additional cost from aggregation (often 0 with secure aggregation)

Total training privacy cost: Using advanced composition (Rényi DP accounting): ϵtotal2Tln(1/δ)ϵround+Tϵround2\epsilon_{\text{total}} \approx \sqrt{2T \ln(1/\delta)} \cdot \epsilon_{\text{round}} + T \cdot \epsilon_{\text{round}}^2

Where TT = number of training rounds.

Example calculation:

  • 100 rounds of training
  • ϵround=0.1\epsilon_{\text{round}} = 0.1 per round
  • δ=106\delta = 10^{-6}
  • Using Rényi DP: ϵtotal3.5\epsilon_{\text{total}} \approx 3.5

Privacy vs. Utility Tradeoffs

Empirical measurements across different privacy budgets:

Privacy Budget (ϵ\epsilon)Noise Scale (σ\sigma)Utility LossPractical Use
ϵ<1\epsilon < 1σ>10\sigma > 1015-25%High-sensitivity data
ϵ=13\epsilon = 1-3σ=310\sigma = 3-105-15%Medical, financial
ϵ=38\epsilon = 3-8σ=13\sigma = 1-32-8%General enterprise
ϵ=815\epsilon = 8-15σ=0.51\sigma = 0.5-1<2%Low-sensitivity
ϵ>15\epsilon > 15σ<0.5\sigma < 0.5MinimalMinimal privacy

Gradient Clipping Impact

Gradient clipping bound CC affects both privacy and training dynamics:

Smaller CC (e.g., C=0.1C = 0.1):

  • Better privacy (lower sensitivity)
  • More aggressive clipping → potential training instability
  • May require lower learning rates

Larger CC (e.g., C=10C = 10):

  • Worse privacy (higher sensitivity, more noise needed)
  • Preserves more gradient information
  • More stable training but noisier updates

Optimal CC selection: Research suggests setting CC to clip approximately 10-20% of gradients. Empirically, this balances privacy and utility effectively.

Communication Efficiency Analysis

Federated learning communication costs for LLMs:

Model SizeFull UpdatesLoRA UpdatesCompression Ratio
7B14 GB20 MB700×
13B26 GB35 MB740×
70B140 GB100 MB1400×

With gradient compression:

  • Top-k sparsification: Additional 10× reduction
  • Quantization (INT8): Additional 4× reduction
  • Combined: Up to 5,600× reduction vs. full model updates

Case Studies and Applications

Cross-Bank Fraud Detection LLM

Scenario: Multiple banks want to train a fraud detection LLM on their combined transaction records without sharing sensitive customer data with competitors.

Architecture:

Code
┌─────────────────────────────────────────────────────────────┐
│                    Aggregation Server                        │
│  (Secure Enclave - Never sees individual bank data)          │
└───────────────────────┬─────────────────────────────────────┘
                        │ Aggregated LoRA updates
        ┌───────────────┼───────────────┐
        │               │               │
        ▼               ▼               ▼
   ┌─────────┐    ┌─────────┐    ┌─────────┐
   │ Bank A  │    │ Bank B  │    │ Bank C  │
   │         │    │         │    │         │
   │ DP-SGD  │    │ DP-SGD  │    │ DP-SGD  │
   │ Training│    │ Training│    │ Training│
   └─────────┘    └─────────┘    └─────────┘
        │               │               │
   Transaction     Transaction     Transaction
   Records         Records         Records
   (Never leaves)  (Never leaves)  (Never leaves)

Approach:

  • Base model: Pre-trained financial LLM (public financial documents, SEC filings)
  • Fine-tuning: Federated LoRA with DP across banks
  • Privacy budget: ϵ=4\epsilon = 4 per bank per training cycle
  • Aggregation: Daily rounds with secure aggregation

Challenges addressed:

  • Regulatory compliance: PCI-DSS, GDPR—transaction data never leaves banks
  • Competitive concerns: No bank sees another's transaction patterns or customer behavior
  • Privacy guarantees: DP bounds information leakage about individual transactions
  • Fraud pattern sharing: Banks benefit from collective fraud detection without exposing their data

Results:

  • Model achieves 93% of centralized training quality
  • 40% improvement in cross-bank fraud detection (fraudsters often target multiple banks)
  • Meets regulatory requirements for all participating institutions
  • 3× faster detection of new fraud patterns compared to isolated training

On-Device Personalization

Scenario: Mobile keyboard wants to improve next-word prediction using user typing patterns.

Approach:

  • Base model: General language model
  • Fine-tuning: Federated learning across millions of devices
  • Privacy: User-level DP (protect all of a user's data, not just individual examples)
  • Training: Overnight when devices are charging

Google's implementation (Gboard): Demonstrated user-level DP at scale with ϵ<10\epsilon < 10 while improving prediction quality.

Financial Document Processing

Scenario: Banks want to train a document understanding model on financial statements without revealing client data to each other or a central server.

Approach:

  • Split learning: Most computation on central server
  • DP on client-side: Intermediate activations are noised
  • Secure enclaves: Server computation in TEEs for additional protection

Benefit: Enables industry-wide model improvement without competitive data sharing.

Future Directions

Tighter Privacy-Utility Tradeoffs

Current DP-LLM training suffers 5-15% utility loss. Research directions to close this gap:

Better mechanisms: DP-FTRL (follow-the-regularized-leader) and other alternatives to DP-SGD may provide tighter bounds.

Adaptive clipping: Learn optimal clipping bounds during training rather than fixing them.

Privacy-aware architectures: Design model architectures that are inherently more privacy-friendly (lower sensitivity).

Trustworthy Aggregation

Current federated learning trusts the server to aggregate honestly. Alternatives:

Decentralized aggregation: No central server; clients aggregate in a peer-to-peer manner.

Blockchain-based verification: Use blockchain to verify aggregation correctness.

Multi-party computation: Distribute aggregation across multiple non-colluding parties.

Synthetic Data Generation

Rather than training directly on private data:

  1. Train a generative model with DP on private data
  2. Generate synthetic data from the DP generative model
  3. Train downstream models on synthetic data (unlimited, non-private)

The DP guarantee transfers: if the generative model is DP, anything derived from it is also DP. This approach is gaining traction for its flexibility.

2025 Research Breakthroughs

RLDP Framework (July 2025): Casts DP optimization as a closed-loop control problem using deep reinforcement learning. Across 1,600+ experiments on GPT2-small, Llama-1B, Llama-3B, and Mistral-7B:

  • Perplexity reductions of 1.3-30.5% (mean 5.4%)
  • Average 5.6% downstream utility gain
  • First framework to use RL for adaptive DP optimization

POPri (August 2025): Addresses the challenge that many LLMs cannot be stored or trained on client devices. Turns DP synthetic generation into an LLM policy optimization problem, enabling powerful alignment methods like DPO for private federated learning.

GeoClip: Geometry-aware framework for DP-SGD that clips and perturbs gradients in a transformed basis aligned with gradient distribution's geometry. Adaptively estimates transformation using noisy gradients without additional privacy cost.

DP-Prox for Edge Devices: Framework enabling federated instruction tuning of small LLMs on 8GB devices, instantiated on Phi-3-mini with QLoRA. FedProx augments local objective with proximal term preventing deviation from global model.

Security Research (EMNLP 2025)

Recent security analysis reveals ongoing challenges:

Attack findings: Attackers can extract training data from global models even using straightforward generation methods. Leakage increases with model size. Enhanced attack strategies track global model updates during training to intensify privacy leakage.

Defense evaluation: Differential privacy, regularization-constrained updates, and safety-aligned LLMs can mitigate risks but require careful tuning. The research emphasizes that FL alone is insufficient—DP and secure aggregation remain essential.

Unlearning and Data Deletion

GDPR and similar regulations provide "right to be forgotten"—users can request their data be deleted. For models trained on that data:

Machine unlearning: Modify trained models to "forget" specific training examples without full retraining.

Federated unlearning: Handle deletion requests in federated settings where the server never saw the data directly.

DP's advantage: With strong DP guarantees (ϵ<1\epsilon < 1), individual examples have minimal influence. Unlearning may be unnecessary—the model already "barely remembers" any individual.

Enrico Piovano, PhD

Co-founder & CTO at Goji AI. Former Applied Scientist at Amazon (Alexa & AGI), focused on Agentic AI and LLMs. PhD in Electrical Engineering from Imperial College London. Gold Medalist at the National Mathematical Olympiad.

Related Articles

LLM Application Security: Practical Defense Patterns for Production

Comprehensive guide to securing LLM applications in production. Covers the OWASP Top 10 for LLMs 2025, prompt injection defense strategies, PII protection with Microsoft Presidio, guardrails with NeMo and Lakera, output validation, and defense-in-depth architecture.

20 min read
EducationLLMs

LLM Pre-training: Building Foundation Models from Scratch

A comprehensive guide to pre-training large language models—from data curation and architecture decisions to scaling laws and distributed training infrastructure. Understanding how GPT, Llama, and other foundation models are built.

15 min read
EducationLLMs

SFT Deep Dive: Instruction Tuning Techniques and Best Practices

A comprehensive guide to Supervised Fine-Tuning (SFT) for LLMs—covering full fine-tuning vs LoRA vs QLoRA vs DoRA, data curation strategies, instruction formats, multi-task learning, and avoiding catastrophic forgetting.

4 min read
LLMsRLHF

HuggingFace TRL: A Deep Dive into the Transformer Reinforcement Learning Library

A comprehensive exploration of HuggingFace TRL's architecture—examining its trainer ecosystem from SFT to GRPO, data collators, reward functions, vLLM integration, and the internals that power modern LLM fine-tuning workflows.

11 min read
LLMsML Engineering

Data Curation for LLM Training: The Hidden Foundation of Model Quality

A comprehensive guide to curating training data for large language models—from web crawl filtering and deduplication to quality classifiers and data mixing strategies. The unglamorous work that determines model quality.

10 min read
LLMsML Engineering

LLM Guardrails & Output Filtering: Building Safe Production Systems

A comprehensive guide to implementing guardrails for LLM applications—from input validation and prompt injection defense to output filtering, content moderation, and the architecture of production safety systems.

12 min read