Plant Leaf Disease Detection

Deployable deep learning for cross-domain plant disease detection via ensemble learning, knowledge distillation, and quantization

Overview

An edge-deployable deep learning system for plant disease detection achieving 99.15% accuracy across 15 diseases. Through innovative ensemble learning, knowledge distillation, and INT8 quantization, we reduced model size by ~99% (671×) to 1.46 MB while maintaining 97.46% accuracy for mobile and edge deployment.

📄 Published in: IEEE Access (Impact Factor: 3.4, Q1 Journal)
🔗 Paper: IEEE Access
💻 Code: GitHub Repository

Problem Statement

Agriculture faces critical challenges in disease detection:

  • Manual inspection is slow: Farmers cannot scale visual inspection across large farms
  • Expert shortage: Limited access to plant pathologists in rural areas
  • Real-time needs: Diseases spread rapidly; early detection is crucial
  • Resource constraints: Mobile devices lack computational power for large deep learning models

Solution Architecture

1. Data Collection & Preprocessing

Datasets Used:

  • PlantVillage: 10 disease classes under lab conditions (54,306 images)
  • TomatoVillage: 8 disease classes in field conditions (18,161 images)
  • Combined Dataset: 15 diseases across lab + field environments

Preprocessing Pipeline:

  • Image resizing to 224×224 pixels
  • Color normalization and contrast enhancement
  • Data augmentation: rotation, flipping, brightness adjustment
  • ADASYN balancing: Addressed class imbalance for minority disease classes

2. Ensemble Teacher Network

Architecture:

  • DenseNet121 (Dense connections for feature reuse)
  • ResNet101 (Deep residual learning)
  • DenseNet201 (Deeper dense network)
  • EfficientNet-B4 (Compound scaling for efficiency)

Training Strategy:

  • Each model trained independently on augmented dataset
  • Soft voting: Average of class probabilities for final prediction
  • Achieved 99.15% accuracy on test set

3. Knowledge Distillation

Teacher-Student Framework:

  • Teacher: Frozen ensemble network (4 models)
  • Student: Lightweight ShuffleNetV2 architecture
  • Distillation Loss: Combination of hard labels and soft teacher predictions
    Loss = α × CrossEntropy(student, true_labels) + 
           (1-α) × KL_Divergence(student, teacher_soft_labels)
    
  • Temperature Scaling: T=5 for smoother probability distributions

Results:

  • Student model: 98.73% accuracy (only 0.42% drop from ensemble)
  • Model size: 8.9 MB (75× smaller than ensemble)

4. INT8 Quantization

Post-Training Quantization:

  • Converted FP32 weights to INT8 (8-bit integers)
  • Calibration on representative dataset
  • Optimized for mobile CPUs and edge TPUs

Final Compressed Model:

  • Size: 1.46 MB (671× smaller than original ensemble)
  • Accuracy: 97.46% accuracy (maintained >97% performance)
  • Inference Speed: 0.3ms per image on mobile devices

Technical Implementation

Model Architecture: ShuffleNetV2

# Student Network Architecture
ShuffleNetV2(
    input_shape=(224, 224, 3),
    num_classes=15,
    width_multiplier=1.0,
    include_top=True
)

# Distillation Training
teacher_predictions = ensemble_predict(images, temperature=5)
student_predictions = student_model(images)
distillation_loss = kl_divergence(student_predictions, teacher_predictions)
hard_loss = cross_entropy(student_predictions, true_labels)
total_loss = 0.3 * hard_loss + 0.7 * distillation_loss

Edge Deployment

Mobile App (Android/iOS):

  • TensorFlow Lite for on-device inference
  • Camera integration for real-time detection
  • Offline operation (no internet required)
  • User-friendly interface for farmers

Edge Devices:

  • Raspberry Pi 4 deployment
  • NVIDIA Jetson Nano support
  • Google Coral Edge TPU acceleration

Results & Performance

Accuracy Comparison

Model Accuracy Model Size Inference Time (Mobile)
Ensemble (Teacher) 99.15% 671 MB 1200ms
ShuffleNetV2 (Student) 98.73% 8.9 MB 45ms
Quantized INT8 97.46% 1.46 MB 0.3ms
MobileNetV2 (Baseline) 94.2% 14 MB 60ms
EfficientNet-B0 96.8% 29 MB 90ms

Disease Classes Detected

✅ Tomato Early Blight
✅ Tomato Late Blight
✅ Tomato Leaf Mold
✅ Tomato Septoria Leaf Spot
✅ Tomato Spider Mites
✅ Tomato Target Spot
✅ Tomato Yellow Leaf Curl Virus
✅ Tomato Mosaic Virus
✅ Tomato Bacterial Spot
✅ Tomato Healthy
✅ Potato Early Blight
✅ Potato Late Blight
✅ Potato Healthy
✅ Pepper Bell Bacterial Spot
✅ Pepper Bell Healthy

Explainable AI

GradCAM++ Visualizations:

  • Highlighted regions: Disease-affected leaf areas
  • Verified model attention aligns with pathologist expertise
  • Built trust with farmers and agricultural experts

LIME Decision Boundaries:

  • Identified critical features for each disease class
  • Transparent predictions for clinical validation

Impact & Applications

Real-World Deployment

Mobile App: Deployed on Android devices for 500+ farmers in rural areas
Edge Deployment: Raspberry Pi devices in remote farms without internet
Early Detection: Reduced crop loss by 30% through timely intervention
Cost Savings: Eliminated need for expensive cloud API calls
Accessibility: Works offline in areas with poor connectivity

Agricultural Impact

  • Precision Agriculture: Enable targeted pesticide application
  • Yield Improvement: Early detection prevents disease spread
  • Farmer Empowerment: Democratize access to plant pathology expertise
  • Sustainability: Reduce chemical usage through precise diagnosis

Technical Stack

Deep Learning: PyTorch, TensorFlow, Keras
Model Optimization: TensorFlow Lite, ONNX, TorchScript
Computer Vision: OpenCV, Pillow, scikit-image
Explainability: GradCAM++, LIME, SHAP
Data Processing: NumPy, Pandas, Albumentations
Deployment: TensorFlow Lite, Android Studio, Flask API
Visualization: Matplotlib, Seaborn, Plotly

Key Innovations

  1. 671× Model Compression: Largest compression ratio for plant disease detection
  2. Cross-Domain Generalization: Trained on lab data, validated on field conditions
  3. Knowledge Distillation + Quantization: Novel combination for extreme compression
  4. Explainable Predictions: GradCAM++ heatmaps for farmer trust
  5. Edge-First Design: Offline-capable for rural deployment

Publication & Recognition

📄 Citation:

@article{hasan2025deployable,
  title={Deployable deep learning for cross-domain plant leaf disease detection via ensemble learning, knowledge distillation, and quantization},
  author={Hasan, Mohammad Junayed and Mazumdar, Suvodeep and Momen, Sifat},
  journal={IEEE Access},
  year={2025},
  publisher={IEEE}
}

Future Work

  • Additional Crops: Expand to rice, wheat, corn, and cotton diseases
  • Multi-Disease Detection: Simultaneous detection of multiple diseases per image
  • Pest Detection: Integrate insect pest identification
  • Severity Estimation: Quantify disease progression (early/mid/late stage)
  • Treatment Recommendations: Integrate with pesticide/fungicide databases
  • IoT Integration: Automated monitoring with drone and camera trap deployment

Status: Published & Deployed
Journal: IEEE Access (Q1, IF: 3.4)
GitHub: leaf-disease-ai
Demo: Mobile App Available on Request

References