AI · Healthcare · Diagnostics Published Research

Droplet — AI-Powered Healthcare Diagnostics

An AI-powered diagnostics platform that combines automated blood report analysis with stress detection to create a holistic healthcare tool, featuring semi-supervised learning and published research.

Back

My Role

Lead ML Researcher & Primary Author

When

2023–2024 (B.Tech Final Year Project, NMIMS)

Status

Completed (Prototype + Published Research)

Semi-supervised ML NLP stress detection Published IEEE paper

Built With

Python · FastAPI · MySQL · Flutter · ML (Scikit-learn, XGBoost, SVM)

Published research on semi-supervised learning in healthcare AI

The Problem

Manual healthcare diagnostics are slow, error-prone, and fail to integrate mental health monitoring with physical health assessment.

  • Manual analysis limitations: Blood report analysis is error-prone, slow, and doesn't scale to meet growing healthcare demands.
  • Fragmented records: Lack of centralized report management leads to disconnected healthcare histories and missed patterns.
  • Mental health gaps: Traditional diagnostic tools often exclude mental health monitoring, missing crucial holistic health indicators.
  • Limited labeled data: Healthcare ML models struggle with insufficient labeled datasets for reliable disease prediction.

Our Solution

Droplet integrates multiple AI approaches into a modular diagnostics platform designed to learn effectively from limited labeled data.

  • Semi-supervised core: Flagship self-training pipeline that improves accuracy when labeled healthcare data are scarce.
  • NLP stress detection: Classifier aligned to Beck’s Depression Inventory constructs, trained on curated Reddit mental-health corpora.
  • Report ingestion: OCR for CBC reports with a simple records layer to manage patient test histories.
  • Mobile access: Flutter companion app for uploads and viewing predictions on device.

My Contribution as Lead ML Researcher

Led end-to-end research and engineering, from model design to publication.

  • Model development: Designed and benchmarked Decision Trees, Random Forest, XGBoost, SVM, Logistic Regression, and a semi-supervised self-training model.
  • Performance tuning: Achieved strong results on our datasets (e.g., ≈92.7–92.8% accuracy on CBC module), and tuned XGBoost (diabetes) and SVM (heart-disease) sub-modules.
  • Deployment: Integrated models into a FastAPI backend with a Flutter client for uploads and results.
  • Publication: Primary author of a peer-reviewed paper on semi-supervised CBC diagnostics, published in IEEE proceedings (ICEEICT 2023).
92.8% Peak Accuracy (CBC module)
6 ML Algorithms Benchmarked
3 Active Modules
CBC (Multi-disease) — Semi-Supervised
Paper result
CBC panel Self-training over DT/GB
Accuracy 92.8% F1 ≈ 0.93 P ≈ 0.95 · R ≈ 0.93
Heart Disease — SVM (RBF)
Module
ECG-derived + CBC Semi-supervised (self-learning)
Accuracy 89.8% F1 0.88 AUC 0.93
Stress / Depression — NLP Classifier
Module
Reddit mental-health corpora Fine-tuned classifier
Accuracy 90.2% F1 0.89 FN↓ 7.5%

CBC results reflect the published study; heart-disease and stress/depression are separate Droplet modules (outside the CBC paper).

Results & Impact

Validated semi-supervised learning for CBC-based diagnostics and demonstrated a modular path toward broader health screening.

  • Research validation: Demonstrated feasibility of semi-supervised learning on CBC data through controlled experiments and benchmarking.
  • Practical utility: Prototype shows clear value for rapid triage and clinician decision support in low-label settings.
  • Academic recognition: Findings published in IEEE proceedings (ICEEICT 2023).
  • System view: Showcases how CBC diagnostics, cardiology signals, and mental-health NLP can coexist as a unified platform.

Reflection & Impact

Droplet shaped my approach to responsible ML for health: rigorous methods, conservative claims, and usable interfaces.

  • Healthcare ML design: Balanced model accuracy with clinical constraints and patient experience.
  • Research ownership: Ran the full loop—from problem framing to deployment and peer-reviewed publication.
  • Societal trajectory: Informed my later work on AI for governance and fair access to public services.
  • Interdisciplinary build: Combined supervised/semi-supervised ML with NLP to form coherent, testable modules.

Research Details

Methods and contributions from the CBC paper and adjacent Droplet modules.

  • Semi-supervised innovation: Self-training on top of a strong base learner (DT/GB) to leverage unlabeled CBC data.
  • Module integration: Combined CBC diagnostics with XGBoost-based diabetes and SVM-based heart-disease sub-modules, plus an NLP stress screen.
  • Evaluation: Cross-validated benchmarks on each module; CBC results align with the published paper’s metrics.
  • Publication: Peer-reviewed paper in IEEE proceedings (ICEEICT 2023) detailing the CBC semi-supervised approach.

Full methodology and experiment details are available in the published paper; module-specific evaluations are maintained internally.