VaaniSetuAI

ML model that coverts ISL to text and text to IST in realtime

Repository

Team: Ctrl+Code

Description

Project Title: Sign Language to Text Translator using Google Cloud Vertex AI

Project Overview

This project aims to develop an AI-powered real-time sign language to text translation system using Google Cloud Vertex AI. The system processes live or recorded video streams, detects and classifies hand signs, and translates them into corresponding words or sentences. The solution is built with MLOps principles for automation, scalability, and continuous learning.

Problem Statement

Millions of people worldwide rely on sign language for communication, but a significant communication barrier exists between sign language users and non-signers. This project addresses this challenge by providing an AI-driven translation solution that converts sign language gestures into text in real time.

Project Workflow & Services Used

The implementation follows an end-to-end MLOps pipeline on Google Cloud, leveraging various services for data storage, preprocessing, model training, deployment, and monitoring.

1. Data Collection & Storage

Google Cloud Storage (GCS):
- Stores raw videos, extracted image frames, and labeled datasets.
- Organized in a structured format for easy access.
- Example Structure:
```
gs://sign-language-data/
  ├── raw_videos/
  ├── processed_frames/
  ├── labels.csv
```

2. Data Preprocessing & Feature Extraction

Google Cloud Dataflow:
- Automates data preprocessing by extracting frames from videos.
- Applies transformations such as image resizing, noise reduction, and augmentation.
- Uses OpenCV and TensorFlow for feature extraction.

3. Model Training & Development

Google Vertex AI Training:
- Enables custom training of a deep learning model for sign language recognition.
- Uses AutoML Vision for initial model training and custom TensorFlow models for advanced learning.
- Implements pre-trained models like MediaPipe Hands and Convolutional Neural Networks (CNNs) for hand sign detection.
Google AI Platform Notebooks:
- Provides an interactive environment for model experimentation and hyperparameter tuning.

4. Model Deployment & API Serving

Vertex AI Endpoints:
- Hosts the trained model as a scalable API.
- Supports real-time and batch predictions.
Google Cloud Functions:
- Processes real-time video frames and sends them to the model endpoint.
- Returns predicted text results to the frontend application.

5. Video Stream Processing & App Integration

Cloud Pub/Sub:
- Handles real-time video streaming and message queuing.
- Ensures smooth data flow between client devices and the backend model.
Cloud Run:
- Deploys a lightweight backend service for video frame processing.
- Manages API requests efficiently and scales automatically.

6. MLOps Pipeline & Automation

Vertex AI Pipelines (Kubeflow):
- Automates data ingestion, training, model validation, and deployment.
- Implements CI/CD for continuous improvement of model accuracy.
Cloud Scheduler:
- Schedules model retraining jobs periodically.
- Automates dataset updates with newly labeled sign language gestures.

7. Model Monitoring & Logging

Vertex AI Model Monitoring:
- Tracks model performance and detects concept drift.
- Alerts the team if accuracy drops due to changing hand sign variations.
Cloud Logging & Cloud Monitoring:
- Logs API requests and system performance.
- Provides real-time analytics dashboards to monitor user activity.

Frontend & User Interface

Flutter (for Mobile App):
- Provides an intuitive interface for real-time sign translation.
- Uses Google ML Kit to process video input and send frames to the backend.
Web Dashboard (React.js + Firebase):
- Displays translation results and analytics.
- Allows users to upload new videos for model improvement.

Key Features & Impact

✅ Real-time sign language translation using AI models.
✅ Scalable & automated MLOps pipeline on Google Cloud.
✅ Continuous improvement with model retraining.
✅ Mobile & web support for accessibility.
✅ Bridging the communication gap for the deaf & hard-of-hearing community.

Future Enhancements

Expand to multiple sign languages (e.g., ASL, BSL, ISL).
Integrate voice synthesis to convert text to speech.
Develop edge AI models for offline translation capabilities.
Crowdsourced data collection to improve dataset quality.

Conclusion

This project leverages Google Cloud Vertex AI and MLOps principles to create an end-to-end, scalable, and automated sign language translation system. It has the potential to significantly improve accessibility for sign language users worldwide, making communication easier and more inclusive.

Issues & PRs Board

No issues or pull requests added.