VaaniSetuAI

ML model that coverts ISL to text and text to IST in realtime

Description

Project Title: Sign Language to Text Translator using Google Cloud Vertex AI

Project Overview

This project aims to develop an AI-powered real-time sign language to text translation system using Google Cloud Vertex AI. The system processes live or recorded video streams, detects and classifies hand signs, and translates them into corresponding words or sentences. The solution is built with MLOps principles for automation, scalability, and continuous learning.

Problem Statement

Millions of people worldwide rely on sign language for communication, but a significant communication barrier exists between sign language users and non-signers. This project addresses this challenge by providing an AI-driven translation solution that converts sign language gestures into text in real time.

Project Workflow & Services Used

The implementation follows an end-to-end MLOps pipeline on Google Cloud, leveraging various services for data storage, preprocessing, model training, deployment, and monitoring.

1. Data Collection & Storage

  • Google Cloud Storage (GCS):

    • Stores raw videos, extracted image frames, and labeled datasets.

    • Organized in a structured format for easy access.

    • Example Structure:

      gs://sign-language-data/
        ├── raw_videos/
        ├── processed_frames/
        ├── labels.csv

2. Data Preprocessing & Feature Extraction

  • Google Cloud Dataflow:

    • Automates data preprocessing by extracting frames from videos.

    • Applies transformations such as image resizing, noise reduction, and augmentation.

    • Uses OpenCV and TensorFlow for feature extraction.

3. Model Training & Development

  • Google Vertex AI Training:

    • Enables custom training of a deep learning model for sign language recognition.

    • Uses AutoML Vision for initial model training and custom TensorFlow models for advanced learning.

    • Implements pre-trained models like MediaPipe Hands and Convolutional Neural Networks (CNNs) for hand sign detection.

  • Google AI Platform Notebooks:

    • Provides an interactive environment for model experimentation and hyperparameter tuning.

4. Model Deployment & API Serving

  • Vertex AI Endpoints:

    • Hosts the trained model as a scalable API.

    • Supports real-time and batch predictions.

  • Google Cloud Functions:

    • Processes real-time video frames and sends them to the model endpoint.

    • Returns predicted text results to the frontend application.

5. Video Stream Processing & App Integration

  • Cloud Pub/Sub:

    • Handles real-time video streaming and message queuing.

    • Ensures smooth data flow between client devices and the backend model.

  • Cloud Run:

    • Deploys a lightweight backend service for video frame processing.

    • Manages API requests efficiently and scales automatically.

6. MLOps Pipeline & Automation

  • Vertex AI Pipelines (Kubeflow):

    • Automates data ingestion, training, model validation, and deployment.

    • Implements CI/CD for continuous improvement of model accuracy.

  • Cloud Scheduler:

    • Schedules model retraining jobs periodically.

    • Automates dataset updates with newly labeled sign language gestures.

7. Model Monitoring & Logging

  • Vertex AI Model Monitoring:

    • Tracks model performance and detects concept drift.

    • Alerts the team if accuracy drops due to changing hand sign variations.

  • Cloud Logging & Cloud Monitoring:

    • Logs API requests and system performance.

    • Provides real-time analytics dashboards to monitor user activity.

Frontend & User Interface

  • Flutter (for Mobile App):

    • Provides an intuitive interface for real-time sign translation.

    • Uses Google ML Kit to process video input and send frames to the backend.

  • Web Dashboard (React.js + Firebase):

    • Displays translation results and analytics.

    • Allows users to upload new videos for model improvement.

Key Features & Impact

Real-time sign language translation using AI models.
Scalable & automated MLOps pipeline on Google Cloud.
Continuous improvement with model retraining.
Mobile & web support for accessibility.
Bridging the communication gap for the deaf & hard-of-hearing community.

Future Enhancements

  • Expand to multiple sign languages (e.g., ASL, BSL, ISL).

  • Integrate voice synthesis to convert text to speech.

  • Develop edge AI models for offline translation capabilities.

  • Crowdsourced data collection to improve dataset quality.

Conclusion

This project leverages Google Cloud Vertex AI and MLOps principles to create an end-to-end, scalable, and automated sign language translation system. It has the potential to significantly improve accessibility for sign language users worldwide, making communication easier and more inclusive.

Issues & PRs Board
No issues or pull requests added.