Lightning Talk
Intermediate

Exploring FrameStory: Architecture and extending support to modern Python tooling for improved accessibility

Approved

The talk dives into the functionality of FrameStory, a Python library for generating video descriptions, which is integral for ensuring accessibility for people with visual impairments.

It touches on aspects such as:

  • Image captioning using BLIP (Salesforce/blip-image-captioning-large) and its advantages over YOLO and CLIP (traditionally used for semantic indexing and search for multimodal content and object identification)

  • Extraction of significant frames using OpenCV by checking for threshold between consecutive frames.

  • Generation and de-duplication of captions for significant frames

Furthermore, the talk touches on extending FrameStory to make it compatible with modern Python tooling such as Poetry and uv for better coverage by a fork developed by me under FOSSIA named framestoryx.

This fork is being used with TranscribeIt, a free software multimedia accessibility application developed under FOSSIA.

The fork is being worked on to support segmented descriptions, asynchronous I/O for downloading videos for non-blocking operations in asynchronous functions and multilingual descriptions for improved accessibility.

  • Understand the working of image captioning using BLIP in FrameStory

  • Understand the process of significant frame extraction and efficiency for video description generation

  • Look for improvizations in FrameStory by development of framestoryx and the scope of further improvements for future of accessibility

Introducing a FOSS project or a new version of a popular project
Story of a FOSS project - from inception to growth
Contributing to FOSS
Technology architecture

Keerthana Rajesh Kumar
Founder FOSSIA
https://libremusings.dev
Speaker Image

100 %
Approvability
1
Approvals
0
Rejections
0
Not Sure

I like to see FOSS being used for a11y

Reviewer #1
Approved