Lightning Talk Intermediate MIT

Exploring FrameStory: Architecture and extending support to modern Python tooling for improved accessibility

Approved

Session Description

The talk dives into the functionality of FrameStory, a Python library for generating video descriptions, which is integral for ensuring accessibility for people with visual impairments.

It touches on aspects such as:

Image captioning using BLIP (Salesforce/blip-image-captioning-large) and its advantages over YOLO and CLIP (traditionally used for semantic indexing and search for multimodal content and object identification)
Extraction of significant frames using OpenCV by checking for threshold between consecutive frames.
Generation and de-duplication of captions for significant frames

Furthermore, the talk touches on extending FrameStory to make it compatible with modern Python tooling such as Poetry and uv for better coverage by a fork developed by me under FOSSIA named framestoryx.

This fork is being used with TranscribeIt, a free software multimedia accessibility application developed under FOSSIA.

The fork is being worked on to support segmented descriptions, asynchronous I/O for downloading videos for non-blocking operations in asynchronous functions and multilingual descriptions for improved accessibility.

Key Takeaways

Understand the working of image captioning using BLIP in FrameStory
Understand the process of significant frame extraction and efficiency for video description generation
Look for improvizations in FrameStory by development of framestoryx and the scope of further improvements for future of accessibility

References

https://github.com/chigwell/FrameStory

https://github.com/fossiaorg/framestoryx

https://github.com/fossiaorg/transcribeit

https://fossia.org

https://fossia.org/blog/extending-framestory

Session Categories

Introducing a FOSS project or a new version of a popular project

Story of a FOSS project - from inception to growth

Contributing to FOSS

Technology architecture

Talk License: MIT

Speakers

Keerthana Rajesh Kumar Founder | FOSSIA

Keerthana is the founder of FOSSIA, a women-centric community for encouraging underrepresented people to contributing to the Indian free software ecosystem. She is an open source contributor and developer, focusing on improving accessibility in applications.

She is an ex-intern at Ente, where her passion for local AI grew into research for intuitive accessibility software that respects privacy and can be managed independently.

She volunteers for D&I initiatives in technological communities in terms of mentorship and guidance, especially for people with disabilities. She is a volunteer at FOSS United, CHAOSS and FOSSASIA Summit 2026.

She is also the founder of InLibre, a technological startup for developing inclusive technologies for the Indian community.

Apart from technology, she enjoys blogging, traveling and spending good time with her close ones.

Reviews

Reviews are hidden by the event organisers.