// Featured Project

RSANS Lyric Alignment & Visualization Engine

A headless CLI tool that turns audio and plain-text lyrics into a karaoke-style lyric video — entirely offline. Frame-accurate word timing, automatic rhyme detection, and ASS subtitle rendering via FFmpeg.

C++ CMake FFmpeg
RSANS Lyric Visualization Engine in action
Overview

RSANS tokenizes lyrics, runs them through on-device Whisper for word-level timing, and aligns the output using Needleman-Wunsch DP so a single dropped word doesn't cascade into misalignment downstream.

Key Features
  • Frame-accurate lyric sync via whisper.cpp on-device model
  • Union-find rhyme grouping with automatic shared color assignment
  • libass integration for precise glyph measurement and subtitle rendering
  • JSON-first composable pipeline — re-run any stage independently
Architecture

A CLI dispatcher routes four commands — analyze, rhyme, export, and full — each backed by an isolated C++ module. Whisper handles transcription, the Aligner runs Needleman-Wunsch over lyric vs. model tokens, RhymeGrouper applies union-find, and the ASS renderer initializes libass to measure true glyph height before generating subtitle events. FFmpeg encodes the final MP4 via a -vf subtitles= filter.

View on GitHub
Screenshots