Splitting An Ogg Opus File Stream
Solution 1:
OpusFileSplitter can split Opus audio files.
The Ogg pages can be read independently as long as the file starts with the Beginning of Stream (BOS) header and comment page. You can split one Ogg file into multiple files by creating new files that start with the Ogg header page and have Ogg data/audio pages after . For example, this Ogg Opus file:
*********************************************************
** ** ** Header * Audio Data * Audio Data * Audio Data ** Page * Page 1 * Page 2 * Page 3 *** ** **********************************************************
Could be split into 2 files:
***************************
** ** Header * Audio Data *
* Page * Page 1 ** **
***************************
******************************************
* ** ** Header * Audio Data * Audio Data ** Page * Page 2 * Page 3 ** ** *
******************************************
You're correct regarding audio segments that could be split and span across multiple pages. I'm assuming that a few milliseconds could be lost if a page contains incomplete audio segments, but that should not disrupt speech recognition. Unfortunately, my local tests used Opus files generated by opusenc
util, which didn't create pages that split segments across pages, which seems to be a good thing for splitting files!
OpusFileSplitter.scanPages()
shows how to find the page boundaries.
Post a Comment for "Splitting An Ogg Opus File Stream"