From Fedora Project Wiki


ibus-speech-to-text WhisperCpp support

Summary

ibus-speech-to-text 0.7.0 introduces support for OpenAI's Whisper engine via pywhispercpp (python bindings of WhisperCpp) in addition to the existing Vosk engine.

Owner

Current status


Detailed Description

Key ibus-speech-to-text-0.7.0 Changes:

  • ibus-speech-to-text provides a new backend engine option allowing users to select between Vosk and Whisper engine
  • It has a new GStreamer engine to integrate WhisperCpp into ibus-speech-to-text pipeline
  • It supports multiple Whisper models, including locally installed models and online models downloaded from Hugging Face
  • Automatic locale based model selection when possible
  • UI updates to allow backend switching and model management from setup tool

Feedback

Benefit to Fedora

This package will bring several benefits to Fedora:

  • Higher accuracy speech recognition
  • Greater flexibility by allowing users to choose between multiple backends


Scope

  • Proposal owners:
    • Package pywhispercpp ([1]) [done]
  • Other developers: N/A
  • Policies and guidelines: N/A (not needed for this Change)
  • Trademark approval: N/A (not needed for this Change)
  • Alignment with the Fedora Strategy:

Upgrade/compatibility impact

Existing ibus-speech-to-text installations will continue to use the Vosk backend by default. No existing configuration or functionality is removed.

Early Testing (Optional)

Do you require 'QA Blueprint' support? N

How To Test

Functionality Test

1. Install required packages:sudo dnf install ibus-speech-to-text

2. Restart IBus using ibus restart command

3. Add Speech To Text in input sources

4. Launch the IBus STT Setup tool from the preferences for a configuration and to download a language model

5. From Setup tool select Whisper as a backend then select and download Whisper model from list of available model for each locale

User Experience

Users will see a new backend option in ibus-speech-to-text settings with a variety of Whisper models.

Dependencies

  • pywhispercpp

Contingency Plan

  • Contingency mechanism: N/A (Not a system wide change)
  • Contingency deadline: N/A (Not a system wide change)
  • Blocks release? N/A (Not a system wide change)


Documentation

N/A (Not a system wide change)


Release Notes

ibus-speech-to-text now supports the WhisperCpp speech recognition engine via pywhispercpp, providing improved accuracy and multilingual support.