Cohere Unveils Open-Source Voice Model “Transcribe” for Automatic Speech Recognition

Cohere Unveils Open-Source Voice Model “Transcribe” for Automatic Speech Recognition
TechCrunch

Key Points

  • Cohere released Transcribe, an open‑source ASR model with 2 billion parameters.
  • The model runs on consumer‑grade GPUs and supports 14 languages.
  • Transcribe achieved a 5.42 word‑error rate on the Hugging Face Open ASR leaderboard.
  • Human evaluators gave it a 61 percent win rate over competing models for accuracy and usability.
  • It can process 525 minutes of audio per minute of compute time.
  • Cohere will integrate Transcribe into its North platform and offer a free API.
  • The model will also be available on Cohere’s Model Vault managed inference service.
  • Cohere reported $240 million in annual recurring revenue for 2025 and hinted at a possible IPO.
  • The launch was announced at a TechCrunch event in San Francisco, October 13‑15, 2026.

Enterprise AI company Cohere launched its first voice model, Transcribe, an open‑source automatic speech recognition system built with 2 billion parameters. Designed for consumer‑grade GPUs, the model supports 14 languages and claims a 5.42 word‑error rate on the Hugging Face Open ASR leaderboard, outperforming several competitors. Cohere plans to embed Transcribe in its North orchestration platform, offer free API access, and host it on its Model Vault service. The rollout follows reports of $240 million in annual recurring revenue and hints of a possible near‑term public listing.

Introducing Transcribe

Cohere, an enterprise‑focused AI company, announced the launch of its inaugural voice model called Transcribe. The model is open source and targets automatic speech recognition (ASR) use cases such as note‑taking and speech analysis. With a relatively modest size of 2 billion parameters, Transcribe can run on consumer‑grade graphics processing units, making self‑hosting feasible for a broad range of developers.

Language Coverage and Performance

Transcribe currently supports fourteen languages: English, French, German, Italian, Spanish, Portuguese, Greek, Dutch, Polish, Chinese, Japanese, Korean, Vietnamese, and Arabic. On the Hugging Face Open ASR leaderboard, the model achieved an average word‑error rate (WER) of 5.42, which Cohere says is lower than any other model on that benchmark. Human evaluators gave Transcribe a 61 percent win rate over competing systems when assessing accuracy, coherence and usability. The model performed less well on Portuguese, German and Spanish, where it fell behind some rivals.

Speed and Integration Plans

Cohere reports that Transcribe can process 525 minutes of audio in a single minute, a high throughput for a model of its class. The company intends to integrate the model into its enterprise agent orchestration platform, North, and will make the model available through a free API. Additionally, Transcribe will be hosted on Model Vault, Cohere’s managed inference platform, giving customers a managed‑service option.

Market Context and Company Outlook

The launch comes as demand for speech‑recognition tools grows, fueled by note‑taking and dictation applications such as Granola and Wispr Flow. Earlier this year, Cohere reportedly told investors it generated $240 million in annual recurring revenue for 2025, and its CEO, Aidan Gomez, indicated the startup may go public “soon.”

Event Details

The announcement was made at a TechCrunch event in San Francisco, California, held October 13‑15, 2026.

#automatic speech recognition#open source AI#Cohere#Transcribe model#language models#speech-to-text#enterprise AI#voice technology#Hugging Face leaderboard#North platform#API#Model Vault
Generated with  News Factory -  Source: TechCrunch

Also available in:

Cohere Unveils Open-Source Voice Model “Transcribe” for Automatic Speech Recognition | AI News