The KBLab Blog

RixVox: A Swedish Speech Corpus with 5500 Hours of Speech from Parliamentary Debates

KBLab releases RixVox, a speech dataset comprised of 5500 hours of speech from parliamentary debates. The speeches have been aligned with transcripts from written protocols…

Faton Rekathati

Mar 9, 2023

Finding Speeches in the Riksdag’s Debates

The Riksdag is the Parliament of Sweden. It has made available twenty years of parliamentary debates through its website and open data platform. Each speech is accompanied…

Faton Rekathati

Feb 23, 2023

Swedish zero-shot classification model

KBlab has released a BERT model fine-tuned on NLI tasks, which can be used for zero-shot text classification.

Justyna Sikora

Feb 13, 2023

Swedish Sentence Transformer 2.0

KBLab’s Swedish sentence transformer has been updated to a newer version. The new version features an increased maximum sequence length of 384 tokens, allowing users to…

Faton Rekathati

Jan 16, 2023

BERTopic for Swedish: Topic modeling made easier via KB-BERT

Topic modeling is an exciting option for exploring and finding patterns in large volumes of text data. While this previously required considerable programming skills, a…

Elena Fano, Chris Haffenden

Jun 14, 2022

Evaluating Swedish Language Models

We present OverLim, a new benchmark for evaluating large language models for Swedish, Danish, and Norwegian, created by translating a subset of the GLUE and SuperGLUE…

Robin Kurtz

Mar 16, 2022

Swedish Bootleg model

We at KBLab have trained a Swedish version of an entity disambiguation model called Bootleg, developed by the Hazy Research Lab at Stanford. The model is trained on Swedish…

Elena Fano

Mar 3, 2022

SUCX 3.0 - NER

We present a remix of the venerable SUC 3.0 dataset for Swedish Named Entity Recognition (NER), and explore the effect of Hyper Parameter Optimization (HPO) for this task…

Robin Kurtz, Joey Öhman

Feb 7, 2022

KBLab publishes an article about AI in the library in C&RL

What does AI mean for libraries and how could libraries pave the way for ethical AI? KBLab reflects on these questions in an article in the Open Access journal College &…

Elena Fano, Chris Haffenden

Jan 24, 2022

Welcome KB-Whisper, a new fine-tuned Swedish Whisper model!

Preserving the history of cultural heritage conservation

Analysis of financial data at the Financial Supervisory Authority

Unearthing forgotten images with the help of AI

Words unboxed: discovering new words with Kubord

For how long is a person recognisable by their voice?

A robust, multi-label sentiment classifier for Swedish

Swedish speech synthesis

Scientific discourse with BERTopic

RixVox: A Swedish Speech Corpus with 5500 Hours of Speech from Parliamentary Debates

Finding Speeches in the Riksdag’s Debates

Swedish zero-shot classification model

Swedish Sentence Transformer 2.0

BERTopic for Swedish: Topic modeling made easier via KB-BERT

Evaluating Swedish Language Models

Swedish Bootleg model

SUCX 3.0 - NER

KBLab publishes an article about AI in the library in C&RL