Tech and Gadgets

Using Machine Learning To Make Sure No One Is Left Behind

August 2, 2022

With the continued growth of the use of technology, Machine Learning and Artificial Intelligence are being intricately used to come up with problem-solving solutions in many areas including education, health and even communication.

Kathleen Siminyu, a machine learning fellow and an NLP (Natural Language Processing) reseacher at Mozilla Foundation is working on an interesting project at Mozilla called Common Voice – a data set platform that enables language communities to build language data sets.

A data set for speech recognition is essentially a piece of text accompanied by an audio of what is in the text. That is the data that you would feed into your machine learning algorithm for it to start learning how to transcribe to Kiswahili/or any other language text. From there, it is able to do a mapping of words to the respective sounds that are broken down into smaller parts of speech. A good example is the captions you see on TV in a different from the one speaking or even in teleconferencing tools like Zoom which have captions that are auto-generated.

More about Common Voice

Kathleen explains how this particular data set supports the development of a Kiswahili Common Voice that builds speech transcription models that can be used in areas like agricultural and financial domains. These speech transcription models make sure that other languages are incorporated in the above mentioned areas and include even various language dialects for all to understand.

She is currently working on ensuring that the diversity of Kiswahili speakers, in terms of age, gender, accent and language variant/dialect, is catered for in the dataset and models created.

Why is this so important?

Well, the main reason is how diverse we are, not just in Kenya but Africa as well; and even in a certain culture/tribe, you can find there are various dialects in their language meaning a word or term that is used in a language can have different meanings depending on the dialect. This is why these data sets are vital in relaying information so as to ensure the correct message is sent out to everyone.

She is particularly excited to be working on this project with Mozilla Foundation which is a non-profit because it benefits everyone who wants access to this resource to better their communities and it will also help more developers to use these data sets in various sectors that will benefit their communities. Furthermore, since this is a resource that will benefit everyone, it requires the collective effort of communities to help out with collecting the data, that is the different languages and dialects.

Using Machine Learning To Make Sure No One Is Left Behind

LEAVE A REPLY Cancel reply

MOST POPULAR

KBL Partners With Enda Sportswear To Produce Kenya’s First Classic Running...

Jambojet Increases Flight Frequencies To Kisumu City

Kenya’s Classiest Event, The CBA Africa Concours d’Elegance is Back!

KCB Foundation Trains 92 Youth In Smart Farming

HOT NEWS

Absa Bank Kenya Hosts Women Entrepreneurs For The PanAfrican InspireME Conference

Step for Safety, A Public Walk Organized By The LuQuLuQu Campaign...

The Glorious Look Of LG InstaView Refrigerator – With Upgraded Features...

Transition To A Low-carbon World Must Begin In Our Homes

EDITOR PICKS

itel Extends Feature Phone Products’ Warranty to 24 Months

Africa Food Prize 2024 Call For Nominations Now Open

CS Ababu Appoints Charles Gacheru As CEO Of WRC Safari Rally...

POPULAR POSTS

Africa News

Businesses To Start With 1K In Kenya

Moet Hennessy Takes Top East African acts on an Exclusive VIP...

POPULAR CATEGORY

Uber In Kenya Launch A Personal Accident Insurance Cover to Insure...

LG Plans Shift From A Home Appliance Brand To A Smart...