I'm guessing that of the 4,600 new hours of speech, maybe 4,100 of those hours a...

heyhillary · on Aug 16, 2021

Thanks so much for sharing your comment. Gender equality in participation in Common Voice, is something we really want to improve and champion. As part of the Kiswahili Language community engagement, our team are implementing a gender action plan that includes both participation and use cases for the dataset. We hope to consult, adapt and replicate gender inclusion that has been done by community members and gender action plan to improve representation and involvement of all genders in open source projects such as Common Voice.

LoriP · on Aug 5, 2021

To be fair not sure that's the best guess :) there seem to be more female voices than men to me. Anyhow, I'd wager there's at least a 50:50 mix.

Edman274 · on Aug 6, 2021

I probably should've looked this up before I decided to comment but at least according to this:

https://commonvoice.mozilla.org/en/datasets

The ratio of male to female tagged voices in the English dataset is 45 percent male to 15 percent female. (The remaining 40 percent is untagged.) Odds are good that the ratio is closer to 75 25 than 50 50, at least by hours of recorded audio.