Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

To release mass speech data for MLCommons and Hugging Face Comming AI research


Non-profit AI security working group, for AI research, the EU embracing his face to leave one of the largest collections of domain sound articles came together with the EU Dev Platform.

Data set, called Output of uncontrolled peopleAt least 89 different languages ​​have more than a million hours with more than a million hours. MlCommons says “It is enthusiastic to create R & Di support in various fields of speech technology.

“It helps more human language processing research for language other than English, to bring communication technologies in a world,” the organization wrote Blog Post Thursday. “We expect several avenues to develop advanced speech and speech synthesis in areas of improving the low resource language speech models, especially in areas of improving low resource language speech models.”

It is an amazing goal to make sure. However, the AI ​​information can be risked for researchers who choose to use them as speech of uncontrolled people.

Biased information is one of these risks. Inexperienced people in speeches archives.org, perhaps the best on the road, perhaps the best WEPACK Machine is familiar with the archive. Because Archive.org’s many contributors are English speakers and Americans – Almost all articles in the speech of inexperienced America are in English in English, For Readme on the Official Project page.

Without almost careful filtering, AI systems can demonstrate the same prejudices such as the speech of uncontrolled people and voice synthesizer models. For example, it may have difficulty saying by a non-English speaker or creating synthetic sounds in other languages.

The speech of unmarried people is also notified that people are used for AI research purposes, including commercial applications. MLCommons says all articles are available in the public area or creative commons licenses, there are probable mistakes.

According to the MIT analysisHundreds of publicly open AI training information contain licensing data and errors. The AI ​​should not be required to “refuse” for the “refusal”, including the CEO of the CEO, including the CEO Director General, including CEO Director General, including these creators.

“Many creators (eg Squarespace users) do not have any meaningful way” Newton-Rex wrote Last June in the last post. “For the Creators rarely Off, (1) There are more than one overlapping opt-off methods that are quite confusing and (2) In the case of woes in their coverage. Even if a perfect universal opt-opt-oped, it would be unfair to put the advantage of the creators, taking advantage of the generative AI to compete with them.

MlCommons says that uncontrolled people are committed to updating, maintaining and improving the quality of speech. However, given the potential defects, it wants developments to be carried out with serious caution.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *