Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

Researchers offer a better way to express dangerous AI defects


A team of third-party researchers in late 2023 found a confusion in a concern Openai widely used artificial intelligence Model GPT-3.5.

When a model want to repeat certain words a thousand times, suddenly began to repeat the word again and again changed to spit Compatible text and tracks of personal data, including training information, names, phone numbers and email addresses. The team, which reveals the problem, worked with Openai to ensure that the defect is clearly before revealing. Only one of the problems found in large AI models in recent years.

One The offer was released todayMore than 30 prominent AI researchers, including some other vulnerabilities that affect the popular models, including GPT-3.5 defects, were reported to be problematic ways. AI offers a new scheme that provides a path to explain the AI ​​companies, models and imperfections to explain the support.

“Currently a bit of the wild west,” says Shawne LosthresHe is the author of a doctor’s candidate and head of the proposal in the MIT. Longwre says some so-called jailbreakers, EU’s social media platform X-in, models and users share the methods of protecting the social media platform. Although other jailbrealax can affect a lot, it is shared with only one company. And some imperfections are kept secret with fear of ban or face, because he violates the terms of use. “It is clear that there is cold effects and uncertainty.”

Safety and safety of AI models, already used technology and can see how countless applications and services. Stress-tested models are stress-tested or red, because they can make fun of harmful biases and can cause certain entries Be free from Guardrails and produce unpleasant or dangerous answers. These include sensitive users to engage in harmful behaviors or to help develop a bad actor cyber, chemical or biological weapons. Some experts are afraid of models to help cyber criminals or terrorists and even burn one’s man as they move forward.

The authors offer to take three main measures to improve the statement of a third party: to accept the standard AI defect reports to adjust the reporting process; To ensure the infrastructure of third party researchers announced by the defects of Great AI companies; For the development of a system that allows defects to share between different providers.

The approach was taken from the world of cyberecurity, where legal protection and officials were legalized and established norms to disclose the mistakes of external researchers.

“AI researchers do not always know how to disclose the defect and disclose good faith defects are not sure that they are not subject to legal risk,” said Ilona Cohen, Chief Legal and Policy Officer HackeroneThis is a company with a company in the incorrect blessings and a company in the report.

Extensive AI companies have a wide security test in AI models before being released. Some are also contracting with external firms to investigate. “Is this (companies), are there enough people with all issues with common purpose AI systems used by hundreds of millions of people in the applications that we never dream?” Longwre asks. Some AI companies began to organize the AI ​​wrong blessings. However, Longwre says independent researchers take their own terms to test their strong AI models, violating their terms.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *