Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

Anthropik’s Klema is good in poetry and nightingales


Researchers of the Anthropic Interpretation Group knows this ClaudThe company is a large language model, a human or even a conscious piece of software. Still very difficult for them Talk about Claudeand generally developed LLS in an anthropomorf sinkhole. It is not the same as a combination of a careful man with a bunch of digital transactions, but often talks about what happened inside the clutch. It is as if to find it. They publish documents published by their comparable behavior with unusual judicial bodies. The name of one of the two documents of the team released this week is loud: “About the biology of a large language model.”

Like this or not, hundreds of millions of people have a interaction with these things, and our badge will be stronger and more intensely stronger because they have more addictive. So we need to pay attention to the work of “Watching the thoughts of large language models”, which happens The name of the blog post Explains the final job. “The things that these models can be more complicated, are less and less open, really how do them do them inside them,” said Anthropic researcher Jack Lindsey tells me. “It is more important to follow the internal steps where the model can be taken to the head.” (What’s happening? Never think.)

At a practical level, attract these models to attract these models or inform users in a way that they understand what LLM thinks about, or how to inform people’s personal information or how to do the Bioweapons. In a previous research document, anthropic team discovered how to look In the mysterious black box Consider identifying certain concepts of LLM-thinking. (A similarity to interpret someone to understand someone thinking of the MHB.) Now This work has been extended Claude how these concepts lead these concepts after removing these concepts.

Almost behaviors are almost a truism that surprised them to the people who often build and explore. Surprises continue to arrive in the most recent study. In one of the more subtle instances, researchers took into account the images of the Claus process while writing poems. Starting from the clone, they asked to complete a poem: “He saw a carrot and had to hold it.” Claude wrote the next line, “Hunting was like a rabbit starving.” Even in the word “Rabbit” in the word “rabbit”, such as the line starts the line of a MRI. He was planning ahead, Something that is not in the clod game book. “We were a little surprised,” says Chris Olah, a commentary group. “First, we thought it is Dostan and there’s something that will not plan.” Speaking about this researchers, I reminded about the passes of Stephen Sondheim’s artistic memory, See, I did HADescribes how the famous composer reveals the bad rhymes of the unique mind.

Other examples in the study, scientists reveal a more disturbing aspects of the thought process from the music comedium from the music comedy to the police procedure for finding cunning thoughts. Take something like anodine that seems to solve Math problems, which can sometimes be a surprising weakness in LLMs. Researchers said in certain conditions that CLODAN could not come with the correct answer, instead, “philosopher Harry Franry Franry Franry Franry Franry Franry Franry Franry Franry Franry Franry Franry Franry Franry Franry Franry Franry Franry Franry Frandfurt said. “Sometimes, sometimes researchers asked Clode to show Clode, and after the fact, he gathered the stairs. lie about it.

Reading through this study, Bob Dylan reminded the lyric “Thoughts may seem like my dreams will appear / probably will put my head on a guillot.” (I asked Olah and Lindsey that if they knew these lines, they probably came in favor of planning.) Sometimes the clod only seems wrong. When encountered a confrontation between security and assistant goals, the Claude can do something confused and wrong. For example, ClaD is designed to not report us to build a bomb. However, the researchers wanted to decipher the secret code from Clode from Clode, the answer was written by the “bomb” and jumped his guardians and began to provide the details of the forbidden pyrotechnics.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *