Why AI still needs humans in the loop, at least for now

Large language samples can help you write code – or rewrite ads so they look modern. They can make it easier to quickly understand the main points of a research paper or news story by writing and answering questions. Or they could get things wrong awkwardly.

Large language templates like GPT-3 are key to search engines like Google and Bing, as well as providing suggested responses in email and chat, trying to finish your sentence in Word and running coding helpers like GitHub Copilot.

But it’s also not perfect. Usually considerations of the harm they can cause focus on what you get by learning from everything that is posted on the web, which includes the less positive opinions some hold. Large language models trained on huge text sources such as an online community can end up repeating some rather offensive remarks. And when the model learns from writing with a common bias, such as a group of interviewers referring to men by their surnames and women by their first names or by assuming that men are doctors and nurses, those biases are more likely to emerge in what the model writes.

See: Facebook: Here Comes AI for Metaverse

Possible harms with code generation include that the code is wrong but looks right; It’s still up to the programmer to review the AI-powered suggestions and make sure they understand what they’re doing, but not everyone will.

The “human in the loop” review phase is important for the responsible use of large language models, as it is a way to solve a problem before the script is published or the code goes into production. Code licenses are one of the issues when it comes to writing code, but AI-generated text can create all kinds of headaches, some embarrassing and some more dangerous.

The way large linguistic models work is by predicting the next word in a sentence, the next word after that, the next word after that, and so on, all the way to the end of the sentence, the paragraph or code snippet, looking at each word in the context of all the words around it.

This means that the search engine can understand that a search query that asks “what can aggravate a concussion” is asking what to do when someone suffers a head injury, not about the symptoms or causes of concussion.

Another approach is to pair large language models with different types of machine learning models to avoid whole classes of damage. Choosing the most likely word can mean that a large language model only gives you clear answers; Like always answering “birds” when asked “what can fly” and never “butterflies” or “airline passenger vaccination”. Adding a binary model that characterizes different types of birds might make you “can fly, except for ostriches, penguins, and other flightless birds.”

Using a binary model in conjunction with a large language model is one example of how Bing can use multiple AI models to answer questions. A lot of them are there to deal with the number of different ways we have to say the same thing.

Information about entities such as the Eiffel Tower is stored as vectors so that Bing can tell you the height of the tower even if your query does not include the word Eiffel – the question “How tall is the Tower of Paris” will get the correct answer. The Microsoft Generic Intent Encoder converts search queries into vectors so that it can pick up what people want to see (and click) in search results even when the vocabulary they use is semantically very different.

Bing uses Microsoft’s large language models, (as does the Azure Cognitive Search service that allows you to create a custom search tool for your documents and content) to rank search results, pull excerpts from web pages and highlight the best result or highlight key phrases to help you see if If the web page contains the information you’re looking for, or gives you ideas for different terms that might get you better search results. This doesn’t change anything, except maybe the focus of the sentence.

But Bing also uses a large language model called Turing Natural Language Generation to summarize some information from web pages in search results, and rewrite and shorten the snippet you see so it’s a better answer to the question you typed. So far, so useful.

On some Bing searches, you’ll see a list of questions under “People also ask.” Originally, these were just related queries written by some other Bing user, so if you search for “accounting courses,” you’ll also see questions like how long does it take to get qualified as an accountant, to save you time typing those other searches yourself.

See: Gartner launches 2021 tech startup hype cycle: Here’s what’s happening and what’s heading

Bing doesn’t always have question-and-answer pairs that match each search, so last year Microsoft started using Turing NLG to generate questions and answers for documents before anyone wrote a search that would generate them on demand, so more searches will get additional ideas and handy nuggets .

For news stories, the Q&A can show you more detail than what’s in the headline and excerpts you see in the results. But it only helps if the question Bing generates to agree the answer is accurate.

Over the summer, one of Bing’s questions showed that common metaphors can be a problem for AI tools. You may be confused by the headlines reporting a celebrity’s criticism of someone’s actions as “criticizing” them, one of the questions Turing wrote that I’ve seen clearly misunderstand who was doing what in a particular news story.

The generative language model that generated the question-and-answer pair is not part of cognitive research and Microsoft only offers its own GPT-3 (which can do the same kind of language generation) in private preview, so it doesn’t look like the average business should worry about making these kinds of mistakes on their search pages. But it does show that these models can make mistakes, so you need to have a process ready to deal with them.

A search engine will not take a human look at every search result page before seeing it; The goal of AI models is to deal with problems where the scale is too large for humans to do so. But companies may still want to have a human review of the writing they generate using a large language model. Don’t take a human out of the loop of everything, just yet.

Leave a reply:

Your email address will not be published.

Site Footer

Sliding Sidebar