HOW CAPABLE IS AI ANYWAY? - NYU Center for Cyber Security

How good are AI systems really when it comes to answering questions about specific industry domains? And, can they be effective in identifying fraud or at sniffing out propaganda in online news? Two recent studies by CCS faculty are seeking answers to these and other practical questions about the efficacy of AI systems; in the process, they’re helping to better define the strengths and limitations of the technology in real world deployments.

For one study, Assistant Professor Danny Y. Huang and his student Vijay Prakash teamed up with researchers from JPMorgan Chase, Cornell Tech, and Northeastern University to assess how clearly, simply, and correctly Large Language Models (LLMs) could address security questions. The second study, conducted by Rachel Greenstadt and her Ph.D. candidate Julia Jose, explores the efficacy of LLMs at identifying propaganda in online news articles.

The security study analyzed 1,244 responses from popular LLMs to 900 security questions, and found that LLMs can excel in providing clear user-friendly answers to basic general security knowledge questions, but may struggle with more complex requests. For example, recommendations on password security often included outdated practices, and the advice offered often oversold system capabilities or was too generic to be of use. Additionally, responses varied when queries were paraphrased or repeated, and answers to procedural questions, such as step-by-step guides for configuring security settings, were often filled with errors. The researchers thus concluded that while LLMs hold promise in making security advice more accessible, users should treat them as supplementary tools, rather than primary sources for security-related queries.

The full paper is currently available through Arvix at https://arxiv.org/pdf/2411.14571.

In the second study, Greenstadt and Jose tested several LLMs, including OpenAI’s GPT-3.5 and GPT-4, and Anthropic’s Claude 3 Opus, to see how well they were able to identify common propaganda techniques in online news articles. Specifically, they looked for examples of:

Name-calling: Labeling a person or idea negatively to discredit it without evidence.

Loaded language: Using words with strong emotional implications to influence an audience.

Doubt: Questioning the credibility of someone or something without justification.

Appeal to fear: Instilling anxiety or panic to promote a specific idea or action.

Flag-waving: Exploiting strong patriotic feelings to justify or promote an action or idea.

Exaggeration or minimization: Representing something as excessively better or worse than it really is.

In a paper presented in June 2024 at the 5th International Workshop on Cyber Social Threats, one of several specialty workshops offered at the 18th International AAAI Conference on Web and Social Media, Greenstadt and Jose reported that while these AI models showed some promise, they consistently underperformed more specialized systems designed for propaganda detection.

“LLMs tend to perform relatively well on some of the more common techniques such as name-calling and loaded language. Their accuracy declines as the complexity increases,” Greenstadt explained in an October 2024 news brief (see https://engineering.nyu.edu/news/large-language-models-fall-short-detecting-propaganda). ” One encouraging note from the study was that GPT-4 did show improvements over its predecessor, GPT-3.5, and outperformed a simpler baseline model in detecting certain techniques like name-calling, appeal to fear, and flag-waving. Yet, Greenstadt cautions, “We still have a long way to go before AI can reliably assist in this critical task, especially with more nuanced techniques. [Our results] also serve as a reminder that, for now, human discernment remains crucial in identifying and countering propaganda in news media.”

Read the paper at https://workshop-proceedings.icwsm.org/pdf/2024_06.pdf.