Tag: prompt injection

Claude for Chrome: Anthropic’s Cautious Leap into Browser-Based AI Assistance

Anthropic is piloting Claude for Chrome, an extension that allows Claude to interact directly within the browser. This product deep-dive explores its capabilities, the significant safety challenges it addresses, and the pilot program

llm security

nvidia ai red team

NVIDIA AI Red Team Unveils Critical LLM Security Vulnerabilities and Mitigation Strategies

The NVIDIA AI Red Team has identified three paramount security vulnerabilities in Large Language Model (LLM) applications: remote code execution via LLM-generated code, data leakage through insecure access controls in RAG systems, and data exfiltration via active content rendering of LLM outputs. This analysis details these risks and outlines NVIDIA's recommended countermeasures to fortify LLM implementations.

ai security

prompt injection

Bypassing OpenAI’s Guardrails: A Simple Prompt Injection Vulnerability

Researchers have demonstrated a critical vulnerability in OpenAI's Guardrails framework, showing how simple prompt injection attacks can bypass its safety mechanisms, raising concerns about AI self-regulation.

peer review

"Positive Review Only": Researchers Embed Hidden AI Prompts in Academic Papers to Influence Peer Review

A recent investigation by Nikkei has uncovered a controversial practice where researchers are embedding hidden prompts within their academic preprints. These prompts, designed to be invisible to human readers but detectable by AI, instruct artificial intelligence tools to generate exclusively positive reviews. This tactic, found in papers from 14 institutions across eight countries, primarily in computer science, has sparked debate about the ethics and integrity of AI in the peer-review process.