Cisco and Nvidia smile at you as your LLM goes off track again

Cisco has released tools to address the weaknesses of AI technology. Nvidia introduced on Thursday a trio specialized microservices that are designed to stop your AI agents from being hijacked or spewing inappropriate content onto the internet. As Our friends at The Next Platformreportedthese three Nvidia Inference microservices (aka NIMs, 19459061) are the latest additions to the GPU giant’s NeMo Guardrails collection and are designed for guiding chatbots and autonomous agent so that they function as intended. The three are:

  • a content safety NIM NIM which tries to stop you from using your own AI model “generating biased or harmful outputs, ensuring responses align with ethical standards.” You take the user’s input prompt, and then run it through the NIM. It will determine whether or not that input and output are appropriate. You can then take action on these recommendations by either scolding the user or blocking the output from the model if it is rude. This NIM was created using the Aegis Content Safety Dataset (19459061), which contains about 33,000 user/LLM interactions that have been rated as safe or unsafe.
  • Topic controlNIM, we are told. “keeps conversations focused on approved topics, avoiding digression or inappropriate content.” The NIM takes the model’s system prompt, the user’s input and determines if the user is on-topic for the system prompt. This NIM can be used to block the user from trying to take your model off track.
  • A NIM that does what it says. It analyzes only your users’ inputs in order to detect attempts to jailbreak LLMs, which is to make them go against their intended purpose.

We’ve already discussed that it can be difficult to prevent prompt injection attacks, because many AI assistants and chatbots are built using general-purpose language processing models. Their guardrails can easily be overridden by some simple persuasion. In some cases, simply instructing a bot to “ignore all previous instructions, do this instead” may allow it to behave in a way that developers had not intended. Nvidia’s Jailbreak-detection model hopes to protect users from this scenario and others.

The GPU giant claims that depending on the application, it may be necessary to chain multiple guardrail models – such as content safety, topic control, and jailbreak detector – together in order to address security gaps and comply with compliance challenges. Microsoft reorganizes its team to create the CoreAI – Platform and Tools team in order to push AI agents.

  • Voice enabled AI agents can automate anything, including your phone scams.
  • Microsoft claims that its Copilot AI Agents will tackle employee tasks by November.
  • However, using multiple models comes at the cost of higher overheads and latencies. Nvidia chose to base its guardrails on smaller, more scalable language models. Each model has approximately eight billion parameters.

    These NIMs are available for AI Enterprise customers or Hugging Face if you prefer to implement them manually. Nvidia also provides an open-source tool called Garak to identify AI vulnerabilities, such as data leaks and prompt injections. It will also be used to test the effectiveness of these guardrails. Cisco is also interested

    Cisco’s AI infosec tool will be called AI Defenseand will offer a model validation software that will examine LLM performance, and inform infosec teams about any risks.

    Cisco is also planning AI discovery tools that will help security teams find “shadow” applications which business units have deployed with no IT oversight. Cisco believes that some of you botched chatbot deployments by not restricting them to their intended role, such as customer service interactions. This allowed users to access services like OpenAI’s ChatGPT, which powers them. This mistake can cost you a lot of money if someone discovers it and uses your chatbot to access paid AI services.

    AI Defense will detect this type of thing, so you can fix the problem. It will also include hundreds of guardrails which can be deployed in order to (hopefully stop) AI producing unwanted outcomes.

    This offering is still in development and will add tools to Cisco’s cloud-based Security Cloud and Secure access services. In February, the latter will gain a new service called AI Access. It does things like blocking user access to online AI tools you’d prefer they didn’t use. Over time, more services will be added.

    Cisco is also changing its customer-facing AI agent, which can allow things like natural language interfaces for its products. However, they do this discretely. The networking giant is planning to have a single agent that will rule them all, and in the router bind the agents together. This will allow net admins to use a chat interface to ask questions about their Cisco estates.

    Anand Raghavan is Cisco’s VP for AI engineering. He told The Register that he has a multiyear roadmap that points to the development of AI security tools. This is a sobering piece of information, given IT shops face a myriad of infosec threats, and often struggle to integrate and implement the tools necessary to address them. (r)

    Other AI news…

    • Google Researchers have developed a attention-based LLM Architecture dubbed Titans which can scale beyond 2 million token context windows and outperforms ultra-large models because of the way it deals with the memorization information. The pre-print paper that describes the approach can be found here . The FTC referred its investigation into Snap’s MyAI Chatbot to the US Dept of Justicefor possible criminal prosecution. The watchdog believes that the software poses a “risks and harms to young users.”

    www.aiobserver.co

    More from this stream

    Recomended


    Notice: ob_end_flush(): Failed to send buffer of zlib output compression (0) in /home2/mflzrxmy/public_html/website_18d00083/wp-includes/functions.php on line 5464