SecurityandLLM_blog_Leadimage_1200x514
Stuart_Gunter
Stuart Gunter Principal Consultant

AI, Our Thinking Wed 13th September, 2023

How to Use Large Language Models (LLMs) safely and securely

Late in 2022, ChatGPT made its debut and dramatically changed the world. Within 5 days, it had reached 1 million users – an unprecedented adoption rate for any consumer application since the dawn of the Internet. Since then, many companies have started to think much bigger about what AI could mean for their customers and the wider world.

Here at Equal Experts we’ve seen a growing number of our customers evaluate how they can take advantage of this powerful technology. Business leaders are also rightly concerned about how to harness it safely and securely. In this article, we’ll explore some of the risks associated with adopting an LLM and discuss how these can be safely managed.

What is an LLM?

So what exactly is a large language model? Let’s see what ChatGPT has to say about that:

Follow that up with a question about what some common uses are for LLMs, and you’ll get a long list of suggestions ranging from content generation and customer support to professional document generation for medical, scientific and legal fields, as well as creative writing and art. The possibilities are immense!

Note: The above section is the only part of this article written with the assistance of an LLM!

When it comes to securing these kinds of systems, the good news is that we’re not starting from scratch. Decades of security research are still relevant, although there are some emerging security practices that you need to be aware of.

Tried and tested security practices are still important

Securing LLM architectures does not mean throwing out everything we know about security. The same principles we’ve been using to design secure systems are still relevant and equally important when building an LLM-based system.

As with many things, you have to get the basics right first – and we encourage you to start strong with security foundations such as least privilege access, secure network architecture, strong IAM controls, data protection, security monitoring, and secure software delivery practices such as those described in our Secure Delivery Playbook. Without these foundations in place, you risk building LLM castles on sand, and the cost and complexity in retrofitting foundational security controls is very high.

When it comes to LLM architectures, building in tried and tested security controls is critical, but not sufficient. LLMs bring with them some unique security challenges that need to be addressed. Security is contextual; there is no one-size-fits-all solution. So what do those LLM-specific security concerns look like?

New security practices are emerging

This is a new area of research that’s still undergoing a lot of change, but there are consistent themes across the industry around some of the major areas to focus on. You should see these as an LLM-specific security layer on top of the foundational security controls we described earlier. In many cases, these are not entirely new practices; instead, they are facets of security that we’ve been thinking about for a long time but are now being observed through a different lens.

Governance

Security governance models need to be updated to incorporate AI-specific concerns. This defines clear roles & responsibilities and sets expectations on everyone in the organisation, and provides a mechanism to help you to maintain suitable security standards over time. Simon Case, Head of Data at Equal Experts, has written a great article describing data governance.

Data governance and security governance need to work hand-in-hand. A strong AI security governance model should provide clear guidance to engineering teams in the kinds of security risks they need to protect against and security principles they need to follow, allowing them to adopt the best controls to meet those needs.

Some areas to consider when defining your governance approach include:

  • Model usage: What is the organisation’s policy on using SaaS models (e.g. OpenAI) vs self-hosted models? What are the data protection and regulatory implications of these decisions?
  • Data usage: Does the organisation permit AI-based solutions on PII or commercially-sensitive data? Do you allow use of customer data in LLM development?
  • Privacy and ethics: What use cases are acceptable for LLMs? Is automated decision making allowed, or is human oversight required? Can an LLM be exposed to customers or is it exclusively for internal use?
  • Agency and explainability: How do you ensure users can trust the decisions / outputs from AI systems? Are AI systems permitted to make decisions autonomously or is human intervention required?
  • Legal: What laws and regulations apply, and what internal engagement is needed with company legal teams when designing and building an LLM-based solution?
Delivery

The secure development practices that have become commonplace over the past few years remain applicable, but now need to be reexamined in light of new AI-specific threats. For example:

  • Secure by design: Do your security teams have the right knowledge of AI-based systems to guide engineering teams towards a secure design?
  • Training data security: Where is your training data sourced from? How can you validate the accuracy and provenance of that data? What controls do you have in place to protect it from leaks or poisoning attacks?
  • Model security: How are you protecting against malicious inputs, such as prompt injection? Where is your trained model stored, and who has access to it?
  • Supply chain security: What checks are in place to validate the components used in the system? How are you ensuring isolation and access control between components and data?
  • Testing: Are you conducting adversarial testing against your model? Have your penetration testers got the right skills and experience to assess LLM-based systems?
Operations

Monitoring an LLM-based system requires that you understand the new threats that come into play, including threats to the underlying data as well as the abuse of the LLM itself (e.g. leading to undesirable actions or output). Some areas to consider are:

  • Response accuracy: How do you ensure the model continues to produce accurate results? Can you detect model abuse that taints outputs? How do you detect and correct model drift?
  • Abuse detection: Can you identify abusive inputs to the system? Can you identify when model outputs are being used for harm? Do you have incident response plans to protect against these situations?
  • Pipeline security: Have you threat-modelled your delivery pipelines? Are they monitored as production systems?
  • AI awareness: Do your security operations teams understand AI systems sufficiently to monitor them? Have you updated your security processes to factor in changes with AI?

Do remember this is a very new field and the state of the art is changing all the time. It’s important to understand that the industry’s knowledge of the threats and countermeasures is evolving, so will need constant attention throughout the lifecycle of your LLM-based product. There are many useful guides and frameworks to draw on when defining your own organisation-specific approach to LLM adoption, such as Google’s Secure AI Framework. I would encourage you to invest time researching these when defining your own AI adoption plans.

How can I use LLMs safely within my business?

Given how new this technology is, we would recommend your first LLM-based project is focused on an internal use case. This allows you to get familiar with the technology in a safe environment before looking to adopt it for more ambitious goals. For example, we’ve seen teams evaluate LLMs to improve internal platform documentation, making it easier for product teams to onboard to the platform while reducing the support burden on the platform team. This particular example provides an excellent proving ground for LLMs because:

  • Users are all employees with a common interest in improving the system
  • There is no direct customer impact
  • Undesired LLM output can be monitored and feedback loops designed to improve and correct model behaviour
  • The model is trained on a well-defined set of high quality documentation, free from commercially sensitive data or PII
  • The system has no ability to take autonomous actions on other systems

This is a fast-moving area with constant improvements, but there are sound principles for ensuring secure adoption. 

Conclusion

Start with a strong foundation built on secure software engineering principles & practices, ensure you know exactly how you’re using LLMs and what data could be processed, and apply reasonable security controls for the new and emerging threats.

How can Equal Experts help?

LLMs are not a technology to fear or avoid. They’re a powerful new capability that should be pursued cautiously, with a well thought out plan that takes all the relevant security issues into account. The approach we encourage at Equal Experts is to conduct a discovery to identify the most valuable problems to be solved. Once you’ve tested your ideas and agreed on a particular problem to solve with an LLM, consider running an inception to align everyone on the team and de-risk delivery. Of course, security should be at the forefront of the conversation throughout this process. We have extensive experience in all of these areas and would love to help you get started in this new field. If you’re exploring ideas around LLMs, do get in touch.