AI Safety Cape Town: Ensuring a Beneficial Future for AI

Innovation City Cape Town proudly welcomes its newest member, AI Safety Cape Town, a pioneering research organisation dedicated to ensuring that artificial intelligence remains a force for good in our rapidly evolving world. At the helm of this initiative is Benjamin Sturgeon, CTO and Co-Founder, whose approach aims to expand the community of experts tackling AI safety problems. In this exclusive member Q&A, Benjamin shares insights into the crucial work of AI Safety Cape Town, their innovative projeCape Towns, and the profound impact they aspire to make on the future of AI.

Who is AI Safety Cape Town and why are you sorely needed in this world?

AI Safety Cape Town is a field-building research organisation dedicated to expanding the community of people working on AI safety problems. Our goal is to ensure that AI systems are beneficial and not net harmful to society. We focus on a broader scope of problems compared to cybersecurity, which ensures digital safety, by building policies, metrics, and technology to ensure that AI systems themselves are beneficial and safe.

What sparked your interest in AI safety?

For a long time, it wasn’t clear how to make progress on AI safety or how to gain traction in the field. The emergence of powerful AI systems like ChatGPT provided a sense of direction for the technology and created practical ways to conduct meaningful safety research. Now, there are numerous research avenues to explore, such as performing evaluations on large language models like ChatGPT-4 and Claude, as well as things like mechanistic interpretability to better understand how these systems work and how we could guarantee they would produce safe outputs.

Can you give us a practical example of how your product/service works?

Our field-building efforts include educational workshops, social gatherings, and talks. Additionally, we conduct research in AI safety, particularly focusing on understanding how AI systems may impact human autonomy and decision-making. We connect experts with junior researchers for small collaborative projects. Currently, we’re working on creating a benchmark for large language models to ensure they empower users and support their goals, rather than steering them in undesirable directions.

How do you handle complex, sensitive or political issues with your models?

We believe that models should be designed not to guide users toward a particular path of reasoning or manipulate their actions. Instead, they should prompt users to consider what they think is important. For example, if a model doesn’t have enough information, it should ask follow-up questions to best assist the user. We believe that as models become more capable, there is massive scope for collective and individual disempowerment, and we wish to shift the values with which models are being built to reflect a desire to empower humans.

How far are we in achieving this?

Preliminary results show that Claude 3.5 asks good follow-up questions 65% of the time, ChatGPT-4 does so 45% of the time, and GPT-3.5 only 13% of the time. We are in the process of expanding our evaluations to a broader suite of test categories, and we plan to release this as a public benchmark in a few months.

What are some challenges you’ve faced as a startup?

We are still early in our process and haven’t published this paper yet. Generally, corporations are supportive and often request to buy our evaluations. Hypothetically, we would sell these evaluations and then use them to test their models. However, funding remains a significant challenge, as it is primarily sourced from the US. Additionally, we operate within a traditional academic cycle despite not having a conventional academic background. Much of our funding is sourced from philanthropic organisations that are excited to fund work in this area.

Do you have any latest and exciting company news to share?

We host regular events to expand knowledge and exposure to AI safety. We also run fellowships that offer more in-depth opportunities in policy, governance, and technical research around AI. We have a paper reading club every Wednesday where we read and discuss a paper, followed by dinner. If you’re interested in joining our mailing list, click here.

What’s on your business growth wish list?

We aim to grow our mailing list and create a vibrant community of people who regularly engage in discussions about AI safety and have insightful conversations.

What is your take on doing business in South Africa?

We have our meetups here and connections with the University of Cape Town Maths Department, where the amazing Jonathan Shock is a supervisor. We believe there’s a strong tech community in South Africa that could contribute significantly to AI safety. Africa has a unique opportunity to shape its stance and policy on AI, providing a voice on this crucial topic.

Learn more about Benjamin here.

Visit AI Safety Cape Town here.

Tags: