In May, my team launched a Generative AI Chatbot for business, designed to accelerate productivity by streamlining common tasks.
I have documented the development process in the following articles.
- OpenAI ChatGPT
- AI - Rise of the Machines
- Generative AI for Business
- Prompt Engineering
- Generative AI for Business - Update
- Generative AI - Embeddings
In the last article, I outlined our plan and architecture to ingest business data through a technique known as “embeddings”, which groups similar information using floating point numbers.
The goal was to unlock additional value by enabling the Generative AI Chatbot to respond to prompts with specific business context.
For example, Enterprise businesses have massive datasets (structured and unstructured) that can be a challenge to discover and/or consume, impacting speed to value. We hope that by exposing this data via natural language (Generative AI Chatbot) we can mitigate these pain points, whilst also unlocking additional insights from previously obscured data.
Over the past month, we have been testing this concept across a series of end-user pilots, continuously evaluating and tuning the architecture and data. The desired result is to consistently deliver accurate responses, referenced from business data, avoiding hallucinations.
The screenshot below highlights the business context architecture in action.
In this example, we have ingested our Information Security policy directives and standards (20+ documents and 100+ pages).
The Generative AI Chatbot is prompted with a specific question “What is our password policy?”.
The answer provided is pulled directly from an unstructured document, including the source information to allow for user verification (if required).
The right column shown in the screenshot is for testing purposes only, providing additional detail regarding the specific data chunks and embeddings being referenced. This supports the troubleshooting process, allowing users to better understand why a specific response is being provided.
Overall, testing has been very successful. The architecture tuning process has been surprisingly straightforward, focused on ensuring that the ingested data has been chunked and grouped appropriately, covering issues that can occur from overlapping context windows, etc.
We also “industrialised” the architecture ready for scale, including additional features to enforce/maintain information protection, etc. For example, you do not want restricted/sensitive data to be surfaced via the Generative AI Chatbot by accident.
The bigger challenge has been the data itself, specifically when working with unstructured data (e.g., knowledge articles, policies, procedures, processes)
We have discovered that many of our documents are technically correct, but lack the appropriate context, making them difficult to consume (by a human or a machine). We have also identified a few instances where documents contain incorrect (out of date) information and/or conflict with information found in another document.
When working with thousands of documents, written over many years by different authors, these data issues are to be expected.
When reviewed in isolation by a human, the impact of an individual issue is likely very small. However, when combined, without any ability to apply additional context, the response from a Generative AI Chatbot can be misleading (inaccurate, confusing, conflicting).
As a result, the majority of our time has been spent working with the data owners, supported by some new guidelines and templates to ensure the data is optimised for consumption.
Regardless of our future with Generative AI, this process has been useful in identifying and improving our source data.
As we conclude the pilots, we plan to proceed with a limited production launch, targeting a specific business area. This approach will allow us to continue to learn as we scale, balancing the effort/risk/reward.
In addition to the business context, we have also continued to evolve the core production capabilities. The screenshot below highlights the latest web interface, incorporating a wide range of user experience improvements.
We have also added some specific new features. For example:
Upload Files: Directly upload unstructured documents to be used as part of a specific prompt. This removes the need to copy/paste bodies of text.
Model Switcher: Our architecture is decoupled, allowing for multiple models to be used. At this time we have enabled OpenAI GPT-3.5 (via Azure OpenAI) and Google PaLM2 in production, with OpenAI GPT-4 and Meta Llama 2 in non-production. The user can switch models instantly using the model switcher, whilst maintaining all context and features.
Personas: New pre-primed personas to support specific business tasks. These personas will also be used to surface business context, including individual user access controls, meaning personas can be restricted and secured for specific user groups.
Dashboard: Usage dashboard highlighting the holistic statistics and estimated return on investment.
As highlighted by our dashboard (screenshot below), adoption continues to grow, outpacing our expectations.
Since May, we have seen 2,654 unique users leverage the Generative AI Chatbot, which is approximately 38% of the target audience. On average, this includes 600+ daily interactions.
Considering this growth has been organic (no formal organisational change management intervention), combined with the minimal investment to build/run(approximately $120pm total), we are very pleased with the progress and therefore excited to see what comes next!