TechnologySeptember 11, 2024

Precision and Efficiency in Domain-Specific Text2SQL Conversion: Introducing Skypoint AI Platform’s SherloQ

Precision and Efficiency in Domain-Specific Text2SQL Conversion: Introducing Skypoint AI Platform’s SherloQ

Editor’s note: DataStax customer Skypoint, a provider of data, analytics, and AI services to healthcare providers, has developed a new Text2SQL engine that enables highly accurate natural-language database queries. In this guest post, Skypoint CEO Tisson Mathew walks us through the how and why of SherloQ, which was built on Astra DB. 

In the rapidly evolving world of AI-driven data analytics, precision and efficiency are paramount. This is especially true in industries where accuracy can directly impact critical decisions.

Skypoint's SherloQ sets new benchmarks in the Text2SQL domain, particularly for regulated industries like healthcare, the public sector, and financial services. 

While general-purpose AI tools have made strides in democratizing data access, SherloQ’s specialized approach delivers unmatched accuracy and cost-efficiency, making it the tool of choice for organizations that demand more from their data.

The SherloQ difference: 92% accuracy in action

In benchmarks on production workloads, SherloQ achieved a 92% accuracy rate in converting natural language into SQL queries. This is a significant improvement over the 65% accuracy achieved by a combination of GPT-3.5 along with prompt engineering and 5+ few-shot queries per user prompt.

This level of precision isn’t an accident: it's the result of a deliberate focus on industry-specific training using Llama 3.1 models. Unlike general-purpose models that are built to cater to a broad audience, SherloQ is engineered to understand the unique terminologies, data structures, and nuances of industries that operate under strict regulatory frameworks.

This precision is crucial for organizations where even a minor error in data interpretation can lead to significant consequences. For example, in the healthcare sector, SherloQ’s ability to accurately interpret and generate SQL queries helps ensure that patient data is analyzed correctly, leading to better clinical decisions and improved patient outcomes. In the financial services industry, SherloQ can help prevent costly mistakes in financial reporting or risk assessment, where precision is critical for maintaining compliance and making sound financial decisions.

Cost-efficiency through domain specificity

Accuracy alone isn’t enough; organizations also need solutions that are cost-effective and provide a strong return on investment. This is another area where SherloQ shines. Compared to general-purpose AI tools like Snowflake’s Cortex Analyst, which often come with high operational costs due to their broad applicability and computational demands, SherloQ offers a more focused and efficient solution.

SherloQ’s domain-specific approach enables it to operate with a leaner, more efficient model, reducing both the computational overhead and associated costs. This efficiency translates directly into savings for organizations that adopt SherloQ, making it an ideal solution for businesses looking to maximize their analytics budget without compromising on performance or accuracy.

SherloQ’s seamless integration with existing data infrastructures means that organizations don’t need to invest in costly system upgrades or extensive retraining programs. Whether your organization relies on Microsoft SQL, Snowflake, Google BigQuery, or any other major database platform, SherloQ fits into your existing ecosystem.

Advanced features: Building a production-grade solution

Beyond being a high-accuracy, cost-efficient Text2SQL engine, SherloQ is also a production-grade solution designed for real-world applications. 

SherloQ is capable of advanced error handling and reflection capabilities. This enables SherloQ to learn from past interactions, continuously improving its query accuracy and relevance over time. This self-improving mechanism is particularly valuable in regulated industries, where the ability to adapt and refine queries based on past performance is critical for maintaining compliance and ensuring accurate data analysis.

Another key feature of SherloQ is its state-management capabilities, which help it maintain context throughout the query-generation process. SherloQ uses LangChain and Astra DB, DataStax’s vector database. Astra DB in DataStax’s AI PaaS helps SherloQ perform dynamic few-shot prompting to find the examples closest in structure and content to each new input query. This enables SherloQ to leverage the most relevant patterns to generate accurate SQL queries.

These tools ensure that SherloQ can handle complex, multi-step queries. This level of sophistication makes SherloQ not just a tool for data querying, but a strategic asset for organizations looking to enhance their data-driven decision-making processes.

Seamless integration and security

It’s not enough for a tool to be accurate and cost-effective; it also needs to be easy to use and secure. SherloQ delivers on both fronts with its seamless integration capabilities and robust security features.

It’s designed to integrate effortlessly with modern communication platforms like Teams and Slack, enabling organizations to access and share data insights without disrupting their workflows. This integration ensures that SherloQ fits into existing processes, making data-driven decision-making more accessible and efficient across an organization.

SherloQ ensures that data is always protected. It comes with enterprise-grade security features, including robust encryption, multi-factor authentication, and compliance with industry standards such as HIPAA and HITRUST. This focus on security is critical for organizations in regulated industries, where data breaches can have severe legal and financial implications.

Case study: SherloQ reduces query latency and improves accuracy

The true measure of any technology is its impact in the real world, and SherloQ has already proven its value in production. 

A post-acute healthcare operator’s finance team used SherloQ to generate quarterly trends for housekeeping expenses. SherloQ produced a precise SQL query that accounted for potential pitfalls like division by zero, and structured the results in a way that provided clear, actionable insights. Compared to earlier models, SherloQ reduced query latency from 29 seconds to 10 seconds, while also improving accuracy and reliability by over 30%.

Whether it’s improving operational efficiency, reducing costs, or enhancing decision-making accuracy, SherloQ is a powerful tool that can help organizations unlock the full potential of their data.

Why SherloQ is the future of Text2SQL in regulated industries

As the demand for AI-driven analytics continues to increase, the need for specialized, high-accuracy solutions grows, too. General-purpose tools like Snowflake’s Cortex Analyst have their place, but for organizations in regulated industries, the precision and cost-efficiency offered by SherloQ make it a superior choice.

SherloQ’s combination of industry-specific accuracy, cost-effectiveness, and advanced features sets a new standard in Text2SQL technology. By providing a tool that is not only highly accurate but also easy to integrate, secure, and scalable, Skypoint has created a solution that meets the needs of modern businesses while also positioning them for future success.

Experience the difference that SherloQ can make for your organization — request a demo and discover how this innovative Text2SQL engine can transform your data-driven decision making processes.

Discover more
Vector Search
Share

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.