Arbutus AI ML FAQs

What new AI/ML capabilities are available in the AI/ML menu?

You’ll see: AI - Using ChatGPT - requires internet connectivity and will only be enabled once configured with your organization's OpenAI account details. The following AI capabilities are available in the AI/ML menu:

• Smart Query
• Data Categorization
• Arbutus Assistant

ML - Using fully contained and on-premise Machine Learning algorithms via calls to industry-standard Python procedures. No internet connectivity, external communications, or internal configurations required. The following ML capabilities are available in the AI/ML menu:

• Clusters
• Outliers
• Sentiment Analysis

Both AI and ML capabilities are accessible via a wizard-driven and user-friendly interface. No scripting (programming) knowledge required. Please see our AI/ML User Guide for full details on each capability and guidance on how to access and use them.
Do the AI/ML features work offline?
What are the best ways to use these tools?

To use these AI/ML tools effectively, it is a good idea to first get an understanding of what they do and also where and when you would/should use them as part of your analytics. This also includes understanding your data and the input requirements and prompts presented at the time of running them. Not understanding your data may not get you the desired results.

At a very high level, the best ways to use these tools are a s follows:
• Outliers: To detect abnormal numeric values, e.g., data points that are far from the norm.
• Clusters: To group similar records, e.g., grouping of similar data points.
• Sentiment Analysis: To determine the emotional tone or sentiment expressed in a piece of text, e.g., classifying the sentiment as positive, negative, or neutral
• Data Categorization: To feed free-text fields and provide responses related to these data points through ChatGPT integration.

Please see our AI/ML User Guide for full details on each capability and guidance on how to access and use them.

What's required to use AI features in Arbutus?
Do I need ChatGPT set up for this to work?

Yes, you will need ChatGPT set up for all our AI functionality, e.g., Data Categorization, to work effectively.
The following information is also applicable:
• Confirmation your organizational policies allow the use of ChatGPT.
• Have an OpenAI API account with ChatGPT.
• Enable AI functionality with your API key. Please refer to item #2 in the 'IMPORTANT - PLEASE NOTE' section on page 3 of our AI/ML User Guide for full details.
• Active internet connection.
What happens if I don't configure ChatGPT?

If ChatGPT is not configured then none of the AI functionality would work.

The following information is also applicable:
• Confirmation your organizational policies allow the use of ChatGPT.
• Have an OpenAI API account with ChatGPT.
• Enable AI functionality with your API key. Please refer to item #2 in the 'IMPORTANT - PLEASE NOTE' section on page 3 of our AI/ML User Guide for full details.
• Active internet connection.
Does AI apps require an internet connection?

Yes, our AI apps would require an active internet connection in order for them to work effectively.

The following information is also applicable:
• Confirmation your organizational policies allow the use of ChatGPT.
• Have an OpenAI API account with ChatGPT.
• Enable AI functionality with your API key. Please refer to item #2 in the 'IMPORTANT - PLEASE NOTE' section on page 3 of our AI/ML User Guide for full details.

What does Smart Query do?

Smart Query is an AI App that allows the user to ask natural language questions (see Note below) on a dataset that is analyzed, and in return get responses from OpenAI based on interpretation of the questions asked. The AI Smart Query App involves interaction with OpenAI and ChatGPT.

Note: Natural Language Processing (NLP) is what allows ChatGPT (and other AI tools) to understand what you are saying, i.e. question you are asking, and respond like a human would – using real, natural language.

Please see our AI/ML User Guide for more information and guidance.
How do I access Smart Query in Arbutus?
What kinds of questions can I ask Smart Query?

While Arbutus provides the Smart Query tool to enable users to interact with their data using natural language, it is important to note that the interpretation and response is handled entirely by the AI engine.

Our integrated AI capabilities allow you to ask questions in natural language and receive helpful insights about your data. The clarity, detail, and structure of your questions will influence the quality of the responses, and because AI interprets queries in real time, the same question may yield different results. Taking time to learn the basics of prompt engineering—how to frame questions clearly and provide relevant context will help you get more consistent and useful outcomes.

Smart Query is used for asking questions to query the dataset passed to OpenAI and to then get responses back based on ChatGPT's interpretation of those questions.

For example, here are some examples of questions that you can ask in Smart Query:
• Which insurance provider has the highest Charge_Amount, and what is it?
• I have a date of birth field in my data. I would like to know how many patients were born between 1937 and 1941.
• Show me duplicate insurance provider names.
• I would like to know the total charge amount (field is Charge_Amount) each for the following Insurance Providers: Aetna, Blue Cross. I would like two separate totals.

Please see our AI/ML User Guide for more information and guidance.
Can I use Smart Query on related tables?

When you run the AI Smart Query App, one of the items you are prompted to select is an Arbutus table. If this happens to be a table that is related to another table, or multiple tables, then only those fields that are included in the current view, including any fields from the related tables, are analyzed.

Therefore, while you can use Smart Query on related tables, only those fields in the current view are analyzed.
What does Smart Query do?

Smart Query is an AI App that allows the user to ask natural language questions (see Note below) on a dataset that is analyzed, and in return get responses from OpenAI based on interpretation of the questions asked. The AI Smart Query App involves interaction with OpenAI and ChatGPT.

Note: Natural Language Processing (NLP) is what allows ChatGPT (and other AI tools) to understand what you are saying, i.e. question you are asking, and respond like a human would – using real, natural language.

Please see our AI/ML User Guide for more information and guidance.

How accurate is Smart Query in Arbutus?

While the AI Smart Query App is launched from an Arbutus procedure, the results are ultimately generated through OpenAI's ChatGPT (via a Python procedure) and then returned to Arbutus for display. Therefore, the results and output is not generated by Arbutus.

However, it is important to note that the interpretation and response is handled entirely by the AI engine. Our integrated AI capabilities allow you to ask questions in natural language and receive helpful insights about your data. The clarity, detail, and structure of your questions will influence the quality of the responses, and because AI interprets queries in real time, the same question may yield different results.

As the results are generated by OpenAI, it is strongly recommended that users exercise due diligence by reviewing and validating the output within Analyzer to ensure its accuracy and reliability.
Can Smart Query run advanced commands like Classify or Summarize?
What happens if Smart Query can't interpret my prompt?
Can Smart Query help new users learn Arbutus syntax?

When you run the AI Smart Query App, all processing takes place in the background, not visible to you. Once processing is done, the results are passed back to Arbutus for display. Very limited actions are recorded in the Command Log, and do not really provide much assistance to new users in terms of learning Arbutus syntax. This is mainly because most of the processing is done outside of Arbutus via a Python procedure, resulting in an interaction with OpenAI and ChatGPT.
When running an AI SmartApp? e.g. Smart Query, in Arbutus, I understand that my data is passed on to OpenAI for analysis. What happens to my data?

Please note that our AI functionality requires internet connectivity and an active ChatGPT account. Otherwise, it is disabled and non-functional by default, to ensures that no data is sent off premise unless explicitly decided upon by the customer’s organization and users. It will only be enabled once configured with your organization’s OpenAI account details.

Please take note that the AI Apps will use any information you share via the input prompts or any data from a table that you specify/select, and this information/data is then passed on to OpenAI (via ChatGPT) for subsequent analysis.

Therefore, it is imperative that you get confirmation on your organization’s policy allowing the use of ChatGPT.

How do I use Data Categorization in Arbutus?
What is Data Categorization?

Can I control how the categories are created in Data Categorization?
Where does the output from Data Categorization appear?

The responses received from ChatGPT is essentially a filtered list of records from the current table (that you had selected) and based on the description field from that same table.
For example, if the task you entered/specified was, “Analyze the comments in the selected field and create an output containing items categorized as 'Food issues' “, then you could expect to see a filtered list of all records from the current table (that you had selected) indicating ‘Food issues’ based on the description field that you had also selected. A new output table is not created.

Is Arbutus Assistant just for querying?
How do I use Arbutus Assistant in Arbutus?

Can Arbutus Assistant automate analysis?
What's the difference between Arbutus Assistant and Smart Query?

The main difference between Arbutus Assistant and Smart Query is that the latter uses a dataset, e.g., an Arbutus table containing data, which is selected by the user, in its interaction with ChatGPT. However, both these AI Apps prompt for questions, which are also entered by the user, for ChatGPT to analyze and provide a response. Please see our AI/ML User Guide for more information and guidance.

What do Clusters do?

Clusters and clustering is a process of identifying and grouping data points that are similar to each other. It is a common technique in data analysis, machine learning, and market segmentation.
Clustering helps in recognizing patterns, organizing data, and making predictions. For example, in real estate, clustering could be used to categorize buyers into groups based on budget, location preference, or property type, allowing for better-targeted marketing strategies.
Please see our AI/ML User Guide for more information and guidance.
How do I use Clusters in Arbutus?
How do I control the number of clusters?

When you run the Clusters ML App, you are prompted for the number of clusters. Choosing/Specifying the right number of clusters could involve some experimentation. The right number of clusters helps ensure that the clustering results are both accurate and useful for your specific application. Choosing the right number is crucial because too few or too many may lead to poor segmentation. Therefore, it is recommended that you first get a good understanding of your data.

Please see our AI/ML User Guide for more information and guidance.
Can I see what defines each cluster?

No, you will not be able to see what defines each cluster. As a result of running the Clusters ML App, you will notice a new field/column called kclusters in the resulting output table. The range of values in this table, e.g., 0,1,2, would depend on the number of clusters that was manually entered and prompted at the time of running the Clusters ML App. However, you will not see any descriptions as such to describe what these values mean or represent. At a minimum, you should understand that each of those values indicate a grouping of records with similar characteristics based on the fields clustered on. You can do further analysis on that new field (kclusters), e.g. run the Summarize or Classify command, to verify and understand why the records are grouped as they are.
Can I interpret or label Arbutus Clusters?

As a result of running the Clusters ML App, you will notice a new field/column called kclusters in the resulting output table. The range of values in this table, e.g., 0,1,2, would depend on the number of clusters that was manually entered and prompted at the time of running the Clusters ML App. However, you will not see any descriptions as such to describe what these values mean or represent. At a minimum, you should understand that each of those values indicate a grouping of records with similar characteristics based on the fields clustered on. You can do further analysis on that new field (kclusters), e.g. run the Summarize or Classify command, to verify and understand why the records are grouped as they are.

How do I know how many clusters I need in Clustering?

When you run the Clusters ML App, you are prompted for the number of clusters. Choosing/Specifying the right number of clusters could involve some experimentation. The right number of clusters helps ensure that the clustering results are both accurate and useful for your specific application. Choosing the right number is crucial because too few or too many may lead to poor segmentation. Therefore, it is recommended that you first get a good understanding of your data. Please see our AI/ML User Guide for more information and guidance.
Can I see what the clusters mean?

As a result of running the Clusters ML App, you will notice a new field/column called kclusters in the resulting output table. The range of values in this table, e.g., 0,1,2, would depend on the number of clusters that was manually entered and prompted at the time of running the Clusters ML App. However, you will not see any descriptions as such to describe what these values mean or represent. At a minimum, you should understand that each of those values indicate a grouping of records with similar characteristics based on the fields clustered on. You can do further analysis on that new field (kclusters), e.g. run the Summarize or Classify command, to verify and understand why the records are grouped as they are.
Can I choose the number of clusters in Arbutus Clusters?

When you run the Clusters ML App, you are prompted for the number of clusters. You should also be aware that choosing/specifying the right number of clusters could involve some experimentation. The right number of clusters helps ensure that the clustering results are both accurate and useful for your specific application. Choosing the right number is crucial because too few or too many may lead to poor segmentation. Therefore, it is recommended that you first get a good understanding of your data. Please see our AI/ML User Guide for more information and guidance.
What kind of fields are used in Clusters?

When you run the Clusters ML App, you are prompted for the following three fields: Relationship Key Field, Numeric Field 1, Numeric Field 2, Numeric Field 3, and Numeric Field 4. The Relationship Key Field is the key field on which the records are grouped by in your dataset, e.g., Vendor ID. The numeric fields (1 to 4) are the fields on which you would like to run Clusters. In other words, the data points that you are looking to group into their respective clusters. At a minimum, you have to specify at least two numeric fields. Selecting just one numeric field would result in a pop-up message displaying an error and termination of procedure.

Please see our AI/ML User Guide for more information and guidance.
What fields do I need to run Clusters?

When you run the Clusters ML App, you are prompted for the following three fields: Relationship Key Field, Numeric Field 1, Numeric Field 2, Numeric Field 3, and Numeric Field 4. The Relationship Key Field is the key field on which the records are grouped by in your dataset, e.g., Vendor ID. The numeric fields (1 to 4) are the fields on which you would like to run Clusters. In other words, the data points that you are looking to group into their respective clusters. At a minimum, you have to specify at least two numeric fields. Selecting just one numeric field would result in a pop-up message displaying an error and termination of procedure.

Please see our AI/ML User Guide for more information and guidance.
Where is the Clusters output stored?

How does Outliers work?
How do I use Outliers in Arbutus?
How are Outliers identified in Arbutus?

The Arbutus ML Outliers functionality involves executing an industry-standard Python procedure (iqr.py) designed for identifying outliers using Interquartile Range methodology (IQR), thus calculating the medians and the lower and upper boundaries for identifying outliers in a dataset - any value (from the numeric field being tested for outliers) that falls in between the lower and upper boundaries is considered an outlier. Please see our AI/ML User Guide for more information.

What kind of Outliers method is used?
Where does Arbutus Outliers store results?
Can I customize the sensitivity of Outliers detection in Arbutus?

What does Sentiment Analysis do?

Sentiment Analysis, also known as opinion mining, is a natural language processing (NLP) technique used in machine learning (ML) to determine the emotional tone or sentiment expressed in a piece of text. The primary aim is to classify the sentiment as positive, negative, or neutral. Essentially, sentiment analysis uses NLP and ML technologies to train computer software to analyze and interpret text similarly to humans. Please see our AI/ML User Guide for more information and guidance.
How do I run Sentiment Analysis in Arbutus?
Can I run it on multiple fields?
Can I train it or change the scale?

Does Sentiment Analysis work on any text?

Sentiment analysis is best suited for textual data that expresses opinions, emotions, or attitudes. The goal is to determine whether the sentiment conveyed is positive, negative, or neutral, and sometimes to identify more nuanced emotions like anger, joy, or sadness. Examples of such fields include Customer feedback/reviews, Survey responses, Support logs, News articles, Product descriptions, and so on. Please see our AI/ML User Guide for more information and guidance.
Does Sentiment Analysis in Arbutus use ChatGPT?
What kind of fields work best for Sentiment Analysis?

Sentiment analysis is best suited for textual data that expresses opinions, emotions, or attitudes. The goal is to determine whether the sentiment conveyed is positive, negative, or neutral, and sometimes to identify more nuanced emotions like anger, joy, or sadness. Examples of such fields include Customer feedback/reviews, Survey responses, Support logs, News articles, Product descriptions, and so on. Please see our AI/ML User Guide for more information and guidance.
Can I change or train the Sentiment Analysis model in Arbutus?

What if my organization doesn't allow external API use?
Are there any missing capabilities in this release?

AI and machine learning technologies are evolving rapidly, and Arbutus is committed to staying at the forefront of that evolution. While this current release delivers robust AI/ML functionality designed to support and enhance audit and data analytics workflows, we recognize that there is always room for refinement and growth. Our team is actively working on advancing the intelligence, accuracy, and completeness of AI/ML capabilities, and users can expect notable enhancements and new features in the upcoming releases.
Can I automate AI/ML Apps?
Can I run AI/ML Apps on multiple fields?

Can I schedule AI/ML procedures or include them in automation?
How are AI/ML results saved?

How the AI-ML results are saved depends on the App used. Here is a listing for each of the Apps with high level information on where their results are saved:

• Outliers (ML) - Creates a new output table with all existing fields from the source table that was selected for testing, and a new field created and included in the output table as a result of the testing.

• Clusters (ML) - Same as above.

• Sentiment Analysis (ML) - Same as above, but with two new fields created and included in the output table.

• Smart Query (AI) - Creates a new output table with two fields. One field showing the question that you had entered earlier and another field showing the result (answer) of the question that was asked to ChatGPT.

• Data Categorization (AI) - A new output table is not created. Instead, a filter is applied in the view of the table analyzed in ChatGPT showing the filtered results.

• Arbutus Assistant (AI) - A new output table is not created. Instead the response from ChatGPT to the question you asked is displayed in an Arbutus dialog. You do have the option to save this response (displayed in the dialog) to an Arbutus procedure.
As a user, what's missing from these new tools?

AI and machine learning technologies are evolving rapidly, and Arbutus is committed to staying at the forefront of that evolution. While this current release delivers robust AI/ML functionality designed to support and enhance audit and data analytics workflows, we recognize that there is always room for refinement and growth. Our team is actively working on advancing the intelligence, accuracy, and completeness of AI/ML capabilities, and users can expect notable enhancements and new features in the upcoming releases.

Products

What's New

Global Partners

By Industry

By Business Processes

By Popularity

Resources

Webinars

Arbutus Services

Arbutus Help

AI/ML FAQ

You've got questions. We've answers.

General Questions

What new AI/ML capabilities are available in the AI/ML menu?

Do the AI/ML features work offline?

What are the best ways to use these tools?

What's required to use AI features in Arbutus?

Do I need ChatGPT set up for this to work?

What happens if I don't configure ChatGPT?

Does AI apps require an internet connection?

AI01 - SMART QUERY

What does Smart Query do?

How do I access Smart Query in Arbutus?

What kinds of questions can I ask Smart Query?

Can I use Smart Query on related tables?

What does Smart Query do?

How accurate is Smart Query in Arbutus?

Can Smart Query run advanced commands like Classify or Summarize?

What happens if Smart Query can't interpret my prompt?

Can Smart Query help new users learn Arbutus syntax?

When running an AI SmartApp? e.g. Smart Query, in Arbutus, I understand that my data is passed on to OpenAI for analysis. What happens to my data?

AI02 - DATA CATEGORIZATION

How do I use Data Categorization in Arbutus?

What is Data Categorization?

Can I control how the categories are created in Data Categorization?

Where does the output from Data Categorization appear?

AI03 - ARBUTUS ASSISTANT

Is Arbutus Assistant just for querying?

How do I use Arbutus Assistant in Arbutus?

Can Arbutus Assistant automate analysis?

What's the difference between Arbutus Assistant and Smart Query?

ML01 - CLUSTERS

What do Clusters do?

How do I use Clusters in Arbutus?

How do I control the number of clusters?

Can I see what defines each cluster?

Can I interpret or label Arbutus Clusters?

How do I know how many clusters I need in Clustering?

Can I see what the clusters mean?

Can I choose the number of clusters in Arbutus Clusters?

What kind of fields are used in Clusters?

What fields do I need to run Clusters?

Where is the Clusters output stored?

ML02 - OUTLIERS

How does Outliers work?

How do I use Outliers in Arbutus?

How are Outliers identified in Arbutus?

What kind of Outliers method is used?

Where does Arbutus Outliers store results?

Can I customize the sensitivity of Outliers detection in Arbutus?

ML03 - SENTIMENT ANALYSIS

What does Sentiment Analysis do?

How do I run Sentiment Analysis in Arbutus?

Can I run it on multiple fields?

Can I train it or change the scale?

Does Sentiment Analysis work on any text?

Does Sentiment Analysis in Arbutus use ChatGPT?

What kind of fields work best for Sentiment Analysis?

Can I change or train the Sentiment Analysis model in Arbutus?

USER - CENTERED QUESTIONS

What if my organization doesn't allow external API use?

Are there any missing capabilities in this release?

Can I automate AI/ML Apps?

Can I run AI/ML Apps on multiple fields?

Can I schedule AI/ML procedures or include them in automation?

How are AI/ML results saved?

As a user, what's missing from these new tools?

Get Support

AI/ML Functionality User Guide