Title: | Run Multiple Large Language Model Predictions Against a Table, or Vectors |
---|---|
Description: | Run multiple 'Large Language Model' predictions against a table. The predictions run row-wise over a specified column. It works using a one-shot prompt, along with the current row's content. The prompt that is used will depend of the type of analysis needed. |
Authors: | Edgar Ruiz [aut, cre], Posit Software, PBC [cph, fnd] |
Maintainer: | Edgar Ruiz <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0 |
Built: | 2024-11-19 05:41:25 UTC |
Source: | https://github.com/mlverse/mall |
Use a Large Language Model (LLM) to classify the provided text as one of the
options provided via the labels
argument.
llm_classify( .data, col, labels, pred_name = ".classify", additional_prompt = "" ) llm_vec_classify(x, labels, additional_prompt = "", preview = FALSE)
llm_classify( .data, col, labels, pred_name = ".classify", additional_prompt = "" ) llm_vec_classify(x, labels, additional_prompt = "", preview = FALSE)
.data |
A |
col |
The name of the field to analyze, supports |
labels |
A character vector with at least 2 labels to classify the text as |
pred_name |
A character vector with the name of the new column where the prediction will be placed |
additional_prompt |
Inserts this text into the prompt sent to the LLM |
x |
A vector that contains the text to be analyzed |
preview |
It returns the R call that would have been used to run the
prediction. It only returns the first record in |
llm_classify
returns a data.frame
or tbl
object.
llm_vec_classify
returns a vector that is the same length as x
.
library(mall) data("reviews") llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE) llm_classify(reviews, review, c("appliance", "computer")) # Use 'pred_name' to customize the new column's name llm_classify( reviews, review, c("appliance", "computer"), pred_name = "prod_type" ) # Pass custom values for each classification llm_classify(reviews, review, c("appliance" ~ 1, "computer" ~ 2)) # For character vectors, instead of a data frame, use this function llm_vec_classify( c("this is important!", "just whenever"), c("urgent", "not urgent") ) # To preview the first call that will be made to the downstream R function llm_vec_classify( c("this is important!", "just whenever"), c("urgent", "not urgent"), preview = TRUE )
library(mall) data("reviews") llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE) llm_classify(reviews, review, c("appliance", "computer")) # Use 'pred_name' to customize the new column's name llm_classify( reviews, review, c("appliance", "computer"), pred_name = "prod_type" ) # Pass custom values for each classification llm_classify(reviews, review, c("appliance" ~ 1, "computer" ~ 2)) # For character vectors, instead of a data frame, use this function llm_vec_classify( c("this is important!", "just whenever"), c("urgent", "not urgent") ) # To preview the first call that will be made to the downstream R function llm_vec_classify( c("this is important!", "just whenever"), c("urgent", "not urgent"), preview = TRUE )
Use a Large Language Model (LLM) to process the provided text using the
instructions from prompt
llm_custom(.data, col, prompt = "", pred_name = ".pred", valid_resps = "") llm_vec_custom(x, prompt = "", valid_resps = NULL)
llm_custom(.data, col, prompt = "", pred_name = ".pred", valid_resps = "") llm_vec_custom(x, prompt = "", valid_resps = NULL)
.data |
A |
col |
The name of the field to analyze, supports |
prompt |
The prompt to append to each record sent to the LLM |
pred_name |
A character vector with the name of the new column where the prediction will be placed |
valid_resps |
If the response from the LLM is not open, but
deterministic, provide the options in a vector. This function will set to
|
x |
A vector that contains the text to be analyzed |
llm_custom
returns a data.frame
or tbl
object.
llm_vec_custom
returns a vector that is the same length as x
.
library(mall) data("reviews") llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE) my_prompt <- paste( "Answer a question.", "Return only the answer, no explanation", "Acceptable answers are 'yes', 'no'", "Answer this about the following text, is this a happy customer?:" ) reviews |> llm_custom(review, my_prompt)
library(mall) data("reviews") llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE) my_prompt <- paste( "Answer a question.", "Return only the answer, no explanation", "Acceptable answers are 'yes', 'no'", "Answer this about the following text, is this a happy customer?:" ) reviews |> llm_custom(review, my_prompt)
Use a Large Language Model (LLM) to extract specific entity, or entities, from the provided text
llm_extract( .data, col, labels, expand_cols = FALSE, additional_prompt = "", pred_name = ".extract" ) llm_vec_extract(x, labels = c(), additional_prompt = "", preview = FALSE)
llm_extract( .data, col, labels, expand_cols = FALSE, additional_prompt = "", pred_name = ".extract" ) llm_vec_extract(x, labels = c(), additional_prompt = "", preview = FALSE)
.data |
A |
col |
The name of the field to analyze, supports |
labels |
A vector with the entities to extract from the text |
expand_cols |
If multiple |
additional_prompt |
Inserts this text into the prompt sent to the LLM |
pred_name |
A character vector with the name of the new column where the prediction will be placed |
x |
A vector that contains the text to be analyzed |
preview |
It returns the R call that would have been used to run the
prediction. It only returns the first record in |
llm_extract
returns a data.frame
or tbl
object.
llm_vec_extract
returns a vector that is the same length as x
.
library(mall) data("reviews") llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE) # Use 'labels' to let the function know what to extract llm_extract(reviews, review, labels = "product") # Use 'pred_name' to customize the new column's name llm_extract(reviews, review, "product", pred_name = "prod") # Pass a vector to request multiple things, the results will be pipe delimeted # in a single column llm_extract(reviews, review, c("product", "feelings")) # To get multiple columns, use 'expand_cols' llm_extract(reviews, review, c("product", "feelings"), expand_cols = TRUE) # Pass a named vector to set the resulting column names llm_extract( .data = reviews, col = review, labels = c(prod = "product", feels = "feelings"), expand_cols = TRUE ) # For character vectors, instead of a data frame, use this function llm_vec_extract("bob smith, 123 3rd street", c("name", "address")) # To preview the first call that will be made to the downstream R function llm_vec_extract( "bob smith, 123 3rd street", c("name", "address"), preview = TRUE )
library(mall) data("reviews") llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE) # Use 'labels' to let the function know what to extract llm_extract(reviews, review, labels = "product") # Use 'pred_name' to customize the new column's name llm_extract(reviews, review, "product", pred_name = "prod") # Pass a vector to request multiple things, the results will be pipe delimeted # in a single column llm_extract(reviews, review, c("product", "feelings")) # To get multiple columns, use 'expand_cols' llm_extract(reviews, review, c("product", "feelings"), expand_cols = TRUE) # Pass a named vector to set the resulting column names llm_extract( .data = reviews, col = review, labels = c(prod = "product", feels = "feelings"), expand_cols = TRUE ) # For character vectors, instead of a data frame, use this function llm_vec_extract("bob smith, 123 3rd street", c("name", "address")) # To preview the first call that will be made to the downstream R function llm_vec_extract( "bob smith, 123 3rd street", c("name", "address"), preview = TRUE )
Use a Large Language Model (LLM) to perform sentiment analysis from the provided text
llm_sentiment( .data, col, options = c("positive", "negative", "neutral"), pred_name = ".sentiment", additional_prompt = "" ) llm_vec_sentiment( x, options = c("positive", "negative", "neutral"), additional_prompt = "", preview = FALSE )
llm_sentiment( .data, col, options = c("positive", "negative", "neutral"), pred_name = ".sentiment", additional_prompt = "" ) llm_vec_sentiment( x, options = c("positive", "negative", "neutral"), additional_prompt = "", preview = FALSE )
.data |
A |
col |
The name of the field to analyze, supports |
options |
A vector with the options that the LLM should use to assign a sentiment to the text. Defaults to: 'positive', 'negative', 'neutral' |
pred_name |
A character vector with the name of the new column where the prediction will be placed |
additional_prompt |
Inserts this text into the prompt sent to the LLM |
x |
A vector that contains the text to be analyzed |
preview |
It returns the R call that would have been used to run the
prediction. It only returns the first record in |
llm_sentiment
returns a data.frame
or tbl
object.
llm_vec_sentiment
returns a vector that is the same length as x
.
library(mall) data("reviews") llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE) llm_sentiment(reviews, review) # Use 'pred_name' to customize the new column's name llm_sentiment(reviews, review, pred_name = "review_sentiment") # Pass custom sentiment options llm_sentiment(reviews, review, c("positive", "negative")) # Specify values to return per sentiment llm_sentiment(reviews, review, c("positive" ~ 1, "negative" ~ 0)) # For character vectors, instead of a data frame, use this function llm_vec_sentiment(c("I am happy", "I am sad")) # To preview the first call that will be made to the downstream R function llm_vec_sentiment(c("I am happy", "I am sad"), preview = TRUE)
library(mall) data("reviews") llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE) llm_sentiment(reviews, review) # Use 'pred_name' to customize the new column's name llm_sentiment(reviews, review, pred_name = "review_sentiment") # Pass custom sentiment options llm_sentiment(reviews, review, c("positive", "negative")) # Specify values to return per sentiment llm_sentiment(reviews, review, c("positive" ~ 1, "negative" ~ 0)) # For character vectors, instead of a data frame, use this function llm_vec_sentiment(c("I am happy", "I am sad")) # To preview the first call that will be made to the downstream R function llm_vec_sentiment(c("I am happy", "I am sad"), preview = TRUE)
Use a Large Language Model (LLM) to summarize text
llm_summarize( .data, col, max_words = 10, pred_name = ".summary", additional_prompt = "" ) llm_vec_summarize(x, max_words = 10, additional_prompt = "", preview = FALSE)
llm_summarize( .data, col, max_words = 10, pred_name = ".summary", additional_prompt = "" ) llm_vec_summarize(x, max_words = 10, additional_prompt = "", preview = FALSE)
.data |
A |
col |
The name of the field to analyze, supports |
max_words |
The maximum number of words that the LLM should use in the summary. Defaults to 10. |
pred_name |
A character vector with the name of the new column where the prediction will be placed |
additional_prompt |
Inserts this text into the prompt sent to the LLM |
x |
A vector that contains the text to be analyzed |
preview |
It returns the R call that would have been used to run the
prediction. It only returns the first record in |
llm_summarize
returns a data.frame
or tbl
object.
llm_vec_summarize
returns a vector that is the same length as x
.
library(mall) data("reviews") llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE) # Use max_words to set the maximum number of words to use for the summary llm_summarize(reviews, review, max_words = 5) # Use 'pred_name' to customize the new column's name llm_summarize(reviews, review, 5, pred_name = "review_summary") # For character vectors, instead of a data frame, use this function llm_vec_summarize( "This has been the best TV I've ever used. Great screen, and sound.", max_words = 5 ) # To preview the first call that will be made to the downstream R function llm_vec_summarize( "This has been the best TV I've ever used. Great screen, and sound.", max_words = 5, preview = TRUE )
library(mall) data("reviews") llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE) # Use max_words to set the maximum number of words to use for the summary llm_summarize(reviews, review, max_words = 5) # Use 'pred_name' to customize the new column's name llm_summarize(reviews, review, 5, pred_name = "review_summary") # For character vectors, instead of a data frame, use this function llm_vec_summarize( "This has been the best TV I've ever used. Great screen, and sound.", max_words = 5 ) # To preview the first call that will be made to the downstream R function llm_vec_summarize( "This has been the best TV I've ever used. Great screen, and sound.", max_words = 5, preview = TRUE )
Use a Large Language Model (LLM) to translate a text to a specific language
llm_translate( .data, col, language, pred_name = ".translation", additional_prompt = "" ) llm_vec_translate(x, language, additional_prompt = "", preview = FALSE)
llm_translate( .data, col, language, pred_name = ".translation", additional_prompt = "" ) llm_vec_translate(x, language, additional_prompt = "", preview = FALSE)
.data |
A |
col |
The name of the field to analyze, supports |
language |
Target language to translate the text to |
pred_name |
A character vector with the name of the new column where the prediction will be placed |
additional_prompt |
Inserts this text into the prompt sent to the LLM |
x |
A vector that contains the text to be analyzed |
preview |
It returns the R call that would have been used to run the
prediction. It only returns the first record in |
llm_translate
returns a data.frame
or tbl
object.
llm_vec_translate
returns a vector that is the same length as x
.
library(mall) data("reviews") llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE) # Pass the desired language to translate to llm_translate(reviews, review, "spanish")
library(mall) data("reviews") llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE) # Pass the desired language to translate to llm_translate(reviews, review, "spanish")
Allows us to specify the back-end provider, model to use during the current R session
llm_use( backend = NULL, model = NULL, ..., .silent = FALSE, .cache = NULL, .force = FALSE )
llm_use( backend = NULL, model = NULL, ..., .silent = FALSE, .cache = NULL, .force = FALSE )
backend |
The name of an supported back-end provider. Currently only 'ollama' is supported. |
model |
The name of model supported by the back-end provider |
... |
Additional arguments that this function will pass down to the
integrating function. In the case of Ollama, it will pass those arguments to
|
.silent |
Avoids console output |
.cache |
The path to save model results, so they can be re-used if
the same operation is ran again. To turn off, set this argument to an empty
character: |
.force |
Flag that tell the function to reset all of the settings in the R session |
A mall_session
object
library(mall) llm_use("ollama", "llama3.2") # Additional arguments will be passed 'as-is' to the # downstream R function in this example, to ollama::chat() llm_use("ollama", "llama3.2", seed = 100, temperature = 0.1) # During the R session, you can change any argument # individually and it will retain all of previous # arguments used llm_use(temperature = 0.3) # Use .cache to modify the target folder for caching llm_use(.cache = "_my_cache") # Leave .cache empty to turn off this functionality llm_use(.cache = "") # Use .silent to avoid the print out llm_use(.silent = TRUE)
library(mall) llm_use("ollama", "llama3.2") # Additional arguments will be passed 'as-is' to the # downstream R function in this example, to ollama::chat() llm_use("ollama", "llama3.2", seed = 100, temperature = 0.1) # During the R session, you can change any argument # individually and it will retain all of previous # arguments used llm_use(temperature = 0.3) # Use .cache to modify the target folder for caching llm_use(.cache = "_my_cache") # Leave .cache empty to turn off this functionality llm_use(.cache = "") # Use .silent to avoid the print out llm_use(.silent = TRUE)
Use a Large Language Model (LLM) to see if something is true or not based the provided text
llm_verify( .data, col, what, yes_no = factor(c(1, 0)), pred_name = ".verify", additional_prompt = "" ) llm_vec_verify( x, what, yes_no = factor(c(1, 0)), additional_prompt = "", preview = FALSE )
llm_verify( .data, col, what, yes_no = factor(c(1, 0)), pred_name = ".verify", additional_prompt = "" ) llm_vec_verify( x, what, yes_no = factor(c(1, 0)), additional_prompt = "", preview = FALSE )
.data |
A |
col |
The name of the field to analyze, supports |
what |
The statement or question that needs to be verified against the provided text |
yes_no |
A size 2 vector that specifies the expected output. It is
positional. The first item is expected to be value to return if the
statement about the provided text is true, and the second if it is not. Defaults
to: |
pred_name |
A character vector with the name of the new column where the prediction will be placed |
additional_prompt |
Inserts this text into the prompt sent to the LLM |
x |
A vector that contains the text to be analyzed |
preview |
It returns the R call that would have been used to run the
prediction. It only returns the first record in |
llm_verify
returns a data.frame
or tbl
object.
llm_vec_verify
returns a vector that is the same length as x
.
library(mall) data("reviews") llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE) # By default it will return 1 for 'true', and 0 for 'false', # the new column will be a factor type llm_verify(reviews, review, "is the customer happy") # The yes_no argument can be modified to return a different response # than 1 or 0. First position will be 'true' and second, 'false' llm_verify(reviews, review, "is the customer happy", c("y", "n")) # Number can also be used, this would be in the case that you wish to match # the output values of existing predictions llm_verify(reviews, review, "is the customer happy", c(2, 1))
library(mall) data("reviews") llm_use("ollama", "llama3.2", seed = 100, .silent = TRUE) # By default it will return 1 for 'true', and 0 for 'false', # the new column will be a factor type llm_verify(reviews, review, "is the customer happy") # The yes_no argument can be modified to return a different response # than 1 or 0. First position will be 'true' and second, 'false' llm_verify(reviews, review, "is the customer happy", c("y", "n")) # Number can also be used, this would be in the case that you wish to match # the output values of existing predictions llm_verify(reviews, review, "is the customer happy", c(2, 1))
Mini reviews data set
reviews
reviews
A data frame that contains 3 records. The records are of fictitious product reviews.
library(mall) data(reviews) reviews
library(mall) data(reviews) reviews