LLMs in the R Space

R-Ladies Gaborone

Luis D. Verde Arregoitia

2025-06-28

Hello!

@LuisDVerde
@LuisDVA
liomys.mx
luis@liomys.mx

Mammals, macroecology, conservation
R user since 2011 / ‘blogger’ since 2015
Certified Instructor - Posit(RStudio) & The Carpentries
Online and in-person courses all year (Physalia Courses Online, R for the Rest of Us)
rOpenSci mentor 2023-2026
Package developer
R conference speaker + organizer
R-Ladies collaborator (St. Louis 🇺🇸; Buenos Aires 🇦🇷; Gaborone 😃 🇧🇼)

Language Models?

Large Language Models?

Generative AI?

AI-generated image created with DALL·E via ChatGPT

After drinking some water, the dog went back to the bed to feed her ___________.

puppies: 0.12, babies: 0.10, family: 0.07, offspring 0.042, kittens 0.02 etc.

Language models

Predict and generate text by estimating the probability of tokens occurring in a given sequence.

Large language models

Add more parameters and massive datasets, enabling more complex tasks

Transformers

A novel (2017) algorithmic architecture to help models focus on important parts of their input for more efficiency

LLMs can help us with:

Text generation, translation, sentiment analysis, writing and debugging code, generating images and videos, etc.

However

Let’s consider:

Growing evidence of negative effects on learning
Biases
Hype
Valid anti-AI sentiment
Issues with training material (piracy, fair use considerations, copyright)
Provider intentions

🚫 Hype 🚫

(hint: disregard the hype and use what makes you more efficient)

“If you don’t become an AI/prompt engineer by tomorrow you’re a loser with no future”

“Your product needs AI or it will be worthless”

“All your training in data and coding was a waste of time”

Still…

I believe everyone deserves the ability to automate tedious tasks in their lives with computers - Simon Willison

..providing a way for people to talk with machines in plain language constitutes a dramatic step forward in making computing accessible to everyone - Carl Bergstrom (paraphrased)

Where are the models? Who makes them?

Many popular LLMs are hosted by cloud providers

OpenAI (Known for GPT series including ChatGPT)
Google AI (Gemini models)
Anthropic (Claude models)

Many more! (let’s have a look)

Cloud-Hosted models

Relatively easy setup
- Create account, set up billing if applicable, get an API key
Access to new and in-development models and massive computing power

Can be costly
Need internet access
We send our data to the API

Local models

Download smaller models on our own hardware

Data not shared with a provider
No ongoing costs
Work offline

Steep learning curve
Large memory requirements
Slower performance

Examples:

Ollama (Simplifies running open-source LLMs locally)
Hugging Face transformers (lots of open-source models)

Which model do I use?

Considerations

Pricing
- Free tiers, price per token, billing policies
Speed
Privacy
Hardware

On pricing

Nothing is more costly than something given free of charge
- Japanese proverb.

nothing is offered for free unless there is something to be gained for the party involved
if we are not paying for the product, then we are the product

Interacting with LLMs

In the browser

Web-based chat interfaces, often with conversation histories and file upload features.

Programatically through an API

Application Programming Interfaces (APIs) are the rules and protocols that determine how one application can request data from another.

APIs

AI-generated image created with DALL·E 3 via Bing Image Generator

Let different software systems talk to each other and exchange data without having to understand each others’ inner workings.

APIs

Allow your applications (R scripts, browser-based prompts, mobile apps) to:

-   Send input text (prompts)
-   Specify which LLM model to use 
-   Receive generated text, code, or structured data back
-   Handle authentication (e.g., using API keys)

API Keys

Working with APIs means we need to identify ourselves to the model provider and authenticate our encrypted connection to their servers

Overall workflow

Create and account with a model provider
Generate an API key
Keep it safe but reachable by your LLM tools

In R most of the relevant packages can work with keys stored in Environment Variables.

Never share your API keys in public repositories or documents

H. Wickham - Managing secrets

Ted Laderas - A gRadual introduction to web APIs and JSON

Why LLMs + R

AI-generated image created with DALL·E 3 via Bing Image Generator

Decent enough outputs

Working code, actual problems solved

Explosion of programs and features in late 2024

Relevant keynotes, blog posts, demos

Reduce context switching

Shifting our attention between different tasks or programs can be tiring and make us less productive and efficient.

Browser ↩︎️↪️ IDE ↩︎️↪️ Word Processor ↩︎️↪️ Cellphone

Lots of copying and pasting may introduce errors.

Why the LLMs + R guide?

Learn about Quarto books
Show off 📦 hexsession
Contribute in Spanish to improve access to these tools
Keep up with developments and learn to use the tools I include

Getting started

Questions we might put in an LLM chat window?

> How do I add a subtitle to my ggplot figure?
> What are the arguments for pivot_wider()?
> I can’t join my table_1 object with my dat3 data frame, help!

LLMS in R

Trying to help someone over the phone vs. helping someone at the computer

Young Thug and Lil Durk Troubleshooting meme - imgflip

Three quick demos

‘Continue’ extension

“an open-source AI code assistant”

runs as an extension in VS Code (and OSS Code forks including Positron) and JetBrains

Cool IDE features:

Chat
Autocomplete
Context Items

`ellmer` and friends

Bridging R and LLMs

Robust and designed specifically for easy interaction with LLMs directly from R
Broad provider support and meant for entreprise/production use
Allows models to extend their capabilities by executing R code
Can produce output that is immediately usable in R

`ellmer` demos

chatGroqDemo.R
extractPDFdemo.R

`gander`

More than a simple chat window
Powered by ellmer

gander receives a snapshot of our environment with every request and recognizes what we are talking about, including object and variable names

ganderDemo.R

Not part of this talk but interesting

(and somewhat covered in the Quarto Guide)

Model Context Protocol (MCP): the “USB-C port for AI applications”
Resource Augmented Generation (RAG): enhance the accuracy and reliability of LLMs with additional information from relevant data sources.
AI agents: Leverage MCP to do many things in with some degree of ‘autonomy’

Staying Up-to-Date

Very dynamic and fast-moving field
Key players are constantly innovating and collaborating
Conferences often introduce novel tools
Blogs and social media

Closing thoughts

LLM-based tools are not always necessary. The answers you need may already be in the documentation, a book, your colleagues, a blog or tutorial, etc.

Use for efficiency but always verify outputs

Experiment and share your successes!

Acknowledgements

Team at posit making, maintaining, and improving several of these tools: Simon Couch, Garrick Aden-Buie, Hadley Wickham, Joe Cheng, Tomasz Kalinowski, Winston Chang, and many others.

MLverse team.

Albert Rapp and Chris Brown for sharing tutorials early in the life cycles of many of these tools.

Other developers and Maintainers: Frank Hull, Gabriel Kaiser

Everyone that shares, gives me feedback on the guide, or points me to need tools I missed.

🤔🤔🤔

Thanks!

(these slides will be linked in the LLMs Guide by Monday)

@LuisDVerde
@LuisDVA
liomys.mx
luis@liomys.mx

LLMs in the R Space

Hello!

Language models

Large language models

Transformers

LLMs can help us with:

However

Let’s consider:

🚫 Hype 🚫

Still…

Where are the models? Who makes them?

Cloud-Hosted models

Local models

Examples:

Which model do I use?

Considerations

On pricing

Interacting with LLMs

In the browser

Programatically through an API

APIs

APIs

API Keys

Overall workflow

Why LLMs + R

Decent enough outputs

Explosion of programs and features in late 2024

Reduce context switching

Why the LLMs + R guide?

Getting started

Questions we might put in an LLM chat window?

LLMS in R

Three quick demos

‘Continue’ extension

ellmer and friends

Bridging R and LLMs

ellmer demos

gander

Not part of this talk but interesting

Staying Up-to-Date

Closing thoughts

Acknowledgements

Thanks!

`ellmer` and friends

`ellmer` demos

`gander`