Tutorials

Registration for all tutorials is closed. We appreciate your interest in them and hope that you will enjoy the rest of the conference. Access details were emailed to all enrolled participants on 19 June.

Session 1 (Tutorial) - Monday, 20 June 2022, 4:00 - 7:30am CDT

Déployer une application Shiny: Comment faire en pratique ?

  • Capacity: 20
  • Language: French
  • Duration: 3h

Résumé

En ce qui concerne Shiny, la majorité des discussions, vidéos et tutoriels se concentrent sur le développement de l’application et son optimisation.

Mais alors quoi?

Construire une application Shiny est une chose, mais comprendre ce que cela signifie en termes de déploiement en est une autre !

Ce tutoriel vise à aider les développeurs Shiny à déployer leurs applications sur plusieurs plateformes (RStudio Connect, Shiny Server et ShinyProxy).

Comprendre comment cela fonctionne et être capable de le faire soi-même est la première étape pour mieux comprendre le monde informatique qui entoure notre bulle Shiny.

Professeurs

Ce tutorial est enseigné par des scientifiques de données:

Cervan Girard est enthousiaste, motive, fiable, constructif et efficace en matière de formation et de développement. Il aime partager sa passion pour le langage R avec des apprenants de tous niveaux.

Par Antoine Languillaume, construire un logiciel, c'est un peu comme construire une horloge ingénieuse : cela demande de la patience, de la persévérance et une bonne dose de créativité. Il vous fera comprendre que le code informatique n'est pas une rune cabalistique destinée--c'est un support de communication à part entière que tout le monde peut maîtriser !

Arthur Bréant aime explorer la donnée sous toutes ses formes : relationnelle, non relationnelle, ordonnée ou non ordonnée. Son enthousiasme est encore plus fort lorsqu'il s'agit de rendre des informations dans une application - de préférence Shiny. il partagera les meilleures recettes de l'équipe ThinkR pour les deploiements réussis.

Docker for R users

  • Capacity: 100
  • Language: English
  • Duration: 3.5h

Abstract

Docker is one of the main tools for reproducibility, CI/CD and automation, productionize, and MLops. In this tutorial, you will learn the foundation of Docker with applications for R programming. That includes creating Docker images, setting an R development environment, and deploying R code on Github Actions with a container.

Instructors

Rami Krispin is a data science and engineering manager who mainly focuses on time series analysis, forecasting, and MLops applications with R. He is the author of Hands-On Time Series Analysis with R, and several R packages, including the TSstudio package for time series analysis and forecasting applications. Since the beginning of the COVID-19 outbreak, he has contributed to COVID-19 open-source projects. That has included the creation of several packages, such as coronavirus, covid19italy, and covid19sf, integrating CI/CD tools such as Docker, Github Actions, and Bash to automate data streaming and communicate results with live dashboards.

Rahul Sangole (http://www.rsangole.com) is a data scientist who focuses on time-series analyses, multivariate anomaly detection, and reproducible work in R. He has experience creating production-ready ETL pipelines, Shiny dashboards, and analyses running in Docker. He’s an avid advocate for Docker as the modus operandi for R development. He has created various solutions to break down the barriers to entry to adopt Docker for development in both RStudio and VS Code.

Introduction to spatial data analysis in R

  • Capacity: 20
  • Language: English
  • Duration: 2h

Abstract

R serves as an excellent open-source platform to perform various spatial analyses. With recent developments in packages that can handle large datasets in little time like sf and terra, spatial data wrangling and modeling have also become simple.

This tutorial is for R users who are familiar with basic data wrangling in R but new to spatial analysis. We will introduce various spatial data types, and some basic R packages used to download and analyze spatial data, such as sf and raster, using tidyverse style. Participants will learn how to transform spatial data (focusing on raster and vector data) to different projections, create buffers of different radii and extract information from spatial data. We will extract points from a raster, calculate distance and area, perform spatial joins, and derive a summary of raster images. We will explore the conversion of raster to vector data and vice versa. Finally, we will create static and interactive maps. We will also share other resources that will help participants continue their learning journey.

Instructors

Meenakshi Kushwaha (https://meenakshi.rbind.io/) is an environmental health researcher and certified RStudio instructor. Meenakshi has worked on a range of environmental health issues, including lead poisoning, air pollution, and water quality in high and low/middle-income countries.

Adithi R. Upadhya (https://adithirugis.netlify.app/) is a geospatial data analyst at ILK Labs Bengaluru, India. Her work consists of analysing air quality data, applying different techniques / models and finally generating high resolution maps. She is also the developer and maintainer of packages like mmaqshiny and pollucheck, which have been published in the Journal of Open Source Software.

Pratyush Agarwal is the field research coordinator at ILK Labs. He manages all the air quality instruments and fieldwork. He is a big foodie and travel enthusiast and likes to watch movies in his leisure time.

Causal machine learning with DoubleML

  • Capacity: 75
  • Language: English
  • Duration: 3h

Abstract

Machine learning is frequently used for predicting outcome variables. But in many cases, we are interested in causal questions: Why do customers churn? What is the effect of a price change on sales? How can we evaluate an A/B test?

This three-hour tutorial offers an introduction to causal machine learning with the R package DoubleML. First, a general motivation and introduction to causal machine learning is provided. Participants will learn about situations in which causal questions become important and how ML methods can be used to answer them. We will shortly discuss common pitfalls in the use of ML for causality and introduce the building blocks of double machine learning. Double machine learning offers a general framework for causal machine learning in a variety of causal models. Second, participants will be introduced to DoubleML and learn about the most important steps of a causal analysis with DoubleML. DoubleML makes it possible to conduct causal inference on a variety of learners supported by the mlr3 ecosystem. Code demonstrations are provided in reproducible examples, together with a short outlook on how users can apply, modify and extend the DoubleML package for their own applications. Third, participants will get started with causal machine learning and solve hands-on case studies by themselves or in teams. Our target is users with some basic knowledge of ML concepts and medium R knowledge. Participants will benefit from prior experience in machine learning–particularly in using mlr3–and object orientation based on R6.

Instructors

Philipp Bach is a postdoctoral researcher in statistics at the University of Hamburg, Germany. His research focuses on implementing and applying methods for causal machine learning. He is a developer and contributor to the R packages DoubleML and hdm. He enjoys being part of the R community and is looking forward to get to know the tutorial participants at useR! 2022.

Martin Spindler is a professor of statistics and applications in business administration at the University of Hamburg, Germany. He is founder of the start-up EconomicAI, which transfers ML-based causal models to industry. His research focuses on theoretical extensions of causal inference in high-dimensional models, as well as their applications in academic research and industry projects. He is a developer and contributor to the R packages hdm and DoubleML. Martin can’t wait to learn more about tutorial participants and their interest in causal machine learning.

Oliver Schacht is a PhD candidate in statistics at the University of Hamburg, Germany. His research focuses on implementing and applying causal machine learning and causal reinforcement learning in academic and industry applications. Oliver is really excited to teach one of the interactive lab sessions in this tutorial.

Introduction to Git and GitHub

  • Capacity: 20
  • Language: English
  • Duration: 2.5h

Abstract

A version control system records changes to a file or set of files over time so that these modifications can be tracked, monitored, and recorded. Git is a version control system which allows you to review or restore earlier versions. Git is installed and maintained on your local system and gives you a self-contained record of your ongoing versions. Compared to other systems, Git is responsive, easy to use, and inexpensive (free).

GitHub is a cloud-based platform that allows you to store and share your projects. Unlike Git, GitHub is exclusively cloud-based. GitHub is one of the largest cloud-based code-sharing platforms, so using it can provide wide exposure for your project. The more people you have to review your project, the more attention and use it is likely to attract.

Instructor

Gavin Masterson’s role as a consulting data scientist at Fathom Data includes work as a data engineer, data analyst, database manager, dashboard designer, report writer and more. He is highly skilled in R, SQL (particularly PostgreSQL), Git and data analysis, and has also conducted research as a herpetologist and population ecologist in South Africa. There’s more about him at https://www.linkedin.com/in/gavin-masterson-9224921a/.

Creating and deploying a statistical blog with the blogdown package

  • Capacity: 30
  • Language: Spanish
  • Duration: 2.5h

Abstract

This tutorial, Creación y publicación de un blog estadístico con el paquete blogdown, will show the fundamental pieces of the system, and attendees will be able to build and even publish their first statistical website. A GitHub account is recommended, as well as having the tidyverse and blogdown packages installed.

Instructor

Emilio L. Cano is a professor of computer science and statistics at Rey Juan Carlos University. His research interests include applied statistics, statistical learning and methodologies for quality. He has given more than 1,000 hours of in-company training. He is the author of the SixSigma R package, published in the CRAN repository with an average of 1500 downloads per month, and of two monographs on quality methodologies with R in Springer. He keeps a permanent activity of research results transfer with companies via technology transfer contracts. He is also president of the technical subcommittee of standardization UNE (member of ISO) CTN 66/SC 3 (Statistical Methods), a collaborating teacher in the Spanish Association for Quality (AEC), and president of the R Hispano association (Spanish R users group).

Session 2 (Tutorial) - Monday, 20 June 2022, 9:00 am - 12:30pm CDT

Introduction to responsible machine learning · Wprowadzenie do modelowania predykcyjnego · La guía del viajero al aprendizaje automático responsable · Sorumlu makine öğrenmesi (machine learning) rehberi · Cùng xây dựng Model Machine Learning

  • Capacity: 200 (EN), 40 (PL), 40 (ES), 40 (TR), 20 (VN)
  • Languages: English, Polish, Spanish, Turkish and Vietnamese
  • Duration: 2.8h

Abstract

[EN] Complex machine learning models are frequently used in predictive modeling. There are a lot of examples for random forest-like or boosting-like models in medicine, finance, agriculture, etc.

But who trusts in black boxes? In this workshop we will show why and how one would analyse the structure of the black-box model.

This will be a hands-on workshop with three parts. In each part there will be a short lecture and then time for practice and discussion. Using the example of analysing a specific dataset, we will show the basics of modelling with tree models. We will then show how to evaluate and analyse such models using XAI techniques.

From the packages, we will learn about randomForest, party, mlr3, DALEX, modelStudio and arenar.

The tutorial will be based on stories and examples from the book-comic https://github.com/BetaAndBit/RML

[TR] Tabular formattaki veriler için gelişmiş tahmin modelleri eğitmek istiyor, ancak black-box yapılarından rahatsız mı oluyorsunuz?

• Tahmin modellerinin açıklanması için kullanılan açıklanabilir yapay zeka (XAI) araçları duydunuz ve nasıl kullanılacağını öğrenmek mi istiyorsunuz?

• Önyargıları ortadan kaldırarak, mümkün olduğunca sorumlu bir şekilde tahmin modelleri geliştirmek ister misiniz?

Yukarıdaki sorulardan herhangi birine cevabınız EVET ise sizi bu “Sorumlu Makine Öğrenmesine Giriş” eğitimine davet ediyoruz!

Instructors

Przemysław Biecek, professor in responsible machine learning at Warsaw University of Technology, is deeply interested in education, applied statistics and applications in medicine. Biecek will give this tutorial in English.

Anna Kozak, senior data scientist at the Warsaw University of Technology, teaches data visualisation, and in business she implements XAI in financial applications. She will give this tutorial in Polish.

Juan Correa has a PhD in behavioral data science and is a researcher-professor at CESA School of Business Colombia. He will give this workshop in Spanish.

Mustafa Cavus has a PhD in statistics, followed by a postdoc at Warsaw University of Technology. He is currently working on autoML tools and organising the R community in Turkey. He will give this tutorial in Turkish.

Ly Thien is a student of mathematics and data analysis at Warsaw University of Technology. Thien will give this tutorial in Vietnamese. A free electronic copy of the text in Vietnamese is available at https://betaandbit.github.io/RML_VN/. It can be purchased from Amazon at https://betaandbit.github.io/RML_VN/.

Programación de scripts en R para el procesamiento de QGIS

  • Capacity: 20
  • Language: Spanish
  • Duration: 2.5h

Abstract

QGIS is a leading open software that in recent years has been very popular among researchers and professionals of geographic information systems in different public and private entities, especially at technical-scientific institutions due to its large ecosystem and interoperability with other scientific software, such as GRASS, SAGA, R, Python, OTB, and GDAL, among others. Its combination with R has been mainly to solve statistical analysis tasks with spatial and non-spatial data. Since version 2.14 (Essen), QGIS has had the option to integrate R software into its toolbox. Previously this functionality was part of the QGIS core; however, for new versions from 3.0 (Girona) onwards, this functionality has been proposed as an extension. From the perspective of R users, this change of approach allows for more independent development according to the requirements of the R spatial community.

This tutorial will be 100 percent hands-on from start to finish. We will learn how to write and add new tools to the QGIS process box using the R programming language, acquiring the skills to take advantage of all the benefits of the new QGIS API through the integration offered by the Processing R Provider add-on.

Instructors

Gabriel Gaona is a data science enthusiast, interested in spatial analysis for meteorology and hydrology. His work involves spatial analysis applications in energy, meteorology, hydrology, environment, health and public spaces. He has more than 5 years of experience in R, spatial analysis and data science education at the undergraduate and graduate levels at universities in Ecuador.

Antony Barja is a researcher in training at Health Innovation Lab, interested in spatial analysis of data focused on infectious diseases. His current work and publications involve spatial analysis in natural hazards, landscape ecology and health. He has more than 3 years of experience managing several free and open source spatially oriented technologies.

Futureverse: Parallelization in R

  • Capacity: 40
  • Language: English
  • Duration: 2.5h

Abstract

In this tutorial, you will learn how to use the future framework to turn sequential R code into parallel R code with minimal effort.

There are a few ways to parallelize R code. Some solutions come built-in with R (parallel package) and others are provided through R packages available on CRAN. The future framework, available on CRAN since 2016 and used by hundreds of R packages, is designed to unify and leverage common parallelization frameworks in R, to make new and existing R code faster with minimal effort by the developer.

The futureverse (https://futureverse.org) allows you, as the developer, to stay with your favorite programming style. For example, future.apply provides one-to-one alternatives to base R’s apply() and lapply() functions, furrr provides alternatives to purrr’s map() functions, and doFuture provides support for foreach syntax. At the same time, the user can switch to a parallel backend of their choice–e.g., they can parallelize on their local machine, across multiple local or remote machines, towards the cloud, or on a job-scheduler on a high-performance computing (HPC) cluster. As a developer, you do not have to worry about which backend the user picks–your future-based code will remain the same regardless of the parallel backend.

PS. We will not cover asynchronous Shiny programming using futures and promises in this tutorial.

Instructor

Henrik Bengtsson is an associate professor at the University of California, San Francisco (UCSF), where he does work in cancer research and statistical method development. He has a background in computer science and mathematical statistics, has used R since 2000, is a member of the R Foundation, and serves as the ISC Director of the R Consortium. Henrik maintains 30+ CRAN and Bioconductor packages and is the developer of the future framework.

RStudio in the Amazon EC2 Cloud: A total beginner’s guide

CANCELED Emails about alternate arrangements were sent to enrolled participants on 14 June

Lessons for designing scalable and maintainable Shiny apps

  • Capacity: 225
  • Language: English
  • Duration: 3.3h

Abstract

This tutorial is aimed at experienced Shiny developers who want to productionize their apps (especially in the context of a large organization). We are going to share some lessons we have learned. This includes the design, implementation details and continuous integration and deployment of Shiny apps. Participants will learn how to split their code between business logic and user interface, as well as to use Shiny modules to make configurable Shiny apps. We will show how to simplify the reactivity graph, create dynamic user interfaces and implement R6 classes for advanced use cases. Finally we will explain how to organize the code in packages for testing and deployment.

Instructors

Pawel Rucki graduated in 2015 from University of Warsaw, with his degree in econometrics and quantitative economics. Having worked with R for almost 10 years now, Pawel has applied it to geospatial data analysis, credit risk assessment, financial provisions calculation and clinical trial data analysis. He joined Roche in 2019 and is currently the engineering team lead responsible for one of the biggest R codebases in Roche product development.

Daniel Sabanes Bove studied statistics and obtained his PhD in 2013 for his research work on Bayesian model selection. He started in Roche as a biostatistician and then worked at Google as a data scientist before rejoining Roche, where he is currently leading a statistical engineering team that works on productionizing R packages, Shiny modules and how-to templates for data scientists. Daniel has been programming in R for 18 years; he is (co-)author of multiple R packages published on CRAN and Bioconductor, as well as the book Likelihood and Bayesian Inference: With Applications in Biology and Medicine.

Adrian Waddell joined Roche in 2016, and he is currently part of the Data Science Acceleration group. Adrian holds a PhD in statistics with a focus on interactive data visualization and exploration from the University of Waterloo, Canada, and a bachelor’s degree in data analysis and process design from Zurich University of Applied Sciences. In 2019 Adrian co-initiated the NEST project at Roche, and he is currently the technical lead. NEST is a software development project for creating R-based tools to analyze clinical trials data for exploratory and regulatory use.

Introduction to dimensional reduction in R

  • Capacity: 35
  • Language: English
  • Duration: 3.5h

Abstract

Biology is evermore dependent on high-dimensional data. In this context, it can be challenging to evaluate and explore how this dimension reduction is used to extract information, help with data visualization and provide better models. In this hands-on tutorial, we will use dimensional reduction techniques (PCA, ICA, Multidimensional Scaling, t-sne, and U-MAP) in data to see how they behave. Our goal is to provide students an overview of some frameworks that can be used in R to do, thus harvesting R’s power. First, students will use the tidyverse and tidy model frameworks to learn PCA and ICA. Second, they will use other packages to learn multidimensional scaling. Third, we will see how this field is still growing rapidly and how we can use reticulate to execute programs and code that have been written in Python in R. We will then use reticulate together with scikit-learn to do t-sne and U-Map from R. Last, students will discuss what they have learned and their main takeaways from the tutorial.

Instructor

Isabella Bicalho Frazeto is a bioinformatician interested in using every modeling tool out there to describe biological phenomena. She has experience working with machine learning for clinical data, omics data analysis and research. Currently, she is working with computer vision and machine learning applied to developmental biology.

Session 3 (Tutorial) - Monday, 20 June 2022, 2:00 - 5:30pm CDT

Larger-than-memory data workflows with Apache Arrow

  • Capacity: 50
  • Language: English
  • Duration: 3.5h

Abstract

As datasets become larger and more complex, the boundaries between data engineering and data science are becoming blurred. Data analysis pipelines with larger-than-memory data are becoming commonplace, creating a gap that needs to be bridged–between engineering tools designed to work with very large datasets on the one hand, and data science tools that provide the analysis capabilities used in data workflows on the other. One way to build this bridge is with Apache Arrow, a multi-language toolbox for working with larger-than-memory tabular data. Arrow is designed to improve performance and efficiency, and places emphasis on standardization and interoperability among workflow components, programming languages, and systems. The arrow package provides a mature R interface to Apache Arrow, making it an appealing solution for data scientists working with large data in R.

In this tutorial you will learn how to use the arrow R package to create seamless engineering-to-analysis data pipelines. You’ll learn how to use interoperable data file formats like Parquet or Feather for efficient storage and data access. You’ll learn how to exercise fine control over data types to avoid common data pipeline problems. During the tutorial you’ll be processing larger-than-memory files and multi-file datasets with familiar dplyr syntax, and working with data in cloud storage. The tutorial doesn’t assume any previous experience with Apache Arrow: instead, it will provide a foundation for using arrow, giving you access to a powerful suite of tools for analyzing larger-than-memory datasets in R.

Instructors

Danielle Navarro is a distinguished cognitive and data scientist, professional educator, open source R developer, and coauthor of ggplot2: Elegant Graphics for Data Analysis (3rd edition).

Jonathan Keane is an engineering and data science manager at Voltron Data. They’ve been passionate about R since undergrad and developed or contributed to a number of open source projects over the years.

Stephanie Hazlitt is a data scientist, an avid R user, and an engineering manager at Voltron Data, with a passion for supporting people and teams in learning, creating and sharing data science products and tools.

How to create publication-ready tables in R

  • Capacity: 225
  • Language: English
  • Duration: 3.5h

Abstract

This tutorial will teach attendees how to make fully reproducible, publication-ready tables for academic journals, public-facing talks and online displays. The strengths and weaknesses of popular table-making packages will be shown. After the guiding principles for table design are explained, attendees will learn how to build tables. The course will use a mixture of lecture and hands-on practice to produce beautiful tables that have already been published, and attendees will learn how to take these tables to the next level. Attendees will learn how to make both static and interactive tables during the tutorial topics, which include: quick descriptive statics with the table1 package; tables to support hypothesis testing with p-values using the gt with gtsummary packages; making complex tables which can be directly integrated using Microsoft Word with the flextable package; and making beautiful interactive tables with the reactable and reactablefmtr packages. The audience will also learn how to use the rUM package as a shortcut to building academic papers.

Instructors

Raymond Balise, PhD, is an award-winning lecturer, applied statistician and biostatistician working in the Miller School of Medicine at the University of Miami. His decades of experience studying cancer, health disparities, HIV and addiction have led to hundreds of peer-reviewed abstracts, posters, and papers.

Lauren Nahodyl, MS, is a biostatistician for the University of Miami Miller School of Medicine. Lauren’s research currently focuses on the areas of cardiovascular disease and cancer disparities.

Anna Calderon, BA, is a data analyst at the University of Miami with an interest in the intersection of healthcare research and machine learning. She has experience working in clinical research involving vulnerable populations and is currently involved in the application of machine learning to predict outcomes of substance use disorder treatments.

Francisco Cardozo, MS, is a data analyst and a PhD student at the University of Miami. He has been working on the evaluation of international preventive programs and developing tools to help communities quickly access the information they need to prevent alcohol use in adolescents.

Machine learning with tidymodels

  • Capacity: 40
  • Language: English
  • Duration: 3.3h

Abstract

This workshop will provide a gentle introduction to machine learning with R using the modern suite of predictive modeling packages called tidymodels. We will build, evaluate, compare, and tune predictive models. Along the way, we’ll learn about key concepts in machine learning including overfitting, the holdout method, the bias-variance trade-off, ensembling, cross-validation, and feature engineering. Learners will gain knowledge about good predictive modeling practices, as well as hands-on experience using tidymodels packages like parsnip, rsample, recipes, yardstick, and tune.

Instructor

Emil Hvitfeldt is a software engineer at RStudio and part of the tidymodels team’s effort to improve R’s modeling capabilities. He maintains several packages within the realms of modeling, text analysis, and color palettes. He taught statistical machine learning as an adjunct professor at American University. He co-authored the book Supervised Machine Learning for Text Analysis in R.

Desarrollo de aplicaciones de Shiny a nivel comercial (sponsored by Appsilon)

  • Capacity: 50
  • Language: Spanish
  • Duration: 3h

Abstract

How do you transform your idea into a production-ready app?

In this interactive tutorial, we will delve into several key aspects that will allow you to organize, ease, and accelerate the development of your project into a professional Shiny app.

This tutorial will cover structure and good practices to start and maintain a project (renv, unit testing, linters, css, config), modularization (R6 classes, Shiny Modules, klmr box), performance optimization (profvis, benchmark) and data management (DBI, pool). For each of these topics, there will be an interactive practice session to gain hands-on experience.

Instructors

Oriol Senan is a Shiny developer at Appsilon and a contributor to Bioconductor. Previously, he was a researcher on complex systems applied to computational biology.

Federico Rivadeneira is a molecular biologist with a background in molecular evolution and ecology. He currently works as R Shiny developer at Appsilon.

Agustin Perez Santangelo is a molecular biologist and cognitive scientist from Argentina. Currently, he works as an R Shiny developer with which he has had some fun in the past (including a runner-up prize in Shiny Contest 2021). He enjoys translating ideas into code.

Regression modeling strategies

  • Capacity: 225
  • Language: English
  • Duration: 3.5h

Abstract

The tutorial will provide an overview of how to relax linearity assumptions using regression splines, model specification, problems with stepwise variable selection, various issues in multivariable modeling and validation. It will include a case study illustrating many of the methods covered. Materials related to the course along with course notes may be obtained from http://hbiostat.org/rms .

Instructor

Frank Harrell is a professor of biostatistics at the Vanderbilt University School of Medicine. His research areas include development and validation of predictive models, Bayesian modeling, missing data, statistical graphics, biomedical research, and drug development.

Visualizaciones efectivas con ggplot2: más allá de las opciones por defecto

  • Capacity: 40
  • Language: Spanish
  • Duration: 3.5h

Abstract

ggplot2 is a powerful package that allows us to create data visualizations with just a few lines of code. In part, this is possible due to the great “default values” it has for some of the elements of a plot. Although this might be enough when you are exploring your data, when you need to develop data visualizations to effectively communicate your results to an audience, more customization might be required. To this end, this hands-on tutorial aims to help participants to gain a deeper understanding of ggplot2’s layered structure, and how to customize data and non-data elements of a plot. Participants will write code that will allow them to use different geoms in a flexible way, and to customize elements such as color palettes, fonts, axes, legends, etc. This practical knowledge will be accompanied by a reflection about how these elements of design can help us to communicate our results in a more effective and accessible way.

Instructors

Stephanie Orellana is a data scientist focused on spatial data and natural resources. She likes maps, data visualization, programming, and teaching coding to people who don’t have a background in computer science. She is a RStudio trainer and one of the co-organizers of RLadies Santiago.

Riva Quiroga is a linguist with interest in data visualization, text mining, coding literacy, and communication skills. She is a RStudio trainer, a Carpentries instructor, and a co-organizer of RLadies Santiago, RLadies Valparaíso, and LatinR.