News and Notes from the Makers of Nexus | Sonatype Blog

OSS Index contributor asks: Where 'R' you?

Written by Katie McCaskey | June 23, 2020

Editor's note: Many people contribute their time and talents to open source projects. It's always interesting to discover the diversity of expertise and perspective.

Many developers are introduced to Sonatype by way of Sonatype Nexus Repository OSS, DepShield, or through other free developer tools. Others discover us through OSS Index, our free catalog of open source components and scanning tools. Still others find us the "old fashioned" way: shared passions and community contribution.

To learn more, I chatted with Dr. Colin Gillespie of Jumping Rivers data science training and consultancy with a strong focus on R and Python. Colin is co-author of the book, Efficient R Programming, among other accolades. He is also a recent contributor to OSS Index.

How did you get involved with open source software?

During my PhD (1999-2002), the department made the switch from S+ to the newly released programming language R v.1.0. I used R (as well as Fortran 77) to simulate epidemic type processes. After my PhD, I moved to Newcastle University as a Statistics lecturer. As an academic, I was able to combine interests (computing) with my research. One of the nice things is that all this could be open source.

How did you hear about Sonatype?

Jeffry [of Sonatype] emailed one of the R mailing lists about an issue that came up in the newly created oysteR package. The description of the package piqued my interest.

What prompted you to get involved with oysteR?

One of the services that Jumping Rivers offers is monitoring R and python related infrastructure. For example, it's straightforward to create a Shiny or Flask dashboard, but it's difficult to monitor it for vulnerabilities. When I came across oysteR, it seemed like the natural tool to fit into our toolchain.

After seeing the message on R-help, I thought I would make a simple pull request and fix the issue. However, to get the package on CRAN, required a little more work.

CRAN is the R package repository. Unlike PyPI, it has fairly strict rules for uploading a package. These rules include a series of automated tests, such as checking file encodings, examples, documentation. But it also has a human element, where someone looks at your package. Before getting on CRAN, the package is also run on a variety of operating systems, Windows, Linux, & Mac, 32 and 64 bit varieties.

What kinds of projects are you working on now?

R is a bit of an odd language. Unlike Python or C++, the majority of the R community don't have a formal programming background. Instead, they are data scientists, medics, epidemiologists, and basically anyone who works with data. In a recent R course, I even taught a lawyer who was looking to assess the fairness of judges!

Over the last few months, I've been working on an R package called inteRgrate, that looks to enforce incredibly strict/pedantic standards, via continuous integration. The nucleus of this package is implementing "Jumping Rivers" opinionated coding standards in an automated manner. It has now evolved into something more general.

---

Browse additional community contributions and projects at the Sonatype Exchange. And if you're inspired, please consider contributing.