unheadr

My first ever attempt at writing and sharing an R package is now on GitHub.

Click here for installation instructions, examples, and source code.

The functions themselves are essentially the same that I’ve blogged about in the past. They have saved me countless hours of manual data wrangling, and I found myself sourcing these functions whenever I work with other people’s data in my field of comparative biology. Rather than having random scripts floating around, I decided to put them together into a package. This also forced me to write better documentation and clean up some of the messy code.

It was also a good excuse to procrastinate and make a crude hex logo. The design is based on the two-headed crocodile from Zapotec mythology on which we all exist, floating above a primordial sea.

package functions

untangle2

The star of the package. By now there are a couple of people other than myself that use it (at my insistence) to turn human-readable data into usable data by putting extraneous header rows where they below (in a separate column). The motivation and inner workings for this function are described here and here.

unbreak_vals

This is a niche function for very specific usecases. It uses regex to fix values that are broken up across two rows. This usually happens when we are formatting a table and we need to fit it on a page.

unwrap_cols

This function is similar to unbreak_vals, but it is made to unwrap and glue string elements that have been wrapped across multiple rows for presentation purposes, with an inconsistent number of empty or NA values padding out the vertical space in some of the columns.

package writing

I’m essentially a (clueless) tourist to the package development scene. I still managed to write something that I think is useful thanks to Hadley Wickham’s R Packages book, with additional help from the usethis package documentation, notes I took at the 2018 RStudio conference, and advice and Twitter encouragement by JD Long.


Final remarks:

a) I haven’t written unit tests and I’m not sure I’ll even be able to without an insane amount of time reading up on the topic and looking at other people’s testing code. This seems to be the package-making stage at which even the best documentation is not very beginner-friendly.

b) Any feedback is more than welcome.

c) If there is an existing package that this could be pulled into and improved please let me know.

d) There is now some redundancy across the package README, the function documentation, and this blog. Preferable to having poorly-documented functions floating around.

e) Shout out to everyone at the BES Macroecology meeting!