Skip to main content
SHARE
Publication

How R Developers explain their Package Choice: A Survey

by Aditi A Malviya Thakur, Audris Mockus, Russell L Zaretzki, Bogdan Bichescu, Randy Bradley
Publication Type
Conference Paper
Book Title
2023 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)
Publication Date
Page Numbers
1 to 12
Publisher Location
New Jersey, United States of America
Conference Name
2023 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)
Conference Location
New Orleans, Louisiana, United States of America
Conference Sponsor
IEEE
Conference Date
-

Background: Contemporary software development relies heavily on reusing already implemented functionality, usually in the form of packages. Aims: We aim to shed light on developers' preferences when selecting packages in R language. Method: To do that, we create and administer a survey to over 1000 developers who have added one of two common dataframe enhancement libraries in R to their projects: data.table or tidyr. We design a questionnaire using the Social Contagion Theory (SCT) following prior work on technology adoption and ensure that key dimensions affecting developer choice are considered. Results: Of the 1085 developers we contacted, 803 completed the survey asking them to prioritize various factors known to affect developer perceptions of package quality and to provide their background. Most developers self-identified as data scientists with two to five years of work experience. We found significant differences between the preferences of developers who chose data.table and tidyr. Surprisingly, package reputation based on easy-to-see measures, such as the number of stars on GitHub, was not an important factor for either group. Conclusions: Our findings demonstrate the inherently social nature of package adoption. They can help design future studies on how different populations of developers make decisions on which software packages to use in their projects. Finally, package developers and maintainers can benefit by better understanding the prime concerns of the users of their packages.