Usage, Costs, and Benefits of Continuous Integration in Open-Source Projects


Continuous integration (CI) systems automate the compilation, building, and testing of software. Despite CI rising as a big success story in automated software engineering, it has received almost no attention from the research community.

For example, how widely is CI used in practice, and what are some costs and benefits associated with CI? Without answering such questions, developers, tool builders, and researchers make decisions based on folklore instead of data.

In this work, we use three complementary methods to study in-depth the usage of CI in open-source projects. To understand what CI systems developers use, we analyzed 34,544 open-source projects from GitHub. To understand how developers use CI, we analyzed 1,529,291 builds from the most popular CI system. To understand why projects use or do not use CI, we surveyed 442 developers. With this data, we answered 14 questions related to the usage, cost, and benefits of CI. Among our results, we show evidence that supports the popular claim that CI helps projects release more often. We also discovered that 70% of the most popular projects from GitHub use CI, as well as finding that the overall percentage of projects using CI continues to grow, making it important and timely to focus more research on CI.

Full results can be found in our paper, to appear at 31st IEEE/ACM International Conference on Automated Software Engineering (ASE 2016) : Tech Report: Usage, Costs, and Benefits of Continuous Integration in Open-Source Projects





This work is a joint effort between Oregon State University and University of Illinois at Urbana-Champaign



Empirical Study

Data Corpora

To understand the extent to which CI is used and what CI systems developers use, we analyized three different data sources:

Breadth Corpus
0
Projects
Survey
Depth Corpus
0
Projects
0
Pull Requests
0
Builds


Research Questions

RQ1: What percentage of open-source projects use CI?

We found that 40% of all the projects from our breadth corpus use CI. Thus, CI is indeed used widely and warrants further investigation. Additionally, we know that our scripts do not find all usage of CI. We can reliably detect the use of (public) CI services only if their API makes it possible to query the CI service based on knowing the GitHub organization and project name. Therefore, the results we present are a lower bound on the number of projects that use CI.

RQ2: What is the breakdown of usage of different CI services?

Next we investigate which CI services are the most widely used in our breadth corpus. We show that Travis CI is by far the most widely used CI service. Because of this result, we feel confident that our further analysis can focus on the projects that use Travis CI as a CI service, and that analyzing such projects gives representative results for usage of CI services in open-source projects. We also found that some projects use more than one CI service. In our breadth corpus, of all the projects who use CI, 14% use more than one CI.We think this is an interesting result which deserves future attention.

Percentages add to greater than 100% since developers can use more than one CI service.

RQ3: Do certain types of projects use CI more than others?

CI usage by project popularity: We want to determine whether more popular projects are more likely to use CI. Our intuition is that if CI leads to better outcomes, then we would expect to see higher usage of CI among the most popular projects (or, alternatively, that projects using CI get better and thus more popular). The accompanying figure shows that the most popular projects (as measured by the number of stars) are also the most likely to use CI. (Kendall's τ , p < 0.00001) We grouped the projects from our breadth corpus into 64 even groups, ordered by number of stars. We then calculate the percent of projects in each group that are using CI. Each group has around 540 projects. In the most popular group, 70% of projects use CI. As the projects become less popular, we see that the percentage of projects using CI declines to 23%.

Observation

Popular projects are more likely to use CI.

CI usage by language: We now examine CI usage by programming language. Are there certain languages for which the projects written primarily in such languages use CI more than others? The accompanying table of CI usage by language shows projects sorted by the percentage of projects that use CI for each language, from our breadth corpus. The data shows that in fact there are certain languages that use CI more than others. Notice that the usage of CI does not perfectly correlate with the number of projects using that language (as measured by the number of projects using a language, with its rank by percentage, Kendall's τ , p > 0.68). In other words, some of the languages that use CI the most are both popular languages like Ruby and less popular languages like Scala. Similarly, among projects that use CI less, we notice both popular languages such as Objective-C and Java, as well as less popular languages such as VimL. However, we did observe that many of the languages that have the highest CI usage are also dynamically-typed languages (e.g., Ruby, PHP, CoffeeScript, Clojure, Python, and JavaScript). One possible explanation is that in the absence of a static type system that can catch errors early on, so these languages use CI to provide extra safety

Observation

We observe a wide range of projects that use CI. The popularity of the language does not correlate with the probability that a project uses CI.

Projects. Grouped into equal size groups, groups on the right have the most stars.

Language Total Projects # Using CI Percent CI
Scala 329 221 67.17%
Ruby 2721 1758 64.61%
Go 1159 702 60.57%
PHP 1806 982 54.37%
CoffeeScript 343 176 51.31%
Clojure 323 152 47.06%
Python 3113 1438 46.19%
Emacs Lisp 150 67 44.67%
JavaScript 8495 3692 43.46%
Other 1710 714 41.75%
C++ 1233 483 39.17%
Swift 723 273 37.76%
Java 3371 1188 35.24%
C 1321 440 33.31%
C# 652 188 28.83%
Perl 140 38 27.14%
Shell 709 185 26.09%
HTML 948 241 25.42%
CSS 937 194 20.70%
Objective-C 2745 561 20.44%
VimL 314 59 18.79%
RQ4: When did open-source projects adopt CI?

We next study when projects began to adopt CI. We answer this question with our depth corpus, because the breadth corpus does not have the date of the first build, which we use to determine when CI was introduced to the project. Notice that we are collecting data from Travis CI, which was founded in 2011. The figure shows that CI has experienced a steady growth over the last 5 years.

We also analyze the age of each project when developers first introduced CI, and we found that the median time was around 1 year. Based on this data, we conjecture that while many developers introduce CI early in a projects development lifetime, it is not always seen as something that provides a large amount of value during the very initial development of a project.

Observation

The median project introduces CI a year into development.

CI adoption over time. Each bar shows the percent of projects using CI from the all projects that would eventually adopt CI.

RQ5: Do developers plan on continuing to use CI?

Is CI a passing "fad" in which developers will lose interest, or will it be a lasting practice? While only time will tell what the true answer is, to get some sense of what the future could hold, we asked developers in our survey if they plan to use CI for their next project. We asked them how likely they were to use CI on their next project, using a 5-point Likert scale ranging from definitely will use to definitely will not use. This figure shows that developers feel very strongly that they will be using CI for their next project. The top two options, 'Definitely' and 'Most Likely' account for 94% of all our survey respondents, and the average of all the answers was 4.54. While this seems like a pretty resounding endorsement for the continued use of CI, we decided to dig a little deeper. Even among respondents who are not currently using CI, 53% said that they would 'Definitely' or 'Most Likely' use CI for their next project.

Observation

While CI is widely used in practice nowadays, we predict that in the future, CI adoption rates will increase even further.

RQ6: Why do open-source projects choose not to use CI?

One way that we evaluate the costs of CI is to ask developers why they do not use CI. In our survey, we asked respondents whether they chose to use or not use CI, and if they indicated that they did not, then we asked them to tell us why they do not use CI.

Table 4 shows the percentage of the respondents who selected particular reasons for not using CI. As mentioned before, we build this list of reasons by collecting information from various popular internet sources. Interestingly, the primary cost that respondents identified was not a technical cost; instead, the reason for not using CI was that "The developers on my project are not familiar enough with CI."

The second most selected reason was that the project does not have automated tests. This speaks to a real cost for CI, in that much of its value comes from automated tests, and some projects find that developing good automated test suites is a substantial cost. Even in the cases where developers had automated tests, some questioned the use of CI (in particular and regression testing in general); one respondent (P74) even said "In 4 years our tests have yet to catch a single bug."

Observation

The main reason why open-source projects choose to not use CI is that the developers are not familiar enough with CI.

Reasons developers gave for not using CI

Reason Percent
The developers on my project are not familiar enough with CI 47.00
Our project doesn’t have automated tests 44.12
Our project doesn’t commit often enough for CI to be worth it 35.29
Our project doesn’t currently use CI, but we would like to in the future 26.47
CI systems have too high maintenance costs (e.g., time, effort, etc.) 20.59
CI takes too long to set up 17.65
CI doesn’t bring value because our project already does enough testing 5.88
RQ7: How often do projects evolve their CI configuration?

We ask this question to identify how often developers evolve their CI configurations. Is it a "write-once-and-forget-it" situation, or is it something that evolves constantly? The Travis CI service is configured via a YAML file, named .travis.yml, in the project's root directory. YAML is a human friendly data serialization standard. To determine how often a project has changed its configuration, we analyzed the history of every .travis.yml file and counted how many times it has changed. We calculate the number of changes from the commits in our depth corpus. This figure shows the number of changes/commits to the .travis.yml file over the life of the project. We observe that the median project change their CI configuration 12 times, but one of the projects changed the CI configuration 266 times. This leads us to conclude that many projects setup CI once and then have minimal involvement (25% of projects have 5 or less changes to their CI configuration), but some projects do find themselves changing their CI setup quite often.

Observation

Some projects change their configurations relatively often, so it is worthwhile to study what these changes are.

RQ8: What are some common reasons projects evolve their CI configuration?

To better understand the changes to the CI configuration files, we analyzed all the changes that were made to the .travis.yml files in our depth corpus. Because YAML is a structured language, we can parse the file and determine which part of the configuration was changed. The most common changes were to the build matrix, which in Travis specifies a combination of runtime, environment, and exclu- sions/inclusions. For example, a build matrix for a project in Ruby could specify the runtimes rvm 2.2, rvm 1.9, and jruby, the build environment rails2 and rails3, and the exclusions/ inclusions, e.g., exclude: jruby with rails2. All combinations will be built except those excluded, so in this example there would be 5 different builds. Other common changes included the dependent libraries to install before building the project (what .travis.yml calls before install ) and changes to the build script themselves. Also, many other changes were due to the version changes of dependencies.

Observation

Many CI configuration changes are driven by dependency changes and could be potentially automated.

Config Area Total Edits Percentage
Build Matrix 9718 14.70
Before Install 8549 12.93
Build Script 8328 12.59
Build Language Config 7222 10.92
Build Env 6900 10.43
Before Build Script 6387 9.66
Install 4357 6.59
Whitespace 3226 4.88
Build platform Config 3058 4.62
Notifications 2069 3.13
Comments 2004 3.03
Git Configuration 1275 1.93
Deploy Targets 1079 1.63
After Build Success 1025 1.55
After Build Script 602 0.91
Before Deploy 133 0.20
After Deploy 79 0.12
Custom Scripting 40 0.06
After Build Failure 39 0.06
After Install 14 0.02
Before Install 10 0.02
Mysql 5 0.01
After Build Success 3 --
Allow Failures 2 --
RQ9: How long do CI builds take on average?

Another cost of using CI is the time to build the application and run all the tests. This cost represents both a cost of energy for the computing power to run these builds, but also developers may have to wait to see if their build passes before they merge in the changes, so having longer build times means more wasted developer time.

The average build time is just under 500 seconds. To compute the average build times, we first remove all the canceled (incomplete manually stopped) build results, and only compared the time for error, failed, and passed (completed builds). To further understand the data, we look at each outcome independently. Interestingly, we find that passing tests run faster than either error or failed tests. The difference between error and fail is significant (Wilcoxson, p < 0.00001), as is the difference between passed and error (Wilcoxson, p < 0.000001) and passed and failed (Wilcoxson, p < 0.000001).

We find this result surprising as our intuition is that passing builds should take longer, because if an error state is encountered early on, the process can abort and return earlier. Perhaps it is the case that many of the faster running pass builds are not generating a meaningful result, and should not have been run. However, more investigation is needed to determine what the exact reasons for this is.

RQ10: Why do open-source projects choose to use CI?

Having found that CI is widely used in open-source projects (RQ1), and that CI is most widely used among the most popular projects on GitHub (RQ3), we want to understand why developers choose to use CI. However, why a project uses CI cannot be determined from a code repository. Thus, we answer this question using our survey data.

We show the percentage of the respondents who selected particular reasons for using CI. As mentioned before, we build this list of reasons by collecting information from various popular internet sources. The two most popular reasons were "CI makes us less worried about breaking our builds" and "CI helps us catch bugs earlier". One respondent (P285) added: "Acts like a watchdog. You may not run tests, or be careful with merges, but the CI will. :)"

Martin Fowler,is quoted as saying "Continuous Integration doesn't get rid of bugs, but it does make them dramatically easier to find and remove." However, in our survey, very few projects felt that CI actually helped them during the debugging process.

Observation

Projects use CI because it helps them catch bugs early and makes them less worried about breaking the build. However, CI is not widely perceived as helpful with debugging.

Reasons for Using CI, as reported by survey participants

Reason Percent
CI makes us less worried about breaking our builds 87.71
CI helps us catch bugs earlier 79.61
CI allows running our tests in the cloud, freeing up our personal machines 54.55
CI helps us deploy more often 53.22
CI makes integration easier 53.07
CI runs our tests in a real-world staging environment 46.00
CI lets us spend less time debugging 33.66
RQ11: Do projects with CI release more often?

One of the more common claims about CI is that it helps projects release more often, e.g., CloudBees motto is "Deliver Software Faster". Over 50% of the respondents from our survey claimed it was a reason why they use CI. We analyze our data to see if we can indeed find evidence that would support this claim.

We found that projects that use CI do indeed release more often than either (1) the same projects before they used CI or (2) the projects that do not use CI. In order to be able to compare across projects and periods, we calculated the release rate as the number of releases per month. Projects that use CI average .54 releases per month, while projects that do not use CI average .24 releases per month. That is more than double the release rate, and the difference is statistically significant (Wilcoxson, p < 0:00001). To identify the effect of CI, we also compared, for projects that use CI, the release rate both before and after the first CI build. We found that projects that eventually added CI used to release at a rate of .34 releases per month, well below the .54 rate at which they release now with CI. Once again, this difference is statistically significant (Wilcoxson, p < 0:00001).

Observation

Projects that use CI release more than twice as often as those that do not use CI.

Release frequency statistics while using Travis

Using Travis Number of Versions Released Monthly
Yes 0.54 versions/month
No 0.24 versions/month
RQ12: Do projects which use CI accept more pull requests?

For a project that uses a CI service such as Travis CI, when the CI server builds a pull request, it annotates the pull request on GitHub with a visual cue such as a green check mark or a red `X' that shows whether the pull request was able to build successfully on the CI server. Our intuition is that this extra information can help developers better decide whether or not to merge a pull request into their code. To determine if this extra information indeed makes a difference, we compared the pull request acceptance rates between pull requests that have this CI information and pull requests that do not have it. Note that projects can exclude some branches from their repository to not run on the CI server, so just because a project uses CI on some branch, there is no guarantee that every pull request contains the CI build status information.

Table 7 shows the results for this question. We found that pull requests without CI information were 5% more likely to be merged than pull requests with CI information. Our intuition of this result is that those 5% of pull requests have problems which are identified by the CI. By not merging these pull requests, developers can avoid breaking the build. This difference is statistically significant (Fisher's Exact Test: p < 0:00001). This also fits with our survey result that developers say that using CI makes them less worried about breaking the build. One respondent added that CI "Prevents contributors from releasing breaking builds". By not merging in potential problem pull requests, developers can avoid breaking their builds.

Observation

CI build status can help developers avoid breaking the build by not merging problematic pull requests into their projects.

RQ13: Do pull requests with CI builds get accepted faster (in terms of calendar time)?

Once a pull request is submitted, the code is not merged until the pull request is accepted. The sooner a pull request is accepted, the sooner the code is merged into the project. In the previous question, we saw that projects accept fewer (i.e., reject or ignore more) pull requests with CI then pull requests without CI. In this question, we consider only accepted pull requests, and ask whether there is a difference in the time it takes for projects to accept pull requests with and without CI. One reason developers gave for using CI is that it helps make integration easier. One respondant added "To be more confident when merging PRs". If integration is indeed easier, does it then translate into pull requests being integrated faster?

This figure shows the distributions of the time to accept pull requests, with and without CI. To build this table, we select, from our depth corpus, all the pull requests that were accepted, both with and without build information from the CI server. The mean time with CI is 81 hours, but the median is only 5.18 hours. Similarly, the mean time without CI is 140 hours, but the median is 6.8 hours. Comparing the median time to accept the pull requests, we find that the median pull request is merged 1.6 hours faster then pull requests without CI information. This difference is statistically significant (Wilcoxon, p < :0000001).

Observation

CI build status can make integrating pull requests faster. When using CI, the median pull request is accepted 1.6 hours sooner.

RQ14: Do CI builds fail less on master than on other non-master branches?

The most popular reason that participants gave for using CI was that it helps avoid breaking the build. Thus, we analyze this claim in the depth corpus. Does the data show a difference in the way developers use CI with the master branch vs. with the other branches? Is there any difference between how many builds fail on master vs. on the other branches? Perhaps developers take more care when writing a pull request for master than for another branch.

This table shows the percentage of builds that pass in pull requests to the master branch, compared to all other branches. We found that pull requests are indeed more likely to pass when they are on master.

Observation

CI builds on the master branch pass more often than on the other branches.

Survey Results

Other Responses: Why projects use CI

RQ10 shows the results of asking developers why they choose to use CI. However, participants were also able to enter free form answers into an "other" field. Click on Responses to show all the answers from participants.

CI allows us to test across multiple environments
PR Verification
It forces contributors to run the tests (which they might not otherwise do).
It's the only professional thing to do
Strong Collaborative Graph Based Federated Authentication with the cloud
CI automatically tests pull requests from external contributors CI removes the worries about security
It is IMPOSSIBLE to collaborate without CI, builds will break constantly
also CI services give access to windows and other legacy machines, making it easier to build binaries for multiple platforms
CI helps us test against multiple platforms
Test on many configurations
CI results are the _only_ answer to question 'Are we OK?'
CI makes reviewing PRs from external contributors less scary
gives external contributors confidence that they aren't breaking the project
To be more confident when merging PRs
CI ensures everyone who contributes has their contribution vetted in the same environment
CI lets us easily show others that the tests pass after our changes
It's our product
It help in reviwing and validating contributions from other people
integrates other services like Transifex for automatic l10n updates
Can run tests that I otherwise would not be able to on localhsot
helps testing on multiple platforms
merge pull requests
We test on multiple platforms
CI helps review code from outsiders without worry about breaking something
Prevents contributors from releasing breaking builds
runs on multiple OS/Node.js combinations
CI gives submitters of pull requests immediate feedback about whether their contribution is acceptable
CI allows us to test on multiple platform versions (PHP 5, PHP 7, etc)
Makes it possible to test across a wider range of target environments
Standardized environment
Build and deploy binaries
Automates part of process of reviewing user contributions
CI in combination with gitub allow checking PRs without the need to pull the code to local machine
Acts like a watchdog. You may not run tests, or be careful with merges, but the CI will. :)
tests across a variety of platforms
CI makes us notice cross platform breakage early
CI allows time-consuming, repetitive tests to be run automatically
testing more envinronments, different node versions for example
CI helps us test different platforms
CI helps testing with MANY different environments (language version, dependency version, etc.)
Run tests on pull requests
Runs tests in platforms other than our own, continously provides runnable builds
Allow users to see if the project builds ok.
CI Allows us to test on a huge variety of Operating SYtems and processor Architectures
we verify pull requests to our open project
CI allows as to test on different platforms/browsers
Helps in open source for commiters to fix their bugs prior to needing review
CI tests across all supported platforms, devs usually only test on a single local platform
CI helps us automate any repetitive tasks
CI checks for style
CI builds the program for deploys too
CircleCI makes it easy to integrate with Docker by building and pushing Docker images on every commit
CI runs our tests in multiple platforms
CI allows us to do performance comparisons between hardware types to know exact resource utilization profiles of applications.
CI provides accountability
CI output provides automated feedback for authors of pull requests on open source projects.
We can easily test an all browser/platform combinations.
The big benefit to us is that it enforces a successful build on every change — it’s easy to see when someone breaks tests.
CI helps us get automated feedback on community contributions
Other Responses to: Why projects do not use CI

RQ6 shows the results of asking developers why they choose not to use CI. However, participants were also able to enter free form answers into an "other" field. Click on Responses to show all the answers from participants.

There is not code to compile - they're scripts
In 4 years our tests have yet to catch a single bug.
Other Responses: What CI platform do you use?

We also asked survey participants "Does your project use Continuous Integration?"
The available reponses were:

  • No
  • Yes, we use TravisCI
  • Yes, we use Jenkins
  • Yes, we use AppVeyor
  • Other:______________

To see all "Other" responses, click on Repsonses.

TravisCI and internal solutions within company
GoCD
We use both TravisCI and AppVeyor
TravisCI, AppVeyor, and a self-hosted Buildbot
We use both TravisCI and AppVeyor
Internal tools
Visual Studio Team Services
Circle CI
GitlabCI
GitlabCI
All of the above, plus CircleCI
teamcity
Visual Studio Team Services Online
Yes, we use Codeship
We use both TravisCI and AppVeyor
travis, circle, drone.io, taskcluster, buildbot, jenkins
Use travisCI, appveyor and circleCI
Janky
Evergreen
all of them
We use TravisCI and AppVeyor
Visual Studio Team Services
SemaphoreCI
Visual Studio Team Service
Custom build solution
TravisCI and SemaphoreCI
Codeship
TravisCI and CircleCI and TeamCity
all of the above
TravisCI AND AppVeyor
We use Travis and Appveyor
TeamCity
Travis and Jenkins
Azure and Kudu
Evergreen
Evergreen
travis, appveyor and circleci
VSO
Circle CI
Evergreen
Evergreen
CircleCI
Home made in Tcl (recidiv)
Buildbot
Citclei
alll of them
Teamcity, VSTS
Travis + AppVeyor (Windows support is important too)
http://wercker.com/
Circle, AppVeyor & Jenkins
SemaphoreCI
CircleCI
TeamCity
TeamCity
Bamboo
CirclCI and AppVeyor
http://semaphoreci.com
wercker
CircleCI
CircleCI
TravisCI and AppVeyor
VSTS
CircleCI
TFS
We use Travis CI and Appveyor
Drone.io
CircleCI
All of the above
Yes, CircleCI
All three
CircleCI
Teamcity
TFS
BuildDefinition VSOnline
CodeShip
We use Travis + AppVeyor
semaphoreci.com
we use a mix of TravisCI and Jenkins
HARROW.UO
Xcode server
GitLab CI
We use a proprietary Microsoft CI system
SemaphoreCI
Jenkins, and AppVeyor
CodeShip
Visual Studio Team Services
Both TravisCI and Jenkins
Drone, Travis CI, Jenkins, Gitlab CI
TeamCity
TaskCluster
GitLab CI
TeamCity
Buildkite
Some TravisCI, some CircleCI
Both Travis and AppVeyor
Yes, we use both Travis and Jenkins
Visual Studio Team Services
Team City
Yes — CircleCi
Yes, I use TravisCI and CircleCI
CircleCI
CircleCI
GitLab CI
adHoc
CircleCi
Team Foundation Server
We use both Travis and Jenkins
CircleCI

Team

HeadShot

    

Oregon State University

HeadShot

University of Illinois at Urbana-Champaign

HeadShot

    

University of Illinois at Urbana-Champaign

HeadShot

    

University of Illinois at Urbana-Champaign

HeadShot

    

Oregon State University

Acknowledgements

We thank CloudBees for sharing with us the list of open- source projects using CloudBees and thank Travis for fixing a bug in their API to enable us to collect all relevant build history. We thank Semih Okur for feedback on the survey. We thank Mihai Codoban, Kory Kraft, Denis Bogdanas, Shane McKee, Nicholas Nelson, and Sruti Srinivasa Ragavan for their feedback on earlier versions of this paper. This work was partially funded through the NSF CCF-1439957 grant.


This webpage built with: D3, D3 Bar Graphs, D3 Stacked Bar Graphs, Bootstrap, Odometer, Inview, D3Pie, Plot.ly, Font Awesome, and JQuery


This work is a part of the COPE project

Version 1.5