Continuous integration (CI) systems automate the compilation, building, and testing of software. Despite CI rising as a big success story in automated software engineering, it has received almost no attention from the research community.
For example, how widely is CI used in practice, and what are some costs and benefits associated with CI? Without answering such questions, developers, tool builders, and researchers make decisions based on folklore instead of data.
In this work, we use three complementary methods to study in-depth the usage of CI in open-source projects. To understand what CI systems developers use, we analyzed 34,544 open-source projects from GitHub. To understand how developers use CI, we analyzed 1,529,291 builds from the most popular CI system. To understand why projects use or do not use CI, we surveyed 442 developers. With this data, we answered 14 questions related to the usage, cost, and benefits of CI. Among our results, we show evidence that supports the popular claim that CI helps projects release more often. We also discovered that 70% of the most popular projects from GitHub use CI, as well as finding that the overall percentage of projects using CI continues to grow, making it important and timely to focus more research on CI.
We found that 40% of all the projects from our breadth corpus use CI. Thus, CI is indeed used widely and warrants further investigation. Additionally, we know that our scripts do not find all usage of CI. We can reliably detect the use of (public) CI services only if their API makes it possible to query the CI service based on knowing the GitHub organization and project name. Therefore, the results we present are a lower bound on the number of projects that use CI.
Next we investigate which CI services are the most widely used in our breadth corpus. We show that Travis CI is by far the most widely used CI service. Because of this result, we feel confident that our further analysis can focus on the projects that use Travis CI as a CI service, and that analyzing such projects gives representative results for usage of CI services in open-source projects. We also found that some projects use more than one CI service. In our breadth corpus, of all the projects who use CI, 14% use more than one CI.We think this is an interesting result which deserves future attention.
Percentages add to greater than 100% since developers can use more than one CI service.
CI usage by project popularity: We want to determine whether more popular projects are more likely to use CI. Our intuition is that if CI leads to better outcomes, then we would expect to see higher usage of CI among the most popular projects (or, alternatively, that projects using CI get better and thus more popular). The accompanying figure shows that the most popular projects (as measured by the number of stars) are also the most likely to use CI. (Kendall's τ , p < 0.00001) We grouped the projects from our breadth corpus into 64 even groups, ordered by number of stars. We then calculate the percent of projects in each group that are using CI. Each group has around 540 projects. In the most popular group, 70% of projects use CI. As the projects become less popular, we see that the percentage of projects using CI declines to 23%.
Popular projects are more likely to use CI.
We observe a wide range of projects that use CI. The popularity of the language does not correlate with the probability that a project uses CI.
Projects. Grouped into equal size groups, groups on the right have the most stars.
|Language||Total Projects||# Using CI||Percent CI|
We next study when projects began to adopt CI. We answer this question with our depth corpus, because the breadth corpus does not have the date of the first build, which we use to determine when CI was introduced to the project. Notice that we are collecting data from Travis CI, which was founded in 2011. The figure shows that CI has experienced a steady growth over the last 5 years.
We also analyze the age of each project when developers first introduced CI, and we found that the median time was around 1 year. Based on this data, we conjecture that while many developers introduce CI early in a projects development lifetime, it is not always seen as something that provides a large amount of value during the very initial development of a project.
The median project introduces CI a year into development.
CI adoption over time. Each bar shows the percent of projects using CI from the all projects that would eventually adopt CI.
Is CI a passing "fad" in which developers will lose interest, or will it be a lasting practice? While only time will tell what the true answer is, to get some sense of what the future could hold, we asked developers in our survey if they plan to use CI for their next project. We asked them how likely they were to use CI on their next project, using a 5-point Likert scale ranging from definitely will use to definitely will not use. This figure shows that developers feel very strongly that they will be using CI for their next project. The top two options, 'Definitely' and 'Most Likely' account for 94% of all our survey respondents, and the average of all the answers was 4.54. While this seems like a pretty resounding endorsement for the continued use of CI, we decided to dig a little deeper. Even among respondents who are not currently using CI, 53% said that they would 'Definitely' or 'Most Likely' use CI for their next project.
While CI is widely used in practice nowadays, we predict that in the future, CI adoption rates will increase even further.
One way that we evaluate the costs of CI is to ask developers why they do not use CI. In our survey, we asked respondents whether they chose to use or not use CI, and if they indicated that they did not, then we asked them to tell us why they do not use CI.
Table 4 shows the percentage of the respondents who selected particular reasons for not using CI. As mentioned before, we build this list of reasons by collecting information from various popular internet sources. Interestingly, the primary cost that respondents identified was not a technical cost; instead, the reason for not using CI was that "The developers on my project are not familiar enough with CI."
The second most selected reason was that the project does not have automated tests. This speaks to a real cost for CI, in that much of its value comes from automated tests, and some projects find that developing good automated test suites is a substantial cost. Even in the cases where developers had automated tests, some questioned the use of CI (in particular and regression testing in general); one respondent (P74) even said "In 4 years our tests have yet to catch a single bug."
The main reason why open-source projects choose to not use CI is that the developers are not familiar enough with CI.
|The developers on my project are not familiar enough with CI||47.00|
|Our project doesn’t have automated tests||44.12|
|Our project doesn’t commit often enough for CI to be worth it||35.29|
|Our project doesn’t currently use CI, but we would like to in the future||26.47|
|CI systems have too high maintenance costs (e.g., time, effort, etc.)||20.59|
|CI takes too long to set up||17.65|
|CI doesn’t bring value because our project already does enough testing||5.88|
We ask this question to identify how often developers evolve their CI configurations. Is it a "write-once-and-forget-it" situation, or is it something that evolves constantly? The Travis CI service is configured via a YAML file, named .travis.yml, in the project's root directory. YAML is a human friendly data serialization standard. To determine how often a project has changed its configuration, we analyzed the history of every .travis.yml file and counted how many times it has changed. We calculate the number of changes from the commits in our depth corpus. This figure shows the number of changes/commits to the .travis.yml file over the life of the project. We observe that the median project change their CI configuration 12 times, but one of the projects changed the CI configuration 266 times. This leads us to conclude that many projects setup CI once and then have minimal involvement (25% of projects have 5 or less changes to their CI configuration), but some projects do find themselves changing their CI setup quite often.
Some projects change their configurations relatively often, so it is worthwhile to study what these changes are.
To better understand the changes to the CI configuration files, we analyzed all the changes that were made to the .travis.yml files in our depth corpus. Because YAML is a structured language, we can parse the file and determine which part of the configuration was changed. The most common changes were to the build matrix, which in Travis specifies a combination of runtime, environment, and exclu- sions/inclusions. For example, a build matrix for a project in Ruby could specify the runtimes rvm 2.2, rvm 1.9, and jruby, the build environment rails2 and rails3, and the exclusions/ inclusions, e.g., exclude: jruby with rails2. All combinations will be built except those excluded, so in this example there would be 5 different builds. Other common changes included the dependent libraries to install before building the project (what .travis.yml calls before install ) and changes to the build script themselves. Also, many other changes were due to the version changes of dependencies.
Many CI configuration changes are driven by dependency changes and could be potentially automated.
|Config Area||Total Edits||Percentage|
|Build Language Config||7222||10.92|
|Before Build Script||6387||9.66|
|Build platform Config||3058||4.62|
|After Build Success||1025||1.55|
|After Build Script||602||0.91|
|After Build Failure||39||0.06|
|After Build Success||3||--|
Another cost of using CI is the time to build the application and run all the tests. This cost represents both a cost of energy for the computing power to run these builds, but also developers may have to wait to see if their build passes before they merge in the changes, so having longer build times means more wasted developer time.
The average build time is just under 500 seconds. To compute the average build times, we first remove all the canceled (incomplete manually stopped) build results, and only compared the time for error, failed, and passed (completed builds). To further understand the data, we look at each outcome independently. Interestingly, we find that passing tests run faster than either error or failed tests. The difference between error and fail is significant (Wilcoxson, p < 0.00001), as is the difference between passed and error (Wilcoxson, p < 0.000001) and passed and failed (Wilcoxson, p < 0.000001).
We find this result surprising as our intuition is that passing builds should take longer, because if an error state is encountered early on, the process can abort and return earlier. Perhaps it is the case that many of the faster running pass builds are not generating a meaningful result, and should not have been run. However, more investigation is needed to determine what the exact reasons for this is.
Having found that CI is widely used in open-source projects (RQ1), and that CI is most widely used among the most popular projects on GitHub (RQ3), we want to understand why developers choose to use CI. However, why a project uses CI cannot be determined from a code repository. Thus, we answer this question using our survey data.
We show the percentage of the respondents who selected particular reasons for using CI. As mentioned before, we build this list of reasons by collecting information from various popular internet sources. The two most popular reasons were "CI makes us less worried about breaking our builds" and "CI helps us catch bugs earlier". One respondent (P285) added: "Acts like a watchdog. You may not run tests, or be careful with merges, but the CI will. :)"
Martin Fowler,is quoted as saying "Continuous Integration doesn't get rid of bugs, but it does make them dramatically easier to find and remove." However, in our survey, very few projects felt that CI actually helped them during the debugging process.
Projects use CI because it helps them catch bugs early and makes them less worried about breaking the build. However, CI is not widely perceived as helpful with debugging.
|CI makes us less worried about breaking our builds||87.71|
|CI helps us catch bugs earlier||79.61|
|CI allows running our tests in the cloud, freeing up our personal machines||54.55|
|CI helps us deploy more often||53.22|
|CI makes integration easier||53.07|
|CI runs our tests in a real-world staging environment||46.00|
|CI lets us spend less time debugging||33.66|
One of the more common claims about CI is that it helps projects release more often, e.g., CloudBees motto is "Deliver Software Faster". Over 50% of the respondents from our survey claimed it was a reason why they use CI. We analyze our data to see if we can indeed find evidence that would support this claim.
We found that projects that use CI do indeed release more often than either (1) the same projects before they used CI or (2) the projects that do not use CI. In order to be able to compare across projects and periods, we calculated the release rate as the number of releases per month. Projects that use CI average .54 releases per month, while projects that do not use CI average .24 releases per month. That is more than double the release rate, and the difference is statistically significant (Wilcoxson, p < 0:00001). To identify the effect of CI, we also compared, for projects that use CI, the release rate both before and after the first CI build. We found that projects that eventually added CI used to release at a rate of .34 releases per month, well below the .54 rate at which they release now with CI. Once again, this difference is statistically significant (Wilcoxson, p < 0:00001).
Projects that use CI release more than twice as often as those that do not use CI.
|Using Travis||Number of Versions Released Monthly|
For a project that uses a CI service such as Travis CI, when the CI server builds a pull request, it annotates the pull request on GitHub with a visual cue such as a green check mark or a red `X' that shows whether the pull request was able to build successfully on the CI server. Our intuition is that this extra information can help developers better decide whether or not to merge a pull request into their code. To determine if this extra information indeed makes a difference, we compared the pull request acceptance rates between pull requests that have this CI information and pull requests that do not have it. Note that projects can exclude some branches from their repository to not run on the CI server, so just because a project uses CI on some branch, there is no guarantee that every pull request contains the CI build status information.
Table 7 shows the results for this question. We found that pull requests without CI information were 5% more likely to be merged than pull requests with CI information. Our intuition of this result is that those 5% of pull requests have problems which are identified by the CI. By not merging these pull requests, developers can avoid breaking the build. This difference is statistically significant (Fisher's Exact Test: p < 0:00001). This also fits with our survey result that developers say that using CI makes them less worried about breaking the build. One respondent added that CI "Prevents contributors from releasing breaking builds". By not merging in potential problem pull requests, developers can avoid breaking their builds.
CI build status can help developers avoid breaking the build by not merging problematic pull requests into their projects.
Once a pull request is submitted, the code is not merged until the pull request is accepted. The sooner a pull request is accepted, the sooner the code is merged into the project. In the previous question, we saw that projects accept fewer (i.e., reject or ignore more) pull requests with CI then pull requests without CI. In this question, we consider only accepted pull requests, and ask whether there is a difference in the time it takes for projects to accept pull requests with and without CI. One reason developers gave for using CI is that it helps make integration easier. One respondant added "To be more confident when merging PRs". If integration is indeed easier, does it then translate into pull requests being integrated faster?
This figure shows the distributions of the time to accept pull requests, with and without CI. To build this table, we select, from our depth corpus, all the pull requests that were accepted, both with and without build information from the CI server. The mean time with CI is 81 hours, but the median is only 5.18 hours. Similarly, the mean time without CI is 140 hours, but the median is 6.8 hours. Comparing the median time to accept the pull requests, we find that the median pull request is merged 1.6 hours faster then pull requests without CI information. This difference is statistically significant (Wilcoxon, p < :0000001).
CI build status can make integrating pull requests faster. When using CI, the median pull request is accepted 1.6 hours sooner.
The most popular reason that participants gave for using CI was that it helps avoid breaking the build. Thus, we analyze this claim in the depth corpus. Does the data show a difference in the way developers use CI with the master branch vs. with the other branches? Is there any difference between how many builds fail on master vs. on the other branches? Perhaps developers take more care when writing a pull request for master than for another branch.
This table shows the percentage of builds that pass in pull requests to the master branch, compared to all other branches. We found that pull requests are indeed more likely to pass when they are on master.
CI builds on the master branch pass more often than on the other branches.
RQ10 shows the results of asking developers why they choose to use CI. However, participants were also able to enter free form answers into an "other" field. Click on Responses to show all the answers from participants.
|CI allows us to test across multiple environments|
|It forces contributors to run the tests (which they might not otherwise do).|
|It's the only professional thing to do|
|Strong Collaborative Graph Based Federated Authentication with the cloud|
|CI automatically tests pull requests from external contributors CI removes the worries about security|
|It is IMPOSSIBLE to collaborate without CI, builds will break constantly|
|also CI services give access to windows and other legacy machines, making it easier to build binaries for multiple platforms|
|CI helps us test against multiple platforms|
|Test on many configurations|
|CI results are the _only_ answer to question 'Are we OK?'|
|CI makes reviewing PRs from external contributors less scary|
|gives external contributors confidence that they aren't breaking the project|
|To be more confident when merging PRs|
|CI ensures everyone who contributes has their contribution vetted in the same environment|
|CI lets us easily show others that the tests pass after our changes|
|It's our product|
|It help in reviwing and validating contributions from other people|
|integrates other services like Transifex for automatic l10n updates|
|Can run tests that I otherwise would not be able to on localhsot|
|helps testing on multiple platforms|
|merge pull requests|
|We test on multiple platforms|
|CI helps review code from outsiders without worry about breaking something|
|Prevents contributors from releasing breaking builds|
|runs on multiple OS/Node.js combinations|
|CI gives submitters of pull requests immediate feedback about whether their contribution is acceptable|
|CI allows us to test on multiple platform versions (PHP 5, PHP 7, etc)|
|Makes it possible to test across a wider range of target environments|
|Build and deploy binaries|
|Automates part of process of reviewing user contributions|
|CI in combination with gitub allow checking PRs without the need to pull the code to local machine|
|Acts like a watchdog. You may not run tests, or be careful with merges, but the CI will. :)|
|tests across a variety of platforms|
|CI makes us notice cross platform breakage early|
|CI allows time-consuming, repetitive tests to be run automatically|
|testing more envinronments, different node versions for example|
|CI helps us test different platforms|
|CI helps testing with MANY different environments (language version, dependency version, etc.)|
|Run tests on pull requests|
|Runs tests in platforms other than our own, continously provides runnable builds|
|Allow users to see if the project builds ok.|
|CI Allows us to test on a huge variety of Operating SYtems and processor Architectures|
|we verify pull requests to our open project|
|CI allows as to test on different platforms/browsers|
|Helps in open source for commiters to fix their bugs prior to needing review|
|CI tests across all supported platforms, devs usually only test on a single local platform|
|CI helps us automate any repetitive tasks|
|CI checks for style|
|CI builds the program for deploys too|
|CircleCI makes it easy to integrate with Docker by building and pushing Docker images on every commit|
|CI runs our tests in multiple platforms|
|CI allows us to do performance comparisons between hardware types to know exact resource utilization profiles of applications.|
|CI provides accountability|
|CI output provides automated feedback for authors of pull requests on open source projects.|
|We can easily test an all browser/platform combinations.|
|The big benefit to us is that it enforces a successful build on every change — it’s easy to see when someone breaks tests.|
|CI helps us get automated feedback on community contributions|
RQ6 shows the results of asking developers why they choose not to use CI. However, participants were also able to enter free form answers into an "other" field. Click on Responses to show all the answers from participants.
|There is not code to compile - they're scripts|
|In 4 years our tests have yet to catch a single bug.|
We also asked survey participants "Does your project use Continuous Integration?"
The available reponses were:
|TravisCI and internal solutions within company|
|We use both TravisCI and AppVeyor|
|TravisCI, AppVeyor, and a self-hosted Buildbot|
|We use both TravisCI and AppVeyor|
|Visual Studio Team Services|
|All of the above, plus CircleCI|
|Visual Studio Team Services Online|
|Yes, we use Codeship|
|We use both TravisCI and AppVeyor|
|travis, circle, drone.io, taskcluster, buildbot, jenkins|
|Use travisCI, appveyor and circleCI|
|all of them|
|We use TravisCI and AppVeyor|
|Visual Studio Team Services|
|Visual Studio Team Service|
|Custom build solution|
|TravisCI and SemaphoreCI|
|TravisCI and CircleCI and TeamCity|
|all of the above|
|TravisCI AND AppVeyor|
|We use Travis and Appveyor|
|Travis and Jenkins|
|Azure and Kudu|
|travis, appveyor and circleci|
|Home made in Tcl (recidiv)|
|alll of them|
|Travis + AppVeyor (Windows support is important too)|
|Circle, AppVeyor & Jenkins|
|CirclCI and AppVeyor|
|TravisCI and AppVeyor|
|We use Travis CI and Appveyor|
|All of the above|
|We use Travis + AppVeyor|
|we use a mix of TravisCI and Jenkins|
|We use a proprietary Microsoft CI system|
|Jenkins, and AppVeyor|
|Visual Studio Team Services|
|Both TravisCI and Jenkins|
|Drone, Travis CI, Jenkins, Gitlab CI|
|Some TravisCI, some CircleCI|
|Both Travis and AppVeyor|
|Yes, we use both Travis and Jenkins|
|Visual Studio Team Services|
|Yes — CircleCi|
|Yes, I use TravisCI and CircleCI|
|Team Foundation Server|
|We use both Travis and Jenkins|
We thank CloudBees for sharing with us the list of open- source projects using CloudBees and thank Travis for fixing a bug in their API to enable us to collect all relevant build history. We thank Semih Okur for feedback on the survey. We thank Mihai Codoban, Kory Kraft, Denis Bogdanas, Shane McKee, Nicholas Nelson, and Sruti Srinivasa Ragavan for their feedback on earlier versions of this paper. This work was partially funded through the NSF CCF-1439957 grant.