Accelerating CI – Part 3: Incremental CI

In my previous blog posts, I explained how parallelization and resource allocation optimization help developers accelerate their CI workflows. Today, I would like to introduce you to another method of optimizing software development processes, which I call “incremental CI”. As its name suggests, it relies on analyzing the source code differences between two versions.

Incremental processing

Nowadays, in order to further increase their productivity, developers include tasks other than builds and tests in their CI workflows. The most common examples are the tools to check code styles for adherence to coding standards and to measure code metrics. Some of these tools, such as gofmt and golint, are designed to process only the files that are passed as inputs and are not affected by any changes in other files. In the case of those tools, the results about input files with no changes are always exactly the same, no matter how many times they are executed. By reusing the result of the previous execution, Inspecode brings a significant performance gain.
To enable this behavior in Inspecode, you should set the value of the incremental parameter to true, as shown in the below example:

inspecode:
    experimental:
        incremental: true

It is important to note that, with Inspecode, incremental processing can also be enabled or disabled on a tool-by-tool basis. In the example below, incremental processing is enabled on gofmt and disabled on golint.

inspecode:
    experimental:
        incremental: true
    gofmt: default
    golint:
        experimental:
            incremental: false

It is also important to note that there are several cases that might cause different results, even if the input source files are not modified:

Changes are made in the tool’s option settings
The tool’s configuration file is updated
The tool is upgraded to a newer version

To avoid any issues in the above scenarios, we take a conservative approach, where the tools are set to process all the input files, no matter if these were updated or not.

Automatic task skipping

Automatic task skipping greatly increases the speed of CI workflows. If incremental processing is enabled and Inspecode does not detect any change on all the target files of a tool, the execution task of the tool will be automatically skipped. You don’t need to write special words like [ci skip] in commit messages. For example, if the all the updates exclusively concern the README file, while the Go source code files are untouched, the execution of golint is skipped. Inspecode reports the same golint issues as detected in the previous task. Hence, the status of the skipped task also remains unchanged from the previous task.

Similarly, if the input and ignore clauses in rocro.yml includes no target files, Inspecode will skip the execution of the tool automatically. The below snippet illustrates a case where all the target files for golint are ignored:

inspecode:
    experimental:
        ignore: "**/*.go"
    golint: default

In the above example, the task execution is skipped regardless of whether incremental processing is enabled or disabled. The status of the task will always be Skipped. You should review the input and ignore settings if the task status is Skipped, although you did not cancel the job including the task.

This blog post introduced you to incremental processing, one of the methods Inspecode uses to help developers speed up their CI workflows. Currently, Inspecode analyses file changes and conservatively decides if incremental processing can be executed so that all the issue reports of tools never change. In the future, we are thinking about making incremental processing the default option. I hope this blog post included useful information that will help you further optimize your CI workflows.

Accelerating CI – Part 2: Optimization of resource allocation

In the previous blog, I focused on accelerating CI by parallelization. However, since there is a limit to the hardware resources because of their costs, therefore efficient distribution of the limited resources to parallel tasks is the key to further speeding up. In this blog, I will show you how to improve job throughput by optimizing the allocation of CPU usage.

In Rocro, a series of processes caused by events such as git-push and pull requests are called jobs and processes executed on individual containers in jobs are called tasks. In other words, a job is a collection of tasks. For example, in the following rocro.yml, the entire process set in YAML is a job and the process of each tool (such as gofmt), setup process for executing these tools, etc. are tasks.

inspecode:
    gofmt: default
    golint: default
    go-test: default

Since the execution time of a job depends on the slowest task, in order to improve the throughput of the job, it is necessary to level the execution time of each task as much as possible. For Inspecode/Docstand, CPU usage can be specified by cpu option in rocro.yml. cpu: 1 indicates that the tool can use one CPU core to its full extent. With cpu: 1, 3.75 GiB memory is allocated and the allocation is proportional to the CPU usage. You can specify the amount in 1/1000th units by specifying the usage amount with a decimal fraction such as cpu: 0.25 or by adding m (milli) at the end. cpu: 250m is equivalent to cpu: 0.25.

If go-test takes the longest time in the above rocro.yml, assigning more CPU resources to go-test than the other tools will improve the processing time of the whole job:

inspecode:
    gofmt:
        machine:
            cpu: 0.25
    golint:
        machine:
            cpu: 500m
    go-test:
        machine:
            cpu: 1.25

This level of finer optimization of CPU resources is not available in other CI tools and services. Inspecode/Docstand are completely free during the beta period and you can use a total of 8 CPU cores for free. So I hope you can try out various optimizations.

Accelerating CI – Part 1: Parallelization

Hello everyone, I am the CTO of Rocro Inc. Today, I will start the Rocro Engineering Blog. In this blog, we hope to convey not only about our products but also the various know-how gained through the product development.

Rocro is a group of web services for software developers using GitHub and Bitbucket. Amidst many excellent services already available in the market, why did I started this Rocro project? One of the reasons is to further accelerate CI. With many-core processors and auto-scalable cloud services becoming more and more popular, the best way to accelerate CI is by dividing jobs as finely as possible and parallelizing them. In this part, I will explain about how Rocro supports parallelization.

Rocro’s Inspecode/Docstand executes all tools in parallel^*1. For example, if you write rocro.yml like the following in Inspecode, the three tools (gofmt, glint, go test) will run in parallel.

inspecode:
    gofmt: default
    golint: default
    go-test: default

In this way, you can parallelize the execution of tools just by arranging the tool names. Even without rocro.yml, Inspecode will detect the major language in the Git repository and automatically execute the appropriate tools in parallel.

It is also possible to split the input given to the tool for further parallelization. For example, by writing the following rocro.yml, the input of go test is split into two and executed in parallel.

inspecode:
    gofmt: default
    golint: default
    go-test:
        - input:
            - /path/to/package1
            - /path/to/package2
        - input:
            - /path/to/package3
            - /path/to/package4

If the execution time of a specific tool is dominant over the time of the entire job, then it is not so fast with the tool level parallelization alone. In such a case, it is better to split the tool input in appropriate way to accelerate the execution.

In the next part, I will talk about accelerating CI by optimizing resource allocation.

*1 In order to perform automatic optimization and parallelization, rocro.yml adopts declarative description style as much as possible.