Tag Archives: software analytics

The Problem with Software Analytics

Software Analytics is the marriage of data science and software engineering.  It hopes to use data generated from software and software engineering processes to provide insights for creating better software.

The following is a quote from a software analytics round table discussion in 2013. All of the round table members are leading academics at prestigious universities.  Obviously, they were chosen because they are very accomplished and know the field.  Now, onto the quote.

Modern software services such as GitHub, BitBucket, Ohlol, Jira, FogBugz, and the like employ wide use of visualization and even bug effort estimation. We can pat ourselves on the backs even if those developers never read a single one of our papers.


Here is the source in IEEE Computer (which most likely you cannot access unless you are an academic): Roundtable: What’s Next in Software Analytics). For the non-academics an InfoQ reprint is available free online.

The academic research community cannot take credit for what Github, BitBucket, and others have done.  Yes, that academic research community is doing some excellent work, but most software practitioners are not seeing it because that research is being hidden in academic journals. The advancements might have occurred simultaneously and coincidentally, but there is not a clear causal relationship.  Unfortunately, the academic research is not getting into the hands of the software practitioners.

I would like to think the target audience of software engineering research would be software engineers, project managers, and developers. However, as this quote points out, those practitioners hardly ever see the research. If the research does not reach the intended audience, then there is a clear problem.  A problem that needs to be fixed.

Unfortunately, I do not yet know what the fix is. If you have any ideas, please leave a comment below.

If there is enough interest, maybe I will start something (just don’t know what that something is).

Scoring A Software Development Organization With A Single Number

I just finished my PhD in the Computational Science and Statistics program at South Dakota State University. My dissertation focused on the area of software analytics, sometimes called Data-Driven Software Engineering. Specifically, how does a Software Development Organization evaluate itself? Students have a G.P.A. (Grade Point Average), but organizations do not have a similar evaluation method.

The dissertation introduces the C.R.I. (Cumulative Result Indicator) to provide a single number to evaluate the performance of a software development organization. The C.R.I. focuses on 5 primary elements of a Software Development Organization.

  1. Quality
  2. Availability
  3. Satisfaction
  4. Schedule
  5. Requirements

C.R.I. demonstrates what data needs to be calculated, and how that data can be used to create a score. Naturally, this solution will not work in every situation, but it does provide a consistent method for evaluation, and it is flexible to allow only some of the elements or even additional elements.

There is the brief 1-minute overview of the dissertation. Feel free to read more of the details in the document below.

The source and data files are available on Github, Dissertation Scoring SDO.

You can also see results of the analysis on Sense, Scoring an SDO.

This is the first in a series of posts on Data-Driven Software Engineering. In the next few weeks, I will be posting more about the topic. Some posts will be excerpts from the dissertation, and others will be new thoughts on the topic. Stay Tuned!