For those interested, here’s how I made the git logs into the calendar view in the previous excuse-ridden post on why edgeyo‘s development has been slow the past 1.5 months.

  1. Thehe git log is piped into a csv file:
  2. xuanyi@gallifrey:~$ git log master --date=short --pretty=format:"%h%x09%an%x09%ad%x09%s" > edgeyoCommitLog.csv
  3. Here’s what the csv looks like
  4. A table of commit counts is created with Excel’s pivot table – with Date and CommitCount as columns. The file is saved as edgeyoCommits.csv
  5. R is used to create the visualization (ggplot2 and RColorBrewer is required):
  6. df <- read.csv('edgeyoCommits.csv') #load the csv
    df$Day <- strptime(df$Date, "%d/%m/%Y")$wday #define Day
    df$Week <- -(strptime(df$Date, "%d/%m/%Y")$yday %/% 7 + 1) #define Week
    df$Month <- strptime(df$Date, "%d/%m/%Y")$mon #define Month
    
    library(ggplot2) #load ggplot2
    library(RColorBrewer) #load RColorBrewer
    
    p <- ggplot(df, aes(x = df$Day, y = df$Week, fill=df$CommitCount)
    plottedP <- p + geom_tile()
            + scale_fill_gradientn(colour=brewer.pal(5, "Blues"))
            + facet_wrap(~ Month, nrow = 1)
    finalP <- plottedP + opts(panel.background = theme_rect(fill='#FFFFFF', colour='#FFFFFF'))
    finalP
    
  7. Clean up in Photoshop!

And this is how the final product looks like:

Notes

  • This visualization uses geom_tile. Documentation can be found here. facet_wrap was also used.
  • Only the Master branch was visualized. This was because the topic was actually about the Master branch – that’s the branch we deploy using.
  • Prior to creating the pivot table, the log file was scrubbed of any commits that are less than 20 lines
  • For purpose of analysis, we analyzed only commits from the 27th of December, since that’s when edgeyo actually became truly public facing.

Discussion

  • What can be done better? Here’s an idea: There are four of us, it would be very simple to split the day out into four quadrants, with each quadrant representing one person.
  • Is this a better visualization than gitk’s graph? It depends. For the context of the previous blog post, trying to figure out where we went wrong, it was useful. It was immediately apparent once we placed events on dates (an embarassing amount of personal circumstances were slowing us down a lot).
  • The Use of Visualization We use visualization a lot internally in edgeyo. ggplot2 and other tools are useful for us when doing analyses of our users (yes, we know edgeyo is not running at its full capacity yet, but we’ve already done quite a bit of analyses). Good news is we’ve decided that we will share them more often on the blog :D . Tell me what you think?
 

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code lang=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

edgeyo is now live.      Use edgeyo now