Chord Similarity analysis table

chord similarity
Using a vector space to analyse similarity between chords in an interactive table

Matt Crump


January 22, 2024

piano chord similarity analysis. cartoon. colorful. musical. science.

This is a quick post with some tools for looking at chord similarity.

I took all of the chords on this music theory page, and represented them in a chord vector space using one-hot codes.

Each chord is represented as a vector with 12 features, corresponding to each of the 12 possible notes. If a note is in a chord, then the note feature get’s a 1 in the vector. All other features are set to 0.

Here’s a few examples of what the vectors look like.

key type item C Db D Eb E F Gb G Ab A Bb B
C key C note 1 0 0 0 0 0 0 0 0 0 0 0
C scale C major scale 1 0 1 0 1 1 0 1 0 1 0 1
C triads C major triad 1 0 0 0 1 0 0 1 0 0 0 0
C triads C minor triad 1 0 0 1 0 0 0 1 0 0 0 0
C triads C6 1 0 0 0 1 0 0 1 0 1 0 0

I created vectors for 47 chords across all keys. This results in a matrix with 564 rows (for each chord vector) and 12 columns for each note feature.

Next, I used the vector cosine to compute the similarity between each chord and every other chords. That results in a 564x564 chord-similarity matrix.

In the matrix, similarity ranges from 0 (no similarity) to 1 (perfect similarity). The similarity function basically tracks how many overlapping features there are between two chords. If each chord has zero shared notes, then similarity will be 0. As a chord increases in the number of shared notes, then similarity increases.

If I have extra time I may play around with some additional ways to visualize this matrix. For now, I’m using the datatable library to load the matrix into this webpage. It’s a bit clunky and very large.

The table should scroll left and right. The columns can be sorted ascending or descending by clicking the little arrows beside a column name. Sorting from the largest value to the smallest value will re-arrange the row-names on the left side. This produces an ordered list of chords from most to least similar to the column name that was clicked.

Shiny app


There were originally tables in this post, but they were basically unusable. Ideally, I’d like to search the tables on my ipad while it is sitting on my piano.

I tried to make a shiny app using webr (that would run without a server), but I couldn’t make it work.

So, here is a shiny app that does work:

First order chord similarity

# import excel sheet
c_chord_excel <- rio::import("chord_vectors.xlsx")

# grab feature vectors
c_chord_matrix <- as.matrix(c_chord_excel[,4:15])

# assign row names to the third column containing chord names
row.names(c_chord_matrix) <- c_chord_excel[,3]

# define all keys
keys <- c("C","Db","D","Eb","E","F","Gb","G","Ab","A","Bb","B")

# the excel sheet only has chords in C
# loop through the keys, permute the matrix to get the chords in the next key
# add the permuted matrix to new rows in the overall chord_matrix
for (i in 1:length(keys)) {
  if (i == 1) {
    # initialize chord_matrix with C matrix
    chord_matrix <- c_chord_matrix
  } else {
    #permute the matrix as a function of iterator
    new_matrix <- cbind(c_chord_matrix[, (14-i):12],c_chord_matrix[, 1:(13-i)] )
    # rename the rows with the new key
    new_names <- gsub("C", keys[i], c_chord_excel[,3])
    row.names(new_matrix) <- new_names
    # append the new_matrix to chord_matrix
    chord_matrix <- rbind(chord_matrix,new_matrix)

# calculate similarities between keys
key_similarities <- lsa::cosine(chord_matrix)

# calculate similarities between keys
chord_similarities <- lsa::cosine(t(chord_matrix))

chord_properties <- tibble(
  type = rep(c_chord_excel$type,length(keys)),
  key = rep(keys, each = dim(c_chord_matrix)[1]),
  chord_names  = row.names(chord_similarities)

chord_similarities <- cbind(chord_properties,
row.names(chord_similarities) <- NULL

# print table
# try datatable options...didn't work as I wanted
# DT::datatable(
#   chord_similarities[1:50,1:50],
#   extensions = c("FixedColumns","SearchBuilder","Buttons"),
#   options = list(
#     paging = TRUE,
#     pageLength =  25,
#     fixedColumns = list(leftColumns = 4),
#     dom = "BQlfrtip",
#     searchBuilder = TRUE,
#     search = list(return = TRUE),
#     buttons = 'colvis'
#   )
# )

  extensions = "FixedColumns",
  options = list(
    paging = TRUE,
    pageLength =  25,
    fixedColumns = list(leftColumns = 4)

Second-order similarities

update: this table has been moved to the shiny app

This table uses the above cosine similarity vectors for each chord as the basis vectors, and then computes another similarity matrix. The approach has interesting properties in other domains like computational modeling of word semantics. Just curious to look at it in this context.

One basic observation is that in the first-order similarity space (above), there are many 0 similarities between pairs. For example, the single note C has 0 similarity to every other single note, because each note only has one feature, and the features never overlap. However, in the second-order matrix, individual notes now have some positive similarity to each other. This is because the whole vector of similarities for C (from above), now has many shared elements with the vectors for the other chords, and a similarity can be computed.

I suspect this table will have some more musically interesting things going on.

first_order <- lsa::cosine(t(chord_matrix))
second_order <- lsa::cosine(first_order)

second_order_similarities <- cbind(chord_properties,
row.names(second_order_similarities) <- NULL

  extensions = "FixedColumns",
  options = list(
    paging = TRUE,
    pageLength =  25,
    fixedColumns = list(leftColumns = 4)