In the prelude, I mentioned that music streaming services upload approximately 50k new tracks every day
There is a cosmological theory that the Universe has structure on all scales from moons, planets, and stars, to individual galaxies, to clusters of galaxies, to superclusters, and so on without end
To resolve the stale playlist problem, the first step is get an idea of the scale of the Musical Universe. To this end, I created a map (i.e., force-directed graph) of the largest 50 genre galaxies in the Musical Universe. The map can help us answer the question: do genre-superclusters exist among the top 50 music genres on Spotify? Can you find any?
Read my answer
Nodes are encoded as circles. Each node is either a track (gray) or genre (colored). For genre nodes, color (ranging sequentially from red-yellow-blue) encodes the number of incoming links (i.e. in degrees). More significant (or connected) genres are more bluish while less significant genres are more redish or yellowish (like how hotter stars emit a bluish color while cooler stars emit a redish or yellowish color).
Yes! Hover over any node to see its name and connections. You can also grab (i.e., click and drag) nodes to see how they affect the graph layout. Future releases may incorporate filtering and more details-on-demand. Stay tuned!
Mike Bostock's D3 force-directed graph example
Sophie Engle's Graph Demo's
All work was complete by Kai unless noted otherwise.
One reviewer reported that the graph did not render when they tried to view it. I suspect this was not a bug but rather caused by the large size of the data, causing the view to leave the page before the graph finished rendering. D3's force-directed-graph function is computationally expensive (particularly when large data is used) because node positions are recalulated every clock tick. To this end, I asked if there were any methods that could speed up the rendering time without compromising position quality in the graph. Professor Engle suggested d3-force-reuse, which I integrated into my final visualization. The improvement is made by approximating the Barnes-Hut values. It does this by reusing approximations instead of computing new ones at each iteration of the layout algorithm.