spotify 数据分析

Spotisis /spo-ti-sis/ noun The analysis of one’s Spotify streaming history using Python.

Spotisis / spo-ti-sis / 名词使用Python分析一个人的Spotify流历史。

I was reading through a lot of data science related guides and project ideas when I came across an article in which the author compared his song choices with his friend’s. I wanted to do something similar, so set out to analyse my own streaming history and compare it with what the world listens to.

当我看到一篇文章,作者将他的歌曲选择与朋友的歌曲选择进行比较时,我正在阅读许多与数据科学相关的指南和项目构想。 我想做类似的事情,因此着手分析自己的流媒体历史并将其与世界听的内容进行比较。

Through this, I aim to find out more about my music preferences and how that differs from the world’s genral picks.

通过这一工作,我旨在了解有关我的音乐喜好以及与世界各地的精选音乐有何不同的更多信息。

I never really put much thought into my music preference before this project — it was always kind of dependent on my mood, and when someone asked me what type of music I like, I had no answer — because it varied from one hour to another.

在这个项目开始之前,我从来没有真正考虑过我的音乐偏好-它总是取决于我的心情,当有人问我喜欢哪种音乐时,我没有答案-因为它从一个小时到另一个小时不等。

I’ve split this project into 2 sections:

我将该项目分为两个部分:

Part A is the analysis of my music streaming history.

A部分是对我的音乐流历史的分析。

  • Timeline of my streaming history我的流式传输历史的时间表
  • Day preference日偏好
  • Favorite artist最喜欢的艺术家
  • Favorite songs最喜欢的歌曲
  • Spirit of the songs歌曲的精神
  • Diversity多元化

Part B is the comparison of the top 50 songs streamed on my list with the top 50 songs streamed in 2019

B部分是我列表中前50首歌曲与2019年前50首歌曲的比较

数据 (The data)

Spotify allows every user to request a download of all their streaming history, so Part A is completely dependent on that. They also have an amazing Developer Platform in which the public can use the data available for their own interest. Along with my personal data, I used the audio features option — which breaks down a song and gives it ‘score’ for a number of different attributes. The attributes are as follows:

Spotify允许每个用户请求下载其所有流历史记录,因此A部分完全依赖于此。 他们还拥有一个了不起的开发人员平台 ,公众可以在其中使用自己感兴趣的数据。 除了我的个人数据,我还使用了音频功能选项-可以分解一首歌曲,并为许多不同的属性赋予它“得分”。 属性如下:

  • Acousticness — A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic

    声学 -轨道是否声学的置信度,范围为0.0到1.0。 1.0代表高置信度轨道是声学的

  • Danceability — A description of how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.

    舞蹈性 -基于音乐元素(包括速度,节奏稳定性,节拍强度和整体规律性)的组合,说明轨道是否适合跳舞。 值0.0最低可跳舞,而1.0最高可跳舞。

  • Energy — Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy.

    能量 —能量是从0.0到1.0的量度,表示强度和活动的感知量度。 通常,充满活力的曲目会感觉快速,响亮且嘈杂。

  • Instrumentalness — Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content.

    器乐性 —预测音轨是否不包含人声。 在这种情况下,“哦”和“啊”的声音被当作工具。 器乐性值越接近1.0,则轨道中没有声音的可能性越大。

  • Liveness — Detects the presence of an audience in the recording.

    生动度 -检测记录中是否有听众。

  • Loudness — The overall loudness of a track in decibels (dB). Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typical range between -60 and 0 db.

    响度 -轨道的整体响度,以分贝(dB)为单位。 响度是声音的质量,它是身体力量(振幅)的主要心理关联。 值的典型范围是-60至0 db。

  • Speechiness — Speechiness detects the presence of spoken words in a track.

    语音性 -语音性可检测曲目中是否存在口语。

  • Valence — A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track.

    -从0.0到1.0的量度,描述了轨道传达的音乐积极性。

  • Tempo — The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration

    节奏 —曲目的总体估计节奏,单位为节拍/分钟(BPM)。 用音乐术语来说,节奏是指给定乐曲的速度或节奏,它直接来自平均拍子持续时间

  • Mode — Mode indicates the modality (major or minor) of a track, the type of scale from which its melodic content is derived. Major is represented by 1 and minor is 0.

    模式 —模式表示轨道的形态(主要或次要),是其旋律内容所源自的音阶类型。 Major用1表示,minor用0表示。

  • Key — The estimated overall key of the track.

    密钥 -曲目的估计总体密钥。

The dataset was a little messy, so I used Pandas to clean it up according to my need for each section. The entire code can be found on the GitHub link at the end of this article.

数据集有点混乱,因此我根据每个部分的需要使用Pandas对其进行清理。 完整的代码可以在本文结尾的GitHub链接上找到。

For Part B, I used this dataset from Kaggle.

对于B部分,我用这从Kaggle数据集。

Before we begin, I just want to say something… Don’t come at me for my music choice!

在我们开始之前,我只想说些什么...不要因为我的音乐选择而来找我!

甲部 (Part A)

1.我的流式传输历史的时间表 (1. Timeline of my streaming history)

I know that I spend a lot of time listening to music, but I didn’t know I spent that much time! The data dates back to late June of 2019 and was highly varied.

我知道我花了很多时间听音乐,但是我不知道我花很多时间! 该数据可以追溯到2019年6月下旬,并且变化很大。

On February 24th 2020, I spent a gasping 535 minutes (which is almost 9 hours) on spotify — the most in the past year! There’s no definite answer as to why the difference between the highest and lowest value (which was in seconds) is so much, but I did register for Spotify Premium around that time, so maybe that was the reason? Push the promos harder you guys ;)

2020年2月24日,我在Spotify上花费了535分钟(将近9个小时),这是过去一年中最多的! 关于最高值和最低值(以秒为单位)之间的差异为何如此之大,没有确切的答案,但是我确实在那个时候注册了Spotify Premium,所以也许这就是原因吗? 加大促销力度;)

2.每日偏好 (2. Day preference)

Does the day of the week affect how long I spend listening to music?

星期几会影响我听音乐的时间吗?

I usually listen to music while walking to and back from college, so I would’ve predicted that more time would be spent during the weekdays. Sunday is chillday, so it makes sense that it was when I spent most time listening to music.

我通常在上大学和上大学时听音乐,因此我预计在工作日将花费更多时间。 星期日很冷,所以有意义的是那是我花大量时间听音乐的时候。

3.最喜欢的艺术家 (3. Favorite artists)

Do I have a favorite arist?

我有最喜欢的艺术家吗?

According the the data, I actually do. There were two factors I considered: the number of times I played an artist’s song and the total amount of time I spent listening to their songs.

根据数据,我实际上是这样做的。 我考虑了两个因素:播放歌手歌曲的次数和收听他们歌曲的总时间。

When looking through the data, I found that some of the songs were played only for a few seconds, so that was reducing the accuracy of the results.

查看数据时,我发现某些歌曲仅播放了几秒钟,因此降低了结果的准确性。

The graphs below show the top 15 artists under both categories.

下图显示了两个类别中的前15位艺术家。

Lauv, Shawn Mendes, One Direction and Justin Bieber maintained the top 4 positions under both graphs, whereas the others were rearranged.

劳夫,肖恩·门德斯,一个方向和贾斯汀·比伯在两个图表上均保持前4位,而其他两个则重新排列。

4.哪些歌曲播放最多? (4. Which songs were played most?)

Was it by the same 15 artists?

是由同一15位艺术家创作的吗?

Yes, it was — Lauv took 5 of the 15 spots!

是的,是的— Lauv占据了15个景点中的5个!

I realised that some of the top 15 artists (based on the amount of time spent listening to their songs) were on the list because of one or two songs which were repeated multiple times.

我意识到,排名前15位的艺术家中的一些(基于听他们的歌曲所花费的时间)在名单上是因为一首或两首歌曲被重复多次。

For example, Memories by Maroon 5 was the most played song (played for a total of 184 minutes). When comared to the total time spent listening to the group (430 minutes), the different was about 246 minutes. In percentage, it means that more than 40% of the time spent listening to Maroon 5 was spent only on Memories.

例如,Maroon 5的Memories是播放最多的歌曲(总共播放184分钟)。 将听完该小组所花费的总时间(430分钟)估算为大约246分钟。 以百分比表示,这意味着超过40%的时间在聆听Maroon 5上的时间仅花在记忆上。

It’s a good song. Admit it.

这是一首好歌。 承认吧

5.歌曲的精神 (5. Spirit of the song)

Do I listen to positive songs?

我会听正面的歌吗?

Using the valence attribute from Spotify’s audio analysis features, I tried to find out the general spirit of the top 50 songs I listen to. The valence scale is from 0–1, with one being the most positiveness conveyed in the track.

使用Spotify音频分析功能的valence属性,我试图找出我听的前50首歌曲的总体精神。 化合价的范围是0-1,其中一个是在曲目中传达的最多的积极性。

For the sake of classification:- low spirit = 0 ≤ valence < 0.5- netural = 0.5≤ valence < 0.6-high spirit = 0.6 ≤ valence ≤ 1

为了分类:-低酒精度= 0≤价<0.5-神经质=0.5≤价<0.6-高酒精度= 0.6≤价≤1

(I named it as ‘spirit’ because ‘positive’ and ‘negative’ didn’t feel right)

(我将其命名为“精神”,因为“正”和“负”感觉不正确)

I was pretty unsure about this one and was utterly surprised by the results.

我对此不太确定,对结果完全感到惊讶。

So I listen to more of low spirit songs?? That doesn’t make sense!

所以我听更多的低沉的歌曲吗? 那没有道理!

When I cross referenced the song names to its valence scale, I realised that this may not have been the most accurate representation. Ed Sheeran’s Photograph had a valence scale of 0.18, for which it was categorised as ‘low spirit’. Although it’s not a super high spirited song, it’s not so low either!

当我将歌曲名称以其效价比例交叉引用时,我意识到这可能不是最准确的表示形式。 埃德·希兰(Ed Sheeran)的摄影作品的化合价等级为0.18,因此其分类为“精神低落”。 尽管这不是一首超振奋的歌,但它也不是那么低!

6.歌曲的多样性 (6. Diversity of songs)

How do the audio features of the songs compare to one another?

歌曲的音频功能如何相互比较?

The spirit of the song built up my curiosity to know more about how the songs varied from one another in therms of the audio features, so I compared the top 3 most played songs. I believe that my song choices are highly diverse.

这首歌的精神激发了我的好奇心,以了解更多有关歌曲在音频功能方面的差异的信息,因此我比较了播放次数最多的前三首歌曲。 我相信我的歌曲选择非常多样化。

Those who are familiar with these songs know just how much they vary from one another — they give such different vibes, but I needed the data to prove it.

那些熟悉这些歌曲的人知道它们彼此之间有多少不同-它们具有不同的共鸣,但是我需要数据来证明这一点。

There is A LOT of difference — most noticable in the loudness and acousticness attributes.

有很多差异-响度和声学属性最明显。

The next part is based off of this diversity.

下一部分基于这种多样性。

B部分 (Part B)

Is my music too diverse? How does it fare when compared to the global top 50?

我的音乐太多样化了吗? 与全球前50名相比,情况如何?

Apart from the mode, everything is different! I prefer less groovy, instrumental based songs which have lower energy levels, while the global hits suggest people lean towards fast paced, energetic songs that they can dance to.

除了模式,其他都不同! 我更喜欢能量水平较低的低调,器乐性歌曲,而全球流行歌曲则建议人们倾向于快节奏,充满活力的歌曲,他们可以跳舞。

The difference between my music’s average tempo (beats per minute) and the global average is 4 BPM. According to research, songs which have 120 BPM are considered to be fast paced songs. My preference seems to be at a little slower pace, though not by much.

我的音乐的平均节奏(每分钟的节拍)与全局平均速度之间的差是4 BPM。 根据研究,具有120 BPM的歌曲被视为快节奏的歌曲。 我的喜好似乎放慢了一点,尽管速度并不慢。

结论 (Conclusion)

This project was a blast to do. I thoroughly enjoyed learning more about my music preferences and comparing that to the global hits. Now that I am backed with the data, I can say that my music is highly diversified and that I do have a favourite artist — Lauv (considering the amount of time I’ve spent listening to his songs, it wouldn’t be justified to say otherwise!).

这个项目是一个爆炸。 我非常喜欢学习有关自己的音乐喜好,并将其与全球流行歌曲进行比较。 现在,我有了这些数据的支持,可以说我的音乐非常多样化,而且确实有一位喜欢的艺术家Lauv(考虑到我花了很多时间听他的歌曲,这并没有理由否则说!)。

Following this article, I would like to continue by applying some machine learning knowledge to create a recommender system based on my music preferences.

在阅读完本文之后,我想继续应用一些机器学习知识,根据我的音乐喜好创建一个推荐系统。

Feel free to comment and view the entire code on my GitHub!

随时在我的GitHub上评论和查看整个代码!

Big thanks to Vlad Gheorghe for his brilliant explanation (huge savior!)

非常感谢弗拉德·格奥尔格(Vlad Gheorghe)出色的解释(救世主!)

翻译自: https://medium.com/swlh/analysis-of-my-spotify-streaming-history-57a6088c3d3

spotify 数据分析


http://www.taodudu.cc/news/show-995000.html

相关文章:

  • 纹个鸡儿天才小熊猫_给熊猫用户的5个提示
  • 图像离群值_什么是离群值?
  • 数据预处理工具_数据预处理
  • 自考数据结构和数据结构导论_我跳过大学自学数据科学
  • 在PyTorch中转换数据
  • tidb数据库_异构数据库复制到TiDB
  • 刚认识女孩说不要浪费时间_不要浪费时间寻找学习数据科学的最佳方法
  • 什么是数据仓库,何时以及为什么要考虑一个
  • 探索性数据分析入门_入门指南:R中的探索性数据分析
  • python web应用_为您的应用选择最佳的Python Web爬网库
  • 在FAANG面试中破解堆算法
  • itchat 道歉_人类的“道歉”
  • 数据科学 python_为什么需要以数据科学家的身份学习Python的7大理由
  • 动量策略 python_在Python中使用动量通道进行交易
  • 高斯模糊为什么叫高斯滤波_为什么高斯是所有发行之王?
  • 从Jupyter Notebook到脚本
  • 加勒比海兔_加勒比海海洋物种趋势
  • srpg 胜利条件设定_英雄联盟获胜条件
  • 机器学习 综合评价_PyCaret:机器学习综合
  • 盛严谨,严谨,再严谨。_评估员工调查的统计严谨性
  • arima 预测模型_预测未来:学习使用Arima模型进行预测
  • bigquery_在BigQuery中链接多个SQL查询
  • mysql 迁移到tidb_通过从MySQL迁移到TiDB来水平扩展Hive Metastore数据库
  • 递归函数基例和链条_链条和叉子
  • 足球预测_预测足球热
  • python3中朴素贝叶斯_贝叶斯统计:Python中从零开始的都会都市
  • 数据治理 主数据 元数据_我们对数据治理的误解
  • 提高机器学习质量的想法_如何提高机器学习的数据质量?
  • 逻辑回归 python_深入研究Python的逻辑回归
  • Matplotlib中的“ plt”和“ ax”到底是什么?

spotify 数据分析_我的Spotify流历史分析相关推荐

  1. spotify 数据分析_没有数据? 没问题! 如何从Wikipedia和Spotify收集重金属数据

    spotify 数据分析 For many data science students, collecting data is seen as a solved problem. It's just ...

  2. spotify下载_我的Spotify推荐系统之旅

    spotify下载 There's one particular event that cheers my Mondays. No that's not having classes or work ...

  3. 下载spotify音乐_如何在Spotify上播放更高质量的音乐

    下载spotify音乐 With Spotify Premium, you get access to higher quality music streaming. By default (and ...

  4. spotify 缓存_如何在Spotify中获得最佳音质

    spotify 缓存 Spotify allows you to change the streaming quality of the music or playlists you listen t ...

  5. 下载spotify音乐_如何在Spotify上发现新音乐

    下载spotify音乐 One of Spotify's most powerful features is its recommendation system, which allows you t ...

  6. 下载spotify音乐_如何从Spotify下载音乐以进行离线播放

    下载spotify音乐 Khamosh Pathak Khamosh Pathak If you're using Spotify Premium, you can easily download a ...

  7. 同态加法_同态的Spotify

    同态加法 重点 (Top highlight) When neumorphism was predicted to be one of the top 2020 UI design trends, I ...

  8. 流式数据分析_流式大数据分析

    流式数据分析 The recent years have seen a considerable rise in connected devices such as IoT [1] devices, ...

  9. spotify能免费下歌吗_什么是Spotify Duo,它适合您吗?

    spotify能免费下歌吗 Spotify Premium Duo is a new Spotify plan for two people who live in the same house. S ...

最新文章

  1. JavaScriptjQuery.查询DOM元素
  2. 如何实现php自动备份数据库,使用php自动备份数据库表的实现方法
  3. 微博粉丝精灵_腾讯与精灵宝可梦公司宣布合作开发新游戏
  4. SOS宣布与融合子公司成立一家合资企业,专注区块链资产和加密货币等业务
  5. Oracle触发器5-Instead of触发器
  6. Nodejs版本的企业微信中接收消息与腾讯对接之接收消息 代码已经上传,可以去下载
  7. opencv C++ 旋转任意角度图片
  8. lammps后处理:ovito快速提取单条位错线的伯氏矢量
  9. 有监督学习与无监督学习的区别
  10. 计算广告——读书笔记(一)
  11. OpenCV的基本矩阵操作与示例
  12. CRC16校验的原理
  13. Python数据库篇
  14. 移动机器人设计与实践-基础概念汇总
  15. 厦门→世界各地国际快递业务
  16. 格式化时间 将2021-09-05T09:08:03.000Z 转换成 YYYY-MM-DD HH:mm:ss 格式
  17. 千里马Android Framework-Binder通信总结流程图
  18. linux系统可以玩星际争霸吧,Linux下也玩星际争霸
  19. Tableau学习笔记(进阶)——(7)多边形地图和背景图地图:设置地理信息(自定义地图码导入、设置地图源)
  20. 渗透武器库---信息收集工具大全

热门文章

  1. (C++版)链表(三)——实现双向链表的创建、插入、删除等简单操作
  2. shell编程题(二)
  3. 三年Java开发,java基础常问面试题
  4. PHP数组 转 对象/对象 转 数组
  5. [洛谷P5048][Ynoi2019模拟赛]Yuno loves sqrt technology III
  6. 图像灰度变换及图像数组操作
  7. Android 自定义View实现QQ运动积分抽奖转盘
  8. Apache Tomcat目录下各个文件夹的作用
  9. JAVA静态和非静态内部类
  10. flash中的渐变滤镜GradientGlowFilter