华盛顿特区与其他地区的差别

深度分析 (In-Depth Analysis)

Living in Washington DC for the past 1 year, I have come to realize how WMATA metro is the lifeline of this vibrant city. The metro network is enormous and well-connected throughout the DMV area. When I first moved to the Capital city with no car, I often used to hop on the metro to get around. I have always loved train journeys and therefore unsurprisingly, metro became my most favorite way to explore this beautiful city. On my travels, I often notice the product placements and advertisements on metro platforms, near escalators/elevators, inside the metro trains, etc. A good analysis of the metro rider data would help the advertisers to identify which metro stops are the busiest at what times so as to increase the ad exposure. I chanced upon this free dataset and decided to plunge deep into it. In this article, I’ll walk you through my analysis.

在过去的一年中,住在华盛顿特区,我逐渐意识到WMATA地铁是这座充满活力的城市的生命线。 地铁网络非常庞大,并且在DMV区域内连接良好。 当我第一次没有汽车搬到首都时,我经常跳上地铁到处走走。 我一直喜欢火车旅行,因此毫不奇怪,地铁成为我探索这座美丽城市的最喜欢的方式。 在旅途中,我经常注意到地铁站台,自动扶梯/电梯附近,地铁列车内等的产品位置和广告。对地铁乘客数据的良好分析将有助于广告商确定哪些地铁站最繁忙时间,以增加广告曝光率。 我偶然发现了这个免费数据集,并决定深入其中。 在本文中,我将指导您进行分析。

Step 1: Importing necessary libraries

步骤1:导入必要的库

import pandas as pdimport numpy as npimport seaborn as snsimport matplotlib.pyplot as pltimport warningswarnings.filterwarnings("ignore")from wordcloud import WordCloud, STOPWORDSfrom nltk.corpus import stopwords

Step 2: Reading the data

步骤2:读取资料

Let us call our pandas dataframe as ‘df_metro’ which will contain the original data.

让我们将熊猫数据框称为“ df_metro”,它将包含原始数据。

df_metro = pd.read_csv("DC MetroData.csv"

Step 3: Eyeballing the data and length of the dataframe

步骤3:查看数据和数据帧的长度

df_metro.head()
df_metro.columns
len(df_metro)

Step 4: Checking distinct values under different columns

步骤4:检查不同列下的不同值

Let us check what are the unique values in the column ‘Time’

让我们检查“时间”列中的唯一值是什么

df_metro['Time'].value_counts().sort_values()

Unique values in the column ‘Day’ are as follows:

“天”列中的唯一值如下:

df_metro['Day'].value_counts().sort_values()

Next step is to analyze few questions.

下一步是分析一些问题。

Q1。 什么是受欢迎的出入口? (Q1. What are the popular entrances and exits?)

The distinct count of records for each metro stop arranged in descending order will give us which are popular entrances and exits.

每个地铁站按降序排列的独特记录数将为我们提供受欢迎的出入口。

df_metro['Entrance'].value_counts().sort_values(ascending=False).head()
df_metro['Exit'].value_counts().sort_values(ascending=False).head()

Popular locations seem to be

热门地点似乎

  1. Gallery Place-Chinatown: Major attractions are Capital One Arena (drawing big crowds for sporting events and music concerts), restaurants, bars, etc.

    唐人街画廊广场:主要景点是首都一号竞技场(吸引大量体育赛事和音乐会),餐馆,酒吧等。

  2. Foggy Bottom: Government offices in the area makes it a popular commute destination

    有雾的底部:该地区的政府机关使其成为受欢迎的通勤目的地

  3. Pentagon City: Its location just 2 miles away from the National Mall in downtown Washington makes the area a popular site for hotels and businesses.

    五角大楼市:其位置距华盛顿市中心的国家购物中心仅2英里,使该地区成为酒店和企业的热门地点。

  4. Dupont Circle: International Embassies located in the area

    杜邦环岛:位于该地区的国际使馆

  5. Union Station: An important location for the long-distance travelers

    联合车站:长途旅行者的重要位置

  6. Metro center: A popular downtown location

    地铁中心:市中心热门地点

  7. Fort Totten: Its Metro station serves as a popular transfer point for the Green, Yellow and Red lines

    托滕堡(Fort Totten):其地铁站是绿线,黄线和红线的热门换乘点

Takeaway: Advertisers should target the above popular metro stations that have the high rider footfall to grab maximum buyer attention.

要点:广告商应该针对那些拥有较高人流的热门地铁站,以吸引最大的买家注意力。

Q2。 在一周的不同日期/时间,乘车情况如何? (Q2. What does the ridership look like during different days/times of the week?)

This can be answered by simply plotting the riders’ data across different days and times. We will make use of the seaborn library to create this viz.

只需绘制不同日期和时间的骑手数据即可解决。 我们将利用seaborn库来创建此viz。

sns.set_style("whitegrid") ax = sns.barplot(x="Day", y="Riders", hue="Time",             data = df_metro,             palette = "inferno_r")ax.set(xlabel='Day', ylabel='# Riders')plt.title("Rider Footfall on different Days/Times")plt.show(ax)

Takeaway: Metro is a popular choice of work commute in the city and therefore, as expected the rider footfall is the highest during the Weekday, particularly more so during AM Peak and PM Peak. Companies planning to roll out new products should target these slots to attract attention and generate interest in the consumers. For advertising opportunities during the weekend, the most attractive time slot seems to be Midday, closely followed by PM Peak.

要点:地铁是城市通勤的一种流行选择,因此,正如预期的那样,乘客的人流量在工作日期间最高,尤其是在AM Peak和PM Peak。 计划推出新产品的公司应针对这些广告位,以吸引注意力并引起消费者的兴趣。 对于周末的广告机会而言,最吸引人的时间段似乎是中午,紧随其后的是PM Peak。

Q3。 在典型的工作日中,哪些繁忙的路线? (Q3. What are the busy routes during a typical weekday?)

To analyze this question, we are going to consider a footfall of more than 500 riders at any given metro station. First, we will create a dataframe ‘busy_routes’ that contain data about routes with >500 riders. Second, we will filter this dataframe to contain data for only ‘AM Peak’. Third, we will sort this filtered output.

为了分析这个问题,我们将考虑在任何给定的地铁站有500多名乘客的人流。 首先,我们将创建一个数据框“ busy_routes”,其中包含有关骑行人数超过500人的数据。 其次,我们将过滤此数据框以仅包含“ AM Peak”的数据。 第三,我们将对过滤后的输出进行排序。

busy_routes = weekday[weekday['Riders']>500][['Merge', 'Time', 'Riders']]peak_am = busy_routes.query('Time=="AM Peak"')peak_am.sort_values('Riders').tail()

Repeating the same steps for ‘PM Peak’.

对“ PM Peak”重复相同的步骤。

peak_pm = busy_routes.query('Time=="PM Peak"')len(peak_pm)peak_pm.sort_values('Riders').tail()

Takeaway: We see that the routes with high footfall during AM Peak are the same with high footfall during the PM Peak such as West Falls Church — Farragut West, Vienna-Farragut West, Shady Grove — Farragut North. This tells us that these are the popular work commute routes as people going to work in Farragut during AM peak return to their homes in Vienna/Falls Church/Shady Grove during PM peak. Advertisers should target these high traffic commute routes to maximize on their advertisements and product placements.

要点:我们发现,在AM峰期间人流量大的路线与PM峰期间人流量大的路线相同,例如西瀑布教堂-西法拉格特,西维也纳-法拉古特,谢迪格罗夫-北法拉格特。 这告诉我们,这是最受欢迎的工作通勤路线,因为人们在AM高峰期间在Farragut上班,而在PM高峰期间返回维也纳/ Falls教堂/ Shady Grove的家中。 广告商应针对这些高流量的通勤路线,以最大程度地利用其广告和产品展示位置。

Q4。 周末有哪些热门的地铁路线? (Q4. What are the popular metro routes during the weekends?)

Let us perform a similar analysis as we did for the weekday. Since we are dealing with the weekend data here, we will consider metro stations with a footfall of more than 200 riders.

让我们进行与工作日相似的分析。 由于我们在这里处理周末数据,因此我们将考虑拥有200多名乘客的地铁站。

saturday = df_metro[df_metro['Day']=='Saturday']busy_routes_sat = saturday[saturday['Riders']>200][['Merge', 'Time', 'Riders']]busy_routes_sat.sort_values('Riders').tail()
sunday = df_metro[df_metro['Day']=='Sunday']busy_routes_sun = sunday[sunday['Riders']>200][['Merge', 'Time', 'Riders']]busy_routes_sun.sort_values('Riders').tail()

Takeaway: Smithsonian is an extremely popular destination with tourists as well as city-dwellers alike because of several museums and proximity to White House, The Capitol, national monuments, war memorials, etc. Our analysis tells us that the crowds head out from Crystal City, Pentagon City, Vienna, Franconia to the Smithsonian during the Midday, and return in the PM Peak. Most of these crowds are young families with kids which are an ideal audience for companies launching products meant for younger populations including children.

要点:史密森尼博物馆是一个非常受游客和城市居民欢迎的目的地,因为它拥有数个博物馆,而且邻近白宫,国会大厦,国家古迹,战争纪念馆等。我们的分析告诉我们,人群从水晶城出发,五角大楼市,维也纳,弗兰肯行政区到中午的史密森尼博物馆,然后在PM山顶返回。 这些人群中大多数是有孩子的年轻家庭,这是公司推出针对包括儿童在内的年轻人口产品的理想受众。

Q5。 作为广告客户,我应该在“深夜”中定位到哪些位置? (Q5. As an advertiser, which locations should I target during Late Night?)

We will do a similar analysis as above to identify which metro stations are ideal for putting out advertisements late in the night. For the ‘Late Night’, we will consider metro stations with a footfall of >50 riders.

我们将进行与上述类似的分析,以确定哪些地铁站最适合在深夜发布广告。 对于“深夜”,我们将考虑载客量超过50人的地铁站。

late_night = df_metro[df_metro['Day']=='Late Night']busy_routes_latenight = late_night[late_night['Riders']>50][['Merge', 'Time', 'Riders']]busy_routes_latenight.sort_values('Riders').tail()

Takeaway: We see that late night the riders ride the metro from popular locations such as Gallery Place, Clarendon, Dupont Circle and U Street with a buzzing nightlife. Therefore, advertisers wanting to appeal to this section of the population (which normally would be a younger population) should potentially target these metro stations to grab maximum attention.

要点:我们看到深夜的时候,骑手们从热门场所(如Gallery Place,Clarendon,Dupont Circle和U Street)乘坐地铁,那里的夜生活很热闹。 因此,想要吸引这一部分人群(通常是较年轻的人群)的广告商应该以这些地铁站为目标,以吸引最大的关注。

Closing remarks: This dataset was fairly straightforward and hence, we did not spend a lot of time cleaning and wrangling the data. With the given data, we were able to find sweet spots that would ensure maximum moolah for advertisers’ money. Thanks for reading!

结束语:该数据集非常简单,因此,我们没有花费很多时间来清理和整理数据。 根据给定的数据,我们能够找到最佳点,以确保最大程度地减少广告客户的收入。 谢谢阅读!

翻译自: https://medium.com/@tanmayee92/identify-profitable-advertising-locations-using-washington-dc-metro-data-a03c5c4fc18f

华盛顿特区与其他地区的差别


http://www.taodudu.cc/news/show-6734685.html

相关文章:

  • CSDN博客搬家至掘金
  • Vue 修改掘金字体颜色
  • I/O设备管理软件层次
  • 与设备无关的I/O软件
  • 从软件角度看PCIe设备的硬件结构
  • HID 设备PC端软件的开发
  • 设备、设备节点和设备驱动详解
  • DSDP -- 设备软件开发平台
  • 移动设备软件开发-3
  • 移动设备软件开发-2
  • 操作系统设备独立性软件完成设备分配的原理及过程
  • 如何在英文版本的win7中安装中文软件?
  • Win7中文版转英文版
  • win7英文版中文显示乱码-很多软件无法安装,都可依此解决
  • Win7英文版安装中文软件乱码的问题
  • win7 系统英文版转化为可安装中文软件版本
  • Windows7安装包英文版和中文版的差异揭秘
  • 什么蓝牙耳机打电话清晰?通话蓝牙耳机品牌推荐
  • 电话的内线、外线
  • TypeC转接头-边充电边听歌支持线控通话
  • Telink ble mesh天猫精灵应用
  • Telink “undefined reference to“ 问题解决方法
  • Telink 825x - SDK软件结构
  • Telink 8258 BLE 开发
  • Telink BLE EVK工具使用
  • Telink BLE 开发环境搭建
  • 房地产公司营销策划,活动方案注意事项有哪些?
  • 【Win10笔记本】连接显示器并向左扩展
  • win10多显示器鼠标屏幕间跳转研究
  • 曲线(笔迹)简化算法

华盛顿特区与其他地区的差别_使用华盛顿特区地铁数据确定可获利的广告位置...相关推荐

  1. 使用华盛顿特区地铁数据确定可获利的广告位置

    深度分析 (In-Depth Analysis) Living in Washington DC for the past 1 year, I have come to realize how WMA ...

  2. 机器学习数据倾斜的解决方法_机器学习并不总是解决数据问题的方法

    机器学习数据倾斜的解决方法 总览 (Overview) I was given a large dataset of files, what some would like to call big d ...

  3. 爬虫goodreads数据_使用Python从Goodreads数据中预测好书

    爬虫goodreads数据 Photo of old books by Ed Robertson on Unsplash 埃德·罗伯森 ( Ed Robertson)的旧书照片,内容为Unsplash ...

  4. keras时间序列数据预测_使用Keras的时间序列数据中的异常检测

    keras时间序列数据预测 Anomaly Detection in time series data provides e-commerce companies, finances the insi ...

  5. 快速数据库框架_快速学习新的数据科学概念的框架

    快速数据库框架 重点 (Top highlight) 数据科学 (Data Science) Success in data science and software engineering depe ...

  6. 基于plotly数据可视化_如何使用Plotly进行数据可视化

    基于plotly数据可视化 The amount of data in the world is growing every second. From sending a text to clicki ...

  7. 通才与专家_那么您准备聘请数据科学家了吗? 通才还是专家?

    通才与专家 Throughout my 10-year career, I have seen people often spend their time and energy in passiona ...

  8. 欺诈行为识别_使用R(编程)识别欺诈性的招聘广告

    欺诈行为识别 背景 (Background) Online recruitment fraud (ORF) is a form of malicious behaviour that aims to ...

  9. 提高机器学习质量的想法_如何提高机器学习的数据质量?

    提高机器学习质量的想法 The ultimate goal of every data scientist or Machine Learning evangelist is to create a ...

最新文章

  1. smartrpc编译构建
  2. 我是怎么把一个项目带崩的
  3. hdfs/hbase报错:Incomplete HDFS URI, no host
  4. 老赵被刷票了,但这不是老赵做的
  5. 向DWR传递map/返回map/list/set等(返回对象的处理)
  6. CV Papers|计算机视觉论文推荐周报20200504期
  7. Error:No-bean-named-springSecurityFilterChain-available
  8. linux 文件类型 时间戳 ls bash特性四 文件查看命令 cp move echo
  9. 立即更新 Chrome 浏览器!这个 0day 已遭在野利用
  10. 你真的了解UIButton、UILabel 吗?
  11. 内核和用户空间异步通信
  12. LintCode—两数组的交(547)
  13. pushbutton flash 游戏开发
  14. Java的加减乘除方法
  15. SEO优化教程之关键词密度及TDK标签布局
  16. java framemaker教程_Freemarker入门案例
  17. 记录:Flink checkpoint 过期导致失败(线上问题)
  18. Lynda课程中文字幕 Network Automation Quick Start 网络自动化快速入门
  19. 用python画微信表情_【一点资讯】“裂开了,苦涩了,翻白眼”!我用Python画出微信新出的表情包 www.yidianzixun.com...
  20. LInux目录与路径

热门文章

  1. Flutter web加载慢问题优化和解决方案
  2. background-color和bgColor用法上区别
  3. 解决无法登录微软(Microsoft)账号的设置
  4. VUE高仿饿了么app开发思维导图
  5. 倾斜摄影/无人机影像处理(Bentely ContextCapture)及存储配置推荐
  6. (css)el标签添加下划
  7. 笔记本电脑硬盘不见了_笔记本电脑开机找不到硬盘完美解决方法
  8. DBR分区表详解(FAT)
  9. docker将容器添加到指定网络
  10. 汉枫的EW11关于在SATA模式下连接TCP服务器的设置