本文为学习笔记，记录了由University of Michigan推出的Coursera专项课程——Python for Everybody中Chapter12、13、15及16中的部分样例和全部作业代码。其中，作业代码均已通过测试。

一、Chapter 12

0. Sample Code

# To run this, download the BeautifulSoup zip file
# http://www.py4e.com/code3/bs4.zip
# and unzip it in the same directory as this filefrom urllib.request import urlopen
from bs4 import BeautifulSoup
import ssl# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONEurl = input('Enter - ')
html = urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, "html.parser")# Retrieve all of the anchor tags
tags = soup('a')
for tag in tags:# Look at the parts of a tagprint('TAG:', tag)print('URL:', tag.get('href', None))print('Contents:', tag.contents[0])print('Attrs:', tag.attrs)

1. Scraping HTML Data with BeautifulSoup

import urllib.request,urllib.parse,urllib.error
from bs4 import BeautifulSoup
import re
import sslctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONEurl = input('Enter-')
html = urllib.request.urlopen(url).read()
soup = BeautifulSoup(html,'html.parser')total = 0
count= 0
tags = soup('span')
for tag in tags:str=tag.contents[0]lst=re.findall('[0-9]+', str)try:num = int(lst[0])count+= 1total+= numexcept:continueprint('Count', count)
print('Sum', total)

2. Following Links in HTML Using BeautifulSoup

from urllib.request import urlopen
from bs4 import BeautifulSoup
import sslctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONEurl = input("Enter URL: ")
count = int(input("Enter count: "))
position = int(input("Enter position: "))-1for i in range(count):html = urlopen(url, context=ctx).read()soup = BeautifulSoup(html, "html.parser")lst = list()tags = soup('a')for tag in tags:lst.append(tag.get('href', None))url = lst[position]print("Retrieving: ", url)

二、Chapter 13

1. Extracting Data from XML

import urllib.request, urllib.parse,urllib.error
import ssl
import xml.etree.ElementTree as ETtotal = 0
url = input("Enter location: ")
print("Retrieving ", url)ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONEuh = urllib.request.urlopen(url, context=ctx)
data = uh.read()
print('Retrieved', len(data), 'characters')
tree = ET.fromstring(data)
counts = tree.findall('.//count')
print('Count: ', len(counts))
lst=tree.findall('comments/comment')
for i in lst:num = int(i.find('count').text)total+= num
print("Sum: ", total)

2. Extracting Data from JSON

import urllib.request, urllib.parse,urllib.error
import ssl
import jsontotal = 0
url = input("Enter location: ")
print("Retrieving ", url)ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONEuh = urllib.request.urlopen(url, context=ctx)
data = uh.read()
print('Retrieved', len(data),'characters')
info = json.loads(data)
print('Count:', len(info['comments']))
for dict in info['comments']:item = dict['count']total+= int(item)
print('Sum: ', total)

3. Using the GeoJSON API

import urllib.request, urllib.parse, urllib.error
import json
import sslapi_key = False
if api_key is False:api_key = 42serviceurl = 'http://py4e-data.dr-chuck.net/json?'
else :serviceurl = 'https://maps.googleapis.com/maps/api/geocode/json?'ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONEwhile True:address = input('Enter location: ')if len(address) < 1: breakparms = dict()parms['address'] = addressif api_key is not False: parms['key'] = api_keyurl = serviceurl + urllib.parse.urlencode(parms)print('Retrieving', url)uh = urllib.request.urlopen(url, context=ctx)data = uh.read().decode()print('Retrieved', len(data), 'characters')try:js = json.loads(data)except:js = Noneif not js or 'status' not in js or js['status'] != 'OK':print('==== Failure To Retrieve ====')continueplace_id = js['results'][0]['place_id']print('Place id',place_id)

三、Chapter 15

1. Counting Email in a Database

① Sample Code

import sqlite3conn = sqlite3.connect('emaildb.sqlite')
cur = conn.cursor()cur.execute('DROP TABLE IF EXISTS Counts')cur.execute('''
CREATE TABLE Counts (email TEXT, count INTEGER)''')fname = input('Enter file name: ')
if (len(fname) < 1): fname = 'mbox-short.txt'
fh = open(fname)
for line in fh:if not line.startswith('From: '): continuepieces = line.split()email = pieces[1]cur.execute('SELECT count FROM Counts WHERE email = ? ', (email,))row = cur.fetchone()if row is None:cur.execute('''INSERT INTO Counts (email, count)VALUES (?, 1)''', (email,))else:cur.execute('UPDATE Counts SET count = count + 1 WHERE email = ?',(email,))conn.commit()# https://www.sqlite.org/lang_select.html
sqlstr = 'SELECT email, count FROM Counts ORDER BY count DESC LIMIT 10'for row in cur.execute(sqlstr):print(str(row[0]), row[1])cur.close()

② Assignment

#Chapter 15--Counting Email in a Database
#输入文件：mbox.txt/输出文件：orgdb.sqlite
import sqlite3
import reconn = sqlite3.connect('orgdb.sqlite')cur = conn.cursor()
cur.execute('DROP TABLE IF EXISTS Counts')
cur.execute('''
CREATE TABLE Counts (org TEXT, count INTEGER)''')fname=input('Enter file name: ') # fname >>> mbox.txt
fh = open(fname)
for line in fh:if not line.startswith('From: '): continuepieces = line.split()email=pieces[1]org = re.findall('@(.+)',email)[0]cur.execute('SELECT count FROM Counts WHERE org = ? ', (org,))row = cur.fetchone()if row is None:cur.execute('''INSERT INTO Counts (org, count)VALUES (?, 1)''', (org,))else:cur.execute('UPDATE Counts SET count = count + 1 WHERE org = ?',(org,))conn.commit()#以下代码仅用作验证结果
sqlstr = 'SELECT org, count FROM Counts ORDER BY count DESC'
for row in cur.execute(sqlstr):print(str(row[0]), row[1])cur.close()

2. Multi-Table Database - Tracks

① Sample Code

import xml.etree.ElementTree as ET
import sqlite3conn = sqlite3.connect('trackdb.sqlite')
cur = conn.cursor()# Make some fresh tables using executescript()
cur.executescript('''
DROP TABLE IF EXISTS Artist;
DROP TABLE IF EXISTS Album;
DROP TABLE IF EXISTS Track;CREATE TABLE Artist (id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,name    TEXT UNIQUE
);CREATE TABLE Album (id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,artist_id  INTEGER,title   TEXT UNIQUE
);CREATE TABLE Track (id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,title TEXT  UNIQUE,album_id  INTEGER,len INTEGER, rating INTEGER, count INTEGER
);
''')fname = input('Enter file name: ')
if ( len(fname) < 1 ) : fname = 'Library.xml'# <key>Track ID</key><integer>369</integer>
# <key>Name</key><string>Another One Bites The Dust</string>
# <key>Artist</key><string>Queen</string>
def lookup(d, key):found = Falsefor child in d:if found : return child.textif child.tag == 'key' and child.text == key :found = Truereturn Nonestuff = ET.parse(fname)
all = stuff.findall('dict/dict/dict')
print('Dict count:', len(all))
for entry in all:if ( lookup(entry, 'Track ID') is None ) : continuename = lookup(entry, 'Name')artist = lookup(entry, 'Artist')album = lookup(entry, 'Album')count = lookup(entry, 'Play Count')rating = lookup(entry, 'Rating')length = lookup(entry, 'Total Time')if name is None or artist is None or album is None : continueprint(name, artist, album, count, rating, length)cur.execute('''INSERT OR IGNORE INTO Artist (name) VALUES ( ? )''', ( artist, ) )cur.execute('SELECT id FROM Artist WHERE name = ? ', (artist, ))artist_id = cur.fetchone()[0]cur.execute('''INSERT OR IGNORE INTO Album (title, artist_id) VALUES ( ?, ? )''', ( album, artist_id ) )cur.execute('SELECT id FROM Album WHERE title = ? ', (album, ))album_id = cur.fetchone()[0]cur.execute('''INSERT OR REPLACE INTO Track(title, album_id, len, rating, count) VALUES ( ?, ?, ?, ?, ? )''', ( name, album_id, length, rating, count ) )conn.commit()

② Assignment

import xml.etree.ElementTree as ET
import sqlite3conn = sqlite3.connect('trackdb.sqlite')
cur = conn.cursor()cur.executescript('''
DROP TABLE IF EXISTS Artist;
DROP TABLE IF EXISTS Genre;
DROP TABLE IF EXISTS Album;
DROP TABLE IF EXISTS Track;CREATE TABLE Artist (id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,name    TEXT UNIQUE
);CREATE TABLE Genre (id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,name    TEXT UNIQUE
);CREATE TABLE Album (id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,artist_id  INTEGER,title   TEXT UNIQUE
);CREATE TABLE Track (id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,title TEXT  UNIQUE,album_id  INTEGER,genre_id  INTEGER,len INTEGER, rating INTEGER, count INTEGER
);
''')fname = 'Library.xml'def lookup(d, key):found = Falsefor child in d:if found : return child.textif child.tag == 'key' and child.text == key :found = Truereturn Nonestuff = ET.parse(fname)
all = stuff.findall('dict/dict/dict')
for entry in all:if ( lookup(entry, 'Track ID') is None ) : continuename = lookup(entry, 'Name')artist = lookup(entry, 'Artist')genre = lookup(entry, 'Genre')album = lookup(entry, 'Album')track = lookup(entry, 'Track')count = lookup(entry, 'Play Count')rating = lookup(entry, 'Rating')length = lookup(entry, 'Total Time')if name is None or artist is None or genre is None or album is None : continuecur.execute('''INSERT OR IGNORE INTO Artist (name) VALUES ( ? )''', ( artist, ) )cur.execute('SELECT id FROM Artist WHERE name = ? ', (artist, ))artist_id = cur.fetchone()[0]cur.execute('''INSERT OR IGNORE INTO Genre (name) VALUES ( ? )''', ( genre, ) )cur.execute('SELECT id FROM Genre WHERE name = ? ', (genre, ))genre_id = cur.fetchone()[0]cur.execute('''INSERT OR IGNORE INTO Album (title, artist_id) VALUES ( ?, ? )''', ( album, artist_id ) )cur.execute('SELECT id FROM Album WHERE title = ? ', (album, ))album_id = cur.fetchone()[0]cur.execute('''INSERT OR REPLACE INTO Track(title, album_id, genre_id, len, rating, count) VALUES ( ?, ?, ?, ?, ?, ? )''',  ( name, album_id, genre_id, length, rating, count) )conn.commit()#用于检验结果的SQL语句：
sqlstr = '''SELECT Track.title, Artist.name, Album.title, Genre.name FROM Track JOIN Genre JOIN Album JOIN Artist ON Track.genre_id = Genre.ID and Track.album_id = Album.id AND Album.artist_id = Artist.idORDER BY Artist.name, Track.title LIMIT 3'''for row in cur.execute(sqlstr):print(str(row[0]), row[1],row[2],row[3])cur.close()

3. Many Students in Many Courses

① Sample Code

import json
import sqlite3conn = sqlite3.connect('rosterdb.sqlite')
cur = conn.cursor()# Do some setup
cur.executescript('''
DROP TABLE IF EXISTS User;
DROP TABLE IF EXISTS Member;
DROP TABLE IF EXISTS Course;CREATE TABLE User (id     INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,name   TEXT UNIQUE
);CREATE TABLE Course (id     INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,title  TEXT UNIQUE
);CREATE TABLE Member (user_id     INTEGER,course_id   INTEGER,role        INTEGER,PRIMARY KEY (user_id, course_id)
)
''')fname = input('Enter file name: ')
if len(fname) < 1:fname = 'roster_data_sample.json'# [
#   [ "Charley", "si110", 1 ],
#   [ "Mea", "si110", 0 ],str_data = open(fname).read()
json_data = json.loads(str_data)for entry in json_data:name = entry[0]title = entry[1]print((name, title))cur.execute('''INSERT OR IGNORE INTO User (name)VALUES ( ? )''', ( name, ) )cur.execute('SELECT id FROM User WHERE name = ? ', (name, ))user_id = cur.fetchone()[0]cur.execute('''INSERT OR IGNORE INTO Course (title)VALUES ( ? )''', ( title, ) )cur.execute('SELECT id FROM Course WHERE title = ? ', (title, ))course_id = cur.fetchone()[0]cur.execute('''INSERT OR REPLACE INTO Member(user_id, course_id) VALUES ( ?, ? )''',( user_id, course_id ) )conn.commit()

② Assignment

import json
import sqlite3conn = sqlite3.connect('rosterdb.sqlite')
cur = conn.cursor()cur.executescript('''
DROP TABLE IF EXISTS User;
DROP TABLE IF EXISTS Member;
DROP TABLE IF EXISTS Course;CREATE TABLE User (id     INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,name   TEXT UNIQUE
);CREATE TABLE Course (id     INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,title  TEXT UNIQUE
);CREATE TABLE Member (user_id     INTEGER,course_id   INTEGER,role        INTEGER,PRIMARY KEY (user_id, course_id)
)
''')fname = 'roster_data.json'str_data = open(fname).read()
json_data = json.loads(str_data)for entry in json_data:name = entry[0]title = entry[1]role =entry[2]cur.execute('''INSERT OR IGNORE INTO User (name)VALUES ( ? )''', ( name, ) )cur.execute('SELECT id FROM User WHERE name = ? ', (name, ))user_id = cur.fetchone()[0]cur.execute('''INSERT OR IGNORE INTO Course (title)VALUES ( ? )''', ( title, ) )cur.execute('SELECT id FROM Course WHERE title = ? ', (title, ))course_id = cur.fetchone()[0]cur.execute('''INSERT OR REPLACE INTO Member(user_id, course_id, role) VALUES ( ?, ?, ? )''',( user_id, course_id, role ) )conn.commit()sqlstr1 = '''SELECT User.name,Course.title, Member.role FROM User JOIN Member JOIN Course ON User.id = Member.user_id AND Member.course_id = Course.idORDER BY User.name DESC, Course.title DESC, Member.role DESC LIMIT 2'''for row in cur.execute(sqlstr1):print(str(row[0]), row[1], row[2])sqlstr2 = '''SELECT 'XYZZY' || hex(User.name || Course.title || Member.role ) AS X FROM User JOIN Member JOIN Course ON User.id = Member.user_id AND Member.course_id = Course.idORDER BY X LIMIT 1'''for row in cur.execute(sqlstr2):print(str(row[0]))cur.close()

四、Chapter 16

1. Step 1

① geoload

import urllib.request, urllib.parse, urllib.error
import http
import sqlite3
import json
import time
import ssl
import sysapi_key = False
# If you have a Google Places API key, enter it hereif api_key is False:api_key = 42serviceurl = "http://py4e-data.dr-chuck.net/json?"
else :serviceurl = "https://maps.googleapis.com/maps/api/geocode/json?"# Additional detail for urllib
# http.client.HTTPConnection.debuglevel = 1conn = sqlite3.connect('geodata.sqlite')
cur = conn.cursor()cur.execute('''
CREATE TABLE IF NOT EXISTS Locations (address TEXT, geodata TEXT)''')# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONEfh = open("where.data")
count = 0
for line in fh:if count > 200 :print('Retrieved 200 locations, restart to retrieve more')breakaddress = line.strip()print('')cur.execute("SELECT geodata FROM Locations WHERE address= ?",(memoryview(address.encode()), ))try:data = cur.fetchone()[0]print("Found in database ",address)continueexcept:passparms = dict()parms["address"] = addressif api_key is not False: parms['key'] = api_keyurl = serviceurl + urllib.parse.urlencode(parms)print('Retrieving', url)uh = urllib.request.urlopen(url, context=ctx)data = uh.read().decode()print('Retrieved', len(data), 'characters', data[:20].replace('\n', ' '))count = count + 1try:js = json.loads(data)except:print(data)  # We print in case unicode causes an errorcontinueif 'status' not in js or (js['status'] != 'OK' and js['status'] != 'ZERO_RESULTS') :print('==== Failure To Retrieve ====')print(data)breakcur.execute('''INSERT INTO Locations (address, geodata)VALUES ( ?, ? )''', (memoryview(address.encode()), memoryview(data.encode()) ) )conn.commit()if count % 10 == 0 :print('Pausing for a bit...')time.sleep(5)print("Run geodump.py to read the data from the database so you can vizualize it on a map.")

2. Step 2

②geodump

import sqlite3
import json
import codecsconn = sqlite3.connect('geodata.sqlite')
cur = conn.cursor()cur.execute('SELECT * FROM Locations')
fhand = codecs.open('where.js', 'w', "utf-8")
fhand.write("myData = [\n")
count = 0
for row in cur :data = str(row[1].decode())try: js = json.loads(str(data))except: continueif not('status' in js and js['status'] == 'OK') : continuelat = js["results"][0]["geometry"]["location"]["lat"]lng = js["results"][0]["geometry"]["location"]["lng"]if lat == 0 or lng == 0 : continuewhere = js['results'][0]['formatted_address']where = where.replace("'", "")try :print(where, lat, lng)count = count + 1if count > 1 : fhand.write(",\n")output = "["+str(lat)+","+str(lng)+", '"+where+"']"fhand.write(output)except:continuefhand.write("\n];\n")
cur.close()
fhand.close()
print(count, "records written to where.js")
print("Open where.html to view the data in a browser")

Coursera | Python for Everybody专项课程相关推荐

CS230+deeplearning.ai专项课程笔记及作业目录
随着lecture10的结束,标志着CS230到Coursera上deeplearning.ai专项课程的所有编程作业的完成,中途虽然进度比较慢,也曾遇到很多困难,但是收获满满,也找到了自己比较感兴趣 ...
Coursera | Applied Data Science with Python 专项课程 | Applied Machine Learning in Python
本文为学习笔记,记录了由University of Michigan推出的Coursera专项课程--Applied Data Science with Python中Course Three: Ap ...
吴恩达Coursera, 机器学习专项课程, Machine Learning：Advanced Learning Algorithms第三周编程作业...
吴恩达Coursera, 机器学习专项课程, Machine Learning:Advanced Learning Algorithms第三周所有jupyter notebook文件: 吴恩达,机器学 ...
吴恩达Coursera, 机器学习专项课程, Machine Learning：Advanced Learning Algorithms第二周编程作业...
吴恩达Coursera, 机器学习专项课程, Machine Learning:Advanced Learning Algorithms第二周所有jupyter notebook文件: 吴恩达,机器学 ...
Coursera近期新推了一个金融和机器学习的专项课程系列
Coursera近期新推了一个金融和机器学习的专项课程系列:Machine Learning and Reinforcement Learning in Finance Specialization( ...
coursera python web_一步步爬取Coursera课程资源
有时候我们需要把一些经典的东西收藏起来,时时回味,而Coursera上的一些课程无疑就是经典之作.Coursera中的大部分完结课程都提供了完整的配套教学资源,包括ppt,视频以及字幕等,离线下来后会 ...
吴恩达Deeplearning.ai国庆节上新：生成对抗网络（GAN）专项课程
机器之心报道作者:蛋酱 Coursera 刚刚上新了 GAN 的专项课程,或许在这个国庆假期,你应该学习一波了. 生成对抗网络(Generative Adversarial Network,GAN) ...
gan 总结数据增强_吴恩达Deeplearning.ai国庆上新：GAN专项课程
Coursera 刚刚上新了 GAN 的专项课程,或许在这个国庆假期,你应该学习一波了. 作者:蛋酱生成对抗网络(Generative Adversarial Network,GAN)是当前功能最强 ...
完结篇 | 吴恩达deeplearning.ai专项课程精炼笔记全部汇总
红色石头的个人网站:redstonewill.com 从去年8月份开始,AI界大IP吴恩达在coursera上开设了由5们课组成的深度学习专项课程,掀起了一股人工智能深度学习热潮.这里附上deeple ...

Coursera | Python for Everybody专项课程

一、Chapter 12

0. Sample Code

1. Scraping HTML Data with BeautifulSoup

2. Following Links in HTML Using BeautifulSoup

二、Chapter 13

1. Extracting Data from XML

2. Extracting Data from JSON

3. Using the GeoJSON API

三、Chapter 15

1. Counting Email in a Database

① Sample Code

② Assignment

2. Multi-Table Database - Tracks

① Sample Code

② Assignment

3. Many Students in Many Courses

① Sample Code

② Assignment

四、Chapter 16

1. Step 1

① geoload

2. Step 2

②geodump

Coursera | Python for Everybody专项课程相关推荐

最新文章

热门文章