15
Introduction of Feedy Masashi Shibata #pyconapac #pyconapac2016

Introduction of Feedy

Embed Size (px)

Citation preview

Page 1: Introduction of Feedy

Introduction of FeedyMasashi Shibata

#pyconapac #pyconapac2016

Page 2: Introduction of Feedy

Masashi Shibata

Student Programmer in Japan

c-bata

PyCon JP staff

c_bata_en! "

Page 3: Introduction of Feedy

RSS Feed

<?xml version="1.0" encoding="UTF-8" ?> <rss version="2.0"> <channel> <title>c-bata’s weblog</title> <link>http://example.com</link> <description>c-bata’s weblog</description> <item> <title>Introduction of Feedy</title> <link>http://example.com/foo</link> <description>short description</description> </item> <item> <title>XML Tutorial</title> <link>http://example.com/bar</link> <description>Description about bar</description> </item> :

RSS Feed XML

title url descriptions

Each feed items consists of

#

#

#

Page 4: Introduction of Feedy

Collecting images from RSS Feed

<?xml version="1.0" encoding="UTF-8" ?> <rss version="2.0"> <channel> <title>c-bata’s weblog</title> <link>http://example.com</link> <description>c-bata’s weblog</description> <item> <title>Introduction of Feedy</title> <link>http://example.com/foo</link> <description>Description about foo</description> </item> <item> <title>XML Tutorial</title> <link>http://example.com/bar</link> <description>Description about bar</description> </item> :

RSS Feed XML <html lang=en> <head> <title>introduction of feedy | c-bata’s weblog</title> </head> <body> <h1>Introduction of Feedy</h1> : </body> </html>

Article 1

<html lang=en> <head> <title>introduction of feedy | c-bata’s weblog</title> </head> <body> <h1>Introduction of Feedy</h1> : </body> </html>

Article 2

FEED ITEMS

:If you want to collect images, you have to fetch HTML of each articles.

Page 5: Introduction of Feedy
Page 6: Introduction of Feedy

Little complexwhen just fetching RSS Feed items.

http://doc.scrapy.org/en/1.0/topics/architecture.html

Page 7: Introduction of Feedy

$ pip install feedy

Page 8: Introduction of Feedy

Usagefrom feedy import Feedy app = Feedy(‘feedy.dat')

@app.add(‘<RSS_FEED_URL>’) def func(info, body): # do something

if __name__ == '__main__': app.run()

Page 9: Introduction of Feedy

with BeautifulSoup4from feedy import Feedy from bs4 import BeautifulSoup app = Feedy(‘feedy.dat')

@app.add(‘<RSS_FEED_URL>’) def func(info, body): soup = BeautifulSoup(body, "html.parser") # do something

if __name__ == '__main__': app.run()

HTML Body of each feed items

Page 10: Introduction of Feedy

Collecting images using Feedyfrom feedy import Feedy app = Feedy('feedy.dat')

@app.add('http://rss.cnn.com/rss/edition.rss') def cnn(info, body): soup = BeautifulSoup(body, "html.parser") for x in soup.find_all(‘img’, attrs={‘class’: ‘foo’}): print(x[‘src'])

if __name__ == '__main__': app.run()

Page 11: Introduction of Feedy

Adding other RSS [email protected](‘http://site1.com/rss') def site1(info, body): soup = BeautifulSoup(body, "html.parser") for x in soup.find_all(‘img’, attrs={‘class’: ‘foo’}): print(x[‘src'])

@app.add(‘http://any-other-website.com/rss') def site2(info, body): soup = BeautifulSoup(body, "html.parser") for x in soup.find_all(‘img’, attrs={‘class’: ‘bar’}): print(x['src'])

Page 12: Introduction of Feedy

Plugins

Page 13: Introduction of Feedy

Getting social share countsfrom feedy import Feedy from feedy_plugins import social_share_plugin

app = Feedy(store='feedy.dat', ignore_fetched=True) app.install(social_shared_plugin)

@app.add('http://rss.cnn.com/rss/edition.rss') def cnn_shared(info, body, social_count): article = { 'pocket': social_count['pocket_count'], 'facebook': social_count['facebook_count'], } print(article)

Page 14: Introduction of Feedy

github.com/c-bata/feedy

Pull requests, Issues are Welcom :)

More details are avairable on github.

Page 15: Introduction of Feedy

Thank you!