21.05.2021 -- #

How to learn German while staying up to date on all things reef aquariums? Korallenriff. A great blog with a ton of original content on reefing. From how two videos to product releases they cover it all. It is a project by ankerundmeehr accompanied by a YouTube channel, magazine, and salt water wiki.

I struggled to stay up to date with their content. Unable to find an RSS feed for the blog I decided to scrape it and build my own. Meet korallenriff-feed, a python script that runs every 6 hours as an AWS lambda function. The script itself is simple, using

  • requests for http communication
  • BeautifulSoup to parse HTML and scrape data
  • FeedGenerator to create a rss xml from the scraped data
  • boto3 to upload that rss xml file to a publicly accessible s3 bucket

The requirements for this project lead me to lambda because I didn’t want something always running, wanted python, and an easy deploy flow. Normally for me this means Serverless or Terraform. However I’ve not been super happy with either lately. Terraform is not elegant when it comes to simple lambdas and serverless is too much JS for my taste. The search for something different lead me to Zappa.

Zappa is “Serverless Python”. In other words, an easy way to deploy event driven python code as lambda functions plus API gateways. In my case didn’t need a gateway luckily it was easy to disable. I found the Zappa development flow smooth. Within an hour of getting started I had a hello world lambda up and running. I found the scheduling straight forward, permissions relaxed and easy to develop with, and debugging via cloud watch logs / console logs nice. Was really refreshing. Now that it is setup I can sit back and enjoy the articles without much worry about maintenance.

rss reader app

rss reader app

The feed is public here: https://travisshears.com/personal/korallenriff-feed/rss.xml

Full code on source hut repo

\- [ tech, lambda, aws, python, reeftank ]