Pycast

A collection of bits of code that allow derrivation of flows. Aimed at simulatiing breach scnearios for quick assesments of risk.

Todo List

Development

When moving back to linux from OSX, I couldn't find a video podcast tool that I liked that ran on Linux. Having started to work with python at work, I thougth I might be able to knock up something to scrape video podcast feeds. So this was one of the first python scripts I wrote, it's a little messy and doesn't work with every video podcast, but I like it! It requires the installation of the feedparser Library and the creation of a text file containing the rss urls for the podcasts you wish to download. It also uses wget, which isn't included in all OSX releases, this can be obtained via the macports utility. The script creates a log of all that is has downloaded so won't try and download the same episode twice! See below for a link to download of all the files and a syntax highlighted view of the code.

			
				#!/usr/bin/env python
				# description of script

				import feedparser
				import os
				import os.path
				import urllib2

				# Todo
				# Check to see if there are more than X episodes in folder - if so, delete oldest
				# Prompt user to ask how many episodes to keep
				# Turn elements of script into function

				#create lists
				current_file = []
				log_file = []

				# Check for log file - create if not present
				if os.path.exists(log.txt):
				    print "log already exists":
				else:
				    open(x, 'log.txt').close()


				# Check for feed file - create if not present
				if os.path.exists(feeds.txt):
				    print "feeds already exists":
				else:
				    open(x, 'feeds.txt').close()

				# Open feed list - must be in same directory as script
				openfile = open('feeds.txt', 'r')
				feeds = openfile.readlines()
				openfile.close()

				#set working dir
				script_dir = os.getcwd()

				#Loop through feeds
				for line in feeds:

				    ###parse feed###
				    d = feedparser.parse(line)
				    title = d['feed']['title']
				    #find url of episode write to 'current' variable
				    current = d.entries[0].links[1].href

				    ###check for existing file in directory###

				    #read current url into list then split list at last entry for file name
				    current_file = current.split('/')
				    #file name = 'current_episode'
				    current_episode = current_file[-1]
				    ###write data to log file and check with log file

				    #read log file
				    os.chdir(script_dir)
				    f = open('log.txt', 'r')
				    log_file = f.readlines()
				    f.close()


				    #check proposed downloads against log file
				    if current_episode +'\n' in log_file:
				        print 'Current episode of ' + title + ' already downloaded'

				    else:
				        print 'New episode of ' + title + ' detected'
				        #write to log
				        os.chdir(script_dir)
				        f = open('log.txt', 'a')
				        f.write(current_episode + '\n')
				        f.close()

				        #check to see if there is a dir for podcast
				        if os.path.exists(title):
				            os.chdir(title)
				            #download data
				            response = urllib2.urlopen(current)
				            response.read()

				        #if not, make one
				        else:
				            os.mkdir(title)
				            os.chdir(title)
				            #download data
				            download_path = 'wget ' + current
				            os.system(download_path)

				print 'Check complete'