Search YouTube Videos for a Phrase and Return a Timestamp
I love Dave Ramsey. His daily shows are streamed live everyday on YouTube and then made available to watch there. The “Debt Free Screams” are my favorite part of the show. Because I usually don’t have the time watch the entire episode I would like to find those short segments that contain the “Debt Free Screams”.
The Challenge: Search a youtube video for a specific phrase and return a list of timestamps for when that phrase was spoken.
The Solution: Behold… Daily Debt Free Screams
#! /bin/bash # # This script is configured to search for the phrase "in the lobby of", needs more work # to make the search phrase configurable # # Usage: ./getYoutubeTimestampOfPhrase.sh <youtube-url> # Output: A JSON with the timestamp of each occurrence of the phrase. e.g. # {"id":"9gyLR0OR1jM","timestamp":"5712","title":"The Dave Ramsey Show (07-13-17)"}, # {"id":"9gyLR0OR1jM","timestamp":"9314","title":"The Dave Ramsey Show (07-13-17)"} # # Note: you must pre install youtube-dl (https://github.com/rg3/youtube-dl) # # First we'll download the captions as a .vtt file /usr/local/bin/youtube-dl --write-auto-sub --skip-download -o '%(title)s_%(id)s.%(ext)s' "$1" #&> /dev/null # setup some config variables youtube="https://www.youtube.com/watch?v=" BASEDIR=$(dirname $0) queue_files="${BASEDIR}/*.en.vtt" json="" for queue_file in $queue_files; do if [[ ! -f "$queue_file" ]]; then continue fi timestamp=$(egrep 'in<[[:digit:]\:\.]*><c> the<\/c><[[:digit:]\:\.]*><c> lobby<\/c><[[:digit:]\:\.]*><c> of' "$queue_file" | cut -c4-11) if [[ ! -z "$timestamp" ]]; then newtimes=$(printf %s "$timestamp" | awk -F: '{ print ($1 * 3600) + ($2 * 60) + $3 }') for newtime in $newtimes; do videoId=$(echo "$queue_file" | cut -c3- | cut -d "." -f 1 | cut -d "_" -f 2) title=$(echo "$queue_file" | cut -c3- | cut -d "." -f 1 | cut -d "_" -f 1) videolink="${youtube}${videoId}&t=${newtime}" videoData=$(printf '{"id":"%s","timestamp":"%s","title":"%s"}\n' "$videoId" "$newtime" "$title") json="${json}${videoData}," echo $videolink done fi rm "$queue_file" done echo "${json%?}" >> out.json