Some of my scripts for daily use on a linux desktop

Tuesday, February 2, 2010

How to get pictures from RSS feeds from deviantArt (get_deviation_search.sh v1.0)

Each time you make a research on deviantArt web site (or when you browse a category) a RSS feed is generated.
For example, to see the newest deviations in the fractal category, the rss feed generated at the bottom of the page is http://backend.deviantart.com/rss.xml?q=in%3Adigitalart/fractals%20sort%3Atime&type=deviation.

For my Photo Album script, I wanted to download/save pictures from the RSS feed to my hard drive.
Well, I wrote something and it works but it certainly can be improve!

I call this script : get_deviation_search.sh and it should be run with at least three arguments:
1st arg : folder where to save the pictures
2nd arg : delay (in minutes) for repeating the script (0 is for a single run)
3rd arg and following : urls of the rss feed to follow
and more parameters can be set inside the script


#example for photomanipulations, fractals categories and research for "conky"

./get_deviation_search.sh /home/wlourf/deviant 5 http://backend.deviantart.com/rss.xml?q=in%3Adigitalart/fractals%20sort%3Atime\&type=deviation http://backend.deviantart.com/rss.xml?q=in%3Adigitalart/photomanip%20sort%3Atime\&type=deviation http://backend.deviantart.com/rss.xml?q=in%3Adigitalart/fractals%20sort%3Atime\&type=deviation http://backend.deviantart.com/rss.xml?q=sort%3Atime%20conky\&type=deviation


IMPORTANT : In the RSS feed the symbol & has to replace by \& elsewhere it won't work

This script can also be downloaded here


En français (soyons fous) :
Ce script permet de télécharger les images des liens RSS générés sur les pages des résultats de recherche sur deviantArt.
Il fonctionne avec 3 arguments minimum :
1: le dossier où enregistrer les images
2: le temps à attendre entre chaque boucle (0 pour une seule boucle)
3 et suivants : liens RSS (Dans ces liens il faut remplacer & par /&)


The script :

#! /bin/bash
#This script get pictures from RSS feed from deviantArt website and save them to hard disk
#
# by wlourf http://u-scripts.blogspot.com/
# v1.0 - 02 Feb. 2010
#
#1st arg : folder where to save the pictures
#2nd arg : delay (in minutes) for repetaing the script (0 is for a single run)
#3rd arg and following : urls of the rss feed to follow

#some paramaters are set here
file1=/tmp/get_deviant_rss.txt
file2=/tmp/get_liste.txt
#display filename even if it is not saved (ie saved previously)
display_files_not_saved=false
#if script running in gnome_terminal, right click on the file can open it
display_files_with_link=true
############################# end ###########################

clear

if [[ $# -lt 3 ]]; then
  echo 'This script needs at least 3 arguments : '
  echo ' - full path of the folder where save the pictures'
  echo ' - run the script every X minutes, 0 for a single run'
  echo ' - one or more RSS feeds to watch (take care to escape the & symbol with \&)'
  exit
fi

folder=$1
repeat=$2
args=$*
rss=${args#$1" "}
rss=${rss#$2" "}
nbUrl=0

if [[ {$folder:-1} != "/" ]]; then
  folder="$folder/"
fi

for url in $rss
do
  ((nbUrl++))
  tabRSS[$nbUrl]=$url
done

nbLoops=0
flag=true
while ( $flag ); do

  echo "-------------------"
  echo "folder = "$folder
  echo "-------------------"
  echo "loop every $repeat minutes"
  echo "-------------------"

  ((nbLoops++))

  mkdir -p $folder
  cd $folder

  for ((u=1 ; u<=$nbUrl ; u++))
  do
    #read one rss feed
    echo "-------------------"
    echo "rss $u/$nbUrl = "${tabRSS[u]}
    GET ${tabRSS[u]} > $file1
    
    #extract the link to the fullview image
    match="medium=\"image\""
    url_line=""

    begin="http"
    end="\" height"

    table=()
    idx=1
    #put the links in a table for better display
    while read line
    do
      if [[ "$line" =~ "${match}" ]]; then
        url_line=$line
        a=$(($(expr "$url_line" : ".*$begin")-${#begin}))
        b=$(($(expr "$url_line" : ".*$end")-$a-${#end}))
        url_link=${url_line:$a:$b}
      table[$idx]=$url_link
      ((idx++))
      fi
    done < $file1

    #read the table ans save the image if not already saved
    nbLinks=$(($idx-1))
    nbSaved=0
    for ((a=1 ; a<=$nbLinks ; a++))
    do
      link=${table[a]}
      img_name=$(expr match "$link" '.*\(/.*\)')
      img_name=${img_name:1}
      txt="loop $nbLoops - url $u - $a/$nbLinks"
      if ($display_files_with_link) then
        strFile="file://"$folder$img_name
      else
        strFile=$img_name
      fi
      if [ -f "$img_name" ]; then
        if ($display_files_not_saved) then
          echo $txt" - *** "$strFile" *** already saved"
        fi
      else
        echo $txt" - "$strFile
        wget -q $link
        ((nbSaved++))
      fi

    done
    echo
    echo "$nbSaved/$nbLinks pictures saved for loop number $nbLoops and rss= $url"
    echo "-------------------"
    echo
  done
  
  if [[ $repeat -eq "0" ]]; then
    flag=false
  fi

  nows=$(date +%s)
  waitto=$(($nows+$repeat*60))
  sl=$(($waitto-$nows))
  echo "Wait $((sl/60)) minutes till :"
  date --date "$sl sec"

  sleep $sl
done

echo
echo "Finish! after $nbLoops loop(s)"

1 comment:

  1. Interesting post. Thanks for the share. Keep posting such kind of information on your blog. I bookmarked it for continuous visit.
    youtube html5 player| html5 converter

    ReplyDelete