Tuesday, December 15, 2009

Removing ads from Prothom-alo.com

I wrote the following style to remove the *UGLY* ads from prothom-alo.com. You are going to need the followings to be able to use it:
  1. Firefox ...best browser on the planet
  2. Stylish or GreaseMonkey ...lets you apply user provided styles/scripts which overrides the server provided stuffs for a particular webpage.
  3. My CSS script ...Removes ads from prothom-alo.com

A complete screen-shot is here: http://imgur.com/U6czu.png

OneManga.com leecher

This one leeches manga from www.onemanga.com. I was too lazy to implement the chapter ripper part. It will only leech the whole series with the right URL as argument.

** code comes with no warranty or whatsoever. I'm not responsible if it breaks anybody's computer. Use it at your own risk. You have been warned !!! **

Example:
onemanga.sh http://www.onemanga.com/School_Rumble/

    1 #!/bin/bash
2
3 RED='\e[0;31m'
4 CYAN='\e[0;36m'
5 NC='\e[0m' # No Color
6
7 if [ -z $1 ]
8 then
9 echo
10 echo "[*] usage: `basename $0` manga_url"
11 echo
12 exit
13 else
14 manga_name=`echo $1 | awk -F '/' '{for(i=NF;i>=0;i--){if(length($i)>0){
15 print $i;break;}}}'`
16 fi
17
18 main_url="http://www.onemanga.com"
19
20 rm -rf ${manga_name}
21
22 ##finding list of chapters
23 echo -n -e "${CYAN}[*]${RED} Finding total chapters in ${CYAN} $manga_name
24 ${NC}= "
25 wget -q -nv ${main_url}/${manga_name} -O tmp.txt
26 chapters=`cat tmp.txt | grep '<td class="ch-subject"><a href="/' | awk -F
27 '"' '{print $4}'`
28
29
30 count=0
31 for c in $chapters
32 do
33 mkdir -p ./$c
34 count=$((count+1))
35 done
36 echo -e "${CYAN}${count}${NC}"
37 ##
38
39 ##parse chapter and download
40 for chapter in $chapters
41 do
42 pwd=`pwd`
43
44 cd ./$chapter
45
46 ## initial wget
47 echo -e "${CYAN}[*]${RED} Trying to find the image base url${NC}"
48
49 ## find collect the first page in the chapter
50 wget -q -nv $main_url/$chapter -O tmp.txt
51 page=`cat tmp.txt | grep "Begin reading" | awk -F '"' '{print $2}'`
52
53 ## now go to that page & find image base
54 wget -q -nv ${main_url}${page} -O tmp.txt 2>/dev/null
55 image=`cat tmp.txt | grep "img_url" | awk -F '"' '{for(i=1;i<NF;i++){if($i
56 ~ "jpg"){print $i}}}' | awk -F '/' '{print $NF}'`
57 image_base=`cat tmp.txt | grep "img_url" | awk -F '"' '{for(i=1;i<NF;i++){
58 if($i ~ "jpg"){print $i}}}' | sed s/"$image"//g`
59 echo -e "${RED}>>${NC} $image_base"
60
61 ## download
62 d=$((d+1))
63 names=`cat tmp.txt | awk '{for(i=1;i<=NF;i++){if($i ~ "selected")go++}{if(
64 go>1){print}}}' | grep -e "</option>" -e 'credits</option>' -e 'extra*<
65 /option>' -e 'cover*</option>' | awk -F '"' '{print $2}'`
66
67 n=0
68 for k in $names
69 do
70 n=$((n+1))
71 done
72
73 echo -e "${CYAN}[*]${RED} Downloading ${CYAN}$n ${RED}images from chapter
74 ${CYAN}$chapter ${RED} ... ##${CYAN}$((count-d+1))${RED}##${CYAN}$count${
75 RED}##${NC}"
76 for k in $names
77 do
78 #echo -e "${RED}>>${NC} ${image_base}${k}.jpg"
79 wget -nv "${image_base}${k}.jpg"
80 done
81
82 cd $pwd
83 done
84 ##

MangaFox.com leecher

This is a bash script to leech a whole series or a particular chapter from www.mangafox.com. I wrote this script for my niece, hope it helps you too. Using is simple, 1st argument is the url to a particular manga and 2nd (optional) argument is the the chapter in that manga.

** code comes with no warranty or whatsoever. I'm not responsible if it breaks anybody's computer. Use it at your own risk. You have been warned !!! **

Example:
mangafox.sh http://www.mangafox.com/manga/school_rumble/ 282
or
mangafox.sh School_Rumble 282
or
mangafox.sh
http://www.mangafox.com/manga/school_rumble/
or
mangafox.sh
School_Rumble

    1 #!/bin/bash
2
3 RED='\e[1;31m'
4 CYAN='\e[1;36m'
5 NC='\e[0m' # No Color
6 YLW='\e[1;33m'
7 WHITE='\e[0;37m'
8
9 main_url="http://www.mangafox.com/manga"
10 wget_param="--tries=10 --retry-connrefused"
11
12 ## usage
13 if [ -z $1 ]
14 then
15 echo
16 echo -e "${CYAN}[*]${RED} usage: `basename $0` manga_url${NC}"
17 echo
18 exit
19 else
20 manga_name=`echo $1 | awk -F '/' '{for(i=NF;i>=0;i--){if(length($i)>0){
21 print $i;break;}}}'`
22 if [ ! -z "$2" ]
23 then
24 specific_chapter="$2"
25 fi
26 fi
27 ##
28
29 function find_chapters()
30 {
31 TMP="${manga_name}_find_chapters.tmp"
32
33 echo -n -e "${CYAN}[*]${RED} Finding total chapters in ${CYAN} $manga_name
34 ${NC}= "
35 wget $wget_param -q -nv "${main_url}/${manga_name}/?no_warning=1" -O $TMP
36 chapters=`cat $TMP | grep -e 'class="chico">' | grep -v -e '</td>' -e
37 '#listing' | awk -F '"' '{print $2}' | sed 's/^\/manga\///g'`
38
39 count=0
40 for c in $chapters
41 do
42 mkdir -p ./$c
43 #echo $c ##debug
44 count=$((count+1))
45 done
46 echo -e "${CYAN}${count}${NC}"
47 }
48
49
50 function parse_chapter_n_download()
51 {
52 PAGES="pages.tmp"
53 PAGE="page_html.tmp"
54
55 for chapter in $chapters
56 do
57 pwd=`pwd`
58
59 if [ "$specific_chapter" ]
60 then
61 mkdir -p "$specific_chapter" 2>/dev/null
62 chapter=$specific_chapter
63 fi
64
65 cd ./$chapter
66
67 ## find total number of pages in chapter
68 echo -n -e "${CYAN}[*]${RED} Total pages in ${CYAN} $chapter ${NC}= "
69 wget -q -nv $wget_param $main_url/$chapter -O $PAGES
70 pages=`cat $PAGES | grep '^.*<option value=.*<\/select>.*$' -m1 | awk '{
71 for(i=1;i<=NF;i++){if($(i-1)~"value"){print $i}}}' | sed -e
72 's/selected//g;s/option//g;s/[<>\/"=]//g;'`
73
74 n=0
75 for k in $pages
76 do
77 #echo $k ##debug
78 n=$((n+1))
79 done
80 echo -e "${CYAN}$n${NC}"
81
82 ## now i have a list of (1,2,3...).html pages
83 for p in $pages
84 do
85 wget $wget_param -q -nv $main_url/$chapter/${p}.html -O $PAGE
86 img_url=`cat $PAGE | grep 'onclick="return enlarge();' | awk '{for(i=1;i<
87 =NF;i++){if($i~"http://"){print $i}}}' | sed 's/src=//g;s/["=]//g'`
88 img=`echo $img_url | awk -F '/' '{print $NF}'`
89 echo -e -n "${CYAN}>>${WHITE} $img_url ${RED} ... ${NC}"
90 wget $wget_param -q -nv $img_url
91 if [ -e $img ]
92 then
93 echo -e "${CYAN}[done]${NC}"
94 else
95 echo -e "${YLW}[failed]${NC}"
96 fi
97 done
98
99 cd $pwd
100
101 if [ "$specific_chapter" ]
102 then
103 exit;
104 fi
105 done
106 }
107
108
109 function main()
110 {
111 rm -rf ${manga_name}
112 find_chapters
113 parse_chapter_n_download
114 }
115
116 main

Thursday, December 10, 2009

Listing directories only in bash

Now this is embarrassing. All these years with an Unix operating system and I couldn't figure it out by myself. Anyways, here is how you do it:

Method 1:

ls -d */
Method 2:
echo */