PDFで読んだ本出力までできるようにした。perlのスクリプト自体はtexのソース吐くので、適当に変換。同じところにasin.txtというファイルを作っておく必要あり。
perl photo.pl > book.tex ; platex book.tex ; dvips book.dvi > book.ps ; ps2pdf book.ps > book.pdf
ソース
use strict; use warnings; use WWW::Mechanize; my $title_pattern = '<title>(.*?)</title>'; my $price_pattern = '<span class="price bold">\s(.*?)</span>'; my $jpg_pattern = 'http://ec1.images-amazon.com/images/I/.*?\.jpg'; my $url; my $w=WWW::Mechanize->new; print_header(); open(FILE,"asin.txt"); my @asins =<FILE>; close(FILE); foreach my $asin(@asins){ chomp($asin); $url = "http://d.hatena.ne.jp/asin/".$asin; $w->get($url); my $content = $w->content(); print '\begin{itemize}',"\n"; print '\item '; get_title($content,$title_pattern); print '\item '; get_price($content,$price_pattern); print '\end{itemize}',"\n"; get_photo($w,$content,$jpg_pattern,$asin); print_photo($asin); } print_footer(); sub get_title{ my ($content,$title_pattern) = @_; my $title; if($content =~ /$title_pattern/){ $title = $1; } print $title,"\n"; } sub get_price{ my ($content,$price_pattern) = @_; my $price; if($content =~ /$price_pattern/){ $price = $1; } print $price,"\n"; } sub get_photo{ my ($w,$content,$jpg_pattern,$asin) = @_; my $filenam if($content =~ /$jpg_pattern/){ $filename = $&; } my $response = $w -> get($filename); open OFH," > $asin.jpg"; binmode OFH; print OFH $response -> content; close OFH; system("convert $asin.jpg eps2:$asin.eps"); } sub print_header{ print '\documentclass[twocolumn]{jsarticle}',"\n"; print '\usepackage{wrapfig}',"\n"; print '\usepackage[dvipdfm]{graphicx}',"\n"; print '\begin{document}',"\n"; } sub print_footer{ print '\end{document}',"\n"; } sub print_photo{ my $photo=shift; print '\includegraphics[scale=1]',"{$photo.eps}\n"; }
できあがりのpdfはこんな感じ。
バグ
- Amazonの画像ファイルが違うところに置いてあると、画像か取ってこれない
- 価格が表示してないところが取ってこれない