随笔-167  评论-65  文章-0  trackbacks-0
ruby 1.8.7 + rails 2.1.0

打开 http://www.google.cn/finance?q=600001 这个网址 可以看到 谷歌财经的 右侧 有个新闻区。。。这个新闻区就是从别的地方抓取来的
截图:


现在我们也来仿照它来实现一个,首先rails解析rss有两种方式,一种是用封装好的类库,一种是用原始的解析xml的方式,或者利用别人封装好的库 例如feedtools, rubyrss 等
用类库的方法:
    require 'rss/2.0'
    require 'open-uri'
    url = "http://news.google.cn/news?pz=1&ned=ccn&hl=zh-CN&topic=b&output=rss"
    @feed = RSS::Parser.parse(open(url).read, false)
    @feed.items.each do |item|
      puts item.title
      puts item.link
      puts  item.description
    end


解析xml的方法:
在lib下建立一个RssParser的类,这样在任何地方都可以调用
class RssParser
  require 'rexml/document'
  def self.run(url)
    xml = REXML::Document.new Net::HTTP.get(URI.parse(url))
    data = {
      :title    => xml.root.elements['channel/title'].text,
      :home_url => xml.root.elements['channel/link'].text,
      :rss_url  => url,
      :items    => []
    }
    xml.elements.each '//item' do |item|
      new_items = {} and item.elements.each do |e|
        new_items[e.name.gsub(/^dc:(\w)/,"\1").to_sym] = e.text
      end
      data[:items] << new_items
    end
    data
  end
end


action中使用:
  def test
    feed = RssParser.run("http://news.google.cn/news?pz=1&ned=ccn&hl=zh-CN&topic=b&output=rss")
    feed1 = feed[:items][0]
    feed2 = feed[:items][0]
    feed3 = feed[:items][0]
    # combine the feeds into an array
    @feeds = [feed1, feed2, feed3]
    # parse the pubDate strings into a DateTime object
    @feeds.each {|x| x[:pubDate] = DateTime.parse(x[:pubDate].to_s)}
    # iterate through each feed, sorting by pubDate
    @feeds.sort! {|a,b| a[:pubDate] <=> b[:pubDate]}
    # reverse the array to sort by descending pubDate
    @feeds.reverse!
    @feeds.each do |feed|
      puts feed[:title]
      puts feed[:link]
      puts feed[:pubDate]
    end
  end


那么上面的title link description 是从哪里来的呢。。。这个是rss2.0的xml结构,一般情况下是这样的:
<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
  <title>Example Feed</title>
<description>Insert witty or insightful remark here</description>
<link>http://example.org/</link>
<lastBuildDate>Sat, 13 Dec 2003 18:30:02 GMT</lastBuildDate>
<managingEditor>johndoe@example.com (John Doe)</managingEditor>
<item>
<title>Atom-Powered Robots Run Amok</title>
<link>http://example.org/2003/12/13/atom03</link>
<guid isPermaLink="false">urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</guid>
<pubDate>Sat, 13 Dec 2003 18:30:02 GMT</pubDate>
<description>Some text.</description>
</item>
</channel>
</rss>

或者你可以查看rss的页面源代码,或者puts下  @feed = RSS::Parser.parse(open(url).read, false)的结果都可以看到上面的这中xml文档结构

好,下面我们开始实现上面图的新闻:
我们可以把这个部分放在partial里,所以只需要helper和partial文件
helper:
def feed_collection(param)
require 'rss/2.0'
require 'open-uri'
# from news.google.cn
urlhot = "http://news.google.cn/news?pz=1&ned=ccn&hl=zh-CN&topic=b&output=rss"
urlfinance = "http://news.google.cn/news?pz=1&ned=ccn&hl=zh-CN&topic=ecn&output=rss"
urlfund = "http://news.google.cn/news?pz=1&ned=ccn&hl=zh-CN&topic=stc&output=rss"
urlfinancing = "http://news.google.cn/news?pz=1&ned=ccn&hl=zh-CN&topic=pf&output=rss"
case param
when 'hot'
RSS::Parser.parse(open(urlhot).read, false)
when 'finance'
RSS::Parser.parse(open(urlfinance).read, false)
when 'fund'
RSS::Parser.parse(open(urlfund).read, false)
when 'financing'
RSS::Parser.parse(open(urlfinancing).read, false)
end
end

def feed_link(param)
require 'cgi'
CGI.unescape(param.slice(/(http%).*(&)/)).gsub(/&/,'') if param # 把十六进制路径 例如http%3A2F之类的转化为 字符
end

def feed_title(param)
param.slice(/.*(-)/).gsub(/-/,"") if param #截取需要的title
end

def feed_from(param)
param.slice(/( - ).*/).from(2) if param # 截取需要的部分
end


partial: _feednews.erb.html
<div class="slides">
<div><%= render :partial => 'shared/feednews_item',:collection => feed_collection("hot").items %></div>
<div><%= render :partial => 'shared/feednews_item',:collection => feed_collection('finance').items %></div>
<div><%= render :partial => 'shared/feednews_item',:collection => feed_collection('fund').items %></div>
<div><%= render :partial => 'shared/feednews_item',:collection => feed_collection('financing').items %></div>
</div>

主义这里参考了 jquery的loopslider 插件(幻灯片) 加载显示的只是第一个div部分,可以参考:
http://github.com/nathansearles/loopedSlider/tree/master

partial: _feednews_item.html.erb
<ul>
<% unless feednews_item.nil? %>
<li class="news"><a href="<%= feed_link(feednews_item.link) %>" target="_blank"><%= feed_title(feednews_item.title) %></a>

<span class="grey small"><span> <%= feed_from(feednews_item.title) %></span>&nbsp;&mdash;&nbsp;<span><%= feednews_item.pubDate.to_date %></span></span></li>
<% end %>
</ul>

okay....已经成功了,我实现的截图:


ref:
http://www.rubycentral.com/book/ref_c_string.html
http://www.javaeye.com/topic/60620
http://www.troubleshooters.com/codecorn/ruby/basictutorial.htm#_Regular_Expressions
http://paranimage.com/15-jquery-slideshow-plugins/#respond
http://hi.baidu.com/todayz/blog/item/83c1b219d966fd4142a9ad5f.html
http://dennis-zane.javaeye.com/blog/57538
http://sporkmonger.com/projects/feedtools/
http://rubyrss.com/
http://rubyrss.com/
http://www.superwick.com/archives/2007/06/09/rss-feed-parsing-in-ruby-on-rails/
http://www.ruby-forum.com/topic/144447




write by feng
posted on 2009-08-18 10:45 fl1429 阅读(800) 评论(0)  编辑  收藏 所属分类: Rails

只有注册用户登录后才能发表评论。


网站导航:
 
已访问数:
free counters