我写了一个
ruby youtube url解析器.它被设计为输入以下结构之一的youtube url(这些是我能找到的youtube url结构,也许还有更多?):
http://youtu.be/sGE4HMvDe-Q http://www.youtube.com/watch?v=Lp7E973zozc&feature=relmfu http://www.youtube.com/p/A0C3C1D163BE880A?hl=en_US&fs=1
目的是只保存剪辑或播放列表的id以便嵌入它,所以如果它是一个剪辑:’sGE4HMvDe-Q’,或者它是一个播放列表:’p / A0C3C1D163BE880A’
我写的解析器适用于这些网址,但看起来有点脆弱和冗长,我只是想知道是否有人可以建议一个更好的ruby方法来解决这个问题?
def parse_youtube a = url.split('//').last.split('/') b = a.last.split('watch?v=').last.split('?').first.split('&').first if a[1] == 'p' url = "p/#{b}" else url = b end end
解决方法
def parse_youtube url regex = /(?:.be\/|\/watch\?v=|\/(?=p\/))([\w\/\-]+)/ url.match(regex)[1] end urls = %w[http://youtu.be/sGE4HMvDe-Q http://www.youtube.com/watch?v=Lp7E973zozc&feature=relmfu http://www.youtube.com/p/A0C3C1D163BE880A?hl=en_US&fs=1] urls.each {|url| puts parse_youtube url } # sGE4HMvDe-Q # Lp7E973zozc # p/A0C3C1D163BE880A
根据您使用它的方式,您可能希望更好地验证URL确实来自youtube.
更新:
几年后再回到这里.我一直对最初的答案是多么草率感到恼火.由于Youtube域的有效性无论如何都没有得到验证,我已经删除了一些slop.
NODE EXPLANATION -------------------------------------------------------------------------------- (?: group,but do not capture: -------------------------------------------------------------------------------- . any character except \n -------------------------------------------------------------------------------- be 'be' -------------------------------------------------------------------------------- \/ '/' -------------------------------------------------------------------------------- | OR -------------------------------------------------------------------------------- \/ '/' -------------------------------------------------------------------------------- watch 'watch' -------------------------------------------------------------------------------- \? '?' -------------------------------------------------------------------------------- v= 'v=' -------------------------------------------------------------------------------- | OR -------------------------------------------------------------------------------- \/ '/' -------------------------------------------------------------------------------- (?= look ahead to see if there is: -------------------------------------------------------------------------------- p 'p' -------------------------------------------------------------------------------- \/ '/' -------------------------------------------------------------------------------- ) end of look-ahead -------------------------------------------------------------------------------- ) end of grouping -------------------------------------------------------------------------------- ( group and capture to \1: -------------------------------------------------------------------------------- [\w\/\-]+ any character of: word characters (a-z,A-Z,0-9,_),'\/','\-' (1 or more times (matching the most amount possible)) -------------------------------------------------------------------------------- ) end of \1