假设我有这样的文字,
text<-c("[McCain]: We need tax policies that respect the wage earners and job creators. [Obama]: It's harder to save. It's harder to retire. [McCain]: The biggest problem with American healthcare system is that it costs too much. [Obama]: We will have a healthcare system,not a disease-care system. We have the chance to solve problems that we've been talking about... [Text on screen]: Senators McCain and Obama are talking about your healthcare and financial security. We need more than talk. [Obama]: ...year after year after year after year. [Announcer]: Call and make sure their talk turns into real solutions. AARP is responsible for the content of this advertising.")
我想删除(编辑:摆脱)[和](和括号本身)之间的所有文本.最好的方式是做什么?这是我使用正则表达式和stingr包的微弱尝试:
str_extract(text,"\\[[a-z]*\\]")
感谢任何帮助!
有了这个:
gsub("\\[[^\\]]*\\]","",subject,perl=TRUE);
正则表达式是什么意思
\[ # '[' [^\]]* # any character except: '\]' (0 or more # times (matching the most amount possible)) \] # ']'