我有一个包含许多元音的文件,带有变音符号。我需要做这些替换:
用a替换ā,á,ǎ和à。
用e替换ē,é,ě和è。
用i替换ī,í,ǐ和ì。
用o取代ō,ó,ǒ和ò。
用ü替换ū,ú,ǔ和ù。
>用ü替换ǖ,ǘ,ǚ和。。
>用A替换Ā,Á,Ǎ和À
>用E替换Ē,É,Ě和È
用I替换Ī,Í,Ǐ和Ì
用O替换Ō,Ó,Ǒ和O.
>用U.替换Ū,Ú,Ǔ和Ù
>用Ü替换Ǖ,Ǘ,Ǚ和。。
我知道我可以一次更换它们:
sed -i 's/ā/a/g' ./file.txt
是否有更有效的方式来取代所有这些?
如果您检查工具iconv的手册页:
//TRANSLIT
When the string “//TRANSLIT” is appended to –to-code,transliteration is activated. This means that when a character cannot be represented in the
target character set,it can be approximated through one or several similarly looking characters.
所以我们可以做:
kent$ cat test1 Replace ā,á,ǎ,and à with a. Replace ē,é,ě,and è with e. Replace ī,í,ǐ,and ì with i. Replace ō,ó,ǒ,and ò with o. Replace ū,ú,ǔ,and ù with u. Replace ǖ,ǘ,ǚ,and ǜ with ü. Replace Ā,Á,Ǎ,and À with A. Replace Ē,É,Ě,and È with E. Replace Ī,Í,Ǐ,and Ì with I. Replace Ō,Ó,Ǒ,and Ò with O. Replace Ū,Ú,Ǔ,and Ù with U. Replace Ǖ,Ǘ,Ǚ,and Ǜ with Ü. kent$ iconv -f utf8 -t ascii//TRANSLIT test1 Replace a,a,and a with a. Replace e,e,and e with e. Replace i,i,and i with i. Replace o,o,and o with o. Replace u,u,and u with u. Replace u,and u with u. Replace A,A,and A with A. Replace E,E,and E with E. Replace I,I,and I with I. Replace O,O,and O with O. Replace U,U,and U with U. Replace U,and U with U.