LINUX: Using sed to remove HTML tags

preview_player
Показать описание
In this video we look at removing HTML tags from a web page with sed, the Linux stream editor. We start with a simple demo using letters then move to the tag symbols. Really shows the power of the Linux command line
Комментарии
Автор

A good beginning for explaining sed. Anyone watching this and wanting to actually strip html tags using sed, there is a lot more to it because you have to deal with multiline tags and other exceptions. I find it just easier to use something like elinks -dump or the html2text command to do it in most cases. But of course the technique you demonstrate can be used in many cases.

climagic