يحاول ذهب - حر
Simple web scraping with Bash: Ski Report
#262/September 2022
|Linux Magazine
With one line of Bash code, Pete scrapes the web and builds a desktop notification app to get the daily snow report.
While recently doing a small project, I was amazed by how much web scraping I could do with just one line of Bash. I used the text-based Lynx browser [1] and then piped the output to a grep search. Figure 1 shows the one-line Bash example that scrapes the current snow depth from the Sunshine Village Snow Forecast web page.
In this article, I will introduce some techniques to easily scrape web pages, and then I will create a desktop notification script that provides the daily snow forecast.
The Lynx Text Browser
For my Bash web scraping, I started out by looking at using command-line tools such as curl [2] with the htm12text [3] utility. This technique definitely works, but I found that using the Lynx browser offers a one-step solution with a slightly cleaner text output.
To install Lynx on Raspian/Debian/ Ubuntu, use:
sudo apt install lynx
The Lynx -dump option will output a web page to text with HTML tags, HTML encoding, and JavaScript removed. Figure 2 shows that a Lynx dump can greatly clean up the original web page and make searching considerably easier.
هذه القصة من طبعة #262/September 2022 من Linux Magazine.
اشترك في Magzter GOLD للوصول إلى آلاف القصص المتميزة المنسقة، وأكثر من 9000 مجلة وصحيفة.
هل أنت مشترك بالفعل؟ تسجيل الدخول
المزيد من القصص من Linux Magazine
Linux Magazine
Exercise Place
The GRUB 2 boot manager might seem intimidating at first glance. All the more reason to spin up a virtual playground so you can practice.
10 mins
#298/September 2025: Indie Game Studio
Linux Magazine
Terminal Mosaic
What's better than one command line? Many command lines that never die. Take the terminal to new places with Zellij.
9 mins
#298/September 2025: Indie Game Studio
Linux Magazine
MakerSpace
Build a Long-Range Sensor Network with ChirpStack Sensor Symphony
14 mins
#298/September 2025: Indie Game Studio
Linux Magazine
How Flatpak, AppImage, and Snap are changing software distribution Ship It!
Modern-day package systems solve some problems posed by classic formats like DEB and RPM. We look at Flatpak, AppImage, and Snap and describe how they differ.
12 mins
#298/September 2025: Indie Game Studio
Linux Magazine
Dashboard Delight
Simplify the chaos of self-hosted services with Homepage, a customizable dashboard with widgets that put service statistics at your fingertips.
9 mins
#298/September 2025: Indie Game Studio
Linux Magazine
MADDOG'S DOGHOUSE
Free software, and the FOSS community, can help technology students get the education they desire in Brazil and elsewhere.
3 mins
#298/September 2025: Indie Game Studio
Linux Magazine
Rethinking the Terminal
The Warp AI agent takes the guesswork out of working at the command line. We show you how to build a simple website with one prompt.
4 mins
#298/September 2025: Indie Game Studio
Linux Magazine
Just in Time
Just is a command runner that lets you define project-specific tasks in a declarative justfile.
7 mins
#298/September 2025: Indie Game Studio
Linux Magazine
The Watcher
This versatile security app checks for vulnerabilities, watches logs, and acts as a single interface for other tools.
7 mins
#298/September 2025: Indie Game Studio
Linux Magazine
NO INTERNETREQUIRED
This new utility lets you update a system that is notconnected to the Internet.
4 mins
#298/September 2025: Indie Game Studio
Translate
Change font size

