Notes on Web Scraping
#programming
Separate web crawling and web scraping.
Use something like Urlbox to save the page to S3, get a webhook when ready to scrape and use something to scrape
Playwright seems to be useful tool in the web scraping stack
Pandas read_html()
seems useful