notes-on-web-scraping
#programming
Separate web crawling and web scraping.
Use something like Urlbox to save the page to S3, get a webhook when ready to scrape and use something to scrape
Playwright seems to be useful tool in the web scraping stack
Pandas read_html() seems useful