Web Scraping With VBA – Intro

Background

There are a lot of reasons why you may want to scrape data from websites, and there are a lot of different programming languages you can use to do it. If you ask someone who knows what they are doing, they will likely tell you to use Python for web scraping. Python is free, flexible, and powerful. However, if you do not consider yourself a programmer, and you have never used Python before, it may be seem like a daunting task to learn and use Python. Most people in business have used Microsoft Excel, and a good number probably have had exposure to Macros in Excel which are written in VBA (Visual Basic for Applications). VBA may not have the power and flexibility of Python, and it certainly does not have the prestige, but it is actually quite effective and straight forward to use for web scraping.

One nice feature of using VBA for web scraping is it naturally mimics the actions of a real user because when you scrape data with VBA, you are controlling a web browser via code and all of your requests appear to the hosting sites exactly as they would if a human were manually navigating their site. You do not have to worry about ‘tricking’ the site into thinking you are a real with seemingly complicated methods that you likely do not understand (assuming you are a beginner).

Bottom line, VBA can be a great tool to use for web scraping, even if you are not an experienced programmer. Below are links to subsequent posts that will provide a short tutorial on how to scrape data from a website using VBA. It is important to note that by automating certain tasks you can put undue strain on a website. If websites suspect that you are a bot and not a real user, they may throttle your access, put up captchas, or even blacklist you from their site. If you find yourself needing to scrape hundreds of pages from a website, then there may be an API in place that will allow you to obtain the same information in a more efficient manner that is sanctioned by the website. However, using a site’s API will require more programming knowledge and may be outside of your wheelhouse.

Scraping HTML tables

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Powered by WordPress.com.

Up ↑

%d bloggers like this: