Python Facebook Posts Scraper with Requests and BeautifulSoup4

A few weeks ago I wrote about the Importance of using User Agents when we scrap data, and my examples shows the response from Twitter when we used the correct User Agent. This time I want to do the same but Facebook. We gonna scrape the posts in users profiles, Facebook pages and groups.

What we gonna get?

A list of items with the next values:

params	description
published	Formatted DateTime of published
description	Post text content
images	List of images in posts
post_url	The unique post URL
external_links	External links found in the description
like_url	The Like URL

Scraping output

Let’s start

Get Python (recommended Python 3.7+)

2. Clone or download this repository

git clone https://github.com/adeoy/FacebookPostsScraper.git

3. Install the project requirements

pip install -r requirements.txt

Lets explain

First of all, all the code is in my Github repository https://github.com/adeoy/FacebookPostsScraper

We create a requests session.
Set a User Agent of and old Nokia C3 phone to the requests session (Nokia C3 gives me better results during the scraping than other phones).
Check if we have a session cookie saved in our computer, if not, then login to Facebook with email and password and save the session cookie in our computer (we need to log because our friends private profiles can’t be scraped without auth).
Request a profile and scrape the posts using BeautifulSoup and CSS selectors.
Return the results.
Have fun 🙂

I already made a class manage all the processes, first, we need to instantiate an object of FacebookPostsScraper, pass of email and password, and optionally if your Facebook account isn’t in English you need to set the Text in the URL that opens a Post that only appears in the Facebook mobile version. Don’t worry if you don’t understand, I will respond for you if ask me in the comments the language what you need. BTW, these are for English and Spanish:

English: ‘Full Story’
Spanish: ‘Historia completa’

Once you instantiate an object, in the process, the class automatically logs to Facebook and prepares the session for the requests. Now you can call the method get_posts_from_profile and pass a Facebook profile URL to get the posts.

Edit June 27th, 2020. Now you can export the scraped posts to CSV, Excel, and JSON. See the end of the examples to check out.

Examples

Example with single url

from FacebookPostsScraper import FacebookPostsScraper as Fps
from pprint import pprint as pp

# Enter your Facebook email and password
email = 'YOUR_EMAIL'
password = 'YOUR_PASWORD'

# Instantiate an object
fps = Fps(email, password, post_url_text='Full Story')

# Example with single profile
single_profile = 'https://www.facebook.com/BillGates'
data = fps.get_posts_from_profile(single_profile)
pp(data)

fps.posts_to_csv('my_posts')  # You can export the posts as CSV document
# fps.posts_to_excel('my_posts')  # You can export the posts as Excel document
# fps.posts_to_json('my_posts')  # You can export the posts as JSON document

Example with multiple urls

from FacebookPostsScraper import FacebookPostsScraper as Fps
from pprint import pprint as pp

# Enter your Facebook email and password
email = 'YOUR_EMAIL'
password = 'YOUR_PASWORD'

# Instantiate an object
fps = Fps(email, password, post_url_text='Full Story')

# Example with multiple profiles
profiles = [
    'https://www.facebook.com/zuck', # User profile
    'https://www.facebook.com/thepracticaldev', # Facebook page
    'https://www.facebook.com/groups/python' # Facebook group
]
data = fps.get_posts_from_list(profiles)
pp(data)

fps.posts_to_csv('my_posts')  # You can export the posts as CSV document
# fps.posts_to_excel('my_posts')  # You can export the posts as Excel document
# fps.posts_to_json('my_posts')  # You can export the posts as JSON document

Final thoughts

I also recommend you to check out this book where I learned some cool tricks to make web scraping: https://amzn.to/3umlGuc.

Please be free to asking anything you want in the comments section.

beautiful soup facebook python requests web scraping