Fastest Way to Load Data from MongoDB into Pandas
Posted by Aly Sivji in Data Analysis
In this post, I will describe how to use BSON-NumPy to pull data out of Mongo and into pandas. While this library is still in the prototype stage, it's hard to to ignore the 10x speed improvement that comes from reading BSON documents directly into NumPy.
For this example, I will be analyzing a collection of my tweets
A Gentle Introduction to Context Managers: The Pythonic Way of Managing Resources
Summary¶
- Explore
with
statements and the context manager protocol- Implement context manager class to query MongoDB
- Convert
try...finally
block towith
block and increase code readability
I recently read Steve McConnell's Code Complete to level up my software development skill-set. The book has helped me become more deliberate about programming and problem solving in general. Before I sit down to write a single line of code, I take some time to plan out the work I am going to do versus code by the seat of my pants
Scheduling Web Scrapers on the PythonAnywhere Cloud (Scrapy Part 2)
(Note: This post is part of my reddit-scraper series)
Summary
- Running Scrapy spider as a script
- Scheduling script to run on PythonAnywhere cloud
Previously on Siv Scripts, we created a web scraping pipeline to pull Top Posts from Reddit and store them in a MongoDB collection. At this stage, we …
Building a Flask Web Application (Flask Part 2)
(Note: This post is part of my reddit-scraper series)
Summary
- Web frameworks intro
- Explore the Flask microframework
- Understand the Model-View-Controller (MVC) pattern
- Build Flask web app
Last time we started our web application adventure by learning how to generate dynamic HTML webpages from data stored in MongoDB using MongoEngine and …
Generating HTML Pages from MongoDB with MongoEngine and Jinja2 (Flask Part 1)
(Note: This post is part of my reddit-scraper series)
Summary
- Overview of MongoDB
- Discussion of Object-Relational Mapping (ORM)
- Use MongoEngine to get items out of MongoDB
- Render HTML pages using Jinja2
- Interact with REST API to send emails with Requests
Previously on Siv Scripts, we implemented a web scraping pipeline …
Scraping Websites into MongoDB using Scrapy Pipelines
Summary
- Discuss advantages of using Scrapy framework
- Create Reddit spider and scrape top posts from list of subreddits
- Implement Scrapy pipeline to send scraped data into MongoDB
Wouldn't it be great if every website had a free API we could poll to get the data we wanted?
Sure, we could …