Why does ActiveRecord often have performance issues?
What is this
This article explain why the performance of ActiveRecord
(※) tend to be wrong.
※ What I call ActiveRecord
means ActiveRecord library of Ruby(on Rails)
What is often said about rich ORM like ActiveRecord
Feature | Rich ORM like ActiveRecord | Go's database/sql
|
---|---|---|
Productivity | High | Low |
Performance | Some overhead | High |
Flexibility | Limited by abstraction | High |
Consistency | Consistent API | Hard to maintain consistency |
Validation and Callbacks | Built-in | Need to implement manually |
Dependencies | Many | Few |
Learning Cost | At first low but later large due to many features | At first high due to requires deep SQL knowledge but later low |
Introduction to ActiveRecord
Introduction to ActiveRecord of Ruby on Rails
Introduction to ActiveRecord of Ruby on Rails
What is ActiveRecord?
ActiveRecord is the Object Relational Mapping (ORM) layer supplied with Ruby on Rails. It provides an interface and binding between the tables in a relational database and the Ruby program code that manipulates database records.
Key concepts
1. Models
ActiveRecord models are Ruby classes that are mapped to database tables.
Each model inherits from ApplicationRecord, which in turn inherits from ActiveRecord::Base.
2. Migrations
Migrations are Ruby classes that are used to make changes to the database schema over time in a consistent and easy way.
3. CRUD Operations
ActiveRecord makes it easy to create, read, update, and delete (CRUD) records in the database.
Example Application
erDiagram
POST {
int id PK
string title
text body
datetime created_at
datetime updated_at
}
COMMENT {
int id PK
int post_id FK
text body
datetime created_at
datetime updated_at
}
POST ||--o{ COMMENT : "has many"
Step 1: Creating a New Rails Application
First, let's create a new Rails application:
$ rails new blog
$ cd blog
Step 2: Generating a Model
Let's generate a Post model with a title and body:
$ rails generate model Post title:string body:text
This command generates a migration file, the model file, and the test files for the Post model.
Step 3: Running Migrations
Run the migrations to create the posts table in the database:
$ rails db:migrate
Step 4: Using the Model
Now that we have our model and table, let's use ActiveRecord to interact with the database.
Creating a Record
post = Post.new(title: 'First Post', body: 'This is the body of the first post')
post.save
Or in a single step:
Post.create(title: 'First Post', body: 'This is the body of the first post')
Reading Records
Find a post by its ID:
post = Post.find(1)
Find all posts:
posts = Post.all
Find posts with specific conditions:
posts = Post.where(title: 'First Post')
Updating Records
post = Post.find(1)
post.update(title: 'Updated Post Title')
Deleting Records
post = Post.find(1)
post.destroy
Step 5: Validations
ActiveRecord allows you to add validations to your models to ensure that only valid data is saved to the database. For example:
class Post < ApplicationRecord
validates :title, presence: true
validates :body, presence: true
end
With these validations, ActiveRecord will prevent saving a Post without a title or body:
post = Post.new(title: '')
post.save # returns false
post.errors.full_messages # returns ["Title can't be blank", "Body can't be blank"]
Step 6: Associations
ActiveRecord makes it easy to manage associations between models. For example, let's say we have a Comment model that belongs to a Post:
Generate the Comment model:
$ rails generate model Comment post:references body:text
$ rails db:migrate
Set up the association in the models:
class Post < ApplicationRecord
has_many :comments, dependent: :destroy
end
class Comment < ApplicationRecord
belongs_to :post
end
Now, you can create comments for a post and access them easily:
post = Post.create(title: 'First Post', body: 'This is the body of the first post')
post.comments.create(body: 'This is a comment')
post.comments # returns all comments for this post
Conclusion
This introduction covers the basics of ActiveRecord in Ruby on Rails. With these fundamentals, you can start building and interacting with your database models effectively. For more advanced topics, refer to the Rails Guides.
Classification for performance issue
Ruby
- Dynamic-typed language
- Require runtime interpreter
Advantage of Golang
- Statistical-typed language
- Concurrent GC
- Native code(Already compiled)
- Optimize compile by for example, inline expansion, removing dead codes,...etc
- Light runtime with small overhead
ORM
- Require some ORM specific techniques to improve performance because of many abstractions (ex. query, transactions, connections, ...etc)
ActiveRecord library
To summarize, some default behavior and convension can invoke performance issue at certain circumstances and they require us to have some knowledge or techniques to optimize them.
- Lazy loading by default make us easily invoke N+1 problems
- Require some knowledge to distinguish similar method to know the underneath behavior including queries
- Take much memory cost to make ActiveRecord instance
- Everytime execute validation by default
- Execute select all columns (*) by default
- Require some techniques to manage connection, transaction, timing of loading, ...etc to change default behavior
Understanding N+1 Problem and Lazy Loading in ActiveRecord
What is the N+1 Problem?
The N+1 problem is a performance issue that occurs when fetching data from a database. It happens when one query is executed to fetch the main records (the "1"), and then N additional queries are executed to fetch related records for each of the main records.
Example of N+1 Problem
Consider the following code that retrieves posts and their comments:
posts = Post.all
posts.each do |post|
puts post.title
post.comments.each do |comment|
puts comment.body
end
end
In this example:
-
SELECT * FROM
posts is executed once. - For each post,
SELECT * FROM comments WHERE post_id = ?
is executed N times.
This results in a total of 1 + N queries, which can significantly degrade performance when N is large.
What is Lazy Loading?
Lazy loading means that related table data is only fetched when it is actually accessed. This is the default behavior in ActiveRecord. Each time a related record is accessed, a separate query is executed.
Lazy Loading Example
Using the previous example, post.comments triggers a new query each time it is accessed, leading to the N+1 problem:
Here, comments are lazily loaded for each post, causing a new query to be executed for each post's comments.
Eager Loading to Solve the N+1 Problem as an example of solution
Eager loading is a strategy to fetch all necessary data in a single query. In ActiveRecord, this can be achieved using the includes method. Eager loading reduces the number of queries by loading all related records at once beforehand.
Eager Loading Example
-posts = Post.all
+posts = Post.includes(:comments).all
posts.each do |post|
puts post.title
post.comments.each do |comment|
puts comment.body
end
end
In this example:
-
SELECT * FROM posts
is executed once. -
SELECT * FROM comments WHERE post_id IN (?)
is executed once at the line ofposts = Post.includes(:comments).all
(before actually need title column's value)
This results in a total of 2 queries, regardless of the number of posts, thus solving the N+1 problem.
(Appendix) When Lazy Loading is Not Relevant
If there are no related tables, lazy loading and the N+1 problem are irrelevant.
For example, fetching records from a single table does not involve related records, so there is no concern about lazy loading or multiple queries.
Example with No Related Tables
users = User.all
users.each do |user|
puts user.name
end
In this case:
- Only
SELECT * FROM
users is executed.
There are no related records to fetch, so lazy loading and the N+1 problem do not apply.
Summary
- N+1 Problem:
- Occurs when fetching records of the related table with multiple queries.
- Lazy Loading:
- Default behavior where related data is fetched when accessed, leading to the N+1 problem.
- Eager Loading:
- Strategy to fetch all related data in a single query using includes, solving the N+1 problem.
- (Single Table:)
- Lazy loading and the N+1 problem are not relevant when there are no related tables.
ActiveRecord Pattern
The ActiveRecord pattern is just a pattern of Object design:
The object wrap and corresponds to a DB record, encupcellate DB access, include domain logic, and has both data and behavior.
Of course this pattern itself cannot invoke performance issue because it's just a design pattern, but this pattern has an affinity with LazyLoading of ActiveRTecord library and N+1 problems.
Discussion