Skip to main content

Using Ruby on Rails and AWS SQS for Website Crawling

Recently, I returned to Ruby on Rails, one of my favourite web application frameworks. This time, I aimed to build a simple management tool for scheduling website crawls—a key component of a side project I'm working on.

The tool's purpose is straightforward:

  1. Maintain a list of websites to crawl.
  2. Schedule crawls for these websites.
  3. Trigger a Golang process for the actual crawling task.
To facilitate communication between the Rails app and the Golang service, I chose AWS SQS (Simple Queue Service). SQS provides a reliable way to send, receive, and manage messages between distributed systems.

Adding an SQS Service in Rails

In Rails, services are often used to encapsulate business logic that doesn’t belong in the standard MVC structure. For my application, I created a services/sqs_send_service.rb to handle sending messages to SQS queues.

Here’s the implementation:  

require "aws-sdk-sqs"

class SqsSendService
  # Client is a class method that returns an instance of the SQS client
  # @return [Aws::SQS::Client] An instance of the SQS client
  def self.client
    @client ||= Aws::SQS::Client.new
  end

  # The initialize method sets the queue URL
  # @param sqs_queue_url [String] The URL of the SQS queue
  # @return [SqsSendService] An instance of the SqsSendService class
  def initialize(sqs_queue_url = nil)
    sqs_queue_url  ||= ENV["SQS_QUEUE_URL"]
    @queue_url = sqs_queue_url
  end

  # The send_message method sends a message to the SQS queue
  # @param message_body [String] The body of the message
  # @param message_attributes [Hash] The attributes of the message
  # @return [Aws::SQS::Types::SendMessageResult] The result of the send message operation
  def send_message(message_body, message_attributes = {})
    self.client.send_message({
      queue_url: @queue_url,
      message_body: message_body,
      message_attributes: message_attributes
    })
  end
end
Key Features:
  • Reusable SQS Client: The client method ensures a single instance of the Aws::SQS::Client is reused, improving efficiency.
  • Dynamic Configuration: The queue URL can be set either via the sqs_queue_url parameter or an environment variable (ENV["SQS_QUEUE_URL"]), making it flexible for different environments.
  • Message Attributes: The send_message method supports custom attributes, allowing for enhanced message context when sending data.

Why Use SQS?

AWS SQS is a powerful tool for decoupling services. In my project:
  • The Rails app handles scheduling and sends crawl requests via SQS.
  • The Golang service listens to the SQS queue and processes the crawl jobs.
This separation allows each service to scale independently and ensures robust communication even if one service temporarily goes offline.

Final Thoughts

This project reminded me of why I enjoy working with Rails—its flexibility allows me to integrate tools like SQS seamlessly. By leveraging Rails services, I kept my application’s design clean and modular.

If you're looking to integrate AWS SQS into your Rails app, a service like the SqsSendService provides a solid foundation. H

Comments

Popular posts from this blog

Running Multiple Strategies in Parallel Using Goroutines in Go

I recently revisited a project I had worked on some time ago, where the system was required to take multiple parameters, query data from various sources, and then return a unified response. Reflecting on that project, I became curious about how I would approach building this process today. As a result, I decided to create a simple template to run multiple strategies concurrently. A Simple Version In this example, we’ll explore how to run multiple strategies in parallel using goroutines and channels. We’ll build a simple system that processes multiple strategies concurrently and returns a unified response from all of them. package main import ( "context" "fmt" "sync" ) func main() { // Create a context with a timeout ctx := context.Background() // Create a WaitGroup to synchronize goroutines var wg sync.WaitGroup // List of strategy names strategies := []Strategy{ &strA{}, &strB{}, } // start a response channel responseCh := ...

Simple Go HTTP Server Template

When working on various Go projects for clients, one recurring need is setting up a basic HTTP server. Whether it’s for health checks, APIs, or serving simple content, having a lightweight server template is essential. In this post, I’ll share a simple template I use to quickly spin up an HTTP server. This template can be extended to suit any project, and I’ll walk you through the key components. The Main Package In the main.go file, we initialize and start the server. I’ve also integrated graceful shutdown capabilities using the bsm/shutdown package to ensure that the server stops cleanly when interrupted. package main import ( "fmt" "log" "net/http" "../simple-server/internal/server" "github.com/bsm/shutdown" ) func main() { // Start the server fmt.Println("Starting server...") srv, err := server.New() if err != nil { log.Fatalln("failed to start server:", err) } defer func() { err := srv.Stop(...

Working with Rails Importmap and Sprockets

When starting a new Rails project, you may find yourself juggling different asset management tools. In my recent project, Rails came pre-configured with both: gem "sprockets-rails" gem "importmap-rails" I was keen to use `importmap-rails`, as it offers a modern, gem-free way to manage JavaScript dependencies. However, I was already familiar with `sprockets-rails` from previous projects, which made the mixed setup feel a bit confusing. Since my project uses Bootstrap 5 alongside Turbo and Stimulus, it took some trial and error to get everything working smoothly—especially in production. The Challenge: Importmap in Production According to the Importmap documentation , JavaScript files should be served directly from the `/app/javascript` folder. This works perfectly in development. However, in production, I noticed that the JavaScript files were not being correctly referenced, leading to missing assets and broken functionality. The solution? Precompiling the...