Note: The entire approach highlighted in this blog about using the rsync command with Ruby is only applicable on Linux-based systems and won't be available on Windows.

In this article, we will explore how to use rsync in Ruby to sync files from local to a remote location, how to improve its error logging and how rsync is better than scp.

Let's first understand what is rsync:

rsync is a Linux command to sync files from one location to another. We can run this command on the terminal and it will copy files from one directory to another, either locally or even on a remote location.

scp is another Linux command which allows file transfer from one location to another similar to rsync. You can refer to this article about scp to learn more about it.

Why we preferred rsync over scp?

There are some advantages of using rsync over scp, these are:

  1. rsync uses a special delta transfer algorithm along with a few optimizations which makes it faster as compared to scp which performs a plain transfer
  2. rsync provides the option to preserve the partially downloaded files. In case if your download session gets interrupted in the middle of copying a file, you can resume it from where you left off
  3. rsync automatically detects if the file has been transferred correctly

I found this Stack Overflow post helpful while comparing scp to rsync.

How to use rsync command?

rsync [options] [source] [destination]

Following are the options which we can use with rsync:

-v : verbose
-r : recursively copies data. It doesn’t preserve timestamps and permission while transferring data
-a : archive mode, it allows copying files recursively and it also preserves symbolic links, file permissions, user & group ownerships and timestamps
-z : compress file data
-h : human-readable, output numbers in a human-readable format
-e : specify type of protocol

The above options (and more) can also be accessed by running rsync -h in the console.
We can also combine multiple options like rsync -avhe [source] [destination] as per our need.

How rsync works with SSH?

It is important to transfer data on a secure connection to ensure that nobody can read data while it is being transferred. And we can achieve this by using rsync with SSH (Secure Shell) protocol.
As we can see in the above options list, -e option is used to specify the type of protocol we are using to copy/sync the files to/from a remote location.

So with ssh rsync command will be like this:

$ rsync -e ssh /local_dir user@remote-host:/remote_dir/

We can also change the permission of file while copying/syncing files from/to a remote location

$ rsync -avhe ssh --chown=USER:GROUP /local_dir user@remote-host:/remote_dir/`

The above command will sync all the files present in directory /local_dir owned by USER with group GROUP to the /remote_dir directory on the remote-host.

How to use rsync in Ruby?

Till now we saw how we can run rsync command in Linux/ Unix terminal. Now let's see how to use this command in Ruby to sync files to a remote location.

There are multiple ways to call shell commands from Ruby, some of which are as follows:

  1. Kernel#` i.e. backticks
    It returns the result of shell command

      `echo "Hello World!"`       #=> "Hello World!\n"
      cmd = "echo 'Hello World!'"
      value = `#{cmd}`            #=> "Hello World!\n"
    
  2. Built-in syntax, %x( cmd )
    It also returns the result of shell command similar to backticks. In the syntax following %x is a delimiter, we can use any character as a delimiter. For example (, {, [ or < . For more details refer to this Stack Overflow post.

      %x( echo 'Hello World!')  #=> "Hello World!\n"
    
  3. Kernel#exec
    It exits the current process and executes the shell command in the new process.
    In the following example, exec exits the ruby console and executes the echo command.

      exec("echo 'Hello World!'")  #=> It exits ruby console and prints "Hello World" on terminal
    
  4. Kernel#system
    Executes the given command in a sub-shell.
    On successful execution of command, it returns true otherwise false.

      system('invalid command')                 #=> nil
      system('echo system command executed')    #=> true
      system('echo new command | grep another') #=> false
    

For more information, refer to this rubyguide.

We will use the system command for error tracking since only this command echoes the STDOUT with true/false/nil according to the result.

Let's create a method using Kernel#system to execute rsync command:

def sync_files
  cmd = 'rsync -e ssh /local_dir user@remote-host:/remote_dir/'

  system cmd
end

In the above method, string cmd is the rsync command which we want to execute on the terminal. We have passed it to the system method. All seems to be good so far, right?

Wait, what happens if the rsync command fails due to unexpected errors, such as the server was out of service?

How to track system errors in Ruby?

First, we will understand what the system method returns:
The system returns true if the command gives zero exit status, false for non zero exit status. Returns nil if command execution fails.

We can check the error status using $?. This returned object of class Process::Status with process id and exit code. For example:

#<Process::Status: pid 9167 exit 2>

We can also use $?.success? to check the status which returns true if the last operation was a success.

raise "rsync failed with status code #{status}" unless $?.success?

Let's modify the above sync_files method to track errors

def sync_files
  system 'rsync -e ssh /local_dir user@remote-host:/remote_dir/'

  status = $?.exitstatus

  raise "rsync failed with status code #{status}" unless status.zero?
end

Now, the above method will raise an error whenever rsync command fails to execute due to any issues.

I hope you enjoyed this article and learned how to use rsync in Ruby. Thank you.