Rails Architecture and Scalability

| Comments

Everybody know when this “Rails doesn’t scale” has been begun:

Panic, we must not use Ruby on Rails because Twitter had scalability problems.

I’m a Java developer who love Ruby language, but don’t write anything in Rails. Despite this I’ve decided to look into this famous “Rails doesn’t scale” statement deeper to understand the root cause of this problem.

So, I collected public available Ruby on Rails architecture and scalability case studies (videos of conference talks, reports and blog posts) and tried to extract the general patterns of architecture and scalability issues.

There are two type of scalability issues:

  1. Application performance - when web application can’t handle huge traffic
  2. Delivery velocity - when it’s become hard to make changes in big Rails application, run quick tests, deploy it and manage big team

Note: You are not another “twitter” to worry about scalability issues right from beginning of the project (there are fewer web apps on the Internet to get enough traffic to even care about scalability). Your goal is to push your product as quick as possible. But, in the same time you’d like to use Rails (due to it’s productivity) and make you potentially application scalable (in all possible terms)

The most interesting that scalability is about architecture, databases, caching, event queues, disk IO and less about Rails framework.

Rails deployment architecture

Let’s review the common Rails deployment approaches (see Deployment with Ruby on Rails)

Simple Rails Setup

One Rails instance handles all requests. Rails is single-threaded: There is only one concurrent request.

Typical Rails Setup

  • A load-balancer distributes the incoming requests
  • Some load-balancers will deliver static requests themselves
  • Several Rails instances handle all requests
  • Number of concurrent requests equals number of Rails instances

Application server (Phusion Passenger)

  • Involving Phusion Passenger application server
  • Makes setup easier on the single machine level
  • Multiple servers still require load balancer
  • Suitable for mass-hosting
  • upcoming standard way of deploying Rails

Recommended Rails Application Setups

Small Site

Apache with mod_rails/Phusion Passenger

Medium Site

  • Apache/Nginx as frontend proxy
  • Passanger as backend
  • Deliver static files with Apache/Nginx

Large Site

  • Redundant load-ballancer
  • Redundant proxy
  • Phusion Passenger/mod_rails

Scale up Traditional Rails Application

Caching

There might be the cases when it’s not enought and in this case we should start looking in “cashing” direction via involving Memcahed and/or Redis (based on Konstantin’s Gredeskoul slides)

Long-runnint task scaling

(based on Konstantin’s Gredeskoul slides)

  • Background jobs with Reques (it sits on top of Redis)
  • Use Solr/Elasticsearch instead of doing complex joins

Rails moving towards SOA and micro services

The shown above architectures are related to Monolith Architecture. This type of architecture has some problems:

  • Development pain points:
  • effective controllers and models have a lot of logic
    • ~1000 Models/Controllers, 200K LOC, 100s of jobs
  • Merge issues arise in big team (20-30+)
  • Lots of contributors and no ownership
  • Difficult deployments with long integration cycles
  • Tests are not green, it’s really hard to support stable test quality

The monolithic Rails app should evolve into ecosystems of connected services. It’s becoming quite common for Rails apps to be working mainly as clients to other services.

Splitting application into small pieces

  • Split into smaller applications (based on Konstantin’s Gredeskoul slides)
    • Contains web UI, logic and data
      • Extract look and feel into gem to share across apps
    • May combine with other apps
    • May rely on common libraries
    • Typically run in their own Ruby VM
  • Extract services and create APIs
    • Create client API wrapper gems for consumers
  • Extract libraries (gems)
    • Create shared based client gem library

Reference (sample) service orienter / micro service architecture

Now, we have more than one Rails application and many service which are communicating using messaging, distributed cashed, etc.

Groupon

Flipcart

Gilt

I’ve collect much more case studies, see here.

Conclusion

Moving Monolith Rails application to micro service architecture it’s not one shot action. It’s long run with lots of trade offs. And more over, micro service architecture is not silver bullet, it’s just one alternative way to scale your application (see [Recommended Rails Application Setups][#Recommended.Rails.Application.Setups]).

The key idea is to develop your application with SRP (Single responsibility principle) in mind. The more modular your application the more scalable it’s.

I’m planning to add more architecture case studies to my collection (not only Rails related). Stay tuned.

Rails related tech components

Collection of major technology components mentioned in case studies.

Web Servers / Proxy

Application Servers

Libraries / Tools

References

How to Download Jars From Maven Central

| Comments

We know how to download Java libraries with it’s dependencies (transitive included) via Maven pom.xml, Ant/Ivy build.xml script, Gradle build.gradle script etc. But what if we need to download them without these scripts.

There are several ways to do this. Assume that we’d like to download spark-core library (groupId=com.sparkjava, artifactId=spark-core, version=2.1) with all dependencies from Maven Central into lib folder.

Use Maven3 dependency plugin

Here is there variants for lib download:

Download library with all dependencies
1
2
3
4
5
6
7
8
# Specify repoUrl (it's optional)
mvn dependency:get -DrepoUrl=http://download.java.net/maven/2/ -DgroupId=com.sparkjava -DartifactId=spark-core -Dversion=2.1

# OR use default repoUrl
mvn dependency:get -DgroupId=com.sparkjava -DartifactId=spark-core -Dversion=2.1

# OR use parameter artifact as groupId:artifactId:version
mvn dependency:get -Dartifact=com.sparkjava:spark-core:2.1

Now we need to copy just downloaded artifacts in our working directory:

Copy jars from local maven repo
1
2
3
mvn dependency:copy-dependencies -f $HOME/.m2/repository/com/sparkjava/spark-core/2.1/spark-core-2.1.pom -DoutputDirectory=$(pwd)/lib
# the previous command doesn't copy spark-core-x.x.jar, that's why we should copy it manually
cp $HOME/.m2/repository/com/sparkjava/spark-core/2.1/spark-core-2.1.jar $(pwd)/lib

Use standalone Ivy

We can use Ivy as standalone jar to download Maven dependencies without creating Ant build file:

1
2
3
4
5
# 1. Download the latest ivy jar (currently it's v.2.4.0)
curl -L -O http://search.maven.org/remotecontent?filepath=org/apache/ivy/ivy/2.4.0/ivy-2.4.0.jar

# 2. Run ivy.jar to retrieve all dependencies
java -jar ivy-2.4.0.jar -dependency com.sparkjava spark-core 2.1 -retrieve "lib/[artifact]-[revision](-[classifier]).[ext]"

As you can see Ivy downloads approach is much simpler. The only cons (or pros, it depends) that ivy.jar should be additionally downloaded.

Calling Ivy from Groovy or Java

Here I’ve decided to store Evgeny’s Goldin code snippet as a reference for myself. Programmatic artifacts downloads is not a common operation. It’s alway nice to know the general concept how it can be done. Especially when Ivy documentation is not very informative.

Groovy snippet of calling Ivy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
import org.apache.ivy.Ivy
import org.apache.ivy.core.module.descriptor.DefaultDependencyDescriptor
import org.apache.ivy.core.module.descriptor.DefaultModuleDescriptor
import org.apache.ivy.core.module.id.ModuleRevisionId
import org.apache.ivy.core.resolve.ResolveOptions
import org.apache.ivy.core.settings.IvySettings
import org.apache.ivy.plugins.resolver.URLResolver
import org.apache.ivy.core.report.ResolveReport
import org.apache.ivy.plugins.parser.xml.XmlModuleDescriptorWriter


public File resolveArtifact(String groupId, String artifactId, String version) {
        //creates clear ivy settings
        IvySettings ivySettings = new IvySettings();
        //url resolver for configuration of maven repo
        URLResolver resolver = new URLResolver();
        resolver.setM2compatible(true);
        resolver.setName('central');
        //you can specify the url resolution pattern strategy
        resolver.addArtifactPattern(
            'http://repo1.maven.org/maven2/[organisation]/[module]/[revision]/[artifact](-[revision]).[ext]');
        //adding maven repo resolver
        ivySettings.addResolver(resolver);
        //set to the default resolver
        ivySettings.setDefaultResolver(resolver.getName());
        //creates an Ivy instance with settings
        Ivy ivy = Ivy.newInstance(ivySettings);

        File ivyfile = File.createTempFile('ivy', '.xml');
        ivyfile.deleteOnExit();

        String[] dep = [groupId, artifactId, version]

        DefaultModuleDescriptor md =
                DefaultModuleDescriptor.newDefaultInstance(ModuleRevisionId.newInstance(dep[0],
                dep[1] + '-caller', 'working'));

        DefaultDependencyDescriptor dd = new DefaultDependencyDescriptor(md,
                ModuleRevisionId.newInstance(dep[0], dep[1], dep[2]), false, false, true);
        md.addDependency(dd);

        //creates an ivy configuration file
        XmlModuleDescriptorWriter.write(md, ivyfile);

        String[] confs = ['default'];
        ResolveOptions resolveOptions = new ResolveOptions().setConfs(confs);

        //init resolve report
        ResolveReport report = ivy.resolve(ivyfile.toURL(), resolveOptions);

        //so you can get the jar library
        File jarArtifactFile = report.getAllArtifactsReports()[0].getLocalFile();

        return jarArtifactFile;
}

resolveArtifact( 'log4j', 'log4j', '1.2.16' )

References