Installing and Configuring Confluence Data Center on AWS

 ·  ☕ 11 min read

Or at least how I did it


Disclaimer

This is my experience and steps I performed, YMMV.
Some of these steps might not be required for your environment and some other steps might be.
Some of the values might be different for you, where I think that might be the case they will appear as variables like so: ${VARIABLE}.
This guide is to help you, not tell you want to do, so I take no responsibility if you screw up. OK?

Official Documentation can be found Here1

The Goal

We hope to achieve an instance of Confluence running on a cluster of nodes.

Side-note: The only important difference between Confluence and Confluence Datacenter is that the Datacenter version is a cluster of nodes.
It is exactly the exact same piece of software, but using a different licence.

We will be using:

  • Confluence version 7.2.1
  • RHEL7.5
  • m5.large (2 vCPU, 8 GB Mem) for nodes
  • 60 GB for node storage
  • Oracle 12c se for the database running on RDS with type db.m4.large (2 vCPU, 8 GB Mem)
  • 200 GB of Database storage
  • EFS for the shared storage
  • ALB for the load balancer that serves to the cluster

The Load Balancer will be using HTTPS between both the client and the Confluence nodes.

The subnets, number of nodes and other various configurations not listed are likely of no importance to the build.

Useful Information

Here is some information that could prove useful.

These are the defaults and might be assumed below; if the values are different for you, then you would likely know, substitute appropriately.

Home Directory ${CONFLUENCE_HOME} = /var/atlassian/application-data/confluence/

Install Directory ${CONFLUENCE_INSTALL} = /opt/atlassian/confluence/

The keytool is located in ${CONFLUENCE_INSTALL}/jre/bin/

If when you navigate to Confluence, you use the same hostname as when using Jira then timing issues can occur that result in being logged of for inactivity. This can happen if you are installing on the same node and not using different URLs for using ssh tunneling and therefore accessing both Confluence and Jira using ’localhost’. Using separate hostname of different browsers can get around this issue.

Pre-install steps

There are a number of tasks that are recommended to perform prior to the installation of Confluence. These should be done on all the nodes

  • Update hostname of your nodes
  • Update the proxy settings of the node
    • This is particularly important to do for your package manager (e.g. yum) as the install step may require some additional packages
    • Don’t forget the proxy bypass list too
  • Join the domain
  • Mount the EFS file system.
    1. Create the shared directory. e.g. /confluence-shared
    2. Add to /etc/fstab. e.g.
${IP/HOST OF EFS MOUNT}:/ ${SHARED DIRECTORY} nfs4 defaults,_netdev 0 0
  • Download the installers and upload to each node
    • Make the installer executable
1
chmod +x ${INSTALLER}
  • Download and upload the Oracle DB drivers
    • Extract the tar.gz file
1
tar -xf ${DRIVERS ARCHIVE}

Command-line installation

These steps need to be done on all the nodes.

Now we can install Confluence:

1
sudo ${INSTALLER}

When prompted, enter the following values:

  1. o # Start installation
  2. 1 # Express installation, use defaults
  3. i # Install
  4. y # Start Confluence service

Next we need to add the Oracle DB drivers

1
sudo cp -r ${DB DRIVERS FOLDER}/* ${CONFLUENCE_INSTALL}/lib/

Next we want to ensure that permissions are set sufficiently, we will do this by setting the owner to a user called ‘confluence’ that was created as part of the install.

1
2
3
sudo chown -R confluence ${CONFLUENCE_HOME}
sudo chown -R confluence ${CONFLUENCE_INSTALL}
sudo chown -R confluence ${SHARED DIRECTORY}

Finally, to ensure that these post installation changes are picked up, we want to restart the confluence service

While you can use systemctl to start and stop the service, using the below method is both recommended by Confluence documentation and it provides output to the command line.

1
2
sudo /etc/init.d/confluence stop
sudo /etc/init.d/confluence start

Database Configuration

This needs to be done prior to the Web Interface Configuration but can be done before or after the actual install via the command-line.

Confluence requires that a database user be created with certain permissions rather than the DB admin user. The following commands create a tablespace, user and grant that user the minimum permissions it needs:

1
2
3
4
5
6
7
create tablespace confluencetablespace datafile size 1G autoextend on;
create user confluenceuser identified by ${PASSWORD} default tablespace confluencetablespace quota unlimited on confluencetablespace;
grant connect to confluenceuser;
grant resource to confluenceuser;
grant create table to confluenceuser;
grant create sequence to confluenceuser;
grant create trigger to confluenceuser;

Web Interface Configuration

This is done on a single node, so it is a good idea that you stop the Confluence service on all but one node first.

This is done as a series of forms with ‘points of no return’. Thus if you make a mistake and are unable to go back, you can start again by deleting the contents of ${CONFLUENCE_HOME} and restarting the Confluence service. You can then start the Web interface configuration again.

  1. Using a web browser, navigate to the node still running Confluence on http (default port is 8080).

  2. First page

    • Select “Production Installation”
  3. Get apps

    • These additional apps are out of scope for this guide
  4. License Key

    • Enter your own License, either a purchased or evaluation one.
  5. Configure cluster node

    • Enter a Clueter name and the location of the shared directory
    • While multicast is not supported in AWS VPCs, we will configure an alternative in a separate step; so leave as multicast for now
  6. Set up your database

    • Database type = Oracle 12c
    • Connection type = Simple
    • Username and Password should be that of the user created in an earlier step
    • Fill in the other fields with your own appropriate values
    • It is a good idea to test the connection
    • If you previously went through this setup and are doing it again, then you may be advised that the database contains existing data. Continue with the setup anyway.
  7. Load Content

    • Create an sample site. This can be deleted later, but it gets you up and going.
  8. Configure User Management

    • Select User Management. We will configure Active Directory integration in a later step
  9. Admin Account

    • I’ll let you decide what you want to put here
  10. Setup Successful

    • Select Start
  11. Log in

    • Log in with the admin account you created earlier

Increasing Memory

At this point, you have Confluence installed on all nodes, but only configured on one of them.

  1. Open the file ${CONFLUENCE_INSTALL}/bin/setenv.sh
  2. Modify the values for -Xms and -Xmx for min and max heap size respectively
    • Due to the way Java works, this setting does not limit the maximum amount of memory that will be used. There is additional overhead for managing the Java Virtual Machine as well.
    • Setting these parameters to the same value may reduce the memory management overhead.

Adding to the Load Balancer

At this point, you have Confluence installed on all nodes, but only configured on one of them.

Tomcat (which is the server that powers Confluence) needs to be told about the proxy/reverse proxy that users will be connecting from if not going directly, this is therefore important for clustering.

  1. Open ${CONFLUENCE_INSTALL}/conf/server.xml
  2. Do NOT comment out the default http connector. This can break the clustering and internal RSS feeds
  3. Uncomment the HTTPS behind proxy connector
    • Also change the value for proxyName to the url you will use to connect to the load balancer

Clustering

At this point, you have Confluence installed on all nodes, but only configured on one of them.

  1. Stop the Confluence service on all nodes
  2. Update ${CONFLUENCE_HOME}/confluence.cfg.xml to tell the cluster how to reach the other nodes. There are 2 ways to do this. Either hard-code the IPs of all the cluster members or give Confluence access to the AWS APIs to it can discover them based on attributes such as specified tags.2

This requires that the Confluence nodes have access to the ec2 endpoint for the AWS APIs.
You might need to set the proxy, but that is not included in this guide.

Set: confluence.cluster.join.type = aws
Remove: confluence.cluster.address
Add: confluence.cluster.aws.region = ap-southeast-2 (or which ever region is appropriate)
Add: confluence.cluster.aws.tag.key and set to your own tag name. e.g. Application
Add: confluence.cluster.aws.tag.value = and set to your own tag value. e.g. Confluence
Add: confluence.cluster.aws.iam.role = <Name (not ARN) of IAM role to assume> # Note that this requires the confluence nodes to have the permission to assume it’s own role

Note that there are other properties that can be used other than tags. See the documentation for more details.

This is less scalable, but if the AWS APIs are not reachable, then this is your only option

Set: confluence.cluster.join.type = tcp_ip
Remove: confluence.cluster.address
Add: confluence.cluster.peers to a comma-delimited list of IPs of all the nodes in the cluster. e.g. 10.0.0.1,10.0.0.2,10.0.0.3

  1. Copy the Install directory and home directory from the node you performed the web configuration on to the remaining nodes (be careful around preserving permissions).

  2. Start the servers back up ONE AT A TIME!

Enable HTTPS

If you need to have end-2-end encryption and off-loading the ssl-termination to the Load Balancer is not sufficient.

These instructions assume that you have a certificate chain, a signed certificate and a private key for the certificate.

  1. Stop the cluster

  2. Generate a keystore using the keytool

1
keytool -genkeypair -keysize 2048 -alias tomcat -keyalg RSA -sigalg SHA256withRSA

The prompted information is not important as we will be deleting the key soon anyway, but it does generate a keystore which we will use.

  1. Move it to the appropriate directory and change the ownership if required
1
2
mv ~/.keystore ${CONFLUENCE_HOME}/confluence.jks
chown confluence ${CONFLUENCE_HOME}/confluence.jks
  1. Delete the cert currently in the keystore
1
keytool -delete -alias tomcat -keystore ${CONFLUENCE_HOME}/confluence.jks
  1. Create a temporary pkcs12 keystore using the certs and key you have
1
openssl pkcs12 -export -in ${SIGNED CERTIFICATE} -inkey <${PRIVATE KEY} -out tmpkeystore -name tomcat -CAfile ${CERT CHAIN/BUNDLE/ROOT} -caname root
  1. Import the temporary keystore into the real keystore
1
keytool -importkeystore -deststorepass ${KEYSTORE PASSWORD} -destkeypass ${KEYSTORE PASSWORD} -destkeystore ${CONFLUENCE_HOME}/confluence.jks -srckeystore tmpkeystore -srcstoretype PKCS12 -srcstorepass ${SOURCE KEYSTORE PASSWORD} -alias tomcat

${SOURCE KEYSTORE PASSWORD} will be the password you gave in step 5

The default password is ‘changeit’

You can also delete the temporary keystore ‘tmpkeystore’

  1. Modify the file ${CONFLUENCE_INSTALL}/conf/server.xml

    • On the HTTPS connector, add the properties:
      • keystorePass="${KEYSTORE PASSWORD}"
      • keystoreFile="${CONFLUENCE_HOME}/confluence.jks"
      • clientAuth=“false”
      • sslProtocol=“TLSv1.2”
      • sslEnabledProtocols=“TLSv1.2”
      • SSLEnabled=“true”
      • And make sure the protocol is: protocol=“org.apache.coyote.http11.Http11NioProtocol”
  2. Copy to other nodes both the server.xml and jks file

  3. Start the cluster back up

Install Trusted CA

These steps should be done on each of the nodes

If you are using private certificates you might need to install the certificate, intermediary or root Certificates onto the servers.

To add to the Trust Store of the OS.

1
2
3
4
5
sudo yum install ca-certificates
sudo cp ${CERTIFICATE FILE} /etc/pki/ca-trust/source/anchors/
sudo update-ca-trust enable
sudo update-ca-trust extract
sudo update-ca-trust

You might need to add the certificates to Java’s Trust Store as well. This can be done with the commands:

1
2
sudo ${CONFLUENCE_INSTALL}/jre/bin/keytool -import -alias Confluence -keystore ${CONFLUENCE_INSTALL}/jre/lib/security/cacerts -file ${CERTIFICATE FILE}
sudo /etc/init.d/confluence restart

Single Sign-On

Official documentation here3.

Adding a Directory

  1. In the Confluence Console, go to Settings > User Management > User Directories
  2. Click “Add Directory”
  3. Configuration
    • Name: I recommend the name of the domain like ’example.com’
    • Directory Type = Microsoft Active Directory
    • Hostname: The Hostname or IP of the server running LDAP service. A good practice is to make all the Domain Controllers accessible to your Confluence nodes in which case you can use the domain name (e.g. example.com) as this should return the IPs of all the Domain Controllers.
    • Port: 389 is the default but 636 is the default when using ssl. Using SSL is recommended but if your LDAP service is not configured for it, then this is obviously not an option. I suggest you try it anyway and then use non-ssl if that fails.
    • Username & Password: Use your own values
    • LDAP Schema Parameters: These will be particular to your setup
    • LDAP Permissions: Select “Read Only with Local Groups”
    • Default Group Memberships = “confluence-users”
    • User Schema Settings: Depending on the configuration of your SAML provider you might need to change parameters: ‘User Object Filter’ and ‘User Name Attribute’. For example, the default is to have the shortname; but in my case the SAML provider used the full email of the user, thus I had to change the values as below:
      • User Object Filter = “(&(objectCategory=Person)(userPrincipalName=*))”
      • User Name Attribute =" userPrincipalName"

Adding a SAML provider

This is not a tutorial on how to configure your SAML provider, but how to configure Confluence to use it

  1. In the Confluence Console, go to Settings >System > SSO 2.0
  2. Configure as such:
    • Authentication Method: Select “SAML single sign-on”
    • The ‘SAML SSO 2.0 Settings’ parameters would be dependant on your SAML provider settings
    • Login mode: This determines if when accessing Confluence, you will be automatically redirected to your SAML provider (e.g. ADFS) or you need to use a special url to sign on with that method (this is the “Assertion Consumer Service URL” at the bottom of the page). Keeping SAML as the secondary method is recommended while performing initial configuration and testing so you can still use the local directory (e.g. the primary admin account).

Kieran Goldsworthy
WRITTEN BY
Kieran Goldsworthy
Cloud Engineer and Architect


What's on this Page