Docker Docker Docker Dockerize everything!

In this post you will learn:

  • What is Docker?
  • How to install and run it
  • More about Docker images
  • How to return to the host-machine without killing my container?
  • How to use your own internal Docker registry to store images
  • How to push your image to Artifactory
  • The Dockerfile: Best practices & versioning
  • How to copy a Docker image to some other Docker host
  • Docker networking
  • How to monitor containers

So, let’s get started.

What is Docker?

It is magic powered by unicorn blood.

Think of it as a Virtual Machine but, instead of having the Operating System + Hypervisor layers below the application you want to run, it just shares a “sub-context” of the Linux Kernel and allows you to run other Linux’es within the same Linux — Completely separated virtual environments with their own libraries + OS tools + applications + exposed ports, etc. And why is it so cool? You might ask, once you assemble your container with everything you want, you can take a snapshot of that (which is known as a Docker IMAGE) and spin new containers. You can spin multiple instances of a given image, think of it in terms of OOP, where you can create instances out of a class, so you are doing basically the same thing by creating containers out of a Docker image.

Beyond the virtualized, isolated characteristics, it is really time-efficient. There’s no Guest OS to boot here so you can actually start your container in a split second with the software you want, that, according to the actual command you set for it, it will start that process in the foreground for whatever purpose. If it is a web or an application server, it will just come up straight away, ready to service requests.

How to install and run it

The installation is very straight forward, you can find the details in the link below:

Don’t forget to start your Docker daemon:

# /bin/systemctl start docker.service

If you are in a Mac OS environment, find the little Docker icon on the top-right corner of your screen:

Then you can start a container and play in a completely isolated / virtualized environment:
# docker run -it –name mycontainer centos /bin/sh

Quick review of the docker run syntax:

Parameter Description
-d (detached) Runs in detached mode (not interactive)
–name Name of the container
-h (hostname) Hostname within the Docker network
–link Allow communication with another container in the Docker network
-p (port) Exposed port ( <host_port>:<container_port>)
-v (volume) Mapped volume/disk path (<host_path>:<container_path>)
<image> Name of the docker image
-w (working directory) Initial directory for the container command
-t (tty / terminal) Assign pseudo-tty for the container
-i (input) Set STDIN of the container (interactive)

Here is a slightly more complex example (running a local project just to illustrate):

# docker run -d –name mykanban -h mykanbanapp –link mydbcontainer -p 8080:8080 -w /opt/mykanban -v /home/marcelo/Projects/MyOnlineKanban/mykanban:/opt/mykanban java:8 /usr/bin/jjs – cp lib/mongo-2.10.1.jar httpsrv.js

More about Docker images

So, let’s say I want to create my image with a “netcat” pre-installed. I would need to run:

# docker run -it –name my-docker-name centos /bin/sh

If that is the first time you are trying to spin a container out of the “centos” image, the Docker daemon will go to Docker Hub and get that image for you:

Unable to find image ‘centos:latest’ locally
latest: Pulling from library/centos
a3ed95caeb02: Pull complete
Digest: sha256:1a62cd7c773dd5c6cf08e2e28596f6fcc99bd97e38c9b324163e0da90ed27562
Status: Downloaded newer image for centos:latest

Then you can install what you need:

# yum install nc

Installed: nc.x86_64 2:6.40-7.el7 Complete!

How to return to the host-machine without killing my container?

If you entered a container with `docker exec` you can just type `exit` to leave the container. However, if you started a container with `docker run` then you should use the following shortcut:

ctrl+p ctrl+q

And now you can see that your container is still running (because you started it with a perpetual shell terminal process: /bin/sh):

# docker ps -n 1


f03d4ba0a56f CREATED 22 minutes ago STATUS Up 22 minutes        centos “/bin/sh” nc-image

So you can now “commit” that container and create your first DOCKER IMAGE (i.e., basically taking a snapshot of the container and turning that state into an image):

# docker commit f03d4ba0a56f nc-server

Then it becomes part of the images available in this Docker Host server:

# docker images

REPOSITORY    IMAGE      ID             CREATED          SIZE
nc-server     centos     ff5450d8c273   6 seconds ago    278.8 MB

We can create new images out of base images for different purposes, we can even extend them for specific use cases.

How to use your own internal Docker registry to store images


Here is how you connect to it:

Go to artifactory, click on your user name on the top right:

provide the password once more and click on the gear icon to generate an API key (i.e., the artifactory encrypted password).

And, finally, login:

# docker login

Login Succeeded


How to push your image to artifactory

First we need to tag it:

Here’s the syntax => docker tag [OPTIONS] IMAGE[:TAG] [REGISTRYHOST/] [USERNAME/]NAME[:TAG]

# docker tag my-busybox tag

# docker push
The push refers to a repository [] 06cc5a7ff579: Pushed
test-tag: digest: sha256:82b9618df57b5fc2ebed3d79c3d26e3ccb51e3f302348979b7534af555e2913a
size: 940

# docker images | grep test-tag
REPOSITORY TAG IMAGE ID SIZE  17 minutes ago 1.113 MB

Pushed and tagged.

The Dockerfile: Best practices & versioning

Committing your docker container into an image is a bad practice because the whole process is very manual and not very flexible. Imagine that you want to install an earlier version of “netcat”, then you will need to jump inside a container that was created out of the image you committed earlier and then uninstall & install another version of netcat. Or you need to create another container from scratch. It’s just too messy. Imagine a more granular change involving multiple points of configuration within the same container (e.g., service packs, JVM arguments, port configuration, OS-level tweaks, etc.), it’s a nightmare to manage all that by manually committing changes.

Therefore, do not commit containers !!! THAT WAS JUST FOR SHOW! — USE DOCKERFILES!!!

Following good automation practices: if you need to apply a number of custom steps to assemble your container, it is a bad idea to spin it and commit it. To solve that problem we use the “Dockerfile”.

1) Create a “Dockerfile” under your project folder: /home/user/Projects/my-nc-server

2) Introduce the instructions you need, e.g.:
FROM centos
MAINTAINER Marcelo Costa <>

RUN yum -y install nc
RUN yum -y install net-tools

3) Create a new custom image with the following command (running within the “my-nc- server” directory):

# docker build .
You can also introduce the [name-of-the-image]:[tag] notation with the “–tag” parameter:

# docker build –tag my-nc-server:test-tag .

*More info:

How to copy a Docker image to some other Docker host

You can also copy images as packages with the Save/Export & Load commands.

What is the difference between Save and Export? Answer: Save persists an image whereas Export persists containers.

Here is how you do it:

# docker save my-busybox >my-busybox.tar
# scp my-busybox.tar user@somemachine:/home/user/


my-busybox.tar 100% 1299KB 1.3MB/s 00:00

# scp user@somemachine:/home/user/my-busybox.tar .

my-busybox.tar 100% 1299KB 1.3MB/s 00:00

# docker load < my-busybox.tar
1834950e52ce: Loading layer 1.311 MB/1.311 MB

# docker images | grep busy
my-busybox latest 5d8cbe820583 About an hour ago 1.113 MB

Docker networking

Docker offers 3 types of network configuration: bridge, host and none.
You can define “none” if you want to waste a lot of time configuring everything yourself.

The “host” option sucks really bad — it replicates all the network interfaces of the Docker host into your container so there is no magic of isolated virtualization.

The default option “bridge” is applied when none of the others are specified. This option creates a “docker0” interface in your Linux and, for each container that you start, a Virtual Ethernet interface is created along with it (it is usually named as “veth<crazy_sequence_of_characters>”). Any requests that target a specific port that is mapped between the Docker Host and the Container will be handled by the docker0 network, forwarded to its respecting “veth” and then it will fall in the “eth0” of the container.

e.g., sandbox01 → dockerhost : ens34 :: docker0 :: vethXXX → container : eth0

An example of the how the network interfaces connect with each other:

Be aware that scripts under “/etc/sysconfig/network-scripts” that contain the name of that interface can potentially block this flow, depending on its instructions.

  • Yeah, a very specific caveat here… you guessed it right, I had faced an issue with that and got stuck for a few days on it 😦

How to monitor containers

Ideally, if you have a container orchestration system like Kubernetes, then you can consider sophisticated tools like Prometheus. If you just want to monitor containers from within the Docker host, here are some useful commands:

docker top
# docker top nc-server


root 28524 28510 0 11:36 pts/1 00:00:00 nc -vv -l 8080

docker stats

# docker stats nc-server

nc-server 0.00% 9.769MB/12.42GB 0.08% 48B/648B 9.409MB/0B 0

Keep in mind:

docker exec -it <container-name> /bin/bash
ctrl+p & ctrl+q (return to docker host)
docker logs -ft <container-name>

That’s it. Just some general Docker instructions that should bring you up to speed if you never play with it before, please provide suggestions to expand this article in the comments.


Playing with Emacs

Ok, here it goes…


sudo apt-get install emacs


Add the following lines to your “~/.emacs.d/init.el” file:

(require 'package)
(add-to-list 'package-archives '("melpa" . "") t)

*More about Emacs package management in this post from a Bulgarian dude.

Start Emac

  • emacs
  • emacs -nw
    *I prefer to use Emacs in its “no-window” mode.

Find/Open files

C-x f  = file



Exit Emacs

C-x C-c

Marking, Cutting, Copying & Pasting

C-space = Starts marking
*(move around the document to select the fragment of content you want).

C-g = Cancel mark

C-w = Cut

M-w = Copy

C-y = Yank / Paste

Undo / Redo

C-x u = Undo

C-Shift _ = Redo

Save file

C-x s = Save

Moving between points of interest

Mark one or more sections of the document with C-space and C-g then use

C-u C-space = moves to the previously marked location


C-x b = presents list of buffers at the bottom of the interface (aka: mini-buffer)
*move to the next buffer with C-s (step) and to the previous buffer with C-r (return).

C-x C-b = presents list of buffers on the main screen with details
*You can tag which buffers to delete with “d” and can undo that action with “u”


C-x 3 = Splits the window horizontally

C-x 2 = Splits the window vertically

C-x o = Other. Moves to the other window

C-x 0 (zero) = Closes the active window.


C-h k = Key. Press the key combination and Emacs will take you to the help page containing the instructions to it.

CUA Mode

To demonstrate the power of this feature, let us create an unordered list by writing some HTML code:

  1. Write a list of names:

Fat Mike
El Hefe

2. Enter cua-mode by typing “M-x cua-mode”. This cua-mode interface allows you to select text through rectangle marks. To start selecting, officially you have to type “C-Enter” (aka: C-RET), however, it did not work in my Xubuntu’s terminal, if C-RET does not work for you, the key mapping for this function must be customized. Here’s how you do it:

  • While in cua-mode, type “M-x” to execute a command in the mini-buffer.
  • Type “M-x” followed by “customize-variable”, press ENTER
  • Type “cua-rectangle-mark-key”, press ENTER.
  • In this interface you can navigate to the editable text-field and set another key to replace “ENTER” (RET) in the shortcut that select text rectangularly t in cua-mode,  you can set this new shortcut by reproducing the desired sequence and the character should be captured in the text-field. Once the new key is set you should navigate to the “Apply and save” link and press ENTER. (I have selected C-. to be the new shortcut for my rectangular selection).

Now that your cua-mode selection is working, select all 4 names in the list starting at the end of “Smelly” and marking the text all the way up to “Fat Mike”, i.e., the beginning of the first line. Now type “<li>”, you will see that it will move all 4 lines and replicate the typing in all of them:

<li>Fat Mike
<li>El Hefe

3. You can disable the rectangular selection by pressing the shortcut once more (e.g., C-. ). If you wish to add an attribute to all the “list item” tags (<li>) you can move the cursor within the tag and select all the rows once more within the same column. Start typing and you should get:

<li id=”punk”>Fat Mike
<li id=”punk”>Melvin
<li id=”punk”>El Hefe
<li id=”punk”>Smelly

4. If you want to introduce some sequence of numbers you can achieve that by pressing “M-n” (numerical sequence). Move the cursor to the end of the value of the “id” attribute and, with the rectangular selection within that column , add an underscore character (_) and press “M-n”. The mini-buffer presents the options to add the sequence (“start value”, “Increment”, etc.) just press ENTER to accept the default values and you should see the result:

<li id=”punk_0″>Fat Mike
<li id=”punk_1″>Melvin
<li id=”punk_2″>El Hefe
<li id=”punk_3″>Smelly

That concludes the cua-mode section.

Emacs as a Python IDE

Here’s how to install the “elpy” mod against Emacs to introduce some Python IDE features:

Type “M-x list-packages”, wait until it connects to the packages’ repository and press “C-s” to search for the package called “elpy” (you can browse back and forth between packages by pressing C-M-s and C-M-r, respectively. Again, consider the “step” and “return” words to remember these navigation options). Once you find the package, press ENTER and confirm the installation.

The next step is to initialize the Python IDE features by editing your “~/.emacs.d/init.el” file (and include some lines to fix some key binding issues):

;; Fixing a key binding bug in elpy for snippet expansion                                                                          
(define-key global-map (kbd "C-c k") 'yas-expand)
;; Fixing another key binding bug in iedit mode                                                                                    
(define-key global-map (kbd "C-c o") 'iedit-mode)


The “elpy” mod includes some cool features, such as:

  1. Syntax highlighting / colors
  2. Auto-complete
  3. Interpreter failures / Static Analysis
  4. Special shortcuts to increase productivity

Common productivity shortcuts

M-; = Comment multiple lines of code. Select the same lines and use the same shortcut to uncomment the code.

C-c C-r r = Refactor. It presents refactoring options (e.g., extract a given snippet of code and move to a separate function).

C-c k = Expand Kode. It can auto-complete the common format of a given function, e.g., for, if, etc.

C-d = Documentation. Checks the correspondent excerpt from the help page related to a given function or Python instruction.

C-c C-e = Simultaneous Editing. This shortcut allows you to rename the name of a given variable or function by changing all the occurrences of that variable or function within the same script simultaneously. Use the same shortcut again to leave the edit mode.

C-h m = Manual. Presents an instructions manual with all the key bindings associated with the modes/mods that are being used in that given buffer.


M-x flyspell-mode = This will enable a real-time spell checking mechanism, it’s one of the “Minor Modes” that are shipped OOTB with Emacs. Once you enable it, the words will turn red if they are misspelled. In order to check the suggestions press M-$ and the options should show up at the top of the screen.


Version Control: Emacs & GIT

M-x vc-diff = This command shows the difference between the local modified version of the file and the version that is currently committed into the HEAD.

C-x v u = versioning-undo. It discards the changes that were staged since the last commit.

C-x v ~ = Open a specific revision (just need the first characters as the input) in a separate buffer.

C-x v l = versioning-log. Open log showing all the revisions associated with the file currently opened in the buffer.

  • You can browse through the revisions and press “f” when the cursor is on a given revision ID to open it on a separate buffer.
  • You can also see a diff report between the selected revision and the subsequent one by pressing “d”.

C-x v i = insert. It adds a file to the staging area within the emacs interface.

To stage, commit, push and pull within the Emacs interface, you can download a new package called Magit. Once you download it, restart Emacs and check the status of the staged/committed files by running M-x magit-status. Here are some of the main commands:

  • s = Stage (add) the file(s).
  • c = Commit the file(s)
  • b = Switch to a different branch.

More info here:

To play with these shortcuts, I recommend you change the default behavior for GIT’s commit message editing and add Emacs as its default text editor. While committing with Magit, remember to use C-c C-c to leave the current buffer where you edit the commit message.

Here are the commands to make this change:

$ git config --global core.editor "emacs -nw"
$ export GIT_EDITOR="emacs -nw"

* But of course, you can always use the ” git commit -m ‘my message’ ” approach, then the text editor doesn’t matter.

To be continued…





Damn you “localhost” ! ..and other musings about documentation

We code stuff.

And, at some point, for some of this stuff, we create documentation.

We do this to explain how the awesome stuff we create works or to provide some guidance to layman on how to use it, sometimes even both pieces of information are provided. People forget stuff, move to different companies, different teams, they die, convert themselves to Orthodox Latvian, for whatever reason, there’s a point where a given piece of technology has to be maintained and extended, and the documentation is one of the pillars of such endeavour.

There are many formats for a given documentation:

  • Official documentation: Vague, dull and filled with subliminal messages that reinforce the brand around the software’s manufacturer.
  • Blog post / Wiki page: Way better. Sometimes hosted internally in some web-based collaboration system, Blogs and Wikis are widely used to document whatever is developed and can be extended through comments the and collaboration of all the team members. Personally, I like Blogs. The informal tone of it makes me enjoy the learning process.
  • The code: Behold the pseudo-axiom that states “The code IS the documentation”. It is never outdated and, if done with elegance, it is clear enough to guide everyone through the understanding of whatever has been implemented. Code comments can at times smooth things up depending on how cryptic a given snippet of code is perceived.
  • The ticket: Provides full awareness of the timeline of the story / task /defect. If you have some sort of Agile Planning or bug-tracking solution that connects to the actual source-code management system, that is even better. However, it gets polluted really fast with comments, misleading information and attachments (logs, screenshots).

But of course, there are other ways to document things, some interesting practices might involve recording a presentation followed by a demo, uploading the slide deck, making the video available somewhere. How about a podcast with your fellow programmers? Or perhaps letting the newcomers dig through the code and ask them to produce the documentation as part of their ramp-up process?

The catalyst for this post was an incident that happened a long time ago, in a galaxy far far away. One of the Ops guys followed some instructions on a wiki page to recreate some collections in a SolrCloud environment, the instructions had something similar to this:

Once you ssh into the server execute the following command: 

The problem is that the Ops guy accidentally logged into a Production server instead of the Staging server he was looking for and, suddenly, all the indexing data from thousands of customers were gone, in the blink of an eye. I’m not exposing the exact command that was executed but, just so you known, we have proper SSL configuration to avoid any misguided interaction with our SolrCloud API, however, the usage of the client certificate is obviously granted to the Operations team.


So, here is the question: Can/Should we blame the documentation? Some might think we should integrate some additional mechanism within the existing security API to avoid such mistakes, or, even easier, we could wrap the command around a bash script that would present some scary ASCII art with skull and bones to let the user know that he is about to run that command against a particular ip address / hostname and this is a sensitive operation, anyway, regardless of the approach, this kind of goes against the “fix the process, not the problem” principle. After we restored the collection from backup, the documentation was adjusted with a placeholder:


Through this exercise of sharing my personal musings involving documentation, I guess I will take a step at some compilation of best practices to create good documentation. Hopefully, the following guidelines can be adapted to different types of documentation, even if it is about architecture, features, troubleshooting steps, etc.

#1 – You must have empathy

Try to put yourself in someone else’s shoes. Yeah, this one can be extremely relative and vague but, just like everything else in life, I believe it is interesting to try to leave the proper breadcrumbs and send the elevator back to help other people to reach that level of understanding you have. just imagine how amazing it would be if you could avoid all those IM windows that are constantly blinking while you try to concentrate and write some code, just reply with “RTFM” (Read The F**ing Full Manual) and send the link. Keep the following resources in mind:

  • Write an overview
  • Provide links to other resources
  • Expand acronyms
  • Collect feedback once you post it and amend if necessary

#2 – Help the visual learners with some diagrams

Describe the basic architecture and, perhaps even dive into the components involved in the request flow. For automated operations, another idea is to describe a timeline of events and present the entities involved in the orchestration. Use point #1 as a guidance on which visual elements would be more appropriate for the scenario you are working on.


#3 – Name it, tag it and categorize everything

The documentation you create is useless unless it can be found. So make sure you put some meaningful name and add some tags to make it easy for your internal collaboration system to index it properly. Group the pages into sections that make sense and advertise your documentation in your next technical update or knowledge sharing session.

Of course, there is no silver bullet. The documentation will become outdated, even the code can turn into a misleading amalgamation of legacy and working methods (some people don’t realize they have a version-control system so they decide to leave old stuff in the code “just in case”). However, with the proper set of references, links and comments (or other forms of general collaboration), there are alternatives to find the up-to-date information, either by going to the ticket and checking the latest updates on it or by checking the commit history of the components associated with the use case under investigation.

Now, I want to collect some feedback on this post so just comment if you feel there is something missing here.


Devops: Buzzword or the catalyst to fight conformity?

I have been meaning to write about this for quite some time now because this is the kind of stuff that should be chewing on every techie’s ear lately. Let me summarize the concept of DevOps from the point of view of a typical old-school manager (it’s funnier this way):

“ANARCHY! Developers jumping out of their cubicles and bashing into the server room bringing chaos and instauring pandemonium within the company”.

Now, here’s what it really means:

 “To bring Development and Operations together to build and deliver software more effectively and efficiently”.

This is cool but I want to take this post beyond the main aspects of DevOps, a good Release & Deployment process is definitely a  subject for an extensive discussion but the essence of it, the restlessness, that’s the point I want to touch today.

We love technology, we love to experiments with the “new toys”, either hardware or software (in my case, specifically, it’s definitely software due to budget issues), but I sincerely believe that the majority doesn’t want to assimilate any of these latest libraries/middlewares/APIs/Frameworks/methodologies/egregores frivolously, there’s value behind these tools, otherwise we wouldn’t have the hype around them and all the companies (or independent entities) behind such technologies wouldn’t be succeeding as they are. Now here comes the challenge: how do you introduce these changes to your project? It helps if you are the Senior Developer, it’s even more helpful if you are the Team Lead, but what about mere mortals, developers that are fighting on the trenches on a daily basis, or even enthusiasts that are labelled as “Systems Engineer” or “Support Analyst” (Yeah, I’m including myself in this category) that just don’t have a voice to break paradigms, some of them will give up and comply, another group will leave the company and there are those that will turn the apparently irreversible mess into something better.

I will present the archetypes that I’ve defined for each one of these developers (or IT Professionals in general):

The first group that gives up can be classified as “Furniture that writes code” – They are the guys that come to work everyday to do what they’re told, never bring anything new to the table, wait until 5 PM so they can go home and wait for Death to pay them a visit.


There’s the second group that I call “The Prodigious Tourists” – These guys (and girls) are geniuses, they carry a bias against mainstream stuff like Java or .NET, always leaning towards trending stuff, most of them would write a “Hello World” and start spreading the word about the new “silver bullet” that is out on the market, everything that you use is legacy technology for them, their skills are just as good as their ability to keep whining about all the company problems without presenting any tangible idea to solve them. They will, in most cases, leave the company to work for some cool startup where the receptionist is dressed as a Pokemon, then, as its product/service catalogue evolves, this company hires a consultant, things start getting too bureaucratic and they will pack their bag and move on to the next one.


And then we have “The Mavericks” –  Office pariahs, people in the coffee room laugh at them because of their crazy ideas, they want to improve things, naive day dreamers that should not be near a server, they will struggle with their limited network access & awareness of office politics to enhance processes leaving a trace of rejected Proofs of Concept along the way.


Maybe my interpretation of the latter is a little bit hyperbolic, but this one brings us closer to the profile of someone that needs to be involved in your company’s DevOps initiative, or any other cultural- change initiative for that matter. The restlessness should go beyond DevOps, the term was coined and gained notoriety to tackle a specific (and critical) problem: deliver software; So did “Agile” and “Extreme Programming” that came before it, but what about other inefficient processes that you have identified within the company? Why do you need 5 tickets to copy a file to that Websphere node? Why does your security request takes 3 weeks to be processed? Why Developers are not committing their Stored Procedures into version-control? Every company has similar issues and it’s easy to ignore them despite the pain and over-bureaucracy that they bring to your project, you can say that the problem lies in another department and, therefore, it’s out of your scope or that you don’t have a voice, no political power whatsoever, to raise a flag about these problems so you can’t do anything about it, these are all valid points as long as you wait for the right moment to strike and don’t let this inconformity flame be squelched, the worst excuse that I can imagine is the classic “That’s the way things are done around here”:



I heard about Hudson (proprietary father of Jenkins) before the “Continuous Integration” revolution, the little DTSTTCPW programs that were being used for Unit Testing arose way before the “Agile Manifesto”, but the methodology only becomes evangelizable when these cool buzzwords start flying around, which is definitely beneficial because the manager likes whatever he reads on trending magazines.


That’s why DevOps is so cool, it gives you an opportunity to play with the new toys and, most important, to fix processes, the road to build and deliver software has so many aspects that present many opportunities to enhance and/or eliminate many things. Now you can finally share your opinions and ideas, you can externalize all your frustration.


That’s it, if your team has a lot of messy processes and you are worried about how you should approach DevOps, there’s a brilliant talk by John Esser entitled “Creating a Culture for Continuous Delivery” that gives you 8 lessons to start breaking the paradigms with your company, I believe it’s an amazing place to start. You can read about the tools, install Jenkins on your machine, code a bunch of automation scripts but in the end, the company culture will present itself as the most challenging obstacle. Good luck!

Nashorn and the JVM Monitoring Challenge – PART 2

Hello and welcome to the second part of the “Nashorn and the JVM Monitoring Challenge” series, we will continue our quest to unveil what kind of chaotic things we will see once the JVM starts processing the bytecode that is executing untyped dynamic languages.

Let’s start with some good news: You don’t have to follow the OpenJDK Build instructions to start playing with Nashorn anymore, it was finally integrated to an early build of the Oracle’s JDK 8, I haven’t tried myself (Still playing in Ubuntu 12 with OpenJDK 8) but it should be there.

So, let’s get started, the agenda for today is going to be:

  1. Running a test script with the Nashorn Javascript engine
  2. Understanding Invokedynamic (It should be helpful to dive more into what we are going to see later)
  3. Monitor the JVM while running a Nashorn application

Running a test script

The Nashorn engine can be loaded in a Java class and then, with the instance of the ScriptEngine object, use the eval() method to execute Javascript code. Just write and compile the following class:

import javax.script.*;
public class EvalFile {
 public static void main(String[] args) throws Exception {
     // create a script engine manager
     ScriptEngineManager factory = new ScriptEngineManager();
     // create JavaScript engine
     ScriptEngine engine = factory.getEngineByName("nashorn");
     // evaluate JavaScript code from given file - specified by first argument

Once you have your ‘EvalFile’ class ready, create a dummy .js file just to give it a try, write something like:

print('Hello World');

Then you can execute this script like this:

java -cp nashorn.jar:. EvalFile dummy.js

In order to speed things up, I’ve created an alias to invoke the ‘jjs’ command-line tool so I don’t have to use this EvalFile class.

$alias jjs='/var/jdk8/openjdk8/nashorn/bin/jjs'

now, the .js file can be executed like this:

jjs dummy.js

Moving on, my test script here will only be used to explore how we can keep track of the chain of function calls and the number/size of objects that are being allocated into the JVM’s heap, if you really want to experience the power of Nashorn, you can refer to the official ‘Java Scripting Programmer’s Guide‘.

This test comprises two files:

— Model.js —

function Person(name, address, phone) { = name;
    this.address = address; = phone;
    this.sayHello = function() {
        return "Hi, my name is " +;

— testNashorn.js —

var Thread = java.lang.Thread;
print("Welcome to testNashorn.js");
var people = [];
for(var i=0;i<100;i++){
    var p = new Person("Marcelo","Caucaia Street 17","+353 086 5555555");
//Lots of objects were allocated into memory at this point 
var myFunction = function(){
    var text = people[0].sayHello();

The first one (Model.js) is where I’m declaring my Person “Class”, 3 attributes and 1 meth.. sorry, function, the second file (testNashorn.js) is the actual program, it first transforms the Thread Java class into a Javascript variable, then loads the Model.js, i.e., runs the Javascript code inside that file and prepares our Person() function/constructor, it declares an empty objects array and enters a loop instruction that is going to create 100 objects in memory, after the ‘for’ loop, as I will need a second to trigger my script to generate a heap dump, I decided to add a ‘Thread.sleep(30000)’ there (30 secs is more than enough), once the program awakes, it declares a function (myFunction) that is going to print the value returned by the ‘sayHello()’ function from the first object stored in the ‘people’ array, this function is then called right afterwards.

Now, we can run the program:

jjs testNashorn.js

The output should be something like this:

Welcome to testNashorn.js
[30 sec pause]
Hi, my name is Marcelo

That’s it, we have our test script, let’s move on to the second topic.

Understanding invokedynamic

Ok, we already know that the Nashorn Javascript engine is 100% compliant with ECMA-262 5.1 and it is fully implemented with the new “invokedynamic” bytecode instruction, therefore, is faster and more compliant than Mozilla’s Rhino, but what’s invokedynamic?

Executing Java code is not the JVM’s solely purpose, every Java code is compiled into bytecode and this is the piece that gets consumed and processed by the JVM, if a programming language can be compiled to bytecode than its instructions can be interpreted by the JVM (e.g., Closure or Scala). The bytecode is an efficient simplified form of non-human-readable code that is executed closer to machine-level instructions, i.e., better performance.

The JVM has approximately 200 “opcodes” to perform invocation of instructions, handle access to fields and control objects and arrays. The following table presents the types of invocation bytecode operations that were available before JDK version 7:




For static methods


For non-private instance methods


For private instance


For the receiver that implements the interface

A simple invocation to a method starts from a given “Call Site”, which is from where the request is initiated; it is assembled with the name of the method, the signature (access level, modifiers and return type) and the arguments that are processed by this method, the JVM will process this Call Site information and go through a set of operations: It is going to look for that method’s code within memory (Lookup), check if the types involved in the operations match (Type Checking), invokes the actual code (Branch) and then caches the location of that method so, if it is going to be needed again soon, the JVM already knows that memory address and speeds up the process (Cache).


The new “Invokedynamic” bytecode operation allows the JVM to customize how the resources for the Call Site are assembled (dynamically) and also perform a different set of operations within the JVM so the field or method can be accessed (invoked). Instead of the regular Call Site, it integrates bytecode (invokedynamic operation with name and signature) with a bootstrap method, this is the component that will connect the Call Site with the “Method Handle”, once the handle finds the correct way of making this invocation occurs, the JVM will optimize the operation and the invokedynamic bytecode will be attached to the “Target Method” to avoid processing all these steps again. In a scenario where a scripting language that is running within the JVM needs to access a specific function, it is going to initiate the process by providing the bootstrap method with the invokedynamic instructions (name of the function followed by arguments and the return type), the JVM will look for the function within a Method Table (list of functions that are not associated to any object or class) based on the arguments that are defined at runtime (Lookup), once it finds the function, it will perform some language-specific type checking (Type Checking) and then it will finish the bootstrap process connecting the Call Site with the Method Handle so it can be executed (Branch), this connection is performed only once but Call Sites can be connected to new Method Handles.


If you want a more in depth explanation, I strongly recommend this blog post here:

Monitoring the JVM while running a Nashorn application

We have reached the last part today’s post, it’s time to diagnose the Rhinoceros (or… at least, try).


Let’s start with some Thread Dumps: if we run our testNashorn.jjs and take a few thread dumps (using the instructions documented in our previous post), this is what we get once we load them into our Thread Dump analyzer:


That’s it, say goodbye to the good-old readable stack trace. On this first analysis we reinforced once more how this change of paradigm will affect the way Java Performance Analysis and Troubleshooting is done nowadays, the chain of execution presented on the stack trace of the “Main” thread resembles bytecode instructions, the best clue to easily identify what initiated each set of instructions is the name of the Javascript file that is declared in the “jdk.nashorn.internal.scripts.Script” class (it can be found at the bottom of the stack trace), there are some familiar things there like the JVM native and internal threads but the rest got pretty cryptic for me.

So, what’s our alternative? As far as I know, there isn’t any. We can only use some arguments to run the program and get some debug data that is supposed to give us some clues as to where the calls are coming from, but it is not very clear. I believe that, if we grok the concepts behind ‘invokedynamic’, we can use the “–print-code” Nashorn argument and produce some debugging output that can be interpreted based on the dynamic calls that are generated by the Nashorn engine:


We can also get more verbose results with the following command:

jjs -Dnashorn.debug=true -Dnashorn.methodhandles.debug.stacktrace=true --log=codegen:info testNashorn.js

What if the program hangs during the execution of a particular function? Under the development phase is pretty easy to just attach a debugger and step through the function calls but what if we are troubleshooting something in the production environment? I’m not sure if we have a valid alternative to that, if you have any ideas, please share on the comment section below.

What about Heap Dumps? Let’s see what we get when we take a Heap Dump during the execution of our testNashorn.js script:


Yep, it also got a little weird here. Since we don’t have packages and typed classes, there’s no way to easily track down where are our “Classes” and the number/size of objects associated to them, I did some investigation and found this “jdk.nashorn.internal.scripts.JO” object that apparently serves as a wrapper to the objects created through Javascript functions, the downside is that it doesn’t separate the objects based on its “Class” (at least I didn’t find any parameter that pointed me anywhere near the “jdk.nashorn.internal.scripts.Script$Person” object), so if you have 100 instances of ‘Person’ and 100 instances of ‘Car’, they will be mingled in this sea of ‘JO’ instances (I haven’t tested other object forms yet, e.g., Object Literals; not sure if we would see something different). So, how do we easily keep track of the size of objects created from a given function()? Well, we could rely on the format of the attributes and play with OQL (Object Query Language) and isolate a given set of objects to determine how much space they are taking up, but that’s just messy. Currently, there are a few DEBUG parameters documented in “$OPENJDK8_HOME/nashorn/docs/DEVELOPER_README“, some of them are quite interesting and might provide the answers we need, e.g.:


Now, to conclude this post, I leave you with a message from Jim Laskey, this is one of the replies that he sent me when I was questioning his team about these concerns:

“The next round of development will be focusing on tools, so what you are trying to accomplish will get easier. You have an opportunity to provide us guidance on this…. Stack crawls will get better once we start working on debugging APIs.”.

So there you have it, if you are interested in contribute to these debug APIs I hope this post can provide some guidance and/or raise awareness on the difficulties that we might face in a near future where we will be troubleshooting Nashorn-based enterprise applications.

Good Nashorning everyone! 😀