Here is how I set up ubuntu 18 in virtualbox on osx 10.12.6 macOS Sierra to try out GPT-2.

First of all, install virtualbox. My version is 6.0.8 r130520. Then download ubuntu (My version is 18.04.2 desktop version). Then install it with 30 Gb disk space and 4096 Mb ram. (you will need 22 Gb disk space at least). You may notice that the virtual screen is very small, and there seems to be no way to enlarge the resolution. If this happens to you, just endure the installation process until it’s done, then shutdown the VM. Then right click on your VM, Settings » Display » Screen » Graphics Controller = VBoxSVGA. Next time you turn on the VM, you could simply resize the window to change the virtual screen size. You can also scale the window (same resolution but looks bigger), this may happen when you have a huge monitor with super high resolution.

Next thing you may want to disable the screen saver, otherwise your VM will keep asking your password every 5 min. Go to the bottom left, then Settings -> Power -> Blank screen = Never. Close the window.

Next thing I highly recommend is to enable copy and paste. Follow this link. And pin the terminal to the bar when you have a chance.

Now we are ready to install GPT-2. I got most of the details from this original post.

#0 Update system

sudo apt update

#1 Install system-wide dependencies

The commands are from here. Basically we are installing TensorFlow, Keras, Caffe, Caffe, CUDA, cuDNN, and NVIDIA drivers. Copy and paste the following long one-liner into your terminal. The installation may take a while depending on your internet speed. If %-v or ctrl-v doesn’t work, just right click and paste.

LAMBDA_REPO=$(mktemp) && \
wget -O${LAMBDA_REPO} && \
sudo dpkg -i ${LAMBDA_REPO} && rm -f ${LAMBDA_REPO} && \
sudo apt-get update && sudo apt-get install -y lambda-stack-cuda

At some point, you will encounter a window titled Configuring libcudnn7, and there seems to be no freaking way to press the ok button. Someone found the solution. Press Alt + Enter or Option + Enter. A dialog will pop up and you just need to select agree and press enter.

When done, sudo reboot. Once you log back in, the background will have the Lambda logo.

#2: Install Python dependencies & GPT-2 code

The following commands will pull the GPT-2 code and the 345M model file.

sudo apt-get install python3-venv
git clone
cd gpt-2/
python3 -m venv ~/venv-gpt-2
. ~/venv-gpt-2/bin/activate
pip install tensorflow-gpu==1.12
pip install -r requirements.txt
python3 ./ 345M

#3: Run the model

We will need to change some default params first. In case you are curious what top_k is, here is the comment from the file src/

top_k=0 : Integer value controlling diversity. 1 means only 1 word is
considered for each step (token), resulting in deterministic completions,
while 40 means 40 words are considered at each step. 0 (default) is a
special setting meaning no restrictions. 40 generally is a good value.

sed -i 's/top_k=0/top_k=40/g' src/
sed -i "s/model_name='117M'/model_name='345M'/g" src/

Once you changed the params, run the script.

python3 src/

When you see Model prompt >>>, you will need to put in words in one single line. For example I go to and copy the first paragraph Donald John Trump (born June 14, 1946) is the 45th and current president of the United States. Before entering politics, he was a businessman and television personality. Copy and paste into the terminal, then press ENTER. It will take a while. It took me 2 minutes to generate the following.

Early life Edit

Donald Trump was born in New York City on June 14, 1946.[1] Donald was first elected to the New York City borough Senate on May 20, 1986, just days before he was born.[2] By the time he was nine, he and his brother Donald Jr. were the guests of honor at the Governor-General’s invitation, and became close to the Governor-General’s wife Marianne Rockefeller. He is listed as having a B.A. degree from the Queens College of New York in 1983.[3]

Career Edit

Donald worked at a bank from 1969 to 1980. He eventually became a partner at the law firm G.W. Bush & Associates, which he helped to found in 1983. Trump’s practice was headquartered on Fifth Avenue and 125th Street in Lower Manhattan but was later moved from this location to Trump Tower. During a meeting at Trump Tower with Ronald Reagan’s daughter Meghan, Trump proposed to his sister as a romantic dinner. In 1986, Trump sold the building he had owned for $1 million, which left him with a large and substantial income, which he was using to build Trump Tower. Trump told Meghan that he planned to invest $10 million into Trump Tower during his presidential campaign.[4] According to her account, Trump gave her his entire earnings of $250,000 and stated he planned to take home $100 million. He also gave out a $500,000 guarantee to each of his children, saying that if they would win the Presidency he meant to keep the remaining $250,000 for them.[5]

One month prior to the election, on November 30, he made his plans to build a casino gambling establishment in Atlantic City known as the Taj Mahal. He also announced plans to build an international hotel called Trump Castle that would be open 24 hours a day 7 days a week, seven days a week. He announced that they would build the hotel on his property, which would be used for “a major international hotel”. He did, however, reveal that he would have to agree to build the project on other land if he wanted his family to get a piece of the profit. He also told this story to the New York Post, claiming this was a good deal because he would own Trump Castle and, therefore, the resort.

After the election, Trump began working at the Trump Hotel, a hotel that was located near his property that was to be demolished in an effort to build

Truthfulness aside, the above text is randomly generated. You can try with the same input and the output is supposed to be different. Try with longer input and the output will be more cohesive. Short input will give just about the same amount of text but sometimes it does not make much sense.

There you go. This is the power of GPT-2. Some people are worried some are not, what do you think?

3 thoughts on “GPT-2

  1. Thank you for this! After countless attempts, this is the closest I’ve come to actually being able to test GPT-2. It seems that the full stack worked well a couple years ago, but updates since then have broken component compatibility. If you still have a working GPT-2 VM (particularly one that’s been updated with the full GPT-2 dataset now available), is there any chance you’d share it? I can provide hosting space if needed.


  2. Hi Matthew, I had one in my virtual box on osx, but I deleted it already since it took up space, plus my mbp isn’t that fast running GPT-2 in a virtual box anyways. I think you will get much better result if you run it directly on linux with a decent gpu.


  3. Thanks again, I did manage to get it working and with the XL model (might be worth updating the reference in tutorial from 345M to 1558M). Key steps seem to have been ‘pip3 uninstall tensorflow-gpu’ and ‘pip3 install –upgrade tensorflow-gpu==1.4’ I’m reasonably competent in Linux, resolving dependencies, and compiling from source, but the whole ML stack seems like a house of cards. Speed is certainly a factor, but considering how many parts there are, how they all interconnect, and how easily things break in a VM, I’m not willing to risk breaking my actual work set-up just to get a faster run.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s